Bilkent Repository :: Browsing by Subject "Learning algorithms"

Browsing by Subject "Learning algorithms"

Now showing 1 - 20 of 22

Open Access
Adaptive ensemble learning with confidence bounds for personalized diagnosis
(AAAI Press, 2016) Tekin, Cem; Yoon, J.; Van Der Schaar, M.
With the advances in the field of medical informatics, automated clinical decision support systems are becoming the de facto standard in personalized diagnosis. In order to establish high accuracy and confidence in personalized diagnosis, massive amounts of distributed, heterogeneous, correlated and high-dimensional patient data from different sources such as wearable sensors, mobile applications, Electronic Health Record (EHR) databases etc. need to be processed. This requires learning both locally and globally due to privacy constraints and/or distributed nature of the multimodal medical data. In the last decade, a large number of meta-learning techniques have been proposed in which local learners make online predictions based on their locally-collected data instances, and feed these predictions to an ensemble learner, which fuses them and issues a global prediction. However, most of these works do not provide performance guarantees or, when they do, these guarantees are asymptotic. None of these existing works provide confidence estimates about the issued predictions or rate of learning guarantees for the ensemble learner. In this paper, we provide a systematic ensemble learning method called Hedged Bandits, which comes with both long run (asymptotic) and short run (rate of learning) performance guarantees. Moreover, we show that our proposed method outperforms all existing ensemble learning techniques, even in the presence of concept drift.
Open Access
Adaptive hierarchical space partitioning for online classification
(IEEE, 2016) Kılıç, O. Fatih; Vanlı, N. D.; Özkan, H.; Delibalta, İ.; Kozat, Süleyman Serdar
We propose an online algorithm for supervised learning with strong performance guarantees under the empirical zero-one loss. The proposed method adaptively partitions the feature space in a hierarchical manner and generates a powerful finite combination of basic models. This provides algorithm to obtain a strong classification method which enables it to create a linear piecewise classifier model that can work well under highly non-linear complex data. The introduced algorithm also have scalable computational complexity that scales linearly with dimension of the feature space, depth of the partitioning and number of processed data. Through experiments we show that the introduced algorithm outperforms the state-of-the-art ensemble techniques over various well-known machine learning data sets.
Open Access
Application of the RIMARC algorithm to a large data set of action potentials and clinical parameters for risk prediction of atrial fibrillation
(Springer, 2015) Ravens, U.; Katircioglu-Öztürk, D.; Wettwer, E.; Christ, T.; Dobrev, D.; Voigt, N.; Poulet, C.; Loose, S.; Simon, J.; Stein, A.; Matschke, K.; Knaut, M.; Oto, E.; Oto, A.; Güvenir, H. A.
Ex vivo recorded action potentials (APs) in human right atrial tissue from patients in sinus rhythm (SR) or atrial fibrillation (AF) display a characteristic spike-and-dome or triangular shape, respectively, but variability is huge within each rhythm group. The aim of our study was to apply the machine-learning algorithm ranking instances by maximizing the area under the ROC curve (RIMARC) to a large data set of 480 APs combined with retrospectively collected general clinical parameters and to test whether the rules learned by the RIMARC algorithm can be used for accurately classifying the preoperative rhythm status. APs were included from 221 SR and 158 AF patients. During a learning phase, the RIMARC algorithm established a ranking order of 62 features by predictive value for SR or AF. The model was then challenged with an additional test set of features from 28 patients in whom rhythm status was blinded. The accuracy of the risk prediction for AF by the model was very good (0.93) when all features were used. Without the seven AP features, accuracy still reached 0.71. In conclusion, we have shown that training the machine-learning algorithm RIMARC with an experimental and clinical data set allows predicting a classification in a test data set with high accuracy. In a clinical setting, this approach may prove useful for finding hypothesis-generating associations between different parameters.
Open Access
Auto-tuning similarity search algorithms on multi-core architectures
(2013) Gedik, B.
In recent times, large high-dimensional datasets have become ubiquitous. Video and image repositories, financial, and sensor data are just a few examples of such datasets in practice. Many applications that use such datasets require the retrieval of data items similar to a given query item, or the nearest neighbors (NN or k -NN) of a given item. Another common query is the retrieval of multiple sets of nearest neighbors, i.e., multi k -NN, for different query items on the same data. With commodity multi-core CPUs becoming more and more widespread at lower costs, developing parallel algorithms for these search problems has become increasingly important. While the core nearest neighbor search problem is relatively easy to parallelize, it is challenging to tune it for optimality. This is due to the fact that the various performance-specific algorithmic parameters, or "tuning knobs", are inter-related and also depend on the data and query workloads. In this paper, we present (1) a detailed study of the various tuning knobs and their contributions on increasing the query throughput for parallelized versions of the two most common classes of high-dimensional multi-NN search algorithms: linear scan and tree traversal, and (2) an offline auto-tuner for setting these knobs by iteratively measuring actual query execution times for a given workload and dataset. We show experimentally that our auto-tuner reaches near-optimal performance and significantly outperforms un-tuned versions of parallel multi-NN algorithms for real video repository data on a variety of multi-core platforms. © 2013 Springer Science+Business Media New York.
Open Access
Automatic detection of geospatial objects using multiple hierarchical segmentations
(Institute of Electrical and Electronics Engineers, 2008-07) Akçay, H. G.; Aksoy, S.
The object-based analysis of remotely sensed imagery provides valuable spatial and structural information that is complementary to pixel-based spectral information in classification. In this paper, we present novel methods for automatic object detection in high-resolution images by combining spectral information with structural information exploited by using image segmentation. The proposed segmentation algorithm uses morphological operations applied to individual spectral bands using structuring elements in increasing sizes. These operations produce a set of connected components forming a hierarchy of segments for each band. A generic algorithm is designed to select meaningful segments that maximize a measure consisting of spectral homogeneity and neighborhood connectivity. Given the observation that different structures appear more clearly at different scales in different spectral bands, we describe a new algorithm for unsupervised grouping of candidate segments belonging to multiple hierarchical segmentations to find coherent sets of segments that correspond to actual objects. The segments are modeled by using their spectral and textural content, and the grouping problem is solved by using the probabilistic latent semantic analysis algorithm that builds object models by learning the object-conditional probability distributions. The automatic labeling of a segment is done by computing the similarity of its feature distribution to the distribution of the learned object models using the Kullback-Leibler divergence. The performances of the unsupervised segmentation and object detection algorithms are evaluated qualitatively and quantitatively using three different data sets with comparative experiments, and the results show that the proposed methods are able to automatically detect, group, and label segments belonging to the same object classes. © 2008 IEEE.
Open Access
A beam search algorithm to optimize robustness under random machine breakdowns and processing time variability
(Institute of Industrial Engineers, 2007) Gören, S.; Sabuncuoğlu, İhsan
The vast majority of the machine scheduling research assumes complete information about the scheduling problem and a static environment in which scheduling systems operate. In practice, however, scheduling systems are subject to considerable uncertainty in dynamic environments. The ability to cope with the uncertainty in scheduling process is becoming increasingly important in today's highly dynamic and competitive business environments. In the literature, two approaches have appeared as the effective way: reactive and proactive scheduling. The objective in reactive scheduling is to revise schedules as necessary, while proactive scheduling attempts to incorporate future disruptions when generating schedules. In this paper we take a proactive scheduling approach to solve a machine scheduling problem with two sources of uncertainty: processing time variability and machine breakdowns. We define two robustness measures and develop a heuristic based on beam search methodology to optimize them. The computational results show that the proposed algorithms perform significantly better than a number of heuristics available in the literature.
Open Access
Classification of regional ionospheric disturbance based on machine learning techniques
(European Space Agency, 2016) Terzi, Merve Begüm; Arıkan, Orhan; Karatay, S.; Arıkan, F.; Gulyaeva, T.
In this study, Total Electron Content (TEC) estimated from GPS receivers is used to model the regional and local variability that differs from global activity along with solar and geomagnetic indices. For the automated classification of regional disturbances, a classification technique based on a robust machine learning technique that have found wide spread use, Support Vector Machine (SVM) is proposed. Performance of developed classification technique is demonstrated for midlatitude ionosphere over Anatolia using TEC estimates generated from GPS data provided by Turkish National Permanent GPS Network (TNPGN-Active) for solar maximum year of 2011. As a result of implementing developed classification technique to Global Ionospheric Map (GIM) TEC data, which is provided by the NASA Jet Propulsion Laboratory (JPL), it is shown that SVM can be a suitable learning method to detect anomalies in TEC variations.
Open Access
Contextual learning for unit commitment with renewable energy sources
(IEEE, 2017) Lee, H. -S.; Tekin, Cem; Schaar, M.; Lee, J. -W.
In this paper, we study a unit commitment (UC) problem minimizing operating costs of the power system with renewable energy sources. We develop a contextual learning algorithm for UC (CLUC) which learns which UC schedule to choose based on the context information such as past load demand and weather condition. CLUC does not require any prior knowledge on the uncertainties such as the load demand and the renewable power outputs, and learns them over time using the context information. We characterize the performance of CLUC analytically, and prove its optimality in terms of the long-term average cost. Through the simulation results, we show the performance of CLUC and the effectiveness of utilizing the context information in the UC problem.
Open Access
Fast insect damage detection in wheat kernels using transmittance images
(IEEE, 2004-07) Çataltepe, Z.; Pearson, T.; Cetin, A. Enis
We used transmittance images and different learning algorithms to classify insect damaged and un-damaged wheat kernels. Using the histogram of the pixels of the wheat images as the feature, and the linear model as the learning algorithm, we achieved a False Positive Rate (1-specificity) of 0.12 at the True Positive Rate (sensitivity) of 0.8 and an Area Under the ROC Curve (AUC) of 0.90 ± 0.02. Combining the linear model and a Radial Basis Function Network in a committee resulted in a FP Rate of 0.09 at the TP Rate of 0.8 and an AUC of 0.93 ± 0.03.
Open Access
Gambler's ruin bandit problem
(IEEE, 2017) Akbarzadeh, Nima; Tekin, Cem
In this paper, we propose a new multi-armed bandit problem called the Gambler's Ruin Bandit Problem (GRBP). In the GRBP, the learner proceeds in a sequence of rounds, where each round is a Markov Decision Process (MDP) with two actions (arms): a continuation action that moves the learner randomly over the state space around the current state; and a terminal action that moves the learner directly into one of the two terminal states (goal and dead-end state). The current round ends when a terminal state is reached, and the learner incurs a positive reward only when the goal state is reached. The objective of the learner is to maximize its long-term reward (expected number of times the goal state is reached), without having any prior knowledge on the state transition probabilities. We first prove a result on the form of the optimal policy for the GRBP. Then, we define the regret of the learner with respect to an omnipotent oracle, which acts optimally in each round, and prove that it increases logarithmically over rounds. We also identify a condition under which the learner's regret is bounded. A potential application of the GRBP is optimal medical treatment assignment, in which the continuation action corresponds to a conservative treatment and the terminal action corresponds to a risky treatment such as surgery.
Open Access
Jamming bandits-a novel learning method for optimal jamming
(Institute of Electrical and Electronics Engineers Inc., 2016) Amuru, S.; Tekin, C.; Van Der Schaar, M.; Buehrer, R.M.
Can an intelligent jammer learn and adapt to unknown environments in an electronic warfare-type scenario? In this paper, we answer this question in the positive, by developing a cognitive jammer that adaptively and optimally disrupts the communication between a victim transmitter-receiver pair. We formalize the problem using a multiarmed bandit framework where the jammer can choose various physical layer parameters such as the signaling scheme, power level and the on-off/pulsing duration in an attempt to obtain power efficient jamming strategies. We first present online learning algorithms to maximize the jamming efficacy against static transmitter-receiver pairs and prove that these algorithms converge to the optimal (in terms of the error rate inflicted at the victim and the energy used) jamming strategy. Even more importantly, we prove that the rate of convergence to the optimal jamming strategy is sublinear, i.e., the learning is fast in comparison to existing reinforcement learning algorithms, which is particularly important in dynamically changing wireless environments. Also, we characterize the performance of the proposed bandit-based learning algorithm against multiple static and adaptive transmitter-receiver pairs.
Open Access
Modeling interestingness of streaming classification rules as a classification problem
(Springer, 2005-06) Aydın, Tolga; Güvenir, Halil Altay
Inducing classification rules on domains from which information is gathered at regular periods lead the number of such classification rules to be generally so huge that selection of interesting ones among all discovered rules becomes an important task. At each period, using the newly gathered information from the domain, the new classification rules are induced. Therefore, these rules stream through time and are so called streaming classification rules. In this paper, an interactive classification rules' interestingness learning algorithm (ICRIL) is developed to automatically label the classification rules either as "interesting" or "uninteresting" with limited user interaction. In our study, VFFP (Voting Fuzzified Feature Projections), a feature projection based incremental classification algorithm, is also developed in the framework of ICRIL. The concept description learned by the VFFP is the interestingness concept of streaming classification rules. © Springer-Verlag Berlin Heidelberg 2006.
Open Access
Neural networks for improved target differentiation and localization with sonar
(Pergamon Press, 2001) Ayrulu, B.; Barshan, B.
This study investigates the processing of sonar signals using neural networks for robust differentiation of commonly encountered features in indoor robot environments. Differentiation of such features is of interest for intelligent systems in a variety of applications. Different representations of amplitude and time-of-flight measurement patterns acquired from a real sonar system are processed. In most cases, best results are obtained with the low-frequency component of the discrete wavelet transform of these patterns. Modular and non-modular neural network structures trained with the back-propagation and generating-shrinking algorithms are used to incorporate learning in the identification of parameter relations for target primitives. Networks trained with the generating-shrinking algorithm demonstrate better generalization and interpolation capability and faster convergence rate. Neural networks can differentiate more targets employing only a single sensor node, with a higher correct differentiation percentage (99%) than achieved with previously reported methods (61-90%) employing multiple sensor nodes. A sensor node is a pair of transducers with fixed separation, that can rotate and scan the target to collect data. Had the number of sensing nodes been reduced in the other methods, their performance would have been even worse. The success of the neural network approach shows that the sonar signals do contain sufficient information to differentiate all target types, but the previously reported methods are unable to resolve this identifying information. This work can find application in areas where recognition of patterns hidden in sonar signals is required. Some examples are system control based on acoustic signal detection and identification, map building, navigation, obstacle avoidance, and target-tracking applications for mobile robots and other intelligent systems. Copyright © 2001 Elsevier Science Ltd.
Open Access
An Online Causal Inference Framework for Modeling and Designing Systems Involving User Preferences: A State-Space Approach
(Hindawi Limited, 2017) Delibalta, I.; Baruh, L.; Kozat, S. S.
We provide a causal inference framework to model the effects of machine learning algorithms on user preferences. We then use this mathematical model to prove that the overall system can be tuned to alter those preferences in a desired manner. A user can be an online shopper or a social media user, exposed to digital interventions produced by machine learning algorithms. A user preference can be anything from inclination towards a product to a political party affiliation. Our framework uses a state-space model to represent user preferences as latent system parameters which can only be observed indirectly via online user actions such as a purchase activity or social media status updates, shares, blogs, or tweets. Based on these observations, machine learning algorithms produce digital interventions such as targeted advertisements or tweets. We model the effects of these interventions through a causal feedback loop, which alters the corresponding preferences of the user. We then introduce algorithms in order to estimate and later tune the user preferences to a particular desired form. We demonstrate the effectiveness of our algorithms through experiments in different scenarios. © 2017 Ibrahim Delibalta et al.
Open Access
A privacy-preserving solution for the bipartite ranking problem
(IEEE, 2016-12) Faramarzi, Noushin Salek; Ayday, Erman; Güvenir, H. Altay
In this paper, we propose an efficient solution for the privacy-preserving of a bipartite ranking algorithm. The bipartite ranking problem can be considered as finding a function that ranks positive instances (in a dataset) higher than the negative ones. However, one common concern for all the existing schemes is the privacy of individuals in the dataset. That is, one (e.g., a researcher) needs to access the records of all individuals in the dataset in order to run the algorithm. This privacy concern puts limitations on the use of sensitive personal data for such analysis. The RIMARC (Ranking Instances by Maximizing Area under the ROC Curve) algorithm solves the bipartite ranking problem by learning a model to rank instances. As part of the model, it learns weights for each feature by analyzing the area under receiver operating characteristic (ROC) curve. RIMARC algorithm is shown to be more accurate and efficient than its counterparts. Thus, we use this algorithm as a building-block and provide a privacy-preserving version of the RIMARC algorithm using homomorphic encryption and secure multi-party computation. Our proposed algorithm lets a data owner outsource the storage and processing of its encrypted dataset to a semi-trusted cloud. Then, a researcher can get the results of his/her queries (to learn the ranking function) on the dataset by interacting with the cloud. During this process, neither the researcher nor the cloud learns any information about the raw dataset. We prove the security of the proposed algorithm and show its efficiency via experiments on real data.
Open Access
Supervised machine learning algorithm for arrhythmia analysis
(IEEE, 1997) Güvenir, H. Altay; Acar, Burak; Demiröz, Gülşen; Çekin, A.
A new machine learning algorithm for the diagnosis of cardiac arrhythmia from standard 12 lead ECG recordings is presented. The algorithm is called VFI5 for Voting Feature Intervals. VFI5 is a supervised and inductive learning algorithm for inducing classification knowledge from examples. The input to VFI5 is a training set of records. Each record contains clinical measurements, from ECG signals and some other information such as sex, age, and weight, along with the decision of an expert cardiologist. The knowledge representation is based on a recent technique called Feature Intervals, where a concept is represented by the projections of the training cases on each feature separately. Classification in VFI5 is based on a majority voting among the class predictions made by each feature separately. The comparison of the VFI5 algorithm indicates that it outperforms other standard algorithms such as Naive Bayesian and Nearest Neighbor classifiers.
Open Access
A tabu search algorithm for sparse placement of wavelength converters in optical networks
(Springer, 2004) Sengezer, N.; Karasan, E.
In this paper, we study the problem of placing limited number of wavelength converting nodes in a multi-fiber network with static traffic demands and propose a tabu search based heuristic algorithm. The objective of the algorithm is to achieve the performance of full wavelength conversion in terms of minimizing the total number of fibers used in the network by placing minimum number of wavelength converting nodes. We also present a greedy algorithm and compare its performance with the tabu search algorithm. Finally, we present numerical results that demonstrate the high correlation between placing a wavelength converting node and the amount of transit traffic passing through that node. © Springer-Verlag 2004.
Open Access
Technical note-optimal structural results for assemble-to-order generalized M-Systems
(INFORMS Inst.for Operations Res.and the Management Sciences, 2014) Nadar, E.; Akan, M.; Scheller-Wolf, A.
We consider an assemble-to-order generalized M-system with multiple components and multiple products, batch ordering of components, random lead times, and lost sales. We model the system as an infinite-horizon Markov decision process and seek an optimal policy that specifies when a batch of components should be produced (i.e., inventory replenishment) and whether an arriving demand for each product should be satisfied (i.e., inventory allocation). We characterize optimal inventory replenishment and allocation policies under a mild condition on component batch sizes via a new type of policy: lattice-dependent base stock and lattice-dependent rationing. © 2014 INFORMS.
Open Access
Theories and proofs in fault diagnosis
(Springer, 1998-09) Çiçekli, İlyas
This paper illustrates how theories (contexts), fail branches, and the ability to control the construction of proofs in MetaProlog play an important role in the expression of the fault diagnosis problem. These facilities of MetaProlog make it easier to represent digital circuits and the fault diagnosis algorithm on them. MetaProlog theories are used both in the representation of digital circuits and in the implementation of the fault diagnosis algorithm. Fail branches and the ability to control their construction play a key role during the construction of hypothesises to explain the fault in a given faulty circuit.
Open Access
Two learning approaches for protein name extraction
(Academic Press, 2009) Tatar, S.; Cicekli, I.
Protein name extraction, one of the basic tasks in automatic extraction of information from biological texts, remains challenging. In this paper, we explore the use of two different machine learning techniques and present the results of the conducted experiments. In the first method, Bigram language model is used to extract protein names. In the latter, we use an automatic rule learning method that can identify protein names located in the biological texts. In both cases, we generalize protein names by using hierarchically categorized syntactic token types. We conducted our experiments on two different datasets. Our first method based on Bigram language model achieved an F-score of 67.7% on the YAPEX dataset and 66.8% on the GENIA corpus. The developed rule learning method obtained 61.8% F-score value on the YAPEX dataset and 61.0% on the GENIA corpus. The results of the comparative experiments demonstrate that both techniques are applicable to the task of automatic protein name extraction, a prerequisite for the large-scale processing of biomedical literature. © 2009 Elsevier Inc. All rights reserved.