Browsing by Subject "Feature selection"
Now showing 1 - 10 of 10
Results Per Page
Sort Options
Item Open Access Feature selection using stochastic approximation with Barzilai and Borwein non-monotone gains(Elsevier Ltd, 2021-08) Aksakallı, V.; Yenice, Z. D.; Malekipirbazari, Milad; Kargar, KamyarWith recent emergence of machine learning problems with massive number of features, feature selection (FS) has become an ever-increasingly important tool to mitigate the effects of the so-called curse of dimensionality. FS aims to eliminate redundant and irrelevant features for models that are faster to train, easier to understand, and less prone to overfitting. This study presents a wrapper FS method based on Simultaneous Perturbation Stochastic Approximation (SPSA) with Barzilai and Borwein (BB) non-monotone gains within a pseudo-gradient descent framework wherein performance is measured via cross-validation. We illustrate that SPSA with BB gains (SPSA-BB) provides dramatic improvements in terms of the number of iterations for convergence with minimal degradation in cross-validated error performance over the current state-of-the art approach with monotone gains (SPSA-MON). In addition, SPSA-BB requires only one internal parameter and therefore it eliminates the need for careful fine-tuning of numerous other internal parameters as in SPSA-MON or comparable meta-heuristic FS methods such as genetic algorithms (GA). Our particular implementation includes gradient averaging as well as gain smoothing for better convergence properties. We present computational experiments on various public datasets with Nearest Neighbors and Naive Bayes classifiers as wrappers. We present comparisons of SPSA-BB against full set of features, SPSA-MON, as well as seven popular meta-heuristics based FS algorithms including GA and particle swarm optimization. Our results indicate that SPSA-BB converges to a good feature set in about 50 iterations on the average regardless of the number of features (whether a dozen or more than 1000 features) and its performance is quite competitive. SPSA-BB can be considered extremely fast for a wrapper method and therefore it stands as a high-performing new feature selection method that is also computationally feasible in practice.Item Open Access Human action recognition with line and flow histograms(IEEE, 2008-12) İkizler, Nazlı; Cinbiş, R. Gökberk; Duygulu, PınarWe present a compact representation for human action recognition in videos using line and optical flow histograms. We introduce a new shape descriptor based on the distribution of lines which are fitted to boundaries of human figures. By using an entropy-based approach, we apply feature selection to densify our feature representation, thus, minimizing classification time without degrading accuracy. We also use a compact representation of optical flow for motion information. Using line and flow histograms together with global velocity information, we show that high-accuracy action recognition is possible, even in challenging recording conditions. © 2008 IEEE.Item Open Access Human activity recognition using inertial/magnetic sensor units(Springer, Berlin, Heidelberg, 2010) Altun, Kerem; Barshan, BillurThis paper provides a comparative study on the different techniques of classifying human activities that are performed using body-worn miniature inertial and magnetic sensors. The classification techniques implemented and compared in this study are: Bayesian decision making (BDM), the least-squares method (LSM), the k-nearest neighbor algorithm (k-NN), dynamic time warping (DTW), support vector machines (SVM), and artificial neural networks (ANN). Daily and sports activities are classified using five sensor units worn by eight subjects on the chest, the arms, and the legs. Each sensor unit comprises a triaxial gyroscope, a triaxial accelerometer, and a triaxial magnetometer. Principal component analysis (PCA) and sequential forward feature selection (SFFS) methods are employed for feature reduction. For a small number of features, SFFS demonstrates better performance and should be preferable especially in real-time applications. The classifiers are validated using different cross-validation techniques. Among the different classifiers we have considered, BDM results in the highest correct classification rate with relatively small computational cost. © 2010 Springer-Verlag Berlin Heidelberg.Item Open Access Land cover classification with multi-sensor fusion of partly missing data(American Society for Photogrammetry and Remote Sensing, 2009-05) Aksoy, S.; Koperski, K.; Tusk, C.; Marchisio, G.We describe a system that uses decision tree-based tools for seamless acquisition of knowledge for classification of remotely sensed imagery. We concentrate on three important problems in this process: information fusion, model understandability, and handling of missing data. Importance of multi-sensor information fusion and the use of decision tree classifiers for such problems have been well-studied in the literature. However, these studies have been limited to the cases where all data sources have a full coverage for the scene under consideration. Our contribution in this paper is to show how decision tree classifiers can be learned with alternative (surrogate) decision nodes and result in models that are capable of dealing with missing data during both training and classification to handle cases where one or more measurements do not exist for some locations. We present detailed performance evaluation regarding the effectiveness of these classifiers for information fusion and feature selection, and study three different methods for handling missing data in comparative experiments. The results show that surrogate decisions incorporated into decision tree classifiers provide powerful models for fusing information from different data layers while being robust to missing data. © 2009 American Society for Photogrammetry and Remote Sensing.Item Open Access Machine-based classification of ADHD and nonADHD participants using time/frequency features of event-related neuroelectric activity(Elsevier Ireland Ltd, 2017) Öztoprak, H.; Toycan, M.; Alp, Y. K.; Arıkan, Orhan; Doğutepe, E.; Karakaş S.Objective Attention-deficit/hyperactivity disorder (ADHD) is the most frequent diagnosis among children who are referred to psychiatry departments. Although ADHD was discovered at the beginning of the 20th century, its diagnosis is still confronted with many problems. Method A novel classification approach that discriminates ADHD and nonADHD groups over the time-frequency domain features of event-related potential (ERP) recordings that are taken during Stroop task is presented. Time-Frequency Hermite-Atomizer (TFHA) technique is used for the extraction of high resolution time-frequency domain features that are highly localized in time-frequency domain. Based on an extensive investigation, Support Vector Machine-Recursive Feature Elimination (SVM-RFE) was used to obtain the best discriminating features. Results When the best three features were used, the classification accuracy for the training dataset reached 98%, and the use of five features further improved the accuracy to 99.5%. The accuracy was 100% for the testing dataset. Based on extensive experiments, the delta band emerged as the most contributing frequency band and statistical parameters emerged as the most contributing feature group. Conclusion The classification performance of this study suggests that TFHA can be employed as an auxiliary component of the diagnostic and prognostic procedures for ADHD. Significance The features obtained in this study can potentially contribute to the neuroelectrical understanding and clinical diagnosis of ADHD.Item Open Access Machine-based learning system: classification of ADHD and non-ADHD participants(IEEE, 2017) Öztoprak, H.; Toycan, M.; Alp, Y. K.; Arıkan, Orhan; Doğutepe, E.; Karakaş, S.Attention-deficit/hyperactivity disorder (ADHD) is the most frequent diagnosis among children who are referred to psychiatry departments. Although ADHD was discovered at the beginning of the 20th century, its diagnosis is confronted with many problems. In this paper, a novel classification approach that discriminates ADHD and non-ADHD groups over the time-frequency domain features of ERP recordings is presented. Support Vector Machine-Recursive Feature Elimination (SVM-RFE) was used to obtain best discriminating features. When only three of these features were used the accuracy of classification reached to 98%, and use of six features further improved classification accuracy to 99.5%. The proposed scheme was tested with a new experimental setup and 100% accuracy is obtained. The results were obtained using RCV. The classification performance of this study suggests that TFHA can be employed as a core component of the diagnostic and prognostic procedures of various psychiatric illnesses.Item Open Access Mining of remote sensing image archives using spatial relationship histograms(IEEE, 2008-07) Kalaycılar, Fırat; Kale, Aslı; Zamalieva, Daniya; Aksoy, SelimWe describe a new image representation using spatial relationship histograms that extend our earlier work on modeling image content using attributed relational graphs. These histograms are constructed by classifying the regions in an image, computing the topological and distance-based spatial relationships between these regions, and counting the number of times different groups of regions are observed in the image. We also describe a selection algorithm that produces very compact representations by identifying the distinguishing region groups that are frequently found in a particular class of scenes but rarely exist in others. Experiments using Ikonos scenes illustrate the effectiveness of the proposed representation in retrieval of images containing complex types of scenes such as dense and sparse urban areas. © 2008 IEEE.Item Open Access Performance comparison of feature selection and extraction methods with random instance selection(Elsevier Ltd, 2021-10-01) Malekipirbazari, Milad; Aksakallı, V.; Shafqat, W.; Eberhard, A.In pattern recognition, irrelevant and redundant features together with a large number of noisy instances in the underlying dataset decrease performance of trained models and make the training process considerably slower, if not practically infeasible. In order to combat this so-called curse of dimensionality, one option is to resort to feature selection (FS) methods designed to select the features that contribute the most to the performance of the model, and one other option is to utilize feature extraction (FE) methods that map the original feature space into a new space with lower dimensionality. These two methods together are called feature reduction (FR) methods. On the other hand, deploying an FR method on a dataset with massive number of instances can become a major challenge, from both memory and run time perspectives, due to the complex numerical computations involved in the process. The research question we consider in this study is rather a simple, yet novel one: do these FR methods really need the whole set of instances (WSI) available for the best performance, or can we achieve similar performance levels with selecting a much smaller random subset of WSI prior to deploying an FR method? In this work, we provide empirical evidence based on comprehensive computational experiments that the answer to this critical research question is in the affirmative. Specifically, with simple random instance selection followed by FR, the amount of data needed for training a classifier can be drastically reduced with minimal impact on classification performance. We also provide recommendations on which FS/ FE method to use in conjunction with which classifier.Item Open Access Qualitative test-cost sensitive classification(Elsevier BV, 2010) Cebe, M.; Gunduz Demir, C.This paper reports a new framework for test-cost sensitive classification. It introduces a new loss function definition, in which misclassification cost and cost of feature extraction are combined qualitatively and the loss is conditioned with current and estimated decisions as well as their consistency. This loss function definition is motivated with the following issues. First, for many applications, the relation between different types of costs can be expressed roughly and usually only in terms of ordinal relations, but not as a precise quantitative number. Second, the redundancy between features can be used to decrease the cost; it is possible not to consider a new feature if it is consistent with the existing ones. In this paper, we show the feasibility of the proposed framework for medical diagnosis problems. Our experiments demonstrate that this framework is efficient to significantly decrease feature extraction cost without decreasing accuracy. © 2010 Elsevier B.V. All rights reserved.Item Open Access Qualitative test-cost sensitive classification(2008) Cebe, MüminDecision making is a procedure for selecting the best action among several alternatives. In many real-world problems, decision has to be taken under the circumstances in which one has to pay to acquire information. In this thesis, we propose a new framework for test-cost sensitive classification that considers the misclassification cost together with the cost of feature extraction, which arises from the effort of acquiring features. This proposed framework introduces two new concepts to test-cost sensitive learning for better modeling the real-world problems: qualitativeness and consistency. First, this framework introduces the incorporation of qualitative costs into the problem formulation. This incorporation becomes important for many real world problems, from finance to medical diagnosis, since the relation between the misclassification cost and the cost of feature extraction could be expressed only roughly and typically in terms of ordinal relations for these problems. For example, in cancer diagnosis, it could be expressed that the cost of misdiagnosis is larger than the cost of a medical test. However, in the test-cost sensitive classification literature, the misclassification cost and the cost of feature extraction are combined quantitatively to obtain a single loss/utility value, which requires expressing the relation between these costs as a precise quantitative number. Second, the proposed framework considers the consistency between the current information and the information after feature extraction to decide which features to extract. For example, it does not extract a new feature if it brings no new information but just confirms the current one; in other words, if the new feature is totally consistent with the current information. By doing so, the proposed framework could significantly decrease the cost of feature extraction, and hence, the overall cost without decreasing the classification accuracy. Such consistency behavior has not been considered in the previous test-cost sensitive literature. We conduct our experiments on three medical data sets and the results demonstrate that the proposed framework significantly decreases the feature extraction cost without decreasing the classification accuracy.