Browsing by Subject "Classification (of information)"
Now showing 1 - 20 of 30
- Results Per Page
- Sort Options
Item Open Access Activity recognition invariant to sensor orientation with wearable motion sensors(MDPI AG, 2017) Yurtman, A.; Barshan, B.Most activity recognition studies that employ wearable sensors assume that the sensors are attached at pre-determined positions and orientations that do not change over time. Since this is not the case in practice, it is of interest to develop wearable systems that operate invariantly to sensor position and orientation. We focus on invariance to sensor orientation and develop two alternative transformations to remove the effect of absolute sensor orientation from the raw sensor data. We test the proposed methodology in activity recognition with four state-of-the-art classifiers using five publicly available datasets containing various types of human activities acquired by different sensor configurations. While the ordinary activity recognition system cannot handle incorrectly oriented sensors, the proposed transformations allow the sensors to be worn at any orientation at a given position on the body, and achieve nearly the same activity recognition performance as the ordinary system for which the sensor units are not rotatable. The proposed techniques can be applied to existing wearable systems without much effort, by simply transforming the time-domain sensor data at the pre-processing stage. © 2017 by the authors. Licensee MDPI, Basel, Switzerland.Item Open Access Application of the RIMARC algorithm to a large data set of action potentials and clinical parameters for risk prediction of atrial fibrillation(Springer, 2015) Ravens, U.; Katircioglu-Öztürk, D.; Wettwer, E.; Christ, T.; Dobrev, D.; Voigt, N.; Poulet, C.; Loose, S.; Simon, J.; Stein, A.; Matschke, K.; Knaut, M.; Oto, E.; Oto, A.; Güvenir, H. A.Ex vivo recorded action potentials (APs) in human right atrial tissue from patients in sinus rhythm (SR) or atrial fibrillation (AF) display a characteristic spike-and-dome or triangular shape, respectively, but variability is huge within each rhythm group. The aim of our study was to apply the machine-learning algorithm ranking instances by maximizing the area under the ROC curve (RIMARC) to a large data set of 480 APs combined with retrospectively collected general clinical parameters and to test whether the rules learned by the RIMARC algorithm can be used for accurately classifying the preoperative rhythm status. APs were included from 221 SR and 158 AF patients. During a learning phase, the RIMARC algorithm established a ranking order of 62 features by predictive value for SR or AF. The model was then challenged with an additional test set of features from 28 patients in whom rhythm status was blinded. The accuracy of the risk prediction for AF by the model was very good (0.93) when all features were used. Without the seven AP features, accuracy still reached 0.71. In conclusion, we have shown that training the machine-learning algorithm RIMARC with an experimental and clinical data set allows predicting a classification in a test data set with high accuracy. In a clinical setting, this approach may prove useful for finding hypothesis-generating associations between different parameters.Item Open Access Authorship attribution: performance of various features and classification methods(IEEE, 2007-11) Bozkurt, İlker Nadi; Bağlıoğlu, Özgür; Uyar, ErkanAuthorship attribution is the process of determining the writer of a document. In literature, there are lots of classification techniques conducted in this process. In this paper we explore information retrieval methods such as tf-idf structure with support vector machines, parametric and nonparametric methods with supervised and unsupervised (clustering) classification techniques in authorship attribution. We performed various experiments with articles gathered from Turkish newspaper Milliyet. We performed experiments on different features extracted from these texts with different classifiers, and combined these results to improve our success rates. We identified which classifiers give satisfactory results on which feature sets. According to experiments, the success rates dramatically changes with different combinations, however the best among them are support vector classifier with bag of words, and Gaussian with function words. ©2007 IEEE.Item Open Access Boosting performance of directory-based cache coherence protocols with coherence bypass at subpage granularity and a novel on-chip page table(ACM, 2016- 05) Soltaniyeh, M.; Kadayıf, I.; Öztürk, ÖzcanChip multiprocessors (CMPs) require effective cache coher-ence protocols as well as fast virtual-To-physical address trans-lation mechanisms for high performance. Directory-based cache coherence protocols are the state-of-The-Art approaches in many-core CMPs to keep the data blocks coherent at the last level private caches. However, the area overhead and high associativity requirement of the directory structures may not scale well with increasingly higher number of cores. As shown in some prior studies, a significant percentage of data blocks are accessed by only one core, therefore, it is not necessary to keep track of these in the directory struc-ture. In this study, we have two major contributions. First, we show that compared to the classification of cache blocks at page granularity as done in some previous studies, data block classification at subpage level helps to detect consid-erably more private data blocks. Consequently, it reduces the percentage of blocks required to be tracked in the di-rectory significantly compared to similar page level classification approaches. This, in turn, enables smaller directory caches with lower associativity to be used in CMPs without hurting performance, thereby helping the directory struc-ture to scale gracefully with the increasing number of cores. Memory block classification at subpage level, however, may increase the frequency of the Operating System's (OS) in-volvement in updating the maintenance bits belonging to subpages stored in page table entries, nullifying some por-tion of performance benefits of subpage level data classification. To overcome this, we propose a distributed on-chip page table as a our second contribution. © 2016 Copyright held by the owner/author(s).Item Open Access Çağrı merkezi metin madenciliği yaklaşımı(IEEE, 2017-05) Yiğit, İ. O.; Ateş, A. F.; Güvercin, Mehmet; Ferhatosmanoğlu, Hakan; Gedik, BuğraGünümüzde çağrı merkezlerindeki görüşme kayıtlarının sesten metne dönüştürülebilmesi görüşme kaydı metinleri üzerinde metin madenciliği yöntemlerinin uygulanmasını mümkün kılmaktadır. Bu çalışma kapsamında görüşme kaydı metinleri kullanarak görüşmenin içeriğinin duygu yönünden (olumlu/olumsuz) değerlendirilmesi, müşteri memnuniyetinin ve müşteri temsilcisi performansının ölçülmesi amaçlanmaktadır. Yapılan çalışmada görüşme kaydı metinlerinden metin madenciliği yöntemleri ile yeni özellikler çıkarılmıştır. Metinlerden elde edilen özelliklerden yararlanılarak sınıflandırma ve regresyon yöntemleriyle görüşme kayıtlarının içeriklerinin değerlendirilmesini sağlayacak tahmin modelleri oluşturulmuştur. Bu çalışma sonucunda ortaya çıkarılan tahmin modellerinin Türk Telekom bünyesindeki çağrı merkezlerinde kullanılması hedeflenmektedir.Item Open Access Chat mining for gender prediction(Springer, 2006-10) Küçükyılmaz, Tayfun; Cambazoğlu, B. Barla; Aykanat, Cevdet; Can, FazlıThe aim of this paper is to investigate the feasibility of predicting the gender of a text document's author using linguistic evidence. For this purpose, term- and style-based classification techniques are evaluated over a large collection of chat messages. Prediction accuracies up to 84.2% are achieved, illustrating the applicability of these techniques to gender prediction. Moreover, the reverse problem is exploited, and the effect of gender on the writing style is discussed. © Springer-Verlag Berlin Heidelberg 2006.Item Open Access Classification by voting feature intervals(Springer, 1997-04) Demiröz, Gülşen; Güvenir, H. AltayA new classification algorithm called VFI (for Voting Feature Intervals) is proposed. A concept is represented by a set of feature intervals on each feature dimension separately. Each feature participates in the classification by distributing real-valued votes among classes. The class receiving the highest vote is declared to be the predicted class. VFI is compared with the Naive Bayesian Classifier, which also considers each feature separately. Experiments on real-world datasets show that VFI achieves comparably and even better than NBC in terms of classification accuracy. Moreover, VFI is faster than NBC on all datasets. © Springer-Verlag Berlin Heidelberg 1997.Item Open Access Comparative analysis of different approaches to target classification and localization with sonar(IEEE, 2001-08) Ayrulu, Birsel; Barshan, BillurThe comparison of different classification and fusion techniques was done for target classification and localization with sonar. Target localization performance of artificial neural networks (ANN) was found to be better than the target differentiation algorithm (TDA) and fusion techniques. The target classification performance of non-parametric approaches was better than that of parameterized density estimator (PDE) using homoscedastic and heteroscedastic NM for statistical pattern recognition techniques.Item Open Access Competitive and online piecewise linear classification(IEEE, 2013) Özkan, Hüseyin; Donmez, M.A.; Pelvan O.S.; Akman, A.; Kozat, Süleyman S.In this paper, we study the binary classification problem in machine learning and introduce a novel classification algorithm based on the 'Context Tree Weighting Method'. The introduced algorithm incrementally learns a classification model through sequential updates in the course of a given data stream, i.e., each data point is processed only once and forgotten after the classifier is updated, and asymptotically achieves the performance of the best piecewise linear classifiers defined by the 'context tree'. Since the computational complexity is only linear in the depth of the context tree, our algorithm is highly scalable and appropriate for real time processing. We present experimental results on several benchmark data sets and demonstrate that our method provides significant computational improvement both in the test (5 ∼ 35×) and training phases (40 ∼ 1000×), while achieving high classification accuracy in comparison to the SVM with RBF kernel. © 2013 IEEE.Item Open Access Data imputation through the identification of local anomalies(Institute of Electrical and Electronics Engineers Inc., 2015) Ozkan, H.; Pelvan, O. S.; Kozat, S. S.We introduce a comprehensive and statistical framework in a model free setting for a complete treatment of localized data corruptions due to severe noise sources, e.g., an occluder in the case of a visual recording. Within this framework, we propose: 1) a novel algorithm to efficiently separate, i.e., detect and localize, possible corruptions from a given suspicious data instance and 2) a maximum a posteriori estimator to impute the corrupted data. As a generalization to Euclidean distance, we also propose a novel distance measure, which is based on the ranked deviations among the data attributes and empirically shown to be superior in separating the corruptions. Our algorithm first splits the suspicious instance into parts through a binary partitioning tree in the space of data attributes and iteratively tests those parts to detect local anomalies using the nominal statistics extracted from an uncorrupted (clean) reference data set. Once each part is labeled as anomalous versus normal, the corresponding binary patterns over this tree that characterize corruptions are identified and the affected attributes are imputed. Under a certain conditional independency structure assumed for the binary patterns, we analytically show that the false alarm rate of the introduced algorithm in detecting the corruptions is independent of the data and can be directly set without any parameter tuning. The proposed framework is tested over several well-known machine learning data sets with synthetically generated corruptions and experimentally shown to produce remarkable improvements in terms of classification purposes with strong corruption separation capabilities. Our experiments also indicate that the proposed algorithms outperform the typical approaches and are robust to varying training phase conditions. © 2015 IEEE.Item Open Access EEG sinyallerinde gamma tepkisinin tespiti(IEEE, 2006-04) Tüfekçi, D. İlhan; Karakaş, S.; Arıkan, OrhanIn the detection of the existence of the early gamma response, subjective methods have been used. In this study, an automated gamma detection technique is developed based on the features obtained from the time - frequency representation of the EEG signal in the gamma frequency band. The technique easily discriminates the gamma response existing and non-existing cases for the generated synthetic data. The classification of the technique and that of the expert opinion coincide %77 for real EEG data. © 2006 IEEE.Item Open Access Estimating the chance of success in IVF treatment using a ranking algorithm(Springer, 2015) Güvenir, H. A.; Misirli, G.; Dilbaz, S.; Ozdegirmenci, O.; Demir, B.; Dilbaz, B.In medicine, estimating the chance of success for treatment is important in deciding whether to begin the treatment or not. This paper focuses on the domain of in vitro fertilization (IVF), where estimating the outcome of a treatment is very crucial in the decision to proceed with treatment for both the clinicians and the infertile couples. IVF treatment is a stressful and costly process. It is very stressful for couples who want to have a baby. If an initial evaluation indicates a low pregnancy rate, decision of the couple may change not to start the IVF treatment. The aim of this study is twofold, firstly, to develop a technique that can be used to estimate the chance of success for a couple who wants to have a baby and secondly, to determine the attributes and their particular values affecting the outcome in IVF treatment. We propose a new technique, called success estimation using a ranking algorithm (SERA), for estimating the success of a treatment using a ranking-based algorithm. The particular ranking algorithm used here is RIMARC. The performance of the new algorithm is compared with two well-known algorithms that assign class probabilities to query instances. The algorithms used in the comparison are Naïve Bayes Classifier and Random Forest. The comparison is done in terms of area under the ROC curve, accuracy and execution time, using tenfold stratified cross-validation. The results indicate that the proposed SERA algorithm has a potential to be used successfully to estimate the probability of success in medical treatment.Item Open Access Exploiting interclass rules for focused crawling(IEEE, 2004) Altingövde, I. S.; Ulusoy, ÖzgürA baseline crawler was developed at the Bilkent University based on a focused-crawling approach. The focused crawler is an agent that targets a particular topic and visits and gathers only a relevant, narrow Web segment while trying not to waste resources on irrelevant materials. The rule-based Web-crawling approach uses linkage statistics among topics to improve a baseline focused crawler's harvest rate and coverage. The crawler also employs a canonical topic taxonomy to train a naïve-Bayesian classifier, which then helps determine the relevancy of crawled pages.Item Open Access FAME: Face association through model evolution(IEEE, 2015-06) Gölge, Eren; Duygulu, PınarWe attack the problem of building classifiers for public faces from web images collected through querying a name. The search results are very noisy even after face detection, with several irrelevant faces corresponding to other people. Moreover, the photographs are taken in the wild with large variety in poses and expressions. We propose a novel method, Face Association through Model Evolution (FAME), that is able to prune the data in an iterative way, for the models associated to a name to evolve. The idea is based on capturing discriminative and representative properties of each instance and eliminating the outliers. The final models are used to classify faces on novel datasets with different characteristics. On benchmark datasets, our results are comparable to or better than the state-of-the-art studies for the task of face identification. © 2015 IEEE.Item Open Access Farklı yapay sinir ağı temelli sınıflandırıcılar ile insan hareketi tanımlama(IEEE, 2017-05) Çatalbaş, Burak; Morgül, Ömer; Çatalbaş, Bahadırİnsan Hareketi Tanımlanması, taşıdığı önem ve sınırlı öznitelik vektörü ile yüksek sınıflandırma oranlarına ulaşmasında karşılaşılan zorluk nedeniyle popüler bir araştırma konusudur. Bireylerin hareket ölçülebilirliginin akıllı telefonların içinde gömülü bulunan atalet ölçüm birimleri sayesinde artması ile birlikte, bu alanda toplanan veri miktarı artmakta ve daha başarılı sınıflandırıcıların tasarlanabilmesine imkan saglanmaktadır. Yapay sinir ağları, konvansiyonel sınıflandırıcılara göre sınıflandırma sorunlarında daha iyi performans sergileyebilmektedir. Bu çalışmada, Irvine Kaliforniya Üniversitesi (UCI) veri setine yapay sinir ağı temelli bir sınıflandırıcı önermek için çeşitli yapay sinir ağı yapıları denenmiş olup, bu sınıflandırıcılar ile elde edilen başarı oranları literatürdeki aynı veri kümesi için bulunan sonuçlarla karşılaştırılmıştır.Item Open Access İki durumlu bir beyin bilgisayar arayüzünde özellik çıkarımı ve sınıflandırma(IEEE, 2017-10) Altındiş, Fatih; Yılmaz, B.Beyin bilgisayar arayüzü (BBA) teknolojisi motor nöronlarının özelliğini kaybeden ve hareket kabiliyeti kısıtlanmış ALS ve felçli hastalar gibi birçok kişinin dış dünya ile iletişimini sağlamaya yönelik kullanılmaktadır. Bu çalışmada, Avusturya’daki Graz Üniversitesi’nde alınmış EEG veri seti kullanılarak gerçek zamanlı EEG işleme simülasyonu ile motor hayal etme sınıflandırılması amaçlanmıştır. Bu veri setinde sağ el ya da sol elin hareket ettirilme hayali esnasında 8 kişiden alınmış iki kanallı EEG sinyalleri bulunmaktadır. Her katılımcıdan 60 sağ ve 60 sol olmak üzere toplamda 120 adet yaklaşık 9 saniyelik motor hayal etme deneme sinyali kayıt edilmiştir. Bu sinyaller filtrelemeye tabi tutulmuştur. Yirmi dört, 32 ve 40 elemanlı özellik vektörü bant geçiren filtreler kullanarak elde edilen göreceli güç değişim değerleridir (GGDD). Bu çalışmada, lineer diskriminant analizi (LDA), k en yakın komşular (KNN) ve destek vektör makinaları (SVM) ile sınıflandırma yapılmış, en iyi sınıflandırma performansının 24 değerli özellik vektörüyle ve LDA sınıflandırma yöntemiyle elde edildiği gösterilmiştir.Item Open Access Interactive training of advanced classifiers for mining remote sensing image archives(ACM, 2004) Aksoy, Selim; Koperski, K.; Tusk, C.; Marchisio G.Advances in satellite technology and availability of down-loaded images constantly increase the sizes of remote sensing image archives. Automatic content extraction, classification and content-based retrieval have become highly desired goals for the development of intelligent remote sensing databases. The common approach for mining these databases uses rules created by analysts. However, incorporating GIS information and human expert knowledge with digital image processing improves remote sensing image analysis. We developed a system that uses decision tree classifiers for interactive learning of land cover models and mining of image archives. Decision trees provide a promising solution for this problem because they can operate on both numerical (continuous) and categorical (discrete) data sources, and they do not require any assumptions about neither the distributions nor the independence of attribute values. This is especially important for the fusion of measurements from different sources like spectral data, DEM data and other ancillary GIS data. Furthermore, using surrogate splits provides the capability of dealing with missing data during both training and classification, and enables handling instrument malfunctions or the cases where one or more measurements do not exist for some locations. Quantitative and qualitative performance evaluation showed that decision trees provide powerful tools for modeling both pixel and region contents of images and mining of remote sensing image archives.Item Open Access Large-scale cluster-based retrieval experiments on Turkish texts(ACM, 2007) Altıngövde, İsmail Şengör; Özcan, Rıfat; Öcalan Hüseyin C.; Can, Fazlı; Ulusoy, ÖzgürWe present cluster-based retrieval (CBR) experiments on the largest available Turkish document collection. Our experiments evaluate retrieval effectiveness and efficiency on both an automatically generated clustering structure and a manual classification of documents. In particular, we compare CBR effectiveness with full-text search (FS) and evaluate several implementation alternatives for CBR. Our findings reveal that CBR yields comparable effectiveness figures with FS. Furthermore, by using a specifically tailored cluster-skipping inverted index we significantly improve in-memory query processing efficiency of CBR in comparison to other traditional CBR techniques and even FS.Item Open Access Maximizing benefit of classifications using feature intervals(Springer, Berlin, Heidelberg, 2003) İkizler, Nazlı; Güvenir, H. AltayThere is a great need for classification methods that can properly handle asymmetric cost and benefit constraints of classifications. In this study, we aim to emphasize the importance of classification benefits by means of a new classification algorithm, Benefit-Maximizing classifier with Feature Intervals (BMFI) that uses feature projection based knowledge representation. Empirical results show that BMFI has promising performance compared to recent cost-sensitive algorithms in terms of the benefit gained.Item Open Access Modeling interestingness of streaming classification rules as a classification problem(Springer, 2005-06) Aydın, Tolga; Güvenir, Halil AltayInducing classification rules on domains from which information is gathered at regular periods lead the number of such classification rules to be generally so huge that selection of interesting ones among all discovered rules becomes an important task. At each period, using the newly gathered information from the domain, the new classification rules are induced. Therefore, these rules stream through time and are so called streaming classification rules. In this paper, an interactive classification rules' interestingness learning algorithm (ICRIL) is developed to automatically label the classification rules either as "interesting" or "uninteresting" with limited user interaction. In our study, VFFP (Voting Fuzzified Feature Projections), a feature projection based incremental classification algorithm, is also developed in the framework of ICRIL. The concept description learned by the VFFP is the interestingness concept of streaming classification rules. © Springer-Verlag Berlin Heidelberg 2006.