A discretization method based on maximizing the area under receiver operating characteristic curve
Please cite this item using this persistent URLhttp://hdl.handle.net/11693/21091
International Journal of Pattern Recognition and Artificial Intelligence
- Research Paper 
Many machine learning algorithms require the features to be categorical. Hence, they require all numeric-valued data to be discretized into intervals. In this paper, we present a new discretization method based on the receiver operating characteristics (ROC) Curve (AUC) measure. Maximum area under ROC curve-based discretization (MAD) is a global, static and supervised discretization method. MAD uses the sorted order of the continuous values of a feature and discretizes the feature in such a way that the AUC based on that feature is to be maximized. The proposed method is compared with alternative discretization methods such as ChiMerge, Entropy-Minimum Description Length Principle (MDLP), Fixed Frequency Discretization (FFD), and Proportional Discretization (PD). FFD and PD have been recently proposed and are designed for Naïve Bayes learning. ChiMerge is a merging discretization method as the MAD method. Evaluations are performed in terms of M-Measure, an AUC-based metric for multi-class classification, and accuracy values obtained from Naïve Bayes and Aggregating One-Dependence Estimators (AODE) algorithms by using real-world datasets. Empirical results show that MAD is a strong candidate to be a good alternative to other discretization methods. © 2013 World Scientific Publishing Company.
Showing items related by title, author, creator and subject.
Korkmaz, Sayit (Bilkent University, 2005)The Wigner distribution and linear canonical transforms are important tools for optics, signal processing, quantum mechanics, and mathematics. In this thesis, we study the discrete versions of Wigner distributions and ...
Atak O.; Atalar, A. (2013)We present Bilkent reconfigurable computer (BilRC), a new coarse-grained reconfigurable architecture (CGRA) employing an execution-triggering mechanism. A control data flow graph language is presented for mapping the ...
Efficient analysis of phased arrays of microstrip patches using a hybrid generalized forward backward method/green's function technique with a DFT based acceleration algorithm Bakir O.; Civi Ö.A.; Erturk V.B.; Chou H.-T. (2008)A hybrid method based on the combination of generalized forward backward method (GFBM) and Green's function for the grounded dielectric slab together with the acceleration of the combination via a discrete Fourier transform ...