A discretization method based on maximizing the area under receiver operating characteristic curve
Please cite this item using this persistent URLhttp://hdl.handle.net/11693/21091
International Journal of Pattern Recognition and Artificial Intelligence
- Research Paper 
Many machine learning algorithms require the features to be categorical. Hence, they require all numeric-valued data to be discretized into intervals. In this paper, we present a new discretization method based on the receiver operating characteristics (ROC) Curve (AUC) measure. Maximum area under ROC curve-based discretization (MAD) is a global, static and supervised discretization method. MAD uses the sorted order of the continuous values of a feature and discretizes the feature in such a way that the AUC based on that feature is to be maximized. The proposed method is compared with alternative discretization methods such as ChiMerge, Entropy-Minimum Description Length Principle (MDLP), Fixed Frequency Discretization (FFD), and Proportional Discretization (PD). FFD and PD have been recently proposed and are designed for Naïve Bayes learning. ChiMerge is a merging discretization method as the MAD method. Evaluations are performed in terms of M-Measure, an AUC-based metric for multi-class classification, and accuracy values obtained from Naïve Bayes and Aggregating One-Dependence Estimators (AODE) algorithms by using real-world datasets. Empirical results show that MAD is a strong candidate to be a good alternative to other discretization methods. © 2013 World Scientific Publishing Company.
Showing items related by title, author, creator and subject.
Korkmaz, Sayit (Bilkent University, 2005)The Wigner distribution and linear canonical transforms are important tools for optics, signal processing, quantum mechanics, and mathematics. In this thesis, we study the discrete versions of Wigner distributions and ...
Akar, N. (Taylor and Francis Inc., 2015)A novel algorithmic method is proposed to fit matrix geometric distributions of desired order to empirical data or arbitrary discrete distributions. The proposed method effectively combines two existing approaches from two ...
Ozaktas, H., M.; Sümbül, U. (2006)Periodicity and discreteness are Fourier duals in the same sense as operators such as coordinate multiplication and differentiation, and translation and phase shift. The fractional Fourier transform allows interpolation ...