Show simple item record

dc.contributor.authorKurtcephe, M.en_US
dc.contributor.authorGüvenir H.A.en_US
dc.date.accessioned2016-02-08T09:41:01Z
dc.date.available2016-02-08T09:41:01Z
dc.date.issued2013en_US
dc.identifier.issn2180014en_US
dc.identifier.urihttp://hdl.handle.net/11693/21091
dc.description.abstractMany machine learning algorithms require the features to be categorical. Hence, they require all numeric-valued data to be discretized into intervals. In this paper, we present a new discretization method based on the receiver operating characteristics (ROC) Curve (AUC) measure. Maximum area under ROC curve-based discretization (MAD) is a global, static and supervised discretization method. MAD uses the sorted order of the continuous values of a feature and discretizes the feature in such a way that the AUC based on that feature is to be maximized. The proposed method is compared with alternative discretization methods such as ChiMerge, Entropy-Minimum Description Length Principle (MDLP), Fixed Frequency Discretization (FFD), and Proportional Discretization (PD). FFD and PD have been recently proposed and are designed for Naïve Bayes learning. ChiMerge is a merging discretization method as the MAD method. Evaluations are performed in terms of M-Measure, an AUC-based metric for multi-class classification, and accuracy values obtained from Naïve Bayes and Aggregating One-Dependence Estimators (AODE) algorithms by using real-world datasets. Empirical results show that MAD is a strong candidate to be a good alternative to other discretization methods. © 2013 World Scientific Publishing Company.en_US
dc.language.isoEnglishen_US
dc.source.titleInternational Journal of Pattern Recognition and Artificial Intelligence en_US
dc.relation.isversionofhttp://dx.doi.org/10.1142/S021800141350002Xen_US
dc.subjectarea under ROC curveen_US
dc.subjectData miningen_US
dc.subjectdiscretizationen_US
dc.subjectArea under roc curve (AUC)en_US
dc.subjectDiscretization methoden_US
dc.subjectDiscretizationsen_US
dc.subjectMulti-class classificationen_US
dc.subjectReal-world datasetsen_US
dc.subjectReceiver operating characteristic curvesen_US
dc.subjectReceiver operating characteristics curves (ROC)en_US
dc.subjectSupervised discretizationen_US
dc.subjectData miningen_US
dc.subjectLearning algorithmsen_US
dc.subjectLearning systemsen_US
dc.subjectVolume measurementen_US
dc.subjectDiscrete event simulationen_US
dc.titleA discretization method based on maximizing the area under receiver operating characteristic curveen_US
dc.typeResearch Paperen_US
dc.departmentComputer Engineering, Bilkent University, Ankara, 06800, Turkeyen_US
dc.departmentEECS, Case Western Reserve University, Cleveland, OH, 44106, United Statesen_US
dc.citation.volumeNumber27en_US
dc.citation.issueNumber1en_US
dc.identifier.doi10.1142/S021800141350002Xen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record