A discretization method based on maximizing the area under receiver operating characteristic curve

dc.citation.epage26en_US
dc.citation.issueNumber1en_US
dc.citation.spage1en_US
dc.citation.volumeNumber27en_US
dc.contributor.authorKurtcephe, M.en_US
dc.contributor.authorGüvenir H. A.en_US
dc.date.accessioned2016-02-08T09:41:01Z
dc.date.available2016-02-08T09:41:01Z
dc.date.issued2013en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractMany machine learning algorithms require the features to be categorical. Hence, they require all numeric-valued data to be discretized into intervals. In this paper, we present a new discretization method based on the receiver operating characteristics (ROC) Curve (AUC) measure. Maximum area under ROC curve-based discretization (MAD) is a global, static and supervised discretization method. MAD uses the sorted order of the continuous values of a feature and discretizes the feature in such a way that the AUC based on that feature is to be maximized. The proposed method is compared with alternative discretization methods such as ChiMerge, Entropy-Minimum Description Length Principle (MDLP), Fixed Frequency Discretization (FFD), and Proportional Discretization (PD). FFD and PD have been recently proposed and are designed for Naïve Bayes learning. ChiMerge is a merging discretization method as the MAD method. Evaluations are performed in terms of M-Measure, an AUC-based metric for multi-class classification, and accuracy values obtained from Naïve Bayes and Aggregating One-Dependence Estimators (AODE) algorithms by using real-world datasets. Empirical results show that MAD is a strong candidate to be a good alternative to other discretization methods. © 2013 World Scientific Publishing Company.en_US
dc.description.provenanceMade available in DSpace on 2016-02-08T09:41:01Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2013en
dc.identifier.doi10.1142/S021800141350002Xen_US
dc.identifier.issn0218-0014
dc.identifier.urihttp://hdl.handle.net/11693/21091
dc.language.isoEnglishen_US
dc.publisherWorld Scientific Publishing Co. Pte. Ltd.en_US
dc.relation.isversionofhttp://dx.doi.org/10.1142/S021800141350002Xen_US
dc.source.titleInternational Journal of Pattern Recognition and Artificial Intelligenceen_US
dc.subjectArea under ROC curveen_US
dc.subjectData miningen_US
dc.subjectDiscretizationen_US
dc.subjectArea under roc curve (AUC)en_US
dc.subjectDiscretization methoden_US
dc.titleA discretization method based on maximizing the area under receiver operating characteristic curveen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
A discretization method based on maximizing the area under receiver operating characteristic curve.pdf
Size:
358.37 KB
Format:
Adobe Portable Document Format
Description:
Full printable version