Browsing by Subject "Area under roc curve (AUC)"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Open Access A discretization method based on maximizing the area under receiver operating characteristic curve(World Scientific Publishing Co. Pte. Ltd., 2013) Kurtcephe, M.; Güvenir H. A.Many machine learning algorithms require the features to be categorical. Hence, they require all numeric-valued data to be discretized into intervals. In this paper, we present a new discretization method based on the receiver operating characteristics (ROC) Curve (AUC) measure. Maximum area under ROC curve-based discretization (MAD) is a global, static and supervised discretization method. MAD uses the sorted order of the continuous values of a feature and discretizes the feature in such a way that the AUC based on that feature is to be maximized. The proposed method is compared with alternative discretization methods such as ChiMerge, Entropy-Minimum Description Length Principle (MDLP), Fixed Frequency Discretization (FFD), and Proportional Discretization (PD). FFD and PD have been recently proposed and are designed for Naïve Bayes learning. ChiMerge is a merging discretization method as the MAD method. Evaluations are performed in terms of M-Measure, an AUC-based metric for multi-class classification, and accuracy values obtained from Naïve Bayes and Aggregating One-Dependence Estimators (AODE) algorithms by using real-world datasets. Empirical results show that MAD is a strong candidate to be a good alternative to other discretization methods. © 2013 World Scientific Publishing Company.Item Open Access Ranking instances by maximizing the area under ROC curve(Institute of Electrical and Electronics Engineers, 2013) Guvenir, H. A.; Kurtcephe, M.In recent years, the problem of learning a real-valued function that induces a ranking over an instance space has gained importance in machine learning literature. Here, we propose a supervised algorithm that learns a ranking function, called ranking instances by maximizing the area under the ROC curve (RIMARC). Since the area under the ROC curve (AUC) is a widely accepted performance measure for evaluating the quality of ranking, the algorithm aims to maximize the AUC value directly. For a single categorical feature, we show the necessary and sufficient condition that any ranking function must satisfy to achieve the maximum AUC. We also sketch a method to discretize a continuous feature in a way to reach the maximum AUC as well. RIMARC uses a heuristic to extend this maximization to all features of a data set. The ranking function learned by the RIMARC algorithm is in a human-readable form; therefore, it provides valuable information to domain experts for decision making. Performance of RIMARC is evaluated on many real-life data sets by using different state-of-the-art algorithms. Evaluations of the AUC metric show that RIMARC achieves significantly better performance compared to other similar methods. © 1989-2012 IEEE.