Qualitative test-cost sensitive classification
Date
Authors
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Volume
Issue
Pages
Language
Type
Journal Title
Journal ISSN
Volume Title
Usage Stats
views
downloads
Series
Abstract
Decision making is a procedure for selecting the best action among several alternatives. In many real-world problems, decision has to be taken under the circumstances in which one has to pay to acquire information. In this thesis, we propose a new framework for test-cost sensitive classification that considers the misclassification cost together with the cost of feature extraction, which arises from the effort of acquiring features. This proposed framework introduces two new concepts to test-cost sensitive learning for better modeling the real-world problems: qualitativeness and consistency. First, this framework introduces the incorporation of qualitative costs into the problem formulation. This incorporation becomes important for many real world problems, from finance to medical diagnosis, since the relation between the misclassification cost and the cost of feature extraction could be expressed only roughly and typically in terms of ordinal relations for these problems. For example, in cancer diagnosis, it could be expressed that the cost of misdiagnosis is larger than the cost of a medical test. However, in the test-cost sensitive classification literature, the misclassification cost and the cost of feature extraction are combined quantitatively to obtain a single loss/utility value, which requires expressing the relation between these costs as a precise quantitative number. Second, the proposed framework considers the consistency between the current information and the information after feature extraction to decide which features to extract. For example, it does not extract a new feature if it brings no new information but just confirms the current one; in other words, if the new feature is totally consistent with the current information. By doing so, the proposed framework could significantly decrease the cost of feature extraction, and hence, the overall cost without decreasing the classification accuracy. Such consistency behavior has not been considered in the previous test-cost sensitive literature. We conduct our experiments on three medical data sets and the results demonstrate that the proposed framework significantly decreases the feature extraction cost without decreasing the classification accuracy.