Implicit concept drift detection for multi-label data streams

Date
2022-01
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Bilkent University
Volume
Issue
Pages
Language
English
Type
Thesis
Journal Title
Journal ISSN
Volume Title
Abstract

Many real-world applications adopt multi-label data streams as the need for algo-rithms to deal with rapidly generated data increases. For such streams, changes in data distribution, also known as concept drift, cause the existing classification models to rapidly lose their effectiveness. To assist the classifiers, we propose a novel algorithm called Label Dependency Drift Detector (LD3), an implicit (un-supervised) concept drift detector using label dependencies within the data for multi-label data streams. Our study exploits the dynamic temporal dependencies between labels using a label influence ranking method, which leverages a data fusion algorithm and uses the produced ranking to detect concept drift. LD3 is the first unsupervised concept drift detection algorithm in the multi-label classification problem area. In this study, we perform an extensive evaluation of LD3 by comparing it with 14 prevalent supervised concept drift detection algorithms that we adapt to the problem area using 12 datasets and a baseline classifier. The results show that LD3 provides between 19.8% and 68.6% better predictive performance than comparable detectors on both real-world and synthetic data streams.

Course
Other identifiers
Book Title
Keywords
Big data, Multi-label data stream, Multi-label classification, Concept drift, Drift detection, Data fusion
Citation
Published Version (Please cite this version)