Unsupervised concept drift detection using sliding windows: two contributions

Gözüaçık, Ömer

Unsupervised concept drift detection using sliding windows: two contributions

Available

The embargo period has ended, and this item is now available.

Files

10366616.pdf (4.41 MB)

Date

2020-10

Authors

Gözüaçık, Ömer

Advisor

Can, Fazlı

BUIR Usage Stats

2
views

98
downloads

Abstract

Data stream mining has become an important research area over the past decade due to the increasing amount of data available today. Sources from various domains generate limitless volume of data in temporal order. Such data are referred to as data streams, and generally, they are nonstationary as the characteristics of the data evolve over time. This phenomenon is called concept drift, and it is an issue of great importance in the literature since it makes models outdated and decreases their predictive performance. In the presence of concept drift, adapting the change in data is necessary to have more robust and effective classifiers. Drift detectors are designed to run jointly with the classification models, updating them when a significant change in the data distribution is observed. In this study, we propose two unsupervised concept drift detection methods: D3 and OCDD. In D3, we use a discriminative classifier over a sliding window to monitor the change in the distribution of data. When the old and the new data are separable with the discriminative classifier, a drift is signaled. In OCDD, we use a one-class classifier over a sliding window. We monitor the number of outliers identified in the sliding window. We claim that the number of outliers are the signs of a new concept, and define concept drift detection as the continuous form of anomaly detection. A drift is signaled if the percentage of the outliers are over a pre-determined threshold. We perform a comprehensive evaluation on the latest and the most prevalent concept drift detectors using 13 datasets. The results show that OCDD outperforms the other methods by producing models with significantly better predictive performances on both real-world and synthetic datasets. D3 is on par with the other methods.

Keywords

Data stream, Concept drift

Degree Discipline

Computer Engineering

Degree Level

Master's

Degree Name

MS (Master of Science)

Permalink

http://hdl.handle.net/11693/54498

Collections

Graduate School of Engineering and Science

Language

English

Type

Thesis

Full item page

Unsupervised concept drift detection using sliding windows: two contributions

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Unsupervised concept drift detection using sliding windows: two contributions

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type