Unsupervised concept drift detection using sliding windows: two contributions
Author(s)
Advisor
Can, FazlıDate
2020-10Publisher
Bilkent University
Language
English
Type
ThesisItem Usage Stats
308
views
views
380
downloads
downloads
Abstract
Data stream mining has become an important research area over the past decade
due to the increasing amount of data available today. Sources from various domains generate limitless volume of data in temporal order. Such data are referred
to as data streams, and generally, they are nonstationary as the characteristics
of the data evolve over time. This phenomenon is called concept drift, and it
is an issue of great importance in the literature since it makes models outdated
and decreases their predictive performance. In the presence of concept drift,
adapting the change in data is necessary to have more robust and effective classifiers. Drift detectors are designed to run jointly with the classification models,
updating them when a significant change in the data distribution is observed.
In this study, we propose two unsupervised concept drift detection methods: D3
and OCDD. In D3, we use a discriminative classifier over a sliding window to
monitor the change in the distribution of data. When the old and the new data
are separable with the discriminative classifier, a drift is signaled. In OCDD,
we use a one-class classifier over a sliding window. We monitor the number of
outliers identified in the sliding window. We claim that the number of outliers are
the signs of a new concept, and define concept drift detection as the continuous
form of anomaly detection. A drift is signaled if the percentage of the outliers
are over a pre-determined threshold. We perform a comprehensive evaluation on
the latest and the most prevalent concept drift detectors using 13 datasets. The
results show that OCDD outperforms the other methods by producing models
with significantly better predictive performances on both real-world and synthetic
datasets. D3 is on par with the other methods.