Online minimax optimal density estimation and anomaly detection in nonstationary environments

Limited Access
This item is unavailable until:
2020-08-25
Date
2017-07
Editor(s)
Advisor
Kozat, Süleyman Serdar
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Bilkent University
Volume
Issue
Pages
Language
English
Journal Title
Journal ISSN
Volume Title
Series
Abstract

Online anomaly detection has attracted signi cant attention in recent years due to its applications in network monitoring, cybersecurity, surveillance and sensor failure. To this end, we introduce an algorithm that sequentially processes data to detect anomalies in time series. Our algorithm consists of two stages: density estimation and anomaly detection. First, we construct a probability density function to model the normal data. Then, we threshold the density of the newly observed data to detect anomalies. We approach this problem from an information theoretic perspective and, for the rst time in the literature, propose minimax optimal schemes for both stages to create an optimal anomaly detection algorithm in a strong deterministic sense. For the rst stage, we introduce an online density estimator that is minimax optimal for general nonstationary exponential-family of distributions without any assumptions on the observation sequence. Our algorithm does not require a priori knowledge of the time horizon, the drift of the underlying distribution or the time instances the parameters of the source changes. Our results are guaranteed to hold in an individual sequence manner. For the second stage, we propose an online threshold selection scheme that has logarithmic performance bounds against the best threshold chosen in hindsight. Our complete algorithm adaptively updates its parameters in a truly sequential manner to achieve log-linear regrets in both stages. Because of its universal prediction perspective on its density estimation, our anomaly detection algorithm can be used in unsupervised, semi-supervised and supervised manner. Through synthetic and real life experiments, we demonstrate substantial performance gains with respect to the state-of-the-art.

Course
Other identifiers
Book Title
Citation
Published Version (Please cite this version)