Online minimax optimal density estimation and anomaly detection in nonstationary environments
Kozat, Süleyman Serdar
Please cite this item using this persistent URLhttp://hdl.handle.net/11693/33561
Online anomaly detection has attracted signi cant attention in recent years due to its applications in network monitoring, cybersecurity, surveillance and sensor failure. To this end, we introduce an algorithm that sequentially processes data to detect anomalies in time series. Our algorithm consists of two stages: density estimation and anomaly detection. First, we construct a probability density function to model the normal data. Then, we threshold the density of the newly observed data to detect anomalies. We approach this problem from an information theoretic perspective and, for the rst time in the literature, propose minimax optimal schemes for both stages to create an optimal anomaly detection algorithm in a strong deterministic sense. For the rst stage, we introduce an online density estimator that is minimax optimal for general nonstationary exponential-family of distributions without any assumptions on the observation sequence. Our algorithm does not require a priori knowledge of the time horizon, the drift of the underlying distribution or the time instances the parameters of the source changes. Our results are guaranteed to hold in an individual sequence manner. For the second stage, we propose an online threshold selection scheme that has logarithmic performance bounds against the best threshold chosen in hindsight. Our complete algorithm adaptively updates its parameters in a truly sequential manner to achieve log-linear regrets in both stages. Because of its universal prediction perspective on its density estimation, our anomaly detection algorithm can be used in unsupervised, semi-supervised and supervised manner. Through synthetic and real life experiments, we demonstrate substantial performance gains with respect to the state-of-the-art.