Browsing by Subject "Bandwidth selection"
Now showing 1 - 3 of 3
- Results Per Page
- Sort Options
Item Open Access Online anomaly detection with bandwidth optimized hierarchical kernel density estimators(IEEE, 2020) Kerpicci, M.; Ozkan, H.; Kozat, Süleyman SerdarWe propose a novel unsupervised anomaly detection algorithm that can work for sequential data from any complex distribution in a truly online framework with mathematically proven strong performance guarantees. First, a partitioning tree is constructed to generate a doubly exponentially large hierarchical class of observation space partitions, and every partition region trains an online kernel density estimator (KDE) with its own unique dynamical bandwidth. At each time, the proposed algorithm optimally combines the class estimators to sequentially produce the final density estimation. We mathematically prove that the proposed algorithm learns the optimal partition with kernel bandwidths that are optimized in both region-specific and time-varying manner. The estimated density is then compared with a data-adaptive threshold to detect anomalies. Overall, the computational complexity is only linear in both the tree depth and data length. In our experiments, we observe significant improvements in anomaly detection accuracy compared with the state-of-the-art techniques.Item Open Access Online anomaly detection with kernel density estimators(2019-07) Kerpiççi, MineWe study online anomaly detection in an unsupervised framework and introduce an algorithm to detect the anomalies in sequential data. We first sequentially learn the density for the observed data with a novel kernel based hierarchical approach for which we also provide a regret bound in a competitive manner against an exponentially large class of estimators. In our approach, we use a binary partitioning tree and apply the nonparametric Kernel Density Estimation (KDE) method at each node of the introduced tree. The use of the partitioning tree allows us not only to generate a large class of estimators of size doubly exponential in the depth that we compete against in estimating the density, but also to hierarchically organize the class to obtain a computationally efficient final estimation. Moreover, we do not assume any underlying distribution for the data so that our algorithm can work for data coming from any unknown arbitrarily complex distribution. Note that the end-to-end processing in our work is truly online. For this, we exploit a random Fourier kernel expansion for sequentially exact kernel evaluations without a repetitive access to past data. Our algorithm learns not only the optimal partitioning of the observation space but also the optimal bandwidth, which is locally tuned for the optimal partition. Thus, we solve the bandwidth selection problem in KDE methods in a highly novel and computationally efficient way. Finally, as the data density is sequentially being learned in the stream, we compare the estimated density with a threshold to detect the anomalies. We also adaptively learn the threshold in time to achieve the optimal threshold. In our experiments with synthetic and real datasets, we illustrate significant performance improvements achieved by our method against the state-of-the-art anomaly detection algorithms.Item Open Access Parametrik olmayan yoğunluk tahmincileri ile ardışık anomali tespiti(IEEE, 2019-04) Kerpiççi, Mine; Kozat, Süleyman S.; Özkan, H.Bu bildiride, gözlemlenen verideki anomalileri, gözetimsiz bir çerçevede, iki aşamalı yöntemle bulmak için anomali tespit algoritması tanıtılmıştır. İlk aşamada, ardışık olarak gözlemlenen verinin yoğunluğu çekirdek temelli özgün bir yöntemle tahmin edilmektedir. Bu amaçla, gözlem alanı bölünmekte ve her bölgede parametrik olmayan Çekirdek Yoğunluk Tahmincisi (ÇYT) veri dağılımına dair hiçbir varsayımda bulunulmadan kullanılmaktadır. Sonra, yoğunluk tahmini eşik değeriyle karşılaştırılarak verinin anomali olup olmadığına karar verilmektedir. Ayrıca, çekirdek temelli yöntemlerdeki bant genişliği seçimi problemi de verimli bir şekilde çözülmektedir. Bu amaçla, her bir bölgeye çekirdek bant genişliği seti atanmakta ve her tahmincinin ait olduğu bölgeye göre en iyi bant genişliği seçeneğine zamanla ulaşması sağlanmaktadır. Sayısal örneklerde, tanıtılan algoritmanın literatürde sıklıkla kullanılan anomali tespit metodlarına göre yüksek performans artışı elde ettiği gösterilmektedir.