Online anomaly detection with kernel density estimators

buir.advisorKozat, Süleyman Serdar
dc.contributor.authorKerpiççi, Mine
dc.date.accessioned2019-08-02T08:02:16Z
dc.date.available2019-08-02T08:02:16Z
dc.date.copyright2019-07
dc.date.issued2019-07
dc.date.submitted2019-07-29
dc.departmentDepartment of Electrical and Electronics Engineeringen_US
dc.descriptionCataloged from PDF version of article.en_US
dc.descriptionThesis (M.S.) : Bilkent University, Department of Electrical and Electronics Engineering, İhsan Doğramacı Bilkent University, 2019.en_US
dc.descriptionIncludes bibliographical references (leaves 40-44).en_US
dc.description.abstractWe study online anomaly detection in an unsupervised framework and introduce an algorithm to detect the anomalies in sequential data. We first sequentially learn the density for the observed data with a novel kernel based hierarchical approach for which we also provide a regret bound in a competitive manner against an exponentially large class of estimators. In our approach, we use a binary partitioning tree and apply the nonparametric Kernel Density Estimation (KDE) method at each node of the introduced tree. The use of the partitioning tree allows us not only to generate a large class of estimators of size doubly exponential in the depth that we compete against in estimating the density, but also to hierarchically organize the class to obtain a computationally efficient final estimation. Moreover, we do not assume any underlying distribution for the data so that our algorithm can work for data coming from any unknown arbitrarily complex distribution. Note that the end-to-end processing in our work is truly online. For this, we exploit a random Fourier kernel expansion for sequentially exact kernel evaluations without a repetitive access to past data. Our algorithm learns not only the optimal partitioning of the observation space but also the optimal bandwidth, which is locally tuned for the optimal partition. Thus, we solve the bandwidth selection problem in KDE methods in a highly novel and computationally efficient way. Finally, as the data density is sequentially being learned in the stream, we compare the estimated density with a threshold to detect the anomalies. We also adaptively learn the threshold in time to achieve the optimal threshold. In our experiments with synthetic and real datasets, we illustrate significant performance improvements achieved by our method against the state-of-the-art anomaly detection algorithms.en_US
dc.description.degreeM.S.en_US
dc.description.provenanceSubmitted by Betül Özen (ozen@bilkent.edu.tr) on 2019-08-02T08:02:16Z No. of bitstreams: 1 Mine Kerpicci - Thesis.pdf: 814903 bytes, checksum: 38114070a2ba051814eac87d6a25c5ac (MD5)en
dc.description.provenanceMade available in DSpace on 2019-08-02T08:02:16Z (GMT). No. of bitstreams: 1 Mine Kerpicci - Thesis.pdf: 814903 bytes, checksum: 38114070a2ba051814eac87d6a25c5ac (MD5) Previous issue date: 2019-07en
dc.description.statementofresponsibilityby Mine Kerpiççien_US
dc.embargo.release2020-01-29
dc.format.extentxi, 44 leaves : charts (some color) ; 30 cm.en_US
dc.identifier.itemidB160106
dc.identifier.urihttp://hdl.handle.net/11693/52290
dc.language.isoEnglishen_US
dc.publisherBilkent Universityen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectOnline anomaly detectionen_US
dc.subjectKernel density estimationen_US
dc.subjectBandwidth selectionen_US
dc.subjectRegret analysisen_US
dc.titleOnline anomaly detection with kernel density estimatorsen_US
dc.title.alternativeÇekirdek yoğunluk tahmincileri ile çevrimiçi anomali tespitien_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Mine Kerpicci - Thesis.pdf
Size:
795.8 KB
Format:
Adobe Portable Document Format
Description:
Full printable version
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: