Online learning under adverse settings

Özkan, Hüseyin

Online learning under adverse settings

buir.advisor	Kozat, S. Serdar
dc.contributor.author	Özkan, Hüseyin
dc.date.accessioned	2016-05-02T07:12:24Z
dc.date.available	2016-05-02T07:12:24Z
dc.date.copyright	2015-05
dc.date.issued	2015-05
dc.date.submitted	01-06-2015
dc.description	Cataloged from PDF version of article.	en_US
dc.description	Includes bibliographical references (leaves 145-164).	en_US
dc.description.abstract	We present novel solutions for contemporary real life applications that generate data at unforeseen rates in unpredictable forms including non-stationarity, corruptions, missing/mixed attributes and high dimensionality. In particular, we introduce novel algorithms for online learning, where the observations are received sequentially and processed only once without being stored, under adverse settings: i) no or limited assumptions can be made about the data source, ii) the observations can be corrupted and iii) the data is to be processed at extremely fast rates. The introduced algorithms are highly effective and efficient with strong mathematical guarantees; and are shown, through the presented comprehensive real life experiments, to significantly outperform the competitors under such adverse conditions. We develop a novel highly dynamical ensemble method without any stochastic assumptions on the data source. The presented method is asymptotically guaranteed to perform as well as, i.e., competitive against, the best expert in the ensemble, where the competitor, i.e., the best expert, itself is also specifically designed to continuously improve over time in a completely data adaptive manner. In addition, our algorithm achieves a significantly superior modeling power (hence, a significantly superior prediction performance) through a hierarchical and self-organizing approach while mitigating over training issues by combining (taking finite unions of) low-complexity methods. On the contrary, the state-of-the-art ensemble techniques are heavily dependent on static and unstructured expert ensembles. In this regard, we rigorously solve the resulting issues such as the over sensitivity to source statistics as well as the incompatibility between the modeling power and the computational load/precision. Our results uniformly hold for every possible input stream in the deterministic sense regardless of the stationary or non-stationary source statistics. Furthermore, we directly address the data corruptions by developing novel versatile imputation methods and thoroughly demonstrate that the anomaly detection -in addition to being stand alone an important learning problem- is extremely effective for corruption detection/imputation purposes. To that end, as the first time in the literature, we develop the online implementation of the Neyman-Pearson characterization for anomalies in stationary or non-stationary fast streaming temporal data. The introduced anomaly detection algorithm maximizes the detection power at a specified controllable constant false alarm rate with no parameter tuning in a truly online manner. Our algorithms can process any streaming data at extremely fast rates without requiring a training phase or a priori information while bearing strong performance guarantees. Through extensive experiments over real/synthetic benchmark data sets, we also show that our algorithms significantly outperform the state-of-the-art as well as the most recently proposed techniques in the literature with remarkable adaptation capabilities to non-stationarity.	en_US
dc.description.statementofresponsibility	by Hüseyin Özkan.	en_US
dc.embargo.release	2016-06-01
dc.format.extent	xvi, 164 leaves : charts.	en_US
dc.identifier.itemid	B150302
dc.identifier.uri	http://hdl.handle.net/11693/29022
dc.language.iso	English	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Online Learning	en_US
dc.subject	Supervised learning	en_US
dc.subject	Prediction	en_US
dc.subject	Classification	en_US
dc.subject	Regression	en_US
dc.subject	Anomaly detection	en_US
dc.subject	Big data	en_US
dc.subject	Adverse conditions	en_US
dc.subject	Deterministic analysis	en_US
dc.subject	Worst case	en_US
dc.subject	Non-stationarity	en_US
dc.subject	Concept change	en_US
dc.subject	Self-organizing	en_US
dc.subject	Decision tree	en_US
dc.subject	Hidden markov model	en_US
dc.subject	HMM	en_US
dc.subject	Partially observable HMM states	en_US
dc.subject	Label errors	en_US
dc.subject	Corruption	en_US
dc.subject	Noise	en_US
dc.subject	Anomaly	en_US
dc.subject	Imputation	en_US
dc.subject	Time series	en_US
dc.subject	Neyman-pearson	en_US
dc.title	Online learning under adverse settings	en_US
dc.title.alternative	Karşıt koşullar altında çevrimiçi öğrenme	en_US
dc.type	Thesis	en_US
thesis.degree.discipline	Electrical and Electronic Engineering
thesis.degree.grantor	Bilkent University
thesis.degree.level	Doctoral
thesis.degree.name	Ph.D. (Doctor of Philosophy)

Files

Original bundle

Now showing 1 - 1 of 1

Name:: thesis.pdf
Size:: 3.34 MB
Format:: Adobe Portable Document Format
Description:: Full printable version

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Graduate School of Engineering and Science