Online learning under adverse settings

buir.advisorKozat, S. Serdar
dc.contributor.authorÖzkan, Hüseyin
dc.date.accessioned2016-05-02T07:12:24Z
dc.date.available2016-05-02T07:12:24Z
dc.date.copyright2015-05
dc.date.issued2015-05
dc.date.submitted01-06-2015
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (leaves 145-164).en_US
dc.descriptionThesis (Ph. D.): Bilkent University, Department of Electrical and Electronics Engineering, İhsan Doğramacı Bilkent University, 2015.en_US
dc.description.abstractWe present novel solutions for contemporary real life applications that generate data at unforeseen rates in unpredictable forms including non-stationarity, corruptions, missing/mixed attributes and high dimensionality. In particular, we introduce novel algorithms for online learning, where the observations are received sequentially and processed only once without being stored, under adverse settings: i) no or limited assumptions can be made about the data source, ii) the observations can be corrupted and iii) the data is to be processed at extremely fast rates. The introduced algorithms are highly effective and efficient with strong mathematical guarantees; and are shown, through the presented comprehensive real life experiments, to significantly outperform the competitors under such adverse conditions. We develop a novel highly dynamical ensemble method without any stochastic assumptions on the data source. The presented method is asymptotically guaranteed to perform as well as, i.e., competitive against, the best expert in the ensemble, where the competitor, i.e., the best expert, itself is also specifically designed to continuously improve over time in a completely data adaptive manner. In addition, our algorithm achieves a significantly superior modeling power (hence, a significantly superior prediction performance) through a hierarchical and self-organizing approach while mitigating over training issues by combining (taking finite unions of) low-complexity methods. On the contrary, the state-of-the-art ensemble techniques are heavily dependent on static and unstructured expert ensembles. In this regard, we rigorously solve the resulting issues such as the over sensitivity to source statistics as well as the incompatibility between the modeling power and the computational load/precision. Our results uniformly hold for every possible input stream in the deterministic sense regardless of the stationary or non-stationary source statistics. Furthermore, we directly address the data corruptions by developing novel versatile imputation methods and thoroughly demonstrate that the anomaly detection -in addition to being stand alone an important learning problem- is extremely effective for corruption detection/imputation purposes. To that end, as the first time in the literature, we develop the online implementation of the Neyman-Pearson characterization for anomalies in stationary or non-stationary fast streaming temporal data. The introduced anomaly detection algorithm maximizes the detection power at a specified controllable constant false alarm rate with no parameter tuning in a truly online manner. Our algorithms can process any streaming data at extremely fast rates without requiring a training phase or a priori information while bearing strong performance guarantees. Through extensive experiments over real/synthetic benchmark data sets, we also show that our algorithms significantly outperform the state-of-the-art as well as the most recently proposed techniques in the literature with remarkable adaptation capabilities to non-stationarity.en_US
dc.description.provenanceSubmitted by Betül Özen (ozen@bilkent.edu.tr) on 2016-05-02T07:12:24Z No. of bitstreams: 1 thesis.pdf: 3505007 bytes, checksum: ad0f438ba0b866787e7d9f62174d2f03 (MD5)en
dc.description.provenanceMade available in DSpace on 2016-05-02T07:12:24Z (GMT). No. of bitstreams: 1 thesis.pdf: 3505007 bytes, checksum: ad0f438ba0b866787e7d9f62174d2f03 (MD5) Previous issue date: 2015-05en
dc.description.statementofresponsibilityby Hüseyin Özkan.en_US
dc.embargo.release2016-06-01
dc.format.extentxvi, 164 leaves : charts.en_US
dc.identifier.itemidB150302
dc.identifier.urihttp://hdl.handle.net/11693/29022
dc.language.isoEnglishen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectOnline Learningen_US
dc.subjectSupervised learningen_US
dc.subjectPredictionen_US
dc.subjectClassificationen_US
dc.subjectRegressionen_US
dc.subjectAnomaly detectionen_US
dc.subjectBig dataen_US
dc.subjectAdverse conditionsen_US
dc.subjectDeterministic analysisen_US
dc.subjectWorst caseen_US
dc.subjectNon-stationarityen_US
dc.subjectConcept changeen_US
dc.subjectSelf-organizingen_US
dc.subjectDecision treeen_US
dc.subjectHidden markov modelen_US
dc.subjectHMMen_US
dc.subjectPartially observable HMM statesen_US
dc.subjectLabel errorsen_US
dc.subjectCorruptionen_US
dc.subjectNoiseen_US
dc.subjectAnomalyen_US
dc.subjectImputationen_US
dc.subjectTime seriesen_US
dc.subjectNeyman-pearsonen_US
dc.titleOnline learning under adverse settingsen_US
dc.title.alternativeKarşıt koşullar altında çevrimiçi öğrenmeen_US
dc.typeThesisen_US
thesis.degree.disciplineElectrical and Electronic Engineering
thesis.degree.grantorBilkent University
thesis.degree.levelDoctoral
thesis.degree.namePh.D. (Doctor of Philosophy)

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis.pdf
Size:
3.34 MB
Format:
Adobe Portable Document Format
Description:
Full printable version

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: