Prediction with expert advice: on the role of contexts, bandit feedback and risk-awareness

buir.advisorTekin, Cem
dc.contributor.authorEkşioğlu, Kubilay
dc.date.accessioned2018-12-25T12:18:45Z
dc.date.available2018-12-25T12:18:45Z
dc.date.copyright2018-12
dc.date.issued2018-12
dc.date.submitted2018-12-21
dc.departmentDepartment of Electrical and Electronics Engineeringen_US
dc.descriptionCataloged from PDF version of article.en_US
dc.descriptionThesis (M.S.): Bilkent University, Department of Electrical and Electronics Engineering, İhsan Doğramacı Bilkent University, 2018.en_US
dc.descriptionIncludes bibliographical references (leaves 54-59).en_US
dc.description.abstractAlong with the rapid growth in the size of data generated and collected over time, the need for developing online algorithms that can provide answers without any offline training has considerably increased. In this thesis, we consider the prediction with expert advice problem under the online learning framework. Specifically, we consider problems where experts have asymmetric information about the sample space. First, we propose an algorithm that selects a subset of the experts and makes predictions based on the advices of this subset. Then, we propose another algorithm that clusters samples in an online manner and makes predictions based on the history of observations and decisions within each cluster. Next, we consider the Safe Bandit, a variant of the Risk Aware Multi Armed Bandit, where the goal is to minimize the number of rounds in which a risky arm is chosen. Adopting mean-variance as the risk notion, we define an arm as risky if its mean-variance is higher than a given threshold. Using this, we define a new regret measure called Risk Violation Regret (RVR), which depends on the number of times risky arms are selected. Then, we propose a learning algorithm called Exploration and Exploitation with Risk Thresholds (EXERT), and prove that it achieves O(1) RVR with high probability. Afterwards, we use EXERT in an expert selection problem, where each expert corresponds to a neural network with reject option. For this, we propose a method to train these neural networks and use them to evaluate the performance of EXERT in real-world datasets.en_US
dc.description.degreeM.S.en_US
dc.description.statementofresponsibilityby Kubilay Ekşioğlu.en_US
dc.embargo.release2019-06-21
dc.format.extentxi, 59 leaves : charts (some color) ; 30 cm.en_US
dc.identifier.itemidB159204
dc.identifier.urihttp://hdl.handle.net/11693/48211
dc.language.isoEnglishen_US
dc.publisherBilkent Universityen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectPrediction with Expert Adviceen_US
dc.subjectMulti Armed Banditsen_US
dc.subjectOnline Learningen_US
dc.subjectNeural Networksen_US
dc.titlePrediction with expert advice: on the role of contexts, bandit feedback and risk-awarenessen_US
dc.title.alternativeUzman önerileriyle tahmin: bağlamların, haydut geribildirimin ve risk farkındalığının rolü üzerineen_US
dc.typeThesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
KubilayEksioglu_10225045.pdf
Size:
624.2 KB
Format:
Adobe Portable Document Format
Description:
Full printable version
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: