Multi-armed bandit algorithms for communication networks and healthcare

buir.advisorTekin, Cem
dc.contributor.authorDemirel, İlker
dc.date.accessioned2022-06-10T06:05:59Z
dc.date.available2022-06-10T06:05:59Z
dc.date.copyright2022-06
dc.date.issued2022-06
dc.date.submitted2022-06-09
dc.descriptionCataloged from PDF version of article.en_US
dc.descriptionThesis (Master's): Bilkent University, Department of Electrical and Electronics Engineering, İhsan Doğramacı Bilkent University, 2022.en_US
dc.descriptionIncludes bibliographical references (leaves 84-99).en_US
dc.description.abstractMulti-armed bandits (MAB) is a well-established sequential decision-making framework. While the simplest MAB framework is useful in modeling a wide range of real-world applications ranging from adaptive clinical trial design to financial portfolio management, it requires further extensions for other problems. We propose three novel MAB algorithms that are useful in optimizing bolus-insulin dose recommendation in type-1 diabetes, best channel identification in cognitive radio networks, and online recommender systems. First, we introduce and study the “safe leveling” problem, where the learner's objective is to keep the arm outcomes close to a target level rather than maximize them. We propose a novel algorithm, ESCADA, with cumulative regret and safety guarantees. We demonstrate its effectiveness against the straightforward adaptations of standard MAB algorithms to the “leveling task”. Next, we study the “federated multi-armed bandit” (FMAB) problem, where a cohort of clients play the same MAB game to learn the globally best arm. We consider adversarial “Byzantine” clients disturbing the learning process with false model updates and propose a robust algorithm, Fed-MoM-UCB. We provide theoretical guarantees on Fed-MoM-UCB while identifying the certain performance sacrifices that robustness requires. Finally, we study the “combinatorial multi-armed bandits with probabilistically triggered arms” (CMAB-PTA), where the learner chooses a set of arms at each round that may trigger other arms. CMAB-PTA is useful in modeling various problems such as influence maximization on graphs and online recommendation systems. We propose a Gaussian process-based algorithm, ComGP-UCB. We provide upper bounds on its regret and demonstrate its effectiveness against the state-of-the-art baselines when arm outcomes are correlated.en_US
dc.description.provenanceSubmitted by Betül Özen (ozen@bilkent.edu.tr) on 2022-06-10T06:05:59Z No. of bitstreams: 1 B161020.pdf: 3511711 bytes, checksum: 3c873be42fed5b97a5eadc159801727d (MD5)en
dc.description.provenanceMade available in DSpace on 2022-06-10T06:05:59Z (GMT). No. of bitstreams: 1 B161020.pdf: 3511711 bytes, checksum: 3c873be42fed5b97a5eadc159801727d (MD5) Previous issue date: 2022-06en
dc.description.statementofresponsibilityby İlker Demirelen_US
dc.embargo.release2022-12-06
dc.format.extentxiii, 115 leaves : charts ; 30 cm.en_US
dc.identifier.itemidB161020
dc.identifier.urihttp://hdl.handle.net/11693/80674
dc.language.isoEnglishen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectMulti-armed banditsen_US
dc.subjectFederated learningen_US
dc.subjectMachine learningen_US
dc.subjectCommunication networksen_US
dc.subjectHealthcareen_US
dc.titleMulti-armed bandit algorithms for communication networks and healthcareen_US
dc.title.alternativeİletişim ağları ve sağlık uygulamaları için çok kollu haydut algoritmalarıen_US
dc.typeThesisen_US
thesis.degree.disciplineElectrical and Electronic Engineering
thesis.degree.grantorBilkent University
thesis.degree.levelMaster's
thesis.degree.nameMS (Master of Science)

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
B161020.pdf
Size:
3.35 MB
Format:
Adobe Portable Document Format
Description:
Full printable version

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description: