Multi-armed bandit algorithms for communication networks and healthcare

Demirel, İlker

Multi-armed bandit algorithms for communication networks and healthcare

Available

The embargo period has ended, and this item is now available.

Files

B161020.pdf (3.35 MB)

Date

2022-06

Authors

Demirel, İlker

Advisor

Tekin, Cem

BUIR Usage Stats

7
views

69
downloads

Abstract

Multi-armed bandits (MAB) is a well-established sequential decision-making framework. While the simplest MAB framework is useful in modeling a wide range of real-world applications ranging from adaptive clinical trial design to financial portfolio management, it requires further extensions for other problems. We propose three novel MAB algorithms that are useful in optimizing bolus-insulin dose recommendation in type-1 diabetes, best channel identification in cognitive radio networks, and online recommender systems. First, we introduce and study the “safe leveling” problem, where the learner's objective is to keep the arm outcomes close to a target level rather than maximize them. We propose a novel algorithm, ESCADA, with cumulative regret and safety guarantees. We demonstrate its effectiveness against the straightforward adaptations of standard MAB algorithms to the “leveling task”. Next, we study the “federated multi-armed bandit” (FMAB) problem, where a cohort of clients play the same MAB game to learn the globally best arm. We consider adversarial “Byzantine” clients disturbing the learning process with false model updates and propose a robust algorithm, Fed-MoM-UCB. We provide theoretical guarantees on Fed-MoM-UCB while identifying the certain performance sacrifices that robustness requires. Finally, we study the “combinatorial multi-armed bandits with probabilistically triggered arms” (CMAB-PTA), where the learner chooses a set of arms at each round that may trigger other arms. CMAB-PTA is useful in modeling various problems such as influence maximization on graphs and online recommendation systems. We propose a Gaussian process-based algorithm, ComGP-UCB. We provide upper bounds on its regret and demonstrate its effectiveness against the state-of-the-art baselines when arm outcomes are correlated.

Keywords

Multi-armed bandits, Federated learning, Machine learning, Communication networks, Healthcare

Degree Discipline

Electrical and Electronic Engineering

Degree Level

Master's

Degree Name

MS (Master of Science)

Permalink

http://hdl.handle.net/11693/80674

Collections

Graduate School of Engineering and Science

Language

English

Type

Thesis

Full item page

Multi-armed bandit algorithms for communication networks and healthcare

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Multi-armed bandit algorithms for communication networks and healthcare

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type