Risk-averse multi-armed bandit problem

Malekipirbazari, Milad

Risk-averse multi-armed bandit problem

Available

The embargo period has ended, and this item is now available.

Files

10414096.pdf (1.79 MB)

Date

2021-08

Authors

Malekipirbazari, Milad

Advisor

Çavuş İyigün, Özlem

BUIR Usage Stats

12
views

83
downloads

Abstract

In classical multi-armed bandit problem, the aim is to ﬁnd a policy maximizing the expected total reward, implicitly assuming that the decision maker is risk-neutral. On the other hand, the decision makers are risk-averse in some real life applications. In this study, we design a new setting for the classical multi-armed bandit problem (MAB) based on the concept of dynamic risk measures, where the aim is to ﬁnd a policy with the best risk adjusted total discounted outcome. We provide theoretical analysis of MAB with respect to this novel setting, and propose two diﬀerent priority-index heuristics giving risk-averse allocation indices with structures similar to Gittins index. The ﬁrst proposed heuristic is based on Lagrangian duality and the indices are expressed as the Lagrangian multiplier corresponding to the activation constraint. In the second part, we present a theoretical analysis based on Whittle’s retirement problem and propose a gener-alized version of restart-in-state formulation of the Gittins index to compute the proposed risk-averse allocation indices. Finally, as a practical application of the proposed methods, we focus on optimal design of clinical trials and we apply our risk-averse MAB approach to perform risk-averse treatment allocation based on a Bayesian Bernoulli model. We evaluate the performance of our approach against other allocation rules, including ﬁxed randomization.

Keywords

Multi-armed bandit, Gittins index, Dynamic risk-aversion, Coherent risk measures, Markov decision process, Clinical trials

Degree Discipline

Industrial Engineering

Degree Level

Doctoral

Degree Name

Ph.D. (Doctor of Philosophy)

Permalink

http://hdl.handle.net/11693/76469

Collections

Graduate School of Engineering and Science

Language

English

Type

Thesis

Full item page

Risk-averse multi-armed bandit problem

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Risk-averse multi-armed bandit problem

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type