Risk-averse allocation indices for multiarmed bandit problem

Malekipirbazari, Milad; Çavuş, Özlem

Risk-averse allocation indices for multiarmed bandit problem

Files

Risk-averse_allocation_indices_for_multiarmed_bandit_problem.pdf (782.35 KB)

Date

2021-01-25

Authors

Malekipirbazari, Milad

Çavuş, Özlem

BUIR Usage Stats

2
views

52
downloads

Citation Stats

Abstract

In classical multiarmed bandit problem, the aim is to find a policy maximizing the expected total reward, implicitly assuming that the decision-maker is risk-neutral. On the other hand, the decision-makers are risk-averse in some real-life applications. In this article, we design a new setting based on the concept of dynamic risk measures where the aim is to find a policy with the best risk-adjusted total discounted outcome. We provide a theoretical analysis of multiarmed bandit problem with respect to this novel setting and propose a priority-index heuristic which gives risk-averse allocation indices having a structure similar to Gittins index. Although an optimal policy is shown not always to have index-based form, empirical results express the excellence of this heuristic and show that with risk-averse allocation indices we can achieve optimal or near-optimal interpretable policies.

Source Title

IEEE Transactions on Automatic Control

Publisher

IEEE

Keywords

Coherent risk measures, Dynamic allocation index, Dynamic risk-aversion, Gittins index, Multiarmed bandit (MAB)

Permalink

http://hdl.handle.net/11693/76828

Published Version (Please cite this version)

https://doi.org/10.1109/TAC.2021.3053539

Collections

Scholarly Publications - Industrial Engineering

Language

English

Type

Article

Full item page

Risk-averse allocation indices for multiarmed bandit problem

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Risk-averse allocation indices for multiarmed bandit problem

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type