Risk-averse allocation indices for multiarmed bandit problem

Malekipirbazari, Milad; Çavuş, Özlem

Risk-averse allocation indices for multiarmed bandit problem

buir.contributor.author	Malekipirbazari, Milad
buir.contributor.author	Çavuş, Özlem
buir.contributor.orcid	Malekipirbazari, Milad\|0000-0002-3212-6498
buir.contributor.orcid	Çavuş, Özlem\|0000-0002-9901-0836
dc.citation.epage	5529	en_US
dc.citation.issueNumber	11	en_US
dc.citation.spage	5522	en_US
dc.citation.volumeNumber	66	en_US
dc.contributor.author	Malekipirbazari, Milad
dc.contributor.author	Çavuş, Özlem
dc.date.accessioned	2022-01-27T10:30:18Z
dc.date.available	2022-01-27T10:30:18Z
dc.date.issued	2021-01-25
dc.department	Department of Industrial Engineering	en_US
dc.description.abstract	In classical multiarmed bandit problem, the aim is to find a policy maximizing the expected total reward, implicitly assuming that the decision-maker is risk-neutral. On the other hand, the decision-makers are risk-averse in some real-life applications. In this article, we design a new setting based on the concept of dynamic risk measures where the aim is to find a policy with the best risk-adjusted total discounted outcome. We provide a theoretical analysis of multiarmed bandit problem with respect to this novel setting and propose a priority-index heuristic which gives risk-averse allocation indices having a structure similar to Gittins index. Although an optimal policy is shown not always to have index-based form, empirical results express the excellence of this heuristic and show that with risk-averse allocation indices we can achieve optimal or near-optimal interpretable policies.	en_US
dc.identifier.doi	10.1109/TAC.2021.3053539	en_US
dc.identifier.eissn	1558-2523
dc.identifier.issn	0018-9286
dc.identifier.uri	http://hdl.handle.net/11693/76828
dc.language.iso	English	en_US
dc.publisher	IEEE	en_US
dc.relation.isversionof	https://doi.org/10.1109/TAC.2021.3053539	en_US
dc.source.title	IEEE Transactions on Automatic Control	en_US
dc.subject	Coherent risk measures	en_US
dc.subject	Dynamic allocation index	en_US
dc.subject	Dynamic risk-aversion	en_US
dc.subject	Gittins index	en_US
dc.subject	Multiarmed bandit (MAB)	en_US
dc.title	Risk-averse allocation indices for multiarmed bandit problem	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Risk-averse_allocation_indices_for_multiarmed_bandit_problem.pdf
Size:: 782.35 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.69 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Scholarly Publications - Industrial Engineering