An online minimax optimal algorithm for adversarial multiarmed bandit problem

Gökçesu, Kaan; Kozat, Süleyman Serdar

An online minimax optimal algorithm for adversarial multiarmed bandit problem

buir.contributor.author	Gökçesu, Kaan
buir.contributor.author	Kozat, Süleyman Serdar
dc.citation.epage	5580	en_US
dc.citation.issueNumber	11	en_US
dc.citation.spage	5565	en_US
dc.citation.volumeNumber	29	en_US
dc.contributor.author	Gökçesu, Kaan	en_US
dc.contributor.author	Kozat, Süleyman Serdar	en_US
dc.date.accessioned	2019-02-21T16:05:49Z	en_US
dc.date.available	2019-02-21T16:05:49Z	en_US
dc.date.issued	2018	en_US
dc.department	Department of Electrical and Electronics Engineering	en_US
dc.description.abstract	We investigate the adversarial multiarmed bandit problem and introduce an online algorithm that asymptotically achieves the performance of the best switching bandit arm selection strategy. Our algorithms are truly online such that we do not use the game length or the number of switches of the best arm selection strategy in their constructions. Our results are guaranteed to hold in an individual sequence manner, since we have no statistical assumptions on the bandit arm losses. Our regret bounds, i.e., our performance bounds with respect to the best bandit arm selection strategy, are minimax optimal up to logarithmic terms. We achieve the minimax optimal regret with computational complexity only log-linear in the game length. Thus, our algorithms can be efficiently used in applications involving big data. Through an extensive set of experiments involving synthetic and real data, we demonstrate significant performance gains achieved by the proposed algorithm with respect to the state-of-the-art switching bandit algorithms. We also introduce a general efficiently implementable bandit arm selection framework, which can be adapted to various applications.	en_US
dc.description.sponsorship	Manuscript received September 11, 2016; revised April 26, 2017, July 26, 2017, September 23, 2017, November 20, 2017, and January 23, 2018; accepted January 31, 2018. Date of publication March 8, 2018; date of current version October 16, 2018. This work was supported in part by the Turkish Academy of Sciences Outstanding Researcher Programme and in part by the Scientific and Technological Research Council of Turkey under Contract 113E517. (Corresponding author: Kaan Gokcesu.) K. Gokcesu was with the Department of Electrical and Electronics Engineering, Bilkent University, 06800 Ankara, Turkey. He is now with the Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139 USA (e-mail: gokcesu@mit.edu).	en_US
dc.identifier.doi	10.1109/TNNLS.2018.2806006	en_US
dc.identifier.eissn	2162-2388	en_US
dc.identifier.issn	2162-237X	en_US
dc.identifier.uri	http://hdl.handle.net/11693/50275	en_US
dc.language.iso	English	en_US
dc.publisher	Institute of Electrical and Electronics Engineers	en_US
dc.relation.isversionof	https://doi.org/10.1109/TNNLS.2018.2806006	en_US
dc.relation.project	Bilkent Üniversitesi - Massachusetts Institute of Technology, MIT - Türkiye Bilimsel ve Teknolojik Araştirma Kurumu, TÜBITAK: 113E517	en_US
dc.source.title	IEEE Transactions on Neural Networks and Learning Systems	en_US
dc.subject	Adversarial multiarmed bandit	en_US
dc.subject	Big data	en_US
dc.subject	Individual sequence manner	en_US
dc.subject	Minimax optimal	en_US
dc.subject	Switching bandit	en_US
dc.title	An online minimax optimal algorithm for adversarial multiarmed bandit problem	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: An_online_minimax_optimal_algorithm_for_adversarial_multiarmed_bandit_problem.pdf
Size:: 1.84 MB
Format:: Adobe Portable Document Format
Description:: Full printable version

Download

Collections

Scholarly Publications - Electrical and Electronics Engineering