An online minimax optimal algorithm for adversarial multiarmed bandit problem
buir.contributor.author | Gökçesu, Kaan | |
buir.contributor.author | Kozat, Süleyman Serdar | |
dc.citation.epage | 5580 | en_US |
dc.citation.issueNumber | 11 | en_US |
dc.citation.spage | 5565 | en_US |
dc.citation.volumeNumber | 29 | en_US |
dc.contributor.author | Gökçesu, Kaan | en_US |
dc.contributor.author | Kozat, Süleyman Serdar | en_US |
dc.date.accessioned | 2019-02-21T16:05:49Z | en_US |
dc.date.available | 2019-02-21T16:05:49Z | en_US |
dc.date.issued | 2018 | en_US |
dc.department | Department of Electrical and Electronics Engineering | en_US |
dc.description.abstract | We investigate the adversarial multiarmed bandit problem and introduce an online algorithm that asymptotically achieves the performance of the best switching bandit arm selection strategy. Our algorithms are truly online such that we do not use the game length or the number of switches of the best arm selection strategy in their constructions. Our results are guaranteed to hold in an individual sequence manner, since we have no statistical assumptions on the bandit arm losses. Our regret bounds, i.e., our performance bounds with respect to the best bandit arm selection strategy, are minimax optimal up to logarithmic terms. We achieve the minimax optimal regret with computational complexity only log-linear in the game length. Thus, our algorithms can be efficiently used in applications involving big data. Through an extensive set of experiments involving synthetic and real data, we demonstrate significant performance gains achieved by the proposed algorithm with respect to the state-of-the-art switching bandit algorithms. We also introduce a general efficiently implementable bandit arm selection framework, which can be adapted to various applications. | en_US |
dc.description.provenance | Made available in DSpace on 2019-02-21T16:05:49Z (GMT). No. of bitstreams: 1 Bilkent-research-paper.pdf: 222869 bytes, checksum: 842af2b9bd649e7f548593affdbafbb3 (MD5) Previous issue date: 2018 | en |
dc.description.sponsorship | Manuscript received September 11, 2016; revised April 26, 2017, July 26, 2017, September 23, 2017, November 20, 2017, and January 23, 2018; accepted January 31, 2018. Date of publication March 8, 2018; date of current version October 16, 2018. This work was supported in part by the Turkish Academy of Sciences Outstanding Researcher Programme and in part by the Scientific and Technological Research Council of Turkey under Contract 113E517. (Corresponding author: Kaan Gokcesu.) K. Gokcesu was with the Department of Electrical and Electronics Engineering, Bilkent University, 06800 Ankara, Turkey. He is now with the Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139 USA (e-mail: gokcesu@mit.edu). | en_US |
dc.identifier.doi | 10.1109/TNNLS.2018.2806006 | en_US |
dc.identifier.eissn | 2162-2388 | en_US |
dc.identifier.issn | 2162-237X | en_US |
dc.identifier.uri | http://hdl.handle.net/11693/50275 | en_US |
dc.language.iso | English | en_US |
dc.publisher | Institute of Electrical and Electronics Engineers | en_US |
dc.relation.isversionof | https://doi.org/10.1109/TNNLS.2018.2806006 | en_US |
dc.relation.project | Bilkent Üniversitesi - Massachusetts Institute of Technology, MIT - Türkiye Bilimsel ve Teknolojik Araştirma Kurumu, TÜBITAK: 113E517 | en_US |
dc.source.title | IEEE Transactions on Neural Networks and Learning Systems | en_US |
dc.subject | Adversarial multiarmed bandit | en_US |
dc.subject | Big data | en_US |
dc.subject | Individual sequence manner | en_US |
dc.subject | Minimax optimal | en_US |
dc.subject | Switching bandit | en_US |
dc.title | An online minimax optimal algorithm for adversarial multiarmed bandit problem | en_US |
dc.type | Article | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- An_online_minimax_optimal_algorithm_for_adversarial_multiarmed_bandit_problem.pdf
- Size:
- 1.84 MB
- Format:
- Adobe Portable Document Format
- Description:
- Full printable version