An online minimax optimal algorithm for adversarial multiarmed bandit problem

buir.contributor.authorGökçesu, Kaan
buir.contributor.authorKozat, Süleyman Serdar
dc.citation.epage5580en_US
dc.citation.issueNumber11en_US
dc.citation.spage5565en_US
dc.citation.volumeNumber29en_US
dc.contributor.authorGökçesu, Kaanen_US
dc.contributor.authorKozat, Süleyman Serdaren_US
dc.date.accessioned2019-02-21T16:05:49Zen_US
dc.date.available2019-02-21T16:05:49Zen_US
dc.date.issued2018en_US
dc.departmentDepartment of Electrical and Electronics Engineeringen_US
dc.description.abstractWe investigate the adversarial multiarmed bandit problem and introduce an online algorithm that asymptotically achieves the performance of the best switching bandit arm selection strategy. Our algorithms are truly online such that we do not use the game length or the number of switches of the best arm selection strategy in their constructions. Our results are guaranteed to hold in an individual sequence manner, since we have no statistical assumptions on the bandit arm losses. Our regret bounds, i.e., our performance bounds with respect to the best bandit arm selection strategy, are minimax optimal up to logarithmic terms. We achieve the minimax optimal regret with computational complexity only log-linear in the game length. Thus, our algorithms can be efficiently used in applications involving big data. Through an extensive set of experiments involving synthetic and real data, we demonstrate significant performance gains achieved by the proposed algorithm with respect to the state-of-the-art switching bandit algorithms. We also introduce a general efficiently implementable bandit arm selection framework, which can be adapted to various applications.en_US
dc.description.provenanceMade available in DSpace on 2019-02-21T16:05:49Z (GMT). No. of bitstreams: 1 Bilkent-research-paper.pdf: 222869 bytes, checksum: 842af2b9bd649e7f548593affdbafbb3 (MD5) Previous issue date: 2018en
dc.description.sponsorshipManuscript received September 11, 2016; revised April 26, 2017, July 26, 2017, September 23, 2017, November 20, 2017, and January 23, 2018; accepted January 31, 2018. Date of publication March 8, 2018; date of current version October 16, 2018. This work was supported in part by the Turkish Academy of Sciences Outstanding Researcher Programme and in part by the Scientific and Technological Research Council of Turkey under Contract 113E517. (Corresponding author: Kaan Gokcesu.) K. Gokcesu was with the Department of Electrical and Electronics Engineering, Bilkent University, 06800 Ankara, Turkey. He is now with the Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139 USA (e-mail: gokcesu@mit.edu).en_US
dc.identifier.doi10.1109/TNNLS.2018.2806006en_US
dc.identifier.eissn2162-2388en_US
dc.identifier.issn2162-237Xen_US
dc.identifier.urihttp://hdl.handle.net/11693/50275en_US
dc.language.isoEnglishen_US
dc.publisherInstitute of Electrical and Electronics Engineersen_US
dc.relation.isversionofhttps://doi.org/10.1109/TNNLS.2018.2806006en_US
dc.relation.projectBilkent Üniversitesi - Massachusetts Institute of Technology, MIT - Türkiye Bilimsel ve Teknolojik Araştirma Kurumu, TÜBITAK: 113E517en_US
dc.source.titleIEEE Transactions on Neural Networks and Learning Systemsen_US
dc.subjectAdversarial multiarmed banditen_US
dc.subjectBig dataen_US
dc.subjectIndividual sequence manneren_US
dc.subjectMinimax optimalen_US
dc.subjectSwitching banditen_US
dc.titleAn online minimax optimal algorithm for adversarial multiarmed bandit problemen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
An_online_minimax_optimal_algorithm_for_adversarial_multiarmed_bandit_problem.pdf
Size:
1.84 MB
Format:
Adobe Portable Document Format
Description:
Full printable version