• About
  • Policies
  • What is open access
  • Library
  • Contact
Advanced search
      View Item 
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Electrical and Electronics Engineering
      • View Item
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Electrical and Electronics Engineering
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Distributed multi-agent online learning based on global feedback

      Thumbnail
      View / Download
      2.1 Mb
      Author
      Tekin, C.
      Zhang, S.
      Schaar, Mihaela van der
      Date
      2015-05-01
      Source Title
      IEEE Transactions on Signal Processing
      Print ISSN
      1053-587X
      Electronic ISSN
      1941-0476
      Publisher
      Institute of Electrical and Electronics Engineers
      Volume
      63
      Issue
      9
      Pages
      2225 - 2238
      Language
      English
      Type
      Article
      Item Usage Stats
      134
      views
      123
      downloads
      Abstract
      Abstract—In this paper, we develop online learning algorithms that enable the agents to cooperatively learn how to maximize the overall reward in scenarios where only noisy global feedback is available without exchanging any information among themselves. We prove that our algorithms' learning regrets—the losses incurred by the algorithms due to uncertainty—are logarithmically increasing in time and thus the time average reward converges to the optimal average reward. Moreover, we also illustrate how the regret depends on the size of the action space, and we show that this relationship is influenced by the informativeness of the reward structure with regard to each agent's individual action. When the overall reward is fully informative, regret is shown to be linear in the total number of actions of all the agents. When the reward function is not informative, regret is linear in the number of joint actions. Our analytic and numerical results show that the proposed learning algorithms significantly outperform existing online learning solutions in terms of regret and learning speed. We illustrate how our theoretical framework can be used in practice by applying it to online Big Data mining using distributed classifiers.
      Keywords
      Big data mining
      Distributed cooperative learning
      Multiagent learning
      Multiarmed bandits
      Online learning
      Reward informativeness
      Permalink
      http://hdl.handle.net/11693/49389
      Published Version (Please cite this version)
      http://doi.org/10.1109/TSP.2015.2403288
      Collections
      • Department of Electrical and Electronics Engineering 3657
      Show full item record

      Browse

      All of BUIRCommunities & CollectionsTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsThis CollectionTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartments

      My Account

      Login

      Statistics

      View Usage StatisticsView Google Analytics Statistics

      Bilkent University

      If you have trouble accessing this page and need to request an alternate format, contact the site administrator. Phone: (312) 290 1771
      © Bilkent University - Library IT

      Contact Us | Send Feedback | Off-Campus Access | Admin | Privacy