• About
  • Policies
  • What is open access
  • Library
  • Contact
Advanced search
      View Item 
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Electrical and Electronics Engineering
      • View Item
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Electrical and Electronics Engineering
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Unified intrinsically motivated exploration for off-policy learning in continuous action spaces

      Thumbnail
      View / Download
      4.2 Mb
      Author(s)
      Sağlam, Baturay
      Mutlu, Furkan B.
      Dalmaz, Onat
      Kozat, Süleyman S.
      Date
      2022-08-29
      Source Title
      Signal Processing and Communications Applications Conference (SIU)
      Print ISSN
      2165-0608
      Publisher
      IEEE
      Pages
      [1] - [4]
      Language
      Turkish
      Type
      Conference Paper
      Item Usage Stats
      8
      views
      4
      downloads
      Abstract
      Exploration is maintained in continuous control using undirected methods, in which random noise perturbs the network parameters or selected actions. Exploration that is intrinsically driven is a good alternative to undirected techniques. However, it is only studied for discrete action domains. The intrinsic incentives in the existing reinforcement learning literature are unified together in this study by a deterministic artificial goal generation rule for off-policy learning. The agent gains additional reward through this practice if it chooses actions that lead it to useful state spaces. An extensive set of experiments indicates that the introduced artificial reward rule significantly improves the performance of the off-policy baseline algorithms.
       
      Keşif, rastgele gürültünün ağ parametrelerini veya seçilen eylemleri bozduğu, yönlendirilmemiş yöntemler kullanılarak sürekli kontrolde sürdürülmektedir. İçsel olarak yönlendirilen keşif, yönlendirilmemiş tekniklere iyi bir alternatiftir ancak yalnızca ayrık eylem alanları için incelenmiştir. Mevcut pekiştirmeli öğrenme literatüründeki içsel teşvikler, bu çalışmada politika-dışı öğrenme için deterministik bir yapay hedef oluşturma kuralıyla birleştirilmiştir. Ajan, kendisini yararlı durum uzaylarına götüren eylemleri seçerse, bu uygulama aracılığıyla ek bir ödül kazanmaktadır. Kapsamlı bir deney seti, tanıtılan yapay ödül kuralının, politika-dışı temel algoritmaların performansını önemli ölçüde geliştirdiğini göstermektedir.
      Keywords
      Deep reinforcement learning
      Exploration
      İntrinsic motivation
      Continuous control
      Off-policy learning
      Derin pekiştirmeli öğrenme
      Keşif
      İçsel motivasyon
      Sürekli kontrol
      Politika-dışı öğrenme
      Permalink
      http://hdl.handle.net/11693/111288
      Published Version (Please cite this version)
      https://www.doi.org/10.1109/SIU55565.2022.9864795
      Collections
      • Department of Electrical and Electronics Engineering 4011
      Show full item record

      Browse

      All of BUIRCommunities & CollectionsTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsCoursesThis CollectionTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsCourses

      My Account

      Login

      Statistics

      View Usage StatisticsView Google Analytics Statistics

      Bilkent University

      If you have trouble accessing this page and need to request an alternate format, contact the site administrator. Phone: (312) 290 2976
      © Bilkent University - Library IT

      Contact Us | Send Feedback | Off-Campus Access | Admin | Privacy