• About
  • Policies
  • What is open access
  • Library
  • Contact
Advanced search
      View Item 
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Electrical and Electronics Engineering
      • View Item
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Electrical and Electronics Engineering
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Efficient online learning algorithms based on LSTM neural networks

      Thumbnail
      View / Download
      1.3 Mb
      Author
      Ergen, T.
      Kozat, S. S.
      Date
      2018
      Source Title
      IEEE Transactions on Neural Networks and Learning Systems
      Print ISSN
      2162-237X
      Publisher
      Institute of Electrical and Electronics Engineers
      Volume
      29
      Issue
      8
      Pages
      3772 - 3783
      Language
      English
      Type
      Article
      Item Usage Stats
      139
      views
      704
      downloads
      Abstract
      We investigate online nonlinear regression and introduce novel regression structures based on the long short term memory (LSTM) networks. For the introduced structures, we also provide highly efficient and effective online training methods. To train these novel LSTM-based structures, we put the underlying architecture in a state space form and introduce highly efficient and effective particle filtering (PF)-based updates. We also provide stochastic gradient descent and extended Kalman filter-based updates. Our PF-based training method guarantees convergence to the optimal parameter estimation in the mean square error sense provided that we have a sufficient number of particles and satisfy certain technical conditions. More importantly, we achieve this performance with a computational complexity in the order of the first-order gradient-based methods by controlling the number of particles. Since our approach is generic, we also introduce a gated recurrent unit (GRU)-based approach by directly replacing the LSTM architecture with the GRU architecture, where we demonstrate the superiority of our LSTM-based approach in the sequential prediction task via different real life data sets. In addition, the experimental results illustrate significant performance improvements achieved by the introduced algorithms with respect to the conventional methods over several different benchmark real life data sets.
      Keywords
      Gated recurrent unit (GRU)
      Kalman filtering
      Long short term memory (LSTM)
      Online learning
      Particle filtering (PF)
      Regression
      Stochastic gradient descent (SGD)
      Permalink
      http://hdl.handle.net/11693/50272
      Published Version (Please cite this version)
      https://doi.org/10.1109/TNNLS.2017.2741598
      Collections
      • Department of Electrical and Electronics Engineering 3650
      Show full item record

      Browse

      All of BUIRCommunities & CollectionsTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsThis CollectionTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartments

      My Account

      Login

      Statistics

      View Usage StatisticsView Google Analytics Statistics

      Bilkent University

      If you have trouble accessing this page and need to request an alternate format, contact the site administrator. Phone: (312) 290 1771
      © Bilkent University - Library IT

      Contact Us | Send Feedback | Off-Campus Access | Admin | Privacy