Efficient online learning algorithms based on LSTM neural networks

buir.contributor.authorErgen, Tolga
buir.contributor.authorKozat, Süleyman Serdar
dc.citation.epage3783en_US
dc.citation.issueNumber8en_US
dc.citation.spage3772en_US
dc.citation.volumeNumber29en_US
dc.contributor.authorErgen, Tolgaen_US
dc.contributor.authorKozat, Süleyman Serdaren_US
dc.date.accessioned2019-02-21T16:05:46Z
dc.date.available2019-02-21T16:05:46Z
dc.date.issued2018en_US
dc.departmentDepartment of Electrical and Electronics Engineeringen_US
dc.description.abstractWe investigate online nonlinear regression and introduce novel regression structures based on the long short term memory (LSTM) networks. For the introduced structures, we also provide highly efficient and effective online training methods. To train these novel LSTM-based structures, we put the underlying architecture in a state space form and introduce highly efficient and effective particle filtering (PF)-based updates. We also provide stochastic gradient descent and extended Kalman filter-based updates. Our PF-based training method guarantees convergence to the optimal parameter estimation in the mean square error sense provided that we have a sufficient number of particles and satisfy certain technical conditions. More importantly, we achieve this performance with a computational complexity in the order of the first-order gradient-based methods by controlling the number of particles. Since our approach is generic, we also introduce a gated recurrent unit (GRU)-based approach by directly replacing the LSTM architecture with the GRU architecture, where we demonstrate the superiority of our LSTM-based approach in the sequential prediction task via different real life data sets. In addition, the experimental results illustrate significant performance improvements achieved by the introduced algorithms with respect to the conventional methods over several different benchmark real life data sets.
dc.description.provenanceMade available in DSpace on 2019-02-21T16:05:46Z (GMT). No. of bitstreams: 1 Bilkent-research-paper.pdf: 222869 bytes, checksum: 842af2b9bd649e7f548593affdbafbb3 (MD5) Previous issue date: 2018en
dc.description.sponsorshipManuscript received October 30, 2016; revised May 5, 2017 and August 15, 2017; accepted August 15, 2017. Date of publication September 13, 2017; date of current version July 18, 2018. This work was supported by TUBITAK under Contract 115E917. (Corresponding author: Tolga Ergen.) The authors are with the Department of Electrical and Electronics Engineering, Bilkent University, 06800 Ankara, Turkey (e-mail: ergen@ee.bilkent.edu.tr; kozat@ee.bilkent.edu.tr).
dc.identifier.doi10.1109/TNNLS.2017.2741598
dc.identifier.issn2162-237X
dc.identifier.urihttp://hdl.handle.net/11693/50272
dc.language.isoEnglish
dc.publisherInstitute of Electrical and Electronics Engineers
dc.relation.isversionofhttps://doi.org/10.1109/TNNLS.2017.2741598
dc.relation.projectBilkent Üniversitesi - 115E917
dc.source.titleIEEE Transactions on Neural Networks and Learning Systemsen_US
dc.subjectGated recurrent unit (GRU)en_US
dc.subjectKalman filteringen_US
dc.subjectLong short term memory (LSTM)en_US
dc.subjectOnline learningen_US
dc.subjectParticle filtering (PF)en_US
dc.subjectRegressionen_US
dc.subjectStochastic gradient descent (SGD)en_US
dc.titleEfficient online learning algorithms based on LSTM neural networksen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Efficient-Online-Learning-Algorithms-Based-on-LSTM-Neural-Networks.pdf
Size:
1.34 MB
Format:
Adobe Portable Document Format
Description:
Full printable version