Efficient online training algorithms for recurrent neural networks
Date
Authors
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Volume
Issue
Pages
Language
Type
Journal Title
Journal ISSN
Volume Title
Attention Stats
Usage Stats
views
downloads
Series
Abstract
Recurrent Neural Networks (RNNs) are widely used for online regression due to their ability to learn nonlinear temporal dependencies. As an RNN model, Long-Short-Term-Memory Networks (LSTMs) are commonly preferred in prac-tice, since these networks are capable of learning long-term dependencies while avoiding the exploding gradient problem. On the other hand, the performance improvement of LSTMs usually comes with the price of their large parameter size, which makes their training significantly demanding in terms of computational and data requirements. In this thesis, we address the computational challenges of LSTM training. We introduce two training algorithms, designed for obtaining the online regression performance of LSTMs with less computational requirements than the state-of-the-art. The introduced algorithms are truly online, i.e., they do not assume any underlying data generating process and future information, except that the dataset is bounded. We discuss theoretical guarantees of the introduced algo-rithms, along with their asymptotic convergence behavior. Finally, we demon-strate their performance through extensive numerical studies on real and synthetic datasets, and show that they achieve the regression performance of LSTMs with significantly shorter training times.