Online learning with recurrent neural networks
Author
Ergen, Tolga
Advisor
Kozat, Süleyman Serdar
Date
2018-07Publisher
Bilkent University
Language
English
Type
ThesisItem Usage Stats
171
views
views
0
downloads
downloads
Abstract
In this thesis, we study online learning with Recurrent Neural Networks (RNNs).
Particularly, in Chapter 2, we investigate online nonlinear regression and introduce
novel regression structures based on the Long Short Term Memory (LSTM)
network, i.e., is an advanced RNN architecture. To train these novel LSTM
based structures, we introduce highly e cient and e ective Particle Filtering
(PF) based updates. We also provide Stochastic Gradient Descent (SGD) and
Extended Kalman Filter (EKF) based updates. Our PF based training method
guarantees convergence to the optimal parameter estimation in the Mean Square
Error (MSE) sense. In Chapter 3, we investigate online training of LSTM architectures
in a distributed network of nodes, where each node employs an LSTM
based structure for online regression. We rst provide a generic LSTM based regression
structure for each node. In order to train this structure, we introduce a
highly e ective and e cient Distributed PF (DPF) based training algorithm. We
also introduce a Distributed EKF (DEKF) based training algorithm. Here, our
DPF based training algorithm guarantees convergence to the performance of the
optimal centralized LSTM parameters in the MSE sense. In Chapter 4, we investigate
variable length data regression in an online setting and introduce an energy
e cient regression structure build on LSTM networks. To reduce the complexity
of this structure, we rst replace the regular multiplication operations with an
energy e cient operator. We then apply factorizations to the weight matrices so
that the total number of parameters to be trained is signi cantly reduced. We
then introduce online training algorithms. Through a set of experiments, we illustrate
signi cant performance gains and complexity reductions achieved by the
introduced algorithms with respect to the state of the art methods.
Keywords
Online LearningRecurrent Neural Network (RNN)
Extended Kalman ltering (EKF)
Particle ltering (PF)
Stochastic Gradient Descent (SGD)