Online learning with recurrent neural networks

buir.advisorKozat, Süleyman Serdar
dc.contributor.authorErgen, Tolga
dc.date.accessioned2018-07-30T10:58:16Z
dc.date.available2018-07-30T10:58:16Z
dc.date.copyright2018-07
dc.date.issued2018-07
dc.date.submitted2018-07-17
dc.descriptionCataloged from PDF version of article.en_US
dc.descriptionThesis (M.S.): Bilkent University, Department of Electrical and Electronics Engineering, İhsan Doğramacı Bilkent University, 2018.en_US
dc.descriptionIncludes bibliographical references (leaves 80-87).en_US
dc.description.abstractIn this thesis, we study online learning with Recurrent Neural Networks (RNNs). Particularly, in Chapter 2, we investigate online nonlinear regression and introduce novel regression structures based on the Long Short Term Memory (LSTM) network, i.e., is an advanced RNN architecture. To train these novel LSTM based structures, we introduce highly e cient and e ective Particle Filtering (PF) based updates. We also provide Stochastic Gradient Descent (SGD) and Extended Kalman Filter (EKF) based updates. Our PF based training method guarantees convergence to the optimal parameter estimation in the Mean Square Error (MSE) sense. In Chapter 3, we investigate online training of LSTM architectures in a distributed network of nodes, where each node employs an LSTM based structure for online regression. We rst provide a generic LSTM based regression structure for each node. In order to train this structure, we introduce a highly e ective and e cient Distributed PF (DPF) based training algorithm. We also introduce a Distributed EKF (DEKF) based training algorithm. Here, our DPF based training algorithm guarantees convergence to the performance of the optimal centralized LSTM parameters in the MSE sense. In Chapter 4, we investigate variable length data regression in an online setting and introduce an energy e cient regression structure build on LSTM networks. To reduce the complexity of this structure, we rst replace the regular multiplication operations with an energy e cient operator. We then apply factorizations to the weight matrices so that the total number of parameters to be trained is signi cantly reduced. We then introduce online training algorithms. Through a set of experiments, we illustrate signi cant performance gains and complexity reductions achieved by the introduced algorithms with respect to the state of the art methods.en_US
dc.description.provenanceSubmitted by Betül Özen (ozen@bilkent.edu.tr) on 2018-07-30T10:58:16Z No. of bitstreams: 1 my_thesis.pdf: 2259158 bytes, checksum: e6ebf2a440d035de934e94f77801c503 (MD5)en
dc.description.provenanceMade available in DSpace on 2018-07-30T10:58:16Z (GMT). No. of bitstreams: 1 my_thesis.pdf: 2259158 bytes, checksum: e6ebf2a440d035de934e94f77801c503 (MD5) Previous issue date: 2018-07en
dc.description.statementofresponsibilityby Tolga Ergen.en_US
dc.embargo.release2021-07-17
dc.format.extentxii, 87 leaves : graphics (some color) ; 30 cm.en_US
dc.identifier.itemidB158690
dc.identifier.urihttp://hdl.handle.net/11693/47693
dc.language.isoEnglishen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectOnline Learningen_US
dc.subjectRecurrent Neural Network (RNN)en_US
dc.subjectExtended Kalman ltering (EKF)en_US
dc.subjectParticle ltering (PF)en_US
dc.subjectStochastic Gradient Descent (SGD)en_US
dc.titleOnline learning with recurrent neural networksen_US
dc.title.alternativeYinelenen sinir ağları ile çevrimiçi öğrenimen_US
dc.typeThesisen_US
thesis.degree.disciplineElectrical and Electronic Engineering
thesis.degree.grantorBilkent University
thesis.degree.levelMaster's
thesis.degree.nameMS (Master of Science)

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
my_thesis.pdf
Size:
2.15 MB
Format:
Adobe Portable Document Format
Description:
Full printable version
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: