Online training of LSTM networks in distributed systems for variable length data sequences

Ergen, T.; Kozat, Serdar

Online training of LSTM networks in distributed systems for variable length data sequences

buir.contributor.author	Kozat, Serdar
dc.citation.epage	5165	en_US
dc.citation.issueNumber	10	en_US
dc.citation.spage	5159	en_US
dc.citation.volumeNumber	29	en_US
dc.contributor.author	Ergen, T.	en_US
dc.contributor.author	Kozat, Serdar	en_US
dc.date.accessioned	2019-02-21T16:05:48Z
dc.date.available	2019-02-21T16:05:48Z
dc.date.issued	2018	en_US
dc.department	Department of Electrical and Electronics Engineering	en_US
dc.description.abstract	In this brief, we investigate online training of long short term memory (LSTM) architectures in a distributed network of nodes, where each node employs an LSTM-based structure for online regression. In particular, each node sequentially receives a variable length data sequence with its label and can only exchange information with its neighbors to train the LSTM architecture. We first provide a generic LSTM-based regression structure for each node. In order to train this structure, we put the LSTM equations in a nonlinear state-space form for each node and then introduce a highly effective and efficient distributed particle filtering (DPF)-based training algorithm. We also introduce a distributed extended Kalman filtering-based training algorithm for comparison. Here, our DPF-based training algorithm guarantees convergence to the performance of the optimal LSTM coefficients in the mean square error sense under certain conditions. We achieve this performance with communication and computational complexity in the order of the first-order gradient-based methods. Through both simulated and real-life examples, we illustrate significant performance improvements with respect to the state-of-The-Art methods.
dc.description.sponsorship	Manuscript received June 15, 2017; revised September 8, 2017; accepted November 1, 2017. Date of publication December 7, 2017; date of current version September 17, 2018. This work was supported by TUBITAK under Contract 115E917. (Corresponding author: Tolga Ergen.) The authors are with the Department of Electrical and Electronics Engineering, Bilkent University, 06800 Ankara, Turkey (e-mail: ergen@ee.bilkent.edu.tr; kozat@ee.bilkent.edu.tr). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNNLS.2017.2770179 2162-237X © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
dc.identifier.doi	10.1109/TNNLS.2017.2770179
dc.identifier.issn	2162-237X
dc.identifier.uri	http://hdl.handle.net/11693/50274
dc.language.iso	English
dc.publisher	Institute of Electrical and Electronics Engineers
dc.relation.isversionof	https://doi.org/10.1109/TNNLS.2017.2770179
dc.relation.project	Bilkent Üniversitesi - IEEE Foundation, IEEE - 115E917
dc.source.title	IEEE Transactions on Neural Networks and Learning Systems	en_US
dc.subject	Distributed learning	en_US
dc.subject	Extended Kalman filtering (EKF)	en_US
dc.subject	Long short term memory (LSTM) networks	en_US
dc.subject	Online learning	en_US
dc.subject	Particle filtering	en_US
dc.title	Online training of LSTM networks in distributed systems for variable length data sequences	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Online_training-of_LSTM_networks_in_distributed_systems_for_variable_lenght_data_sequences.pdf
Size:: 683.84 KB
Format:: Adobe Portable Document Format
Description:: Full printable version

Download

Collections

Scholarly Publications - Electrical and Electronics Engineering