Online training of LSTM networks in distributed systems for variable length data sequences
buir.contributor.author | Kozat, Serdar | |
dc.citation.epage | 5165 | en_US |
dc.citation.issueNumber | 10 | en_US |
dc.citation.spage | 5159 | en_US |
dc.citation.volumeNumber | 29 | en_US |
dc.contributor.author | Ergen, T. | en_US |
dc.contributor.author | Kozat, Serdar | en_US |
dc.date.accessioned | 2019-02-21T16:05:48Z | |
dc.date.available | 2019-02-21T16:05:48Z | |
dc.date.issued | 2018 | en_US |
dc.department | Department of Electrical and Electronics Engineering | en_US |
dc.description.abstract | In this brief, we investigate online training of long short term memory (LSTM) architectures in a distributed network of nodes, where each node employs an LSTM-based structure for online regression. In particular, each node sequentially receives a variable length data sequence with its label and can only exchange information with its neighbors to train the LSTM architecture. We first provide a generic LSTM-based regression structure for each node. In order to train this structure, we put the LSTM equations in a nonlinear state-space form for each node and then introduce a highly effective and efficient distributed particle filtering (DPF)-based training algorithm. We also introduce a distributed extended Kalman filtering-based training algorithm for comparison. Here, our DPF-based training algorithm guarantees convergence to the performance of the optimal LSTM coefficients in the mean square error sense under certain conditions. We achieve this performance with communication and computational complexity in the order of the first-order gradient-based methods. Through both simulated and real-life examples, we illustrate significant performance improvements with respect to the state-of-The-Art methods. | |
dc.description.provenance | Made available in DSpace on 2019-02-21T16:05:48Z (GMT). No. of bitstreams: 1 Bilkent-research-paper.pdf: 222869 bytes, checksum: 842af2b9bd649e7f548593affdbafbb3 (MD5) Previous issue date: 2018 | en |
dc.description.sponsorship | Manuscript received June 15, 2017; revised September 8, 2017; accepted November 1, 2017. Date of publication December 7, 2017; date of current version September 17, 2018. This work was supported by TUBITAK under Contract 115E917. (Corresponding author: Tolga Ergen.) The authors are with the Department of Electrical and Electronics Engineering, Bilkent University, 06800 Ankara, Turkey (e-mail: ergen@ee.bilkent.edu.tr; kozat@ee.bilkent.edu.tr). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNNLS.2017.2770179 2162-237X © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. | |
dc.identifier.doi | 10.1109/TNNLS.2017.2770179 | |
dc.identifier.issn | 2162-237X | |
dc.identifier.uri | http://hdl.handle.net/11693/50274 | |
dc.language.iso | English | |
dc.publisher | Institute of Electrical and Electronics Engineers | |
dc.relation.isversionof | https://doi.org/10.1109/TNNLS.2017.2770179 | |
dc.relation.project | Bilkent Üniversitesi - IEEE Foundation, IEEE - 115E917 | |
dc.source.title | IEEE Transactions on Neural Networks and Learning Systems | en_US |
dc.subject | Distributed learning | en_US |
dc.subject | Extended Kalman filtering (EKF) | en_US |
dc.subject | Long short term memory (LSTM) networks | en_US |
dc.subject | Online learning | en_US |
dc.subject | Particle filtering | en_US |
dc.title | Online training of LSTM networks in distributed systems for variable length data sequences | en_US |
dc.type | Article | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Online_training-of_LSTM_networks_in_distributed_systems_for_variable_lenght_data_sequences.pdf
- Size:
- 683.84 KB
- Format:
- Adobe Portable Document Format
- Description:
- Full printable version