Nonlinear regression with hierarchical recurrent neural networks under missing data

Şahin, Safa Onur; Kozat, Süleyman Serdar

Nonlinear regression with hierarchical recurrent neural networks under missing data

buir.contributor.author	Şahin, Safa Onur
buir.contributor.author	Kozar, Süleyman Serdar
buir.contributor.orcid	Şahin, Safa Onur\|0000-0001-8528-058X
buir.contributor.orcid	Kozat, Süleyman Serdar\|0000-0002-6488-3848
dc.citation.epage	5025
dc.citation.issueNumber	10
dc.citation.spage	5012
dc.citation.volumeNumber	5
dc.contributor.author	Şahin, Safa Onur
dc.contributor.author	Kozat, Süleyman Serdar
dc.date.accessioned	2025-02-24T09:16:41Z
dc.date.available	2025-02-24T09:16:41Z
dc.date.issued	2024-10
dc.department	Department of Electrical and Electronics Engineering
dc.description.abstract	We investigated nonlinear regression of variable length sequential data, where the data suffer from missing inputs. We introduced the hierarchical-LSTM network, which is a novel hierarchical architecture based on the LSTM networks. The hierarchical-LSTM architecture contained a set of LSTM networks, where each LSTM network is trained as an expert for processing the inputs following a particular presence-pattern, i.e., we partition the input space into subspaces in a hierarchical manner based on the presence-patterns and assign specific LSTM networks to these subpatterns. We adaptively combine the outputs of these LSTM networks based on the presence-pattern and construct the final output at each time step. The introduced algorithm protects the LSTM networks against performance losses due to: 1) statistical mismatches commonly faced by the widely used imputation methods; and 2) imputation drift, since our architecture uses only the existing inputs without any assumption on the missing data. In addition, the computational load of our algorithm is less than the computational load of the conventional algorithms in terms of the number of multiplication operations, particularly under high missingness ratios. We emphasize that our architecture can be readily applied to other recurrent architectures such as the RNNs and GRU networks. The hierarchical-LSTM network demonstrates significant performance improvements with respect to the state-of-the-art methods in several different well-known real-life and financial datasets. We also openly share the source code of our algorithm to facilitate other studies and for the reproducibility of our results. Future work may explore the selection of a subset of presence-patterns instead of using all presence-patterns so that one can use hierarchical-LSTM architecture with large window lengths by keeping the number of parameters and the computational load at the same level.
dc.identifier.doi	10.1109/TAI.2024.3404414
dc.identifier.eissn	2691-4581
dc.identifier.uri	https://hdl.handle.net/11693/116736
dc.language.iso	English
dc.publisher	IEEE
dc.relation.isversionof	https://dx.doi.org/10.1109/TAI.2024.3404414
dc.rights	CC BY-NC-ND 4.0 DEED (Attribution-NonCommercial-NoDerivatives 4.0 International)
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.source.title	IEEE Transactions on Artificial Intelligence
dc.subject	Long short-term memory (LSTM)
dc.subject	Missing data
dc.subject	Mixture of experts
dc.subject	Recurrent neural networks (RNNs)
dc.subject	Time series regression/prediction
dc.title	Nonlinear regression with hierarchical recurrent neural networks under missing data
dc.type	Article

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Nonlinear_Regression_With_Hierarchical_Recurrent_Neural_Networks_Under_Missing_Data.pdf
Size:: 1.49 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Scholarly Publications - Electrical and Electronics Engineering