Nonlinear regression with hierarchical recurrent neural networks under missing data

Şahin, Safa Onur; Kozat, Süleyman Serdar

Nonlinear regression with hierarchical recurrent neural networks under missing data

Files

Nonlinear_Regression_With_Hierarchical_Recurrent_Neural_Networks_Under_Missing_Data.pdf (1.49 MB)

Date

2024-10

Authors

Şahin, Safa Onur

Kozat, Süleyman Serdar

BUIR Usage Stats

0
views

6
downloads

Citation Stats

Abstract

We investigated nonlinear regression of variable length sequential data, where the data suffer from missing inputs. We introduced the hierarchical-LSTM network, which is a novel hierarchical architecture based on the LSTM networks. The hierarchical-LSTM architecture contained a set of LSTM networks, where each LSTM network is trained as an expert for processing the inputs following a particular presence-pattern, i.e., we partition the input space into subspaces in a hierarchical manner based on the presence-patterns and assign specific LSTM networks to these subpatterns. We adaptively combine the outputs of these LSTM networks based on the presence-pattern and construct the final output at each time step. The introduced algorithm protects the LSTM networks against performance losses due to: 1) statistical mismatches commonly faced by the widely used imputation methods; and 2) imputation drift, since our architecture uses only the existing inputs without any assumption on the missing data. In addition, the computational load of our algorithm is less than the computational load of the conventional algorithms in terms of the number of multiplication operations, particularly under high missingness ratios. We emphasize that our architecture can be readily applied to other recurrent architectures such as the RNNs and GRU networks. The hierarchical-LSTM network demonstrates significant performance improvements with respect to the state-of-the-art methods in several different well-known real-life and financial datasets. We also openly share the source code of our algorithm to facilitate other studies and for the reproducibility of our results. Future work may explore the selection of a subset of presence-patterns instead of using all presence-patterns so that one can use hierarchical-LSTM architecture with large window lengths by keeping the number of parameters and the computational load at the same level.

Source Title

IEEE Transactions on Artificial Intelligence

Publisher

IEEE

Keywords

Long short-term memory (LSTM), Missing data, Mixture of experts, Recurrent neural networks (RNNs), Time series regression/prediction

Permalink

https://hdl.handle.net/11693/116736

Published Version (Please cite this version)

https://dx.doi.org/10.1109/TAI.2024.3404414

Rights

https://creativecommons.org/licenses/by-nc-nd/4.0/

Collections

Scholarly Publications - Electrical and Electronics Engineering

Language

English

Type

Article

Full item page

Nonlinear regression with hierarchical recurrent neural networks under missing data

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Rights

Collections

Language

Type

Nonlinear regression with hierarchical recurrent neural networks under missing data

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Rights

Collections

Language

Type