Browsing by Subject "Recurrent neural networks (RNNs)"

Now showing 1 - 5 of 5

Open Access
Achieving online regression performance of LSTMs with simple RNNs
(Institute of Electrical and Electronics Engineers, 2021-06-17) Vural, Nuri Mert; İlhan, Fatih; Yılmaz, Selim Fırat; Ergüt, S.; Kozat, Süleyman Serdar
Recurrent neural networks (RNNs) are widely used for online regression due to their ability to generalize nonlinear temporal dependencies. As an RNN model, long short-term memory networks (LSTMs) are commonly preferred in practice, as these networks are capable of learning long-term dependencies while avoiding the vanishing gradient problem. However, due to their large number of parameters, training LSTMs requires considerably longer training time compared to simple RNNs (SRNNs). In this article, we achieve the online regression performance of LSTMs with SRNNs efficiently. To this end, we introduce a first-order training algorithm with a linear time complexity in the number of parameters. We show that when SRNNs are trained with our algorithm, they provide very similar regression performance with the LSTMs in two to three times shorter training time. We provide strong theoretical analysis to support our experimental results by providing regret bounds on the convergence rate of our algorithm. Through an extensive set of experiments, we verify our theoretical work and demonstrate significant performance improvements of our algorithm with respect to LSTMs and the other state-of-the-art learning models.
Open Access
Markovian RNN: an adaptive time series prediction network with HMM-based switching for nonstationary environments
(Institute of Electrical and Electronics Engineers Inc., 2023-02-01) İlhan, Fatih; Karaahmetoğlu, Oğuzhan; Balaban, İ.; Kozat, Süleyman Serdar
We investigate nonlinear regression for nonstationary sequential data. In most real-life applications such as business domains including finance, retail, energy, and economy, time series data exhibit nonstationarity due to the temporally varying dynamics of the underlying system. We introduce a novel recurrent neural network (RNN) architecture, which adaptively switches between internal regimes in a Markovian way to model the nonstationary nature of the given data. Our model, Markovian RNN employs a hidden Markov model (HMM) for regime transitions, where each regime controls hidden state transitions of the recurrent cell independently. We jointly optimize the whole network in an end-to-end fashion. We demonstrate the significant performance gains compared to conventional methods such as Markov Switching ARIMA, RNN variants and recent statistical and deep learning-based methods through an extensive set of experiments with synthetic and real-life datasets. We also interpret the inferred parameters and regime belief values to analyze the underlying dynamics of the given sequences.
Open Access
Markovian RNN: an adaptive time series prediction network with HMM-based switching for nonstationary environments
(Institute of Electrical and Electronics Engineers, 2021-08-09) İlhan, Fatih; Karaahmetoğlu, Oğuzhan; Balaban, İ.; Kozat, Süleyman Serdar
We investigate nonlinear regression for nonstationary sequential data. In most real-life applications such as business domains including finance, retail, energy, and economy, time series data exhibit nonstationarity due to the temporally varying dynamics of the underlying system. We introduce a novel recurrent neural network (RNN) architecture, which adaptively switches between internal regimes in a Markovian way to model the nonstationary nature of the given data. Our model, Markovian RNN employs a hidden Markov model (HMM) for regime transitions, where each regime controls hidden state transitions of the recurrent cell independently. We jointly optimize the whole network in an end-to-end fashion. We demonstrate the significant performance gains compared to conventional methods such as Markov Switching ARIMA, RNN variants and recent statistical and deep learning-based methods through an extensive set of experiments with synthetic and real-life datasets. We also interpret the inferred parameters and regime belief values to analyze the underlying dynamics of the given sequences.
Open Access
Nonlinear regression with hierarchical recurrent neural networks under missing data
(IEEE, 2024-10) Şahin, Safa Onur; Kozat, Süleyman Serdar
We investigated nonlinear regression of variable length sequential data, where the data suffer from missing inputs. We introduced the hierarchical-LSTM network, which is a novel hierarchical architecture based on the LSTM networks. The hierarchical-LSTM architecture contained a set of LSTM networks, where each LSTM network is trained as an expert for processing the inputs following a particular presence-pattern, i.e., we partition the input space into subspaces in a hierarchical manner based on the presence-patterns and assign specific LSTM networks to these subpatterns. We adaptively combine the outputs of these LSTM networks based on the presence-pattern and construct the final output at each time step. The introduced algorithm protects the LSTM networks against performance losses due to: 1) statistical mismatches commonly faced by the widely used imputation methods; and 2) imputation drift, since our architecture uses only the existing inputs without any assumption on the missing data. In addition, the computational load of our algorithm is less than the computational load of the conventional algorithms in terms of the number of multiplication operations, particularly under high missingness ratios. We emphasize that our architecture can be readily applied to other recurrent architectures such as the RNNs and GRU networks. The hierarchical-LSTM network demonstrates significant performance improvements with respect to the state-of-the-art methods in several different well-known real-life and financial datasets. We also openly share the source code of our algorithm to facilitate other studies and for the reproducibility of our results. Future work may explore the selection of a subset of presence-patterns instead of using all presence-patterns so that one can use hierarchical-LSTM architecture with large window lengths by keeping the number of parameters and the computational load at the same level.
Open Access
Nonuniformly sampled data processing using LSTM networks
(Institute of Electrical and Electronics Engineers, 2019) Şahin, Safa Onur; Kozat, Süleyman Serdar
We investigate classification and regression for nonuniformly sampled variable length sequential data and introduce a novel long short-term memory (LSTM) architecture. In particular, we extend the classical LSTM network with additional time gates, which incorporate the time information as a nonlinear scaling factor on the conventional gates. We also provide forward-pass and backward-pass update equations for the proposed LSTM architecture. We show that our approach is superior to the classical LSTM architecture when there is correlation between time samples. In our experiments, we achieve significant performance gains with respect to the classical LSTM and phased-LSTM architectures. In this sense, the proposed LSTM architecture is highly appealing for the applications involving nonuniformly sampled sequential data. IEEE