Non-uniformly sampled sequential data processing
Author(s)
Advisor
Date
2019-09Publisher
Bilkent University
Language
English
Type
ThesisItem Usage Stats
203
views
views
141
downloads
downloads
Abstract
We study classification and regression for variable length sequential data, which
is either non-uniformly sampled or contains missing samples. In most sequential
data processing studies, one considers data sequence is uniformly sampled and
complete, i.e., does not contain missing input values. However, non-uniformly
sampled sequences and the missing data problem appear in a wide range of fields
such as medical imaging and financial data. To resolve these problems, certain
preprocessing techniques, statistical assumptions and imputation methods are
usually employed. However, these approaches suffer since the statistical assumptions do not hold in general and the imputation of artificially generated and
unrelated inputs deteriorate the model. To mitigate these problems, in chapter
2, we introduce a novel Long Short-Term Memory (LSTM) architecture. In particular, we extend the classical LSTM network with additional time gates, which
incorporate the time information as a nonlinear scaling factor on the conventional gates. We also provide forward pass and backward pass update equations
for the proposed LSTM architecture. We show that our approach is superior to
the classical LSTM architecture, when there is correlation between time samples.
In chapter 3, we investigate regression for variable length sequential data containing missing samples and introduce a novel tree architecture based on the Long
Short-Term Memory (LSTM) networks. In our architecture, we employ a variable
number of LSTM networks, which use only the existing inputs in the sequence,
in a tree-like architecture without any statistical assumptions or imputations on
the missing data. In particular, we incorporate the missingness information by
selecting a subset of these LSTM networks based on presence-pattern of a certain
number of previous inputs.
Keywords
Long short-term memoryRecurrent neural networks
Non-uniform sampling
Missing data
Supervised learning