Relevance feedback and sparsity handling methods for temporal data
Please cite this item using this persistent URLhttp://hdl.handle.net/11693/47732
Data with temporal ordering arises in many natural and digital processes with an increasing importance and immense number of applications. This study provides solutions to data mining problems in analyzing time series both in standalone and sparse networked cases. We initially develop a methodology for browsing time series repositories by forming new time series queries based on user annotations. The result set for each query is formed using diverse selection methods to increase the effectiveness of the relevance feedback (RF) mechanism. In addition to RF, a unique aspect of time series data is considered and representation feedback methods are proposed to converge to the outperforming representation type among various transformations based on user annotations as opposed to manual selection. These methods are based on partitioning of the result set according to representation performance and a weighting approach which amplifies different features from multiple representations. We subsequently propose the utilization of autoencoders to summarize the time series into a data-aware sparse representation to both decrease computation load and increase the accuracy. Experiments on a large variety of real data sets prove that the proposed methods improve the accuracy significantly and data-aware representations have recorded similar performances while reducing the data and computational load. As a more demanding case, the time series dataset may be incomplete needing interpolation approaches to apply data mining techniques. In this regard, we analyze a sparse time series data with an underlying time varying network. We develop a methodology to generate a road network time series dataset using noisy and sparse vehicle trajectories and evaluate the result using time varying shortest path solutions.