Browsing by Subject "Time series"
Now showing 1 - 16 of 16
- Results Per Page
- Sort Options
Item Open Access BolT: Fused window transformers for fMRI time series analysis(Elsevier B.V., 2023-05-18) Bedel, Hasan Atakan; Şıvgın, Irmak; Dalmaz, Onat; Ul Hassan Dar, Salman ; Çukur, TolgaDeep-learning models have enabled performance leaps in analysis of high-dimensional functional MRI (fMRI) data. Yet, many previous methods are suboptimally sensitive for contextual representations across diverse time scales. Here, we present BolT, a blood-oxygen-level-dependent transformer model, for analyzing multi-variate fMRI time series. BolT leverages a cascade of transformer encoders equipped with a novel fused window attention mechanism. Encoding is performed on temporally-overlapped windows within the time series to capture local representations. To integrate information temporally, cross-window attention is computed between base tokens in each window and fringe tokens from neighboring windows. To gradually transition from local to global representations, the extent of window overlap and thereby number of fringe tokens are progressively increased across the cascade. Finally, a novel cross-window regularization is employed to align high-level classification features across the time series. Comprehensive experiments on large-scale public datasets demonstrate the superior performance of BolT against state-of-the-art methods. Furthermore, explanatory analyses to identify landmark time points and regions that contribute most significantly to model decisions corroborate prominent neuroscientific findings in the literature.Item Open Access Diversity based Relevance Feedback for Time Series Search(2013) Eravci, B.; Ferhatosmanoglu H.We propose a diversity based relevance feedback approach for time series data to improve the accuracy of search results. We first develop the concept of relevance feedback for time series based on dual-tree complex wavelet (CWT) and SAX based approaches. We aim to enhance the search quality by incorporating diversity in the results presented to the user for feedback. We then propose a method which utilizes the representation type as part of the feedback, as opposed to a human choosing based on a preprocessing or training phase. The proposed methods utilize a weighting to handle the relevance feedback of important properties for both single and multiple representation cases. Our experiments on a large variety of time series data sets show that the proposed diversity based relevance feedback improves the retrieval performance. Results confirm that representation feedback incorporates item diversity implicitly and achieves good performance even when using simple nearest neighbor as the retrieval method. To the best of our knowledge, this is the first study on diversification of time series search to improve retrieval accuracy and representation feedback. © 2013 VLDB Endowment.Item Open Access Efficiency of the Turkish stock exchange with respect to monetary variables: a cointegration analysis(Elsevier BV, 1996) Muradoglu, Y. G.; Metin, K.In this study, we test the semistrong form of the efficient market hypothesis in Turkey by using the recently developed techniques in time series econometrics, namely unit roots and cointegration. The long run relationship between stock prices and inflation is investigated by assuming the possible existence of a proxy effect. Conclusions are made as to the efficiency of the Turkish Stock Exchange and its possible implications for investors. To our knowledge, this is among the pioneering studies conducted in an emerging market that uses an updated econometric methodology to allow for an analysis of long run steady state properties together with short run dynamics.Item Open Access End-to-end hybrid architectures for effective sequential data prediction(2023-08) Aydın, Mustafa EnesWe investigate nonlinear prediction in an online setting and introduce two hybrid models that effectively mitigate, via end-to-end architectures, the need for hand-designed features and manual model selection issues of conventional nonlinear prediction/regression methods. Particularly, we first use an enhanced recurrent neural network (LSTM) to extract features from sequential signals, while pre-serving the state information, i.e., the history, and soft gradient boosted decision trees (sGBDT) to produce the final output. The connection is in an end-to-end fashion and we jointly optimize the whole architecture using stochastic gradient descent. Secondly, we again use recursive structures (LSTM) for automatic fea-ture extraction out of raw data but accompany it with a traditional linear time series model (SARIMAX) to deal with the intricacies of the sequential data, e.g., seasonality. The unification of the models is again in a joint manner; it is through a single state space and we optimize the entire architecture using particle filter-ing. The proposed frameworks are generic so that one can use other recurrent architectures, e.g., GRUs, and differentiable machine learning algorithms as well as time series models that have state space representations in lieu of the specific models presented. We demonstrate the learning behavior of the models on syn-thetic data and the significant performance improvements over the conventional methods and the disjoint counterparts over various real life datasets, with which we also show the generic nature of the frameworks. Furthermore, we openly share the source code of the proposed methods to facilitate further research.Item Open Access Environment Kuznets curve for CO2 emissions: a cointegration analysis for China(Elsevier Ltd, 2009) Jalil, A.; Mahmud, S. F.This study examines the long-run relationship between carbon emissions and energy consumption, income and foreign trade in the case of China by employing time series data of 1975-2005. In particular the study aims at testing whether environmental Kuznets curve (EKC) relationship between CO2 emissions and per capita real GDP holds in the long run or not. Auto regressive distributed lag (ARDL) methodology is employed for empirical analysis. A quadratic relationship between income and CO2 emission has been found for the sample period, supporting EKC relationship. The results of Granger causality tests indicate one way causality runs through economic growth to CO2 emissions. The results of this study also indicate that the carbon emissions are mainly determined by income and energy consumption in the long run. Trade has a positive but statistically insignificant impact on CO2 emissions. © 2009 Elsevier Ltd. All rights reserved.Item Open Access Essays on financial development and economic growth(2007) Şendeniz Yüncü, İlkayThe relationship between nancial development and economic growth is analyzed in this dissertation. The rst essay investigates the roles of banking sector development and stock market development in economic growth and the role of economic growth in banking sector development and stock market development in 64 developed and emerging markets over the period 1994 2003 using dynamic panel data techniques. In emerging markets, a statistically signi cant and positive interdependence is observed both between banking sector development and economic growth and between stock market development and economic growth. The results show that in developed markets, although economic growth positively a ects nancial development, banking sector development and stock market development have no statistically signi cant e ects on economic growth, supporting the demand-following view. In the second essay, the role of futures markets in economic growth is investigated using both dynamic panel data and time-series techniques. Dynamic panel estimation results give evidence of a statistically signi cant and positive relationship between futures market development and economic growth. The results are consistent with models, which predict that well-functioning nancial markets promote economic growth. Time-series analyses results indicate that this relationship is more robust for the countries that have medium-sized futures markets. It is concluded that risk management through futures markets improves economic growth mostly in countries with developing futures markets.Item Open Access Modeling of spatio-temporal hawkes processes with randomized kernels(IEEE, 2020) İlhan, Fatih; Kozat, Süleyman SerdarWe investigate spatio-temporal event analysis using point processes. Inferring the dynamics of event sequences spatio-temporally has many practical applications including crime prediction, social media analysis, and traffic forecasting. In particular, we focus on spatio-temporal Hawkes processes that are commonly used due to their capability to capture excitations between event occurrences. We introduce a novel inference framework based on randomized transformations and gradient descent to learn the process. We replace the spatial kernel calculations by randomized Fourier feature-based transformations. The introduced randomization by this representation provides flexibility while modeling the spatial excitation between events. Moreover, the system described by the process is expressed within closed-form in terms of scalable matrix operations. During the optimization, we use maximum likelihood estimation approach and gradient descent while properly handling positivity and orthonormality constraints. The experiment results show the improvements achieved by the introduced method in terms of fitting capability in synthetic and real-life datasets with respect to the conventional inference methods in the spatio-temporal Hawkes process literature. We also analyze the triggering interactions between event types and how their dynamics change in space and time through the interpretation of learned parameters.Item Open Access Online anomaly detection under Markov statistics with controllable type-I error(Institute of Electrical and Electronics Engineers Inc., 2016) Ozkan, H.; Ozkan, F.; Kozat, S. S.We study anomaly detection for fast streaming temporal data with real time Type-I error, i.e., false alarm rate, controllability; and propose a computationally highly efficient online algorithm, which closely achieves a specified false alarm rate while maximizing the detection power. Regardless of whether the source is stationary or nonstationary, the proposed algorithm sequentially receives a time series and learns the nominal attributes - in the online setting - under possibly varying Markov statistics. Then, an anomaly is declared at a time instance, if the observations are statistically sufficiently deviant. Moreover, the proposed algorithm is remarkably versatile since it does not require parameter tuning to match the desired rates even in the case of strong nonstationarity. The presented study is the first to provide the online implementation of Neyman-Pearson (NP) characterization for the problem such that the NP optimality, i.e., maximum detection power at a specified false alarm rate, is nearly achieved in a truly online manner. In this regard, the proposed algorithm is highly novel and appropriate especially for the applications requiring sequential data processing at large scales/high rates due to its parameter-tuning free computational efficient design with the practical NP constraints under stationary or non-stationary source statistics. © 2015 IEEE.Item Open Access Online anomaly detection with minimax optimal density estimation in nonstationary environments(Institute of Electrical and Electronics Engineers, 2018) Gokcesu, K.; Kozat, Süleyman SerdarWe introduce a truly online anomaly detection algorithm that sequentially processes data to detect anomalies in time series. In anomaly detection, while the anomalous data are arbitrary, the normal data have similarities and generally conforms to a particular model. However, the particular model that generates the normal data is generally unknown (even nonstationary) and needs to be learned sequentially. Therefore, a two stage approach is needed, where in the first stage, we construct a probability density function to model the normal data in the time series. Then, in the second stage, we threshold the density estimation of the newly observed data to detect anomalies. We approach this problem from an information theoretic perspective and propose minimax optimal schemes for both stages to create an optimal anomaly detection algorithm in a strong deterministic sense. To this end, for the first stage, we introduce a completely online density estimation algorithm that is minimax optimal with respect to the log-loss and achieves Merhav's lower bound for general nonstationary exponential-family of distributions without any assumptions on the observation sequence. For the second stage, we propose a threshold selection scheme that is minimax optimal (with logarithmic performance bounds) against the best threshold chosen in hindsight with respect to the surrogate logistic loss. Apart from the regret bounds, through synthetic and real life experiments, we demonstrate substantial performance gains with respect to the state-of-the-art density estimation based anomaly detection algorithms in the literature.Item Open Access Online learning under adverse settings(2015-05) Özkan, HüseyinWe present novel solutions for contemporary real life applications that generate data at unforeseen rates in unpredictable forms including non-stationarity, corruptions, missing/mixed attributes and high dimensionality. In particular, we introduce novel algorithms for online learning, where the observations are received sequentially and processed only once without being stored, under adverse settings: i) no or limited assumptions can be made about the data source, ii) the observations can be corrupted and iii) the data is to be processed at extremely fast rates. The introduced algorithms are highly effective and efficient with strong mathematical guarantees; and are shown, through the presented comprehensive real life experiments, to significantly outperform the competitors under such adverse conditions. We develop a novel highly dynamical ensemble method without any stochastic assumptions on the data source. The presented method is asymptotically guaranteed to perform as well as, i.e., competitive against, the best expert in the ensemble, where the competitor, i.e., the best expert, itself is also specifically designed to continuously improve over time in a completely data adaptive manner. In addition, our algorithm achieves a significantly superior modeling power (hence, a significantly superior prediction performance) through a hierarchical and self-organizing approach while mitigating over training issues by combining (taking finite unions of) low-complexity methods. On the contrary, the state-of-the-art ensemble techniques are heavily dependent on static and unstructured expert ensembles. In this regard, we rigorously solve the resulting issues such as the over sensitivity to source statistics as well as the incompatibility between the modeling power and the computational load/precision. Our results uniformly hold for every possible input stream in the deterministic sense regardless of the stationary or non-stationary source statistics. Furthermore, we directly address the data corruptions by developing novel versatile imputation methods and thoroughly demonstrate that the anomaly detection -in addition to being stand alone an important learning problem- is extremely effective for corruption detection/imputation purposes. To that end, as the first time in the literature, we develop the online implementation of the Neyman-Pearson characterization for anomalies in stationary or non-stationary fast streaming temporal data. The introduced anomaly detection algorithm maximizes the detection power at a specified controllable constant false alarm rate with no parameter tuning in a truly online manner. Our algorithms can process any streaming data at extremely fast rates without requiring a training phase or a priori information while bearing strong performance guarantees. Through extensive experiments over real/synthetic benchmark data sets, we also show that our algorithms significantly outperform the state-of-the-art as well as the most recently proposed techniques in the literature with remarkable adaptation capabilities to non-stationarity.Item Open Access Online nonlinear modeling for big data applications(2017-12) Khan, FarhanWe investigate online nonlinear learning for several real life, adaptive signal processing and machine learning applications involving big data, and introduce algorithms that are both e cient and e ective. We present novel solutions for learning from the data that is generated at high speed and/or have big dimensions in a non-stationary environment, and needs to be processed on the y. We speci cally focus on investigating the problems arising from adverse real life conditions in a big data perspective. We propose online algorithms that are robust against the non-stationarities and corruptions in the data. We emphasize that our proposed algorithms are universally applicable to several real life applications regardless of the complexities involving high dimensionality, time varying statistics, data structures and abrupt changes. To this end, we introduce a highly robust hierarchical trees algorithm for online nonlinear learning in a high dimensional setting where the data lies on a time varying manifold. We escape the curse of dimensionality by tracking the subspace of the underlying manifold and use the projections of the original high dimensional regressor space onto the underlying manifold as the modi ed regressor vectors for modeling of the nonlinear system. By using the proposed algorithm, we reduce the computational complexity to the order of the depth of the tree and the memory requirement to only linear in the intrinsic dimension of the manifold. We demonstrate the signi cant performance gains in terms of mean square error over the other state of the art techniques through simulated as well as real data. We then consider real life applications of online nonlinear learning modeling, such as network intrusions detection, customers' churn analysis and channel estimation for underwater acoustic communication. We propose sequential and online learning methods that achieve signi cant performance in terms of detection accuracy, compared to the state-of-the-art techniques. We speci cally introduce structured and deep learning methods to develop robust learning algorithms. Furthermore, we improve the performance of our proposed online nonlinear learning models by introducing mixture-of-experts methods and the concept of boosting. The proposed algorithms achieve signi cant performance gain over the state-ofthe- art methods with signi cantly reduced computational complexity and storage requirements in real life conditions.Item Open Access Scaling forecasting algorithms using clustered modeling(Association for Computing Machinery, 2015) Gür, İ.; Güvercin, M.; Ferhatosmanoglu, H.Research on forecasting has traditionally focused on building more accurate statistical models for a given time series. The models are mostly applied to limited data due to efficiency and scalability problems. However, many enterprise applications require scalable forecasting on large number of data series. For example, telecommunication companies need to forecast each of their customers’ traffic load to understand their usage behavior and to tailor targeted campaigns. Forecasting models are typically applied on aggregate data to estimate the total traffic volume for revenue estimation and resource planning. However, they cannot be easily applied to each user individually as building accurate models for large number of users would be time consuming. The problem is exacerbated when the forecasting process is continuous and the models need to be updated periodically. This paper addresses the problem of building and updating forecasting models continuously for multiple data series. We propose dynamic clustered modeling for forecasting by utilizing representative models as an analogy to cluster centers. We apply the models to each individual series through iterative nonlinear optimization. We develop two approaches: The Integrated Clustered Modeling integrates clustering and modeling simultaneously, and the Sequential Clustered Modeling applies them sequentially. Our findings indicate that modeling an individual’s behavior using its segment can be more scalable and accurate than the individual model itself. The grouped models avoid overfits and capture common motifs even on noisy data. Experimental results from a telco CRM application show the method is efficient and scalable, and also more accurate than having separate individual models.Item Open Access Time and context sensitive optimization of machine learning models for sequential data prediction(2024-07) Fazla, ArdaWe investigate the nonlinear prediction of sequential time series data through the mixture/combination of machine learning models. First, we introduce a novel ensemble learning approach that effectively combines multiple base learners in a time-aware and context-sensitive manner. This process involves a weight optimization problem targeting a specific loss function while considering (non)convex constraints on the linear combination of base learners. These constraints are theoretically analyzed under known statistics and are automatically incorporated into the meta-learner as part of the optimization process during training. Next, we introduce a direct two-stage approach based on the combination of linear and nonlinear models, where we jointly optimize the parameters of both models to minimize the final regression error. By this joint optimization, we alleviate the well-known underfitting and overfitting problems in modeling sequential data. As the linear model, we use a traditional linear time series forecasting model (SARIMAX) and as the nonlinear model, we use boosted soft decision trees (Soft GBDT). For both of our approaches, we illustrate notable performance improvements on real-life data and well-known competition datasets compared to traditional ensemble/mixture techniques and state-of-the-art forecasting models in the machine learning literature. Additionally, we make the source code of both of our approaches publicly available to facilitate further research and comparison.Item Open Access Time-aware and context-sensitive ensemble learning for sequential data(Institute of Electrical and Electronics Engineers, 2023-09-26) Fazla, Arda; Aydın, Mustafa E.; Kozat, Suleyman SerdarWe investigate sequential time series data through ensemble learning. Conventional ensemble algorithms and the recently introduced ones have provided significant performance improvements in widely publicized time series prediction competitions for stationary data. However, recent studies are inadequate in capturing the temporally varying statistics for non-stationary data. To this end, we introduce a novel approach using a meta learner that effectively combines base learners in both a time varying and context-dependent manner. Our approach is based on solving a weight optimization problem that minimizes a specific loss function with constraints on the linear combination of the base learners. The constraints are theoretically analyzed under known statistics and integrated into the learning procedure of the meta-learner as part of the optimization in an automated manner. We demonstrate significant performance improvements on real-life data and well-known competition datasets over the widely used conventional ensemble methods and the state-ofthe-art forecasting methods in the machine learning literature. Furthermore, we openly share the source code of our method to facilitate further research and comparison.Item Open Access Wavelet energy ratio unit root tests(Taylor and Francis Inc., 2016) Trokić, M.This article uses wavelet theory to propose a frequency domain nonparametric and tuning parameter-free family of unit root tests. The proposed test exploits the wavelet power spectrum of the observed series and its fractional partial sum to construct a test of the unit root based on the ratio of the resulting scaling energies. The proposed statistic enjoys good power properties and is robust to severe size distortions even in the presence of serially correlated MA(1) errors with a highly negative moving average (MA) parameter, as well as in the presence of random additive outliers. Any remaining size distortions are effectively eliminated using a novel wavestrapping algorithm. 2016 Copyright © Taylor & Francis Group, LLCItem Open Access Wind power prediction using machine learning and deep learning algorithms(IEEE - Institute of Electrical and Electronics Engineers, 2023-08-28) Şimşek, Ecem; Güngör, Ayşemüge; Karavelioğlu, Öykü; Yerli, Mustafa TolgaIn this study, it has been tried to predict the wind power generation values in a long-term period by using a dataset containing the wind power generation values of 10 zones using machine learning and deep learning methods. In this context, the importance of accurately predicting renewable energy production was emphasized by associating it with machine learning and deep learning methods. The methods to be used in the study were selected based on the literature review and the characteristics of the time series datasets. Since the dataset includes the basic wind components, a detailed feature analysis was performed, and the dataset was enriched with the newly added features. The hyperparameters of the utilized models were optimized for all regions in the dataset separately and the models were run with these hyperparameters. The results of the models were evaluated with different error metrics and compared with each other, and the models with the lowest error scores were determined.