Browsing by Author "Ergen, Tolga"

Now showing 1 - 11 of 11

Open Access
An efficient bandit algorithm for general weight assignments
(IEEE, 2017) Gökçesu, Kaan; Ergen, Tolga; Çiftçi, S.; Kozat, Süleyman Serdar
In this paper, we study the adversarial multi armed bandit problem and present a generally implementable efficient bandit arm selection structure. Since we do not have any statistical assumptions on the bandit arm losses, the results in the paper are guaranteed to hold in an individual sequence manner. The introduced framework is able to achieve the optimal regret bounds by employing general weight assignments on bandit arm selection sequences. Hence, this framework can be used for a wide range of applications.
Open Access
Efficient online learning algorithms based on LSTM neural networks
(Institute of Electrical and Electronics Engineers, 2018) Ergen, Tolga; Kozat, Süleyman Serdar
We investigate online nonlinear regression and introduce novel regression structures based on the long short term memory (LSTM) networks. For the introduced structures, we also provide highly efficient and effective online training methods. To train these novel LSTM-based structures, we put the underlying architecture in a state space form and introduce highly efficient and effective particle filtering (PF)-based updates. We also provide stochastic gradient descent and extended Kalman filter-based updates. Our PF-based training method guarantees convergence to the optimal parameter estimation in the mean square error sense provided that we have a sufficient number of particles and satisfy certain technical conditions. More importantly, we achieve this performance with a computational complexity in the order of the first-order gradient-based methods by controlling the number of particles. Since our approach is generic, we also introduce a gated recurrent unit (GRU)-based approach by directly replacing the LSTM architecture with the GRU architecture, where we demonstrate the superiority of our LSTM-based approach in the sequential prediction task via different real life data sets. In addition, the experimental results illustrate significant performance improvements achieved by the introduced algorithms with respect to the conventional methods over several different benchmark real life data sets.
Open Access
A highly efficient recurrent neural network architecture for data regression
(IEEE, 2018) Ergen, Tolga; Ceyani, Emir
In this paper, we study online nonlinear data regression and propose a highly efficient long short term memory (LSTM) network based architecture. Here, we also introduce on-line training algorithms to learn the parameters of the introduced architecture. We first propose an LSTM based architecture for data regression. To diminish the complexity of this architecture, we use an energy efficient operator (ef-operator) instead of the multiplication operation. We then factorize the matrices of the LSTM network to reduce the total number of parameters to be learned. In order to train the parameters of this structure, we introduce online learning methods based on the exponentiated gradient (EG) and stochastic gradient descent (SGD) algorithms. Experimental results demonstrate considerable performance and efficiency improvements provided by the introduced architecture.
Open Access
Neural networks based online learning
(IEEE, 2017) Ergen, Tolga; Kozat, Süleyman Serdar
In this paper, we investigate online nonlinear regression and introduce novel algorithms based on the long short term memory (LSTM) networks. We first put the underlying architecture in a nonlinear state space form and introduce highly efficient particle filtering (PF) based updates, as well as, extended Kalman filter (EKF) based updates. Our PF based training method guarantees convergence to the optimal parameter estimation under certain assumptions. We achieve this performance with a computational complexity in the order of the first order gradient based methods by controlling the number of particles. The experimental results illustrate significant performance improvements achieved by the introduced algorithms with respect to the conventional methods.
Open Access
A novel anomaly detection approach based on neural networks
(Institute of Electrical and Electronics Engineers, 2018) Ergen, Tolga; Kerpiççi, Mine
In this paper, we introduce a Long Short Term Memory (LSTM) networks based anomaly detection algorithm, which works in an unsupervised framework. We first introduce LSTM based structure for variable length data sequences to obtain fixed length sequences. Then, we propose One Class Support Vector Machines (OC-SVM) algorithm based scoring function for anomaly detection. For training, we propose a gradient based algorithm to find the optimal parameters for both LSTM architecture and the OC-SVM formulation. Since we modify the original OC-SVM formulation, we also provide the convergence results of the modified formulation to the original one. Thus, the algorithm that we proposed is able to process data with variable length sequences. Also, the algorithm provides high performance for time series data. In our experiments, we illustrate significant performance improvements with respect to the conventional methods.
Open Access
A novel distributed anomaly detection algorithm based on support vector machines
(Elsevier, 2020-01) Ergen, Tolga; Kozat, Süleyman S.
In this paper, we study anomaly detection in a distributed network of nodes and introduce a novel algorithm based on Support Vector Machines (SVMs). We first reformulate the conventional SVM optimization problem for a distributed network of nodes. We then directly train the parameters of this SVM architecture in its primal form using a gradient based algorithm in a fully distributed manner, i.e., each node in our network is allowed to communicate only with its neighboring nodes in order to train the parameters. Therefore, we not only obtain a high performing anomaly detection algorithm thanks to strong modeling capabilities of SVMs, but also achieve significantly reduced communication load and computational complexity due to our fully distributed and efficient gradient based training. Here, we provide a training algorithm in a supervised framework, however, we also provide the extensions of our implementation to an unsupervised framework. We illustrate the performance gains achieved by our algorithm via several benchmark real life and synthetic experiments.
Open Access
Novelty detection using soft partitioning and hierarchical models
(IEEE, 2017) Ergen, Tolga; Gökçesu, Kaan; Şimşek, Mustafa; Kozat, Süleyman Serdar
In this paper, we study novelty detection problem and introduce an online algorithm. The algorithm sequentially receives an observation, generates a decision and then updates its parameters. In the first step, to model the underlying distribution, algorithm constructs a score function. In the second step, this score function is used to make the final decision for the observed data. After thresholding procedure is applied, the final decision is made. We obtain the score using versatile and adaptive nested decision tree. We employ nested soft decision trees to partition the observation space in an hierarchical manner. Based on the sequential performance, we optimize all the components of the tree structure in an adaptive manner. Although this in time adaptation provides powerful modeling abilities, it might suffer from overfitting. To circumvent overfitting problem, we employ the intermediate nodes of tree in order to generate subtrees and we then combine them in an adaptive manner. The experiments illustrate that the introduced algorithm significantly outperforms the state of the art methods.
Open Access
Online distributed nonlinear regression via neural networks
(IEEE, 2017) Ergen, Tolga; Kozat, Süleyman Serdar
In this paper, we study the nonlinear regression problem in a network of nodes and introduce long short term memory (LSTM) based algorithms. In order to learn the parameters of the LSTM architecture in an online manner, we put the LSTM equations into a nonlinear state space form and then introduce our distributed particle filtering (DPF) based training algorithm. Our training algorithm asymptotically achieves the optimal training performance. In our simulations, we illustrate the performance improvement achieved by the introduced algorithm with respect to the conventional methods.
Open Access
Online learning with recurrent neural networks
(2018-07) Ergen, Tolga
In this thesis, we study online learning with Recurrent Neural Networks (RNNs). Particularly, in Chapter 2, we investigate online nonlinear regression and introduce novel regression structures based on the Long Short Term Memory (LSTM) network, i.e., is an advanced RNN architecture. To train these novel LSTM based structures, we introduce highly e cient and e ective Particle Filtering (PF) based updates. We also provide Stochastic Gradient Descent (SGD) and Extended Kalman Filter (EKF) based updates. Our PF based training method guarantees convergence to the optimal parameter estimation in the Mean Square Error (MSE) sense. In Chapter 3, we investigate online training of LSTM architectures in a distributed network of nodes, where each node employs an LSTM based structure for online regression. We rst provide a generic LSTM based regression structure for each node. In order to train this structure, we introduce a highly e ective and e cient Distributed PF (DPF) based training algorithm. We also introduce a Distributed EKF (DEKF) based training algorithm. Here, our DPF based training algorithm guarantees convergence to the performance of the optimal centralized LSTM parameters in the MSE sense. In Chapter 4, we investigate variable length data regression in an online setting and introduce an energy e cient regression structure build on LSTM networks. To reduce the complexity of this structure, we rst replace the regular multiplication operations with an energy e cient operator. We then apply factorizations to the weight matrices so that the total number of parameters to be trained is signi cantly reduced. We then introduce online training algorithms. Through a set of experiments, we illustrate signi cant performance gains and complexity reductions achieved by the introduced algorithms with respect to the state of the art methods.
Open Access
Recurrent neural networks based online learning algorithms for distributed systems
(Institute of Electrical and Electronics Engineers, 2018) Ergen, Tolga; Şahin, S. Onur; Kozat, S. Serdar
In this paper, we investigate online parameter learning for Long Short Term Memory (LSTM) architectures in distributed networks. Here, we first introduce an LSTM based structure for regression. Then, we provide the equations of this structure in a state space form for each node in our network. Using this form, we then learn the parameters via our Distributed Particle Filtering based (DPF) training method. Our training method asymptotically converges to the optimal parameter set provided that we satisfy certain trivial requirements. While achieving this performance, our training method only causes a computational load that is similar to the efficient first order gradient based training methods. Through real life experiments, we show substantial performance gains compared to the conventional methods.
Open Access
Team-optimal online estimation of dynamic parameters over distributed tree networks
(Elsevier, 2019) Kılıç, O. F.; Ergen, Tolga; Sayın, M.; Kozat, Süleyman
We study online parameter estimation over a distributed network, where the nodes in the network collaboratively estimate a dynamically evolving parameter using noisy observations. The nodes in the network are equipped with processing and communication capabilities and can share their observations or local estimates with their neighbors. The conventional distributed estimation algorithms cannot perform the team-optimal online estimation in the finite horizon global mean-square error sense (MSE). To this end, we present a team-optimal distributed estimation algorithm through the disclosure of local estimates for tracking an underlying dynamic parameter. We first show that the optimal estimation can be achieved through the diffusion of all the time stamped observations for any arbitrary network and prove that the team optimality through disclosure of local estimates is only possible for certain network topologies such as tree networks. We then derive an iterative algorithm to recursively calculate the combination weights of the disclosed information and construct the team-optimal estimate for each time step. Through series of simulations, we demonstrate the superior performance of the proposed algorithm with respect to the state-of-the-art diffusion distributed estimation algorithms regarding the convergence rate and the finite horizon MSE levels. We also show that while conventional distributed estimation schemes cannot track highly dynamic parameters, through optimal weight and estimate construction, the proposed algorithm presents a stable MSE performance.