Browsing by Author "Bonab, H."
Now showing 1 - 4 of 4
Results Per Page
Sort Options
Item Open Access A broad ensemble learning system for drifting stream classification(Institute of Electrical and Electronics Engineers, 2023-08-21) Bakhshi, Sepehr; Ghahramanian, Pouya; Bonab, H.; Can, FazlıIn a data stream environment, classification models must effectively and efficiently handle concept drift. Ensemble methods are widely used for this purpose; however, the ones available in the literature either use a large data chunk to update the model or learn the data one by one. In the former, the model may miss the changes in the data distribution, while in the latter, the model may suffer from inefficiency and instability. To address these issues, we introduce a novel ensemble approach based on the Broad Learning System (BLS), where mini chunks are used at each update. BLS is an effective lightweight neural architecture recently developed for incremental learning. Although it is fast, it requires huge data chunks for effective updates and is unable to handle dynamic changes observed in data streams. Our proposed approach, named Broad Ensemble Learning System (BELS), uses a novel updating method that significantly improves best-in class model accuracy. It employs an ensemble of output layers to address the limitations of BLS and handle drifts. Our model tracks the changes in the accuracy of the ensemble components and reacts to these changes. We present our mathematical derivation of BELS, perform comprehensive experiments with 35 datasets that demonstrate the adaptability of our model to various drift types, and provide its hyperparameter, ablation, and imbalanced dataset performance analysis. The experimental results show that the proposed approach outperforms 10 state-of-the-art baselines, and supplies an overall improvement of 18.59% in terms of average prequential accuracy.Item Open Access Less is more: a comprehensive framework for the number of components of ensemble classifiers(IEEE, 2019) Bonab, H.; Can, FazlıThe number of component classifiers chosen for an ensemble greatly impacts the prediction ability. In this paper, we use a geometric framework for a priori determining the ensemble size, which is applicable to most of the existing batch and online ensemble classifiers. There are only a limited number of studies on the ensemble size examining majority voting (MV) and weighted MV (WMV). Almost all of them are designed for batch-mode, hardly addressing online environments. Big data dimensions and resource limitations, in terms of time and memory, make the determination of ensemble size crucial, especially for online environments. For the MV aggregation rule, our framework proves that the more strong components we add to the ensemble, the more accurate predictions we can achieve. For the WMV aggregation rule, our framework proves the existence of an ideal number of components, which is equal to the number of class labels, with the premise that components are completely independent of each other and strong enough. While giving the exact definition for a strong and independent classifier in the context of an ensemble is a challenging task, our proposed geometric framework provides a theoretical explanation of diversity and its impact on the accuracy of predictions. We conduct a series of experimental evaluations to show the practical value of our theorems and existing challenges.Item Open Access A novel online stacked ensemble for multi-label stream classification(ACM, 2018) Büyükçakır, Alican; Bonab, H.; Can, FazlıAs data streams become more prevalent, the necessity for online algorithms that mine this transient and dynamic data becomes clearer. Multi-label data stream classification is a supervised learning problem where each instance in the data stream is classified into one or more pre-defined sets of labels. Many methods have been proposed to tackle this problem, including but not limited to ensemble-based methods. Some of these ensemble-based methods are specifically designed to work with certain multi-label base classifiers; some others employ online bagging schemes to build their ensembles. In this study, we introduce a novel online and dynamically-weighted stacked ensemble for multi-label classification, called GOOWE-ML, that utilizes spatial modeling to assign optimal weights to its component classifiers. Our model can be used with any existing incremental multi-label classification algorithm as its base classifier. We conduct experiments with 4 GOOWE-ML-based multi-label ensembles and 7 baseline models on 7 real-world datasets from diverse areas of interest. Our experiments show that GOOWE-ML ensembles yield consistently better results in terms of predictive performance in almost all of the datasets, with respect to the other prominent ensemble models.Item Open Access Unsupervised concept drift detection with a discriminative classifier(Association for Computing Machinery, 2019) Gözüaçık, Ömer; Büyükçakır, Alican; Bonab, H.; Can, FazlıIn data stream mining, one of the biggest challenges is to develop algorithms that deal with the changing data. As data evolve over time, static models become outdated. This phenomenon is called concept drift, and it is investigated extensively in the literature. Detecting and subsequently adapting to concept drifts yield more robust and better performing models. In this study, we present an unsupervised method called D3 which uses a discriminative classifier with a sliding window to detect concept drift by monitoring changes in the feature space. It is a simple method that can be used along with any existing classifier that does not intrinsically have a drift adaptation mechanism. We experiment on the most prevalent concept drift detectors using 8 datasets. The results demonstrate that D3 outperforms the baselines, yielding models with higher performances on both real-world and synthetic datasets.