• About
  • Policies
  • What is open access
  • Library
  • Contact
Advanced search
      View Item 
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      A theoretical framework on the ideal number of classifiers for online ensembles in data streams

      Thumbnail
      View / Download
      498.8 Kb
      Author(s)
      Bonab, Hamed R.
      Can, Fazlı
      Date
      2016-10
      Source Title
      International Conference on Information and Knowledge Management, Proceedings
      Publisher
      ACM
      Pages
      2053 - 2056
      Language
      English
      Type
      Conference Paper
      Item Usage Stats
      220
      views
      331
      downloads
      Abstract
      A priori determining the ideal number of component classifiers of an ensemble is an important problem. The volume and velocity of big data streams make this even more crucial in terms of prediction accuracies and resource requirements. There is a limited number of studies addressing this problem for batch mode and none for online environments. Our theoretical framework shows that using the same number of independent component classifiers as class labels gives the highest accuracy. We prove the existence of an ideal number of classifiers for an ensemble, using the weighted majority voting aggregation rule. In our experiments, we use two state-of-the-art online ensemble classifiers with six synthetic and six real-world data streams. The violation of providing independent component classifiers for our theoretical framework makes determining the exact ideal number of classifiers nearly impossible. We suggest upper bounds for the number of classifiers that gives the highest accuracy. An important implication of our study is that comparing online ensemble classifiers should be done based on these ideal values, since comparing based on a fixed number of classifiers can be misleading. © 2016 ACM.
      Keywords
      Big data stream
      Ensemble size
      Weighted majority voting
      Data communication systems
      Knowledge management
      Data stream
      Ensemble classifiers
      Independent components
      Number of components
      Resource requirements
      Theoretical framework
      Weighted majority voting
      Big data
      Permalink
      http://hdl.handle.net/11693/37519
      Published Version (Please cite this version)
      http://dx.doi.org/10.1145/2983323.2983907
      Collections
      • Department of Computer Engineering 1510
      Show full item record

      Browse

      All of BUIRCommunities & CollectionsTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsCoursesThis CollectionTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsCourses

      My Account

      Login

      Statistics

      View Usage StatisticsView Google Analytics Statistics

      Bilkent University

      If you have trouble accessing this page and need to request an alternate format, contact the site administrator. Phone: (312) 290 2976
      © Bilkent University - Library IT

      Contact Us | Send Feedback | Off-Campus Access | Admin | Privacy