A theoretical framework on the ideal number of classifiers for online ensembles in data streams

Bonab, Hamed R.; Can, Fazlı

A theoretical framework on the ideal number of classifiers for online ensembles in data streams

dc.citation.epage	2056	en_US
dc.citation.spage	2053	en_US
dc.contributor.author	Bonab, Hamed R.	en_US
dc.contributor.author	Can, Fazlı	en_US
dc.coverage.spatial	Indianapolis, Indiana, USA
dc.date.accessioned	2018-04-12T11:42:43Z
dc.date.available	2018-04-12T11:42:43Z
dc.date.issued	2016-10	en_US
dc.department	Department of Computer Engineering	en_US
dc.description	Date of Conference: 24-28 October, 2016
dc.description	Conference name: CIKM '16 Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
dc.description.abstract	A priori determining the ideal number of component classifiers of an ensemble is an important problem. The volume and velocity of big data streams make this even more crucial in terms of prediction accuracies and resource requirements. There is a limited number of studies addressing this problem for batch mode and none for online environments. Our theoretical framework shows that using the same number of independent component classifiers as class labels gives the highest accuracy. We prove the existence of an ideal number of classifiers for an ensemble, using the weighted majority voting aggregation rule. In our experiments, we use two state-of-the-art online ensemble classifiers with six synthetic and six real-world data streams. The violation of providing independent component classifiers for our theoretical framework makes determining the exact ideal number of classifiers nearly impossible. We suggest upper bounds for the number of classifiers that gives the highest accuracy. An important implication of our study is that comparing online ensemble classifiers should be done based on these ideal values, since comparing based on a fixed number of classifiers can be misleading. © 2016 ACM.	en_US
dc.identifier.doi	10.1145/2983323.2983907	en_US
dc.identifier.uri	http://hdl.handle.net/11693/37519	en_US
dc.language.iso	English	en_US
dc.publisher	ACM	en_US
dc.relation.isversionof	http://dx.doi.org/10.1145/2983323.2983907	en_US
dc.source.title	International Conference on Information and Knowledge Management, Proceedings	en_US
dc.subject	Big data stream	en_US
dc.subject	Ensemble size	en_US
dc.subject	Weighted majority voting	en_US
dc.subject	Data communication systems	en_US
dc.subject	Knowledge management	en_US
dc.subject	Data stream	en_US
dc.subject	Ensemble classifiers	en_US
dc.subject	Independent components	en_US
dc.subject	Number of components	en_US
dc.subject	Resource requirements	en_US
dc.subject	Theoretical framework	en_US
dc.subject	Weighted majority voting	en_US
dc.subject	Big data	en_US
dc.title	A theoretical framework on the ideal number of classifiers for online ensembles in data streams	en_US
dc.type	Conference Paper	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: A theoretical framework on the ideal number of classifiers for online ensembles in data streams.pdf
Size:: 498.78 KB
Format:: Adobe Portable Document Format
Description:: Full printable version

Download

Collections

Scholarly Publications - Computer Engineering