A theoretical framework on the ideal number of classifiers for online ensembles in data streams

dc.citation.epage2056en_US
dc.citation.spage2053en_US
dc.contributor.authorBonab, Hamed R.en_US
dc.contributor.authorCan, Fazlıen_US
dc.coverage.spatialIndianapolis, Indiana, USA
dc.date.accessioned2018-04-12T11:42:43Z
dc.date.available2018-04-12T11:42:43Z
dc.date.issued2016-10en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.descriptionDate of Conference: 24-28 October, 2016
dc.descriptionConference name: CIKM '16 Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
dc.description.abstractA priori determining the ideal number of component classifiers of an ensemble is an important problem. The volume and velocity of big data streams make this even more crucial in terms of prediction accuracies and resource requirements. There is a limited number of studies addressing this problem for batch mode and none for online environments. Our theoretical framework shows that using the same number of independent component classifiers as class labels gives the highest accuracy. We prove the existence of an ideal number of classifiers for an ensemble, using the weighted majority voting aggregation rule. In our experiments, we use two state-of-the-art online ensemble classifiers with six synthetic and six real-world data streams. The violation of providing independent component classifiers for our theoretical framework makes determining the exact ideal number of classifiers nearly impossible. We suggest upper bounds for the number of classifiers that gives the highest accuracy. An important implication of our study is that comparing online ensemble classifiers should be done based on these ideal values, since comparing based on a fixed number of classifiers can be misleading. © 2016 ACM.en_US
dc.description.provenanceMade available in DSpace on 2018-04-12T11:42:43Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 179475 bytes, checksum: ea0bedeb05ac9ccfb983c327e155f0c2 (MD5) Previous issue date: 2016en
dc.identifier.doi10.1145/2983323.2983907en_US
dc.identifier.urihttp://hdl.handle.net/11693/37519en_US
dc.language.isoEnglishen_US
dc.publisherACMen_US
dc.relation.isversionofhttp://dx.doi.org/10.1145/2983323.2983907en_US
dc.source.titleInternational Conference on Information and Knowledge Management, Proceedingsen_US
dc.subjectBig data streamen_US
dc.subjectEnsemble sizeen_US
dc.subjectWeighted majority votingen_US
dc.subjectData communication systemsen_US
dc.subjectKnowledge managementen_US
dc.subjectData streamen_US
dc.subjectEnsemble classifiersen_US
dc.subjectIndependent componentsen_US
dc.subjectNumber of componentsen_US
dc.subjectResource requirementsen_US
dc.subjectTheoretical frameworken_US
dc.subjectWeighted majority votingen_US
dc.subjectBig dataen_US
dc.titleA theoretical framework on the ideal number of classifiers for online ensembles in data streamsen_US
dc.typeConference Paperen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
A theoretical framework on the ideal number of classifiers for online ensembles in data streams.pdf
Size:
498.78 KB
Format:
Adobe Portable Document Format
Description:
Full printable version