Accelerating the HyperLogLog cardinality estimation algorithm

Bozkus, C.; Fraguela, B. B.

Accelerating the HyperLogLog cardinality estimation algorithm

dc.citation.volumeNumber	2017	en_US
dc.contributor.author	Bozkus, C.	en_US
dc.contributor.author	Fraguela, B. B.	en_US
dc.date.accessioned	2018-04-12T11:01:45Z
dc.date.available	2018-04-12T11:01:45Z
dc.date.issued	2017	en_US
dc.department	Department of Computer Engineering	en_US
dc.description.abstract	In recent years, vast amounts of data of different kinds, from pictures and videos from our cameras to software logs from sensor networks and Internet routers operating day and night, are being generated. This has led to new big data problems, which require new algorithms to handle these large volumes of data and as a result are very computationally demanding because of the volumes to process. In this paper, we parallelize one of these new algorithms, namely, the HyperLogLog algorithm, which estimates the number of different items in a large data set with minimal memory usage, as it lowers the typical memory usage of this type of calculation from O(n) to O(1). We have implemented parallelizations based on OpenMP and OpenCL and evaluated them in a standard multicore system, an Intel Xeon Phi, and two GPUs from different vendors. The results obtained in our experiments, in which we reach a speedup of 88.6 with respect to an optimized sequential implementation, are very positive, particularly taking into account the need to run this kind of algorithm on large amounts of data. © 2017 Cem Bozkus and Basilio B. Fraguela.	en_US
dc.identifier.doi	10.1155/2017/2040865	en_US
dc.identifier.issn	1058-9244	en_US
dc.identifier.uri	http://hdl.handle.net/11693/37065	en_US
dc.language.iso	English	en_US
dc.publisher	Hindawi Limited	en_US
dc.relation.isversionof	https://doi.org/10.1155/2017/2040865	en_US
dc.source.title	Scientific Programming	en_US
dc.subject	Application programming interfaces (API)	en_US
dc.subject	Program processors	en_US
dc.subject	Sensor networks	en_US
dc.subject	Cardinality estimations	en_US
dc.subject	Internet routers	en_US
dc.subject	Large amounts of data	en_US
dc.subject	Large datasets	en_US
dc.subject	Multi-core systems	en_US
dc.subject	Parallelizations	en_US
dc.subject	Sequential implementation	en_US
dc.subject	Software logs	en_US
dc.subject	Big data	en_US
dc.title	Accelerating the HyperLogLog cardinality estimation algorithm	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Accelerating the HyperLogLog Cardinality Estimation Algorithm.pdf
Size:: 1.39 MB
Format:: Adobe Portable Document Format
Description:: Full printable version

Download

Collections

Scholarly Publications - Computer Engineering