Improving the performance of similarity joins using graphics processing unit

Korkmaz, Zeynep

Improving the performance of similarity joins using graphics processing unit

buir.advisor	Aykanat, Cevdet
dc.contributor.author	Korkmaz, Zeynep
dc.date.accessioned	2016-01-08T18:25:02Z
dc.date.available	2016-01-08T18:25:02Z
dc.date.issued	2012
dc.description	Cataloged from PDF version of article.	en_US
dc.description	Includes bibliographical refences.	en_US
dc.description.abstract	The similarity join is an important operation in data mining and it is used in many applications from varying domains. A similarity join operator takes one or two sets of data points and outputs pairs of points whose distances in the data space is within a certain threshold value, ". The baseline nested loop approach computes the distances between all pairs of objects. When considering large set of objects which yield too long query time for nested loop paradigm, accelerating such operator becomes more important. The computing capability of recent GPUs with the help of a general purpose parallel computing architecture (CUDA) has attracted many researches. With this motivation, we propose two similarity join algorithms for Graphics Processing Unit (GPU). To exploit the advantages of general purpose GPU computing, we rst propose an improved nested loop join algorithm (GPU-INLJ) for the speci c environment of GPU. Also we present a partitioning-based join algorithm (KMEANS-JOIN) that guarantees each partition can be joined independently without missing any join pair. Our experiments demonstrate massive performance gains and the suitability of our algorithms for large datasets.	en_US
dc.description.statementofresponsibility	Korkmaz, Zeynep	en_US
dc.format.extent	xi, 63 leaves, illustrations	en_US
dc.identifier.itemid	B134549
dc.identifier.uri	http://hdl.handle.net/11693/15817
dc.language.iso	English	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Similarity join	en_US
dc.subject	K-means clustering	en_US
dc.subject	General purpose graphics processing unit	en_US
dc.subject	CUDA	en_US
dc.subject.lcc	QA76.9.D343 K67 2012	en_US
dc.subject.lcsh	Data mining.	en_US
dc.subject.lcsh	Parallel programming (Computer science)	en_US
dc.subject.lcsh	Simulation methods.	en_US
dc.subject.lcsh	Computer simulation.	en_US
dc.title	Improving the performance of similarity joins using graphics processing unit	en_US
dc.type	Thesis	en_US
thesis.degree.discipline	Computer Engineering
thesis.degree.grantor	Bilkent University
thesis.degree.level	Master's
thesis.degree.name	MS (Master of Science)

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 0006521.pdf
Size:: 1.01 MB
Format:: Adobe Portable Document Format

Download

Collections

Graduate School of Engineering and Science