Improving the performance of similarity joins using graphics processing unit
buir.advisor | Aykanat, Cevdet | |
dc.contributor.author | Korkmaz, Zeynep | |
dc.date.accessioned | 2016-01-08T18:25:02Z | |
dc.date.available | 2016-01-08T18:25:02Z | |
dc.date.issued | 2012 | |
dc.description | Ankara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent University, 2012. | en_US |
dc.description | Thesis (Master's) -- Bilkent University, 2012. | en_US |
dc.description | Includes bibliographical refences. | en_US |
dc.description.abstract | The similarity join is an important operation in data mining and it is used in many applications from varying domains. A similarity join operator takes one or two sets of data points and outputs pairs of points whose distances in the data space is within a certain threshold value, ". The baseline nested loop approach computes the distances between all pairs of objects. When considering large set of objects which yield too long query time for nested loop paradigm, accelerating such operator becomes more important. The computing capability of recent GPUs with the help of a general purpose parallel computing architecture (CUDA) has attracted many researches. With this motivation, we propose two similarity join algorithms for Graphics Processing Unit (GPU). To exploit the advantages of general purpose GPU computing, we rst propose an improved nested loop join algorithm (GPU-INLJ) for the speci c environment of GPU. Also we present a partitioning-based join algorithm (KMEANS-JOIN) that guarantees each partition can be joined independently without missing any join pair. Our experiments demonstrate massive performance gains and the suitability of our algorithms for large datasets. | en_US |
dc.description.provenance | Made available in DSpace on 2016-01-08T18:25:02Z (GMT). No. of bitstreams: 1 0006521.pdf: 1056204 bytes, checksum: 2d614939bfc2deadac0691a2e0b855a9 (MD5) | en |
dc.description.statementofresponsibility | Korkmaz, Zeynep | en_US |
dc.format.extent | xi, 63 leaves, illustrations | en_US |
dc.identifier.itemid | B134549 | |
dc.identifier.uri | http://hdl.handle.net/11693/15817 | |
dc.language.iso | English | en_US |
dc.rights | info:eu-repo/semantics/openAccess | en_US |
dc.subject | Similarity join | en_US |
dc.subject | K-means clustering | en_US |
dc.subject | General purpose graphics processing unit | en_US |
dc.subject | CUDA | en_US |
dc.subject.lcc | QA76.9.D343 K67 2012 | en_US |
dc.subject.lcsh | Data mining. | en_US |
dc.subject.lcsh | Parallel programming (Computer science) | en_US |
dc.subject.lcsh | Simulation methods. | en_US |
dc.subject.lcsh | Computer simulation. | en_US |
dc.title | Improving the performance of similarity joins using graphics processing unit | en_US |
dc.type | Thesis | en_US |
thesis.degree.discipline | Computer Engineering | |
thesis.degree.grantor | Bilkent University | |
thesis.degree.level | Master's | |
thesis.degree.name | MS (Master of Science) |
Files
Original bundle
1 - 1 of 1