Data decomposition techniques for parallel tree-based k-means clustering

buir.advisorGürsoy, Atilla
dc.contributor.authorŞen, Cenk
dc.date.accessioned2016-01-08T18:25:35Z
dc.date.available2016-01-08T18:25:35Z
dc.date.issued2002
dc.departmentDepartment of Computer Engineeringen_US
dc.descriptionAnkara : The Department of Computer Engineering and the Institute of Engineering and Science of Bilkent University, 2002.en_US
dc.descriptionThesis (Master's) -- Bilkent University, 2002.en_US
dc.descriptionIncludes bibliographical references leaves 78-80.en_US
dc.description.abstractThe main computation in the k-means clustering is distance calculations between cluster centroids and patterns. As the number of the patterns and the number of centroids increases, time needed to complete computations increased. This computational load requires high performance computers and/or algorithmic improvements. The parallel tree-based k-means algorithm on distributed memory machines combines the algorithmic improvements and high computation capacity of the parallel computers to deal with huge datasets. Its performance is affected by the data decomposition technique used. In this thesis, we presented novel data decomposition technique to improve the performance of the parallel tree-based k-means algorithm on distributed memory machines. Proposed tree-based decomposition techniques try to decrease the total number of the distance calculations by assigning processors compact subspaces. The compact subspace improves the performance of the pruning function of the tree-based k-means algorithm. We have implemented the algorithm and have conducted experiments on a PC cluster. Our experimental results demonstrated that the tree-based decomposition technique outperforms the random decomposition and stripwise decomposition techniques.en_US
dc.description.degreeM.S.en_US
dc.description.provenanceMade available in DSpace on 2016-01-08T18:25:35Z (GMT). No. of bitstreams: 1 0002066.pdf: 1831069 bytes, checksum: 1ba08a7499eced7f1afce4d86930e4e1 (MD5)en
dc.description.statementofresponsibilityŞen, Cenken_US
dc.format.extentxii, 80 leaves, illustraions, tablesen_US
dc.identifier.itemidBILKUTUPB066942
dc.identifier.urihttp://hdl.handle.net/11693/15853
dc.language.isoEnglishen_US
dc.publisherBilkent Universityen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectClusteringen_US
dc.subjectParallel algorithmen_US
dc.subjectLoad balancingen_US
dc.subjectData decompositionen_US
dc.subject.lccQA76.58 .S46 2002en_US
dc.subject.lcshData mining.en_US
dc.subject.lcshComputer algorithms.en_US
dc.titleData decomposition techniques for parallel tree-based k-means clusteringen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
0002066.pdf
Size:
1.75 MB
Format:
Adobe Portable Document Format