Partitioning models for scaling distributed graph computations

buir.advisorAykanat, Cevdet
dc.contributor.authorDemirci, Gündüz Vehbi
dc.date.accessioned2019-09-10T12:54:35Z
dc.date.available2019-09-10T12:54:35Z
dc.date.copyright2019-08
dc.date.issued2019-08
dc.date.submitted2019-09-05
dc.descriptionCataloged from PDF version of article.en_US
dc.descriptionThesis (Ph.D.): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2019.en_US
dc.descriptionIncludes bibliographical references (leaves 124-137).en_US
dc.description.abstractThe focus of this thesis is intelligent partitioning models and methods for scaling the performance of parallel graph computations on distributed-memory systems. Distributed databases utilize graph partitioning to provide servers with data-locality and workload-balance. Some queries performed on a database may form cascades due to the queries triggering each other. The current partitioning methods consider the graph structure and logs of query workload. We introduce the cascade-aware graph partitioning problem with the objective of minimizing the overall cost of communication operations between servers during cascade processes. We propose a randomized algorithm that integrates the graph structure and cascade processes to use as input for large-scale partitioning. Experiments on graphs representing real social networks demonstrate the e ectiveness of the proposed solution in terms of the partitioning objectives. Sparse-general-matrix-multiplication (SpGEMM) is a key computational kernel used in scienti c computing and high-performance graph computations. We propose an SpGEMM algorithm for Accumulo database which enables high performance distributed parallelism through its iterator framework. The proposed algorithm provides write-locality and avoids scanning input matrices multiple times by utilizing Accumulo's batch scanning capability and node-level parallelism structures. We also propose a matrix partitioning scheme that reduces the total communication volume and provides a workload-balance among servers. Extensive experiments performed on both real-world and synthetic sparse matrices show that the proposed algorithm and matrix partitioning scheme provide signi cant performance improvements. Scalability of parallel SpGEMM algorithms are heavily communication bound. Multidimensional partitioning of SpGEMM's workload is essential to achieve higher scalability. We propose hypergraph models that utilize the arrangement of processors and also attain a multidimensional partitioning on SpGEMM's workload. Thorough experimentation performed on both realistic as well as synthetically generated SpGEMM instances demonstrates the e ectiveness of the proposed partitioning models.en_US
dc.description.provenanceSubmitted by Betül Özen (ozen@bilkent.edu.tr) on 2019-09-10T12:54:35Z No. of bitstreams: 1 thesis.pdf: 1033056 bytes, checksum: f2fb6c65841c0cf70fe8bb38329ddea5 (MD5)en
dc.description.provenanceMade available in DSpace on 2019-09-10T12:54:35Z (GMT). No. of bitstreams: 1 thesis.pdf: 1033056 bytes, checksum: f2fb6c65841c0cf70fe8bb38329ddea5 (MD5) Previous issue date: 2019-09en
dc.description.statementofresponsibilityby Gündüz Vehbi Demircien_US
dc.embargo.release2020-03-05
dc.format.extentxiv, 137 leaves : charts ; 30 cm.en_US
dc.identifier.itemidB128688
dc.identifier.urihttp://hdl.handle.net/11693/52406
dc.language.isoEnglishen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectGraph partitioningen_US
dc.subjectPropagation modelsen_US
dc.subjectInformation cascadeen_US
dc.subjectSocial networksen_US
dc.subjectRandomized algorithmsen_US
dc.subjectScalabilityen_US
dc.subjectDatabasesen_US
dc.subjectAccumuloen_US
dc.subjectGraphuloen_US
dc.subjectParallel and distributed computingen_US
dc.subjectSparse matricesen_US
dc.subjectSparse matrix-matrix multiplicationen_US
dc.subjectSpGEMMen_US
dc.subjectMatrix partitioningen_US
dc.subjectData localityen_US
dc.subjectHypergraph partitioningen_US
dc.subjectNumerical linear algebraen_US
dc.titlePartitioning models for scaling distributed graph computationsen_US
dc.title.alternativeDağıtık çizge hesaplamalarının ölçeklendirilmesi için bölümleme yöntemlerien_US
dc.typeThesisen_US
thesis.degree.disciplineComputer Engineering
thesis.degree.grantorBilkent University
thesis.degree.levelDoctoral
thesis.degree.namePh.D. (Doctor of Philosophy)

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis.pdf
Size:
1008.84 KB
Format:
Adobe Portable Document Format
Description:
Full printable version
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: