Improving the performance of 1D vertex parallel GNN training on distributed memory systems
buir.advisor | Aykanat, Cevdet | |
dc.contributor.author | Taşcı, Kutay | |
dc.date.accessioned | 2024-08-08T13:48:12Z | |
dc.date.available | 2024-08-08T13:48:12Z | |
dc.date.copyright | 2024-07 | |
dc.date.issued | 2024-07 | |
dc.date.submitted | 2024-08-02 | |
dc.description | Cataloged from PDF version of article. | |
dc.description | Thesis (Master's): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2024. | |
dc.description | Includes bibliographical references (leaves 49-52). | |
dc.description.abstract | Graph Neural Networks (GNNs) are pivotal for analyzing data within graphstructured domains such as social media, biological networks, and recommendation systems. Despite their advantages, scaling GNN training to large datasets in distributed settings poses significant challenges due to the complex task of managing computation and communication costs. The objective of this work is to scale 1D vertex-parallel GNN training on distributed memory systems via (i) twoconstraint partitioning formulation for better computational load balancing and (ii) overlapping communication with computation for reducing communication overhead. In the proposed two-constraint formulation, one constraint encodes the computational load balance during forward propagation, whereas the second constraint encodes the computational load balance during backward propagation. We propose three communication and computation overlapping methods that perform overlapping at three different levels. These methods were tested against traditional approaches using benchmark datasets, demonstrating improved training efficiency without altering the model structure. The outcomes indicate that multi-constraint graph partitioning and the integration of communication and computation overlapping schemes can significantly mitigate the challenges of distributed GNN training. The research concludes with recommendations for future work, including adapting these techniques to dynamic and more complex GNN architectures, promising further improvements in the efficiency and applicability of GNNs in real-world scenarios. | |
dc.description.provenance | Submitted by Betül Özen (ozen@bilkent.edu.tr) on 2024-08-08T13:48:12Z No. of bitstreams: 1 B138237.pdf: 363621 bytes, checksum: b4ae43abed7422047837ac44ddfb2559 (MD5) | en |
dc.description.provenance | Made available in DSpace on 2024-08-08T13:48:12Z (GMT). No. of bitstreams: 1 B138237.pdf: 363621 bytes, checksum: b4ae43abed7422047837ac44ddfb2559 (MD5) Previous issue date: 2024-07 | en |
dc.description.statementofresponsibility | by Kutay Taşcı | |
dc.format.extent | x, 52 leaves : charts ; 30 cm. | |
dc.identifier.itemid | B138237 | |
dc.identifier.uri | https://hdl.handle.net/11693/115727 | |
dc.language.iso | English | |
dc.rights | info:eu-repo/semantics/openAccess | |
dc.subject | Graph neural networks | |
dc.subject | Parallel and distributed memory systems | |
dc.subject | Graph partitioning | |
dc.subject | Load balancing | |
dc.subject | Overlapping communication with computation | |
dc.title | Improving the performance of 1D vertex parallel GNN training on distributed memory systems | |
dc.title.alternative | Dağıtık bellek sistemlerinde 1D düğüm paralel GNN eğitiminin performansının iyileştirilmesi | |
dc.type | Thesis | |
thesis.degree.discipline | Computer Engineering | |
thesis.degree.grantor | Bilkent University | |
thesis.degree.level | Master's | |
thesis.degree.name | MS (Master of Science) |