Reduce operations: send volume balancing while minimizing latency

buir.contributor.authorKarsavuran, M. Ozan
buir.contributor.authorAykanat, Cevdet
dc.citation.epage1473en_US
dc.citation.issueNumber6en_US
dc.citation.spage1461en_US
dc.citation.volumeNumber31en_US
dc.contributor.authorKarsavuran, M. Ozanen_US
dc.contributor.authorAcer, S.en_US
dc.contributor.authorAykanat, Cevdeten_US
dc.date.accessioned2021-02-11T11:43:37Z
dc.date.available2021-02-11T11:43:37Z
dc.date.issued2020
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractCommunication hypergraph model was proposed in a two-phase setting for encapsulating multiple communication cost metrics (bandwidth and latency), which are proven to be important in parallelizing irregular applications. In the first phase, computational-task-to-processor assignment is performed with the objective of minimizing total volume while maintaining computational load balance. In the second phase, communication-task-to-processor assignment is performed with the objective of minimizing total number of messages while maintaining communication-volume balance. The reduce-communication hypergraph model suffers from failing to correctly encapsulate send-volume balancing. We propose a novel vertex weighting scheme that enables part weights to correctly encode send-volume loads of processors for send-volume balancing. The model also suffers from increasing the total communication volume during partitioning. To decrease this increase, we propose a method that utilizes the recursive bipartitioning framework and refines each bipartition by vertex swaps. For performance evaluation, we consider column-parallel SpMV, which is one of the most widely known applications in which the reduce-task assignment problem arises. Extensive experiments on 313 matrices show that, compared to the existing model, the proposed models achieve considerable improvements in all communication cost metrics. These improvements lead to an average decrease of 30 percent in parallel SpMV time on 512 processors for 70 matrices with high irregularity.en_US
dc.identifier.doi10.1109/TPDS.2020.2964536en_US
dc.identifier.issn1045-9219
dc.identifier.urihttp://hdl.handle.net/11693/55082
dc.language.isoEnglishen_US
dc.publisherIEEEen_US
dc.relation.isversionofhttps://dx.doi.org/10.1109/TPDS.2020.2964536en_US
dc.source.titleIEEE Transactions on Parallel and Distributed Systemsen_US
dc.subjectCommunication hypergraphen_US
dc.subjectCommunication costen_US
dc.subjectMaximum communication volumeen_US
dc.subjectCommunication volumeen_US
dc.subjectLatencyen_US
dc.subjectRecursive bipartitioningen_US
dc.subjectHypergraph partitioningen_US
dc.subjectSparse matrixen_US
dc.subjectSparse matrix-vector multiplicationen_US
dc.titleReduce operations: send volume balancing while minimizing latencyen_US
dc.typeArticleen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Reduce_Operations_Send_Volume_Balancing_While_Minimizing_Latency.pdf
Size:
835.75 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: