Reducing latency cost in 2D sparse matrix partitioning models

buir.contributor.authorAykanat, Cevdet
dc.citation.epage24en_US
dc.citation.spage1en_US
dc.citation.volumeNumber57en_US
dc.contributor.authorSelvitopi, O.en_US
dc.contributor.authorAykanat, Cevdeten_US
dc.date.accessioned2018-04-12T10:53:30Z
dc.date.available2018-04-12T10:53:30Z
dc.date.issued2016en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractSparse matrix partitioning is a common technique used for improving performance of parallel linear iterative solvers. Compared to solvers used for symmetric linear systems, solvers for nonsymmetric systems offer more potential for addressing different multiple communication metrics due to the flexibility of adopting different partitions on the input and output vectors of sparse matrix-vector multiplication operations. In this regard, there exist works based on one-dimensional (1D) and two-dimensional (2D) fine-grain partitioning models that effectively address both bandwidth and latency costs in nonsymmetric solvers. In this work, we propose two new models based on 2D checkerboard and jagged partitioning. These models aim at minimizing total message count while maintaining a balance on communication volume loads of processors; hence, they address both bandwidth and latency costs. We evaluate all partitioning models on two nonsymmetric system solvers implemented using the widely adopted PETSc toolkit and conduct extensive experiments using these solvers on a modern system (a BlueGene/Q machine) successfully scaling them up to 8K processors. Along with the proposed models, we put practical aspects of eight evaluated models (two 1D- and six 2D-based) under thorough analysis. To the best of our knowledge, this is the first work that analyzes practical performance of 2D models on this scale. Among evaluated models, the models that rely on 2D jagged partitioning obtain the most promising results by striking a balance between minimizing bandwidth and latency costs.en_US
dc.identifier.doi10.1016/j.parco.2016.04.004en_US
dc.identifier.issn0167-8191
dc.identifier.urihttp://hdl.handle.net/11693/36792
dc.language.isoEnglishen_US
dc.publisherElsevier BVen_US
dc.relation.isversionofhttp://dx.doi.org/10.1016/j.parco.2016.04.004en_US
dc.source.titleParallel Computingen_US
dc.subjectBandwidth overheaden_US
dc.subjectLatency overheaden_US
dc.subjectNonsymmetric linear systemsen_US
dc.subjectParallel iterative solversen_US
dc.subjectSparse matrix partitioningen_US
dc.subjectSparse matrix-vector multiplicationen_US
dc.subjectBandwidthen_US
dc.subjectCostsen_US
dc.subjectIterative methodsen_US
dc.subjectLinear systemsen_US
dc.subjectParallel processing systemsen_US
dc.subjectBandwidth overheadsen_US
dc.subjectLatency overheaden_US
dc.subjectNonsymmetric linear systemsen_US
dc.subjectParallel iterative solversen_US
dc.subjectSparse matricesen_US
dc.subjectSparse matrix-vector multiplicationen_US
dc.subjectMatrix algebraen_US
dc.titleReducing latency cost in 2D sparse matrix partitioning modelsen_US
dc.typeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Reducing latency cost in 2D sparse matrix partitioning models.pdf
Size:
1.66 MB
Format:
Adobe Portable Document Format
Description:
Full Printable Version