Show simple item record

dc.contributor.authorAcer, S.en_US
dc.contributor.authorSelvitopi, O.en_US
dc.contributor.authorAykanat, C.en_US
dc.date.accessioned2019-02-21T16:01:46Z
dc.date.available2019-02-21T16:01:46Z
dc.date.issued2018
dc.identifier.issn0743-7315
dc.identifier.urihttp://hdl.handle.net/11693/49915
dc.description.abstractFor the parallelization of sparse matrix-vector multiplication (SpMV) on distributed memory systems, nonzero-based fine-grain and medium-grain partitioning models attain the lowest communication volume and computational imbalance among all partitioning models. This usually comes, however, at the expense of high message count, i.e., high latency overhead. This work addresses this shortcoming by proposing new fine-grain and medium-grain models that are able to minimize communication volume and message count in a single partitioning phase. The new models utilize message nets in order to encapsulate the minimization of total message count. We further fine-tune these models by proposing delayed addition and thresholding for message nets in order to establish a trade-off between the conflicting objectives of minimizing communication volume and message count. The experiments on an extensive dataset of nearly one thousand matrices show that the proposed models improve the total message count of the original nonzero-based models by up to 27% on the average, which is reflected on the parallel runtime of SpMV as an average reduction of 15% on 512 processors.
dc.language.isoEnglish
dc.source.titleJournal of Parallel and Distributed Computing
dc.relation.isversionofhttps://doi.org/10.1016/j.jpdc.2018.08.005
dc.subjectCommunication overheaden_US
dc.subjectFine-grain partitioningen_US
dc.subjectHypergraphen_US
dc.subjectLoad balancingen_US
dc.subjectMedium-grain partitioningen_US
dc.subjectRecursive bipartitioningen_US
dc.subjectRow-column-parallel SpMVen_US
dc.subjectSparse matrixen_US
dc.subjectSparse matrix-vector multiplicationen_US
dc.titleOptimizing nonzero-based sparse matrix partitioning models via reducing latencyen_US
dc.typeArticleen_US
dc.departmentDepartment of Computer Engineering
dc.citation.spage145en_US
dc.citation.epage158en_US
dc.citation.volumeNumber122en_US
dc.identifier.doi10.1016/j.jpdc.2018.08.005
dc.publisherAcademic Press
dc.embargo.release2020-12-01en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record