Partitioning models for scaling parallel sparse matrix-matrix multiplication

Akbudak, Kadir; Selvitopi, Oğuz; Aykanat, Cevdet

Partitioning models for scaling parallel sparse matrix-matrix multiplication

buir.contributor.author	Akbudak, Kadir
buir.contributor.author	Selvitopi, Oğuz
buir.contributor.author	Aykanat, Cevdet
dc.citation.epage	13:34	en_US
dc.citation.issueNumber	3	en_US
dc.citation.spage	13:1	en_US
dc.citation.volumeNumber	4	en_US
dc.contributor.author	Akbudak, Kadir	en_US
dc.contributor.author	Selvitopi, Oğuz	en_US
dc.contributor.author	Aykanat, Cevdet	en_US
dc.date.accessioned	2019-02-12T08:53:53Z
dc.date.available	2019-02-12T08:53:53Z
dc.date.issued	2018	en_US
dc.department	Department of Computer Engineering	en_US
dc.description.abstract	We investigate outer-product--parallel, inner-product--parallel, and row-by-row-product--parallel formulations of sparse matrix-matrix multiplication (SpGEMM) on distributed memory architectures. For each of these three formulations, we propose a hypergraph model and a bipartite graph model for distributing SpGEMM computations based on one-dimensional (1D) partitioning of input matrices. We also propose a communication hypergraph model for each formulation for distributing communication operations. The computational graph and hypergraph models adopted in the first phase aim at minimizing the total message volume and balancing the computational loads of processors, whereas the communication hypergraph models adopted in the second phase aim at minimizing the total message count and balancing the message volume loads of processors. That is, the computational partitioning models reduce the bandwidth cost and the communication hypergraph models reduce the latency cost. Our extensive parallel experiments on up to 2048 processors for a wide range of realistic SpGEMM instances show that although the outer-product--parallel formulation scales better, the row-by-row-product--parallel formulation is more viable due to its significantly lower partitioning overhead and competitive scalability. For computational partitioning models, our experimental findings indicate that the proposed bipartite graph models are attractive alternatives to their hypergraph counterparts because of their lower partitioning overhead. Finally, we show that by reducing the latency cost besides the bandwidth cost through using the communication hypergraph models, the parallel SpGEMM time can be further improved up to 32%.	en_US
dc.identifier.doi	10.1145/3155292	en_US
dc.identifier.eissn	2329-4957	en_US
dc.identifier.issn	2329-4949	en_US
dc.identifier.uri	http://hdl.handle.net/11693/49306	en_US
dc.language.iso	English	en_US
dc.publisher	Association for Computing Machinery	en_US
dc.relation.isversionof	http://doi.org/10.1145/3155292	en_US
dc.source.title	ACM Transactions on Parallel Computing	en_US
dc.subject	Sparse matrix-matrix multiplication	en_US
dc.subject	SpGEMM	en_US
dc.subject	Hypergraph partitioning	en_US
dc.subject	Graph partitioning	en_US
dc.subject	Communication cost	en_US
dc.subject	Bandwidth	en_US
dc.subject	Latency	en_US
dc.title	Partitioning models for scaling parallel sparse matrix-matrix multiplication	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Partitioning_models_for_scaling_parallel_sparse_matrix-matrix_multiplicatio.pdf
Size:: 1.9 MB
Format:: Adobe Portable Document Format
Description:: Full printable version

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Scholarly Publications - Computer Engineering