Exploiting locality in sparse matrix-matrix multiplication on many-core rchitectures
Date
2017Source Title
IEEE Transactions on Parallel and Distributed Systems
Print ISSN
1045-9219
Publisher
IEEE Computer Society
Volume
28
Issue
8
Pages
2258 - 2271
Language
English
Type
ArticleItem Usage Stats
230
views
views
329
downloads
downloads
Abstract
Exploiting spatial and temporal localities is investigated for efficient row-by-row parallelization of general sparse matrix-matrix multiplication (SpGEMM) operation of the form C=A,B on many-core architectures. Hypergraph and bipartite graph models are proposed for 1D rowwise partitioning of matrix A to evenly partition the work across threads with the objective of reducing the number of B-matrix words to be transferred from the memory and between different caches. A hypergraph model is proposed for B-matrix column reordering to exploit spatial locality in accessing entries of thread-private temporary arrays, which are used to accumulate results for C-matrix rows. A similarity graph model is proposed for B-matrix row reordering to increase temporal reuse of these accumulation array entries. The proposed models and methods are tested on a wide range of sparse matrices from real applications and the experiments were carried on a 60-core Intel Xeon Phi processor, as well as a two-socket Xeon processor. Results show the validity of the models and methods proposed for enhancing the locality in parallel SpGEMM operations. © 1990-2012 IEEE.
Keywords
Bipartite graph modelComputational hypergraph model
Intel Xeon Phi
SpGEMM
Computer architecture
Graph theory
Bipartite graphs
Data locality
Graph clustering
Graph model
Graph partitioning
Hypergraph clustering
Hypergraph model
Hypergraph partitioning
Many-core architecture
Sparse matrices
Sparse matrix-matrix multiplications
Matrix algebra
Permalink
http://hdl.handle.net/11693/37098Published Version (Please cite this version)
http://dx.doi.org/10.1109/TPDS.2017.2656893Collections
Related items
Showing items related by title, author, creator and subject.
-
Improving performance of sparse matrix dense matrix multiplication on large-scale parallel systems
Acer, S.; Selvitopi, O.; Aykanat, Cevdet (Elsevier BV, 2016)We propose a comprehensive and generic framework to minimize multiple and different volume-based communication cost metrics for sparse matrix dense matrix multiplication (SpMM). SpMM is an important kernel that finds ... -
Encapsulating multiple communication-cost metrics in partitioning sparse rectangular matrices for parallel matrix-vector multiplies
Uçar, B.; Aykanat, Cevdet (SIAM, 2004)This paper addresses the problem of one-dimensional partitioning of structurally unsymmetric square and rectangular sparse matrices for parallel matrix-vector and matrix-transpose-vector multiplies. The objective is to ... -
Spatiotemporal graph and hypergraph partitioning models for sparse matrix-vector multiplication on many-core architectures
Abubaker, Nabil; Akbudak, K.; Aykanat, Cevdet (IEEE Computer Society, 2019)There exist graph/hypergraph partitioning-based row/column reordering methods for encoding either spatial or temporal locality for sparse matrix-vector multiplication (SpMV) operations. Spatial and temporal hypergraph ...