Browsing by Subject "Sparse Matrix"

Now showing 1 - 2 of 2

Open Access
Hypergraph partitioning based models and methods for exploiting cache locality in sparse matrix-vector multiplication
(Society for Industrial and Applied Mathematics, 2013-02-27) Akbudak, K.; Kayaaslan, E.; Aykanat, Cevdet
Sparse matrix-vector multiplication (SpMxV) is a kernel operation widely used in iterative linear solvers. The same sparse matrix is multiplied by a dense vector repeatedly in these solvers. Matrices with irregular sparsity patterns make it difficult to utilize cache locality effectively in SpMxV computations. In this work, we investigate single-and multiple-SpMxV frameworks for exploiting cache locality in SpMxV computations. For the single-SpMxV framework, we propose two cache-size-aware row/column reordering methods based on one-dimensional (1D) and two-dimensional (2D) top-down sparse matrix partitioning. We utilize the column-net hypergraph model for the 1D method and enhance the row-column-net hypergraph model for the 2D method. The primary aim in both of the proposed methods is to maximize the exploitation of temporal locality in accessing input vector entries. The multiple-SpMxV framework depends on splitting a given matrix into a sum of multiple nonzero-disjoint matrices. We propose a cache-size-aware splitting method based on 2D top-down sparse matrix partitioning by utilizing the row-column-net hypergraph model. The aim in this proposed method is to maximize the exploitation of temporal locality in accessing both input-and output-vector entries. We evaluate the validity of our models and methods on a wide range of sparse matrices using both cache-miss simulations and actual runs by using OSKI. Experimental results show that proposed methods and models outperform state-of-the-art schemes. (c)2013 Society for Industrial and Applied Mathematics
Open Access
Increasing data reuse in parallel sparse matrix-vector and matrix-transpose-vector multiply on shared-memory architectures
(2014) Karsavuran, Mustafa Ozan
Sparse matrix-vector and matrix-transpose-vector multiplications (Sparse AAT x) are the kernel operations used in iterative solvers. Sparsity pattern of the input matrix A, as well as its transpose, remains the same throughout the iterations. CPU cache could not be used properly during these Sparse AAT x operations due to irregular sparsity pattern of the matrix. We propose two parallelization strategies for Sparse AAT x. Our methods partition A matrix in order to exploit cache locality for matrix nonzeros and vector entries. We conduct experiments on the recently-released Intel R Xeon PhiTM coprocessor involving large variety of sparse matrices. Experimental results show that proposed methods achieve higher performance improvement than the state-of-the-art methods in the literature.