Browsing by Subject "Graph model"
Now showing 1 - 3 of 3
- Results Per Page
- Sort Options
Item Open Access Exploiting locality in sparse matrix-matrix multiplication on many-core rchitectures(IEEE Computer Society, 2017) Akbudak K.; Aykanat, CevdetExploiting spatial and temporal localities is investigated for efficient row-by-row parallelization of general sparse matrix-matrix multiplication (SpGEMM) operation of the form C=A,B on many-core architectures. Hypergraph and bipartite graph models are proposed for 1D rowwise partitioning of matrix A to evenly partition the work across threads with the objective of reducing the number of B-matrix words to be transferred from the memory and between different caches. A hypergraph model is proposed for B-matrix column reordering to exploit spatial locality in accessing entries of thread-private temporary arrays, which are used to accumulate results for C-matrix rows. A similarity graph model is proposed for B-matrix row reordering to increase temporal reuse of these accumulation array entries. The proposed models and methods are tested on a wide range of sparse matrices from real applications and the experiments were carried on a 60-core Intel Xeon Phi processor, as well as a two-socket Xeon processor. Results show the validity of the models and methods proposed for enhancing the locality in parallel SpGEMM operations. © 1990-2012 IEEE.Item Open Access Reordering methods for exploiting spatial and temporal localities in parallel sparse matrix-vector multiplication(2016-08) AbuBaker, NabilSparse Matrix-Vector multiplication (SpMV) is a very important kernel operation for many scientific applications. For irregular sparse matrices, the SpMV operation suffers from poor cache performance due to the irregular accesses of the input vector entries. In this work, we propose row and column reordering methods based on Graph partitioning (GP) and Hypergraph partitioning (HP) in order to exploit spatial and temporal localities in accessing input vector entries by clustering rows/columns with a similar sparsity pattern close to each other. The proposed methods exploit spatial and temporal localities separately (using either rows or columns of the matrix in a GP or HP method), simultaneously (using both rows and column) and in a two-phased manner(using either rows or columns in each phase). We evaluate the validity of the proposed models on a 60- core Xeon Phi co-processor for a large set of sparse matrices arising from different applications. The performance results confirm the validity and the effectiveness of the proposed methods and models.Item Open Access Spatiotemporal graph and hypergraph partitioning models for sparse matrix-vector multiplication on many-core architectures(IEEE Computer Society, 2019) Abubaker, Nabil; Akbudak, K.; Aykanat, CevdetThere exist graph/hypergraph partitioning-based row/column reordering methods for encoding either spatial or temporal locality for sparse matrix-vector multiplication (SpMV) operations. Spatial and temporal hypergraph models in these methods are extended to encapsulate both spatial and temporal localities based on cut/uncut net categorization obtained from vertex partitioning. These extensions of spatial and temporal hypergraph models encode the spatial locality primarily and the temporal locality secondarily, and vice-versa, respectively. However, the literature lacks models that simultaneously encode both spatial and temporal localities utilizing only vertex partitioning for further improving the performance of SpMV on shared-memory architectures. In order to fill this gap, we propose a novel spatiotemporal hypergraph model that leads to a one-phase spatiotemporal reordering method which encodes both types of locality simultaneously. We also propose a framework for spatiotemporal methods which encodes both types of locality in two dependent phases and two separate phases. The validity of the proposed spatiotemporal models and methods are tested on a wide range of sparse matrices and the experiments are performed on both a 60-core Intel Xeon Phi processor and a Xeon processor. Results show the validity of the methods via almost doubling the Gflop/s performance through enhancing data locality in parallel SpMV operations.