Browsing by Author "Akbudak, Kadir"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
Item Open Access Cache locality exploiting methods and models for sparse matrix-vector multiplication(Bilkent University, 2009) Akbudak, KadirThe sparse matrix-vector multiplication (SpMxV) is an important kernel operation widely used in linear solvers. The same sparse matrix is multiplied by a dense vector repeatedly in these solvers to solve a system of linear equations. High performance gains can be obtained if we can take the advantage of today’s deep cache hierarchy in SpMxV operations. Matrices with irregular sparsity patterns make it difficult to utilize data locality effectively in SpMxV computations. Different techniques are proposed in the literature to utilize cache hierarchy effectively via exploiting data locality during SpMxV. In this work, we investigate two distinct frameworks for cacheaware/oblivious SpMxV: single matrix-vector multiply and multiple submatrix-vector multiplies. For the single matrix-vector multiply framework, we propose a cache-size aware top-down row/column-reordering approach based on 1D sparse matrix partitioning by utilizing the recently proposed appropriate hypergraph models of sparse matrices, and a cache oblivious bottom-up approach based on hierarchical clustering of rows/columns with similar sparsity patterns. We also propose a column compression scheme as a preprocessing step which makes these two approaches cache-line-size aware. The multiple submatrix-vector multiplies framework depends on the partitioning the matrix into multiple nonzero-disjoint submatrices. For an effective matrixto-submatrix partitioning required in this framework, we propose a cache-size aware top-down approach based on 2D sparse matrix partitioning by utilizing the recently proposed fine-grain hypergraph model. For this framework, we also propose a traveling salesman formulation for an effective ordering of individual submatrix-vector multiply operations. We evaluate the validity of our models and methods on a wide range of sparse matrices. Experimental results show that proposed methods and models outperforms state-of-the-art schemes.Item Open Access Hypergraph models for parallel sparse matrix-matrix multiplication(Bilkent University, 2015-09) Akbudak, KadirMultiplication of two sparse matrices (i.e., sparse matrix-matrix multiplication, which is abbreviated as SpGEMM) is a widely used kernel in many applications such as molecular dynamics simulations, graph operations, and linear programming. We identify parallel formulations of SpGEMM operation in the form of C = AB for distributed-memory architectures. Using these formulations, we propose parallel SpGEMM algorithms that have the multiplication and communication phases: The multiplication phase consists of local SpGEMM computations without any communication and the communication phase consists of transferring required input/output matrices. For these algorithms, three hypergraph models are proposed. These models are used to partition input and output matrices simultaneously. The input matrices A and B are partitioned in one dimension in all of these hypergraph models. The output matrix C is partitioned in two dimensions, which is nonzero-based in the rst hypergraph model, and it is partitioned in one dimension in the second and third models. In partitioning of these hypergraph models, the constraint on vertex weights corresponds to computational load balancing among processors for the multiplication phase of the proposed SpGEMM algorithms, and the objective, which is minimizing cutsize de ned in terms of costs of the cut hyperedges, corresponds to minimizing the communication volume due to transferring required matrix entries in the communication phase of the SpGEMM algorithms. We also propose models for reducing the total number of messages while maintaining balance on communication volumes handled by processors during the communication phase of the SpGEMM algorithms. An SpGEMM library for distributed memory architectures is developed in order to verify the empirical validity of our models. The library uses MPI (Message Passing Interface) for performing communication in the parallel setting. The developed SpGEMM library is run on SpGEMM instances from various realistic applications and the experiments are carried out on a large parallel IBM BlueGene/Q system, named JUQUEEN. In the experimentation of the proposed hypergraph models, high speedup values are observed.Item Open Access Parallelization of Sparse Matrix Kernels for big data applications(Springer, 2016) Selvitopu, Oğuz; Akbudak, Kadir; Aykanat, Cevdet; Pop, F.; Kołodziej, J.; Di Martino, B.Analysis of big data on large-scale distributed systems often necessitates efficient parallel graph algorithms that are used to explore the relationships between individual components. Graph algorithms use the basic adjacency list representation for graphs, which can also be viewed as a sparse matrix. This correspondence between representation of graphs and sparse matrices makes it possible to express many important graph algorithms in terms of basic sparse matrix operations, where the literature for optimization is more mature. For example, the graph analytic libraries such as Pegasus and Combinatorial BLAS use sparse matrix kernels for a wide variety of operations on graphs. In this work, we focus on two such important sparse matrix kernels: Sparse matrix–sparse matrix multiplication (SpGEMM) and sparse matrix–dense matrix multiplication (SpMM). We propose partitioning models for efficient parallelization of these kernels on large-scale distributed systems. Our models aim at reducing and improving communication volume while balancing computational load, which are two vital performance metrics on distributed systems. We show that by exploiting sparsity patterns of the matrices through our models, the parallel performance of SpGEMM and SpMM operations can be significantly improved.Item Open Access Partitioning models for scaling parallel sparse matrix-matrix multiplication(Association for Computing Machinery, 2018) Akbudak, Kadir; Selvitopi, Oğuz; Aykanat, CevdetWe investigate outer-product--parallel, inner-product--parallel, and row-by-row-product--parallel formulations of sparse matrix-matrix multiplication (SpGEMM) on distributed memory architectures. For each of these three formulations, we propose a hypergraph model and a bipartite graph model for distributing SpGEMM computations based on one-dimensional (1D) partitioning of input matrices. We also propose a communication hypergraph model for each formulation for distributing communication operations. The computational graph and hypergraph models adopted in the first phase aim at minimizing the total message volume and balancing the computational loads of processors, whereas the communication hypergraph models adopted in the second phase aim at minimizing the total message count and balancing the message volume loads of processors. That is, the computational partitioning models reduce the bandwidth cost and the communication hypergraph models reduce the latency cost. Our extensive parallel experiments on up to 2048 processors for a wide range of realistic SpGEMM instances show that although the outer-product--parallel formulation scales better, the row-by-row-product--parallel formulation is more viable due to its significantly lower partitioning overhead and competitive scalability. For computational partitioning models, our experimental findings indicate that the proposed bipartite graph models are attractive alternatives to their hypergraph counterparts because of their lower partitioning overhead. Finally, we show that by reducing the latency cost besides the bandwidth cost through using the communication hypergraph models, the parallel SpGEMM time can be further improved up to 32%.