Vectorization and parallelization of the conjugate gradient algorithm on hypercube-connected vector processors
Scott, D. S.
Microprocessing and Microprogramming
67 - 82
Item Usage Stats
MetadataShow full item record
Solution of large sparse linear systems of equations in the form constitutes a significant amount of the computations in the simulation of physical phenomena . For example, the finite element discretization of a regular domain, with proper ordering of the variables x, renders a banded N × N coefficient matrix A. The Conjugate Gradient (CG) [2,3] algorithm is an iterative method for solving sparse matrix equations and is widely used because of its convergence properties. In this paper an implementation of the Conjugate Gradient algorithm, that exploits both vectorization and parallelization on a 2-dimensional hypercube with vector processors at each node (iPSC-VX/d2), is described. The implementation described here achieves efficient parallelization by using a version of the CG algorithm suitable for coarse grain parallelism [4,5] to reduce the communication steps required and by overlapping the computations on the vector processor with internode communication. With parallelization and vectorization, a speedup of 58 over a μVax II is obtained for large problems, on a two dimensional vector hypercube (iPSC-VX/d2).
Conjugate gradient algorithm
Hypercube-connected vector processors