Reducing communication overhead in sparse matrix and tensor computations

Karsavuran, Mustafa Ozan

Reducing communication overhead in sparse matrix and tensor computations

Available

The embargo period has ended, and this item is now available.

Files

10349631.pdf (3.04 MB)

Date

2020-08

Authors

Karsavuran, Mustafa Ozan

Advisor

Aykanat, Cevdet

BUIR Usage Stats

2
views

39
downloads

Abstract

Encapsulating multiple communication cost metrics, i.e., bandwidth and latency, is proven to be important in reducing communication overhead in the parallelization of sparse and irregular applications. Communication hypergraph model was proposed in a two-phase setting for encapsulating multiple communication cost metrics. The reduce-communication hypergraph model suﬀers from failing to correctly encapsulate send-volume balancing. We propose a novel vertex weighting scheme that enables part weights to correctly encode send-volume loads of processors for send-volume balancing. The model also suﬀers from increasing the total communication volume during partitioning. To decrease this increase, we propose a method that utilizes the recursive bipartitioning (RB) paradigm and reﬁnes each bipartition by vertex swaps. For performance evaluation, we consider column-parallel SpMV, which is one of the most widely known applications in which the reduce-task assignment problem arises. Extensive experiments on 313 matrices show that, compared to the existing model, the proposed models achieve considerable improvements in all communication cost metrics. These improvements lead to an average decrease of 30 percent in parallel SpMV time on 512 processors for 70 matrices with high irregularity. We further enhance the reduce-communication hypergraph model so that it also encapsulates the minimization of the maximum number of messages sent by a processor. For this purpose, we propose a novel cutsize metric which we realize using RB paradigm while partitioning the reduce-communication hypergraph. We also introduce a new type of net for the communication hypergraph which models decreasing the increase in the total communication volume directly with the partitioning objective. Experiments on 300 matrices show that the proposed models achieve considerable improvements in communication cost metrics which lead to better column-parallel SpMM time on 1024 processors. We propose a hypergraph model for general medium-grain sparse tensor partitioning which does not enforce any topological constraint on the partitioning. The proposed model is based on splitting the given tensor into nonzero-disjoint component tensors. Then a mode-dependent coarse-grain hypergraph is constructed for each component tensor. A net amalgamation operation is proposed to form a composite medium-grain hypergraph from these mode-dependent coarse-grain hypergraphs to correctly encapsulate the minimization of the communication volume. We propose a heuristic which splits the nonzeros of dense slices to obtain sparse slices in component tensors. We also utilize the well-known RB paradigm to improve the quality of the splitting heuristic. We propose a medium-grain tripartite graph model with the aim of a faster partitioning at the expense of increasing the total communication volume. Parallel experiments conducted on 10 real-world tensors on up to 1024 processors conﬁrm the validity of the proposed hypergraph and graph models.

Keywords

Distributed-memory systems, Parallel computing, Communication cost, Recursive bipartitioning, Graph partitioning, Hypergraph partitioning, Sparse matrix, Sparse tensor

Degree Discipline

Computer Engineering

Degree Level

Doctoral

Degree Name

Ph.D. (Doctor of Philosophy)

Permalink

http://hdl.handle.net/11693/53980

Collections

Graduate School of Engineering and Science

Language

English

Type

Thesis

Full item page

Reducing communication overhead in sparse matrix and tensor computations

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Reducing communication overhead in sparse matrix and tensor computations

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type