Performance improvement on latency-bound parallel HPC applications by message sharing between processors

Date

2021-02

Editor(s)

Advisor

Aykanat, Cevdet

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

Print ISSN

Electronic ISSN

Publisher

Volume

Issue

Pages

Language

English

Type

Journal Title

Journal ISSN

Volume Title

Attention Stats
Usage Stats
6
views
17
downloads

Series

Abstract

The performance of paralellized High Performance Computing (HPC) applica-tions is tied to the efficiency of the underlying processor-to-processor commu-nication. In latency-bound applications, the performance runs into bottleneck by the processor that is sending the maximum number of messages to the other processors. To reduce the latency overhead, we propose a two-phase message-sharing-based algorithm, where the bottleneck processor (the processor sending the maximum number of messages) is paired with another processor. In the first phase, the bottleneck processor is paired with the processor that has the maxi-mum number of common outgoing messages. In the second phase, the bottleneck processor is paired with the processor that has the minimum number of outgo-ing messages. In both phases, the processor pair share the common outgoing messages between them, reducing their total number of outgoing messages, but especially the number of outgoing messages of the bottleneck processor. We use Sparse Matrix-Vector Multiplication as the kernel application and a 512-processor setting for the experiments. The proposed message-sharing algorithm achieves a reduction of 84% in the number of messages sent by the bottleneck processor and a reduction of 60% in the total number of messages in the system.

Course

Other identifiers

Book Title

Degree Discipline

Computer Engineering

Degree Level

Master's

Degree Name

MS (Master of Science)

Citation

Published Version (Please cite this version)