Efficient HLS-based implementation of Sparse Matrix-Vector Multiplication on FPGA
Date
Authors
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Volume
Issue
Pages
Language
Type
Journal Title
Journal ISSN
Volume Title
Attention Stats
Usage Stats
views
downloads
Series
Abstract
Sparse Matrix-Vector Multiplication (SpMV) is an important core kernel used in many scientific applications. SpMV is a communication-bound algorithm that suffers poorly from spatial locality. It exhibits low computation-tocommunication ratio due to its inherent irregular memory access patterns. This causes a significant waste of DRAM traffic and poor bandwidth utilization. Recently published Propagation Blocking (PB) methodology tackles this communication bottleneck by dividing the execution into binning and accumulation phases, allowing better locality in the cost of additional memory accesses. Building upon PB approach, in this study, we design two FPGA kernels for binning and accumulation phases using high-level synthesis, run together sequentially. Experimental results and projections on larger data show that our design can provide up to 7.9x speedup over the CPU baseline implementation.