Accelerating pagerank with a heterogeneous two phase CPU-FPGA algorithm
Date
Authors
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
BUIR Usage Stats
views
downloads
Series
Abstract
PageRank is a network analysis algorithm that is used to measure the importance of each vertex in a graph. Fundamentally it is a Sparse Matrix-Vector multiplication problem and suffers from the same bottlenecks, such as irregular memory access and low computation-to-communication ratio. Moreover, the existing Field Programmable Gate Array (FPGA) accelerators for PageRank algorithm either require large portions of the graph to be in-memory, which is not suitable for big data applications or cannot fully utilize the memory bandwidth. Recently published Propagation Blocking(PB) methodology improves the performance of PageRank by dividing the execution into binning and accumulation phases. In this paper, we propose a heterogeneous high-throughput implementation of the PB algorithm where the binning phase executed on the FPGA while accumulation is done on a CPU. Unlike prior solutions, our design can handle graphs of any sizes with no need for an on-board FPGA memory. We also show that despite the low frequency of our device, compared to the CPU, by offloading random writes to an accelerator we can still improve the performance significantly. Experimental results show that with our proposed accelerator, PB algorithm can gain up to 40% speedup.