Browsing by Subject "Graph analytics"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
Item Open Access Accelerating pagerank with a heterogeneous two phase CPU-FPGA algorithm(Bilkent University, 2020-12) Usta, FurkanPageRank is a network analysis algorithm that is used to measure the importance of each vertex in a graph. Fundamentally it is a Sparse Matrix-Vector multiplication problem and suffers from the same bottlenecks, such as irregular memory access and low computation-to-communication ratio. Moreover, the existing Field Programmable Gate Array (FPGA) accelerators for PageRank algorithm either require large portions of the graph to be in-memory, which is not suitable for big data applications or cannot fully utilize the memory bandwidth. Recently published Propagation Blocking(PB) methodology improves the performance of PageRank by dividing the execution into binning and accumulation phases. In this paper, we propose a heterogeneous high-throughput implementation of the PB algorithm where the binning phase executed on the FPGA while accumulation is done on a CPU. Unlike prior solutions, our design can handle graphs of any sizes with no need for an on-board FPGA memory. We also show that despite the low frequency of our device, compared to the CPU, by offloading random writes to an accelerator we can still improve the performance significantly. Experimental results show that with our proposed accelerator, PB algorithm can gain up to 40% speedup.Item Open Access Accelerator design for graph analytics(Bilkent University, 2016-06) Yeşil, ŞerifWith the increase in data available online, data analysis became a significant problem in today’s datacenters. Moreover, graph analytics is one of the significant application domains in big data era. However, traditional architectures such as CPUs and Graphics Processing Units (GPUs) fail to serve the needs of graph applications. Unconventional properties of graph applications such as irregular memory accesses, load balancing, and irregular computation challenge current computing systems which are either throughput oriented or built on top of traditional locality based memory subsystems. On the other hand, an emerging technique hardware customization, can help us to overcome these problems since they are expected to be energy efficient. Considering the power wall, hardware customization becomes more desirable. In this dissertation, we propose a hardware accelerator framework that is capable of handling irregular, vertex centric, and asynchronous graph applications. Developed high level SystemC models gives an abstraction to the programmer allowing to implement the hardware without extensive knowledge about the underlying architecture. With the given template, programmers are not limited to a single application since they can develop any graph application as long as it fits to the given template abstract. Besides the ability to develop different applications, the given template also decreases the time spent on developing and testing different accelerators. Additionally, an extensive experimental study shows that the proposed template can outperform a high-end 24 core CPU system up to 3x with up to 65x power efficiency.Item Open Access Graph analytics accelerators for cognitive systems(Institute of Electrical and Electronics Engineers, 2017) Ozdal, M. M.; Yesil, S.; Kim, T.; Ayupov, A.; Greth, J.; Burns, S.; Ozturk, O.Hardware accelerators are known to be performance and power efficient. This article focuses on accelerator design for graph analytics applications, which are commonly used kernels for cognitive systems. The authors propose a templatized architecture that is specifically optimized for vertex-centric graph applications with irregular memory access patterns, asynchronous execution, and asymmetric convergence. The proposed architecture addresses the limitations of existing CPU and GPU systems while providing a customizable template. The authors' experiments show that the generated accelerators can outperform a high-end CPU system with up to 3 times better performance and 65 times better power efficiency. © 1981-2012 IEEE.Item Open Access Parallelization of Sparse Matrix Kernels for big data applications(Springer, 2016) Selvitopu, Oğuz; Akbudak, Kadir; Aykanat, Cevdet; Pop, F.; Kołodziej, J.; Di Martino, B.Analysis of big data on large-scale distributed systems often necessitates efficient parallel graph algorithms that are used to explore the relationships between individual components. Graph algorithms use the basic adjacency list representation for graphs, which can also be viewed as a sparse matrix. This correspondence between representation of graphs and sparse matrices makes it possible to express many important graph algorithms in terms of basic sparse matrix operations, where the literature for optimization is more mature. For example, the graph analytic libraries such as Pegasus and Combinatorial BLAS use sparse matrix kernels for a wide variety of operations on graphs. In this work, we focus on two such important sparse matrix kernels: Sparse matrix–sparse matrix multiplication (SpGEMM) and sparse matrix–dense matrix multiplication (SpMM). We propose partitioning models for efficient parallelization of these kernels on large-scale distributed systems. Our models aim at reducing and improving communication volume while balancing computational load, which are two vital performance metrics on distributed systems. We show that by exploiting sparsity patterns of the matrices through our models, the parallel performance of SpGEMM and SpMM operations can be significantly improved.