Site-based partitioning and repartitioning techniques for parallel pagerank computation

Cevahir, A.; Aykanat, Cevdet; Turk, A.; Cambazoglu, B. B.

Site-based partitioning and repartitioning techniques for parallel pagerank computation

Files

Site-based partitioning and repartitioning techniques for parallel pagerank computation.pdf (5.53 MB)

Date

2011-05

Authors

BUIR Usage Stats

0
views

33
downloads

Citation Stats

Abstract

The PageRank algorithm is an important component in effective web search. At the core of this algorithm are repeated sparse matrix-vector multiplications where the involved web matrices grow in parallel with the growth of the web and are stored in a distributed manner due to space limitations. Hence, the PageRank computation, which is frequently repeated, must be performed in parallel with high-efficiency and low-preprocessing overhead while considering the initial distributed nature of the web matrices. Our contributions in this work are twofold. We first investigate the application of state-of-the-art sparse matrix partitioning models in order to attain high efficiency in parallel PageRank computations with a particular focus on reducing the preprocessing overhead they introduce. For this purpose, we evaluate two different compression schemes on the web matrix using the site information inherently available in links. Second, we consider the more realistic scenario of starting with an initially distributed data and extend our algorithms to cover the repartitioning of such data for efficient PageRank computation. We report performance results using our parallelization of a state-of-the-art PageRank algorithm on two different PC clusters with 40 and 64 processors. Experiments show that the proposed techniques achieve considerably high speedups while incurring a preprocessing overhead of several iterations (for some instances even less than a single iteration) of the underlying sequential PageRank algorithm. © 2011 IEEE.

Source Title

IEEE Transactions on Parallel and Distributed Systems

Publisher

Institute of Electrical and Electronics Engineers

Keywords

Graph partitioning, Hypergraph partitioning, PageRank, Parallelization, Repartitioning, Sparse matrix partitioning, Sparse matrix - vector multiplication, Web search

Permalink

http://hdl.handle.net/11693/22022

Published Version (Please cite this version)

http://dx.doi.org/10.1109/TPDS.20http://dx.doi.org/10.119

Collections

Scholarly Publications - Computer Engineering

Language

English

Type

Article

Full item page

Site-based partitioning and repartitioning techniques for parallel pagerank computation

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Site-based partitioning and repartitioning techniques for parallel pagerank computation

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type