Web-site-based partitioning techniques for efficient parallelization of the PageRank computation
Author(s)
Advisor
Aykanat, CevdetDate
2006Publisher
Bilkent University
Language
English
Type
ThesisItem Usage Stats
166
views
views
43
downloads
downloads
Abstract
Web search engines use ranking techniques to order Web pages in query results.
PageRank is an important technique, which orders Web pages according to the
linkage structure of the Web. The efficiency of the PageRank computation is important
since the constantly evolving nature of the Web requires this computation
to be repeated many times. PageRank computation includes repeated iterative
sparse matrix-vector multiplications. Due to the enormous size of the Web matrix
to be multiplied, PageRank computations are usually carried out on parallel
systems. However, efficiently parallelizing PageRank is not an easy task, because
of the irregular sparsity pattern of the Web matrix. Graph and hypergraphpartitioning-based
techniques are widely used for efficiently parallelizing matrixvector
multiplications. Recently, a hypergraph-partitioning-based decomposition
technique for fast parallel computation of PageRank is proposed. This technique
aims to minimize the communication overhead of the parallel matrix-vector multiplication.
However, the proposed technique has a high prepropocessing time,
which makes the technique impractical. In this work, we propose 1D (rowwise
and columnwise) and 2D (fine-grain and checkerboard) decomposition models
using web-site-based graph and hypergraph-partitioning techniques. Proposed
models minimize the communication overhead of the parallel PageRank computations
with a reasonable preprocessing time. The models encapsulate not only
the matrix-vector multiplication, but the overall iterative algorithm. Conducted
experiments show that the proposed models achieve fast PageRank computation
with low preprocessing time, compared with those in the literature.