Steady-state analysis of Google-like stochastic matrices
Date
Authors
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
BUIR Usage Stats
views
downloads
Series
Abstract
Many search engines use a two-step process to retrieve from the web pages related to a user’s query. In the first step, traditional text processing is performed to find all pages matching the given query terms. Due to the massive size of the web, this step can result in thousands of retrieved pages. In the second step, many search engines sort the list of retrieved pages according to some ranking criterion to make it manageable for the user. One popular way to create this ranking is to exploit additional information inherent in the web due to its hyperlink structure. One successful and well publicized link-based ranking system is PageRank, the ranking system used by the Google search engine. The dynamically changing matrices reflecting the hyperlink structure of the web and used by Google in ranking pages are not only very large, but they are also sparse, reducible, stochastic matrices with some zero rows. Ranking pages amounts to solving for the steady-state vectors of linear combinations of these matrices with appropriately chosen rank-1 matrices. The most suitable method of choice for this task appears to be the power method. Certain improvements have been obtained using techniques such as quadratic extrapolation and iterative aggregation. In this thesis, we propose iterative methods based on various block partitionings, including those with triangular diagonal blocks obtained using cutsets, for the computation of the steady-state vector of such stochastic matrices. The proposed iterative methods together with power and quadratically extrapolated power methods are coded into a software tool. Experimental results on benchmark matrices show that it is possible to recommend Gauss-Seidel for easier web problems and block Gauss-Seidel with partitionings based on a block upper triangular form in the remaining problems, although it takes about twice as much memory as quadratically extrapolated power method.