Incorporating the surfing behavior of web users into PageRank

buir.advisorAykanat, Cevdet
dc.contributor.authorAshyralyyev, Shatlyk
dc.date.accessioned2016-01-08T20:05:15Z
dc.date.available2016-01-08T20:05:15Z
dc.date.issued2013
dc.descriptionAnkara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent University, 2013.en_US
dc.descriptionThesis (Master's) -- Bilkent University, 2013.en_US
dc.descriptionIncludes bibliographical references leaves 68-73en_US
dc.description.abstractOne of the most crucial factors that determines the effectiveness of a large-scale commercial web search engine is the ranking (i.e., order) in which web search results are presented to the end user. In modern web search engines, the skeleton for the ranking of web search results is constructed using a combination of the global (i.e., query independent) importance of web pages and their relevance to the given search query. In this thesis, we are concerned with the estimation of global importance of web pages. So far, to estimate the importance of web pages, two different types of data sources have been taken into account, independent of each other: hyperlink structure of the web (e.g., PageRank) or surfing behavior of web users (e.g., BrowseRank). Unfortunately, both types of data sources have certain limitations. The hyperlink structure of the web is not very reliable and is vulnerable to bad intent (e.g., web spam), because hyperlinks can be easily edited by the web content creators. On the other hand, the browsing behavior of web users has limitations such as, sparsity and low web coverage. In this thesis, we combine these two types of feedback under a hybrid page importance estimation model in order to alleviate the above-mentioned drawbacks. Our experimental results indicate that the proposed hybrid model leads to better estimation of page importance according to an evaluation metric that uses the user click information obtained from Yahoo! web search engine’s query logs as ground-truth ranking. We conduct all of our experiments in a realistic setting, using a very large scale web page collection (around 6.5 billion web pages) and web browsing data (around two billion web page visits) collected through the Yahoo! toolbar.en_US
dc.description.provenanceMade available in DSpace on 2016-01-08T20:05:15Z (GMT). No. of bitstreams: 1 0006904.pdf: 5535923 bytes, checksum: 2bc220f01e87d82f716eb14a531cdf8b (MD5)en
dc.description.statementofresponsibilityAshyralyyev, Shatlyken_US
dc.format.extentxii, 73 leaves, graphics, tablesen_US
dc.identifier.itemidB138905
dc.identifier.urihttp://hdl.handle.net/11693/17013
dc.language.isoEnglishen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectPage qualityen_US
dc.subjectWeb searchen_US
dc.subjectRankingen_US
dc.subjectPageRanken_US
dc.subjectBrowseRanken_US
dc.subject.lccTK5105.884 .A74 2013en_US
dc.subject.lcshSearch engines--Programming.en_US
dc.subject.lcshWeb search engines--Mathematical models.en_US
dc.subject.lcshInformation storage and retrieval systems.en_US
dc.subject.lcshInternet searching.en_US
dc.subject.lcshWorld Wide Web--Statistical methods.en_US
dc.subject.lcshInformation behavior.en_US
dc.titleIncorporating the surfing behavior of web users into PageRanken_US
dc.typeThesisen_US
thesis.degree.disciplineComputer Engineering
thesis.degree.grantorBilkent University
thesis.degree.levelMaster's
thesis.degree.nameMS (Master of Science)

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
0006904.pdf
Size:
5.28 MB
Format:
Adobe Portable Document Format