A result cache invalidation scheme for web search engines
Author
Alıcı, Şadiye
Advisor
Ulusoy, Özgür
Date
2011Publisher
Bilkent University
Language
English
Type
ThesisItem Usage Stats
76
views
views
16
downloads
downloads
Abstract
The result cache is a vital component for the efficiency of large-scale web
search engines, and maintaining the freshness of cached query results is a
current research challenge. As a remedy to this problem, our work proposes a
new mechanism to identify queries whose cached results are stale. The basic
idea behind our mechanism is to maintain and compare the generation time of
query results with the update times of posting lists and documents to decide on
staleness of query results.
The proposed technique is evaluated using a Wikipedia document collection
with real update information and a real-life query log. Throughout the
experiments, we compare our approach with two baseline strategies from
literature together with a detailed evaluation. We show that our technique has
good prediction accuracy, relative to the baseline based on the time-to-live
(TTL) mechanism. Moreover, it is easy to implement and it incurs less
processing overhead on the system relative to a recently proposed, more
sophisticated invalidation mechanism.