A result cache invalidation scheme for web search engines

Date

2011

Editor(s)

Advisor

Ulusoy, Özgür

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

Print ISSN

Electronic ISSN

Publisher

Bilkent University

Volume

Issue

Pages

Language

English

Journal Title

Journal ISSN

Volume Title

Series

Abstract

The result cache is a vital component for the efficiency of large-scale web search engines, and maintaining the freshness of cached query results is a current research challenge. As a remedy to this problem, our work proposes a new mechanism to identify queries whose cached results are stale. The basic idea behind our mechanism is to maintain and compare the generation time of query results with the update times of posting lists and documents to decide on staleness of query results. The proposed technique is evaluated using a Wikipedia document collection with real update information and a real-life query log. Throughout the experiments, we compare our approach with two baseline strategies from literature together with a detailed evaluation. We show that our technique has good prediction accuracy, relative to the baseline based on the time-to-live (TTL) mechanism. Moreover, it is easy to implement and it incurs less processing overhead on the system relative to a recently proposed, more sophisticated invalidation mechanism.

Course

Other identifiers

Book Title

Citation

item.page.isversionof