Cost-aware strategies for query result caching in Web search engines

Date
2011
Authors
Ozcan, R.
Altingovde, I. S.
Ulusoy, O.
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
ACM Transactions on the Web
Print ISSN
1559-1131
Electronic ISSN
Publisher
Association for Computing Machinery
Volume
5
Issue
2
Pages
9:1 - 9:25
Language
English
Type
Article
Journal Title
Journal ISSN
Volume Title
Series
Abstract

Search engines and large-scale IR systems need to cache query results for efficiency and scalability purposes. Static and dynamic caching techniques (as well as their combinations) are employed to effectively cache query results. In this study, we propose cost-aware strategies for static and dynamic caching setups. Our research is motivated by two key observations: (i) query processing costs may significantly vary among different queries, and (ii) the processing cost of a query is not proportional to its popularity (i.e., frequency in the previous logs). The first observation implies that cache misses have different, that is, nonuniform, costs in this context. The latter observation implies that typical caching policies, solely based on query popularity, can not always minimize the total cost. Therefore, we propose to explicitly incorporate the query costs into the caching policies. Simulation results using two large Web crawl datasets and a real query log reveal that the proposed approach improves overall system performance in terms of the average query execution time. © 2011 ACM.

Course
Other identifiers
Book Title
Keywords
Query result caching, Web search engines, Cache miss, Caching policy, Data sets, Processing costs, Query costs, Query execution time, Query logs, Query results, Simulation result, Static and dynamic, Total costs, Costs, Information retrieval, Search engines, User interfaces
Citation
Published Version (Please cite this version)