Browsing by Subject "Time-to-live"

Now showing 1 - 4 of 4

Open Access
Adaptive time-to-live strategies for query result caching in web search engines
(2012) Alıcı, Sadiye; Altıngövde, I. Ş.; Rıfat, Özcan; Cambazoğlu, B. Barla; Ulusoy, Özgür
An important research problem that has recently started to receive attention is the freshness issue in search engine result caches. In the current techniques in literature, the cached search result pages are associated with a fixed time-to-live (TTL) value in order to bound the staleness of search results presented to the users, potentially as part of a more complex cache refresh or invalidation mechanism. In this paper, we propose techniques where the TTL values are set in an adaptive manner, on a per-query basis. Our results show that the proposed techniques reduce the fraction of stale results served by the cache and also decrease the fraction of redundant query evaluations on the search engine backend compared to a strategy using a fixed TTL value for all queries. © 2012 Springer-Verlag Berlin Heidelberg.
Open Access
Efficient result caching mechanisms in search engines
(2014) Sazoğlu, Fethi Burak
The performance of a search engine depends on its components such as crawler, indexer and processor. The query latency, accuracy and recency of the results play crucial role in determining the performance. High performance can be provided with powerful hardware in the data center, but keeping the operational costs restrained is mandatory for search engines for commercial durability. This thesis focuses on techniques to boost the performance of search engines by means of reducing both the number of queries issued to the backend and the cost to process a query stream. This can be accomplished by taking advantage of the temporal locality of the queries. Caching the result for a recently issued query removes the need to reprocess this query when it is issued again by the same or different user. Therefore, deploying query result cache decreases the load on the resources of the search engine which increases the processing power. The main objective of this thesis is to improve search engine performance by enhancing productivity of result cache. This is done by endeavoring to maximize the cache hit rate and minimizing the processing cost by using the per query statistics such as frequency, timestamp and cost. While providing high hit rates and low processing costs improves performance, the freshness of the queries in the cache has to be considered as well for user satisfaction. Therefore, a variety of techniques are examined in this thesis to bound the staleness of cache results without blasting the backend with refresh queries. The offered techniques are demonstrated to be efficient by using real query log data from a commercial search engine.
Open Access
Strategies for setting time-to-live values in result caches
(ACM, 2013-10-11) Sazoğlu, Fethi Burak; Cambazoğlu, B. B.; Özcan, R.; Altıngövde, İsmail Şengör; Ulusoy, Özgür
In web query result caching, staleness of queries are often bounded via a time-to-live (TTL) mechanism, which expires the validity of cached query results at some point in time. In this work, we evaluate the performance of three alternative TTL mechanisms: time-based TTL, frequency-based TTL, and click-based TTL. Moreover, we propose hybrid approaches obtained by pair-wise combination of these mechanisms. Our results indicate that combining time-based TTL with frequency-based TTL yields superior performance (i.e., lower stale query traffic and less redundant computation) than using a particular mechanism in isolation. Copyright is held by the owner/author(s).
Open Access
Timestamp-based result cache invalidation for web search engines
(ACM, 2011) Alıcı, Sadiye; Altingovde I.S.; Özcan, Rıfat; Cambazoglu, B.B.; Ulusoy, Özgür
The result cache is a vital component for efficiency of large-scale web search engines, and maintaining the freshness of cached query results is the current research challenge. As a remedy to this problem, our work proposes a new mechanism to identify queries whose cached results are stale. The basic idea behind our mechanism is to maintain and compare generation time of query results with update times of posting lists and documents to decide on staleness of query results. The proposed technique is evaluated using a Wikipedia document collection with real update information and a real-life query log. We show that our technique has good prediction accuracy, relative to a baseline based on the time-to-live mechanism. Moreover, it is easy to implement and incurs less processing overhead on the system relative to a recently proposed, more sophisticated invalidation mechanism.