Browsing by Author "Cambazoglu, B.B."
Now showing 1 - 3 of 3
- Results Per Page
- Sort Options
Item Open Access Query forwarding in geographically distributed search engines(ACM, 2010) Cambazoglu, B.B.; Varol, Emre; Kayaaslan, Enver; Aykanat, Cevdet; Baeza-Yates, R.Query forwarding is an important technique for preserving the result quality in distributed search engines where the index is geographically partitioned over multiple search sites. The key component in query forwarding is the thresholding algorithm by which the forwarding decisions are given. In this paper, we propose a linear-programming-based thresholding algorithm that significantly outperforms the current state-of-the-art in terms of achieved search efficiency values. Moreover, we evaluate a greedy heuristic for partial index replication and investigate the impact of result cache freshness on query forwarding performance. Finally, we present some optimizations that improve the performance further, under certain conditions. We evaluate the proposed techniques by simulations over a real-life setting, using a large query log and a document collection obtained from Yahoo!. © 2010 ACM.Item Open Access Timestamp-based cache invalidation for search engines(ACM, 2011) Alıcı, Sadiye; Altıngövde, İsmail Şengör; Özcan, Rıfat; Cambazoglu, B.B.; Ulusoy, ÖzgürWe propose a new mechanism to predict stale queries in the result cache of a search engine. The novelty of our approach is in the use of timestamps in staleness predictions. We show that our approach incurs very little overhead on the system while its prediction accuracy is comparable to earlier works. © 2011 Authors.Item Open Access Timestamp-based result cache invalidation for web search engines(ACM, 2011) Alıcı, Sadiye; Altingovde I.S.; Özcan, Rıfat; Cambazoglu, B.B.; Ulusoy, ÖzgürThe result cache is a vital component for efficiency of large-scale web search engines, and maintaining the freshness of cached query results is the current research challenge. As a remedy to this problem, our work proposes a new mechanism to identify queries whose cached results are stale. The basic idea behind our mechanism is to maintain and compare generation time of query results with update times of posting lists and documents to decide on staleness of query results. The proposed technique is evaluated using a Wikipedia document collection with real update information and a real-life query log. We show that our technique has good prediction accuracy, relative to a baseline based on the time-to-live mechanism. Moreover, it is easy to implement and incurs less processing overhead on the system relative to a recently proposed, more sophisticated invalidation mechanism.