Cambazoglu, B.B.Varol, EmreKayaaslan, EnverAykanat, CevdetBaeza-Yates, R.2016-02-082016-02-082010http://hdl.handle.net/11693/28554Date of Conference: July 19 - 23, 2010Query forwarding is an important technique for preserving the result quality in distributed search engines where the index is geographically partitioned over multiple search sites. The key component in query forwarding is the thresholding algorithm by which the forwarding decisions are given. In this paper, we propose a linear-programming-based thresholding algorithm that significantly outperforms the current state-of-the-art in terms of achieved search efficiency values. Moreover, we evaluate a greedy heuristic for partial index replication and investigate the impact of result cache freshness on query forwarding performance. Finally, we present some optimizations that improve the performance further, under certain conditions. We evaluate the proposed techniques by simulations over a real-life setting, using a large query log and a document collection obtained from Yahoo!. © 2010 ACM.EnglishDistributed IRIndex replicationLinear programmingOptimizationQuery forwardingResult cachingSearch enginesDistributed IRDistributed search enginesDocument collectionGreedy heuristicsIndex replicationKey componentMultiple search sitesQuery forwardingQuery logsResult cachingSearch efficiencyThresholding algorithmsInformation retrievalNetwork routingOptimizationSearch enginesLinear programmingQuery forwarding in geographically distributed search enginesConference Paper10.1145/1835449.1835467