Performance comparison of query evaluation techniques in parallel text retrieval systems
Tokuç, A Aylin
Item Usage Stats
MetadataShow full item record
Today’s state-of-the-art search engines utilize the inverted index data structure for fast text retrieval on large document collections. To parallelize the retrieval process, the inverted index should be distributed among multiple index servers. Generally the distribution of the inverted index is done in either a term-based or a document-based fashion. The performances of both schemes depend on the total number of disk accesses and the total volume of communication in the system. The classical approach for both distributions is to use the Central Broker Query Evaluation Scheme (CB) for parallel text retrieval. It is known that in this approach the central broker is heavily loaded and becomes a bottleneck. Recently, an alternative query evaluation technique, named Pipelined Query Evaluation Scheme (PPL), has been proposed to alleviate this problem by performing the merge operation on the index servers. In this study, we analyze the scalability and relative performances of the CB and PPL under various query loads to report the benefits and drawbacks of each method.
KeywordsParallel text retrieval
Central broker query evaluation
Pipelined query evaluation
QA76.5 .T64 2008
Parallel processing (Electronic computers).
Text processing (Computer science).