Effect of inverted index partitioning schemes on performance of query processing in parallel text retrieval systems

Date

2006-11

Authors

Cambazoğlu, B. Barla
Çatal, A.
Aykanat, Cevdet

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

21th International Symposium on Computer and Information Sciences – ISCIS 2006

Print ISSN

0302-9743

Electronic ISSN

Publisher

Springer

Volume

Issue

Pages

717 - 725

Language

English

Journal Title

Journal ISSN

Volume Title

Series

Abstract

Shared-nothing, parallel text retrieval systems require an inverted index, representing a document collection, to be partitioned among a number of processors. In general, the index can be partitioned based on either the terms or documents in the collection, and the way the partitioning is done greatly affects the query processing performance of the parallel system. In this work, we investigate the effect of these two index partitioning schemes on query processing. We conduct experiments on a 32-node PC cluster, considering the case where index is completely stored in disk. Performance results are reported for a large (30 GB) document collection using an MPI-based parallel query processing implementation. © Springer-Verlag Berlin Heidelberg 2006.

Course

Other identifiers

Book Title

Keywords

Data processing, Information retrieval, Magnetic disk storage, Parallel processing systems, Program processors, Query languages, Text processing, Document collection, Index partitioning schemes, Inverted index partitioning, Parallel text retrieval systems, Indexing (of information)

Citation