Effect of inverted index partitioning schemes on performance of query processing in parallel text retrieval systems

Date
2006-11
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
21th International Symposium on Computer and Information Sciences – ISCIS 2006
Print ISSN
0302-9743
Electronic ISSN
Publisher
Springer
Volume
Issue
Pages
717 - 725
Language
English
Type
Conference Paper
Journal Title
Journal ISSN
Volume Title
Series
Abstract

Shared-nothing, parallel text retrieval systems require an inverted index, representing a document collection, to be partitioned among a number of processors. In general, the index can be partitioned based on either the terms or documents in the collection, and the way the partitioning is done greatly affects the query processing performance of the parallel system. In this work, we investigate the effect of these two index partitioning schemes on query processing. We conduct experiments on a 32-node PC cluster, considering the case where index is completely stored in disk. Performance results are reported for a large (30 GB) document collection using an MPI-based parallel query processing implementation. © Springer-Verlag Berlin Heidelberg 2006.

Course
Other identifiers
Book Title
Keywords
Data processing, Information retrieval, Magnetic disk storage, Parallel processing systems, Program processors, Query languages, Text processing, Document collection, Index partitioning schemes, Inverted index partitioning, Parallel text retrieval systems, Indexing (of information)
Citation
Published Version (Please cite this version)