A parallel framework for in-memory construction of term-partitioned inverted indexes

Date
2012
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
The Computer Journal
Print ISSN
0010-4620
Electronic ISSN
Publisher
Oxford University Press
Volume
55
Issue
11
Pages
1317 - 1330
Language
English
Type
Article
Journal Title
Journal ISSN
Volume Title
Series
Abstract

With the advances in cloud computing and huge RAMs provided by 64-bit architectures, it is possible to tackle large problems using memory-based solutions. Construction of term-based, partitioned, parallel inverted indexes is a communication intensive task and suitable for memory-based modeling. In this paper, we provide an efficient parallel framework for in-memory construction of term-based partitioned, inverted indexes. We show that, by utilizing an efficient bucketing scheme, we can eliminate the need for the generation of a global vocabulary. We propose and investigate assignment schemes that can reduce the communication overheads while minimizing the storage and final query processing imbalance. We also present a study on how communication among processors should be carried out with limited communication memory in order to reduce the total inversion time. We present several different communication-memory organizations and discuss their advantages and shortcomings. The conducted experiments indicate promising results. © 2012 The Author. Published by Oxford University Press on behalf of The British Computer Society.

Course
Other identifiers
Book Title
Keywords
Index inversion, Memory - based inversion, Parallel inversion, Term - based partitioning
Citation
Published Version (Please cite this version)