Iterative-improvement-based declustering heuristics for multi-disk databases

Date
2005
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Information Systems
Print ISSN
0306-4379
1873-6076
Electronic ISSN
Publisher
Elsevier
Volume
30
Issue
1
Pages
47 - 70
Language
English
Type
Article
Journal Title
Journal ISSN
Volume Title
Series
Abstract

Data declustering is an important issue for reducing query response times in multi-disk database systems. In this paper, we propose a declustering method that utilizes the available information on query distribution, data distribution, data-item sizes, and disk capacity constraints. The proposed method exploits the natural correspondence between a data set with a given query distribution and a hypergraph. We define an objective function that exactly represents the aggregate parallel query-response time for the declustering problem and adapt the iterative-improvement-based heuristics successfully used in hypergraph partitioning to this objective function. We propose a two-phase algorithm that first obtains an initial K-way declustering by recursively bipartitioning the data set, then applies multi-way refinement on this declustering. We provide effective gain models and efficient implementation schemes for both phases. The experimental results on a wide range of realistic data sets show that the proposed method provides a significant performance improvement compared with the state-of-the-art declustering strategy based on similarity-graph partitioning.

Course
Other identifiers
Book Title
Keywords
Parallel database systems, Declustering, Hypergraph partitioning, Iterative improvement, Weighted similarity graph, Maxcut graph partitioning
Citation
Published Version (Please cite this version)