Selective replicated declustering for arbitrary queries

buir.contributor.authorAykanat, Cevdet
dc.citation.epage386en_US
dc.citation.spage375en_US
dc.contributor.authorOktay, K. Yasinen_US
dc.contributor.authorTürk, Ataen_US
dc.contributor.authorAykanat, Cevdeten_US
dc.coverage.spatialDelft, The Netherlands
dc.date.accessioned2016-02-08T12:27:25Z
dc.date.available2016-02-08T12:27:25Z
dc.date.issued2009-08en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.descriptionDate of Conference: 25-28 August, 2009
dc.descriptionConference name: European Conference on Parallel Processing. Euro-Par 2009: Euro-Par 2009 Parallel Processing
dc.description.abstractData declustering is used to minimize query response times in data intensive applications. In this technique, query retrieval process is parallelized by distributing the data among several disks and it is useful in applications such as geographic information systems that access huge amounts of data. Declustering with replication is an extension of declustering with possible data replicas in the system. Many replicated declustering schemes have been proposed. Most of these schemes generate two or more copies of all data items. However, some applications have very large data sizes and even having two copies of all data items may not be feasible. In such systems selective replication is a necessity. Furthermore, existing replication schemes are not designed to utilize query distribution information if such information is available. In this study we propose a replicated declustering scheme that decides both on the data items to be replicated and the assignment of all data items to disks when there is limited replication capacity. We make use of available query information in order to decide replication and partitioning of the data and try to optimize aggregate parallel response time. We propose and implement a Fiduccia-Mattheyses-like iterative improvement algorithm to obtain a two-way replicated declustering and use this algorithm in a recursive framework to generate a multi-way replicated declustering. Experiments conducted with arbitrary queries on real datasets show that, especially for low replication constraints, the proposed scheme yields better performance results compared to existing replicated declustering schemes. © 2009 Springer.en_US
dc.description.provenanceMade available in DSpace on 2016-02-08T12:27:25Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2009en
dc.identifier.doi10.1007/978-3-642-03869-3_37en_US
dc.identifier.urihttp://hdl.handle.net/11693/28697en_US
dc.language.isoEnglishen_US
dc.publisherSpringeren_US
dc.relation.isversionofhttp://dx.doi.org/10.1007/978-3-642-03869-3_37en_US
dc.source.titleEuropean Conference on Parallel Processing. Euro-Par 2009: Euro-Par 2009 Parallel Processingen_US
dc.subjectData declusteringen_US
dc.subjectData itemsen_US
dc.subjectData replicaen_US
dc.subjectData-intensive applicationen_US
dc.subjectDeclusteringen_US
dc.subjectDeclustering schemeen_US
dc.subjectIterative improvementsen_US
dc.subjectQuery distributionsen_US
dc.subjectQuery informationen_US
dc.subjectQuery responseen_US
dc.subjectQuery retrievalen_US
dc.subjectReal data setsen_US
dc.subjectResponse timeen_US
dc.subjectSelective replicationen_US
dc.subjectVery large datumen_US
dc.subjectArtificial intelligenceen_US
dc.subjectBioinformaticsen_US
dc.subjectDisks (machine components)en_US
dc.subjectDisks (structural components)en_US
dc.subjectDistributed computer systemsen_US
dc.subjectGeographic information systemsen_US
dc.subjectResponse time (computer systems)en_US
dc.titleSelective replicated declustering for arbitrary queriesen_US
dc.typeConference Paperen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Selective replicated declustering for arbitrary queries.pdf
Size:
411.74 KB
Format:
Adobe Portable Document Format
Description:
Full printable version