Efficient discovery of join plans in schemaless data

Acar, Aybar C.; Motro, A.

Efficient discovery of join plans in schemaless data

Files

Efficient discovery of join plans in schemaless data.pdf (830.26 KB)

Date

2009-09

Authors

Acar, Aybar C.

Motro, A.

BUIR Usage Stats

3
views

27
downloads

Citation Stats

Abstract

We describe a method of inferring join plans for a set of relation instances, in the absence of any metadata, such as attribute domains, attribute names, or constraints (e.g., keys or foreign keys). Our method enumerates the possible join plans in order of likelihood, based on the compatibility of a pair of columns and their suitability as join attributes (i.e. their appropriateness as keys). We outline two variants of the approach. The first variant is accurate but potentially time-consuming, especially for large relations that do not fit in memory. The second variant is an approximation of the former and hence less accurate, but is considerably more efficient, allowing the method to be used online, even for large relations. We provide experimental results showing how both forms scale in terms of performance as the number of candidate join attributes and the size of the relations increase. We also characterize the accuracy of the approximate variant with respect to the exact variant. Copyright ©2009 ACM.

Source Title

IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium

Publisher

ACM

Keywords

Dependency inference, Join inference, Schema matching, Foreign keys, Metadata, Keys (for locks)

Permalink

http://hdl.handle.net/11693/28695

Published Version (Please cite this version)

http://dx.doi.org/10.1145/1620432.1620434

Collections

Scholarly Publications - Computer Engineering

Language

English

Type

Conference Paper

Full item page

Efficient discovery of join plans in schemaless data

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Efficient discovery of join plans in schemaless data

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type