Exploiting index pruning methods for clustering XML collections
Date
2010
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
BUIR Usage Stats
1
views
views
10
downloads
downloads
Citation Stats
Series
Abstract
In this paper, we first employ the well known Cover-Coefficient Based Clustering Methodology (C3M) for clustering XML documents. Next, we apply index pruning techniques from the literature to reduce the size of the document vectors. Our experiments show that for certain cases, it is possible to prune up to 70% of the collection (or, more specifically, underlying document vectors) and still generate a clustering structure that yields the same quality with that of the original collection, in terms of a set of evaluation metrics. © 2010 Springer-Verlag Berlin Heidelberg.
Source Title
Focused Retrieval and Evaluation
Publisher
Springer, Berlin, Heidelberg
Course
Other identifiers
Book Title
Keywords
Cover-coefficient based clustering, Index pruning, Clustering index, Cover-coefficient based clustering, Document vectors, Evaluation metrics, Pruning methods, Pruning techniques, Based clustering, Document vectors, Evaluation metrics, Pruning methods, Pruning techniques, Query languages, XML, Markup languages, Quality control
Degree Discipline
Degree Level
Degree Name
Citation
Permalink
Published Version (Please cite this version)
Language
English