Large-scale cluster-based retrieval experiments on Turkish texts
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
891 - 892
Item Usage Stats
We present cluster-based retrieval (CBR) experiments on the largest available Turkish document collection. Our experiments evaluate retrieval effectiveness and efficiency on both an automatically generated clustering structure and a manual classification of documents. In particular, we compare CBR effectiveness with full-text search (FS) and evaluate several implementation alternatives for CBR. Our findings reveal that CBR yields comparable effectiveness figures with FS. Furthermore, by using a specifically tailored cluster-skipping inverted index we significantly improve in-memory query processing efficiency of CBR in comparison to other traditional CBR techniques and even FS.
Classification (of information)
Cluster based retrieval
Full-text search (FS)
Published Version (Please cite this version)http://dx.doi.org/10.1145/1277741.1277961
Showing items related by title, author, creator and subject.
Türel, Anıl; Can, Fazlı (Springer, Berlin, Heidelberg, 2011)Search engines present query results as a long ordered list of web snippets divided into several pages. Post-processing of retrieval results for easier access of desired information is an important research problem. In ...
Can, F.; Altingövde I.S.; Demir, E. (Elsevier, 2004)Our research shows that for large databases, without considerable additional storage overhead, cluster-based retrieval (CBR) can compete with the time efficiency and effectiveness of the inverted index-based full search ...
Altıngövde, İsmail Şengör; Atılgan, Duygu; Ulusoy, Özgür (Springer, Berlin, Heidelberg, 2010)In this paper, we first employ the well known Cover-Coefficient Based Clustering Methodology (C3M) for clustering XML documents. Next, we apply index pruning techniques from the literature to reduce the size of the document ...