Large-scale cluster-based retrieval experiments on Turkish texts
Author
Altıngövde, İsmail Şengör
Özcan, Rıfat
Öcalan Hüseyin C.
Can, Fazlı
Ulusoy, Özgür
Date
2007Source Title
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Publisher
ACM
Pages
891 - 892
Language
English
Type
Conference PaperItem Usage Stats
148
views
views
103
downloads
downloads
Abstract
We present cluster-based retrieval (CBR) experiments on the largest available Turkish document collection. Our experiments evaluate retrieval effectiveness and efficiency on both an automatically generated clustering structure and a manual classification of documents. In particular, we compare CBR effectiveness with full-text search (FS) and evaluate several implementation alternatives for CBR. Our findings reveal that CBR yields comparable effectiveness figures with FS. Furthermore, by using a specifically tailored cluster-skipping inverted index we significantly improve in-memory query processing efficiency of CBR in comparison to other traditional CBR techniques and even FS.
Keywords
Cluster-based retrievalCluster-skipping
Inverted index
Turkish
Classification (of information)
Cluster analysis
Data processing
Query languages
Search engines
Cluster based retrieval
Full-text search (FS)
Inverted index
Information retrieval
Permalink
http://hdl.handle.net/11693/27076Published Version (Please cite this version)
http://dx.doi.org/10.1145/1277741.1277961Collections
Related items
Showing items related by title, author, creator and subject.
-
A new approach to search result clustering and labeling
Türel, Anıl; Can, Fazlı (Springer, Berlin, Heidelberg, 2011)Search engines present query results as a long ordered list of web snippets divided into several pages. Post-processing of retrieval results for easier access of desired information is an important research problem. In ... -
Efficiency and effectiveness of query processing in cluster-based retrieval
Can, F.; Altingövde I.S.; Demir, E. (Elsevier, 2004)Our research shows that for large databases, without considerable additional storage overhead, cluster-based retrieval (CBR) can compete with the time efficiency and effectiveness of the inverted index-based full search ... -
EHPBS: Energy harvesting prediction based scheduling in wireless sensor networks
Akgun, B.; Aykın, Irmak (IEEE, 2013)The clustering algorithms designed for traditional sensor networks have been adapted for energy harvesting sensor networks (EHWSN). However, in these algorithms, the intra-cluster MAC protocols to be used were either not ...