Efficiency and effectiveness of query processing in cluster-based retrieval

Can, F.; Altingövde I.S.; Demir, E.

Efficiency and effectiveness of query processing in cluster-based retrieval

dc.citation.epage	717	en_US
dc.citation.issueNumber	8	en_US
dc.citation.spage	697	en_US
dc.citation.volumeNumber	29	en_US
dc.contributor.author	Can, F.	en_US
dc.contributor.author	Altingövde I.S.	en_US
dc.contributor.author	Demir, E.	en_US
dc.date.accessioned	2016-02-08T10:25:06Z
dc.date.available	2016-02-08T10:25:06Z	en_US
dc.date.issued	2004	en_US
dc.department	Department of Computer Engineering	en_US
dc.description.abstract	Our research shows that for large databases, without considerable additional storage overhead, cluster-based retrieval (CBR) can compete with the time efficiency and effectiveness of the inverted index-based full search (FS). The proposed CBR method employs a storage structure that blends the cluster membership information into the inverted file posting lists. This approach significantly reduces the cost of similarity calculations for document ranking during query processing and improves efficiency. For example, in terms of in-memory computations, our new approach can reduce query processing time to 39% of FS. The experiments confirm that the approach is scalable and system performance improves with increasing database size. In the experiments, we use the cover coefficient-based clustering methodology (C3M), and the Financial Times database of TREC containing 210158 documents of size 564 MB defined by 229748 terms with total of 29545234 inverted index elements. This study provides CBR efficiency and effectiveness experiments using the largest corpus in an environment that employs no user interaction or user behavior assumption for clustering. © 2003 Elsevier Ltd. All rights reserved.	en_US
dc.identifier.doi	10.1016/S0306-4379(03)00062-0	en_US
dc.identifier.issn	0306-4379	en_US
dc.identifier.issn	1873-6076	en_US
dc.identifier.uri	http://hdl.handle.net/11693/24168	en_US
dc.language.iso	English	en_US
dc.publisher	Elsevier	en_US
dc.relation.isversionof	http://dx.doi.org/10.1016/S0306-4379(03)00062-0	en_US
dc.source.title	Information Systems	en_US
dc.subject	Cluster-based retrieval	en_US
dc.subject	Clustering	en_US
dc.subject	Information retrieval	en_US
dc.subject	Performance	en_US
dc.subject	Query processing	en_US
dc.subject	Algorithms	en_US
dc.subject	Data structures	en_US
dc.subject	Indexing (of information)	en_US
dc.subject	Information retrieval	en_US
dc.subject	Information science	en_US
dc.subject	Optimization	en_US
dc.subject	Performance	en_US
dc.subject	Cluster-based retrieval	en_US
dc.subject	Clustering	en_US
dc.subject	In-memory computations	en_US
dc.subject	Query processing	en_US
dc.subject	Query languages	en_US
dc.title	Efficiency and effectiveness of query processing in cluster-based retrieval	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Efficiency and effectiveness of query processing in cluster-based retrieval.pdf
Size:: 263.84 KB
Format:: Adobe Portable Document Format
Description:: Full printable version

Download

Collections

Scholarly Publications - Computer Engineering