Efficiency and effectiveness of query processing in cluster-based retrieval

dc.citation.epage717en_US
dc.citation.issueNumber8en_US
dc.citation.spage697en_US
dc.citation.volumeNumber29en_US
dc.contributor.authorCan, F.en_US
dc.contributor.authorAltingövde I.S.en_US
dc.contributor.authorDemir, E.en_US
dc.date.accessioned2016-02-08T10:25:06Z
dc.date.available2016-02-08T10:25:06Zen_US
dc.date.issued2004en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractOur research shows that for large databases, without considerable additional storage overhead, cluster-based retrieval (CBR) can compete with the time efficiency and effectiveness of the inverted index-based full search (FS). The proposed CBR method employs a storage structure that blends the cluster membership information into the inverted file posting lists. This approach significantly reduces the cost of similarity calculations for document ranking during query processing and improves efficiency. For example, in terms of in-memory computations, our new approach can reduce query processing time to 39% of FS. The experiments confirm that the approach is scalable and system performance improves with increasing database size. In the experiments, we use the cover coefficient-based clustering methodology (C3M), and the Financial Times database of TREC containing 210158 documents of size 564 MB defined by 229748 terms with total of 29545234 inverted index elements. This study provides CBR efficiency and effectiveness experiments using the largest corpus in an environment that employs no user interaction or user behavior assumption for clustering. © 2003 Elsevier Ltd. All rights reserved.en_US
dc.identifier.doi10.1016/S0306-4379(03)00062-0en_US
dc.identifier.issn0306-4379
dc.identifier.issn1873-6076
dc.identifier.urihttp://hdl.handle.net/11693/24168en_US
dc.language.isoEnglishen_US
dc.publisherElsevieren_US
dc.relation.isversionofhttp://dx.doi.org/10.1016/S0306-4379(03)00062-0en_US
dc.source.titleInformation Systemsen_US
dc.subjectCluster-based retrievalen_US
dc.subjectClusteringen_US
dc.subjectInformation retrievalen_US
dc.subjectPerformanceen_US
dc.subjectQuery processingen_US
dc.subjectAlgorithmsen_US
dc.subjectData structuresen_US
dc.subjectIndexing (of information)en_US
dc.subjectInformation retrievalen_US
dc.subjectInformation scienceen_US
dc.subjectOptimizationen_US
dc.subjectPerformanceen_US
dc.subjectCluster-based retrievalen_US
dc.subjectClusteringen_US
dc.subjectIn-memory computationsen_US
dc.subjectQuery processingen_US
dc.subjectQuery languagesen_US
dc.titleEfficiency and effectiveness of query processing in cluster-based retrievalen_US
dc.typeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Efficiency and effectiveness of query processing in cluster-based retrieval.pdf
Size:
263.84 KB
Format:
Adobe Portable Document Format
Description:
Full printable version