Static index pruning in web search engines: combining term and document popularities with query views

dc.citation.epage2-28en_US
dc.citation.issueNumber1en_US
dc.citation.spage2-1en_US
dc.citation.volumeNumber30en_US
dc.contributor.authorAltingovde, I. S.en_US
dc.contributor.authorOzcan, R.en_US
dc.contributor.authorUlusoy, O.en_US
dc.date.accessioned2016-02-08T09:48:43Z
dc.date.available2016-02-08T09:48:43Z
dc.date.issued2012en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractStatic index pruning techniques permanently remove a presumably redundant part of an inverted file, to reduce the file size and query processing time. These techniques differ in deciding which parts of an index can be removed safely; that is, without changing the top-ranked query results. As defined in the literature, the query view of a document is the set of query terms that access to this particular document, that is, retrieves this document among its top results. In this paper, we first propose using query views to improve the quality of the top results compared against the original results. We incorporate query views in a number of static pruning strategies, namely term-centric, document-centric, term popularity based and document access popularity based approaches, and show that the new strategies considerably outperform their counterparts especially for the higher levels of pruning and for both disjunctive and conjunctive query processing. Additionally,we combine the notions of term and document access popularity to form new pruning strategies, and further extend these strategies with the query views. The new strategies improve the result quality especially for the conjunctive query processing, which is the default and most common search mode of a search engine. © 2012 ACM.en_US
dc.description.provenanceMade available in DSpace on 2016-02-08T09:48:43Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2012en
dc.identifier.doi10.1145/2094072.2094074en_US
dc.identifier.issn1046-8188
dc.identifier.urihttp://hdl.handle.net/11693/21609
dc.language.isoEnglishen_US
dc.publisherAssociation for Computing Machineryen_US
dc.relation.isversionofhttp://dx.doi.org/10.1145/2094072.2094074en_US
dc.source.titleACM Transactions on Information Systemsen_US
dc.subjectQuery viewen_US
dc.subjectStatic inverted index pruningen_US
dc.subjectConjunctive queriesen_US
dc.subjectDocument accessen_US
dc.subjectFile sizesen_US
dc.subjectInverted filesen_US
dc.subjectPruning strategyen_US
dc.subjectPruning techniquesen_US
dc.subjectQuery resultsen_US
dc.subjectQuery termsen_US
dc.subjectQuery languagesen_US
dc.subjectQuery processingen_US
dc.subjectSearch enginesen_US
dc.subjectInformation retrieval systemsen_US
dc.titleStatic index pruning in web search engines: combining term and document popularities with query viewsen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Static index pruning in web search engines Combining term and document popularities with query views.pdf
Size:
2.32 MB
Format:
Adobe Portable Document Format
Description:
Full printable version