Timestamp-based result cache invalidation for web search engines

dc.citation.epage982en_US
dc.citation.spage973en_US
dc.contributor.authorAlıcı, Sadiyeen_US
dc.contributor.authorAltingovde I.S.en_US
dc.contributor.authorÖzcan, Rıfaten_US
dc.contributor.authorCambazoglu, B.B.en_US
dc.contributor.authorUlusoy, Özgüren_US
dc.coverage.spatialBeijing, Chinaen_US
dc.date.accessioned2016-02-08T12:18:21Z
dc.date.available2016-02-08T12:18:21Z
dc.date.issued2011en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.descriptionDate of Conference: July 24 - 28, 2011en_US
dc.description.abstractThe result cache is a vital component for efficiency of large-scale web search engines, and maintaining the freshness of cached query results is the current research challenge. As a remedy to this problem, our work proposes a new mechanism to identify queries whose cached results are stale. The basic idea behind our mechanism is to maintain and compare generation time of query results with update times of posting lists and documents to decide on staleness of query results. The proposed technique is evaluated using a Wikipedia document collection with real update information and a real-life query log. We show that our technique has good prediction accuracy, relative to a baseline based on the time-to-live mechanism. Moreover, it is easy to implement and incurs less processing overhead on the system relative to a recently proposed, more sophisticated invalidation mechanism.en_US
dc.description.provenanceMade available in DSpace on 2016-02-08T12:18:21Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2011en
dc.identifier.doi10.1145/2009916.2010046en_US
dc.identifier.urihttp://hdl.handle.net/11693/28360en_US
dc.language.isoEnglishen_US
dc.publisherACMen_US
dc.relation.isversionofhttp://dx.doi.org/10.1145/2009916.2010046en_US
dc.source.titleSIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrievalen_US
dc.subjectCache invalidationen_US
dc.subjectFreshnessen_US
dc.subjectResult cacheen_US
dc.subjectWeb searchen_US
dc.subjectCache invalidationen_US
dc.subjectDocument collectionen_US
dc.subjectFreshnessen_US
dc.subjectGeneration timeen_US
dc.subjectInvalidation mechanismen_US
dc.subjectNew mechanismsen_US
dc.subjectPrediction accuracyen_US
dc.subjectProcessing overheaden_US
dc.subjectQuery logsen_US
dc.subjectQuery resultsen_US
dc.subjectResearch challengesen_US
dc.subjectResult cacheen_US
dc.subjectTime-to-liveen_US
dc.subjectWeb searchesen_US
dc.subjectWikipediaen_US
dc.subjectResearchen_US
dc.subjectSearch enginesen_US
dc.subjectUser interfacesen_US
dc.subjectWebsitesen_US
dc.subjectInformation retrievalen_US
dc.titleTimestamp-based result cache invalidation for web search enginesen_US
dc.typeConference Paperen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Timestamp-based result cache invalidation for web search engines.pdf
Size:
691.72 KB
Format:
Adobe Portable Document Format
Description:
Full printable version