Segmenting and labeling query sequences in a multidatabase environment

dc.citation.epage384en_US
dc.citation.issueNumberPART 1en_US
dc.citation.spage367en_US
dc.citation.volumeNumber7044en_US
dc.contributor.authorAcar, Aybar C.en_US
dc.contributor.authorMotro, A.en_US
dc.coverage.spatialHersonissos, Crete, Greeceen_US
dc.date.accessioned2016-02-08T12:17:32Z
dc.date.available2016-02-08T12:17:32Z
dc.date.issued2011en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.descriptionConference name: Confederated International Conferences: CoopIS, DOA-SVI, and ODBASE 2011en_US
dc.descriptionDate of Conference: October 17-21, 2011en_US
dc.description.abstractWhen gathering information from multiple independent data sources, users will generally pose a sequence of queries to each source, combine (union) or cross-reference (join) the results in order to obtain the information they need. Furthermore, when gathering information, there is a fair bit of trial and error involved, where queries are recursively refined according to the results of a previous query in the sequence. From the point of view of an outside observer, the aim of such a sequence of queries may not be immediately obvious. We investigate the problem of isolating and characterizing subsequences representing coherent information retrieval goals out of a sequence of queries sent by a user to different data sources over a period of time. The problem has two sub-problems: segmenting the sequence into subsequences, each representing a discrete goal; and labeling each query in these subsequences according to how they contribute to the goal. We propose a method in which a discriminative probabilistic model (a Conditional Random Field) is trained with pre-labeled sequences. We have tested the accuracy with which such a model can infer labels and segmentation on novel sequences. Results show that the approach is very accurate (> 95% accuracy) when there are no spurious queries in the sequence and moderately accurate even in the presence of substantial noise (∼70% accuracy when 15% of queries in the sequence are spurious). © 2011 Springer-Verlag.en_US
dc.description.provenanceMade available in DSpace on 2016-02-08T12:17:32Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2011en
dc.identifier.doi10.1007/978-3-642-25109-2_24en_US
dc.identifier.doi10.1007/978-3-642-25109-2en_US
dc.identifier.issn0302-9743
dc.identifier.urihttp://hdl.handle.net/11693/28325
dc.language.isoEnglishen_US
dc.publisherSpringer, Berlin, Heidelbergen_US
dc.relation.isversionofhttp://dx.doi.org/10.1007/978-3-642-25109-2_24en_US
dc.relation.isversionofhttps://doi.org/10.1007/978-3-642-25109-2en_US
dc.source.titleOn the Move to Meaningful Internet Systems: OTM 2011en_US
dc.subjectData Managementen_US
dc.subjectInformation Integrationen_US
dc.subjectQuery Processingen_US
dc.subjectConditional random fielden_US
dc.subjectData sourceen_US
dc.subjectInformation Integrationen_US
dc.subjectMultidatabasesen_US
dc.subjectProbabilistic modelsen_US
dc.subjectQuery sequenceen_US
dc.subjectSub-problemsen_US
dc.subjectTrial and erroren_US
dc.subjectImage segmentationen_US
dc.subjectInformation managementen_US
dc.subjectInformation retrievalen_US
dc.subjectInterneten_US
dc.subjectQuery processingen_US
dc.subjectSearch enginesen_US
dc.titleSegmenting and labeling query sequences in a multidatabase environmenten_US
dc.typeConference Paperen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Segmenting and labeling query sequences in a multidatabase environment.pdf
Size:
332.29 KB
Format:
Adobe Portable Document Format
Description:
Full printable version