Segmenting and labeling query sequences in a multidatabase environment

Acar, Aybar C.; Motro, A.

Segmenting and labeling query sequences in a multidatabase environment

dc.citation.epage	384	en_US
dc.citation.issueNumber	PART 1	en_US
dc.citation.spage	367	en_US
dc.citation.volumeNumber	7044	en_US
dc.contributor.author	Acar, Aybar C.	en_US
dc.contributor.author	Motro, A.	en_US
dc.coverage.spatial	Hersonissos, Crete, Greece	en_US
dc.date.accessioned	2016-02-08T12:17:32Z
dc.date.available	2016-02-08T12:17:32Z
dc.date.issued	2011	en_US
dc.department	Department of Computer Engineering	en_US
dc.description	Conference name: Confederated International Conferences: CoopIS, DOA-SVI, and ODBASE 2011	en_US
dc.description	Date of Conference: October 17-21, 2011	en_US
dc.description.abstract	When gathering information from multiple independent data sources, users will generally pose a sequence of queries to each source, combine (union) or cross-reference (join) the results in order to obtain the information they need. Furthermore, when gathering information, there is a fair bit of trial and error involved, where queries are recursively refined according to the results of a previous query in the sequence. From the point of view of an outside observer, the aim of such a sequence of queries may not be immediately obvious. We investigate the problem of isolating and characterizing subsequences representing coherent information retrieval goals out of a sequence of queries sent by a user to different data sources over a period of time. The problem has two sub-problems: segmenting the sequence into subsequences, each representing a discrete goal; and labeling each query in these subsequences according to how they contribute to the goal. We propose a method in which a discriminative probabilistic model (a Conditional Random Field) is trained with pre-labeled sequences. We have tested the accuracy with which such a model can infer labels and segmentation on novel sequences. Results show that the approach is very accurate (> 95% accuracy) when there are no spurious queries in the sequence and moderately accurate even in the presence of substantial noise (∼70% accuracy when 15% of queries in the sequence are spurious). © 2011 Springer-Verlag.	en_US
dc.identifier.doi	10.1007/978-3-642-25109-2_24	en_US
dc.identifier.doi	10.1007/978-3-642-25109-2	en_US
dc.identifier.issn	0302-9743	en_US
dc.identifier.uri	http://hdl.handle.net/11693/28325	en_US
dc.language.iso	English	en_US
dc.publisher	Springer, Berlin, Heidelberg	en_US
dc.relation.isversionof	http://dx.doi.org/10.1007/978-3-642-25109-2_24	en_US
dc.relation.isversionof	https://doi.org/10.1007/978-3-642-25109-2	en_US
dc.source.title	On the Move to Meaningful Internet Systems: OTM 2011	en_US
dc.subject	Data Management	en_US
dc.subject	Information Integration	en_US
dc.subject	Query Processing	en_US
dc.subject	Conditional random field	en_US
dc.subject	Data source	en_US
dc.subject	Information Integration	en_US
dc.subject	Multidatabases	en_US
dc.subject	Probabilistic models	en_US
dc.subject	Query sequence	en_US
dc.subject	Sub-problems	en_US
dc.subject	Trial and error	en_US
dc.subject	Image segmentation	en_US
dc.subject	Information management	en_US
dc.subject	Information retrieval	en_US
dc.subject	Internet	en_US
dc.subject	Query processing	en_US
dc.subject	Search engines	en_US
dc.title	Segmenting and labeling query sequences in a multidatabase environment	en_US
dc.type	Conference Paper	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Segmenting and labeling query sequences in a multidatabase environment.pdf
Size:: 332.29 KB
Format:: Adobe Portable Document Format
Description:: Full printable version

Download

Collections

Scholarly Publications - Computer Engineering