Acar, Aybar C.Motro, A.2016-02-082016-02-0820110302-9743http://hdl.handle.net/11693/28325Conference name: Confederated International Conferences: CoopIS, DOA-SVI, and ODBASE 2011Date of Conference: October 17-21, 2011When gathering information from multiple independent data sources, users will generally pose a sequence of queries to each source, combine (union) or cross-reference (join) the results in order to obtain the information they need. Furthermore, when gathering information, there is a fair bit of trial and error involved, where queries are recursively refined according to the results of a previous query in the sequence. From the point of view of an outside observer, the aim of such a sequence of queries may not be immediately obvious. We investigate the problem of isolating and characterizing subsequences representing coherent information retrieval goals out of a sequence of queries sent by a user to different data sources over a period of time. The problem has two sub-problems: segmenting the sequence into subsequences, each representing a discrete goal; and labeling each query in these subsequences according to how they contribute to the goal. We propose a method in which a discriminative probabilistic model (a Conditional Random Field) is trained with pre-labeled sequences. We have tested the accuracy with which such a model can infer labels and segmentation on novel sequences. Results show that the approach is very accurate (> 95% accuracy) when there are no spurious queries in the sequence and moderately accurate even in the presence of substantial noise (∼70% accuracy when 15% of queries in the sequence are spurious). © 2011 Springer-Verlag.EnglishData ManagementInformation IntegrationQuery ProcessingConditional random fieldData sourceInformation IntegrationMultidatabasesProbabilistic modelsQuery sequenceSub-problemsTrial and errorImage segmentationInformation managementInformation retrievalInternetQuery processingSearch enginesSegmenting and labeling query sequences in a multidatabase environmentConference Paper10.1007/978-3-642-25109-2_2410.1007/978-3-642-25109-2