Segmenting and labeling query sequences in a multidatabase environment

Acar, Aybar C.; Motro, A.

Segmenting and labeling query sequences in a multidatabase environment

Files

Segmenting and labeling query sequences in a multidatabase environment.pdf (332.29 KB)

Date

2011

Authors

Acar, Aybar C.

Motro, A.

BUIR Usage Stats

3
views

22
downloads

Citation Stats

Abstract

When gathering information from multiple independent data sources, users will generally pose a sequence of queries to each source, combine (union) or cross-reference (join) the results in order to obtain the information they need. Furthermore, when gathering information, there is a fair bit of trial and error involved, where queries are recursively refined according to the results of a previous query in the sequence. From the point of view of an outside observer, the aim of such a sequence of queries may not be immediately obvious. We investigate the problem of isolating and characterizing subsequences representing coherent information retrieval goals out of a sequence of queries sent by a user to different data sources over a period of time. The problem has two sub-problems: segmenting the sequence into subsequences, each representing a discrete goal; and labeling each query in these subsequences according to how they contribute to the goal. We propose a method in which a discriminative probabilistic model (a Conditional Random Field) is trained with pre-labeled sequences. We have tested the accuracy with which such a model can infer labels and segmentation on novel sequences. Results show that the approach is very accurate (> 95% accuracy) when there are no spurious queries in the sequence and moderately accurate even in the presence of substantial noise (∼70% accuracy when 15% of queries in the sequence are spurious). © 2011 Springer-Verlag.

Source Title

On the Move to Meaningful Internet Systems: OTM 2011

Publisher

Springer, Berlin, Heidelberg

Keywords

Data Management, Information Integration, Query Processing, Conditional random field, Data source, Information Integration, Multidatabases, Probabilistic models, Query sequence, Sub-problems, Trial and error, Image segmentation, Information management, Information retrieval, Internet, Query processing, Search engines

Permalink

http://hdl.handle.net/11693/28325

Published Version (Please cite this version)

http://dx.doi.org/10.1007/978-3-642-25109-2_24
https://doi.org/10.1007/978-3-642-25109-2

Collections

Scholarly Publications - Computer Engineering

Language

English

Type

Conference Paper

Full item page

Segmenting and labeling query sequences in a multidatabase environment

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Segmenting and labeling query sequences in a multidatabase environment

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Citation Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type