Querying web metadata: Native score management and text support in databases
Date
Authors
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
1557-4644
Electronic ISSN
Publisher
Volume
Issue
Pages
Language
Type
Journal Title
Journal ISSN
Volume Title
Citation Stats
Attention Stats
Usage Stats
views
downloads
Series
Abstract
In this article, we discuss the issues involved in adding a native score management system to object-relational databases, to be used in querying Web metadata (that describes the semantic content of Web resources). The Web metadata model is based on topics (representing entities), relationships among topics (called metalinks), and importance scores (sideway values) of topics and metalinks. We extend database relations with scoring functions and importance scores. We add to SQL score-management clauses with well-defined semantics, and propose the sidewayvalue algebra (SVA), to evaluate the extended SQL queries. SQL extensions and the SVA algebra are illustrated through two Web resources, namely, the DBLP Bibliography and the SIGMOD Anthology. SQL extensions include clauses for propagating input tuple importance scores to output tuples during query processing, clauses that specify query stopping conditions, threshold predicates (a type of approximate similarity predicates for text comparisons), and user-defined-function-based predicates. The propagated importance scores are then used to rank and return a small number of output tuples. The query stopping conditions are propagated to SVA operators during query processing. We show that our SQL extensions are well-defined, meaning that, given a database and a query Q, under any query processing scheme, the output tuples of Q and their importance scores stay the same. To process the SQL extensions, we discuss two sideway value algebra operators, namely, sideway value algebra join and topic closure, give their implementation algorithms, and report their experimental evaluations.