Implementation of a topic map data model for a Web-based information resource

The Web has become a vast information resource in recent years. Millions of people use the Web on a regular basis and the number is increasing rapidly. The Web is the largest center in the world presenting almost all of the social, economical, educational, etc. activities and anyone from all over the word can visit this huge place even though he does not have to stand up from his sit. Due to its hugeness, finding desired data on the Web in a timely and cost effective way is a problem of wide interest. In the last several years, many search engines have been created to help Web users find desired information. However, most of these search engines employ topic-independent search methods that rely heavily on keyword-based approaches where the users are presented with a lot of unnecessary search results. In this thesis, we present a data model using topic maps standards for Webbased information resources. In this model, topics, topic associations and topic occurrences (called as topic metalinks and topic sources in this study) are the fundamental concepts. In fact, the presented model is a metadata model that describes the content of the Web-based information resource and creates virtual knowledge maps over the modeled information resource. Thus, semantic indexing of the Web-based information resource is performed for allowing efficient search and querying the data on the resource. iv Additionally, we employ full text indexing in the presented model by using a widely accepted method that is inverted file index. Due to the rapid increase of data, the dynamic update of the inverted file index during the addition of new documents is inevitable. We have implemented an efficient dynamic update scheme in the presented model for the employed inverted file index method. The presented topic map data model provides combining the powers of both keyword-based search and topic-centric search methods. We also provide a prototype search engine verifying that our presented model contributes very much to the problem of efficient and effective search and querying of the Web-based information resources.

Metadata, XML, topic maps, Web-based information resource, Web search, inverted file index, dynamic update, Web data modeling, semantic indexing
