Browsing by Subject "XML"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item Open Access Exploiting index pruning methods for clustering XML collections(Springer, Berlin, Heidelberg, 2010) Altıngövde, İsmail Şengör; Atılgan, Duygu; Ulusoy, ÖzgürIn this paper, we first employ the well known Cover-Coefficient Based Clustering Methodology (C3M) for clustering XML documents. Next, we apply index pruning techniques from the literature to reduce the size of the document vectors. Our experiments show that for certain cases, it is possible to prune up to 70% of the collection (or, more specifically, underlying document vectors) and still generate a clustering structure that yields the same quality with that of the original collection, in terms of a set of evaluation metrics. © 2010 Springer-Verlag Berlin Heidelberg.Item Open Access Implementation of a topic map data model for a Web-based information resource(Bilkent University, 2002) Kutlutürk, MustafaThe Web has become a vast information resource in recent years. Millions of people use the Web on a regular basis and the number is increasing rapidly. The Web is the largest center in the world presenting almost all of the social, economical, educational, etc. activities and anyone from all over the word can visit this huge place even though he does not have to stand up from his sit. Due to its hugeness, finding desired data on the Web in a timely and cost effective way is a problem of wide interest. In the last several years, many search engines have been created to help Web users find desired information. However, most of these search engines employ topic-independent search methods that rely heavily on keyword-based approaches where the users are presented with a lot of unnecessary search results. In this thesis, we present a data model using topic maps standards for Webbased information resources. In this model, topics, topic associations and topic occurrences (called as topic metalinks and topic sources in this study) are the fundamental concepts. In fact, the presented model is a metadata model that describes the content of the Web-based information resource and creates virtual knowledge maps over the modeled information resource. Thus, semantic indexing of the Web-based information resource is performed for allowing efficient search and querying the data on the resource. iv Additionally, we employ full text indexing in the presented model by using a widely accepted method that is inverted file index. Due to the rapid increase of data, the dynamic update of the inverted file index during the addition of new documents is inevitable. We have implemented an efficient dynamic update scheme in the presented model for the employed inverted file index method. The presented topic map data model provides combining the powers of both keyword-based search and topic-centric search methods. We also provide a prototype search engine verifying that our presented model contributes very much to the problem of efficient and effective search and querying of the Web-based information resources.Item Open Access Metadata-based modeling of information resources on the web(Wiley, 2004) Özel, S. A.; Altingövde, S.; Ulusoy, Özgür; Özsoyoǧlu G.; Özsoyoǧlu, Z. M.This paper deals with the problem of modeling Web information resources using expert knowledge and personalized user information for improved Web searching capabilities. We propose a "Web information space" model, which is composed of Web-based information resources (HTML/XML [Hypertext Markup Language/Extensible Markup Language] documents on the Web), expert advice repositories (domain-expert-specified meta-data for information resources), and personalized information about users (captured as user profiles that indicates users' preferences about experts as well as users' knowledge about topics). Expert advice, the heart of the Web information space model, is specified using topics and relationships among topics (called metalinks), along the lines of the recently proposed topic maps. Topics and metalinks constitute metadata that describe the contents of the underlying HTML/XML Web resources. The metadata specification process is semiautomated, and it exploits XML DTDs (Document Type Definition) to allow domain-expert guided mapping of DTD elements to topics and metalinks. The expert advice is stored in an object-relational database management systems (DBMS). To demonstrate the practicality and usability of the proposed Web information space model, we created a prototype expert advice repository of more than one million topics/metalinks for DBLP (Database and Logic Programming) Bibliography data set. We also present a query interface that provides sophisticated querying facilities for DBLP Bibliography resources using the expert advice repository.Item Open Access XML retrieval using pruned element-index files(Springer, Berlin, Heidelberg, 2010) Altıngövde, İsmail Şengör; Atılgan, Duygu; Ulusoy, ÖzgürAn element-index is a crucial mechanism for supporting content-only (CO) queries over XML collections. A full element-index that indexes each element along with the content of its descendants involves a high redundancy and reduces query processing efficiency. A direct index, on the other hand, only indexes the content that is directly under each element and disregards the descendants. This results in a smaller index, but possibly in return to some reduction in system effectiveness. In this paper, we propose using static index pruning techniques for obtaining more compact index files that can still result in comparable retrieval performance to that of a full index. We also compare the retrieval performance of these pruning based approaches to some other strategies that make use of a direct element-index. Our experiments conducted along with the lines of INEX evaluation framework reveal that pruned index files yield comparable to or even better retrieval performance than the full index and direct index, for several tasks in the ad hoc track. © 2010 Springer-Verlag Berlin Heidelberg.Item Open Access XML-based framework for web-based neurocardiovascular simularion(Bilkent University, 2004) Uzun, İsmailMathematical modeling and numerical simulation of neurocardiovascular control system has played an important role in better understanding of its function and diagnosis of neurological disorders. Current simulations of neurocardiovascular models are carried out using desktop applications, which lack remote access and information sharing facilities. Although, web-technology has penetrated into all areas of research and professional life during the past two decades, opportunities provided by the web technology has not been fully exploited in this area. Moving from desktop to web, utilizing web technology, promises global access, platform independence, information sharing and easy maintainability features. Considering these features, the demand on a framework that enables webbased simulation of neurocardiovascular system models becomes more obvious. In this thesis, we have proposed and implemented an XML-based framework that enables web-based simulation of neurocardivascular models. In this context, we implemented an XML-based description language for structured description of neurocardiovascular models, a Java-based simulaton package and supportive software to form a web-based architecture. XML is becoming the universal standard for exchange of structured data over the web. Therefore, we make use of XML to propose the generic description language NeuroCardioVascular Markup Language (NCVML), such that it supports description of a wide range of model set. We expect neurocardiovascular model descriptions to be encoded in NCVML form and to be carried over the web in this format. The java-based simulation package, NCVJSim, contains a built-in library with peculiar components and a simulator part. The library could be extended in time such that the library evolves in time. Additionally, making use of Java Dynamic Class Loading & Java Reflection Mechanisms, we implemented the feature of incorporating user implemented Java classes during run-time. Finally, to achieve web-based access and computing, Java Servlet Technology and HTML are utilized. Our proposed framework is developed to serve all types of models, thus, it is not restricted to a particular mathematical neurocardiovascular model.