Bilkent News Portal : a system with new event detection and tracking capabilities
Author(s)
Advisor
Can, FazlıDate
2009Publisher
Bilkent University
Language
English
Type
ThesisItem Usage Stats
169
views
views
135
downloads
downloads
Abstract
News portal services such as browsing, retrieving, and filtering have become an
important research and application area as a result of information explosion on the
Internet. In this work, we give implementation details of Bilkent News Portal that
contains various novel features ranging from personalization to new event detection and
tracking capabilities aiming at addressing the needs of news-consumers. The thesis
presents the architecture, data and file structures, and experimental foundations of the
news portal. For the implementation and evaluation of the new event detection and
tracking component, we developed a test collection: BilCol2005. The collection
contains 209,305 documents from the entire year of 2005 and involves several events in
which eighty of them are annotated by humans. It enables empirical assessment of new
event detection and tracking algorithms on Turkish. For the construction of our test
collection, a web application, ETracker, is developed by following the guidelines of the
TDT research initiative. Furthermore, we experimentally evaluated the impact of
various parameters in information retrieval (IR) that has to be decided during the
implementation of a news portal that provides filtering and retrieval capabilities. For
this purpose, we investigated the effects of stemming, document length, query length,
and scalability issues.
Keywords
Content based information filteringInformation retrieval (IR)
New event detection and tracking
News portal
Test collection construction
TDT