Browsing by Subject "TDT"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Open Access Bilkent News Portal : a system with new event detection and tracking capabilities(2009) Öcalan, Hüseyin ÇağdaşNews portal services such as browsing, retrieving, and filtering have become an important research and application area as a result of information explosion on the Internet. In this work, we give implementation details of Bilkent News Portal that contains various novel features ranging from personalization to new event detection and tracking capabilities aiming at addressing the needs of news-consumers. The thesis presents the architecture, data and file structures, and experimental foundations of the news portal. For the implementation and evaluation of the new event detection and tracking component, we developed a test collection: BilCol2005. The collection contains 209,305 documents from the entire year of 2005 and involves several events in which eighty of them are annotated by humans. It enables empirical assessment of new event detection and tracking algorithms on Turkish. For the construction of our test collection, a web application, ETracker, is developed by following the guidelines of the TDT research initiative. Furthermore, we experimentally evaluated the impact of various parameters in information retrieval (IR) that has to be decided during the implementation of a news portal that provides filtering and retrieval capabilities. For this purpose, we investigated the effects of stemming, document length, query length, and scalability issues.Item Open Access New event detection using chronological term ranking(2009) Bağlıoğlu, ÖzgürNews web pages are an important resource for news consumers since the Internet provides the most up-to-date information. However, the abundance of this information is overwhelming. In order to solve this problem, news articles should be organized in various ways. For example, new event detection (NED) and tracking studies aim to solve this problem by categorizing news stories according to events. Generally, important issues are presented at the beginning of news articles. Based on this observation, we modify the term weighting component of the Okapi similarity measure in several different ways and use them in NED. We perform numerous experiments in Turkish using the BilCol2005 test collection that contains 209,305 documents from the entire year of 2005 and involves several events in which eighty of them are annotated by humans. In this study, we developed various chronological term ranking (CTR) functions using term positions with several parameters. Our experimental results show that CTR in combination with Okapi improves the effectiveness of a baseline system with a desirable performance up to 13%. We demonstrate that NED using CTR has a robust performance in different versions of TDT collection generated by N-pass detection evaluation. The tests indicate that the improvements are statistically significant.