New event detection using chronological term ranking

Date

2009

Editor(s)

Advisor

Can, Fazlı

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

Print ISSN

Electronic ISSN

Publisher

Volume

Issue

Pages

Language

English

Type

Journal Title

Journal ISSN

Volume Title

Attention Stats
Usage Stats
3
views
26
downloads

Series

Abstract

News web pages are an important resource for news consumers since the Internet provides the most up-to-date information. However, the abundance of this information is overwhelming. In order to solve this problem, news articles should be organized in various ways. For example, new event detection (NED) and tracking studies aim to solve this problem by categorizing news stories according to events. Generally, important issues are presented at the beginning of news articles. Based on this observation, we modify the term weighting component of the Okapi similarity measure in several different ways and use them in NED. We perform numerous experiments in Turkish using the BilCol2005 test collection that contains 209,305 documents from the entire year of 2005 and involves several events in which eighty of them are annotated by humans. In this study, we developed various chronological term ranking (CTR) functions using term positions with several parameters. Our experimental results show that CTR in combination with Okapi improves the effectiveness of a baseline system with a desirable performance up to 13%. We demonstrate that NED using CTR has a robust performance in different versions of TDT collection generated by N-pass detection evaluation. The tests indicate that the improvements are statistically significant.

Course

Other identifiers

Book Title

Degree Discipline

Computer Engineering

Degree Level

Master's

Degree Name

MS (Master of Science)

Citation

Published Version (Please cite this version)