Novelty detection in topic tracking

dc.contributor.advisorCan, Fazlı
dc.contributor.authorAksoy, Cem
dc.departmentDepartment of Computer Engineeringen_US
dc.descriptionAnkara : The Department of Computer Engineering and the Institute of Engineering and Science of Bilkent University, 2010.en_US
dc.descriptionThesis (Master's) -- Bilkent University, 2010.en_US
dc.descriptionIncludes bibliographical references leaves 51-56.en_US
dc.description.abstractNews portals provide many services to the news consumers such as information retrieval, personalized information filtering, summarization and news clustering. Additionally, many news portals using multiple sources enable their users to evaluate developments from different perspectives by richening the content. However, increasing number of sources and incoming news makes it difficult for news consumers to find news of their interest in news portals. Different types of organizational operations are applied to ease browsing over the news for this reason. New event detection and tracking (NEDT) is one of these operations which aims to organize news with respect to the events that they report. NEDT may not also be enough by itself to satisfy the news consumers’ needs because of the repetitions of information that may occur in the tracking news of a topic due to usage of multiple sources. In this thesis, we investigate usage of novelty detection (ND) in tracking news of a topic. For this aim, we built a Turkish ND experimental collection, BilNov, consisting of 59 topics with an average of 51 tracking news. We propose usage of three methods; cosine similarity-based ND method, language model-based ND method and cover coefficient-based ND method. Additionally, we experiment on category-based threshold learning which has not been worked on previously in ND literature. We also provide some experimental pointers for ND in Turkish such as restriction of document vector lengths and smoothing methods. Finally, we experiment on TREC Novelty Track 2004 dataset. Experiments conducted by using BilNov show that language model-based ND method outperforms other two methods significantly and category-based threshold learning has promising results when compared to general threshold learning.en_US
dc.description.statementofresponsibilityAksoy, Cemen_US
dc.format.extentxi, 65 leavesen_US
dc.publisherBilkent Universityen_US
dc.subjectNovelty Detectionen_US
dc.subjectTopic Trackingen_US
dc.subject.lccZ699 .A57 2010en_US
dc.subject.lcshInformation storage and retrieval systems.en_US
dc.subject.lcshInformation retrieval.en_US
dc.subject.lcshAutomatic tracking.en_US
dc.subject.lcshCross-language information retrieval.en_US
dc.titleNovelty detection in topic trackingen_US
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
492.29 KB
Adobe Portable Document Format