New event detection and tracking in Turkish

buir.advisorCan, Fazlı
dc.contributor.authorKardaş, Süleyman
dc.date.accessioned2016-01-08T18:10:16Z
dc.date.available2016-01-08T18:10:16Z
dc.date.issued2009
dc.descriptionAnkara : The Department of Computer Engineering and the Institute of Engineering and Science of Bilkent University, 2009.en_US
dc.descriptionThesis (Master's) -- Bilkent University, 2009.en_US
dc.descriptionIncludes bibliographical references leaves 66-73.en_US
dc.description.abstractThe amount of information and the number of information resources on the Internet have been growing rapidly for over a decade. This is also true for on-line news and news providers. To overcome information overload news consumers prefer to track the topics that they are interested in. Topic detection and tracking (TDT) applications aim to organize the temporally ordered stories of a news stream according to the events. Two major problems in TDT are new event detection (NED) and topic tracking (TT). These problems respectively focus on finding the first stories of previously unseen new events and all subsequent stories on a certain topic defined by a small number of initial stories. In this thesis, the NED and TT problems are investigated in detail using the first large-scale test collection (BilCol2005) developed by Bilkent Information Retrieval Group. The collection contains 209,305 documents from the entire year of 2005 and involves several events in which eighty of them are annotated by humans. The experimental results show that a simple word truncation stemming method can statistically compete with a sophisticated stemming approach that pays attention to the morphological structure of the language. Our statistical findings illustrate that word stopping and the contents of the associated stopword list are important and removing the stopwords from content can significantly improve the system performance. We demonstrate that the confidence scores of two different similarity measures can be combined in a straightforward manner for improving the effectiveness.en_US
dc.description.provenanceMade available in DSpace on 2016-01-08T18:10:16Z (GMT). No. of bitstreams: 1 0003828.pdf: 1438158 bytes, checksum: 2359589bc42b40e9be74e93b2a95e9a5 (MD5)en
dc.description.statementofresponsibilityKardaş, Süleymanen_US
dc.format.extentxv, 77 leaves, graphicsen_US
dc.identifier.itemidBILKUTUPB116269
dc.identifier.urihttp://hdl.handle.net/11693/14880
dc.language.isoEnglishen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectNew Event Detectionen_US
dc.subjectTopic Detection and Trackingen_US
dc.subjectTurkishen_US
dc.subject.lccZ699 .K37 2009en_US
dc.subject.lcshInformation storage and retrieval systems.en_US
dc.subject.lcshInformation retrieval.en_US
dc.subject.lcshAutomatic tracking.en_US
dc.titleNew event detection and tracking in Turkishen_US
dc.typeThesisen_US
thesis.degree.disciplineComputer Engineering
thesis.degree.grantorBilkent University
thesis.degree.levelMaster's
thesis.degree.nameMS (Master of Science)

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
0003828.pdf
Size:
1.37 MB
Format:
Adobe Portable Document Format