New event detection and topic tracking in Turkish

dc.citation.epage819en_US
dc.citation.issueNumber4en_US
dc.citation.spage802en_US
dc.citation.volumeNumber61en_US
dc.contributor.authorCan, F.en_US
dc.contributor.authorKocberber, S.en_US
dc.contributor.authorBaglioglu, O.en_US
dc.contributor.authorKardas, S.en_US
dc.contributor.authorOcalan, H. C.en_US
dc.contributor.authorUyar, E.en_US
dc.date.accessioned2016-02-08T09:59:15Z
dc.date.available2016-02-08T09:59:15Z
dc.date.issued2010en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractTopic detection and tracking (TDT) applications aim to organize the temporally ordered stories of a news stream according to the events. Two major problems in TDT are new event detection (NED) and topic tracking (TT). These problems focus on finding the first stories of new events and identifying all subsequent stories on a certain topic defined by a small number of sample stories. In this work, we introduce the first large-scale TDT test collection for Turkish, and investigate the NED and TT problems in this language. We present our test-collection-construction approach, which is inspired by the TDT research initiative. We show that in TDT for Turkish with some similarity measures, a simple word truncation stemming method can compete with a lemmatizer-based stemming approach. Our findings show that contrary to our earlier observations on Turkish information retrieval, in NED word stopping has an impact on effectiveness. We demonstrate that the confidence scores of two different similarity measures can be combined in a straightforward manner for higher effectiveness. The influence of several similarity measures on effectiveness also is investigated. We show that it is possible to deploy TT applications in Turkish that can be used in operational settings. © 2010 ASIS&T.en_US
dc.description.provenanceMade available in DSpace on 2016-02-08T09:59:15Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2010en
dc.identifier.doi10.1002/asi.21264en_US
dc.identifier.issn2330-1635
dc.identifier.urihttp://hdl.handle.net/11693/22372
dc.language.isoEnglishen_US
dc.publisherJohn Wiley & Sons, Inc.en_US
dc.relation.isversionofhttp://dx.doi.org/10.1002/asi.21264en_US
dc.source.titleJournal of the American Society for Information Science and Technologyen_US
dc.subjectConfidence scoreen_US
dc.subjectEvent detectionen_US
dc.subjectNumber of samplesen_US
dc.subjectResearch initiativesen_US
dc.subjectSimilarity measureen_US
dc.subjectTest Collectionen_US
dc.subjectTopic detection and trackingen_US
dc.subjectTopic trackingen_US
dc.subjectTurkishsen_US
dc.subjectInformation servicesen_US
dc.titleNew event detection and topic tracking in Turkishen_US
dc.typeArticleen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
New event detection and topic tracking in turkish.pdf
Size:
243.42 KB
Format:
Adobe Portable Document Format
Description:
Full printable version