• About
  • Policies
  • What is open access
  • Library
  • Contact
Advanced search
      View Item 
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      New event detection and topic tracking in Turkish

      Thumbnail
      View / Download
      243.4 Kb
      Author(s)
      Can, F.
      Kocberber, S.
      Baglioglu, O.
      Kardas, S.
      Ocalan, H. C.
      Uyar, E.
      Date
      2010
      Source Title
      Journal of the American Society for Information Science and Technology
      Print ISSN
      2330-1635
      Publisher
      John Wiley & Sons, Inc.
      Volume
      61
      Issue
      4
      Pages
      802 - 819
      Language
      English
      Type
      Article
      Item Usage Stats
      149
      views
      296
      downloads
      Abstract
      Topic detection and tracking (TDT) applications aim to organize the temporally ordered stories of a news stream according to the events. Two major problems in TDT are new event detection (NED) and topic tracking (TT). These problems focus on finding the first stories of new events and identifying all subsequent stories on a certain topic defined by a small number of sample stories. In this work, we introduce the first large-scale TDT test collection for Turkish, and investigate the NED and TT problems in this language. We present our test-collection-construction approach, which is inspired by the TDT research initiative. We show that in TDT for Turkish with some similarity measures, a simple word truncation stemming method can compete with a lemmatizer-based stemming approach. Our findings show that contrary to our earlier observations on Turkish information retrieval, in NED word stopping has an impact on effectiveness. We demonstrate that the confidence scores of two different similarity measures can be combined in a straightforward manner for higher effectiveness. The influence of several similarity measures on effectiveness also is investigated. We show that it is possible to deploy TT applications in Turkish that can be used in operational settings. © 2010 ASIS&T.
      Keywords
      Confidence score
      Event detection
      Number of samples
      Research initiatives
      Similarity measure
      Test Collection
      Topic detection and tracking
      Topic tracking
      Turkishs
      Information services
      Permalink
      http://hdl.handle.net/11693/22372
      Published Version (Please cite this version)
      http://dx.doi.org/10.1002/asi.21264
      Collections
      • Department of Computer Engineering 1435
      Show full item record

      Browse

      All of BUIRCommunities & CollectionsTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsThis CollectionTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartments

      My Account

      LoginRegister

      Statistics

      View Usage StatisticsView Google Analytics Statistics

      Bilkent University

      If you have trouble accessing this page and need to request an alternate format, contact the site administrator. Phone: (312) 290 1771
      © Bilkent University - Library IT

      Contact Us | Send Feedback | Off-Campus Access | Admin | Privacy