Now showing items 1-5 of 5

    • CoDet : a new algorithm for containment and near duplicate detection in text corpora 

      Varol, Emre (Bilkent University, 2012)
      In this thesis, we investigate containment detection, which is a generalized version of the well known near-duplicate detection problem concerning whether a document is a subset of another document. In text-based ...
    • CoDet: Sentence-based containment detection in news corpora 

      Varol, E.; Can F.; Aykanat, C.; Kaya O. (2011)
      We study a generalized version of the near-duplicate detection problem which concerns whether a document is a subset of another document. In text-based applications, document containment can be observed in exact-duplicates, ...
    • Developing a text categorization template for Turkish news portals 

      Toraman, C.; Can F.; Koçberber, S. (2011)
      In news portals, text category information is needed for news presentation. However, for many news stories the category information is unavailable, incorrectly assigned or too generic. This makes the text categorization a ...
    • New event detection and topic tracking in Turkish 

      Can, F.; Kocberber, S.; Baglioglu, O.; Kardas, S.; Ocalan, H. C.; Uyar, E. (John Wiley & Sons, Inc., 2010)
      Topic detection and tracking (TDT) applications aim to organize the temporally ordered stories of a news stream according to the events. Two major problems in TDT are new event detection (NED) and topic tracking (TT). These ...
    • Redif extraction in handwritten Ottoman literary texts 

      Can, E.F.; Duygulu P.; Can F.; Kalpakli, M. (2010)
      Repeated patterns, rhymes and redifs, are among the fundamental building blocks of Ottoman Divan poetry. They provide integrity of a poem by connecting its parts and bring a melody to its voice. In Ottoman literature, poets ...