Now showing items 1-5 of 5

    • CoDet : a new algorithm for containment and near duplicate detection in text corpora 

      Varol, Emre (Bilkent University, 2012)
      In this thesis, we investigate containment detection, which is a generalized version of the well known near-duplicate detection problem concerning whether a document is a subset of another document. In text-based ...
    • CoDet: Sentence-based containment detection in news corpora 

      Varol, Emre; Can, Fazlı; Aykanat, Cevdet; Kaya, Oğuz (ACM, 2011)
      We study a generalized version of the near-duplicate detection problem which concerns whether a document is a subset of another document. In text-based applications, document containment can be observed in exact-duplicates, ...
    • Developing a text categorization template for Turkish news portals 

      Toraman, Çağrı; Can, Fazlı; Koçberber, Seyit (IEEE, 2011)
      In news portals, text category information is needed for news presentation. However, for many news stories the category information is unavailable, incorrectly assigned or too generic. This makes the text categorization a ...
    • New event detection and topic tracking in Turkish 

      Can, F.; Kocberber, S.; Baglioglu, O.; Kardas, S.; Ocalan, H. C.; Uyar, E. (John Wiley & Sons, Inc., 2010)
      Topic detection and tracking (TDT) applications aim to organize the temporally ordered stories of a news stream according to the events. Two major problems in TDT are new event detection (NED) and topic tracking (TT). These ...
    • Redif extraction in handwritten Ottoman literary texts 

      Can, Ethem F.; Duygulu, Pınar; Can, Fazlı; Kalpaklı, Mehmet (IEEE, 2010)
      Repeated patterns, rhymes and redifs, are among the fundamental building blocks of Ottoman Divan poetry. They provide integrity of a poem by connecting its parts and bring a melody to its voice. In Ottoman literature, poets ...