Browsing by Keywords "Test Collection"
Now showing items 1-5 of 5
-
CoDet : a new algorithm for containment and near duplicate detection in text corpora
(Bilkent University, 2012)In this thesis, we investigate containment detection, which is a generalized version of the well known near-duplicate detection problem concerning whether a document is a subset of another document. In text-based ... -
CoDet: Sentence-based containment detection in news corpora
(ACM, 2011)We study a generalized version of the near-duplicate detection problem which concerns whether a document is a subset of another document. In text-based applications, document containment can be observed in exact-duplicates, ... -
Developing a text categorization template for Turkish news portals
(IEEE, 2011)In news portals, text category information is needed for news presentation. However, for many news stories the category information is unavailable, incorrectly assigned or too generic. This makes the text categorization a ... -
New event detection and topic tracking in Turkish
(John Wiley & Sons, Inc., 2010)Topic detection and tracking (TDT) applications aim to organize the temporally ordered stories of a news stream according to the events. Two major problems in TDT are new event detection (NED) and topic tracking (TT). These ... -
Redif extraction in handwritten Ottoman literary texts
(IEEE, 2010)Repeated patterns, rhymes and redifs, are among the fundamental building blocks of Ottoman Divan poetry. They provide integrity of a poem by connecting its parts and bring a melody to its voice. In Ottoman literature, poets ...