Integrated segmentation and recognition of connected Ottoman script
Please cite this item using this persistent URL
http://hdl.handle.net/11693/14738Collections
Advisor
Ulusoy, Özgür
Publisher
Bilkent University
Abstract
In this thesis, a novel context-sensitive segmentation and recognition method
for connected letters in Ottoman script is proposed. This method first extracts
a set of possible segments from a connected script and determines the candidate
letters to which extracted segments are most similar. Next, a function is defined
for scoring each different syntactically correct sequence of these candidate letters.
To find the candidate letter sequence that maximizes the score function, a directed
acyclic graph is constructed. The letters are finally recognized by computing the
longest path in this graph. Experiments using a collection of printed Ottoman
documents reveal that the proposed method provides very high precision and
recall figures in terms of character recognition. In a further set of experiments
we also demonstrate that the framework can be used as a building block for an
information retrieval system for digital Ottoman archives.