Integrated segmentation and recognition of connected Ottoman script
Yalnız, İsmet Zeki
Item Usage Stats
MetadataShow full item record
In this thesis, a novel context-sensitive segmentation and recognition method for connected letters in Ottoman script is proposed. This method first extracts a set of possible segments from a connected script and determines the candidate letters to which extracted segments are most similar. Next, a function is defined for scoring each different syntactically correct sequence of these candidate letters. To find the candidate letter sequence that maximizes the score function, a directed acyclic graph is constructed. The letters are finally recognized by computing the longest path in this graph. Experiments using a collection of printed Ottoman documents reveal that the proposed method provides very high precision and recall figures in terms of character recognition. In a further set of experiments we also demonstrate that the framework can be used as a building block for an information retrieval system for digital Ottoman archives.
KeywordsOptical character recognition (OCR)
Segmentation and recognition of connected scripts
Information retrieval (IR)
TA1640 .Y34 2008
Optical character recognition devices.