Integrated segmentation and recognition of connected Ottoman script

Date
2009-11
Advisor
Instructor
Source Title
Optical Engineering
Print ISSN
0091-3286
Electronic ISSN
Publisher
S P I E - International Society for Optical Engineering
Volume
48
Issue
11
Pages
1 - 12
Language
English
Type
Article
Journal Title
Journal ISSN
Volume Title
Abstract

We propose a novel context-sensitive segmentation and recognition method for connected letters in Ottoman script. This method first extracts a set of segments from a connected script and determines the candidate letters to which extracted segments are most similar. Next, a function is defined for scoring each different syntactically correct sequence of these candidate letters. To find the candidate letter sequence that maximizes the score function, a directed acyclic graph is constructed. The letters are finally recognized by computing the longest path in this graph. Experiments using a collection of printed Ottoman documents reveal that the proposed method provides >90% precision and recall figures in terms of character recognition. In a further set of experiments, we also demonstrate that the framework can be used as a building block for an information retrieval system for digital Ottoman archives. © 2009 Society of Photo-Optical Instrumentation Engineers.

Course
Other identifiers
Book Title
Keywords
Connected scripts, Historical document analysis, Information retrieval, Optical character recognition, Building blockes, Context-sensitive, Directed acyclic graphs, Integrated segmentation and recognition, Longest path, Precision and recall, Recognition methods, Score function, Experiments, Search engines
Citation
Published Version (Please cite this version)