Segmentation based Ottoman text and matching based Kufic image analysis

Adıgüzel, Hande

Segmentation based Ottoman text and matching based Kufic image analysis

Files

0006591.pdf (38.99 MB)

Date

2013

Authors

Adıgüzel, Hande

Advisor

Şahin, Pınar Duygulu

BUIR Usage Stats

5
views

22
downloads

Abstract

Large archives of historical documents attract many researchers from all around the world. The increasing demand to access those archives makes automatic retrieval and recognition of historical documents crucial. Ottoman archives are one of the largest collections of historical documents. Although Ottoman is not a currently spoken language, many researchers from all around the world are interested in accessing the archived material. This thesis proposes two Ottoman document analysis studies; first one is a crucial pre-processing task for retrieval and recognition which is segmentation of documents. Second one is a more specific retrieval and recognition problem which aims matching Islamic patterns is Kufic images. For the first segmentation task, layout, line and word segmentation is studied. Layout segmentation is obtained via Log-Gabor filtering. Four different algorithms are proposed for line segmentation and finally a simple morphological method is preferred for word segmentation. Datasets are constructed with documents from both Ottoman and other languages (English, Greek and Bangla) to test the script-independency of the methods. Experiments show that our segmentation steps give satisfactory results. The second task aims to detect Islamic patterns in Kufic images. The sub-patterns are considered as basic units and matching is used for the analysis. Graphs are preferred to represent subpatterns where graph and sub-graph isomorphism are used for matching them. Kufic images are analyzed in three different ways. Given a query pattern, all the instances of the query can be found through retrieval. Going further, through known patterns images can be automatically labeled in the entire dataset. Finally, patterns that repeat inside an image can be automatically discovered. As there is no existing Kufic dataset, a new one is constructed by collecting images from the Internet and promising results are obtained on this dataset.

Keywords

Historical Manuscripts, Ottoman Documents, Layout Segmentation, Line Segmentation, Word Segmentation, Islamic Pattern Matching

Degree Discipline

Computer Engineering

Degree Level

Master's

Degree Name

MS (Master of Science)

Permalink

http://hdl.handle.net/11693/15892

Collections

Graduate School of Engineering and Science

Language

English

Type

Thesis

Full item page

Segmentation based Ottoman text and matching based Kufic image analysis

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type

Segmentation based Ottoman text and matching based Kufic image analysis

Files

Date

Authors

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats

Share

Series

Abstract

Source Title

Publisher

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Degree Level

Degree Name

Citation

Permalink

Published Version (Please cite this version)

Collections

Language

Type