Segmentation based Ottoman text and matching based Kufic image analysis

Adıgüzel, Hande

Segmentation based Ottoman text and matching based Kufic image analysis

buir.advisor	Şahin, Pınar Duygulu
dc.contributor.author	Adıgüzel, Hande
dc.date.accessioned	2016-01-08T18:26:19Z
dc.date.available	2016-01-08T18:26:19Z
dc.date.issued	2013
dc.description	Cataloged from PDF version of article.	en_US
dc.description	Includes bibliographical references leaves 80-88.	en_US
dc.description.abstract	Large archives of historical documents attract many researchers from all around the world. The increasing demand to access those archives makes automatic retrieval and recognition of historical documents crucial. Ottoman archives are one of the largest collections of historical documents. Although Ottoman is not a currently spoken language, many researchers from all around the world are interested in accessing the archived material. This thesis proposes two Ottoman document analysis studies; first one is a crucial pre-processing task for retrieval and recognition which is segmentation of documents. Second one is a more specific retrieval and recognition problem which aims matching Islamic patterns is Kufic images. For the first segmentation task, layout, line and word segmentation is studied. Layout segmentation is obtained via Log-Gabor filtering. Four different algorithms are proposed for line segmentation and finally a simple morphological method is preferred for word segmentation. Datasets are constructed with documents from both Ottoman and other languages (English, Greek and Bangla) to test the script-independency of the methods. Experiments show that our segmentation steps give satisfactory results. The second task aims to detect Islamic patterns in Kufic images. The sub-patterns are considered as basic units and matching is used for the analysis. Graphs are preferred to represent subpatterns where graph and sub-graph isomorphism are used for matching them. Kufic images are analyzed in three different ways. Given a query pattern, all the instances of the query can be found through retrieval. Going further, through known patterns images can be automatically labeled in the entire dataset. Finally, patterns that repeat inside an image can be automatically discovered. As there is no existing Kufic dataset, a new one is constructed by collecting images from the Internet and promising results are obtained on this dataset.	en_US
dc.description.statementofresponsibility	Adıgüzel, Hande	en_US
dc.format.extent	xiv, 101, graphics, illustrations, facsims	en_US
dc.identifier.itemid	B139311
dc.identifier.uri	http://hdl.handle.net/11693/15892
dc.language.iso	English	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Historical Manuscripts	en_US
dc.subject	Ottoman Documents	en_US
dc.subject	Layout Segmentation	en_US
dc.subject	Line Segmentation	en_US
dc.subject	Word Segmentation	en_US
dc.subject	Islamic Pattern Matching	en_US
dc.subject.lcc	QA76.9.T48 A35 2013	en_US
dc.subject.lcsh	Text processing (Computer science)	en_US
dc.subject.lcsh	Information storage and retrieval systems.	en_US
dc.subject.lcsh	Archives--Data processing.	en_US
dc.subject.lcsh	Writing--Identification--Data processing.	en_US
dc.subject.lcsh	Computational linguistics.	en_US
dc.subject.lcsh	Natural language processing.	en_US
dc.title	Segmentation based Ottoman text and matching based Kufic image analysis	en_US
dc.type	Thesis	en_US
thesis.degree.discipline	Computer Engineering
thesis.degree.grantor	Bilkent University
thesis.degree.level	Master's
thesis.degree.name	MS (Master of Science)

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 0006591.pdf
Size:: 38.99 MB
Format:: Adobe Portable Document Format

Download

Collections

Graduate School of Engineering and Science