Browsing by Author "Ataer, Esra"

Now showing 1 - 6 of 6

Open Access
Bilkent university at TRECVID 2006
(National Institute of Standards and Technology, 2006-11) Aksoy, Selim; Duygulu, Pınar; Akçay, Hüseyin Gökhan; Ataer, Esra; Baştan, Muhammet; Can, Tolga; Çavuş, Özge; Doǧgrusöz, Emel; Gökalp, Demir; Akaydın, Ateş; Akoǧlu, Leman; Angın, Pelin; Cinbiş, R. Gökberk; Gür, Tunay; Ünlü, Mehmet
We describe our third participation, that includes one high-level feature extraction run, and two manual and one interactive search runs, to the TRECVID video retrieval evaluation. All of these runs have used a system trained on the common development collection. Only visual and textual information were used where visual information consisted of color, texture and edge-based low-level features and textual information consisted of the speech transcript provided in the collection.
Open Access
Matching ottoman words: an image retrieval approach to historical document indexing
(ACM, 2007-07) Ataer, Esra; Duygulu, Pınar
Large archives of Ottoman documents are challenging to many historians all over the world. However, these archives remain inaccessible since manual transcription of such a huge volume is difficult. Automatic transcription is required, but due to the characteristics of Ottoman documents, character recognition based systems may not yield satisfactory results. It is also desirable to store the documents in image form since the documents may contain important drawings, especially the signatures. Due to these reasons, in this study we treat the problem as an image retrieval problem with the view that Ottoman words are images, and we propose a solution based on image matching techniques. The bag-of-visterms approach, which is shown to be successful to classify objects and scenes, is adapted for matching word images. Each word image is represented by a set of visual terms which are obtained by vector quantization of SIFT descriptors extracted from salient points. Similar words are then matched based on the similarity of the distributions of the visual terms. The experiments are carried out on printed and handwritten documents which included over 10,000 words. The results show that, the proposed system is able to retrieve words with high accuracies, and capture the semantic similarities between words. Copyright 2007 ACM.
Open Access
A new representation for matching words
(2007) Ataer, Esra
Large archives of historical documents are challenging to many researchers all over the world. However, these archives remain inaccessible since manual indexing and transcription of such a huge volume is difficult. In addition, electronic imaging tools and image processing techniques gain importance with the rapid increase in digitalization of materials in libraries and archives. In this thesis, a language independent method is proposed for representation of word images, which leads to retrieval and indexing of documents. While character recognition methods suffer from preprocessing and overtraining, we make use of another method, which is based on extracting words from documents and representing each word image with the features of invariant regions. The bag-of-words approach, which is shown to be successful to classify objects and scenes, is adapted for matching words. Since the curvature or connection points, or the dots are important visual features to distinct two words from each other, we make use of the salient points which are shown to be successful in representing such distinctive areas and heavily used for matching. Difference of Gaussian (DoG) detector, which is able to find scale invariant regions, and Harris Affine detector, which detects affine invariant regions, are used for detection of such areas and detected keypoints are described with Scale Invariant Feature Transform (SIFT) features. Then, each word image is represented by a set of visual terms which are obtained by vector quantization of SIFT descriptors and similar words are matched based on the similarity of these representations by using different distance measures. These representations are used both for document retrieval and word spotting. The experiments are carried out on Arabic, Latin and Ottoman datasets, which included different writing styles and different writers. The results show that the proposed method is successful on retrieval and indexing of documents even if with different scripts and different writers and since it is language independent, it can be easily adapted to other languages as well. Retrieval performance of the system is comparable to the state of the art methods in this field. In addition, the system is succesfull on capturing semantic similarities, which is useful for indexing, and it does not include any supervising step.
Open Access
Osmanlıca kelimeleri eşleme
(IEEE, 2007-06) Ataer, Esra; Duygulu, Pınar
Osmanlı arşivleri dünyanın pek çok yerinden araştırmacının ilgi alanına girmektedir. Fakat bu belgelerin elle çevirisi zor bir iş olduğu için, bu arşivler kullanılamaz durumdadır. Otomatik çeviri gerekmektedir, fakat Osmanlıca’nın yazma özelliklerinden dolayı karakter tabanlı tanıma sistemleri istenen başarıyı gösterememektedir. Ayrıca, belgeler minyatür ve tuğra gibi önemli kısımlar içerdiği için, imge formatında saklanmaları gerekmektedir. Bu nedenle, bu çalışmada Osmanlıca kelimeleri imge olarak görerek probleme imge erişim problemi olarak yaklaşıldı ve kelime eşleme tekniği üzerine bir çözüm önerisinde bulunuldu. Nesne tanımada başarılı olan görsel öğeler kümesi (bag-of-visterms) tekniği kelime eşleme işlemine uyarlandı ve böylece her kelime imgesi taç noktalarından çıkarılan SIFT özelliklerinin ¨ vektor¨ nicemlemesiyle sembolize edildi. Benzer kelimeler görsel ögelerin dağılımına göre eşlendi. Deneyler 10,000 kelimenin üzerindeki matbu ve elyazması belge üzerinde yapıldı. Sonuçlar sistemin benzer kelimeleri yüksek doğrulukla eşlediğini ve anlamsal benzerlikleri bulduğunu gösteriyor Large archives of Ottoman documents are challenging to many historians all over the world. However, these archives remain inaccessible since manual transcription of such a huge volume is difficult. Automatic transcription is required, but due to the characteristics of Ottoman documents, character recognition based systems may not yield satisfactory results. It is also desirable to store the documents in image form since the documents may contain important drawings, especially the signatures. Due to these reasons, in this study we treat the problem as an image retrieval problem with the view that Ottoman words are images, and we propose a solution based on image matching techniques. The bag-of-visterms approach, which is shown to be successful to classify objects and scenes, is adapted for matching word images. Each word image is represented by a set of visual terms which are obtained by vector quantization of SIFT descriptors extracted from salient points. Similar words are then matched based on the similarity of the distributions of the visual terms. The experiments are carried out on printed and handwritten documents which included over 10,000 words. The results show that, the proposed system is able to retrieve words with high accuracies, and capture the semantic similarities between words.
Open Access
PATIKAweb: a Web service for querying, visualizing and analyzing a graph-based pathway database
(ISCB Org, 2005-06) Aksay, Çağrı; Arık, Fatma; Ataer, Esra; Ayaz, Aslı; Babur, Özgün; Belviranlı, Mehmet E.; Çetintaş, Ahmet; Çolak, Recep; Çözen, G.; Demir, Emek; Dilek, Alptuğ; Doğrusöz, Uğur; Giral, Erhan; Kaya, Engin; Küçük, Evren; Tekin, A. S.; Yıldırım, Hilmi; Erson, Zeynep
PATIKAweb provides a Web service for retrieving and analyzing biological pathways in PATIKA database, which currently contains data integrated from popular public pathway databases like Reactome. It features a user-friendly interface, dynamic visualization, advanced graph-theoretic queries for extracting biologically important phenomena and exporting facilities to various exchange formats.
Open Access
Retrieval of Ottoman documents
(ACM, 2006-10) Ataer, Esra; Duygulu, Pınar
There is a growing need to access historical Ottoman documents stored in large archives and therefore managing tools for automatic searching, indexing and transcription of these documents is required. In this paper, we present a method for the retrieval of Ottoman documents based on word matching. The method first successfully segments the documents into word images and then uses a hierarchical matching technique to find the similar instances of the word images. The experiments show that even with simple features promising results can be achieved. Copyright 2006 ACM.