Wavelet domain textual coding of Ottoman script images
Gerek Omer, N.
Cetin, A.Enis, Tewfik Ahmed H.
Please cite this item using this persistent URLhttp://hdl.handle.net/11693/27725
Proceedings of SPIE - The International Society for Optical Engineering
- Conference Paper 
Image coding using wavelet transform, DCT, and similar transform techniques is well established. On the other hand, these coding methods neither take into account the special characteristics of the images in a database nor are they suitable for fast database search. In this paper, the digital archiving of Ottoman printings is considered. Ottoman documents are printed in Arabic letters. Witten et al. describes a scheme based on finding the characters in binary document images and encoding the positions of the repeated characters This method efficiently compresses document images and is suitable for database research, but it cannot be applied to Ottoman or Arabic documents as the concept of character is different in Ottoman or Arabic. Typically, one has to deal with compound structures consisting of a group of letters. Therefore, the matching criterion will be according to those compound structures. Furthermore, the text images are gray tone or color images for Ottoman scripts for the reasons that are described in the paper. In our method the compound structure matching is carried out in wavelet domain which reduces the search space and increases the compression ratio. In addition to the wavelet transformation which corresponds to the linear subband decomposition, we also used nonlinear subband decomposition. The filters in the nonlinear subband decomposition have the property of preserving edges in the low resolution subband image.