BUIR Repository :: Browsing by Subject "Image analysis"

Browsing by Subject "Image analysis"

Now showing 1 - 20 of 52

Open Access
Alignment of uncalibrated images for multi-view classification
(IEEE, 2011) Arık, Sercan Ömer; Vuraf, E.; Frossard P.
Efficient solutions for the classification of multi-view images can be built on graph-based algorithms when little information is known about the scene or cameras. Such methods typically require a pair-wise similarity measure between images, where a common choice is the Euclidean distance. However, the accuracy of the Euclidean distance as a similarity measure is restricted to cases where images are captured from nearby viewpoints. In settings with large transformations and viewpoint changes, alignment of images is necessary prior to distance computation. We propose a method for the registration of uncalibrated images that capture the same 3D scene or object. We model the depth map of the scene as an algebraic surface, which yields a warp model in the form of a rational function between image pairs. The warp model is computed by minimizing the registration error, where the registered image is a weighted combination of two images generated with two different warp functions estimated from feature matches and image intensity functions in order to provide robust registration. We demonstrate the flexibility of our alignment method by experimentation on several wide-baseline image pairs with arbitrary scene geometries and texture levels. Moreover, the results on multi-view image classification suggest that the proposed alignment method can be effectively used in graph-based classification algorithms for the computation of pairwise distances where it achieves significant improvements over distance computation without prior alignment. © 2011 IEEE.
Open Access
Approximate fourier domain expression for bloch-siegert shift
(John Wiley and Sons Inc., 2015) Turk, E. A.; Ider, Y. Z.; Ergun, A. S.; Atalar, Ergin
Purpose: In this study, a newsimple Fourier domain-based analytical expression for the Bloch-Siegert (BS) shift-based B1 mapping method is proposed to obtain |B1+| more accurately while using short BS pulse durations and small off-resonance frequencies.Theory and Methods: A new simple analytical expression for the BS shift is derived by simplifying the Bloch equations. In this expression, the phase is calculated in terms of the Fourier transform of the radiofrequency pulse envelope, and thus making the off- and on-resonance effects more easily understandable. To verify the accuracy of the proposed expression, Bloch simulations and MR experiments are performed for the hard, Fermi, and Shinner-Le Roux pulse shapes.Results: Analyses of the BS phase shift-based B1 mapping method in terms of radiofrequency pulse shape, pulse duration, and off-resonance frequency show that |B1+| can be obtained more accurately with the aid of this new expression.Conclusions: In this study, a new simple frequency domain analytical expression is proposed for the BS shift. Using this expression, |B1+| values can be predicted from the phase data using the frequency spectrum of the radiofrequency pulse. This method works well even for short pulse durations and small offset frequencies.
Open Access
Automatic detection of compound structures by joint selection of region groups from a hierarchical segmentation
(Institute of Electrical and Electronics Engineers, 2016) Akçay, H. G.; Aksoy, S.
A challenging problem in remote sensing image analysis is the detection of heterogeneous compound structures such as different types of residential, industrial, and agricultural areas that are composed of spatial arrangements of simple primitive objects such as buildings and trees. We describe a generic method for the modeling and detection of compound structures that involve arrangements of an unknown number of primitives in large scenes. The modeling process starts with a single example structure, considers the primitive objects as random variables, builds a contextual model of their arrangements using a Markov random field, and learns the parameters of this model via sampling from the corresponding maximum entropy distribution. The detection task is formulated as the selection of multiple subsets of candidate regions from a hierarchical segmentation where each set of selected regions constitutes an instance of the example compound structure. The combinatorial selection problem is solved by the joint sampling of groups of regions by maximizing the likelihood of their individual appearances and relative spatial arrangements. Experiments using very high spatial resolution images show that the proposed method can effectively localize an unknown number of instances of different compound structures that cannot be detected by using spectral and shape features alone.
Open Access
Automatic detection of geospatial objects using multiple hierarchical segmentations
(Institute of Electrical and Electronics Engineers, 2008-07) Akçay, H. G.; Aksoy, S.
The object-based analysis of remotely sensed imagery provides valuable spatial and structural information that is complementary to pixel-based spectral information in classification. In this paper, we present novel methods for automatic object detection in high-resolution images by combining spectral information with structural information exploited by using image segmentation. The proposed segmentation algorithm uses morphological operations applied to individual spectral bands using structuring elements in increasing sizes. These operations produce a set of connected components forming a hierarchy of segments for each band. A generic algorithm is designed to select meaningful segments that maximize a measure consisting of spectral homogeneity and neighborhood connectivity. Given the observation that different structures appear more clearly at different scales in different spectral bands, we describe a new algorithm for unsupervised grouping of candidate segments belonging to multiple hierarchical segmentations to find coherent sets of segments that correspond to actual objects. The segments are modeled by using their spectral and textural content, and the grouping problem is solved by using the probabilistic latent semantic analysis algorithm that builds object models by learning the object-conditional probability distributions. The automatic labeling of a segment is done by computing the similarity of its feature distribution to the distribution of the learned object models using the Kullback-Leibler divergence. The performances of the unsupervised segmentation and object detection algorithms are evaluated qualitatively and quantitatively using three different data sets with comparative experiments, and the results show that the proposed methods are able to automatically detect, group, and label segments belonging to the same object classes. © 2008 IEEE.
Open Access
Automatic multimedia cross-modal correlation discovery
(ACM, 2004-08) Pan, J.-Y.; Yang, H.-J.; Faloutsos, C.; Duygulu, Pınar
Given an image (or video clip, or audio song), how do we automatically assign keywords to it? The general problem is to find correlations across the media in a collection of multimedia objects like video clips, with colors, and/or motion, and/or audio, and/or text scripts. We propose a novel, graph-based approach, "MMG", to discover such cross-modal correlations. Our "MMG" method requires no tuning, no clustering, no user-determined constants; it can be applied to any multi-media collection, as long as we have a similarity function for each medium; and it scales linearly with the database size. We report auto-captioning experiments on the "standard" Corel image database of 680 MB, where it outperforms domain specific, fine-tuned methods by up to 10 percentage points in captioning accuracy (50% relative improvement).
Open Access
Comparison and combination of two novel commercial detection methods
(IEEE, 2004-06) Duygulu, Pınar; Chen, M.-Y.; Hauptmann, A.
Detection and removal of commercials plays an important role when searching for important broadcast news video material. In this study, two novel approaches are proposed based on two distinctive characteristics of commercials, namely, repetitive use of commercials over time and distinctive color and audio features. Furthermore, proposed strategies for combining the results of the two methods yield even better performance. Experiments show over 90% recall and precision on a test set of 5 hours of ABC and CNN broadcast news data.
Open Access
Computationally efficient wavelet affine invariant functions for shape recognition
(IEEE, 2004) Bala, E.; Çetin, A. Enis
An affine invariant function for object recognition is constructed from wavelet coefficients of the object boundary. In previous works, undecimated dyadic wavelet transform was used to construct affine invariant functions. In this paper, an algorithm based on decimated wavelet transform is developed to compute an affine invariant function. As a result computational complexity is reduced without decreasing recognition performance. Experimental results are presented. © 2004 IEEE.
Open Access
Computer vision based analysis of potato chips-A tool for rapid detection of acrylamide level
(Wiley - VCH Verlag GmbH & Co. KGaA, 2006) Gökmen, V.; Senyuva, H. Z.; Dülek, B.; Çetin, E.
In this study, analysis of digital color images of fried potato chips were combined with parallel LCMS based analysis of acrylamide in order to develop a rapid tool for the estimation of acrylamide during processing. Pixels of the fried potato image were classified into three sets based on their Euclidian distances to the representative mean values of typical bright yellow, yellowish brown, and dark brown regions using a semiautomatic segmentation algorithm. The featuring parameter extracted from the segmented image was NA2 value which was defined as the number of pixels in Set-2 divided by the total number of pixels of the entire fried potato image. Using training images of potato chips, it was shown that there was a strong linear correlation (r = 0.989) between acrylamide level and NA2 value. Images of a number of test samples were analyzed to predict their acrylamide level by means of this correlation data. The results confirmed that computer vision system described here provided explicit and meaningful description from the viewpoint of inspection and evaluation purpose for potato chips. Assuming a provisional threshold limit of 1000 ng/g for acrylamide, test samples could be successfully inspected with only one failure out of 60 potato chips.
Open Access
Current constrained voltage scaled reconstruction (CCVSR) algorithm for MR-EIT and its performance with different probing current patterns
(Institute of Physics Publishing, 2003) Birgül, Ö.; Eyüboğlu, B. M.; İder, Y. Z.
Conventional injected-current electrical impedance tomography (EIT) and magnetic resonance imaging (MRI) techniques can be combined to reconstruct high resolution true conductivity images. The magnetic flux density distribution generated by the internal current density distribution is extracted from MR phase images. This information is used to form a fine detailed conductivity image using an Ohm's law based update equation. The reconstructed conductivity image is assumed to differ from the true image by a scale factor. EIT surface potential measurements are then used to scale the reconstructed image in order to find the true conductivity values. This process is iterated until a stopping criterion is met. Several simulations are carried out for opposite and cosine current injection patterns to select the best current injection pattern for a 2D thorax model. The contrast resolution and accuracy of the proposed algorithm are also studied. In all simulation studies, realistic noise models for voltage and magnetic flux density measurements are used. It is shown that, in contrast to the conventional EIT techniques, the proposed method has the capability of reconstructing conductivity images with uniform and high spatial resolution. The spatial resolution is limited by the larger element size of the finite element mesh and twice the magnetic resonance image pixel size.
Open Access
Design of a novel MRI compatible manipulator for image guided prostate interventions
(IEEE, 2005-02) Krieger, A.; Susil, R. C.; Ménard, C.; Coleman, J. A.; Fichtinger, G.; Atalar, Ergin; Whitcomb, L. L.
This paper reports a novel remotely actuated manipulator for access to prostate tissue under magnetic resonance imaging guidance (APT-MRI) device, designed for use in a standard high-field MRI scanner. The device provides three-dimensional MRI guided needle placement with millimeter accuracy under physician control. Procedures enabled by this device include MRI guided needle biopsy, fiducial marker placements, and therapy delivery. Its compact size allows for use in both standard cylindrical and open configuration MRI scanners. Preliminary in vivo canine experiments and first clinical trials are reported.
Open Access
Detection of fungal damaged popcorn using image property covariance features
(Elsevier, 2012) Yorulmaz, O.; Pearson, T. C.; Çetin, A.
Covariance-matrix-based features were applied to the detection of popcorn infected by a fungus that causes a symptom called " blue-eye" . This infection of popcorn kernels causes economic losses due to the kernels' poor appearance and the frequently disagreeable flavor of the popped kernels. Images of kernels were obtained to distinguish damaged from undamaged kernels using image-processing techniques. Features for distinguishing blue-eye-damaged from undamaged popcorn kernel images were extracted from covariance matrices computed using various image pixel properties. The covariance matrices were formed using different property vectors that consisted of the image coordinate values, their intensity values and the first and second derivatives of the vertical and horizontal directions of different color channels. Support Vector Machines (SVM) were used for classification purposes. An overall recognition rate of 96.5% was achieved using these covariance based features. Relatively low false positive values of 2.4% were obtained which is important to reduce economic loss due to healthy kernels being discarded as fungal damaged. The image processing method is not computationally expensive so that it could be implemented in real-time sorting systems to separate damaged popcorn or other grains that have textural differences.
Open Access
Dynamic texture detection, segmentation and analysis
(ACM, 2007-07) Töreyin, Behçet Uğur; Dedeoğlu, Yiğithan; Çetin, A. Enis; Fazekas, S.; Chetverikov, D.; Amiaz, T.; Kiryati, N.
Dynamic textures are common in natural scenes. Examples of dynamic textures in video include fire, smoke, clouds, trees in the wind, sky, sea and ocean waves etc. In this showcase, (i) we develop real-time dynamic texture detection methods in video and (ii) present solutions to video object classification based on motion information. Copyright 2007 ACM.
Open Access
Estimation of depth fields suitable for video compression based on 3-D structure and motion of objects
(Institute of Electrical and Electronics Engineers, 1998-06) Alatan, A. A.; Onural, L.
Intensity prediction along motion trajectories removes temporal redundancy considerably in video compression algorithms. In three-dimensional (3-D) object-based video coding, both 3-D motion and depth values are required for temporal prediction. The required 3-D motion parameters for each object are found by the correspondence-based E-matrix method. The estimation of the correspondences - two-dimensional (2-D) motion field - between the frames and segmentation of the scene into objects are achieved simultaneously by minimizing a Gibbs energy. The depth field is estimated by jointly minimizing a defined distortion and bitrate criterion using the 3-D motion parameters. The resulting depth field is efficient in the rate-distortion sense. Bit-rate values corresponding to the lossless encoding of the resultant depth fields are obtained using predictive coding; prediction errors are encoded by a Lempel-Ziv algorithm. The results are satisfactory for real-life video scenes.
Open Access
A fast algorithm for subpixel accuracy image stabilization for digital film and video
(SPIE, 1998) Eroğlu, Çiğdem; Erdem, A. T.
This paper introduces a novel method for subpixel accuracy stabilization of unsteady digital films and video sequences. The proposed method offers a near-closed-form solution to the estimation of the global subpixel displacement between two frames, that causes the misregistration of them. The criterion function used is the mean-squared error over the displaced frames, in which image intensities at subpixel locations are evaluated using bilinear interpolation. The proposed algorithm is both faster and more accurate than the search-based solutions found in the literature. Experimental results demonstrate the superiority of the proposed method to the spatio-temporal differentiation and surface fitting algorithms, as well. Furthermore, the proposed algorithm is designed so that it is insensitive to frame-to-frame intensity variations. It is also possible to estimate any affine motion between two frames by applying the proposed algorithm on three non-collinear points in the unsteady frame.
Open Access
Fast insect damage detection in wheat kernels using transmittance images
(IEEE, 2004-07) Çataltepe, Z.; Pearson, T.; Cetin, A. Enis
We used transmittance images and different learning algorithms to classify insect damaged and un-damaged wheat kernels. Using the histogram of the pixels of the wheat images as the feature, and the linear model as the learning algorithm, we achieved a False Positive Rate (1-specificity) of 0.12 at the True Positive Rate (sensitivity) of 0.8 and an Area Under the ROC Curve (AUC) of 0.90 ± 0.02. Combining the linear model and a Radial Basis Function Network in a committee resulted in a FP Rate of 0.09 at the TP Rate of 0.8 and an AUC of 0.93 ± 0.03.
Open Access
Feature extraction with the fractional Fourier transform
(Bilkent University, 1998) Güleryüz, Özgür
In this work, alternative design and implementation techniques for feature extraction applications are proposed. The proposed techniques amount to decomposing the overall feature extraction problem into a global linear system followed by a local nonlinear system. Different output representations for representation of input features are also allowed and used in these techniques. These different output representations bring cui additional degree of freedom to the feature extraction problems. The systems provide multi-outputs consisting of different features of the input signal or image. Efficient implementation of the linear part of the .system is obtained by using fractional Fourier filtering circuits. Expressions for the proposed techniques are derived and several illustrative examples cxre given in which efficient implementations for feature extraction applications are obtained.
Open Access
Finding compound structures in images using image segmentation and graph-based knowledge discovery
(IEEE, 2009-07) Zamalieva, Daniya; Aksoy, Selim; Tilton J. C.
We present an unsupervised method for discovering compound image structures that are comprised of simpler primitive objects. An initial segmentation step produces image regions with homogeneous spectral content. Then, the segmentation is translated into a relational graph structure whose nodes correspond to the regions and the edges represent the relationships between these regions. We assume that the region objects that appear together frequently can be considered as strongly related. This relation is modeled using the transition frequencies between neighboring regions, and the significant relations are found as the modes of a probability distribution estimated using the features of these transitions. Experiments using an Ikonos image show that subgraphs found within the graph representing the whole image correspond to parts of different high-level compound structures. ©2009 IEEE.
Open Access
Finding people frequently appearing in news
(Springer, 2006-07) Özkan, Derya; Duygulu, Pınar
We propose a graph based method to improve the performance of person queries in large news video collections. The method benefits from the multi-modal structure of videos and integrates text and face information. Using the idea that a person appears more frequently when his/her name is mentioned, we first use the speech transcript text to limit our search space for a query name. Then, we construct a similarity graph with nodes corresponding to all of the faces in the search space, and the edges corresponding to similarity of the faces. With the assumption that the images of the query name will be more similar to each other than to other images, the problem is then transformed into finding the densest component in the graph corresponding to the images of the query name. The same graph algorithm is applied for detecting and removing the faces of the anchorpeople in an unsupervised way. The experiments are conducted on 229 news videos provided by NIST for TRECVID 2004. The results show that proposed method outperforms the text only based methods and provides cues for recognition of faces on the large scale. © Springer-Verlag Berlin Heidelberg 2006.
Open Access
Human action recognition using distribution of oriented rectangular patches
(Springer, 2007-10) İkizler, Nazlı; Duygulu, Pınar
We describe a "bag-of-rectangles" method for representing and recognizing human actions in videos. In this method, each human pose in an action sequence is represented by oriented rectangular patches extracted over the whole body. Then, spatial oriented histograms are formed to represent the distribution of these rectangular patches. In order to carry the information from the spatial domain described by the bag-of-rectangles descriptor to temporal domain for recognition of the actions, four different methods are proposed. These are namely, (i) frame by frame voting, which recognizes the actions by matching the descriptors of each frame, (ii) global histogramming, which extends the idea of Motion Energy Image proposed by Bobick and Davis by rectangular patches, (iii) a classifier based approach using SVMs, and (iv) adaptation of Dynamic Time Warping on the temporal representation of the descriptor. The detailed experiments are carried out on the action dataset of Blank et. al. High success rates (100%) prove that with a very simple and compact representation, we can achieve robust recognition of human actions, compared to complex representations. © Springer-Verlag Berlin Heidelberg 2007.
Open Access
Identification of relative protein bands in polyacrylamide gel electrophoresis (PAGE) using a multi-resolution snake algorithm
(Informa Healthcare, 1999-06) Gürcan, M. N.; Koyutürk, M.; Yildiz, H. S.; Çetin-Atalay R.; Çetin, A. Enis
In polyacrylamide gel electrophoresis (PAGE) image analysis, it is important to determine the percentage of the protein of interest of a protein mixture. This study presents reliable computer software to determine this percentage. The region of interest containing the protein band is detected using the snake algorithm. The iterative snake algorithm is implemented in a multi-resolutional framework. The snake is initialized on a low-resolution image. Then, the final position of the snake at the low resolution is used as the initial position in the higher-resolution image. Finally, the area of the protein is estimated as the area enclosed by the final position of the snake.