Browsing by Subject "Object recognition"
Now showing 1 - 20 of 41
- Results Per Page
- Sort Options
Item Open Access Automated detection of objects using multiple hierarchical segmentations(IEEE, 2007-07) Akçay H. Gökhan; Aksoy, SelimWe introduce an unsupervised method that combines both spectral and structural information for automatic object detection. First, a segmentation hierarchy is constructed by combining structural information extracted by morphological processing with spectral information summarized using principal components analysis. Then, segments that maximize a measure consisting of spectral homogeneity and neighborhood connectivity are selected as candidate structures for object detection. Given the observation that different structures appear more clearly in different principal components, we present an algorithm that is based on probabilistic Latent Semantic Analysis (PLSA) for grouping the candidate segments belonging to multiple segmentations and multiple principal components. The segments are modeled using their spectral content and the PLSA algorithm builds object models by learning the object-conditional probability distributions. Labeling of a segment is done by computing the similarity of its spectral distribution to the distribution of object models using Kullback-Leibler divergence. Experiments on two data sets show that our method is able to automatically detect, group, and label segments belonging to the same object classes. © 2007 IEEE.Item Open Access Automatic detection of geospatial objects using multiple hierarchical segmentations(Institute of Electrical and Electronics Engineers, 2008-07) Akçay, H. G.; Aksoy, S.The object-based analysis of remotely sensed imagery provides valuable spatial and structural information that is complementary to pixel-based spectral information in classification. In this paper, we present novel methods for automatic object detection in high-resolution images by combining spectral information with structural information exploited by using image segmentation. The proposed segmentation algorithm uses morphological operations applied to individual spectral bands using structuring elements in increasing sizes. These operations produce a set of connected components forming a hierarchy of segments for each band. A generic algorithm is designed to select meaningful segments that maximize a measure consisting of spectral homogeneity and neighborhood connectivity. Given the observation that different structures appear more clearly at different scales in different spectral bands, we describe a new algorithm for unsupervised grouping of candidate segments belonging to multiple hierarchical segmentations to find coherent sets of segments that correspond to actual objects. The segments are modeled by using their spectral and textural content, and the grouping problem is solved by using the probabilistic latent semantic analysis algorithm that builds object models by learning the object-conditional probability distributions. The automatic labeling of a segment is done by computing the similarity of its feature distribution to the distribution of the learned object models using the Kullback-Leibler divergence. The performances of the unsupervised segmentation and object detection algorithms are evaluated qualitatively and quantitatively using three different data sets with comparative experiments, and the results show that the proposed methods are able to automatically detect, group, and label segments belonging to the same object classes. © 2008 IEEE.Item Open Access Çarpıcıdan bağımsız ortak fark matrisi kullanarak video ve görüntü işleme(IEEE, 2009-04) Çetin, A. Enis; Duman, Kaan; Tuna, Hakan; Eryıldırım, AbdulkadirBu bildiride gerçel sayılar üzerinde yarı grup kuran yeni bir iletmen tanımlayarak elde edilen bir bölge betimleyicisi ile hareketli obje takibi, yüz sezimi, plaka bulma, bölge betimleme için kullanılabilecek hızlı bir algoritma sunuyoruz. Bu yeni iletmen hiçbir çarpma gerektirmez. Bu iletmeni kullanarak, imge bölgelerini nitelendiren ve ortak fark adı verilen bir matris tanımlıyoruz. Plaka bulma uygulamasında ortak fark matrislerinı plaka bölgelerinden kestirip, bunları bir veritabanında saklıyoruz. Plaka bölgelerini gerçek zamanlı videoda tanımlamak için ilk önce videodaki hareketli bölgeleri taşıyan imgeleri belirliyoruz, sonra hareketli bölgelerin içinde ya da bütün resim içinde plaka büyüklüğündeki bölgelerin ortak ayrık matrislerini veritabanındaki plaka ortak ayrık matrisleriyle karşılaştırarak bölge içinde plaka olup olmadığını belirliyoruz.Item Open Access Color vision in humans and computers(IEEE, 2008) Boyacı, Hüseyin; Akarun L.Humans and many other species rely on color for object recognition. What are the biological underpinnings of color vision and how can we computationally model human color perception? In this study we briefly summarize recent advences regarding the very early, retinal stages of color vision, as well as recent behavioral models of color perception in three dimensional world within rich context. We also emphasize the recent events on the neuroimaging front that allow the researchers begin to systematically study the cortical processes related to color vision. ©2008 IEEE.Item Open Access Computationally efficient wavelet affine invariant functions for 2D object recognition(IEEE, 2003) Bala, E.; Çetin, A. EnisIn this paper, an affine invariant function is presented for object recognition from wavelet coefficients of the object boundary. In previous works, undecimated wavelet transform was used for affine invariant functions. In this paper, an algorithm based on decimated wavelet transform is developed to compute the affine invariant function. As a result, computational complexity is significantly reduced without decreasing recognition performance. Experimental results are presented.Item Open Access Computationally efficient wavelet affine invariant functions for shape recognition(IEEE, 2004) Bala, E.; Çetin, A. EnisAn affine invariant function for object recognition is constructed from wavelet coefficients of the object boundary. In previous works, undecimated dyadic wavelet transform was used to construct affine invariant functions. In this paper, an algorithm based on decimated wavelet transform is developed to compute an affine invariant function. As a result computational complexity is reduced without decreasing recognition performance. Experimental results are presented. © 2004 IEEE.Item Open Access Connectivity-guided adaptive lifting transform for image like compression of meshes(IEEE, 2007-05) Köse, Kıvanç; Çetin, A. Enis; Güdükbay, Uğur; Onural, LeventWe propose a new connectivity-guided adaptive wavelet transform based mesh compression framework. The 3D mesh is first transformed to 2D images on a regular grid structure by performing orthogonal projections onto the image plane. Then, this image-like representation is wavelet transformed using a lifting structure employing an adaptive predictor that takes advantage of the connectivity information of mesh vertices. Then the wavelet domain data is encoded using "Set Partitioning In Hierarchical Trees" (SPIHT) method or JPEG2000. The SPIHT approach is progressive because the resolution of the reconstructed mesh can be changed by varying the length of the 1D data stream created by the algorithm. In JPEG2000 based approach, quantization of the coefficients determines the quality of the reconstruction. The results of the SPIHT based algorithm is observed to be superior to JPEG200 based mesh coder and MPEG-3DGC in rate-distortion.Item Open Access The effect of task on cue usefulness for visual scene classification(2017-05) Karaca, MeltemDetecting objects in the environment is one of the most fundamental functions of the visual system. Humans are highly effective at this, and past studies have shown that we can process things like whether or not an animal is present in a scene within 150 msec. Different lines of research have also examined possible cues that may be useful for rapid object detection and scene classification, and have found things like color, luminance, shape and texture to be diagnostic. Studies examining the degree to which different cues are effective for detecting objects have found that shape and texture are the most important. However, it is unclear whether cue effectiveness depends on the task being employed. The discriminative information contained in different cues may vary depending on the task. This master’s thesis examines the effects of task-relevant information on which cues are most useful for visual detection. In order to investigate the impact of task type on visual cue usefulness, participants were asked to do animal and water detection tasks. They were presented with natural scenes that contain animals or water. We found significant differences in cue usefulness depending on the task. Corresponding differences were also found for reaction times based on the different cues. The results indicated that effectiveness of visual cues depends on the nature of the task, and different cues might be more or less useful when individuals are instructed to do different kinds of tasks.Item Unknown Effects of surface reflectance and 3D shape on perceived rotation axis(Association for Research in Vision and Ophthalmology, 2013) Doerschner, K.; Yilmaz, O.; Kucukoglu, G.; Fleming, R. W.Surface specularity distorts the optic flow generated by a moving object in a way that provides important cues for identifying surface material properties (Doerschner, Fleming et al., 2011). Here we show that specular flow can also affect the perceived rotation axis of objects. In three experiments, we investigate how threedimensional shape and surface material interact to affect the perceived rotation axis of unfamiliar irregularly shaped and isotropic objects. We analyze observers' patterns of errors in a rotation axis estimation task under four surface material conditions: shiny, matte textured, matte untextured, and silhouette. In addition to the expected large perceptual errors in the silhouette condition, we find that the patterns of errors for the other three material conditions differ from each other and across shape category, yielding the largest differences in error magnitude between shiny and matte, textured isotropic objects. Rotation axis estimation is a crucial implicit computational step to perceive structure from motion; therefore, we test whether a structure from a motion-based model can predict the perceived rotation axis for shiny and matte, textured objects. Our model's predictions closely follow observers' data, even yielding the same reflectance-specific perceptual errors. Unlike previous work (Caudek & Domini, 1998), our model does not rely on the assumption of affine image transformations; however, a limitation of our approach is its reliance on projected correspondence, thus having difficulty in accounting for the perceived rotation axis of smooth shaded objects and silhouettes. In general, our findings are in line with earlier research that demonstrated that shape from motion can be extracted based on several different types of optical deformation (Koenderink & Van Doorn, 1976; Norman & Todd, 1994; Norman, Todd, & Orban, 2004; Pollick, Nishida, Koike, & Kawato, 1994; Todd, 1985).Item Unknown Estimation of depth fields suitable for video compression based on 3-D structure and motion of objects(Institute of Electrical and Electronics Engineers, 1998-06) Alatan, A. A.; Onural, L.Intensity prediction along motion trajectories removes temporal redundancy considerably in video compression algorithms. In three-dimensional (3-D) object-based video coding, both 3-D motion and depth values are required for temporal prediction. The required 3-D motion parameters for each object are found by the correspondence-based E-matrix method. The estimation of the correspondences - two-dimensional (2-D) motion field - between the frames and segmentation of the scene into objects are achieved simultaneously by minimizing a Gibbs energy. The depth field is estimated by jointly minimizing a defined distortion and bitrate criterion using the 3-D motion parameters. The resulting depth field is efficient in the rate-distortion sense. Bit-rate values corresponding to the lossless encoding of the resultant depth fields are obtained using predictive coding; prediction errors are encoded by a Lempel-Ziv algorithm. The results are satisfactory for real-life video scenes.Item Unknown Eye tracking using markov models(IEEE, 2004) Bağcı, A. M.; Ansari, R.; Khokhar, A.; Çetin, A. EnisWe propose an eye detection and tracking method based on color and geometrical features of the human face using a monocular camera. In this method a decision is made on whether the eyes are closed or not and, using a Markov chain framework to model temporal evolution, the subject's gaze is determined. The method can successfully track facial features even while the head assumes various poses, so long as the nostrils are visible to the camera. We compare our method with recently proposed techniques and results show that it provides more accurate tracking and robustness to variations in view of the face. A procedure for detecting tracking errors is employed to recover the loss of feature points in case of occlusion or very fast head movement. The method may be used in monitoring a driver's alertness and detecting drowsiness, and also in applications requiring non-contact human computer interaction.Item Unknown Fine-grained object recognition and zero-shot learning in multispectral imagery(IEEE, 2018) Sümbül, Gencer; Aksoy, Selim; Cinbiş, R. G.We present a method for fine-grained object recognition problem, that aims to recognize the type of an object among a large number of sub-categories, and zero-shot learning scenario on multispectral images. In order to establish a relation between seen classes and new unseen classes, a compatibility function between image features extracted from a convolutional neural network and auxiliary information of classes is learnt. Knowledge transfer for unseen classes is carried out by maximizing this function. Performance of the model (15.2%) evaluated with manually annotated attributes, a natural language model, and a scientific taxonomy as auxiliary information is promisingly better than the other methods for 16 test classes.Item Unknown Fine-grained object recognition and zero-shot learning in remote sensing imagery(Institute of Electrical and Electronics Engineers, 2018) Sümbül, Gencer; Cinbis, R. G.; Selim AksoyFine-grained object recognition that aims to identify the type of an object among a large number of subcategories is an emerging application with the increasing resolution that exposes new details in image data. Traditional fully supervised algorithms fail to handle this problem where there is low betweenclass variance and high within-class variance for the classes of interest with small sample sizes. We study an even more extreme scenario named zero-shot learning (ZSL) in which no training example exists for some of the classes. ZSL aims to build a recognition model for new unseen categories by relating them to seen classes that were previously learned. We establish this relation by learning a compatibility function between image features extracted via a convolutional neural network and auxiliary information that describes the semantics of the classes of interest by using training samples from the seen classes. Then, we show how knowledge transfer can be performed for the unseen classes by maximizing this function during inference. We introduce a new data set that contains 40 different types of street trees in 1-ft spatial resolution aerial data, and evaluate the performance of this model with manually annotated attributes, a natural language model, and a scientific taxonomy as auxiliary information. The experiments show that the proposed model achieves 14.3% recognition accuracy for the classes with no training examples, which is significantly better than a random guess accuracy of 6.3% for 16 test classes, and three other ZSL algorithms.Item Unknown Flame detection in video using hidden Markov models(IEEE, 2005) Töreyin, B. Uğur; Dedeoğlu, Yiğithan; Çetin, A. EnisThis paper proposes a novel method to detect flames in video by processing the data generated by an ordinary camera monitoring a scene. In addition to ordinary motion and color clues, flame flicker process is also detected by using a hidden Markov model. Markov models representing the flame and flame colored ordinary moving objects are used to distinguish flame flicker process from motion of flame colored moving objects. Spatial color variations in flame are also evaluated by the same Markov models, as well. These clues are combined to reach a final decision. False alarms due to ordinary motion of flame colored moving objects are greatly reduced when compared to the existing video based fire detection systems.Item Unknown Fractional fourier transform pre-processing for neural networks and its application to object recognition(Elsevier, 2002-01) Barshan, Billur; Ayrulu, BirselThis study investigates fractional Fourier transform pre-processing of input signals to neural networks. The fractional Fourier transform is a generalization of the ordinary Fourier transform with an order parameter a. Judicious choice of this parameter can lead to overall improvement of the neural network performance. As an illustrative example, we consider recognition and position estimation of different types of objects based on their sonar returns. Raw amplitude and time-of-flight patterns acquired from a real sonar system are processed, demonstrating reduced error in both recognition and position estimation of objects. (C) 2002 Elsevier Science Ltd. All rights reserved.Item Unknown Haber videolarında nesne tanıma ve otomatik etiketleme(IEEE, 2006-04) Baştan, Muhammet; Duygulu, PınarWe propose a new approach to object recognition problem motivated by the availability of large annotated image and video collections. Similar to translation from one language to another, this approach considers the object recognition problem as the translation of visual elements to words. The visual elements represented in feature space are first categorized into a finite set of blobs. Then, the correspondences between the blobs and the words are learned using a method adapted from Statistical Machine Translation. Finally, the correspondences, in the form of a probability table, are used to predict words for particular image regions (region naming), for entire images (auto-annotation), or to associate the automatically generated speech transcript text with the correct video frames (video alignment). Experimental results are presented on TRECVID 2004 data set, which consists of about 150 hours of news videos associated with manual annotations and speech transcript text. © 2006 IEEE.Item Unknown İmge işleme yoluyla çubuk kod yersenimi(IEEE, 2005-05) Öktem, R.; Çetin, A. EnisBu özette imge işleme kullanılarak çubuk kod bölgesi çıkarımı ele alınmıştır. Çubuk kodlar birbirine paralel açık-koyu doğrulardan oluştuklarından, ikili bir ayrıt haritasında da belirli bir oryantasyonda birbirine bağlı paralel doğrular olarak ayırdedilirler. Sunulan algoritmalar, bu özellikten yola çıkarak morfoloji ve serbest açı eşiklemesi yolu ile bar koda ait bölgeyi yersemeyi amaçlamaktadır. Ayrıt haritası oluşturmada Sobel işlemi ve ikili altbant dönüşümü kullanılmakta ve her iki yöntem de zaman karmaşıklığı ve performans açısından karşılaştırılmaktadır.Item Unknown Man-made object classification in SAR images using 2-D cepstrum(IEEE, 2009-05) Eryildirim, A.; Çetin, A. EnisIn this paper, a novel descriptive feature parameter extraction method from Synthetic Aperture Radar (SAR) images is proposed. The new method is based on the two-dimensional (2-D) real cepstrum. This novel 2-D cepstrum method is compared with principal component analysis (PCA) method by testing over the MSTAR image database. The extracted features are classified using Support Vector Machine (SVM). We demonstrate that discrimination of natural background (clutter) and man-made objects (metal objects) in SAR imagery is possible using the 2-D cepstrum feature parameters. In addition, the computational cost of the cepstrum method is lower than the PCA method. Experimental results are presented. ©2009 IEEE.Item Unknown Mining web images for concept learning(2014-08) Golge, ErenWe attack the problem of learning concepts automatically from noisy Web image search results. The idea is based on discovering common characteristics shared among category images by posing two novel methods that are able to organise the data while eliminating irrelevant instances. We propose a novel clustering and outlier detection method, namely Concept Map (CMAP). Given an image collection returned for a concept query, CMAP provides clusters pruned from outliers. Each cluster is used to train a model representing a different characteristics of the concept. One another method is Association through Model Evolution (AME). It prunes the data in an iterative manner and it progressively finds better set of images with an evaluational score computed for each iteration. The idea is based on capturing discriminativeness and representativeness of each instance against large number of random images and eliminating the outliers. The final model is used for classification of novel images. These two methods are applied on different benchmark problems and we observed compelling or better results compared to state of art methods.Item Unknown Modeling interaction of fluid, fabric, and rigid objects for computer graphics(IEEE, 2006) Bayraktar, Serkan; Güdükbay, Uğur; Özgüç, BülentSimulating every day phenomena such as fluid, rigid objects, or cloth and their interaction has been a challenge for the computer graphics community for decades. In this article techniques to model such interactions are explained briefly and some of the result of applying these tecniques are presented. © 2006 IEEE.
- «
- 1 (current)
- 2
- 3
- »