Browsing by Subject "Object detection"

Now showing 1 - 20 of 26

Open Access
Automatic detection of compound structures by joint selection of region groups from a hierarchical segmentation
(Institute of Electrical and Electronics Engineers, 2016) Akçay, H. G.; Aksoy, S.
A challenging problem in remote sensing image analysis is the detection of heterogeneous compound structures such as different types of residential, industrial, and agricultural areas that are composed of spatial arrangements of simple primitive objects such as buildings and trees. We describe a generic method for the modeling and detection of compound structures that involve arrangements of an unknown number of primitives in large scenes. The modeling process starts with a single example structure, considers the primitive objects as random variables, builds a contextual model of their arrangements using a Markov random field, and learns the parameters of this model via sampling from the corresponding maximum entropy distribution. The detection task is formulated as the selection of multiple subsets of candidate regions from a hierarchical segmentation where each set of selected regions constitutes an instance of the example compound structure. The combinatorial selection problem is solved by the joint sampling of groups of regions by maximizing the likelihood of their individual appearances and relative spatial arrangements. Experiments using very high spatial resolution images show that the proposed method can effectively localize an unknown number of instances of different compound structures that cannot be detected by using spectral and shape features alone.
Open Access
Automatic detection of compound structures by joint selection of region groups from multiple hierarchical segmentations
(2016-09) Akçay, Hüseyin Gökhan
A challenging problem in remote sensing image interpretation is the detection of heterogeneous compound structures such as different types of residential, industrial, and agricultural areas that are comprised of spatial arrangements of simple primitive objects such as buildings and trees. We describe a generic method for the modeling and detection of compound structures that involve arrangements of unknown number of primitives appearing in different primitive object layers in large scenes. The modeling process starts with example structures, considers the primitive objects as random variables, builds a contextual model of their arrangements using a Markov random field, and learns the parameters of this model via sampling from the corresponding maximum entropy distribution. The detection task is reduced to the selection of multiple subsets of candidate regions from multiple hierarchical segmentations corresponding to different primitive object layers where each set of selected regions constitutes an instance of the example compound structures. The combinatorial selection problem is solved by joint sampling of groups of regions by maximizing the likelihood of their individual appearances and relative spatial arrangements under the model learned from the example structures of interest. Moreover, we incorporate linear equality and inequality constraints on the candidate regions to prevent the co-selection of redundant overlapping regions and to enforce a particular spatial layout that must be respected by the selected regions. The constrained selection problem is formulated as a linearly constrained quadratic program that is solved via a variant of the primal-dual algorithm called the Difference of Convex algorithm by rewriting the non-convex program as the difference of two convex programs. Extensive experiments using very high spatial resolution images show that the proposed method can provide good localization of unknown number of instances of different compound structures that cannot be detected by using spectral and shape features alone.
Open Access
Automatic detection of geospatial objects using multiple hierarchical segmentations
(Institute of Electrical and Electronics Engineers, 2008-07) Akçay, H. G.; Aksoy, S.
The object-based analysis of remotely sensed imagery provides valuable spatial and structural information that is complementary to pixel-based spectral information in classification. In this paper, we present novel methods for automatic object detection in high-resolution images by combining spectral information with structural information exploited by using image segmentation. The proposed segmentation algorithm uses morphological operations applied to individual spectral bands using structuring elements in increasing sizes. These operations produce a set of connected components forming a hierarchy of segments for each band. A generic algorithm is designed to select meaningful segments that maximize a measure consisting of spectral homogeneity and neighborhood connectivity. Given the observation that different structures appear more clearly at different scales in different spectral bands, we describe a new algorithm for unsupervised grouping of candidate segments belonging to multiple hierarchical segmentations to find coherent sets of segments that correspond to actual objects. The segments are modeled by using their spectral and textural content, and the grouping problem is solved by using the probabilistic latent semantic analysis algorithm that builds object models by learning the object-conditional probability distributions. The automatic labeling of a segment is done by computing the similarity of its feature distribution to the distribution of the learned object models using the Kullback-Leibler divergence. The performances of the unsupervised segmentation and object detection algorithms are evaluated qualitatively and quantitatively using three different data sets with comparative experiments, and the results show that the proposed methods are able to automatically detect, group, and label segments belonging to the same object classes. © 2008 IEEE.
Open Access
Bağlamsal çıkarımla nesne sezimi
(IEEE, 2009-04) Kalaycılar, Fırat; Aksoy, Selim
Bu bildiride, sezim başarımını arttırmada tek tek sezilmiş nesneler arasındaki bağlamsal ilişkilerden yararlanan bir nesne sezim sistemi tanıtılmaktadır. Bu çalışmadaki ilk katkı, iki boyutlu görüntü uzayında yapılan ölçümlerden olasılıksal çıkarım yaparak nesneler arası gerçek dünya ilişkilerinin (çevresinde, yakınında, üzerinde vb.) modellenmesidir. Diğer bir katkı ise, bireysel nesne etiketlerine ve nesne ikilileri arasındaki ilişkilere bağlı olan sahne olasılık fonksiyonunun enbüyütülerek, nesnelerin en son etiketlerinin atanmasıdır. En tutarlı sahne duzenleşimini bulmak için bu enbüyütme problemi, doğrusal eniyileme kullanılarak çözülmüştür. Ofis görüntüleri içeren iki farklı veri kümesinde yapılan deneylerde, gerçek dünya uzamsal ilişkileri bağlamsal bilgi olarak kullanıldığında genel sezim başarımının arttığı gözlemlenmiştir. In this paper, an object detection system that utilizes contextual relationships between individually detected objects to improve the overall detection performance is introduced. The first contribution in this work is the modelling of real world object relationships (beside, on, near etc.) that can be probabilistically inferred using measurements in the 2D image space. The other contribution is the assignment offinol lobe/s to the detected objects by maximizing a scene probability function that is defined jointly using both individual object labels and their pairwise spatial relationships. The most consistent scene configuration is obtained by solving the maximization problem using linear optimization. Experiments on two different office data sets showed that incorporation of the real world spatial relationships as can textual information improved the overall detection performance. ©2009 IEEE.
Open Access
ConceptMap: mining noisy web data for concept learning
(Springer, 2014-09) Gölge, Eren; Duygulu, Pınar
We attack the problem of learning concepts automatically from noisy Web image search results. The idea is based on discovering common characteristics shared among subsets of images by posing a method that is able to organise the data while eliminating irrelevant instances. We propose a novel clustering and outlier detection method, namely Concept Map (CMAP). Given an image collection returned for a concept query, CMAP provides clusters pruned from outliers. Each cluster is used to train a model representing a different characteristics of the concept. The proposed method outperforms the state-of-the-art studies on the task of learning from noisy web data for low-level attributes, as well as high level object categories. It is also competitive with the supervised methods in learning scene concepts. Moreover, results on naming faces support the generalisation capability of the CMAP framework to different domains. CMAP is capable to work at large scale with no supervision through exploiting the available sources. © 2014 Springer International Publishing.
Open Access
Detection and classification of objects and texture
(2009) Tuna, Hakan
Object and texture recognition are two important subjects in computer vision. An efficient and fast algorithm to compute a short and efficient feature vector for classification of images is crucial for smart video surveillance systems. In this thesis, feature extraction methods for object and texture classification are investigated, compared and developed. A method for object classification based on shape characteristics is developed. Object silhouettes are extracted from videos by using the background subtraction method. Contour of the objects are obtained from these silhouettes and this 2-D contour signals are transformed into 1-D signals by using a type of radial transformation. Discrete cosine transformation is used to acquire the frequency characteristics of these signals and a support vector machine (SVM) is employed for classification of objects according to this frequency information. This method is implemented and integrated into a real time system together with object tracking. For texture recognition problem, we defined a new computationally efficient operator forming a semigroup on real numbers. The new operator does not require any multiplications. The codifference matrix based on the new operator is defined and an image descriptor using the codifference matrix is developed. Texture recognition and license plate identification examples based on the new descriptor are presented. We compared our method with regular covariance matrix method. Our method has lower computational complexity and it is experimentally shown that it performs as well as the regular covariance method.
Open Access
Detection of compound structures by region group selection from hierarchical segmentations
(IEEE, 2016-07) Akçay, H. Gökhan; Aksoy, Selim
Detection of compound structures that are comprised of different arrangements of simpler primitive objects has been a challenging problem as commonly used bag-of-words models are limited in capturing spatial information. We have developed a generic method that considers the primitive objects as random variables, builds a contextual model of their arrangements using a Markov random field, and detects new instances of compound structures through automatic selection of subsets of candidate regions from a hierarchical segmentation by maximizing the likelihood of their individual appearances and relative spatial arrangements. In this paper, we extend the model to handle different types of primitive objects that come from multiple hierarchical segmentations. Results are shown for the detection of different types of housing estates in a WorldView-2 image. © 2016 IEEE.
Open Access
Detection of compound structures using a gaussian mixture model with spectral and spatial constraints
(Institute of Electrical and Electronics Engineers Inc., 2014) Arı, C.; Aksoy, S.
Increasing spectral and spatial resolution of new-generation remotely sensed images necessitate the joint use of both types of information for detection and classification tasks. This paper describes a new approach for detecting heterogeneous compound structures such as different types of residential, agricultural, commercial, and industrial areas that are comprised of spatial arrangements of primitive objects such as buildings, roads, and trees. The proposed approach uses Gaussian mixture models (GMMs), in which the individual Gaussian components model the spectral and shape characteristics of the individual primitives and an associated layout model is used to model their spatial arrangements. We propose a novel expectation-maximization (EM) algorithm that solves the detection problem using constrained optimization. The input is an example structure of interest that is used to estimate a reference GMM and construct spectral and spatial constraints. Then, the EM algorithm fits a new GMM to the target image data so that the pixels with high likelihoods of being similar to the Gaussian object models while satisfying the spatial layout constraints are identified without any requirement for region segmentation. Experiments using WorldView-2 images show that the proposed method can detect high-level structures that cannot be modeled using traditional techniques. © 1980-2012 IEEE.
Open Access
Detection of compound structures using hierarchical clustering of statistical and structural features
(IEEE, 2011) Akçay, H. Gokhan; Aksoy, Selim
We describe a new procedure that combines statistical and structural characteristics of simple primitive objects to discover compound structures in images. The statistical information that is modeled using spectral, shape, and position data of individual objects, and structural information that is modeled in terms of spatial alignments of neighboring object groups are encoded in a graph structure that contains the primitive objects at its vertices, and the edges connect the potentially related objects. Experiments using WorldView-2 data show that hierarchical clustering of these vertices can find high-level compound structures that cannot be obtained using traditional techniques. © 2011 IEEE.
Open Access
Detection of compound structures using multiple hierarchical segmentations
(IEEE, 2012) Akçay, H. Gökhan; Aksoy, Selim
In this paper, our aim is to discover compound structures comprised of regions obtained from hierarchical segmentations of multiple spectral bands. A region adjacency graph is constructed by representing regions as vertices and connecting these vertices that are spatially close by edges. Then, dissimilarities between neighboring vertices are computed using statistical and structural features, and are assigned as edge weights. Finally, the compound structures are detected by extracting the connected components of the graph whose edges with relatively large weights are removed. Experiments using WorldView-2 images show that grouping of these vertices according to different criteria can extract high-level compound structures that cannot be obtained using traditional techniques. © 2012 IEEE.
Open Access
Detection of tree trunks as visual landmarks in outdoor environments
(2010) Yıldız, Tuğba
One of the basic problems to be addressed for a robot navigating in an outdoor environment is the tracking of its position and state. A fundamental first step in using algorithms for solving this problem, such as various visual Simultaneous Localization and Mapping (SLAM) strategies, is the extraction and identification of suitable stationary “landmarks” in the environment. This is particularly challenging in the outdoors geometrically consistent features such as lines are not frequent. In this thesis, we focus on using trees as persistent visual landmark features in outdoor settings. Existing work to this end only uses intensity information in images and does not work well in low-contrast settings. In contrast, we propose a novel method to incorporate both color and intensity information as well as regional attributes in an image towards robust of detection of tree trunks. We describe both extensions to the well-known edge-flow method as well as complementary Gabor-based edge detection methods to extract dominant edges in the vertical direction. The final stages of our algorithm then group these vertical edges into potential tree trunks using the integration of perceptual organization and all available image features. We characterize the detection performance of our algorithm for two different datasets, one homogeneous dataset with different images of the same tree types and a heterogeneous dataset with images taken from a much more diverse set of trees under more dramatic variations in illumination, viewpoint and background conditions. Our experiments show that our algorithm correctly finds up to 90% of trees with a false-positive rate lower than 15% in both datasets. These results establish that the integration of all available color, intensity and structure information results in a high performance tree trunk detection system that is suitable for use within a SLAM framework that outperforms other methods that only use image intensity information.
Open Access
Finding compound structures in images using image segmentation and graph-based knowledge discovery
(IEEE, 2009-07) Zamalieva, Daniya; Aksoy, Selim; Tilton J. C.
We present an unsupervised method for discovering compound image structures that are comprised of simpler primitive objects. An initial segmentation step produces image regions with homogeneous spectral content. Then, the segmentation is translated into a relational graph structure whose nodes correspond to the regions and the edges represent the relationships between these regions. We assume that the region objects that appear together frequently can be considered as strongly related. This relation is modeled using the transition frequencies between neighboring regions, and the significant relations are found as the modes of a probability distribution estimated using the features of these transitions. Experiments using an Ikonos image show that subgraphs found within the graph representing the whole image correspond to parts of different high-level compound structures. ©2009 IEEE.
Open Access
Fire detection and 3D fire propagation estimation for the protection of cultural heritage areas
(Copernicus GmbH, 2010) Dimitropoulos, K.; Köse, Kıvanç; Grammalidis, N.; Çetin, A. Enis
Beyond taking precautionary measures to avoid a forest fire, early warning and immediate response to a fire breakout are the only ways to avoid great losses and environmental and cultural heritage damages. To this end, this paper aims to present a computer vision based algorithm for wildfire detection and a 3D fire propagation estimation system. The main detection algorithm is composed of four sub-algorithms detecting (i) slow moving objects, (ii) smoke-coloured regions, (iii) rising regions, and (iv) shadow regions. After detecting a wildfire, the main focus should be the estimation of its propagation direction and speed. If the model of the vegetation and other important parameters like wind speed, slope, aspect of the ground surface, etc. are known; the propagation of fire can be estimated. This propagation can then be visualized in any 3D-GIS environment that supports KML files.
Open Access
Image mining using directional spatial constraints
(Institute of Electrical and Electronics Engineers, 2010-01) Aksoy, S.; Cinbiş, R. G.
Spatial information plays a fundamental role in building high-level content models for supporting analysts' interpretations and automating geospatial intelligence. We describe a framework for modeling directional spatial relationships among objects and using this information for contextual classification and retrieval. The proposed model first identifies image areas that have a high degree of satisfaction of a spatial relation with respect to several reference objects. Then, this information is incorporated into the Bayesian decision rule as spatial priors for contextual classification. The model also supports dynamic queries by using directional relationships as spatial constraints to enable object detection based on the properties of individual objects as well as their spatial relationships to other objects. Comparative experiments using high-resolution satellite imagery illustrate the flexibility and effectiveness of the proposed framework in image mining with significant improvements in both classification and retrieval performance.
Open Access
Improving the performance of YOLO-based detection algorithms for small object detection in UAV-taken images
(2023-01) Şahin, Öykü
Recent advances in computer vision yield emerging novel applications for cameraequipped unmanned aerial vehicles such as object detection. The accuracy of the existing object detection solutions running on images acquired by Unmanned Aerial Vehicles (UAVs) is limited when compared to the performance of the object detection solutions designed for ground-taken images. Existing object detection solutions demonstrate lower performance on aerial datasets because of the reasons originating from the nature of the UAVs. These reasons can be summarized as: (i) the lack of large drone datasets with different types of objects, (ii) the larger variance in both scale and orientation of objects in drone images, and (iii) the difference in shape and texture of the features between the ground and the aerial images. Due to these reasons, YOLO-based models, a popular family of one-stage object detectors, perform insufficiently in UAV-based applications. In this thesis, two improved YOLO models: YOLODrone and YOLODrone+ are introduced for detecting objects in drone images. The performance of the models are tested on VisDrone2019 and SkyDataV1 datasets and improved results are reported when compared to the original YOLOv3 and YOLOv5 models.
Open Access
Object detection and synthetic infrared image generation for UAV-based aerial images
(2023-09) Özkanoğlu, Mehmet Akif
This thesis contains two main works related to aerial image processing. In the first work (in the first main part of this thesis), we present novel approaches to detect objects in aerial images. We introduce a novel object detection algorithm based on CenterNet which yields the state-of-the-art results in many metrics on many aerial benchmark datasets, when this thesis was written. In this part, we study the effect of different loss functions, and architectures for improving the detection performance of objects in aerial images taken by UAVs. We show that our proposed approaches help improving certain aspects of the learning process for detecting objects in aerial images. To train recent deep learning-based supervised object detection algorithms, the availability of annotations is essential. Many algorithms, today, use both infrared (IR) and visible (RGB) image pairs as input. However, large datasets (such as VisDrone [1] or ImageNet [2]) typically are captured in the visible spectrum. Therefore, a domain transfer-based approach to artificially generate in-frared equivalents of the visible images for existing datasets is presented in the second part of this thesis. Such image pairs, then, can be used to train object detection algorithms for either mode in future work.
Open Access
Object detection using optical and LiDAR data fusion
(IEEE, 2016-07) Taşar, Onur; Aksoy, Selim
Fusion of aerial optical and LiDAR data has been a popular problem in remote sensing as they carry complementary information for object detection. We describe a stratified method that involves separately thresholding the normalized digital surface model derived from LiDAR data and the normalized difference vegetation index derived from spectral bands to obtain candidate image parts that contain different object classes, and incorporates spectral and height data with spatial information in a graph cut framework to segment the rest of the image where such separation is not possible. Experiments using a benchmark data set show that the performance of the proposed method that uses small amount of supervision is compatible with the ones in the literature. © 2016 IEEE.
Open Access
Object detection using optical and lidar data fusion with graph-cuts
(2017-03) Taşar, Onur
Object detection in remotely sensed data has been a popular problem and is commonly used in a wide range of applications in domains such as agriculture, navigation, environmental management, urban monitoring and mapping. However, using only one type of data source may not be sufficient to solve this problem. Fusion of aerial optical and LiDAR data has been a promising approach in remote sensing as they carry complementary information for object detection. We propose frameworks that partition the data in multiple levels and detect objects with minimal supervision in the partitioned data. Our methodology involves thresholding the data according to height, and dividing the data into smaller components to process it efficiently in the preprocessing step. For the classification task, we propose two graph cut based procedures that detect objects in each component using height information from LiDAR, spectral information from aerial data, and spatial information from adjacency maps. The first procedure provides a binary classification, whereas the second one performs a multi-class classification. We use the first framework to separate buildings from trees in the high pixels, and roads from grass areas in the low pixels. The second procedure is used to detect all of the classes in each component at once. The only supervision our proposed methodology requires consists of samples that are used to estimate the weights of the edges in the graph for the graph-cut procedures. Experiments using a benchmark data set show that the performance of the proposed methodology that uses small amount of supervision is compatible with the ones in the literature.
Open Access
Offloading deep learning powered vision tasks from UAV to 5G edge server with denoising
(Institute of Electrical and Electronics Engineers, 2023-06-20) Özer, S.; İlhan, H. E.; Özkanoğlu, Mehmet Akif; Çırpan, H. A.
Offloading computationally heavy tasks from an unmanned aerial vehicle (UAV) to a remote server helps improve battery life and can help reduce resource requirements. Deep learning based state-of-the-art computer vision tasks, such as object segmentation and detection, are computationally heavy algorithms, requiring large memory and computing power. Many UAVs are using (pretrained) off-the-shelf versions of such algorithms. Offloading such power-hungry algorithms to a remote server could help UAVs save power significantly. However, deep learning based algorithms are susceptible to noise, and a wireless communication system, by its nature, introduces noise to the original signal. When the signal represents an image, noise affects the image. There has not been much work studying the effect of the noise introduced by the communication system on pretrained deep networks. In this work, we first analyze how reliable it is to offload deep learning based computer vision tasks (including both object segmentation and detection) by focusing on the effect of various parameters of a 5G wireless communication system on the transmitted image and demonstrate how the introduced noise of the used 5G system reduces the performance of the offloaded deep learning task. Then solutions are introduced to eliminate (or reduce) the negative effect of the noise. Proposed framework starts with introducing many classical techniques as alternative solutions, and then introduces a novel deep learning based solution to denoise the given noisy input image. The performance of various denoising algorithms on offloading both object segmentation and object detection tasks are compared. Our proposed deep transformer-based denoiser algorithm (NR-Net) yields state-of-the-art results in our experiments.
Open Access
Performance measures for object detection evaluation
(Elsevier BV, 2010) Özdemir, B.; Aksoy, S.; Eckert, S.; Pesaresi, M.; Ehrlich, D.
We propose a new procedure for quantitative evaluation of object detection algorithms. The procedure consists of a matching stage for finding correspondences between reference and output objects, an accuracy score that is sensitive to object shapes as well as boundary and fragmentation errors, and a ranking step for final ordering of the algorithms using multiple performance indicators. The procedure is illustrated on a building detection task where the resulting rankings are consistent with the visual inspection of the detection maps. © 2009 Elsevier B.V. All rights reserved.