Browsing by Subject "Computer vision"

Now showing 1 - 20 of 59

Open Access
3D human pose search using oriented cylinders
(IEEE, 2009-09-10) Pehlivan, Selen; Duygulu, Pınar
In this study, we present a representation based on a new 3D search technique for volumetric human poses which is then used to recognize actions in three dimensional video sequences. We generate a set of cylinder like 3D kernels in various sizes and orientations. These kernels are searched over 3D volumes to find high response regions. The distribution of these responses are then used to represent a 3D pose. We use the proposed representation for (i) pose retrieval using Nearest Neighbor (NN) based classification and Support Vector Machine (SVM) based classification methods, and for (ii) action recognition on a set of actions using Dynamic Time Warping (DTW) and Hidden Markov Model (HMM) based classification methods. Evaluations on IXMAS dataset supports the effectiveness of such a robust pose representation. ©2009 IEEE.
Open Access
Attributes2Classname: a discriminative model for attribute-based unsupervised zero-shot learning
(IEEE, 2017-10) Demirel, B.; Cinbiş, Ramazan Gökberk; İkizler-Cinbiş, N.
We propose a novel approach for unsupervised zero-shot learning (ZSL) of classes based on their names. Most existing unsupervised ZSL methods aim to learn a model for directly comparing image features and class names. However, this proves to be a difficult task due to dominance of non-visual semantics in underlying vector-space embeddings of class names. To address this issue, we discriminatively learn a word representation such that the similarities between class and combination of attribute names fall in line with the visual similarity. Contrary to the traditional zero-shot learning approaches that are built upon attribute presence, our approach bypasses the laborious attributeclass relation annotations for unseen classes. In addition, our proposed approach renders text-only training possible, hence, the training can be augmented without the need to collect additional image data. The experimental results show that our method yields state-of-the-art results for unsupervised ZSL in three benchmark datasets. © 2017 IEEE.
Open Access
Camera-based virtual environment interaction on mobile devices
(Springer, 2006-11) Çapin, Tolga; Haro, A.; Setlur, V.; Wilkinson, S.
Mobile virtual environments, with real-time 3D and 2D graphics, are now possible on smart phone and other camera-enabled devices. Using computer vision, the camera sensor can be treated as an input modality in applications by analyzing the incoming live video. We present our tracking algorithm and several mobile virtual environment and gaming prototypes including: a 3D first person shooter, a 2D puzzle game and a simple action game. Camera-based interaction provides a user experience that is not possible through traditional means, and maximizes the use of the limited display size. © Springer-Verlag Berlin Heidelberg 2006.
Open Access
Combined filtering and key-frame reduction of motion capture data with application to 3DTV
(WSCG, 2006-01-02) Önder, Onur; Erdem, Ç.; Erdem, T.; Güdükbay, Uğur; Özgüç, Bülent
A new method for combined filtering and key-frame reduction of motion capture data is proposed. Filtering of motion capture data is necessary to eliminate any jitter introduced by a motion capture system. Key-frame reduction, on the other hand, allows animators to easily edit motion data by representing animation curves with a significantly smaller number of key frames. The proposed technique achieves key frame reduction and jitter removal simultaneously by fitting a Hermite curve to motion capture data using dynamic programming. Copyright © UNION Agency - Science Press.
Open Access
Computationally efficient wavelet affine invariant functions for 2D object recognition
(IEEE, 2003) Bala, E.; Çetin, A. Enis
In this paper, an affine invariant function is presented for object recognition from wavelet coefficients of the object boundary. In previous works, undecimated wavelet transform was used for affine invariant functions. In this paper, an algorithm based on decimated wavelet transform is developed to compute the affine invariant function. As a result, computational complexity is significantly reduced without decreasing recognition performance. Experimental results are presented.
Open Access
Computer vision based analysis of potato chips-A tool for rapid detection of acrylamide level
(Wiley - VCH Verlag GmbH & Co. KGaA, 2006) Gökmen, V.; Senyuva, H. Z.; Dülek, B.; Çetin, E.
In this study, analysis of digital color images of fried potato chips were combined with parallel LCMS based analysis of acrylamide in order to develop a rapid tool for the estimation of acrylamide during processing. Pixels of the fried potato image were classified into three sets based on their Euclidian distances to the representative mean values of typical bright yellow, yellowish brown, and dark brown regions using a semiautomatic segmentation algorithm. The featuring parameter extracted from the segmented image was NA2 value which was defined as the number of pixels in Set-2 divided by the total number of pixels of the entire fried potato image. Using training images of potato chips, it was shown that there was a strong linear correlation (r = 0.989) between acrylamide level and NA2 value. Images of a number of test samples were analyzed to predict their acrylamide level by means of this correlation data. The results confirmed that computer vision system described here provided explicit and meaningful description from the viewpoint of inspection and evaluation purpose for potato chips. Assuming a provisional threshold limit of 1000 ng/g for acrylamide, test samples could be successfully inspected with only one failure out of 60 potato chips.
Open Access
Computer vision based forest fire detection
(IEEE, 2008) Töreyin, B. Uğur; Çetin, A. Enis
Lookout posts are commonly installed in the forests all around Turkey and the world. Most of these posts have electricity. Surveillance cameras can be placed on to these surveillance towers to detect possible forest fires. Currently, average fire detection time is 5 minutes in manned lookout towers. The aim ofthe proposed computer vision based method is to reduce the average fire detection rate. The detection method is based on the wavelet based analysis of the background images at various update rates.
Open Access
Computer vision based method for real-time fire and flame detection
(Elsevier BV, 2006-01-01) Töreyin, B. U.; Dedeoǧlu, Y.; Güdükbay, Uğur; Çetin, A. Enis
This paper proposes a novel method to detect fire and/or flames in real-time by processing the video data generated by an ordinary camera monitoring a scene. In addition to ordinary motion and color clues, flame and fire flicker is detected by analyzing the video in the wavelet domain. Quasi-periodic behavior in flame boundaries is detected by performing temporal wavelet transform. Color variations in flame regions are detected by computing the spatial wavelet transform of moving fire-colored regions. Another clue used in the fire detection algorithm is the irregularity of the boundary of the fire-colored region. All of the above clues are combined to reach a final decision. Experimental results show that the proposed method is very successful in detecting fire and/or flames. In addition, it drastically reduces the false alarms issued to ordinary fire-colored moving objects as compared to the methods using only motion and color clues. © 2005 Elsevier B.V. All rights reserved.
Open Access
Computer vision based text and equation editor for LATEX
(IEEE, 2004-06) Öksüz, Özcan; Güdükbay, Uğur; Çetin, Enis
In this paper, we present a computer vision based text and equation editor for LATEX. The user writes text and equations on paper and a camera attached to a computer records actions of the user. In particular, positions of the pen-tip in consecutive image frames are detected. Next, directional and positional information about characters are calculated using these positions. Then, this information is used for on-line character classification. After characters and symbols are found, corresponding LATEX code is generated.
Open Access
Computer vision based unistroke keyboard system and mouse for the handicapped
(IEEE, 2003-07) Erdem, M .Erkut; Erdem, İ. Aykut; Atalay, Volkan; Çetin, A. Enis
In this paper, a unistroke keyboard based on computer vision is described for the handicapped. The keyboard can be made of paper or fabric containing an image of a keyboard, which has an upside down U-shape. It can even be displayed on a computer screen. Each character is represented by a non-overlapping rectangular region on the keyboard image and the user enters a character by illuminating a character region with a laser pointer. The keyboard image is monitored by a camera and illuminated key locations are recognized. During the text entry process the user neither have to turn the laser light off nor raise the laser light from the keyboard. A disabled person who has difficulty using his/her hands may attach the laser pointer to an eyeglass and easily enter text by moving his/her head to point the laser beam on a character location. In addition, a mouse-like device can be developed based on the same principle. The user can move the cursor by moving the laser light on the computer screen which is monitored by a camera. © 2003 IEEE.
Open Access
A confirmatory test for sperm in sexual assault samples using a microfluidic-integrated cell phone imaging system
(Elsevier, 2020) Deshmukh, S.; İnci, Fatih; Karaaslan, M. G.; Öğüt, M. G.; Duncan, D.; Klevan, L.; Duncan, G.; Demirci, U.
Rapid and efficient processing of sexual assault evidence to accelerate forensic investigation and decrease casework backlogs is urgently needed. Therefore, the standardized protocols currently used in forensic laboratories can benefit from continued innovation to handle the increasing number and complexity of samples being submitted to forensic labs. To our knowledge, there is currently no available rapid and portable forensic screening technology based on a confirmatory test for sperm identification in a sexual assault kit. Here, we present a novel forensic sample screening tool, i.e., a microchip integrated with a portable cell phone imaging platform that records and processes images for further investigation and storage. The platform (i) precisely and rapidly screens swab samples (<15 min after sample preparation on-chip); (ii) selectively captures sperm from mock sexual assault samples using a novel and previously published SLeX-based surface chemistry treatment (iii) separates non-sperm contents (epithelial cells and debris in this case) out of the channel by flow prior to imaging; (iv) captures cell phone images on a portable cellphone-integrated imaging platform, (v) quantitatively differentiates sperm cells from epithelial cells, using a morphology detection code that leverages Laplacian of Gaussian and Hough gradient transform methods; (vi) is sensitive within a forensic cut-off (>95% accuracy) compared to the manual counts; (vii) provides a cost-effective and timely solution to a problem which in the past has taken a great deal of time; and (viii) handles small volumes of sample (20 μL). This integration of the cellphone imaging platform and cell recognition algorithms with disposable microchips can be a new direction toward a direct visual test to screen and differentiate sperm from epithelial cell types in forensic samples for a crime laboratory scenario. With further development, this integrated platform could assist a sexual assault nurse examiner (SANE) in a hospital or sexual assault treatment center facility to flag sperm-containing samples prior to further downstream testing.
Open Access
Deepfake detection through motion magnification inspired feature manipulation
(2022-09) Mirzayev, Aydamir
Synthetically generated media content poses a significant threat to information security in the online domain. Manipulated videos and images of celebrities, politicians, and ordinary citizens, if aimed at misrepresentation, and defamation can cause significant damage to one's reputation. Early detection of such content is crucial to timely alleviation of further spread of questionable information. In the past years, a significant number of deepfake detection frameworks have proposed to utilize motion magnification as a preprocessing step aimed at revealing transitional inconsistencies relevant to the prediction outcome. However, such an approach is sub-optimal since often utilized motion manipulation approaches are optimized for a limited set of controlled motions and display significant visual artifacts when used outside of their domain. To this end, rather than apply motion magnification as a separate processing step, we propose to test trainable motion magnification-inspired feature manipulation units as an addition to a convolutional-LSTM classification network. In our approach, we aim to take the first step at understanding the use of magnification-like architectures in the task of video classification rather than aim at full integration. We test out results on the Celeb-DF dataset which is composed of more than five thousand synthetically generated videos generated using DeepFakes fake generation method. We treat manipulation unit as another network layer and test the performance of the network both with and without it. To ensure the consistency of our results we perform multiple experiments with the same configurations and report the average accuracy. In our experiments we observe an average 3% jump accuracy when the feature manipulation unit is incorporated into the network.
Open Access
Detection of tree trunks as visual landmarks in outdoor environments
(2010) Yıldız, Tuğba
One of the basic problems to be addressed for a robot navigating in an outdoor environment is the tracking of its position and state. A fundamental first step in using algorithms for solving this problem, such as various visual Simultaneous Localization and Mapping (SLAM) strategies, is the extraction and identification of suitable stationary “landmarks” in the environment. This is particularly challenging in the outdoors geometrically consistent features such as lines are not frequent. In this thesis, we focus on using trees as persistent visual landmark features in outdoor settings. Existing work to this end only uses intensity information in images and does not work well in low-contrast settings. In contrast, we propose a novel method to incorporate both color and intensity information as well as regional attributes in an image towards robust of detection of tree trunks. We describe both extensions to the well-known edge-flow method as well as complementary Gabor-based edge detection methods to extract dominant edges in the vertical direction. The final stages of our algorithm then group these vertical edges into potential tree trunks using the integration of perceptual organization and all available image features. We characterize the detection performance of our algorithm for two different datasets, one homogeneous dataset with different images of the same tree types and a heterogeneous dataset with images taken from a much more diverse set of trees under more dramatic variations in illumination, viewpoint and background conditions. Our experiments show that our algorithm correctly finds up to 90% of trees with a false-positive rate lower than 15% in both datasets. These results establish that the integration of all available color, intensity and structure information results in a high performance tree trunk detection system that is suitable for use within a SLAM framework that outperforms other methods that only use image intensity information.
Open Access
A discussion on homography between stationary multi-camera systems and the soccer field model
(IEEE, 2012) Baysal, Sermetcan; Duygulu, Pınar; Kayalar, Ceren
Computer vision based athlete tracking systems use different methods to segment players from the background and then track them automatically throughout the video. It is insufficient to know a player's position on the image plane if we want to extract performance analysis of the player. Furthermore, image plane coordinates need to be transformed to real world coordinates representing the position of the player on the field. Knowing that the soccer field is planar, the mapping between the world coordinate system and the image coordinate system can be described by a planar homography. In this paper, we provide a discussion on homography calculations between a three-camera player tracking system and the real world soccer field model. © 2012 IEEE.
Open Access
Distinct representations in occipito-temporal, parietal, and premotor cortex during action perception revealed by fMRI and computational modeling
(Elsevier, 2019) Ürgen, Burcu A.; Pehlivan, S.; Saygın, A.
Visual processing of actions is supported by a network consisting of occipito-temporal, parietal, and premotor regions in the human brain, known as the Action Observation Network (AON). In the present study, we investigate what aspects of visually perceived actions are represented in this network using fMRI and computational modeling. Human subjects performed an action perception task during scanning. We characterized the different aspects of the stimuli starting from purely visual properties such as form and motion to higher-aspects such as intention using computer vision and categorical modeling. We then linked the models of the stimuli to the three nodes of the AON with representational similarity analysis. Our results show that different nodes of the network represent different aspects of actions. While occipito-temporal cortex performs visual analysis of actions by means of integrating form and motion information, parietal cortex builds on these visual representations and transforms them into more abstract and semantic representations coding target of the action, action type and intention. Taken together, these results shed light on the neuro-computational mechanisms that support visual perception of actions and provide support that AON is a hierarchical system in which increasing levels of the cortex code increasingly complex features.
Open Access
Entropi fonsiyonuna dayalı uyarlanır karar tümleştirme yapısı
(2012-04) Günay, Osman; Töreyin, B. U.; Köse, Kıvanç; Çetin, A. Enis
Bu bildiride, resim analizi ve bilgisayarla görü uygulamalarında kullanılmak üzere entropi fonksiyonuna dayanan uyarlanır karar tümleştirme yapısı geliştirilmiştir. Bu yapıda bileşik algoritma, herbiri güven derecesini temsil eden sıfır merkezli bir gerçek sayı olarak kendi kararını oluşturan birçok alt algoritmadan meydana gelir. Karar değerleri, çevrimiçi olarak alt algoritmaları tanımlayan dışbukey kümelerin üzerine entropik izdüşümler yapmaya dayalı bir aktif tümleştirme yöntemi ile güncellenen ağırlıklar kullanılarak doğrusal olarak birleştirilir. Bu yapıda genelde bir insan olan bir uzman da bulunur ve karar tümleştirme algoritmasına geribesleme sağlar. Önerilen karar tümleştirme algoritmasının performansı geliştirdigimiz video tabanlı bir orman yangını bulma sistemi kullanılarak test edilmiştir.
Open Access
Exploiting architectural features of a computer vision platform towards reducing memory stalls
(Springer, 2020) Mustafa, Naveed Ul; O’Riordan, M. J.; Rogers, S.; Öztürk, Özcan
Computer vision applications are becoming more and more popular in embedded systems such as drones, robots, tablets, and mobile devices. These applications are both compute and memory intensive, with memory bound stalls (MBS) making a significant part of their execution time. For maximum reduction in memory stalls, compilers need to consider architectural details of a platform and utilize its hardware components efficiently. In this paper, we propose a compiler optimization for a vision-processing system through classification of memory references to reduce MBS. As the proposed optimization is based on the architectural features of a specific platform, i.e., Myriad 2, it can only be applied to other platforms having similar architectural features. The optimization consists of two steps: affinity analysis and affinity-aware instruction scheduling. We suggest two different approaches for affinity analysis, i.e., source code annotation and automated analysis. We use LLVM compiler infrastructure for implementation of the proposed optimization. Application of annotation-based approach on a memory-intensive program shows a reduction in stall cycles by 67.44%, leading to 25.61% improvement in execution time. We use 11 different image-processing benchmarks for evaluation of automated analysis approach. Experimental results show that classification of memory references reduces stall cycles, on average, by 69.83%. As all benchmarks are both compute and memory intensive, we achieve improvement in execution time by up to 30%, with a modest average of 5.79%.
Open Access
FAME: Face association through model evolution
(IEEE, 2015-06) Gölge, Eren; Duygulu, Pınar
We attack the problem of building classifiers for public faces from web images collected through querying a name. The search results are very noisy even after face detection, with several irrelevant faces corresponding to other people. Moreover, the photographs are taken in the wild with large variety in poses and expressions. We propose a novel method, Face Association through Model Evolution (FAME), that is able to prune the data in an iterative way, for the models associated to a name to evolve. The idea is based on capturing discriminative and representative properties of each instance and eliminating the outliers. The final models are used to classify faces on novel datasets with different characteristics. On benchmark datasets, our results are comparable to or better than the state-of-the-art studies for the task of face identification. © 2015 IEEE.
Open Access
Fire detection and 3D fire propagation estimation for the protection of cultural heritage areas
(Copernicus GmbH, 2010) Dimitropoulos, K.; Köse, Kıvanç; Grammalidis, N.; Çetin, A. Enis
Beyond taking precautionary measures to avoid a forest fire, early warning and immediate response to a fire breakout are the only ways to avoid great losses and environmental and cultural heritage damages. To this end, this paper aims to present a computer vision based algorithm for wildfire detection and a 3D fire propagation estimation system. The main detection algorithm is composed of four sub-algorithms detecting (i) slow moving objects, (ii) smoke-coloured regions, (iii) rising regions, and (iv) shadow regions. After detecting a wildfire, the main focus should be the estimation of its propagation direction and speed. If the model of the vegetation and other important parameters like wind speed, slope, aspect of the ground surface, etc. are known; the propagation of fire can be estimated. This propagation can then be visualized in any 3D-GIS environment that supports KML files.
Open Access
Flexible test-bed for unusual behavior detection
(ACM, 2007-07) Petrás I.; Beleznai, C.; Dedeolğu, Yiğithan; Pards, M.; Kovács L.; Szlávik, Z.; Havasi L.; Szirányi, T.; Töreyin, B. Uğur; Güdükbay, Uğur; Çetin, A.hmet Enis; Canton-Ferrer, C.
Visual surveillance and activity analysis is an active research field of computer vision. As a result, there are several different algorithms produced for this purpose. To obtain more robust systems it is desirable to integrate the different algorithms. To help achieve this goal, we propose a flexible, distributed software collaboration framework and present a prototype system for automatic event analysis. Copyright 2007 ACM.