Browsing by Subject "Video analysis"

Now showing 1 - 4 of 4

Open Access
Fuzzy color histogram-based video segmentation
(Academic Press, 2010) Küçüktunç, O.; Güdükbay, Uğur; Ulusoy, Özgür
We present a fuzzy color histogram-based shot-boundary detection algorithm specialized for content-based copy detection applications. The proposed method aims to detect both cuts and gradual transitions (fade, dissolve) effectively in videos where heavy transformations (such as cam-cording, insertions of patterns, strong re-encoding) occur. Along with the color histogram generated with the fuzzy linking method on L*a*b* color space, the system extracts a mask for still regions and the window of picture-in-picture transformation for each detected shot, which will be useful in a content-based copy detection system. Experimental results show that our method effectively detects shot boundaries and reduces false alarms as compared to the state-of-the-art shot-boundary detection algorithms. © 2009 Elsevier Inc. All rights reserved.
Open Access
A multi-modal video analysis approach for car park fire detection
(Elsevier, 2013) Verstockt, S.; Hoecke, S. V.; Beji, T.; Merci, B.; Gouverneur, B.; Çetin, A. Enis; Potter, P. D.; Walle, R. V. D.
In this paper a novel multi-modal flame and smoke detector is proposed for the detection of fire in large open spaces such as car parks. The flame detector is based on the visual and amplitude image of a time-of-flight camera. Using this multi-modal information, flames can be detected very accurately by visual flame feature analysis and amplitude disorder detection. In order to detect the low-cost flame related features, moving objects in visual images are analyzed over time. If an object possesses high probability for each of the flame characteristics, it is labeled as candidate flame region. Simultaneously, the amplitude disorder is also investigated. Also labeled as candidate flame regions are regions with high accumulative amplitude differences and high values in all detail images of the amplitude image's discrete wavelet transform. Finally, when there is overlap of at least one of the visual and amplitude candidate flame regions, fire alarm is raised. The smoke detector, on the other hand, focuses on global changes in the depth images of the time-of-flight camera, which do not have significant impact on the amplitude images. It was found that this behavior is unique for smoke. Experiments show that the proposed detectors improve the accuracy of fire detection in car parks. The flame detector has an average flame detection rate of 93%, with hardly any false positive detection, and the smoke detection rate of the TOF based smoke detector is 88%.
Open Access
Multiple view human activity recognition
(2012) Pehlivan, Selen
This thesis explores the human activity recognition problem when multiple views are available. We follow two main directions: we first present a system that performs volume matching using constructed 3D volumes from calibrated cameras, then we present a flexible system based on frame matching directly using multiple views. We examine the multiple view systems compared to single view systems, and measure the performance improvements in recognition using more views by various experiments. Initial part of the thesis introduces compact representations for volumetric data gained through reconstruction. The video frames recorded by many cameras with significant overlap are fused by reconstruction, and the reconstructed volumes are used as substitutes of action poses. We propose new pose descriptors over these three dimensional volumes. Our first descriptor is based on the histogram of oriented cylinders in various sizes and orientations. We then propose another descriptor which is view-independent, and which does not require pose alignment. We show the importance of discriminative pose representations within simpler activity classification schemes. Activity recognition framework based on volume matching presents promising results compared to the state-of-the-art. Volume reconstruction is one natural approach for multi camera data fusion, but there can be few cameras with overlapping views. In the second part of the thesis, we introduce an architecture that is adaptable to various number of cameras and features. The system collects and fuses activity judgments from cameras using a voting scheme. The architecture requires no camera calibration. Performance generally improves when there are more cameras and more features; training and test cameras do not need to overlap; camera drop in or drop out is handled easily with little penalty. Experiments support the performance penalties, and advantages for using multiple views versus single view.
Open Access
Two-person interaction recognition via spatial multiple instance embedding
(Academic Press Inc., 2015) Sener F.; Ikizler-Cinbis, N.
Abstract In this work, we look into the problem of recognizing two-person interactions in videos. Our method integrates multiple visual features in a weakly supervised manner by utilizing an embedding-based multiple instance learning framework. In our proposed method, first, several visual features that capture the shape and motion of the interacting people are extracted from each detected person region in a video. Then, two-person visual descriptors are formed. Since the relative spatial locations of interacting people are likely to complement the visual descriptors, we propose to use spatial multiple instance embedding, which implicitly incorporates the distances between people into the multiple instance learning process. Experimental results on two benchmark datasets validate that using two-person visual descriptors together with spatial multiple instance learning offers an effective way for inferring the type of the interaction. © 2015 Elsevier Inc.