Histogram of oriented rectangles: a new pose descriptor for human action recognition
Image and Vision Computing
1515 - 1526
Item Usage Stats
MetadataShow full item record
Most of the approaches to human action recognition tend to form complex models which require lots of parameter estimation and computation time. In this study, we show that, human actions can be simply represented by pose without dealing with the complex representation of dynamics. Based on this idea, we propose a novel pose descriptor which we name as Histogram-of-Oriented-Rectangles (HOR) for representing and recognizing human actions in videos. We represent each human pose in an action sequence by oriented rectangular patches extracted over the human silhouette. We then form spatial oriented histograms to represent the distribution of these rectangular patches. We make use of several matching strategies to carry the information from the spatial domain described by the HOR descriptor to temporal domain. These are (i) nearest neighbor classification, which recognizes the actions by matching the descriptors of each frame, (ii) global histogramming, which extends the idea of Motion Energy Image proposed by Bobick and Davis to rectangular patches, (iii) a classifier-based approach using Support Vector Machines, and (iv) adaptation of Dynamic Time Warping on the temporal representation of the HOR descriptor. For the cases when pose descriptor is not sufficiently strong alone, such as to differentiate actions "jogging" and "running", we also incorporate a simple velocity descriptor as a prior to the pose based classification step. We test our system with different configurations and experiment on two commonly used action datasets: the Weizmann dataset and the KTH dataset. Results show that our method is superior to other methods on Weizmann dataset with a perfect accuracy rate of 100%, and is comparable to the other methods on KTH dataset with a very high success rate close to 90%. These results prove that with a simple and compact representation, we can achieve robust recognition of human actions, compared to complex representations. © 2009 Elsevier B.V. All rights reserved.
Human motion understanding
Dynamic time warping
Nearest neighbor classification
Human form models
Published Version (Please cite this version)http://dx.doi.org/10.1016/j.imavis.2009.02.002
Showing items related by title, author, creator and subject.
Sener F.; Ikizler-Cinbis, N. (Academic Press Inc., 2015)Abstract In this work, we look into the problem of recognizing two-person interactions in videos. Our method integrates multiple visual features in a weakly supervised manner by utilizing an embedding-based multiple instance ...
Karşılıklı bilgi ölçütü kullanılarak giyilebilir hareket duyucu sinyallerinin aktivite tanıma amaçlı analizi Dobrucalı, Oğuzcan; Barshan, Billur (IEEE, 2014-04)Giyilebilir hareket duyucuları ile insan aktivitelerinin saptanmasında, uygun duyucu yapılanışının seçimi önem taşıyan bir konudur. Bu konu, kullanılacak duyucuların sayısının, türünün, sabitlenecekleri konum ve yönelimin ...
şener, Fadime; Samet, Nermin; Duygulu, Pınar; Ikizler-Cinbis, N. (IEEE, 2013)In this work, we study the task of recognizing human actions from noisy videos and effects of noise to recognition performance and propose a possible solution. Datasets available in computer vision literature are relatively ...