Browsing by Subject "Human actions"
Now showing 1 - 5 of 5
- Results Per Page
- Sort Options
Item Open Access On recognizing actions in still images via multiple features(Springer, Berlin, Heidelberg, 2012) Şener, Fadime; Bas, C.; Ikizler-Cinbis, N.We propose a multi-cue based approach for recognizing human actions in still images, where relevant object regions are discovered and utilized in a weakly supervised manner. Our approach does not require any explicitly trained object detector or part/attribute annotation. Instead, a multiple instance learning approach is used over sets of object hypotheses in order to represent objects relevant to the actions. We test our method on the extensive Stanford 40 Actions dataset [1] and achieve significant performance gain compared to the state-of-the-art. Our results show that using multiple object hypotheses within multiple instance learning is effective for human action recognition in still images and such an object representation is suitable for using in conjunction with other visual features. © 2012 Springer-Verlag.Item Open Access Pose sentences: a new representation for action recognition using sequence of pose words(IEEE, 2008-12) Hatun, Kardelen; Duygulu, PınarWe propose a method for recognizing human actions in videos. Inspired from the recent bag-of-words approaches, we represent actions as documents consisting of words, where a word refers to the pose in a frame. Histogram of oriented gradients (HOG) features are used to describe poses, which are then vector quantized to obtain pose-words. As an alternative to bagof- words approaches, that only represent actions as a collection of words by discarding the temporal characteristics of actions, we represent videos as ordered sequence of pose-words, that is as pose sentences. Then, string matching techniques are exploited to find the similarity of two action sequences. In the experiments, performed on data set of Blank et al., 92% performance is obtained. © 2008 IEEE.Item Open Access Recognizing actions from still images(IEEE, 2008-12) İkizler, Nazlı; Cinbiş, R .Gökberk; Pehlivan, Selen; Duygulu, PınarIn this paper, we approach the problem of under- standing human actions from still images. Our method involves representing the pose with a spatial and ori- entational histogramming of rectangular regions on a parse probability map. We use LDA to obtain a more compact and discriminative feature representation and binary SVMs for classification. Our results over a new dataset collected for this problem show that by using a rectangle histogramming approach, we can discriminate actions to a great extent. We also show how we can use this approach in an unsupervised setting. To our best knowledge, this is one of the first studies that try to recognize actions within still images. © 2008 IEEE.Item Open Access Recognizing human actions using key poses(IEEE, 2010) Baysal, Sermetcan; Kurt, Mehmet Can; Duygulu, PınarIn this paper, we explore the idea of using only pose, without utilizing any temporal information, for human action recognition. In contrast to the other studies using complex action representations, we propose a simple method, which relies on extracting "key poses" from action sequences. Our contribution is two-fold. Firstly, representing the pose in a frame as a collection of line-pairs, we propose a matching scheme between two frames to compute their similarity. Secondly, to extract "key poses" for each action, we present an algorithm, which selects the most representative and discriminative poses from a set of candidates. Our experimental results on KTH and Weizmann datasets have shown that pose information by itself is quite effective in grasping the nature of an action and sufficient to distinguish one from others. © 2010 IEEE.Item Open Access Two-person interaction recognition via spatial multiple instance embedding(Academic Press Inc., 2015) Sener F.; Ikizler-Cinbis, N.Abstract In this work, we look into the problem of recognizing two-person interactions in videos. Our method integrates multiple visual features in a weakly supervised manner by utilizing an embedding-based multiple instance learning framework. In our proposed method, first, several visual features that capture the shape and motion of the interacting people are extracted from each detected person region in a video. Then, two-person visual descriptors are formed. Since the relative spatial locations of interacting people are likely to complement the visual descriptors, we propose to use spatial multiple instance embedding, which implicitly incorporates the distances between people into the multiple instance learning process. Experimental results on two benchmark datasets validate that using two-person visual descriptors together with spatial multiple instance learning offers an effective way for inferring the type of the interaction. © 2015 Elsevier Inc.