Pose sentences : a new representation for understanding human actions

Date
2008
Editor(s)
Advisor
Duygulu, Pınar
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Bilkent University
Volume
Issue
Pages
Language
English
Journal Title
Journal ISSN
Volume Title
Series
Abstract

In this thesis we address the problem of human action recognition from video sequences. Our main contribution to the literature is the compact use of poses while representing videos and most importantly considering actions as pose-sentences and exploit string matching approaches for classification. We focus on single actions, where the actor performs one simple action through the video sequence. We represent actions as documents consisting of words, where a word refers to a pose in a frame. We think pose information is a powerful source for describing actions. In search of a robust pose descriptor, we make use of four well-known techniques to extract pose information, Histogram of Oriented Gradients, k-Adjacent Segments, Shape Context and Optical Flow Histograms. To represent actions, first we generate a codebook which will act as a dictionary for our action dataset. Action sequences are then represented using a sequence of pose-words, as posesentences. The similarity between two actions are obtained using string matching techniques. We also apply a bag-of-poses approach for comparison purposes and show the superiority of pose-sentences. We test the efficiency of our method with two widely used benchmark datasets, Weizmann and KTH. We show that pose is indeed very descriptive while representing actions, and without having to examine complex dynamic characteristics of actions, one can apply simple techniques with equally successful results.

Course
Other identifiers
Book Title
Citation
Published Version (Please cite this version)