Understanding human motion : recognition and retrieval of human activities
Author
İkizler, Nazlı
Advisor
Duygulu, Pınar
Date
2008Publisher
Bilkent University
Language
English
Type
ThesisItem Usage Stats
81
views
views
33
downloads
downloads
Abstract
Within the ever-growing video archives is a vast amount of interesting information
regarding human action/activities. In this thesis, we approach the problem of extracting
this information and understanding human motion from a computer vision perspective.
We propose solutions for two distinct scenarios, ordered from simple to complex. In
the first scenario, we deal with the problem of single action recognition in relatively
simple settings. We believe that human pose encapsulates many useful clues for recognizing
the ongoing action, and we can represent this shape information for 2D single
actions in very compact forms, before going into details of complex modeling. We
show that high-accuracy single human action recognition is possible 1) using spatial
oriented histograms of rectangular regions when the silhouette is extractable, 2) using
the distribution of boundary-fitted lines when the silhouette information is missing.
We demonstrate that, inside videos, we can further improve recognition accuracy by
means of adding local and global motion information. We also show that within a discriminative
framework, shape information is quite useful even in the case of human
action recognition in still images.
Our second scenario involves recognition and retrieval of complex human activities
within more complicated settings, like the presence of changing background and
viewpoints. We describe a method of representing human activities in 3D that allows
a collection of motions to be queried without examples, using a simple and effective
query language. Our approach is based on units of activity at segments of the body,
that can be composed across time and across the body to produce complex queries.
The presence of search units is inferred automatically by tracking the body, lifting the
tracks to 3D and comparing to models trained using motion capture data. Our models
of short time scale limb behaviour are built using labelled motion capture set. Our query language makes use of finite state automata and requires simple text encoding
and no visual examples. We show results for a large range of queries applied to a
collection of complex motion and activity. We compare with discriminative methods
applied to tracker data; our method offers significantly improved performance. We
show experimental evidence that our method is robust to view direction and is unaffected
by some important changes of clothing.
Keywords
Human motionclassification
image and video processing
activity retrieval
activity recognition
action recognition