A key-pose based representation for human action recognition
This thesis utilizes a key-pose based representation to recognize human actions in videos. We believe that the pose of the human figure is a powerful source for describing the nature of the ongoing action in a frame. Each action can be represented by a unique set of frames that include all the possible spatial configurations of the human body parts throughout the time the action is performed. Such set of frames for each action referred as “key poses” uniquely distinguishes that action from the rest. For extracting “key poses”, we define a similarity value between the poses in a pair of frames by using the lines forming the human figure along with a shape matching method. By the help of a clustering algorithm, we group the similar frames of each action into a number of clusters and use the centroids as “key poses” for that action. Moreover, in order to utilize the motion information present in the action, we include simple line displacement vectors for each frame in the “key poses” selection process. Experiments on Weizmann and KTH datasets show the effectiveness of our key-pose based approach in representing and recognizing human actions.