Browsing by Subject "Body, Human--Computer simulation."

Now showing 1 - 11 of 11

Open Access
A comparative study on human activity classification with miniature inertial and magnetic sensors
(2011) Yüksek, Murat Cihan
This study provides a comparative assessment on the different techniques of classifying human activities that are performed using body-worn miniature inertial and magnetic sensors. The classification techniques compared in this study are: naive Bayesian (NB) classifier, artificial neural networks (ANNs), dissimilarity-based classifier (DBC), various decision-tree methods, Gaussian mixture model (GMM), and support vector machines (SVM). The algorithms for these techniques are provided on two commonly used open source environments: Waikato environment for knowledge analysis (WEKA), a Java-based software; and pattern recognition toolbox (PRTools), a MATLAB toolbox. Human activities are classified using five sensor units worn on the chest, the arms, and the legs. Each sensor unit comprises a tri-axial gyroscope, a tri-axial accelerometer, and a tri-axial magnetometer. A feature set extracted from the raw sensor data using principal component analysis (PCA) is used in the classification process. Three different cross-validation techniques are employed to validate the classifiers. A performance comparison of the classification techniques is provided in terms of their correct differentiation rates, confusion matrices, and computational cost. The methods that result in the highest correct differentiation rates are found to be ANN (99.2%), SVM (99.2%), and GMM (99.1%). The magnetometer is the best type of sensor to be used in classification whereas gyroscope is the least useful. Considering the locations of the sensor units on body, the sensors worn on the legs seem to provide the most valuable information.
Open Access
Human activity classification with miniature inertial sensors
(2009) Tunçel, Orkun
This thesis provides a comparative study on activity recognition using miniature inertial sensors (gyroscopes and accelerometers) and magnetometers worn on the human body. The classification methods used and compared in this study are: a rule-based algorithm (RBA) or decision tree, least-squares method (LSM), k-nearest neighbor algorithm (k-NN), dynamic time warping (DTW- 1 and DTW-2), and support vector machines (SVM). In the first part of this study, eight different leg motions are classified using only two single-axis gyroscopes. In the second part, human activities are classified using five sensor units worn on different parts of the body. Each sensor unit comprises a tri-axial gyroscope, a tri-axial accelerometer and a tri-axial magnetometer. Different feature sets extracted from the raw sensor data and these are used in the classification process. A number of feature extraction and reduction techniques (principal component analysis) as well as different cross-validation techniques have been implemented and compared. A performance comparison of these classification methods is provided in terms of their correct differentiation rates, confusion matrices, pre-processing and training times and classification times. Among the classification techniques we have considered and implemented, SVM, in general, gives the highest correct differentiation rate, followed by k-NN. The classification time for RBA is the shortest, followed by SVM or LSM, k-NN or DTW-1, and DTW-2 methods. SVM requires the longest training time, whereas DTW-2 takes the longest amount of classification time. Although there is not a significant difference between the correct differentiation rates obtained by different crossvalidation techniques, repeated random sub-sampling uses the shortest amount of classification time, whereas leave-one-out requires the longest.
Open Access
Intelligent sensing for robot mapping and simultaneous human localization and activity recognition
(2011) Altun, Kerem
We consider three different problems in two different sensing domains, namely ultrasonic sensing and inertial sensing. Since the applications considered in each domain are inherently different, this thesis is composed of two main parts. The approach common to the two parts is that raw data acquired from simple sensors is processed intelligently to extract useful information about the environment. In the first part, we employ active snake contours and Kohonen’s selforganizing feature maps (SOMs) for representing and evaluating discrete point maps of indoor environments efficiently and compactly. We develop a generic error criterion for comparing two different sets of points based on the Euclidean distance measure. The point sets can be chosen as (i) two different sets of map points acquired with different mapping techniques or different sensing modalities, (ii) two sets of fitted curve points to maps extracted by different mapping techniques or sensing modalities, or (iii) a set of extracted map points and a set of fitted curve points. The error criterion makes it possible to compare the accuracy of maps obtained with different techniques among themselves, as well as with an absolute reference. We optimize the parameters of active snake contours and SOMs using uniform sampling of the parameter space and particle swarm optimization. A demonstrative example from ultrasonic mapping is given based on experimental data and compared with a very accurate laser map, considered an absolute reference. Both techniques can fill the erroneous gaps in discrete point maps. Snake curve fitting results in more accurate maps than SOMs because it is more robust to outliers. The two methods and the error criterion are sufficiently general that they can also be applied to discrete point maps acquired with other mapping techniques and other sensing modalities. In the second part, we use body-worn inertial/magnetic sensor units for recognition of daily and sports activities, as well as for human localization in GPSdenied environments. Each sensor unit comprises a tri-axial gyroscope, a tri-axial accelerometer, and a tri-axial magnetometer. The error characteristics of the sensors are modeled using the Allan variance technique, and the parameters of lowand high-frequency error components are estimated. Then, we provide a comparative study on the different techniques of classifying human activities that are performed using body-worn miniature inertial and magnetic sensors. Human activities are classified using five sensor units worn on the chest, the arms, and the legs. We compute a large number of features extracted from the sensor data, and reduce these features using both Principal Components Analysis (PCA) and sequential forward feature selection (SFFS). We consider eight different pattern recognition techniques and provide a comparison in terms of the correct classification rates, computational costs, and their training and storage requirements. Results with sensors mounted on various locations on the body are also provided. The results indicate that if the system is trained by the data of an individual person, it is possible to obtain over 99% correct classification rates with a simple quadratic classifier such as the Bayesian decision method. However, if the training data of that person are not available beforehand, one has to resort to more complex classifiers with an expected correct classification rate of about 85%. We also consider the human localization problem using body-worn inertial/ magnetic sensors. Inertial sensors are characterized by drift error caused by the integration of their rate output to get position information. Because of this drift, the position and orientation data obtained from inertial sensor signals are reliable over only short periods of time. Therefore, position updates from externally referenced sensors are essential. However, if the map of the environment is known, the activity context of the user provides information about position. In particular, the switches in the activity context correspond to discrete locations on the map. By performing activity recognition simultaneously with localization, one can detect the activity context switches and use the corresponding position information as position updates in the localization filter. The localization filter also involves a smoother, which combines the two estimates obtained by running the zero-velocity update (ZUPT) algorithm both forward and backward in time. We performed experiments with eight subjects in an indoor and an outdoor environment involving “walking,” “turning,” and “standing” activities. Using the error criterion in the first part of the thesis, we show that the position errors can be decreased by about 85% on the average. We also present the results of a 3-D experiment performed in a realistic indoor environment and demonstrate that it is possible to achieve over 90% error reduction in position by performing activity recognition simultaneously with localization.
Open Access
A key-pose based representation for human action recognition
(2011) Kurt, Mehmet Can
This thesis utilizes a key-pose based representation to recognize human actions in videos. We believe that the pose of the human figure is a powerful source for describing the nature of the ongoing action in a frame. Each action can be represented by a unique set of frames that include all the possible spatial configurations of the human body parts throughout the time the action is performed. Such set of frames for each action referred as “key poses” uniquely distinguishes that action from the rest. For extracting “key poses”, we define a similarity value between the poses in a pair of frames by using the lines forming the human figure along with a shape matching method. By the help of a clustering algorithm, we group the similar frames of each action into a number of clusters and use the centroids as “key poses” for that action. Moreover, in order to utilize the motion information present in the action, we include simple line displacement vectors for each frame in the “key poses” selection process. Experiments on Weizmann and KTH datasets show the effectiveness of our key-pose based approach in representing and recognizing human actions.
Open Access
A line based pose representation for human action recognition
(2011) Baysal, Sermetcan
In this thesis, we utilize a line based pose representation to recognize human actions in videos. We represent the pose in each frame by employing a collection of line-pairs, so that limb and joint movements are better described and the geometrical relationships among the lines forming the human figure is captured. We contribute to the literature by proposing a new method that matches line-pairs of two poses to compute the similarity between them. Moreover, to encapsulate the global motion information of a pose sequence, we introduce line-flow histograms, which are extracted by matching line segments in consecutive frames. Experimental results on Weizmann and KTH datasets, emphasize the power of our pose representation; and show the effectiveness of using pose ordering and line-flow histograms together in grasping the nature of an action and distinguishing one from the others. Finally, we demonstrate the applicability of our approach to multi-camera systems on the IXMAS dataset.
Open Access
A multi scale motion saliency method for keyframe extraction from motion capture sequences
(2010) Halit, Cihan
Motion capture is an increasingly popular animation technique; however data acquired by motion capture can become substantial. This makes it di cult to use motion capture data in a number of applications, such as motion editing, motion understanding, automatic motion summarization, motion thumbnail generation, or motion database search and retrieval. To overcome this limitation, we propose an automatic approach to extract keyframes from a motion capture sequence. We treat the input sequence as motion curves, and obtain the most salient parts of these curves using a new proposed metric, called 'motion saliency'. We select the curves to be analyzed by a dimension reduction technique, Principal Component Analysis. We then apply frame reduction techniques to extract the most important frames as keyframes of the motion. With this approach, around 8% of the frames are selected to be keyframes for motion capture sequences. We have quanti ed our results both mathematically and through user tests.
Open Access
Multiple view human activity recognition
(2012) Pehlivan, Selen
This thesis explores the human activity recognition problem when multiple views are available. We follow two main directions: we first present a system that performs volume matching using constructed 3D volumes from calibrated cameras, then we present a flexible system based on frame matching directly using multiple views. We examine the multiple view systems compared to single view systems, and measure the performance improvements in recognition using more views by various experiments. Initial part of the thesis introduces compact representations for volumetric data gained through reconstruction. The video frames recorded by many cameras with significant overlap are fused by reconstruction, and the reconstructed volumes are used as substitutes of action poses. We propose new pose descriptors over these three dimensional volumes. Our first descriptor is based on the histogram of oriented cylinders in various sizes and orientations. We then propose another descriptor which is view-independent, and which does not require pose alignment. We show the importance of discriminative pose representations within simpler activity classification schemes. Activity recognition framework based on volume matching presents promising results compared to the state-of-the-art. Volume reconstruction is one natural approach for multi camera data fusion, but there can be few cameras with overlapping views. In the second part of the thesis, we introduce an architecture that is adaptable to various number of cameras and features. The system collects and fuses activity judgments from cameras using a voting scheme. The architecture requires no camera calibration. Performance generally improves when there are more cameras and more features; training and test cameras do not need to overlap; camera drop in or drop out is handled easily with little penalty. Experiments support the performance penalties, and advantages for using multiple views versus single view.
Open Access
Pose sentences : a new representation for understanding human actions
(2008) Hatun, Kardelen
In this thesis we address the problem of human action recognition from video sequences. Our main contribution to the literature is the compact use of poses while representing videos and most importantly considering actions as pose-sentences and exploit string matching approaches for classification. We focus on single actions, where the actor performs one simple action through the video sequence. We represent actions as documents consisting of words, where a word refers to a pose in a frame. We think pose information is a powerful source for describing actions. In search of a robust pose descriptor, we make use of four well-known techniques to extract pose information, Histogram of Oriented Gradients, k-Adjacent Segments, Shape Context and Optical Flow Histograms. To represent actions, first we generate a codebook which will act as a dictionary for our action dataset. Action sequences are then represented using a sequence of pose-words, as posesentences. The similarity between two actions are obtained using string matching techniques. We also apply a bag-of-poses approach for comparison purposes and show the superiority of pose-sentences. We test the efficiency of our method with two widely used benchmark datasets, Weizmann and KTH. We show that pose is indeed very descriptive while representing actions, and without having to examine complex dynamic characteristics of actions, one can apply simple techniques with equally successful results.
Open Access
Real-time parameterized locomotion generation
(2008) Akbay, Muzaffer
Reuse and blending of captured motions for creating realistic motions of human body is considered as one of the challenging problems in animation and computer graphics. Locomotion (walking, running and jogging) is one of the most common types of daily human motion. Based on blending of multiple motions, we propose a two-stage approach for generating locomotion according to userspecified parameters, such as linear and angular velocities. Starting from a large dataset of various motions, we construct a motion graph of similar short motion segments. This process includes the selection of motions according to a set of predefined criteria, the correction of errors on foot positioning, pre-adjustments, motion synchronization, and transition partitioning. In the second stage, we generate an animation according to the specified parameters by following a path on the graph during run-time, which can be performed in real-time. Two different blending techniques are used at this step depending on the number of the input motions: blending based on scattered data interpolation and blending based on linear interpolation. Our approach provides an expandable and efficient motion generation system, which can be used for real time applications.
Open Access
Recognition and classification of human activities using wearable sensors
(2012) Yurtman, Aras
We address the problem of detecting and classifying human activities using two different types of wearable sensors. In the first part of the thesis, a comparative study on the different techniques of classifying human activities using tag-based radio-frequency (RF) localization is provided. Position data of multiple RF tags worn on the human body are acquired asynchronously and non-uniformly. Curves fitted to the data are re-sampled uniformly and then segmented. The effect of varying the relevant system parameters on the system accuracy is investigated. Various curve-fitting, segmentation, and classification techniques are compared and the combination resulting in the best performance is presented. The classifiers are validated through the use of two different cross-validation methods. For the complete classification problem with 11 classes, the proposed system demonstrates an average classification error of 8.67% and 21.30% for 5-fold and subject-based leave-one-out (L1O) cross validation, respectively. When the number of classes is reduced to five by omitting the transition classes, these errors become 1.12% and 6.52%. The system demonstrates acceptable classification performance despite that tag-based RF localization does not provide very accurate position measurements. In the second part, data acquired from five sensory units worn on the human body, each containing a tri-axial accelerometer, a gyroscope, and a magnetometer, during 19 different human activities are used to calculate inter-subject and interactivity variations in the data with different methods. Absolute, Euclidean, and dynamic time-warping (DTW) distances are used to assess the similarity of the signals. The comparisons are made using time-domain data and feature vectors. Different normalization methods are used and compared. The “best” subject is defined and identified according to his/her average distance to the other subjects.Based on one of the similarity criteria proposed here, an autonomous system that detects and evaluates physical therapy exercises using inertial sensors and magnetometers is developed. An algorithm that detects all the occurrences of one or more template signals (exercise movements) in a long signal (physical therapy session) while allowing some distortion is proposed based on DTW. The algorithm classifies the executions in one of the exercises and evaluates them as correct/incorrect, identifying the error type if there is any. To evaluate the performance of the algorithm in physical therapy, a dataset consisting of one template execution and ten test executions of each of the three execution types of eight exercise movements performed by five subjects is recorded, having totally 120 and 1,200 exercise executions in the training and test sets, respectively, as well as many idle time intervals in the test signals. The proposed algorithm detects 1,125 executions in the whole test set. 8.58% of the executions are missed and 4.91% of the idle intervals are incorrectly detected as an execution. The accuracy is 93.46% for exercise classification and 88.65% for both exercise and execution type classification. The proposed system may be used to both estimate the intensity of the physical therapy session and evaluate the executions to provide feedback to the patient and the specialist.
Open Access
Understanding human motion : recognition and retrieval of human activities
(2008) İkizler, Nazlı
Within the ever-growing video archives is a vast amount of interesting information regarding human action/activities. In this thesis, we approach the problem of extracting this information and understanding human motion from a computer vision perspective. We propose solutions for two distinct scenarios, ordered from simple to complex. In the first scenario, we deal with the problem of single action recognition in relatively simple settings. We believe that human pose encapsulates many useful clues for recognizing the ongoing action, and we can represent this shape information for 2D single actions in very compact forms, before going into details of complex modeling. We show that high-accuracy single human action recognition is possible 1) using spatial oriented histograms of rectangular regions when the silhouette is extractable, 2) using the distribution of boundary-fitted lines when the silhouette information is missing. We demonstrate that, inside videos, we can further improve recognition accuracy by means of adding local and global motion information. We also show that within a discriminative framework, shape information is quite useful even in the case of human action recognition in still images. Our second scenario involves recognition and retrieval of complex human activities within more complicated settings, like the presence of changing background and viewpoints. We describe a method of representing human activities in 3D that allows a collection of motions to be queried without examples, using a simple and effective query language. Our approach is based on units of activity at segments of the body, that can be composed across time and across the body to produce complex queries. The presence of search units is inferred automatically by tracking the body, lifting the tracks to 3D and comparing to models trained using motion capture data. Our models of short time scale limb behaviour are built using labelled motion capture set. Our query language makes use of finite state automata and requires simple text encoding and no visual examples. We show results for a large range of queries applied to a collection of complex motion and activity. We compare with discriminative methods applied to tracker data; our method offers significantly improved performance. We show experimental evidence that our method is robust to view direction and is unaffected by some important changes of clothing.