Multiple view human activity recognition

buir.advisorDuygulu, Pınar
dc.contributor.authorPehlivan, Selen
dc.date.accessioned2016-01-08T18:23:14Z
dc.date.available2016-01-08T18:23:14Z
dc.date.issued2012
dc.departmentDepartment of Computer Engineeringen_US
dc.descriptionAnkara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent University, 2012.en_US
dc.descriptionThesis (Ph. D.) -- Bilkent University, 2012.en_US
dc.descriptionIncludes bibliographical references leaves 94-100.en_US
dc.description.abstractThis thesis explores the human activity recognition problem when multiple views are available. We follow two main directions: we first present a system that performs volume matching using constructed 3D volumes from calibrated cameras, then we present a flexible system based on frame matching directly using multiple views. We examine the multiple view systems compared to single view systems, and measure the performance improvements in recognition using more views by various experiments. Initial part of the thesis introduces compact representations for volumetric data gained through reconstruction. The video frames recorded by many cameras with significant overlap are fused by reconstruction, and the reconstructed volumes are used as substitutes of action poses. We propose new pose descriptors over these three dimensional volumes. Our first descriptor is based on the histogram of oriented cylinders in various sizes and orientations. We then propose another descriptor which is view-independent, and which does not require pose alignment. We show the importance of discriminative pose representations within simpler activity classification schemes. Activity recognition framework based on volume matching presents promising results compared to the state-of-the-art. Volume reconstruction is one natural approach for multi camera data fusion, but there can be few cameras with overlapping views. In the second part of the thesis, we introduce an architecture that is adaptable to various number of cameras and features. The system collects and fuses activity judgments from cameras using a voting scheme. The architecture requires no camera calibration. Performance generally improves when there are more cameras and more features; training and test cameras do not need to overlap; camera drop in or drop out is handled easily with little penalty. Experiments support the performance penalties, and advantages for using multiple views versus single view.en_US
dc.description.degreePh.D.en_US
dc.description.provenanceMade available in DSpace on 2016-01-08T18:23:14Z (GMT). No. of bitstreams: 1 0006408.pdf: 6037933 bytes, checksum: fa444c49a57895c88967ee6229e02679 (MD5)en
dc.description.statementofresponsibilityPehlivan, Selenen_US
dc.format.extentxix, 100 leaves, illustrationsen_US
dc.identifier.urihttp://hdl.handle.net/11693/15693
dc.language.isoEnglishen_US
dc.publisherBilkent Universityen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectVideo analysisen_US
dc.subjectHuman activity recognitionen_US
dc.subjectMultiple viewsen_US
dc.subjectMultiple camerasen_US
dc.subjectPose representationen_US
dc.subject.lccQP301 .P44 2012en_US
dc.subject.lcshHuman locomotion--Computer simulation.en_US
dc.subject.lcshBody, Human--Computer simulation.en_US
dc.subject.lcshImage processing--Digital techniques.en_US
dc.subject.lcshComputer simulation.en_US
dc.subject.lcshDigital computer vision.en_US
dc.subject.lcshPattern recognition systems.en_US
dc.titleMultiple view human activity recognitionen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
0006408.pdf
Size:
5.76 MB
Format:
Adobe Portable Document Format