Multimodal video-based personality recognition using Long Short-Term Memory and convolutional neural networks

buir.advisorGüdükbay, Uğur
dc.contributor.authorAslan, Süleyman
dc.date.accessioned2019-08-08T07:45:53Z
dc.date.available2019-08-08T07:45:53Z
dc.date.copyright2019-07
dc.date.issued2019-07
dc.date.submitted2019-07-16
dc.descriptionCataloged from PDF version of article.en_US
dc.descriptionThesis (M.S.): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2019.en_US
dc.descriptionIncludes bibliographical references (leaves 42-57).en_US
dc.description.abstractPersonality computing and affective computing, where recognition of personality traits is essential, have gained increasing interest and attention in many research areas recently. The personality traits are described by the Five-Factor Model along five dimensions: openness, conscientiousness, extraversion, agreeableness, and neuroticism. We propose a novel approach to recognize these five personality traits of people from videos. Personality and emotion affect the speaking style, facial expressions, body movements, and linguistic factors in social contexts, and they are affected by environmental elements. For this reason, we develop a multimodal system to recognize apparent personality traits based on various modalities such as the face, environment, audio, and transcription features. In our method, we use modality-specific neural networks that learn to recognize the traits independently and we obtain a final prediction of apparent personality with a feature-level fusion of these networks. We employ pre-trained deep convolutional neural networks such as ResNet and VGGish networks to extract high-level features and Long Short-Term Memory networks to integrate temporal information. We train the large model consisting of modality-specific subnetworks using a two-stage training process. We first train the subnetworks separately and then fine-tune the overall model using these trained networks. We evaluate the proposed method using ChaLearn First Impressions V2 challenge dataset. Our approach obtains the best overall “mean accuracy” score, averaged over five personality traits, compared to the state-of-the-art.en_US
dc.description.provenanceSubmitted by Betül Özen (ozen@bilkent.edu.tr) on 2019-08-08T07:45:53Z No. of bitstreams: 1 suleyman_aslan_thesis.pdf: 21686121 bytes, checksum: 641e67866255d837e53e045059339c41 (MD5)en
dc.description.provenanceMade available in DSpace on 2019-08-08T07:45:53Z (GMT). No. of bitstreams: 1 suleyman_aslan_thesis.pdf: 21686121 bytes, checksum: 641e67866255d837e53e045059339c41 (MD5) Previous issue date: 2019-07en
dc.description.statementofresponsibilityby Süleyman Aslanen_US
dc.embargo.release2020-01-16
dc.format.extentxi, 57 leaves : illustrations, charts, graphics ; 30 cm.en_US
dc.identifier.itemidB133465
dc.identifier.urihttp://hdl.handle.net/11693/52318
dc.language.isoEnglishen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectDeep learningen_US
dc.subjectConvolutional Neural Network (CNN)en_US
dc.subjectRecurrent Neural Network (RNN)en_US
dc.subjectLong Short-Term Memory (LSTM) networken_US
dc.subjectPersonality traitsen_US
dc.subjectPersonality trait recognitionen_US
dc.subjectMultimodal informationen_US
dc.titleMultimodal video-based personality recognition using Long Short-Term Memory and convolutional neural networksen_US
dc.title.alternativeÇok kipli uzun kısa-süreli bellek ve Evrişimli Sinir Ağları ile videoda kişilik tanımaen_US
dc.typeThesisen_US
thesis.degree.disciplineComputer Engineering
thesis.degree.grantorBilkent University
thesis.degree.levelMaster's
thesis.degree.nameMS (Master of Science)

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
suleyman_aslan_thesis.pdf
Size:
20.68 MB
Format:
Adobe Portable Document Format
Description:
Full printable version
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: