Browsing by Subject "Multimodal fusion"

Now showing 1 - 4 of 4

Open Access
Affect and personality aware analysis of speech content for automatic estimation of depression severity
(2023-09) Gönç, Kaan
The detection of depression has gained a significant amount of scientific attention for its potential in early diagnosis and intervention. In light of this, we propose a novel approach that places exclusive emphasis on textual features for depression severity estimation. The proposed method seamlessly integrates affect (emotion and sentiment), and personality features as distinct yet interconnected modalities within a transformer-based architecture. Our key contribution lies in a masked multimodal joint cross-attention fusion, which adeptly combines the information gleaned from these different text modalities. This fusion approach empowers the model not only to discern subtle contextual cues within textual data but also to comprehend intricate interdependencies between the modalities. A comprehensive experimental evaluation is undertaken to meticulously assess the individual components comprising the proposed architecture, as well as extraneous ones that are not inherent to it. The evaluation additionally includes the assessments conducted in a unimodal setting where the impact of each modality is examined individually. The findings derived from these experiments substantiate the self-contained efficacy of our architecture. Furthermore, we explore the significance of individual sentences within speech content, offering valuable insights into the contribution of specific textual cues and we perform a segmented evaluation of the proposed method for different ranges of depression severity. Finally, we compare our method with existing state-of-the-art studies utilizing different combinations of auditory, visual, and textual features. The final results demonstrate that our method achieves promising results in depression severity estimation, outperforming the other methods.
Open Access
Multi-contrast MRI synthesis with channel-exchanging-network
(IEEE, 2022-08-29) Dalmaz, Onat; Aytekin, İdil; Dar, Salman Ul Hassan; Erdem, Aykut; Erdem, Erkut; Çukur, Tolga
Magnetic resonance imaging (MRI) is used in many diagnostic applications as it has a high soft-tissue contrast and is a non-invasive medical imaging method. MR signal levels differs according to the parameters T1, T2 and PD that change with respect to the chemical structure of the tissues. However, long scan times might limit acquiring images from multiple contrasts or if the multi-contrasts images are acquired, the contrasts are noisy. To overcome this limitation of MRI, multi-contrast synthesis can be utilized. In this paper, we propose a deep learning method based on Channel-Exchanging-Network (CEN) for multi-contrast image synthesis. Demonstrations are provided on IXI dataset. The proposed model based on CEN is compared against alternative methods based on CNNs and GANs. Our results show that the proposed model achieves superior performance to the competing methods.
Open Access
Multimodal analysis of personality traits on videos of self-presentation and induced behavior
(Springer, 2020) Giritlioğlu, Dersu; Mandira, Burak; Yılmaz, Selim Fırat; Ertenli, C. U.; Akgür, Berhan Faruk; Kınıklıoğlu, Merve; Kurt, Aslı Gül; Mutlu, E.; Dibeklioğlu, Hamdi
Personality analysis is an important area of research in several fields, including psychology, psychiatry, and neuroscience. With the recent dramatic improvements in machine learning, it has also become a popular research area in computer science. While the current computational methods are able to interpret behavioral cues (e.g., facial expressions, gesture, and voice) to estimate the level of (apparent) personality traits, accessible assessment tools are still substandard for practical use, not to mention the need for fast and accurate methods for such analyses. In this study, we present multimodal deep architectures to estimate the Big Five personality traits from (temporal) audio-visual cues and transcribed speech. Furthermore, for a detailed analysis of personality traits, we have collected a new audio-visual dataset, namely: Self-presentation and Induced Behavior Archive for Personality Analysis (SIAP). In contrast to the available datasets, SIAP introduces recordings of induced behavior in addition to self-presentation (speech) videos. With thorough experiments on SIAP and ChaLearn LAP First Impressions datasets, we systematically assess the reliability of different behavioral modalities and their combined use. Furthermore, we investigate the characteristics and discriminative power of induced behavior for personality analysis, showing that the induced behavior indeed includes signs of personality traits.
Open Access
Personality-aware deception detection from behavioral cues
(2021-09) Mandıra, Burak
We encounter with deceptive behavior in our daily lives, almost everyday. Even the most reliable ones among us can sometimes be deceptive either deliberately or unintentionally. Since people are not successful at detecting lies most of the time, it is necessary to use an automated deception detection system particularly in high-stake scenarios such as court trials. We propose a fully automated personality-aware deception detection model that uses videos as input. To our knowledge, we are the ﬁrst to consider analyzing the personality of subjects in a deception detection task. The proposed model is a multimodal approach where it uses both facial expression and voice related cues in addition to personality traits of subjects in its analyses. After personality traits are extracted, they are combined with deception features which are based on expression cues. Deception, voice, and personality modules are constituted from the spatiotemporal architectures such as 3D-ResNext and CNN-GRU to better comprehend the temporal dynamics of the input. Finally, it combines expression and voice modalities using a GRU based fusion model. The evaluation of the proposed model is performed on Real Life Trials dataset that uses the records of court trials from real life. The results suggest that the use of personality traits facilitates the deception detection task. When the personality features are employed in addition to deception features, there is up to 20.4% (relative) improvement on the performance of the deception module. When the voice related cues are also considered upon that, we obtain 15.4% (relative) improvement additionally.