Systematic analysis of speech transcription modeling for reliable assessment of depression severity

Kaynak, Ergün BatuhanDibeklioğlu, Hamdi2025-02-212025-02-212024-04-27https://hdl.handle.net/11693/116536In evaluating the severity of depression, we rigorously investigate a segmented deep learning framework that employs speech transcriptions for predicting levels of depression. Within this framework, we examine the effectiveness of well-known deep learning models for generating useful features for gauging depression. We validate the chosen models using the openly accessible Extended Distress Analysis Interview Corpus (EDAIC) as a dataset. Through our findings and analytical commentary, we demonstrate that valuable features for depression severity estimation can be achieved without leveraging the sequential relationships among textual descriptors. Specifically, temporal aggregation of latent representations surpasses the current best performing methods that utilize recurrent models, exhibiting an 8.8% improvement in Concordance Correlation Coefficient (CCC).EnglishCC BY-NC 4.0https://creativecommons.org/licenses/by-nc/4.0/Depression severity assessmentText analysisDeep learningSpeech transcriptionSystematic analysis of speech transcription modeling for reliable assessment of depression severityArticle10.35377/saucis...13815222636-8129