Natural speech representations in the human brain during a cocktail party

Date
2021-08
Advisor
Çukur, Tolga
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Print ISSN
Electronic ISSN
Publisher
Bilkent University
Volume
Issue
Pages
Language
English
Type
Thesis
Journal Title
Journal ISSN
Volume Title
Series
Abstract

Humans are remarkably adept in selectively listening to a desired speaker in a crowded environment, while filtering out non-target speakers in the background. Attention is key to solving this difficult cocktail-party task, yet a detailed char-acterization of attentional effects on speech representations is lacking. It remains unclear across what levels of speech features and how much attentional modula-tion occurs in each brain area during the cocktail-party task. Besides, it should be clarified whether unattended speech is represented in cortex during selective listening and if so, at what feature levels its representations are maintained. To address these questions, we recorded whole-brain blood-oxygen-level-dependent (BOLD) responses while subjects either passively listened to single-speaker stories, or selectively attended to a male or a female speaker in temporally-overlaid stories in separate experiments. Spectral, articulatory, and semantic models of the natural stories were constructed to enable comprehensive assessments on the hierarchy of speech features. Intrinsic selectivity profiles were identified via vox-elwise models fit to passive listening responses. Attentional modulations were then quantified based on model predictions for attended and unattended stories in the cocktail-party task. We find that acoustic representations are confined to the early auditory cortex whereas linguistic representations are broadly distributed across cortex, that attention causes broad modulations at multiple levels of speech representations (articulatory and semantic) while growing stronger towards later stages of processing, and that unattended speech is represented up to the semantic level in parabelt auditory cortex. These results provide insights on speech perception and attentional mechanisms that underlie the ability to selectively listen to a desired speaker in noisy multi-speaker environments.

Course
Other identifiers
Book Title
Keywords
Functional magnetic resonance imaging (fMRI), Cocktail-party, Dorsal and ventral stream, Encoding model, Natural speech
Citation
Published Version (Please cite this version)