Source and filter estimation for Throat-Microphone speech enhancement

Turan, M. A. T.; Erzin, E.

Source and filter estimation for Throat-Microphone speech enhancement

dc.citation.epage	275	en_US
dc.citation.issueNumber	2	en_US
dc.citation.spage	265	en_US
dc.citation.volumeNumber	24	en_US
dc.contributor.author	Turan, M. A. T.	en_US
dc.contributor.author	Erzin, E.	en_US
dc.date.accessioned	2018-04-12T10:42:45Z
dc.date.available	2018-04-12T10:42:45Z
dc.date.issued	2016	en_US
dc.department	Department of Electrical and Electronics Engineering	en_US
dc.description.abstract	In this paper, we propose a new statistical enhancement system for throat microphone recordings through source and filter separation. Throat microphones (TM) are skin-attached piezoelectric sensors that can capture speech sound signals in the form of tissue vibrations. Due to their limited bandwidth, TM recorded speech suffers from intelligibility and naturalness. In this paper, we investigate learning phone-dependent Gaussian mixture model (GMM)-based statistical mappings using parallel recordings of acoustic microphone (AM) and TM for enhancement of the spectral envelope and excitation signals of the TM speech. The proposed mappings address the phone-dependent variability of tissue conduction with TM recordings. While the spectral envelope mapping estimates the line spectral frequency (LSF) representation of AM from TM recordings, the excitation mapping is constructed based on the spectral energy difference (SED) of AM and TM excitation signals. The excitation enhancement is modeled as an estimation of the SED features from the TM signal. The proposed enhancement system is evaluated using both objective and subjective tests. Objective evaluations are performed with the log-spectral distortion (LSD), the wideband perceptual evaluation of speech quality (PESQ) and mean-squared error (MSE) metrics. Subjective evaluations are performed with an A/B comparison test. Experimental results indicate that the proposed phone-dependent mappings exhibit enhancements over phone-independent mappings. Furthermore enhancement of the TM excitation through statistical mappings of the SED features introduces significant objective and subjective performance improvements to the enhancement of TM recordings. ©2015 IEEE.	en_US
dc.identifier.doi	10.1109/TASLP.2015.2499040	en_US
dc.identifier.issn	2329-9290
dc.identifier.uri	http://hdl.handle.net/11693/36510
dc.language.iso	English	en_US
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	en_US
dc.relation.isversionof	http://dx.doi.org/10.1109/TASLP.2015.2499040	en_US
dc.source.title	IEEE/ACM Transactions on Audio Speech and Language Processing	en_US
dc.subject	Gaussian mixture model	en_US
dc.subject	Speech enhancement	en_US
dc.subject	Statistical mapping	en_US
dc.subject	Throat microphone	en_US
dc.subject	Bandpass filters	en_US
dc.subject	Gaussian distribution	en_US
dc.subject	Mapping	en_US
dc.subject	Mean square error	en_US
dc.subject	Microphones	en_US
dc.subject	Photomapping	en_US
dc.subject	Quality control	en_US
dc.subject	Source separation	en_US
dc.subject	Speech	en_US
dc.subject	Speech communication	en_US
dc.subject	Speech intelligibility	en_US
dc.subject	Telephone sets	en_US
dc.subject	Tissue	en_US
dc.subject	Excitation enhancement	en_US
dc.subject	Gaussian Mixture Model	en_US
dc.subject	Line spectral frequencies	en_US
dc.subject	Log spectral distortions	en_US
dc.subject	Perceptual evaluation of speech qualities	en_US
dc.subject	Subjective evaluations	en_US
dc.subject	Subjective performance	en_US
dc.subject	Throat microphones	en_US
dc.subject	Audio recordings	en_US
dc.title	Source and filter estimation for Throat-Microphone speech enhancement	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Source and Filter Estimation for Throat-Microphone Speech Enhancement.pdf
Size:: 1.98 MB
Format:: Adobe Portable Document Format
Description:: Full printable version

Download

Collections

Scholarly Publications - Electrical and Electronics Engineering