Browsing by Author "Erzin, E."

Now showing 1 - 5 of 5

Open Access
Adaptive filtering for non-gaussian stable processes
(IEEE, 1994) Arıkan, Orhan; Çetin, A. Enis; Erzin, E.
A large class of physical phenomenon observed in practice exhibit non-Gaussian behavior. In this letter, a-stable distributions, which have heavier tails than Gaussian distribution, are considered to model non-Gaussian signals. Adaptive signal processing in the presence of such a noise is a requirement of many practical problems. Since direct application of commonly used adaptation techniques fail in these applications, new algorithms for adaptive filtering for α-stable random processes are introduced.
Open Access
Interframe differential coding of line spectrum frequencies
(IEEE, 1994) Erzin, E.; Çetin, A. Enis
Line spectrum frequencies (LSF's) uniquely represent the linear predictive coding (LPC) filter of a speech frame. In many vocoders LSF's are used to encode the LPC parameters. In this paper, an inter-frame differential coding scheme is presented for the LSF's. The LSF's of the current speech frame are predicted by using both the LSF's of the previous frame and some of the LSF's of the current frame. Then, the difference resulting from prediction is quantized.
Open Access
Line spectral frequency representation of subbands for speech recognition
(1995) Erzin, E.; Çetin, A.E.
In this paper, a new set of speech feature parameters is constructed from subband analysis based Line Spectral Frequencies (LSFs). The speech signal is divided into several subbands and the resulting subsignals are represented by LSFs. The performance of the new speech feature parameters, SUBLSFs, is compared with the widely used Mel Scale Cepstral Coefficients (MELCEPs). SUBLSFs are observed to be more robust than the MELCEPs in the presence of car noise. © 1995.
Open Access
Source and filter estimation for Throat-Microphone speech enhancement
(Institute of Electrical and Electronics Engineers Inc., 2016) Turan, M. A. T.; Erzin, E.
In this paper, we propose a new statistical enhancement system for throat microphone recordings through source and filter separation. Throat microphones (TM) are skin-attached piezoelectric sensors that can capture speech sound signals in the form of tissue vibrations. Due to their limited bandwidth, TM recorded speech suffers from intelligibility and naturalness. In this paper, we investigate learning phone-dependent Gaussian mixture model (GMM)-based statistical mappings using parallel recordings of acoustic microphone (AM) and TM for enhancement of the spectral envelope and excitation signals of the TM speech. The proposed mappings address the phone-dependent variability of tissue conduction with TM recordings. While the spectral envelope mapping estimates the line spectral frequency (LSF) representation of AM from TM recordings, the excitation mapping is constructed based on the spectral energy difference (SED) of AM and TM excitation signals. The excitation enhancement is modeled as an estimation of the SED features from the TM signal. The proposed enhancement system is evaluated using both objective and subjective tests. Objective evaluations are performed with the log-spectral distortion (LSD), the wideband perceptual evaluation of speech quality (PESQ) and mean-squared error (MSE) metrics. Subjective evaluations are performed with an A/B comparison test. Experimental results indicate that the proposed phone-dependent mappings exhibit enhancements over phone-independent mappings. Furthermore enhancement of the TM excitation through statistical mappings of the SED features introduces significant objective and subjective performance improvements to the enhancement of TM recordings. ©2015 IEEE.
Open Access
Teager energy based feature parameters for speech recognition in car noise
(Institute of Electrical and Electronics Engineers, 1999-10) Jabloun, F.; Çetin, A. Enis; Erzin, E.
In this letter, a new set of speech feature parameters based on multirate signal processing and the Teager energy operator is introduced. The speech signal is first divided into nonuniform subbands in mel-scale using a multirate filterbank, then the Teager energies of the subsignals are estimated. Finally, the feature vector is constructed by log-compression and inverse discrete cosine transform (DCT) computation. The new feature parameters have robust speech recognition performance in the presence of car engine noise.