Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jan S. Erkelens is active.

Publication


Featured researches published by Jan S. Erkelens.


IEEE Transactions on Audio, Speech, and Language Processing | 2007

Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors

Jan S. Erkelens; Richard C. Hendriks; Richard Heusdens; Jesper Jensen

This paper considers techniques for single-channel speech enhancement based on the discrete Fourier transform (DFT). Specifically, we derive minimum mean-square error (MMSE) estimators of speech DFT coefficient magnitudes as well as of complex-valued DFT coefficients based on two classes of generalized gamma distributions, under an additive Gaussian noise assumption. The resulting generalized DFT magnitude estimator has as a special case the existing scheme based on a Rayleigh speech prior, while the complex DFT estimators generalize existing schemes based on Gaussian, Laplacian, and Gamma speech priors. Extensive simulation experiments with speech signals degraded by various additive noise sources verify that significant improvements are possible with the more recent estimators based on super-Gaussian priors. The increase in perceptual evaluation of speech quality (PESQ) over the noisy signals is about 0.5 points for street noise and about 1 point for white noise, nearly independent of input signal-to-noise ratio (SNR). The assumptions made for deriving the complex DFT estimators are less accurate than those for the magnitude estimators, leading to a higher maximum achievable speech quality with the magnitude estimators.


IEEE Transactions on Audio, Speech, and Language Processing | 2008

Tracking of Nonstationary Noise Based on Data-Driven Recursive Noise Power Estimation

Jan S. Erkelens; Richard Heusdens

This paper considers estimation of the noise spectral variance from speech signals contaminated by highly nonstationary noise sources. The method can accurately track fast changes in noise power level (up to about 10 dB/s). In each time frame, for each frequency bin, the noise variance estimate is updated recursively with the minimum mean-square error (mmse) estimate of the current noise power. A time- and frequency-dependent smoothing parameter is used, which is varied according to an estimate of speech presence probability. In this way, the amount of speech power leaking into the noise estimates is kept low. For the estimation of the noise power, a spectral gain function is used, which is found by an iterative data-driven training method. The proposed noise tracking method is tested on various stationary and nonstationary noise sources, for a wide range of signal-to-noise ratios, and compared with two state-of-the-art methods. When used in a speech enhancement system, improvements in segmental signal-to-noise ratio of more than 1 dB can be obtained for the most nonstationary noise sources at high noise levels.


IEEE Transactions on Audio, Speech, and Language Processing | 2010

Correlation-Based and Model-Based Blind Single-Channel Late-Reverberation Suppression in Noisy Time-Varying Acoustical Environments

Jan S. Erkelens; Richard Heusdens

This paper considers suppression of late reverberation and additive noise in single-channel speech recordings. The reverberation introduces long-term correlation in the observed signal. In the first part of this work, we show how this correlation can be used to estimate the late reverberant spectral variance (LRSV) without having to assume a specific model for the room impulse responses (RIRs) while no explicit estimates of RIR model parameters are needed. That makes this correlation-based approach more robust against RIR modeling errors. However, the correlation-based method can follow only slow time variations in the RIRs. Existing model-based methods use statistical models for the RIRs, that depend on one or more parameters that have to be estimated blindly. The common statistical models lead to simple expressions for the LRSV that depend on past values of the spectral variance of the reverberant, noise-free, signal. All existing model-based LRSV estimators in the literature are derived assuming the RIRs to be time-invariant realizations of a stochastic process. In the second part of this paper, we go one step further and analyze time-varying RIRs. We show that in this case the reverberance tends to become decorrelated. We discuss the relations between different RIR models and their corresponding LRSV estimators. We show theoretically that similar simple estimators exist as in the time-invariant case, provided that the reverberation time T60 and direct-to-reverberation ratio (DRR) of the RIRs remain nearly constant during an interval of the order of a few frames. We show that the reverberation time can be taken frequency-bin independent in DFT-based enhancement algorithms. Experiments with time-varying RIRs validate the analysis. Experiments with additive nonstationary noise and time-invariant RIRs show the influence of blind estimation of the reverberation time and the DRR.


IEEE Transactions on Speech and Audio Processing | 1997

Bias propagation in the autocorrelation method of linear prediction

Jan S. Erkelens; P.M.T. Broersen

Many low bit-rate speech coders use the autocorrelation method (ACM) to find a linear prediction model of the speech signal. A time-domain analysis of the ACM for autoregressive estimation is given. It is shown that a small bias in a reflection coefficient close to one in absolute value is propagated and prohibits an accurate estimation of further reflection coefficients. Tapered data windows largely reduce this effect, but increase the variance of the models.


IEEE Signal Processing Letters | 2008

On the Estimation of Complex Speech DFT Coefficients Without Assuming Independent Real and Imaginary Parts

Jan S. Erkelens; Richard C. Hendriks; Richard Heusdens

This letter considers the estimation of speech signals contaminated by additive noise in the discrete Fourier transform (DFT) domain. Existing complex-DFT estimators assume independency of the real and imaginary parts of the speech DFT coefficients, although this is not in line with measurements. In this letter, we derive some general results on these estimators, under more realistic assumptions. Assuming that speech and noise are independent, speech DFT coefficients have uniform phase, and that noise DFT coefficients have a Gaussian density, we show theoretically that the spectral gain function for speech DFT estimation is real and upper-bounded by the corresponding gain function for spectral magnitude estimation. We also show that the minimum mean-square error (MMSE) estimator of the speech phase equals the noisy phase. No assumptions are made about the distribution of the speech spectral magnitudes. Recently, speech spectral amplitude estimators have been derived under a generalized-Gamma amplitude distribution. As an example, we will derive the corresponding complex-DFT estimators, without making the independence assumption.


international conference on acoustics, speech, and signal processing | 1995

On the statistical properties of line spectrum pairs

Jan S. Erkelens; P.M.T. Broersen

Accurate quantization of the LPC model is of prime importance for the quality of low bitrate speech coders. In the literature, the quantization properties of several representations of the LPC model have been studied. The best results have generally been obtained with the LSP frequencies. In scalar quantization schemes, the immitance spectrum pairs (ISP) perform even slightly better. The good quantization performance of LSP and ISP can be attributed to their theoretical statistical properties: they are uncorrelated when estimated from stationary autoregressive processes, in contrast to the other representations. For small variations in the coefficients of any representation, the spectral distortion can be expressed as a weighted squared distortion measure. The optimal weighting matrix is the inverse of the covariance matrix of the coefficients. For the LSP and ISP this matrix is a diagonal matrix and hence the best weighting factors are the inverses of the theoretical variances. The difference between the LSP and ISP is due to their distributions in speech.


international conference on acoustics, speech, and signal processing | 2008

Fast noise tracking based on recursive smoothing of MMSE noise power estimates

Jan S. Erkelens; Richard Heusdens

We consider estimation of the noise spectral variance from speech signals contaminated by highly nonstationary noise sources. In each time frame, for each frequency bin, the noise variance estimate is updated recursively with the minimum mean-square error (MMSE) estimate of the current noise power. For the estimation of the noise power, a spectral gain function is used, which is found by an iterative data-driven training method. The proposed noise tracking method can accurately track fast changes in noise level (up to about 10 dB/s). When compared to the minimum statistics method for various noise sources in a speech enhancement system, improvements in segmental signal-to-noise ratio of more than 1 dB are obtained.


IEEE Transactions on Speech and Audio Processing | 1998

LPC interpolation by approximation of the sample autocorrelation function

Jan S. Erkelens; P.M.T. Broersen

Conventionally, the energy of analysis frames is not taken into account for linear prediction (LPC) interpolation. Incorporating the frame energy improves the subjective quality of interpolation, but increases the spectral distortion (SP). The main reason for this discrepancy is that the outliers are increased in low energy parts of segments with rapid changes in energy. The energy is most naturally combined with a normalized autocorrelation representation.


international conference on acoustics, speech, and signal processing | 1994

Analysis of spectral interpolation with weighting dependent on frame energy

Jan S. Erkelens; P.M.T. Broersen

Spectral interpolation improves the performance of low bit rate speech coders without increasing the bit rate. We have investigated the problem of spectral interpolation, by means of autoregressive theory. Our analysis is supported by experiments on stationary and non-stationary data and by experiments on real speech data. The main conclusions of our study are: the energy of the analysis frames should be used in the interpolation process. It is beneficial to give analysis frames an overlap of p samples, where p is the order of the model. The reflection coefficients, log area ratios and the arcsine of reflection coefficients are less suitable for interpolation.<<ETX>>


international conference on acoustics, speech, and signal processing | 2009

Single-microphone late-reverberation suppression in noisy speech by exploiting long-term correlation in the DFT domain

Jan S. Erkelens; Richard Heusdens

We consider blind late-reverberation suppression in speech signals measured with a single microphone in noisy environments. We exploit that reverberant speech shows correlation over longer time spans than clean speech by predicting the contribution of reverberant energy to the current observed spectrum from the enhanced spectra of previous frames. The prediction parameters are recursively updated with estimates of the correlation coefficients between the current reverberant spectrum and enhanced previous spectra. The contributions of late reverberation and noise are suppressed by a standard noise reduction algorithm. The algorithm is shown to decrease the long-term correlation. It achieves significant improvements in segmental speech-to-interference ratio and Bark spectral distortion for typical reverberation times and noise levels, while almost no distortions are introduced in clean speech.

Collaboration


Dive into the Jan S. Erkelens's collaboration.

Top Co-Authors

Avatar

Richard Heusdens

Delft University of Technology

View shared research outputs
Top Co-Authors

Avatar

P.M.T. Broersen

Delft University of Technology

View shared research outputs
Top Co-Authors

Avatar

Richard C. Hendriks

Delft University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Arturo Tejada

Delft University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Richard Heusdens

Delft University of Technology

View shared research outputs
Top Co-Authors

Avatar

Richard Heusdens

Delft University of Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge