Gerald Enzner
Ruhr University Bochum
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gerald Enzner.
Signal Processing | 2006
Gerald Enzner; Peter Vary
Acoustic echo canceler and postfilter for residual echo suppression are two essential building blocks of a hands-free voice communication system. Based on the Kalman filter theory, we derive a simple and advanced algorithm for the optimum joint statistical adaptation of both filter coefficients in time-varying and noisy acoustic environments. The Kalmar filter utilizes a stochastic state-space model of the acoustic echo path which is formulated entirely in the frequency-domain. The resulting adaptive algorithm is computationally efficient and inherently robust, i.e., the adaptation does not require additional regularization or control mechanisms.
IEEE Transactions on Audio, Speech, and Language Processing | 2012
Sarmad Malik; Gerald Enzner
In this paper, we address adaptive acoustic echo cancellation in the presence of an unknown memoryless nonlinearity preceding the echo path. We approach the problem by considering a basis-generic expansion of the memoryless nonlinearity. By absorbing the coefficients of the nonlinear expansion into the unknown echo path, the cascade observation model is transformed into an equivalent multichannel structure, which we further augment with a multichannel first-order Markov model. For the resulting multichannel state-space model, we then derive a recursive Bayesian estimator that takes the form of an adaptive Kalman algorithm in the discrete Fourier transform (DFT) domain. We show that such a recursive estimator can be realized via a stable and structurally efficient multichannel state-space frequency-domain adaptive filter. We demonstrate that our algorithm, which stems from a contained framework, provides effective nonlinear echo cancellation in the presence of continuous double-talk, varying degree of nonlinear distortion, and changes in the echo path.
IEEE Transactions on Signal Processing | 2010
Matthias Pawig; Gerald Enzner; Peter Vary
Hands-free terminals for speech communication employ adaptive filters to reduce echoes resulting from the acoustic coupling between loudspeaker and microphone. When using a personal computer with commercial audio hardware for teleconferencing, a sampling frequency offset between the loudspeaker output D/A converter and the microphone input A/D converter often occurs. In this case, state-of-the-art echo cancellation algorithms fail to track the correct room impulse response. In this paper, we present a novel least mean square (LMS-type) adaptive algorithm to estimate the frequency offset and resynchronize the signals using arbitrary sampling rate conversion. In conjunction with a normalized LMS-type adaptive filter for room impulse response tracking, the proposed system widely removes the deteriorating effects of a frequency offset up to several Hz and restores the functionality of echo cancellation.
international conference on acoustics, speech, and signal processing | 2011
Sarmad Malik; Gerald Enzner
We consider the task of acoustic system identification, where the input signal undergoes a memoryless nonlinear transformation before convolving with an unknown linear system. We focus on the possibility of modeling the nonlinearity with different basis functions, namely the established power series and the proposed Fourier expansion. In this work the unknown coefficients of generic basis functions are merged with the unknown linear system to obtain an equivalent multichannel structure. We use a multichannel DFT-domain algorithm for learning the underlying coefficients of both types of basis functions. We show that the Fourier modeling achieves faster convergence and better learning of the underlying nonlinearity than the polynomial basis.
international conference on acoustics, speech, and signal processing | 2002
Gerald Enzner; Rainer Martin; Peter Vary
Residual echo arises in hands-free telephony equipment due to insufficient echo canceler convergence, but can be suppressed using a postfilter. The most important control parameter for postfilter adaptation is therefore the residual echo power spectral density (PSD). In this contribution we present and compare residual echo PSD estimation techniques. We introduce a new partitioned block-adaptive estimator delivering unbiased residual echo PSD estimates in strongly reverberant and noisy acoustic environments.
European Transactions on Telecommunications | 2002
Gerald Enzner; Rainer Martin; Peter Vary; G. Enzner; Raquel Martín Martín; P. Vary
Residual echo arises in hands-free telephony equipment due to insufficient echo canceler convergence, but can be suppressed using a postfilter. The residual echo power spectral density is the most crucial control parameter for both frequency-domain acoustic echo cancellation and combined residual echo and noise postfiltering. In this contribution we present and compare residual echo power spectral estimation techniques. We introduce a new partitioned block-adaptive estimation technique delivering considerably improved residual echo estimates in strongly reverberant and noisy acoustic environments. We show that the adaptation loop of the frequency-domain adaptive filter (FDAF) can be used simultaneously for residual echo power estimation and tracking of the echo path impulse response. In this way, the FDAF and the postfilter concept supplement each other in a true synergy with low complexity. The resulting echo and noise control system proves to be robust in double talk situations as well.
international conference on acoustics, speech, and signal processing | 2010
Sarmad Malik; Gerald Enzner
A linear dynamical model can be used to describe the evolution of an unknown system in noisy conditions. However, in most applications model parameters of a dynamical system are not known a priori, bringing into question the optimality of traditional state-only estimators. In this paper, we consider block-frequency-domain dynamical models and formulate an optimal framework for low-latency joint state and parameter estimation. We show that the resulting variational expectation-maximization algorithm in the block-frequency-domain offers a comprehensive and efficient solution for the joint estimation task.
international conference on acoustics, speech, and signal processing | 2008
Gerald Enzner
Head related impulse responses (HRIRs) are the key to spatial realism in auditory virtual environments (AVEs). However, the measurement of discrete-azimuth HRIRs and their interpolation has been recognized as a tedious and delicate experimental procedure. We therefore suggest an adaptive filtering concept for continuous HRIR acquisition that completely avoids the traditional sampling and interpolation issue. Using an LMS-type adaptive algorithm, the HRIRs - at any azimuth - are extracted from a one-shot binaural recording. During data acquisition, the subject of interest is continuously rotated in the horizontal plane in order to capture the corresponding spatial information. In particular, the paper provides a profound theoretical and experimental analysis of the resulting HRIR inaccuracy in terms of the mean-square error. Furthermore, the optimal step- size parameter of the LMS-type adaptive algorithm is determined for which the minimum HRIR inaccuracy is attained.
Academic Press Library in Signal Processing | 2014
Gerald Enzner; Herbert Buchner; Alexis Favrot; Fabian Kuech
Abstract Modern systems for acoustic man-machine communication, such as teleconferencing equipment or speech dialog systems, will have to provide hands-free voice interfaces with full-duplex ability in order to allow for most natural interaction with users. Acoustic echo control is the enabling technology for suppressing the unavoidable feedback between acoustic reproduction and recording, i.e., the echo, such that an enhanced microphone signal is transmitted to a remote talker or supplied to a speech terminal. In this chapter, we first recall the established signal processing strategies for acoustic echo cancellation, mostly based on the family of deterministic least-squares adaptive filters. We then proceed with our main theme, which consists of the recent developments towards a more statistical framework for acoustic echo control. Our approach is guided by the motivation of taking various uncertainties regarding the echo generation process into account, while maintaining backwards compatibility for the inclusion of established technology. With unified notation, we present different optimum filtering solutions which were tailored for robustness against the specific uncertainties related to echo path variability, nonlinearity, and clock drift in a system. Furthermore, we provide a generalization of our concepts to systems with multiple reproduction channels, where algorithmic complexity and correlation of loudspeaker signals represents additional challenges.
IEEE Transactions on Audio, Speech, and Language Processing | 2014
Dominic Schmid; Gerald Enzner; Sarmad Malik; Dorothea Kolossa; Rainer Martin
Room reverberation and background noise severely degrade the quality of hands-free speech communication systems. In this work, we address the problem of combined speech dereverberation and noise reduction using a variational Bayesian (VB) inference approach. Our method relies on a multichannel state-space model for the acoustic channels that combines frame-based observation equations in the frequency domain with a first-order Markov model to describe the time-varying nature of the room impulse responses. By modeling the channels and the source signal as latent random variables, we formulate a lower bound on the log-likelihood function of the model parameters given the observed microphone signals and iteratively maximize it using an online expectation-maximization approach. Our derivation yields update equations to jointly estimate the channel and source posterior distributions and the remaining model parameters. An inspection of the resulting VB algorithm for blind equalization and channel identification (VB-BENCH) reveals that the presented framework includes previously proposed methods as special cases. Finally, we evaluate the performance of our approach in terms of speech quality, adaptation times, and speech recognition results to demonstrate its effectiveness for a wide range of reverberation and noise conditions.