Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Walter Kellermann is active.

Publication


Featured researches published by Walter Kellermann.


IEEE Transactions on Speech and Audio Processing | 2005

A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics

Herbert Buchner; Robert Aichner; Walter Kellermann

We present a general broadband approach to blind source separation (BSS) for convolutive mixtures based on second-order statistics. This avoids several known limitations of the conventional narrowband approximation, such as the internal permutation problem. In contrast to traditional narrowband approaches, the new framework simultaneously exploits the nonwhiteness property and nonstationarity property of the source signals. Using a novel matrix formulation, we rigorously derive the corresponding time-domain and frequency-domain broadband algorithms by generalizing a known cost-function which inherently allows joint optimization for several time-lags of the correlations. Based on the broadband approach time-domain, constraints are obtained which provide a deeper understanding of the internal permutation problem in traditional narrowband frequency-domain BSS. For both the time-domain and the frequency-domain versions, we discuss links to well-known, and also, to novel algorithms that constitute special cases. Moreover, using the so-called generalized coherence, links between the time-domain and the frequency-domain algorithms can be established, showing that our cost function leads to an update equation with an inherent normalization ensuring a robust adaptation behavior. The concept is applicable to offline, online, and block-online algorithms by introducing a general weighting function allowing for tracking of time-varying real acoustic environments.


workshop on applications of signal processing to audio and acoustics | 2013

The reverb challenge: Acommon evaluation framework for dereverberation and recognition of reverberant speech

Keisuke Kinoshita; Marc Delcroix; Takuya Yoshioka; Tomohiro Nakatani; Armin Sehr; Walter Kellermann; Roland Maas

Recently, substantial progress has been made in the field of reverberant speech signal processing, including both single- and multichannel dereverberation techniques, and automatic speech recognition (ASR) techniques robust to reverberation. To evaluate state-of-the-art algorithms and obtain new insights regarding potential future research directions, we propose a common evaluation framework including datasets, tasks, and evaluation metrics for both speech enhancement and ASR techniques. The proposed framework will be used as a common basis for the REVERB (REverberant Voice Enhancement and Recognition Benchmark) challenge. This paper describes the rationale behind the challenge, and provides a detailed description of the evaluation framework and benchmark results.


IEEE Signal Processing Magazine | 2012

Making Machines Understand Us in Reverberant Rooms: Robustness Against Reverberation for Automatic Speech Recognition

Takuya Yoshioka; Armin Sehr; Marc Delcroix; Keisuke Kinoshita; Roland Maas; Tomohiro Nakatani; Walter Kellermann

Speech recognition technology has left the research laboratory and is increasingly coming into practical use, enabling a wide spectrum of innovative and exciting voice-driven applications that are radically changing our way of accessing digital services and information. Most of todays applications still require a microphone located near the talker. However, almost all of these applications would benefit from distant-talking speech capturing, where talkers are able to speak at some distance from the microphones without the encumbrance of handheld or body-worn equipment [1]. For example, applications such as meeting speech recognition, automatic annotation of consumer-generated videos, speech-to-speech translation in teleconferencing, and hands-free interfaces for controlling consumer-products, like interactive TV, will greatly benefit from distant-talking operation. Furthermore, for a number of unexplored but important applications, distant microphones are a prerequisite. This means that distant talking speech recognition technology is essential for extending the availability of speech recognizers as well as enhancing the convenience of existing speech recognition applications.


Signal Processing | 2000

Adaptation of a memoryless preprocessor for nonlinear acoustic echo cancelling

Alexander Stenger; Walter Kellermann

Abstract Acoustic echo cancellers (AECs) in todays hands-free telephones rely on the assumption of a linear echo path. However, low-cost audio equipment or constraints of portable communication systems cause nonlinear distortions in the loudspeaker and its amplifier, which limit the echo reduction of linear AECs. Such an echo path can be modelled by a memoryless nonlinear function preceding the linear FIR filter. Algorithms for joint adaptation of both stages and stepsize normalizations are derived. As examples for the preprocessor a hard-clipping curve and a polynomial are considered. Fast convergence of the latter is achieved with signal orthogonalization or RLS adaptation. Adaptation stepsize control mechanisms are derived using a novel system distance measure. Experiments under adverse conditions and with real hardware demonstrate robust convergence with both models, and an echo reduction improvement by up to 10 dB at amplitude peaks. For a straightforward implementation, computational cost is increased by factor 1.5–4.5 compared to a linear AEC, but ways for complexity reduction are outlined.


international conference on acoustics, speech, and signal processing | 2004

TRINICON: a versatile framework for multichannel blind signal processing

Herbert Buchner; Robert Aichner; Walter Kellermann

In this paper we present a framework for multichannel blind signal processing for convolutive mixtures, such as blind source separation (BSS) and multichannel blind deconvolution (MCBD). It is based on the use of multivariate pdf and a compact matrix notation which considerably simplifies the representation and handling of the algorithms. By introducing these techniques into an information theoretic cost function, we can exploit the three fundamental signal properties nonwhiteness, nongaussianity, and nonstationarity. This results in a versatile tool that we call TRINICON (Triple-N ICA for convolutive mixtures). Both, links to popular algorithms and several novel algorithms follow from the general approach. In particular, we introduce a new concept of multichannel blind partial deconvolution (MCBPD) for speech which prevents a complete whitening of the output signals, i.e., the vocal tract is excluded from the equalization. This is especially interesting for automatic speech recognition applications. Moreover, we show results for BSS using multivariate spherically invariant random processes (SIRP) to efficiently model speech, and show how the approach carries over to MCBPD. These concepts are also suitable for an efficient implementation in the frequency domain by using a rigorous broadband derivation avoiding the internal permutation problem and circularity effects.


Archive | 2004

Blind Source Separation for Convolutive Mixtures: A Unified Treatment

Herbert Buchner; Robert Aichner; Walter Kellermann

Blind source separation (BSS) algorithms for time series can exploit three properties of the source signals: nonwhiteness, nonstationarity, and nongaussianity. While methods utilizing the first two properties are usually based on second-order statistics (SOS), higher-order statistics (HOS) must be considered to exploit nongaussianity. In this chapter, we consider all three properties simultaneously to design BSS algorithms for convolutive mixtures within a new generic framework. This concept derives its generality from an appropriate matrix notation combined with the use of multivariate probability densities for considering the time-dependencies of the source signals. Based on a generalized cost function we rigorously derive the corresponding time-domain and frequency-domain broadband algorithms. Due to the broadband approach, time-domain constraints are obtained which provide a more detailed understanding of the internal permutation problem in traditional narrowband frequency-domain BSS. For both, the time-domain and the frequency-domain versions, we discuss links to well-known and also to novel algorithms that follow as special cases of the framework. Moreover, we use models for correlated spherically invariant random processes (SIRPs) which are well suited for a variety of source signals including speech to obtain efficient solutions in the HOS case. The concept provides a basis for off-line, online, and block-on-line algorithms by introducing a general weighting function, thereby allowing for tracking of time-varying real acoustic environments.


Journal of the Acoustical Society of America | 2006

Acoustic source detection and localization based on wavefield decomposition using circular microphone arrays

Heinz Teutsch; Walter Kellermann

This paper is concerned with the problem of detecting and localizing multiple wideband acoustic sources by applying the notion of wavefield decomposition using circular microphone arrays optionally mounted into cylindrical baffles. The decomposed wavefield representation is used to serve as a basis for so-called modal array signal processing algorithms, which have the significant advantage over classical array signal processing algorithms that they inherently support multiple wideband acoustic sources. A rigorous derivation of modal array signal processing algorithms for source detection and localization, as well as performance evaluations, by means of measurements using an actual real-time capable implementation are presented.


IEEE Transactions on Audio, Speech, and Language Processing | 2011

Adaptive Combination of Volterra Kernels and Its Application to Nonlinear Acoustic Echo Cancellation

Luis Antonio Azpicueta-Ruiz; Marcus Zeller; Aníbal R. Figueiras-Vidal; Jerónimo Arenas-García; Walter Kellermann

The combination of filters concept is a simple and flexible method to circumvent various compromises hampering the operation of adaptive linear filters. Recently, applications which require the identification of not only linear, but also nonlinear systems are widely studied. In this paper, we propose a combination of adaptive Volterra filters as the most versatile nonlinear models with memory. Moreover, we develop a novel approach that shows a similar behavior but significantly reduces the computational load by combining Volterra kernels rather than complete Volterra filters. Following an outline of the basic principles, the second part of the paper focuses on the application to nonlinear acoustic echo cancellation scenarios. As the ratio of the linear to nonlinear echo signal power is, in general, a priori unknown and time-variant, the performance of nonlinear echo cancellers may be inferior to a linear echo canceller if the nonlinear distortion is very low. Therefore, a modified version of the combination of kernels is developed obtaining a robust behavior regardless of the level of nonlinear distortion. Experiments with noise and speech signals demonstrate the desired behavior and the robustness of both the combination of Volterra filters and the combination of kernels approaches in different application scenarios.


international conference on acoustics, speech, and signal processing | 1997

Strategies for combining acoustic echo cancellation and adaptive beamforming microphone arrays

Walter Kellermann

New concepts for efficient combination of acoustic echo cancellation (AEC) and adaptive beamforming microphone arrays (ABMA) are presented. By decomposing common beamforming methods into a time-invariant part, which the AEC can integrate, and a separate time-variant part, the number of echo cancellers is minimized without rendering the system identification problem more difficult. Methods for controlling the interaction of ABMA and AEC are outlined and implementations for typical microphone array applications are discussed.


EURASIP Journal on Advances in Signal Processing | 2007

Map-based underdetermined blind source separation of convolutive mixtures by hierarchical clustering and l 1 -norm minimization

Stefan Winter; Walter Kellermann; Hiroshi Sawada; Shoji Makino

We address the problem of underdetermined BSS. While most previous approaches are designed for instantaneous mixtures, we propose a time-frequency-domain algorithm for convolutive mixtures. We adopt a two-step method based on a general maximum a posteriori (MAP) approach. In the first step, we estimate the mixing matrix based on hierarchical clustering, assuming that the source signals are sufficiently sparse. The algorithm works directly on the complex-valued data in the time-frequency domain and shows better convergence than algorithms based on self-organizing maps. The assumption of Laplacian priors for the source signals in the second step leads to an algorithm for estimating the source signals. It involves the-norm minimization of complex numbers because of the use of the time-frequency-domain approach. We compare a combinatorial approach initially designed for real numbers with a second-order cone programming (SOCP) approach designed for complex numbers. We found that although the former approach is not theoretically justified for complex numbers, its results are comparable to, or even better than, the SOCP solution. The advantage is a lower computational cost for problems with low input/output dimensions.

Collaboration


Dive into the Walter Kellermann's collaboration.

Top Co-Authors

Avatar

Herbert Buchner

Technical University of Berlin

View shared research outputs
Top Co-Authors

Avatar

Roland Maas

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Armin Sehr

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Christian Hofmann

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Wolfgang Herbordt

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Christian Huemmer

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Andreas Schwarz

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Klaus Reindl

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Martin Schneider

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Hendrik Barfuss

University of Erlangen-Nuremberg

View shared research outputs
Researchain Logo
Decentralizing Knowledge