Markus Buck
Nuance Communications
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Markus Buck.
Signal Processing | 2006
Markus Buck; Tim Haulick; Hans-Jörg Pfleiderer
The application of microphone arrays and beamforming techniques for speech acquisition promises significant improvement compared to systems operating with a single microphone. Adaptive beamformers offer a potentially superior performance to fixed beamformers particularly in the case of time varying sound field characteristics or in the case of coherent noise such as interfering speakers, loudspeaker signals, etc. However, for real-world applications adaptive beamformers hold the risk of severe signal degradation. Disturbances such as mismatched microphones, an imprecise steering direction or reverberation due to multi-path propagation may cause an adaptive beamformer to distort the desired signal. Microphone mismatch naturally arises from production tolerances as well as from aging effects in the long run.This contribution presents a class of adaptive self-calibration methods. These methods perform a calibration in the background during normal operation of the system and therefore save the need for an additional costly calibration procedure. Based on a systematic approach, new configurations as well as some well-known configurations are derived. The performance of the different self-calibration configurations is examined in a car environment.
EURASIP Journal on Advances in Signal Processing | 2015
Simon Graf; Tobias Herbig; Markus Buck; Gerhard Schmidt
In many speech signal processing applications, voice activity detection (VAD) plays an essential role for separating an audio stream into time intervals that contain speech activity and time intervals where speech is absent. Many features that reflect the presence of speech were introduced in literature. However, to our knowledge, no extensive comparison has been provided yet. In this article, we therefore present a structured overview of several established VAD features that target at different properties of speech. We categorize the features with respect to properties that are exploited, such as power, harmonicity, or modulation, and evaluate the performance of some dedicated features. The importance of temporal context is discussed in relation to latency restrictions imposed by different applications. Our analyses allow for selecting promising VAD features and finding a reasonable trade-off between performance and complexity.
2008 Hands-Free Speech Communication and Microphone Arrays | 2008
Tobias Wolff; Markus Buck
We present a new approach for residual transient noise suppression at the output of an arbitrary beamformer. A spatial optimum estimate for the instantaneous a posteriori SNR is derived on the basis of the output signals of a blocking matrix. The optimization problem is formulated in the logarithmic domain and statistical models for the obtained quantities are given. Based on these models the optimization problem is solved in the maximum a posteriori sense. It is shown that the performance of speech recognition systems in non-stationary noise scenarios is improved considerably compared to the performance achieved with a Wiener filter applied to the beamformer output.
international conference on acoustics, speech, and signal processing | 2011
Timo Matheja; Markus Buck; Achim Eichentopf
Distributed microphone systems in cars usually provide dedicated microphones for several speakers where each microphone captures the desired speech signal at the best. The signal quality may differ strongly among the speaker channels depending on the microphone position, the microphone type, the kind of background noise, and the speaker himself. When combining these signals to a weighted mix annoying switching artifacts may result. In this contribution a new dynamic signal mixer is presented that uses spectral preprocessing to compensate both for different speech signal levels and for different background noise levels and colorations. Thus, artifacts are avoided and smooth transitions can be achieved for the various speech level and the background noise spectrum at speaker changes.
international conference on acoustics, speech, and signal processing | 2009
Markus Buck; Tobias Wolff; Tim Haulick; Gerhard Schmidt
Compact microphone arrays allow for directional filtering with a minimum of installation space. They are therefore particularly suitable for automotive applications. Typically, compact arrays are realized as differential arrays or filter-and-sum beamformers which both show limited performance in terms of directivity. In this contribution we present a novel system for directional filtering for compact arrays. This system consists of two closely spaced microphones and incorporates an adaptive beamformer as well as a spatial post-filter which is designed to suppress non-stationary noise.
international conference on acoustics, speech, and signal processing | 2012
Timo Matheja; Markus Buck; Tobias Wolff
In cars with integrated distributed microphone systems usually each speaker has a dedicated microphone. An often required broadband speaker activity detection can be performed by simply evaluating the power ratios among the microphones but transient interferers like indicator noise, outside crossing cars or speech from interfering speakers may be wrongly assigned to one speakers activity. In this contribution a new method is presented that exploits patterns based on the characteristics of signal power inverted subbands at which compared to the closest microphone higher energy occurs in a distant one. By determination of a distance measure the currently observed pattern of these power inverted subbands is compared to online learned speaker position dependent reference patterns. During noise only periods as well as during transient interfering signals the patterns do not match the reference and false speaker activity detections can be reduced.
EURASIP Journal on Advances in Signal Processing | 2013
Timo Matheja; Markus Buck; Tim Fingscheidt
Supporting multiple active speakers in automotive hands-free or speech dialog applications is an interesting issue not least due to comfort reasons. Therefore, a multi-channel system for enhancement of speech signals captured by distributed distant microphones in a car environment is presented. Each of the potential speakers in the car has a dedicated directional microphone close to his position that captures the corresponding speech signal. The aim of the resulting overall system is twofold: On the one hand, a combination of an arbitrary pre-defined subset of speakers’ signals can be performed, e.g., to create an output signal in a hands-free telephone conference call for a far-end communication partner. On the other hand, annoying cross-talk components from interfering sound sources occurring in multiple different mixed output signals are to be eliminated, motivated by the possibility of other hands-free applications being active in parallel. The system includes several signal processing stages. A dedicated signal processing block for interfering speaker cancellation attenuates the cross-talk components of undesired speech. Further signal enhancement comprises the reduction of residual cross-talk and background noise. Subsequently, a dynamic signal combination stage merges the processed single-microphone signals to obtain appropriate mixed signals at the system output that may be passed to applications such as telephony or a speech dialog system. Based on signal power ratios between the particular microphone signals, an appropriate speaker activity detection and therewith a robust control mechanism of the whole system is presented. The proposed system may be dynamically configured and has been evaluated for a car setup with four speakers sitting in the car cabin disturbed in various noise conditions.
Archive | 2010
Markus Buck; Eberhard Hänsler; Mohamed Krini; Gerhard Schmidt; Tobias Wolff
This chapter contains sections titled: Introduction Signal Processing in Subband Domain Multichannel Echo Cancellation Speaker Localization Beamforming Sensor Calibration Postprocessing Conclusions References
international workshop on acoustic signal enhancement | 2016
Ingo Schalk-Schupp; Friedrich Faubel; Markus Buck; Andreas Wendemuth
This work shows a novel method to suppress linear and nonlinear residual echo components after application of a linear echo canceler. The main idea is to separately treat linear and nonlinear residual echo components, as linear echo is reduced by AEC, while nonlinear echo passes the AEC unchanged. In particular, it is shown that a very simple model, such as a hard clipping function, is sufficient to approximate the nonlinear residual echo power; and that the clipping threshold can be estimated by comparing the broad-band predicted nonlinear residual echo power (produced with the current clipping threshold estimate) to the broad-band observed nonlinear residual echo power (obtained through linear AEC and subtraction of the linear residual echo power, as determined with linear coupling factors). Experimental evaluations show ERLE improvements by up to 14.9 dB compared to linear echo cancellation and suppression at a negligible decrease in speech quality during double talk.
2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays | 2011
Tobias Wolff; Markus Buck
In this contribution we present an adaptive beamformer-postfilter system which can be used to suppress non-stationary noises. The emphasis lies on the spatial filtering property of the proposed postfilter. Due to the spatial characteristic of the postfilter, an interfering speaker in background noise can be suppressed effectively while maintaining the quality of the desired speech using only two microphones.