Volker Stahl | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Volker Stahl is active.

Explore More

Publication

Featured researches published by Volker Stahl.

international conference on acoustics, speech, and signal processing | 2000

Quantile based noise estimation for spectral subtraction and Wiener filtering

Volker Stahl; Alexander Fischer; Rolf Bippus

Elimination of additive noise from a speech signal is a fundamental problem in audio signal processing. In this paper we restrict our considerations to the case where only a single microphone recording of the noisy signal is available. The algorithms which we investigate proceed in two steps. First, the noise power spectrum is estimated. A method based on temporal quantiles in the power spectral domain is proposed and compared with pause detection and recursive averaging. The second step is to eliminate the estimated noise from the observed signal by spectral subtraction or Wiener filtering. The database used in the experiments comprises 6034 utterances of German digits and digit strings by 770 speakers in 10 different cars. Without noise reduction, we obtain an error rate of 11.7%. Quantile based noise estimation and Wiener filtering reduce the error rate to 8.6%. Similar improvements are achieved in an experiment with artificial, non-stationary noise.

international conference on acoustics, speech, and signal processing | 2001

Acoustic synthesis of training data for speech recognition in living room environments

Volker Stahl; Alexander Fischer; Rolf Bippus

Despite continuous progress in robust automatic speech recognition acoustic mismatch between training and test conditions is still a major problem. Consequently, large speech collections must be conducted in many environments. An alternative approach is to generate training data synthetically by filtering clean speech with impulse responses and/or adding noise signals from the target domain. We compare the performance of a speech recognizer trained on recorded speech in the target domain with a system trained on suitably transformed clean speech. In order to obtain comparable results, our experiments are based on two channel recordings with a close talk and a distant microphone which produce the clean signal and the target domain signal respectively. By filtering and adding noise we obtain error rates which are only 10% higher for natural number recognition and 30% higher for a command recognition task compared to training with target domain data.

international conference on acoustics speech and signal processing | 1999

Database and online adaptation for improved speech recognition in car environments

Alexander Fischer; Volker Stahl

Data collections in the car environment require much more effort in terms of cost and time as compared to the telephone or the office environment. Therefore we apply supervised database adaptation from the telephone environment to the car environment to allow quick setup of car environment recognizers. Further reduction of word error rate is obtained by unsupervised online adaptation during recognition. We investigate the common techniques MLLR and MAP for that purpose. We give results on command word recognition in the car environment for all combinations of database and online adaptation in task-dependent and task-independent scenarios. The possibility of setting up speech recognizers for the car environment based on telephone data and a limited amount of adaptation material from the car environment is demonstrated.

international conference on acoustics speech and signal processing | 1998

Subword unit based speech recognition in car environments

Alexander Fischer; Volker Stahl

This paper presents results of speaker-independent speech recognition experiments concerning acoustic front-ends, models and their structures in car environments. The database comprises 350 speakers in 6 different cars. We investigate whole-word models, context-independent phoneme models and context-dependent within-word phoneme models. We studied task-dependent (same vocabulary context in training and test) phoneme models and present first results on task-independent (broad context in training, i.e. phonetically rich material) scenarios. The latter allows flexible vocabulary definition for applications with dynamically changing command words or new applications avoiding an expensive data collection. Acoustic preprocessing is carried out with mel-cepstrum combined with spectral subtraction and SNR normalization. The task-dependent word error rates are well below 3% for both whole-word and phoneme models. The task-independent scenarios have to be worked on further.

Reliable Computing | 1997

Error Reduction of the Taylor Centered Form by Half and an Inner Estimation of the Range

Volker Stahl

We show that the overestimation error of the interval Horner method for univariate polynomials on a centered interval is reduced at least by half if the interval is split at its midpoint zero and the interval Horner method is applied to both halves separately. This observation is used to reduce the overestimation error of the Taylor centered form at least by half. Further, it can be used to compute error bounds for the Taylor centered form and an inner range estimation.

Archive | 1999