Volker Stahl
Philips
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Volker Stahl.
international conference on acoustics, speech, and signal processing | 2000
Volker Stahl; Alexander Fischer; Rolf Bippus
Elimination of additive noise from a speech signal is a fundamental problem in audio signal processing. In this paper we restrict our considerations to the case where only a single microphone recording of the noisy signal is available. The algorithms which we investigate proceed in two steps. First, the noise power spectrum is estimated. A method based on temporal quantiles in the power spectral domain is proposed and compared with pause detection and recursive averaging. The second step is to eliminate the estimated noise from the observed signal by spectral subtraction or Wiener filtering. The database used in the experiments comprises 6034 utterances of German digits and digit strings by 770 speakers in 10 different cars. Without noise reduction, we obtain an error rate of 11.7%. Quantile based noise estimation and Wiener filtering reduce the error rate to 8.6%. Similar improvements are achieved in an experiment with artificial, non-stationary noise.
international conference on acoustics, speech, and signal processing | 2001
Volker Stahl; Alexander Fischer; Rolf Bippus
Despite continuous progress in robust automatic speech recognition acoustic mismatch between training and test conditions is still a major problem. Consequently, large speech collections must be conducted in many environments. An alternative approach is to generate training data synthetically by filtering clean speech with impulse responses and/or adding noise signals from the target domain. We compare the performance of a speech recognizer trained on recorded speech in the target domain with a system trained on suitably transformed clean speech. In order to obtain comparable results, our experiments are based on two channel recordings with a close talk and a distant microphone which produce the clean signal and the target domain signal respectively. By filtering and adding noise we obtain error rates which are only 10% higher for natural number recognition and 30% higher for a command recognition task compared to training with target domain data.
international conference on acoustics speech and signal processing | 1999
Alexander Fischer; Volker Stahl
Data collections in the car environment require much more effort in terms of cost and time as compared to the telephone or the office environment. Therefore we apply supervised database adaptation from the telephone environment to the car environment to allow quick setup of car environment recognizers. Further reduction of word error rate is obtained by unsupervised online adaptation during recognition. We investigate the common techniques MLLR and MAP for that purpose. We give results on command word recognition in the car environment for all combinations of database and online adaptation in task-dependent and task-independent scenarios. The possibility of setting up speech recognizers for the car environment based on telephone data and a limited amount of adaptation material from the car environment is demonstrated.
international conference on acoustics speech and signal processing | 1998
Alexander Fischer; Volker Stahl
This paper presents results of speaker-independent speech recognition experiments concerning acoustic front-ends, models and their structures in car environments. The database comprises 350 speakers in 6 different cars. We investigate whole-word models, context-independent phoneme models and context-dependent within-word phoneme models. We studied task-dependent (same vocabulary context in training and test) phoneme models and present first results on task-independent (broad context in training, i.e. phonetically rich material) scenarios. The latter allows flexible vocabulary definition for applications with dynamically changing command words or new applications avoiding an expensive data collection. Acoustic preprocessing is carried out with mel-cepstrum combined with spectral subtraction and SNR normalization. The task-dependent word error rates are well below 3% for both whole-word and phoneme models. The task-independent scenarios have to be worked on further.
Reliable Computing | 1997
Volker Stahl
We show that the overestimation error of the interval Horner method for univariate polynomials on a centered interval is reduced at least by half if the interval is split at its midpoint zero and the interval Horner method is applied to both halves separately. This observation is used to reduce the overestimation error of the Taylor centered form at least by half. Further, it can be used to compute error bounds for the Taylor centered form and an inner range estimation.
Archive | 1999
Friedhelm Wuppermann; Volker Stahl
Archive | 2002
Volker Stahl; Alexander Fischer
Archive | 2000
Volker Stahl; Alexander Fischer
Archive | 2001
Volker Stahl
conference of the international speech communication association | 1999
Rolf Bippus; Alexander Fischer; Volker Stahl