Is this you? Create Your Porfile

T. Claes

Katholieke Universiteit Leuven

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where T. Claes is active.

Explore More

Publication

Featured researches published by T. Claes.

IEEE Transactions on Speech and Audio Processing | 1998

A novel feature transformation for vocal tract length normalization in automatic speech recognition

T. Claes; Ioannis Dologlou; L. ten Bosch; D. Van Compernolle

This paper proposes a method to transform acoustic models that have been trained with a certain group of speakers for use on different speech in hidden Markov model based (HMM-based) automatic speech recognition. Features are transformed on the basis of assumptions regarding the difference in vocal tract length between the groups of speakers. First, the vocal tract length (VTL) of these groups has been estimated based on the average third formant F/sub 3/. Second, the linear acoustic theory of speech production has been applied to warp the spectral characteristics of the existing models so as to match the incoming speech. The mapping is composed of subsequent nonlinear submappings. By locally linearizing it and comparing results in the output, a linear approximation for the exact mapping was obtained which is accurate as long as the warping is reasonably small. The feature vector, which is computed from a speech frame, consists of the mel scale cepstral coefficients (MFCC) along with delta and delta/sup 2/-cepstra as well as delta and delta/sup 2/ energy. The method has been tested for TI digits data base, containing adult and children speech, consisting of isolated digits and digit strings of different length. The word error rate when trained on adults and tested on children with transformed adult models is decreased by more than a factor of two compared to the nontransformed case.

international conference on acoustics, speech, and signal processing | 1994

On the importance of the microphone position for speech recognition in the car

J Smolders; T. Claes; G. Sablon; D. Van Compernolle

One of the problems with speech recognition in the car is the position of the far talk microphone. This position not only implies more or less noise, coming from the car (engine, tires,...) or from other sources (traffic, wind noise,...) but also a different acoustical transfer function. In order to compare the microphone positions in the car, we recorded a multispeaker database in a car with 7 different positions and compared them on the basis of SNR and recognition rate. The position at the ceiling right in front of the speaker gave the best results.<<ETX>>

international conference on acoustics speech and signal processing | 1996

SNR-normalisation for robust speech recognition

T. Claes; D. Van Compernolle

A new normalisation technique for speech recognition in adverse conditions is presented. Specifically the influence of additive noise in combination with convolutive distortions is considered. In the proposed method a masking constant is added to the outputs of a mel scale triangular filterbank. This is done for testing and training samples. The goal is to normalize the signal-to-noise ratio (SNR) in each frequency band by adapting the masking constant depending on the measured SNR or dynamic range in each band. This makes the extracted parameters less sensitive to the noise level, but also the influence of channel distortions is suppressed. The method is easy to implement and works on-line. Experimental results are given on the NOISEX-92 database and on real car data.

international conference on spoken language processing | 1996

Spectral estimation and normalisation for robust speech recognition

T. Claes; Fei Xie; D. Van Compernolle

Speech recognition in adverse conditions remains a difficult but challenging problem. It has already been shown that normalisation of the dynamic range (SNR) of the frequency channels in a mel scale triangular filter bank (MFCC) improves the robustness against both additive and convolutional noise. Nevertheless, because the method is based on a masking-technique, the improvement is small in the case of SNR values that are smaller than the target (normalised) SNR. A solution for this problem can be found in first enhancing the filter bank energies before the masking-technique is applied. For this purpose the authors developed a non-linear spectral estimator (NSE) for speech recognition that operates on the log filter bank energies. NSE enhances these filter bank energies and makes use of SNR-normalisation also effective at very low SNRs. Experimental results are given on the NOISEX-92 database. Better recognition performance is seen even at 0 dB SMR.

conference of the international speech communication association | 1997