Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Klaus Reinhard is active.

Publication


Featured researches published by Klaus Reinhard.


Speech Communication | 1999

Parametric subspace modeling of speech transitions

Klaus Reinhard; Mahesan Niranjan

Abstract This paper describes an attempt at capturing segmental transition information for speech recognition tasks. The slowly varying dynamics of spectral trajectories carries much discriminant information that is very crudely modelled by traditional approaches such as HMMs. In approaches such as recurrent neural networks there is the hope, but not the convincing demonstration, that such transitional information could be captured. The method presented here starts from the very different position of explicitly capturing the trajectory of short time spectral parameter vectors on a subspace in which the temporal sequence information is preserved. This was approached by introducing a temporal constraint into the well known technique of Principal Component Analysis (PCA). On this subspace, an attempt of parametric modelling the trajectory was made, and a distance metric was computed to perform classification of diphones. Using the Principal Curves method of Hastie and Stuetzle and the Generative Topographic map (GTM) technique of Bishop, Svensen and Williams as description of the temporal evolution in terms of latent variables was performed. On the difficult problem of /bee/, /dee/, /gee/ it was possible to retain discriminatory information with a small number of parameters. Experimental illustrations present results on ISOLET and TIMIT database.


international conference on acoustics, speech, and signal processing | 2006

Robust Endpoint Detection for Speech Recognition Based on Discriminative Feature Extraction

Koichi Yamamoto; Firas Jabloun; Klaus Reinhard; Akinori Kawamura

Accurate endpoint detection is important for improving the speech recognition capability. This paper proposes a novel endpoint detection method which combines energy-based and likelihood ratio-based voice activity detection (VAD) criteria, where the likelihood ratio is calculated with speech/non-speech Gaussian mixture models (GMMs). Moreover, the proposed method introduces the discriminative feature extraction technique (DFE) in order to improve the speech/non-speech classification. The DFE is used in the training of parameters required for calculating the likelihood ratio. Experimental results have shown that the proposed endpointer achieves good performance compared to an energy-based endpointer in terms of start-of-speech (SOS) and end-of-speech (EOS) detections. Due to the improvement of the endpointer, the performance of automatic speech recognition (ASR) has also been improved


international conference on acoustics speech and signal processing | 1998

Parametric subspace modelling of speech transitions

Klaus Reinhard; Mahesan Niranjan

We report on attempting to capture segmental transition information for speech recognition tasks. The slowly varying dynamics of spectral trajectories carries much discriminant information that is very crudely modelled by traditional approaches such as HMMs. In attempts such as recurrent neural networks there is the hope, but not convincing demonstration, that such transitional information could be captured. We start from the very different position of explicitly capturing the trajectory of short time spectral parameter vectors on a subspace in which the temporal sequence information is preserved (time constrained principal component analysis). On this subspace, we attempt a parametric modelling of the trajectory, and compute a distance metric to perform classification of diphones. Much of the discriminant information is still retained in this subspace. This is illustrated on the isolated transitions /bee/,/dee/ and /gee/.


international conference on acoustics speech and signal processing | 1999

Diphone multi-trajectory subspace models

Klaus Reinhard; Mahesan Niranjan

We report on the extension of capturing speech transitions embedded in diphones using trajectory models. The slowly varying dynamics of spectral trajectories carry much discriminant information that is very crudely modelled by traditional approaches such as HMMs. We improved our methodology of explicitly capturing the trajectory of short time spectral parameter vectors introducing multi-trajectory concepts in a probabilistic framework. Optimal subspace selection is presented which finds the most discriminant plane for classification. Using the E-set from the TIMIT database results suggest that discriminant information is preserved in the subspace.


international conference on acoustics, speech, and signal processing | 2000

Matched filter design for diphone subspace models

Klaus Reinhard; Mahesan Niranjan

Considering the perceptual importance of phonetic transitions as minimal contextual variant units, this paper addresses the problem by modelling explicitly interphone dynamics covered in diphones. Subspace projections based on a time-constrained PCA (TC-PCA) are developed which focus on the temporal evolution. They reveal characteristic trajectories present in a low-dimensional spectral representation facilitating robust parameter estimation and simultaneously optimise the discriminant information. A matched filter design is applied to a multiple hypotheses rescoring scheme which enables operating in very low-dimensional parameter space. Using such multiple hypotheses paradigm the complementary information effectiveness of modelling explicitly inter-phone dynamics covered in diphones can be shown using the TIMIT database, resulting in improved phone error rates.


Archive | 1998

Subspace Models For Speech Transitions Using Principal Curves

Klaus Reinhard; Mahesan Niranjan


Speech Communication | 2002

Diphone subspace mixture trajectory models for HMM Complementation

Klaus Reinhard; Mahesan Niranjan


international conference on artificial neural networks | 1997

Non-linear speech transition visualization

Klaus Reinhard; Mahesan Niranjan


conference of the international speech communication association | 2015

PATSY - it's all about pronunciation!

Caroline Kaufhold; Vadim Gamidov; Andreas Kießling; Klaus Reinhard; Elmar Nöth


Archive | 2004

Systeme et methode permettant une intervention acoustique

Jochen Junkawitsch; Raymond Brückner; Klaus Reinhard; Stefan Dobler

Collaboration


Dive into the Klaus Reinhard's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Andreas Kießling

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Caroline Kaufhold

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Elmar Nöth

University of Erlangen-Nuremberg

View shared research outputs
Researchain Logo
Decentralizing Knowledge