S. Rieck
University of Erlangen-Nuremberg
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by S. Rieck.
international conference on acoustics, speech, and signal processing | 1992
Ernst Günter Schukat-Talamazzini; H. Niemann; W. Eckert; Thomas Kuhn; S. Rieck
The authors address the choice of suitable subword units for the hidden Markov model (HMM)-based front-end of a speaker-independent large vocabulary continuous speech dialog system (EVAR). In contrast to the well-known approach of using context-dependent phone-like units (for instance generalized triphones) the authors developed inventories of larger-sized subword units, so-called context-freezing units (CFU). CFU models can be considered as an approximation to the extremely desirable situation of having whole word HMMs under the limiting conditions of the training speech data at hand. Recognition experiments indicate an advantage of the context-freezing units over triphone/biphone/phone combinations in terms of the achieved word accuracy, at least in the case of German speech. Using triphones with contexts generalized by means of broad phonetic classes, the authors achieved results comparable to the CFU ones.<<ETX>>
international conference on acoustics, speech, and signal processing | 1993
Ernst Günter Schukat-Talamazzini; M. Bielecki; Heinrich Niemann; Thomas Kuhn; S. Rieck
Three algorithms to speed up full covariance multivariate Gaussian vector quantizers are presented. The speed-up is achieved by avoiding the distance calculation for a considerable number of codebook classes at each input frame. In two cases, this pruning is guided by thresholds U/sub kappa lambda / which are computed for each class pair in a preprocessing stage. The computation of the U/sub kappa lambda /s from the codebook parameters by the gradient projection method leads to an admissible search strategy. Two of the proposed search procedures trace accuracy for speed. Both of them allow more than fivefold speed-up vector quantization at very low frame error rates, and without any degradation of word accuracy.<<ETX>>
international conference on pattern recognition | 1992
S. Rieck; Ernst Günter Schukat-Talamazzini; H. Niemann
Presents a new approach to speaker adaptation based on semi-continuous hidden Markov models (SCHMM). The authors introduce a modification of the semi-continuous codebook updating which allows rapid speaker adaptation. The approach is based on the idea that phonetic information already incorporated in a trained model should be used to update the codebook. Thus the different acoustic representation of a new speaker is learned while the connection between codebook entries and model states remains the same. Several experiments were carried out with a small speech sample. It is possible to demonstrate that the new codebook updating performs better than conventional SCHMM codebook updating and that using a speech sample comprising about 40 seconds of adaptation speech is enough to achieve 50 percent of the difference in performance between full speaker-dependent training and no adaptation at all.<<ETX>>
KONVENS 92: 1. Konferenz "Verarbeitung Natürlicher Sprache", Nürnberg, 7. - 9. Oktober 1992 | 1992
W. Eckert; Gernot A. Fink; Andreas Kießling; Ralf Kompe; Thomas Kuhn; Franz Kummert; Marion Mast; H. Niemann; Elmar Nöth; R. Prechtel; S. Rieck; Gerhard Sagerer; A. Scheuer; G. Schukat-Talamazzini; B. Seestaedt
Dieser Artikel befast sich mit dem sprachverstehenden Dialogsystem EVAR, insbesondere mit der linguistischen Verarbeitung des Systems. Aufgabe von EVAR ist die Fuhrung eines informationsabfragenden Dialogs uber das deutsche InterCity-Zugsystem. Das linguistische Wissen ist einheitlich in einem semantischen Netz reprasentiert. Die Wissensbasis ist gemas einem geschichteten linguistischen Modell wohlstrukturiert. Schnittstelle zur Spracherkennung ist die Worthypothesen-Ebene. Der Kontrollalgorithmus ist anwendungsunabhangig formuliert und erlaubt das dynamische Umschalten zwischen den beiden grundlegenden Analysestrategien top-down und bottom-up. Das im System reprasentierte Wissen wird sowohl zur Steuerung der Erkennungsphase als auch in der Verstehensphase benutzt. Das System ist in der Lage, Anfragen trotz fehlerhafter Erkennungsergebnisse zu bearbeiten. Ergebnisse fur eine sprecherabhangige- und eine Mehrsprecher-Version der Erkennung werden vorgestellt.
Archive | 2011
Heinrich Niemann; Elmar Nöth; Ernst Gunter Schukat-Talamazzini; Andreas Kießling; Ralf Kompe; Thomas Kuhn; S. Rieck
In order to cope with the problems of spontaneous speech (including, for example, hesitations and non-words) it is necessary to extract from the speech signal all information it contains. Modeling of words by segmental units should be supported by suprasegmental units since valuable information is represented in the prosody of an utterance. We present an approach to flexible and efficient modeling of speech by segmental units and describe extraction and use of suprasegmental information.
Archive | 1992
Thomas Kuhn; S. Kunzmann; Elmar Nöth; S. Rieck; Ernst Günter Schukat-Talamazzini
We present an iterative method to optimize the word recognition rate for a data driven analysis in continuous speech by using a large set of speech samples. After a short description of our system environment a bootstrapping method for an iterative parameter estimation will be discussed. The initialization of the bootstrapping procedure is done by using a limited amount of hand labeled training data to estimate the statistical parameters roughly. In the second step the statistical parameters are estimated more exactly on the basis of unlabeled training data. Some experimental results for the bootstrapping method performed on unlabeled training data in comparison with results achieved by parameter estimation on labeled training data will be given.
Archive | 1992
S. Rieck; Ernst Günter Schukat-Talamazzini; Thomas Kuhn; S. Kunzmann; Elmar Nöth
In this paper a method is described to generate automatically the labels for a new speech database from an existing manually labeled speech database. This becomes necessary when new standards are introduced and the speech signals have to be resampled. A dynamic time warping algorithm is used to match the original and the resampled speech signals. The comparison is carried out on mel based features. To improve computation time the search space for the DTW algorithm is restricted. Several experiments were carried out with a normal density Bayes classifier to check the quality of the new labelings. The results showed only a slight decrease in performance when using the new labelings.
conference of the international speech communication association | 1993
Ernst Günter Schukat-Talamazzini; Heinrich Niemann; Wieland Eckert; Thomas Kuhn; S. Rieck
conference of the international speech communication association | 1993
Wieland Eckert; Thomas Kuhn; Heinrich Niemann; S. Rieck; A. Scheuer; Ernst Günter Schukat-Talamazzini
Archive | 1995
Ralf Kompe; Werner Eckert; Andreas Kiessling; Thomas Kuhn; Maura B. Mast; Heinrich Niemann; Elmar Nth; Ernst Gunter Schukat-Talamazzini; S. Rieck