Ute Essen
Philips
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ute Essen.
international conference on acoustics, speech, and signal processing | 1992
Ute Essen; Volker Steinbiss
Training corpora for stochastic language models are virtually always too small for maximum-likelihood estimation, so smoothing the models is of great importance. The authors derive the cooccurrence smoothing technique for stochastic language modeling and give experimental evidence for its validity. Using word-bigram language models, cooccurrence smoothing improved the test-set perplexity by 14% on a German 100000-word text corpus and by 10% on an English 1-million word corpus.<<ETX>>
international conference on acoustics, speech, and signal processing | 1991
Hermann Ney; Ute Essen
The authors study various problems related to smoothing bigram probabilities for natural language modeling: the type of interpolation, i.e. linear vs. nonlinear, the optimal estimation of interpolation parameters, and the use of word equivalence classes (parts of speech). A nonlinear interpolation method that results in significant improvements over linear interpolation in the experimental tests is proposed. It is shown that the leaving-one-out method in combination with the maximum likelihood criterion can be efficiently used for the optimal estimation of interpolation parameters. In addition, an automatic clustering procedure is developed for finding word equivalence classes using a maximum likelihood criterion. Experimental results are presented for two text databases: a German database with 100000 words and an English database with 1.1 million words.<<ETX>>
Philips Journal of Research | 1995
Volker Steinbiss; Hermann Ney; Xavier L. Aubert; Stefan Besling; Christian Dugast; Ute Essen; Dieter Geller; Reinhard Kneser; H.-G. Meier; Martin Oerder; Bach-Hiep Tran
This paper gives an overview of the Philips Research system for continuous-speech recognition. The recognition architecture is based on an integrated statistical approach. The system has been successfully applied to various tasks in American English and German, ranging from small vocabulary tasks to very large vocabulary tasks and from recognition only to speech understanding. Here, we concentrate on phoneme-based continuous-speech recognition for large vocabulary recognition as used for dictation, which covers a significant part of our research work on speech recognition. We describe this task and report on experimental results. In order to allow a comparison with the performance of other systems, a section with an evaluation on the standard North American Business news (NAB2) task (dictation of American English newspaper text) is supplied.
International Journal of Pattern Recognition and Artificial Intelligence | 1994
Hermann Ney; Volker Steinbiss; Bach-Hiep Tran; Ute Essen
This paper gives an overview of a research system for phoneme based, large vocabulary continuous speech recognition. The system to be described has been applied to the SPICOS task, the DARPA RM task and a 12000 word dictation task. Experimental results for these three tasks will be presented. Like many other systems, the recognition architecture is based on an integrated statistical approach. In this paper, we describe the characteristic features of the system as opposed to other systems: (1) The Viterbi criterion is consistently applied both in training and testing. (2) Continuous mixture densities are used without any tying or smoothing; this approach can be viewed as a sort of ‘statistical template matching’. (3) Time-synchronous beam search is used consistently throughout all tasks; extensions using a tree organization of the vocabulary and phoneme lookahead are presented so that a 12000 word task can be handled.
Speech Communication | 1995
Volker Steinbiss; Hermann Ney; Ute Essen; Bach-Hiep Tran; Xavier L. Aubert; Christian Dugast; Reinhard Kneser; H.-G. Meier; Martin Oerder; Dieter Geller; W. Höllerbauer; H. Bartosik
This paper gives an overview of the Philips research system for phoneme-based, large-vocabulary, continuousspeech recognition. The system has been successfully applied to various tasks in the German and (American) English languages, ranging from small vocabulary tasks to very large vocabulary tasks. Here, we concentrate on continuousspeech recognition for dictation in real applications, the dictation of legal reports and radiology reports in German. We describe this task and report on experimental results. We also describe a commercial PC-based dictation system which includes a PC implementation of our scientific recognition prototype. In order to allow for a comparison with the performance of other systems, a section with an evaluation on the standard Wall Street Journal task (dictation of American English newspaper text) is supplied. The recognition architecture is based on an integrated statistical approach. We describe the characteristic features of the system as opposed to other systems: 1. the Viterbi criterion is consistently applied both in training and testing; 2. continuous mixture densities are used without tying or smoothing; 3. time-synchronous beam search in connection with a phoneme look-ahead is applied to a tree-organized lexicon.
Archive | 1993
Ute Essen; Hermann Ney
The language model part of a speech recognition system provides information about the probabilities of word sequences. The probabilities are estimated beforehand from a large set of training data, so that the language model does not reflect any short-term fluctuations in word use. In order to enable the adaptation to those fluctuations, we added a dynamic component, the cache memory, which uses the word frequencies of the recent past to update the static word probabilities. Compared to a usual bigram language model we achieved an improvement of perplexity of 8% and 23%, respectively, depending on the heterogeneity of the data.
Archive | 1995
Reinhard Kneser; Ute Essen; Hermann Ney
ABSTRACf The probability estimates in stochastic language modelling often depend on some additional parameters apart from the training data. These parameters are typically related to the probabilities of events not seen in the training data and conventional maximum-likelihood methods therefore fail to determine them. We present a special form of cross validation, the leaving-one-out concept, to solve this problem. The application of this technique to several different modelling approaches reveals its flexibility and in some cases the simple way of computation. Experiments, performed on an English corpus of 1.1 million words, show the good generalization capability.
Computer Speech & Language | 1994
Hermann Ney; Ute Essen; Reinhard Kneser
IEEE Transactions on Pattern Analysis and Machine Intelligence | 1995
Hermann Ney; Ute Essen; Reinhard Kneser
conference of the international speech communication association | 1993
Volker Steinbiss; Hermann Ney; B.-H. Iran; Ute Essen; Reinhard Kneser; Martin Oerder; H.-G. Meier; Xavier L. Aubert; Christian Dugast; Dieter Geller; W. Höllerbauer; H. Bartosik