Stefan Ortmanns | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stefan Ortmanns is active.

Explore More

Publication

Featured researches published by Stefan Ortmanns.

IEEE Signal Processing Magazine | 1999

Dynamic programming search for continuous speech recognition

Hermann Ney; Stefan Ortmanns

The authors gives a unifying view of the dynamic programming approach to the search problem. They review the search problem from the statistical point-of-view and show how the search space results from the acoustic and language models required by the statistical approach. Starting from the baseline one-pass algorithm using a linear organization of the pronunciation lexicon, they have extended the baseline algorithm toward various dimensions. To handle a large vocabulary, they have shown how the search space can be structured in combination with a lexical prefix tree organization of the pronunciation lexicon. In addition, they have shown how this structure of the search space can be combined with a time-synchronous beam search concept and how the search space can be constructed dynamically during the recognition process. In particular, to increase the efficiency of the beam search concept, they have integrated the language model look-ahead into the pruning operation. To produce sentence alternatives rather than only the single best sentence, they have extended the search strategy to generate a word graph. Finally, they have reported experimental results on a 64 k-word task that demonstrate the efficiency of the various search concepts presented.

Proceedings of the IEEE | 2000

Progress in dynamic programming search for LVCSR

Hermann Ney; Stefan Ortmanns

Initially introduced in the late 1960s and early 1970s, dynamic programming algorithms have become increasingly popular in automatic speech recognition. There are two reasons why this has occurred. First, the dynamic programming strategy can be combined with a very efficient and practical pruning strategy so that very large search spaces can be handled. Second, the dynamic programming strategy has turned out to be extremely flexible in adapting to new requirements. Examples of such requirements are the lexical tree organization of the pronunciation lexicon and the generation of a word graph instead of the single best sentence. We attempt to review the use of dynamic programming search strategies for large-vocabulary continuous speech recognition (LVCSR). The following methods are described in detail: searching using a lexical tree, language-model look-ahead and word-graph generation.

international conference on spoken language processing | 1996

Language-model look-ahead for large vocabulary speech recognition

Stefan Ortmanns; Hermann Ney; Andreas Eiden

Presents an efficient look-ahead technique which incorporates the language model knowledge at the earliest possible stage during the search process. This so-called language model look-ahead is built into the time-synchronous beam search algorithm using a tree-organized pronunciation lexicon for a bigram language model. The language model look-ahead technique exploits the full knowledge of the bigram language model by distributing the language model probabilities over the nodes of the lexical tree for each predecessor word. We present a method for handling the resulting memory requirements. The recognition experiments performed on the 20,000-word North American Business task (Nov. 1996) demonstrate that, in comparison with the unigram look-ahead, a reduction by a factor of 5 in the acoustic search effort can be achieved without loss in recognition accuracy.

international conference on acoustics speech and signal processing | 1999

High quality word graphs using forward-backward pruning

Achim Sixtus; Stefan Ortmanns

This paper presents an efficient method for constructing high quality word graphs for large vocabulary continuous speech recognition. The word graphs are constructed in a two-pass strategy. In the first pass, a huge word graph is produced using the time-synchronous lexical tree search method. Then, in the second pass, this huge word graph is pruned by applying a modified forward-backward algorithm. To analyze the characteristic properties of this word graph pruning method, we present a detailed comparison with the conventional time-synchronous forward pruning. The recognition experiments, carried out on the North American Business (NAB) 20000-word task, demonstrate that, in comparison to the forward pruning, the new method leads to a significant reduction in the size of the word graph without an increase in the graph word error rate.

international conference on acoustics speech and signal processing | 1998

The RWTH large vocabulary continuous speech recognition system

Hermann Ney; Lutz Welling; Stefan Ortmanns; Klaus Beulen; Frank Wessel

We present an overview of the RWTH Aachen large vocabulary continuous speech recognizer. The recognizer is based on continuous density hidden Markov models and a time-synchronous left-to-right beam search strategy. Experimental results on the ARPA Wall Street Journal (WSJ) corpus verify the effects of several system components, namely linear discriminant analysis, vocal tract normalization, pronunciation lexicon and cross-word triphones, on the recognition performance.

international conference on spoken language processing | 1996

A comparison of time conditioned and word conditioned search techniques for large vocabulary speech recognition

Stefan Ortmanns; Hermann Ney; E. Seide; I. Lindam

Compares the search effort required by the word-conditioned and the time-conditioned tree search methods. Both methods are based on a time-synchronous, left-to-right beam search using a tree-organized lexicon. Whereas the word-conditioned method is well-known and widely used, the time-conditioned method is novel in the context of 20,000-word vocabulary recognition. We extend both methods to handle trigram language models in a one-pass strategy. Both methods were tested on a train schedule inquiry task (1,850 words, telephone speech) and on the North American Business development corpus (20,000 words) of November 1994.

Computer Speech & Language | 2000

Look-ahead techniques for fast beam search

Stefan Ortmanns; Hermann Ney

This paper presents two look-ahead techniques for speeding up large vocabulary continuous speech recognition. These two techniques are the language model look-ahead and the phoneme look-ahead; both are incorporated into the pruning process of the time-synchronous one-pass beam search algorithm. The search algorithm is based on a tree-organized pronunciation lexicon in connection with a bigram language model. Both look-ahead techniques have been tested on the 20 000-word NAB?94 task (ARPA North American Business Corpus). The recognition experiments show that the combination of bigram language model look-ahead and phoneme look-ahead reduces the size of search space by a factor of about 30 and the computational effort by a factor of 5 without affecting the word recognition accuracy in comparison with no look-ahead pruning technique.

IEEE Transactions on Speech and Audio Processing | 2000

The time-conditioned approach in dynamic programming search for LVCSR

Stefan Ortmanns; Hermann Ney

This paper presents the time-conditioned approach in dynamic programming search for large-vocabulary continuous-speech recognition. The following topics are presented: the baseline algorithm, a time-synchronous beam search version, a comparison with the word-conditioned approach, a comparison with stack decoding. The approach has been successfully tested on the NAB task using a vocabulary of 64000 words.

GI Jahrestagung | 1997

Architecture and Search Organization for Large Vocabulary Continuous Speech Recognition

Stefan Ortmanns; Lutz Welling; Klaus Beulen; Frank Wessel; Hermann Ney

This paper gives an overview of an architecture and search organization for large vocabulary, continuous speech recognition (LVCSR at RWTH). In the first part of the paper, we describe the principle and architecture of a LVCSR system. In particular, the isssues of modeling and search for phoneme based recognition are discussed. In the second part, we review the word conditioned lexical tree search algorithm from the viewpoint of how the search space is organized. Further, we extend this method to produce high quality word graphs. Finally, we present some recognition results on the ARPA North American Business (NAB’94) task for a 64 000-word vocabulary (American English, continuous speech, speaker independent).

Computer Speech & Language | 1997