Martin Sundermeyer | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Martin Sundermeyer is active.

Explore More

Publication

Featured researches published by Martin Sundermeyer.

IEEE Transactions on Audio, Speech, and Language Processing | 2015

From feedforward to recurrent LSTM neural networks for language modeling

Martin Sundermeyer; Hermann Ney; Ralf Schlüter

Language models have traditionally been estimated based on relative frequencies, using count statistics that can be extracted from huge amounts of text data. More recently, it has been found that neural networks are particularly powerful at estimating probability distributions over word sequences, giving substantial improvements over state-of-the-art count models. However, the performance of neural network language models strongly depends on their architectural structure. This paper compares count models to feedforward, recurrent, and long short-term memory (LSTM) neural network variants on two large-vocabulary speech recognition tasks. We evaluate the models in terms of perplexity and word error rate, experimentally validating the strong correlation of the two quantities, which we find to hold regardless of the underlying type of the language model. Furthermore, neural networks incur an increased computational complexity compared to count models, and they differently model context dependences, often exceeding the number of words that are taken into account by count based approaches. These differences require efficient search methods for neural networks, and we analyze the potential improvements that can be obtained when applying advanced algorithms to the rescoring of word lattices on large-scale setups.

empirical methods in natural language processing | 2014

Translation Modeling with Bidirectional Recurrent Neural Networks

Martin Sundermeyer; Tamer Alkhouli; Joern Wuebker; Hermann Ney

This work presents two different translation models using recurrent neural networks. The first one is a word-based approach using word alignments. Second, we present phrase-based translation models that are more consistent with phrasebased decoding. Moreover, we introduce bidirectional recurrent neural models to the problem of machine translation, allowing us to use the full source sentence in our models, which is also of theoretical interest. We demonstrate that our translation models are capable of improving strong baselines already including recurrent neural language models on three tasks: IWSLT 2013 German!English, BOLT Arabic!English and Chinese!English. We obtain gains up to 1.6% BLEU and 1.7% TER by rescoring 1000-best lists.

international conference on acoustics, speech, and signal processing | 2013

Comparison of feedforward and recurrent neural network language models

Martin Sundermeyer; Ilya Oparin; Jean-Luc Gauvain; B. Freiberg; Ralf Schlüter; Hermann Ney

Research on language modeling for speech recognition has increasingly focused on the application of neural networks. Two competing concepts have been developed: On the one hand, feedforward neural networks representing an n-gram approach, on the other hand recurrent neural networks that may learn context dependencies spanning more than a fixed number of predecessor words. To the best of our knowledge, no comparison has been carried out between feedforward and state-of-the-art recurrent networks when applied to speech recognition. This paper analyzes this aspect in detail on a well-tuned French speech recognition task. In addition, we propose a simple and efficient method to normalize language model probabilities across different vocabularies, and we show how to speed up training of recurrent neural networks by parallelization.

international conference on acoustics, speech, and signal processing | 2011

The RWTH 2010 Quaero ASR evaluation system for English, French, and German

Martin Sundermeyer; Markus Nussbaum-Thom; Simon Wiesler; Christian Plahl; A. El-Desoky Mousa; Stefan Hahn; David Nolden; Ralf Schlüter; Hermann Ney

Recognizing Broadcast Conversational (BC) speech data is a difficult task, which can be regarded as one of the major challenges beyond the recognition of Broadcast News (BN).

international conference on acoustics, speech, and signal processing | 2012

Performance analysis of Neural Networks in combination with n-gram language models

Ilya Oparin; Martin Sundermeyer; Hermann Ney; Jean-Luc Gauvain

Neural Network language models (NNLMs) have recently become an important complement to conventional n-gram language models (LMs) in speech-to-text systems. However, little is known about the behavior of NNLMs. The analysis presented in this paper aims to understand which types of events are better modeled by NNLMs as compared to n-gram LMs, in what cases improvements are most substantial and why this is the case. Such an analysis is important to take further benefit from NNLMs used in combination with conventional n-gram models. The analysis is carried out for different types of neural network (feed-forward and recurrent) LMs. The results showing for which type of events NNLMs provide better probability estimates are validated on two setups that are different in their size and the degree of data homogeneity.

adaptive multimedia retrieval | 2009

Relevance feedback for the Earth Mover's distance

Marc Wichterich; Christian Beecks; Martin Sundermeyer; Thomas Seidl

Expanding on our preliminary work [1], we present a novel method to heuristically adapt the Earth Movers Distance to relevance feedback. Moreover, we detail an optimization-based method that takes feedback from the current and past Relevance Feedback iterations into account in order to improve the degree to which the Earth Movers Distance reflects the preference information given by the user. As shown by our experiments, the adaptation of the Earth Movers Distance results in a larger number of relevant objects in fewer feedback iterations compared to existing query movement techniques for the Earth Movers Distance.

international conference on acoustics, speech, and signal processing | 2012

Conditional leaving-one-out and cross-validation for discount estimation in Kneser-Ney-like extensions

Jesús Andrés-Ferrer; Martin Sundermeyer; Hermann Ney

The smoothing of n-gram models is a core technique in language modelling (LM). Modified Kneser-Ney (mKN) ranges among one of the best smoothing techniques. This technique discounts a fixed quantity from the observed counts in order to approximate the Turing-Good (TG) counts. Despite the TG counts optimise the leaving-one-out (L1O) criterion, the discounting parameters introduced in mKN do not. Moreover, the approximation to the TG counts for large counts is heavily simplified. In this work, both ideas are addressed: the estimation of the discounting parameters by L1O and better functional forms to approximate larger TG counts. The L1O performance is compared with cross-validation (CV) and mKN baseline in two large vocabulary tasks.

conference of the international speech communication association | 2012

LSTM Neural Networks for Language Modeling.

Martin Sundermeyer; Ralf Schlüter; Hermann Ney

conference of the international speech communication association | 2012

Context-Dependent MLPs for LVCSR: TANDEM, Hybrid or Both?

Zoltán Tüske; Ralf Schlüter; Hermann Ney; Martin Sundermeyer

IWSLT | 2011

Speech recognition for machine translation in Quaero.

Lori Lamel; Sandrine Courcinous; Julien Despres; Jean-Luc Gauvain; Yvan Josse; Kevin Kilgour; Florian Kraft; Viet Bac Le; Hermann Ney; Markus Nußbaum-Thom; Ilya Oparin; Tim Schlippe; Ralf Schlüter; Tanja Schultz; Thiago Fraga-Silva; Sebastian Stüker; Martin Sundermeyer; Bianca Vieru; Ngoc Thang Vu; Alex Waibel

Explore More