Markus Freitag | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Markus Freitag is active.

Explore More

Publication

Featured researches published by Markus Freitag.

workshop on statistical machine translation | 2014

EU-BRIDGE MT: Combined Machine Translation

Markus Freitag; Stephan Peitz; Joern Wuebker; Hermann Ney; Matthias Huck; Rico Sennrich; Nadir Durrani; Maria Nadejde; Philip Williams; Philipp Koehn; Teresa Herrmann; Eunah Cho; Alex Waibel

This paper describes one of the collaborative efforts within EU-BRIDGE to further advance the state of the art in machine translation between two European language pairs, German→English and English→German. Three research institutes involved in the EU-BRIDGE project combined their individual machine translation systems and participated with a joint setup in the shared translation task of the evaluation campaign at the ACL 2014 Eighth Workshop on Statistical Machine Translation (WMT 2014). We combined up to nine different machine translation engines via system combination. RWTH Aachen University, the University of Edinburgh, and Karlsruhe Institute of Technology developed several individual systems which serve as system combination input. We devoted special attention to building syntax-based systems and combining them with the phrasebased ones. The joint setups yield empirical gains of up to 1.6 points in BLEU and 1.0 points in TER on the WMT newstest2013 test set compared to the best single systems.

conference of the european chapter of the association for computational linguistics | 2014

Jane: Open Source Machine Translation System Combination

Markus Freitag; Matthias Huck; Hermann Ney

Different machine translation engines can be remarkably dissimilar not only with respect to their technical paradigm, but also with respect to the translation output they yield. System combination is a method for combining the output of multiple machine translation engines in order to take benefit of the strengths of each of the individual engines. In this work we introduce a novel system combination implementation which is integrated into Jane, RWTH’s open source statistical machine translation toolkit. On the most recent Workshop on Statistical Machine Translation system combination shared task, we achieve improvements of up to 0.7 points in BLEU over the best system combination hypotheses which were submitted for the official evaluation. Moreover, we enhance our system combination pipeline with additional n-gram language models and lexical translation models.

The Prague Bulletin of Mathematical Linguistics | 2011

A Guide to Jane, an Open Source Hierarchical Translation Toolkit

Daniel Stein; David Vilar; Stephan Peitz; Markus Freitag; Matthias Huck; Hermann Ney

A Guide to Jane, an Open Source Hierarchical Translation Toolkit Jane is RWTHs hierarchical phrase-based translation toolkit. It includes tools for phrase extraction, translation and scaling factor optimization, with efficient and documented programs of which large parts can be parallelized. The decoder features syntactic enhancements, reorderings, triplet models, discriminative word lexica, and support for a variety of language model formats. In this article, we will review the main features of Jane and explain the overall architecture. We will also indicate where and how new models can be included.

The Prague Bulletin of Mathematical Linguistics | 2012

Hierarchical Phrase-Based Translation with Jane 2

Matthias Huck; Jan-Thorsten Peter; Markus Freitag; Stephan Peitz; Hermann Ney

Hierarchical Phrase-Based Translation with Jane 2 In this paper, we give a survey of several recent extensions to hierarchical phrase-based machine translation that have been implemented in version 2 of Jane, RWTHs open source statistical machine translation toolkit. We focus on the following techniques: Insertion and deletion models, lexical scoring variants, reordering extensions with non-lexicalized reordering rules and with a discriminative lexicalized reordering model, and soft string-to-dependency hierarchical machine translation. We describe the fundamentals of each of these techniques and present experimental results obtained with Jane 2 to confirm their usefulness in state-of-the-art hierarchical phrase-based translation (HPBT).

workshop on statistical machine translation | 2015

Local System Voting Feature for Machine Translation System Combination

Markus Freitag; Jan-Thorsten Peter; Stephan Peitz; Minwei Feng; Hermann Ney

In this paper, we enhance the traditional confusion network system combination approach with an additional model trained by a neural network. This work is motivated by the fact that the commonly used binary system voting models only assign each input system a global weight which is responsible for the global impact of each input system on all translations. This prevents individual systems with low system weights from having influence on the system combination output, although in some situations this could be helpful. Further, words which have only been seen by one or few systems rarely have a chance of being present in the combined output. We train a local system voting model by a neural network which is based on the words themselves and the combinatorial occurrences of the different system outputs. This gives system combination the option to prefer other systems at different word positions even for the same sentence.

meeting of the association for computational linguistics | 2017

Beam Search Strategies for Neural Machine Translation.

Markus Freitag; Yaser Al-Onaizan

The basic concept in Neural Machine Translation (NMT) is to train a large Neural Network that maximizes the translation performance on a given parallel corpus. NMT is then using a simple left-to-right beam-search decoder to generate new translations that approximately maximize the trained conditional probability. The current beam search strategy generates the target sentence word by word from left-to- right while keeping a fixed amount of active candidates at each time step. First, this simple search is less adaptive as it also expands candidates whose scores are much worse than the current best. Secondly, it does not expand hypotheses if they are not within the best scoring candidates, even if their scores are close to the best one. The latter one can be avoided by increasing the beam size until no performance improvement can be observed. While you can reach better performance, this has the draw- back of a slower decoding speed. In this paper, we concentrate on speeding up the decoder by applying a more flexible beam search strategy whose candidate size may vary at each time step depending on the candidate scores. We speed up the original decoder by up to 43% for the two language pairs German-English and Chinese-English without losing any translation quality.

Archive | 2017

Investigations on machine translation system combination

Markus Freitag; Hermann Ney; François Yvon

Machine translation is a task in the field of natural language processing whose objective is to translate documents from one human language into another human language without any human interaction. There has been extensive research in the field of machine translation and many different machine translation approaches have emerged. Current machine translation systems are based on different paradigms, such as e.g. phrases, phrases with gaps, hand-written rules, syntactical rules or neural networks. All approaches have been proven to perform well on several international evaluation campaigns, but no one has emerged as the superior approach. In this thesis, we investigate the combination of different machine translation approaches to benefit from all of them. The combination of outputs from multiple machine translation systems has been successfully applied in state-of-the-art machine translation evaluations for several years. System combination is a reliable method to combine the benefits of different machine translation systems into one single translation output. System combination relies on the concept of majority voting and the assumption that different machine translation engines produce different errors at different positions, but the majority agrees on a correct translation. Confusion network decoding has emerged as one of the the most successful approaches in combining machine translation outputs. The main goal of this thesis is to develop novel methods to improve the translation quality of confusion network system combination. In this thesis, we introduce a novel system combination implementation which has been made available as open-source toolkit to the research community. We extend previous invented approaches by the addition of several models and show that our methods produce better or similar translation results as the previous invented approaches. Moreover, compared to one single system combination approach, our implementation is significantly better in several translation tasks. On top of this high-level baseline, we extend the confusion network approach with an additional model learned by a neural network. The system combination output is typically a combination of the best available system engines and ignores the output of weaker translation systems, although they could be helpful in some situations. We show that our novel model also takes weaker systems into account and detects the positions where the weaker systems help to improve the quality of the combined translation. One of the most important steps in system combination is the pairwise alignment process between the different input systems. We introduce a novel alignment algorithm which is based on the source sentence and improves the translation quality of our combined translation. In addition to automatic evaluations, we also let humans evaluate our novel approach. Furthermore, we investigate the effect of decoding direction in the commonly used phrase-based and hierarchical phrase-based machine translation approaches. We show how to benefit from system combination and combine different machine translation setups that are based on different decoding directions. In addition, we investigate techniques to combine the different configurations in an earlier stage, e.g. after the alignment training or the phrase extraction step. Finally, we present our recent evaluation results that were obtained with our previously invented methods. We participated in the most recent international evaluation campaigns and demonstrate that our methods outperform the translation setups of all participating top-ranked international research labs in several language pairs.

international conference on computational linguistics | 2012

Jane 2: Open Source Phrase-based and Hierarchical Statistical Machine Translation

Joern Wuebker; Matthias Huck; Stephan Peitz; Malte Nuhn; Markus Freitag; Jan-Thorsten Peter; Saab Mansour; Hermann Ney

workshop on statistical machine translation | 2010

The RWTH Aachen Machine Translation System for WMT 2010

Matthias Huck; Joern Wuebker; Christoph Schmidt; Markus Freitag; Stephan Peitz; Daniel Stein; Arnaud Dagnelies; Saab Mansour; Gregor Leusch; Hermann Ney

IWSLT | 2011