Matthias Huck | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Matthias Huck is active.

Explore More

Publication

Featured researches published by Matthias Huck.

workshop on statistical machine translation | 2015

Findings of the 2015 Workshop on Statistical Machine Translation

Ondrej Bojar; Rajen Chatterjee; Christian Federmann; Barry Haddow; Matthias Huck; Chris Hokamp; Philipp Koehn; Varvara Logacheva; Christof Monz; Matteo Negri; Matt Post; Carolina Scarton; Lucia Specia; Marco Turchi

This paper presents the results of the WMT15 shared tasks, which included a standard news translation task, a metrics task, a tuning task, a task for run-time estimation of machine translation quality, and an automatic post-editing task. This year, 68 machine translation systems from 24 institutions were submitted to the ten translation directions in the standard translation task. An additional 7 anonymized systems were included, and were then evaluated both automatically and manually. The quality estimation task had three subtasks, with a total of 10 teams, submitting 34 entries. The pilot automatic postediting task had a total of 4 teams, submitting 7 entries.

meeting of the association for computational linguistics | 2016

Findings of the 2016 Conference on Machine Translation.

Ondˇrej Bojar; Rajen Chatterjee; Christian Federmann; Yvette Graham; Barry Haddow; Matthias Huck; Antonio Jimeno Yepes; Philipp Koehn; Varvara Logacheva; Christof Monz; Matteo Negri; Aurélie Névéol; Mariana L. Neves; Martin Popel; Matt Post; Raphael Rubino; Carolina Scarton; Lucia Specia; Marco Turchi; Karin Verspoor; Marcos Zampieri

This paper presents the results of the WMT16 shared tasks, which included five machine translation (MT) tasks (standard news, IT-domain, biomedical, multimodal, pronoun), three evaluation tasks (metrics, tuning, run-time estimation of MT quality), and an automatic post-editing task and bilingual document alignment task. This year, 102 MT systems from 24 institutions (plus 36 anonymized online systems) were submitted to the 12 translation directions in the news translation task. The IT-domain task received 31 submissions from 12 institutions in 7 directions and the Biomedical task received 15 submissions systems from 5 institutions. Evaluation was both automatic and manual (relative ranking and 100-point scale assessments). The quality estimation task had three subtasks, with a total of 14 teams, submitting 39 entries. The automatic post-editing task had a total of 6 teams, submitting 11 entries.

workshop on statistical machine translation | 2014

Edinburghâ€™s Syntax-Based Systems at WMT 2014

Philip Williams; Rico Sennrich; Maria Nadejde; Matthias Huck; Eva Hasler; Philipp Koehn

This paper describes the string-to-tree systems built at the University of Edinburgh for the WMT 2014 shared translation task. We developed systems for English-German, Czech-English, FrenchEnglish, German-English, Hindi-English, and Russian-English. This year we improved our English-German system through target-side compound splitting, morphosyntactic constraints, and refinements to parse tree annotation; we addressed the out-of-vocabulary problem using transliteration for Hindi and Russian and using morphological reduction for Russian; we improved our GermanEnglish system through tree binarization; and we reduced system development time by filtering the tuning sets.

Machine Translation | 2012

Jane: an advanced freely available hierarchical machine translation toolkit

David Vilar; Daniel Stein; Matthias Huck; Hermann Ney

In this article we will describe the design and implementation of Jane, an efficient hierarchical phrase-based (HPB) toolkit developed at RWTH Aachen University. The system has been used by RWTH at several international evaluation campaigns, including the WMT and NIST evaluations, and is now freely available for non-commercial application. We will go through the main features of Jane, which include, among others, support for different search strategies, different language model formats, support for syntax-based enhancements to the HPB machine translation paradigm, string-to-dependency translation, extended lexicon models, different methods for minimum-error-rate training and distributed operation on a computer cluster. Special attention has been paid to the efficiency of the decoder, clean code and quality assurance through unit and regression testing. Results on current machine translation tasks are reported, which show that the system is able to obtain state-of-the-art performance.

workshop on statistical machine translation | 2014

EU-BRIDGE MT: Combined Machine Translation

Markus Freitag; Stephan Peitz; Joern Wuebker; Hermann Ney; Matthias Huck; Rico Sennrich; Nadir Durrani; Maria Nadejde; Philip Williams; Philipp Koehn; Teresa Herrmann; Eunah Cho; Alex Waibel

This paper describes one of the collaborative efforts within EU-BRIDGE to further advance the state of the art in machine translation between two European language pairs, German→English and English→German. Three research institutes involved in the EU-BRIDGE project combined their individual machine translation systems and participated with a joint setup in the shared translation task of the evaluation campaign at the ACL 2014 Eighth Workshop on Statistical Machine Translation (WMT 2014). We combined up to nine different machine translation engines via system combination. RWTH Aachen University, the University of Edinburgh, and Karlsruhe Institute of Technology developed several individual systems which serve as system combination input. We devoted special attention to building syntax-based systems and combining them with the phrasebased ones. The joint setups yield empirical gains of up to 1.6 points in BLEU and 1.0 points in TER on the WMT newstest2013 test set compared to the best single systems.

conference of the european chapter of the association for computational linguistics | 2014

Jane: Open Source Machine Translation System Combination

Markus Freitag; Matthias Huck; Hermann Ney

Different machine translation engines can be remarkably dissimilar not only with respect to their technical paradigm, but also with respect to the translation output they yield. System combination is a method for combining the output of multiple machine translation engines in order to take benefit of the strengths of each of the individual engines. In this work we introduce a novel system combination implementation which is integrated into Jane, RWTH’s open source statistical machine translation toolkit. On the most recent Workshop on Statistical Machine Translation system combination shared task, we achieve improvements of up to 0.7 points in BLEU over the best system combination hypotheses which were submitted for the official evaluation. Moreover, we enhance our system combination pipeline with additional n-gram language models and lexical translation models.

The Prague Bulletin of Mathematical Linguistics | 2011

A Guide to Jane, an Open Source Hierarchical Translation Toolkit

Daniel Stein; David Vilar; Stephan Peitz; Markus Freitag; Matthias Huck; Hermann Ney

A Guide to Jane, an Open Source Hierarchical Translation Toolkit Jane is RWTHs hierarchical phrase-based translation toolkit. It includes tools for phrase extraction, translation and scaling factor optimization, with efficient and documented programs of which large parts can be parallelized. The decoder features syntactic enhancements, reorderings, triplet models, discriminative word lexica, and support for a variety of language model formats. In this article, we will review the main features of Jane and explain the overall architecture. We will also indicate where and how new models can be included.

workshop on statistical machine translation | 2015

The Edinburgh/JHU Phrase-based Machine Translation Systems for WMT~2015

Barry Haddow; Matthias Huck; Alexandra Birch; Nikolay Bogoychev; Philipp Koehn

This paper describes the submission of the University of Edinburgh and the Johns Hopkins University for the shared translation task of the EMNLP 2015 Tenth Workshop on Statistical Machine Translation (WMT 2015). We set up phrase-based statistical machine translation systems for all ten language pairs of this year’s evaluation campaign, which are English paired with Czech, Finnish, French, German, and Russian in both translation directions. Novel research directions we investigated include: neural network language models and bilingual neural network language models, a comprehensive use of word classes, and sparse lexicalized reordering features.

empirical methods in natural language processing | 2014

Preference Grammars and Soft Syntactic Constraints for GHKM Syntax-based Statistical Machine Translation

Matthias Huck; Hieu Hoang; Philipp Koehn

In this work, we investigate the effectiveness of two techniques for a featurebased integration of syntactic information into GHKM string-to-tree statistical machine translation (Galley et al., 2004): (1.) Preference grammars on the target language side promote syntactic wellformedness during decoding while also allowing for derivations that are not linguistically motivated (as in hierarchical translation). (2.) Soft syntactic constraints augment the system with additional sourceside syntax features while not modifying the set of string-to-tree translation rules or the baseline feature scores. We conduct experiments with a stateof-the-art setup on an English!German translation task. Our results suggest that preference grammars for GHKM translation are inferior to the plain targetsyntactified model, whereas the enhancement with soft source syntactic constraints provides consistent gains. By employing soft source syntactic constraints with sparse features, we are able to achieve improvements of up to 0.7 points BLEU and 1.0 points TER.

workshop on statistical machine translation | 2014

Augmenting String-to-Tree and Tree-to-String Translation with Non-Syntactic Phrases

Matthias Huck; Hieu Hoang; Philipp Koehn

We present an effective technique to easily augment GHKM-style syntax-based machine translation systems (Galley et al., 2006) with phrase pairs that do not comply with any syntactic well-formedness constraints. Non-syntactic phrase pairs are distinguished from syntactic ones in order to avoid harming effects. We apply our technique in state-of-the-art string-totree and tree-to-string setups. For tree-tostring translation, we furthermore investigate novel approaches for translating with source-syntax GHKM rules in association with input tree constraints and input tree features.

Explore More