Philip Williams | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Philip Williams is active.

Explore More

Publication

Featured researches published by Philip Williams.

workshop on statistical machine translation | 2014

Edinburghâ€™s Syntax-Based Systems at WMT 2014

Philip Williams; Rico Sennrich; Maria Nadejde; Matthias Huck; Eva Hasler; Philipp Koehn

This paper describes the string-to-tree systems built at the University of Edinburgh for the WMT 2014 shared translation task. We developed systems for English-German, Czech-English, FrenchEnglish, German-English, Hindi-English, and Russian-English. This year we improved our English-German system through target-side compound splitting, morphosyntactic constraints, and refinements to parse tree annotation; we addressed the out-of-vocabulary problem using transliteration for Hindi and Russian and using morphological reduction for Russian; we improved our GermanEnglish system through tree binarization; and we reduced system development time by filtering the tuning sets.

workshop on statistical machine translation | 2014

EU-BRIDGE MT: Combined Machine Translation

Markus Freitag; Stephan Peitz; Joern Wuebker; Hermann Ney; Matthias Huck; Rico Sennrich; Nadir Durrani; Maria Nadejde; Philip Williams; Philipp Koehn; Teresa Herrmann; Eunah Cho; Alex Waibel

This paper describes one of the collaborative efforts within EU-BRIDGE to further advance the state of the art in machine translation between two European language pairs, German→English and English→German. Three research institutes involved in the EU-BRIDGE project combined their individual machine translation systems and participated with a joint setup in the shared translation task of the evaluation campaign at the ACL 2014 Eighth Workshop on Statistical Machine Translation (WMT 2014). We combined up to nine different machine translation engines via system combination. RWTH Aachen University, the University of Edinburgh, and Karlsruhe Institute of Technology developed several individual systems which serve as system combination input. We devoted special attention to building syntax-based systems and combining them with the phrasebased ones. The joint setups yield empirical gains of up to 1.6 points in BLEU and 1.0 points in TER on the WMT newstest2013 test set compared to the best single systems.

Proceedings of the Second Conference on Machine Translation | 2017

The University of Edinburgh's Neural MT Systems for WMT17

Rico Sennrich; Alexandra Birch; Anna Currey; Ulrich Germann; Barry Haddow; Kenneth Heafield; Antonio Barone; Philip Williams

This paper describes the University of Edinburghs submissions to the WMT17 shared news translation and biomedical translation tasks. We participated in 12 translation directions for news, translating between English and Czech, German, Latvian, Russian, Turkish and Chinese. For the biomedical task we submitted systems for English to Czech, German, Polish and Romanian. Our systems are neural machine translation systems trained with Nematus, an attentional encoder-decoder. We follow our setup from last year and build BPE-based models with parallel and back-translated monolingual training data. Novelties this year include the use of deep architectures, layer normalization, and more compact models due to weight tying and improvements in BPE segmentations. We perform extensive ablative experiments, reporting on the effectivenes of layer normalization, deep architectures, and different ensembling techniques.

Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers | 2016

Edinburgh's Statistical Machine Translation Systems for WMT16

Philip Williams; Rico Sennrich; Maria Nadejde; Matthias Huck; Barry Haddow; Ondrej Bojar

This paper describes the University of Edinburgh’s phrase-based and syntax-based submissions to the shared translation tasks of the ACL 2016 First Conference on Machine Translation (WMT16). We submitted five phrase-based and five syntaxbased systems for the news task, plus one phrase-based system for the biomedical task.

Computer Speech & Language | 2015

A tree does not make a well-formed sentence: Improving syntactic string-to-tree statistical machine translation with more linguistic knowledge

Rico Sennrich; Philip Williams; Matthias Huck

Abstract Synchronous context-free grammars (SCFGs) can be learned from parallel texts that are annotated with target-side syntax, and can produce translations by building target-side syntactic trees from source strings. Ideally, producing syntactic trees would entail that the translation is grammatically well-formed, but in reality, this is often not the case. Focusing on translation into German, we discuss various ways in which string-to-tree translation models over- or undergeneralise. We show how these problems can be addressed by choosing a suitable parser and modifying its output, by introducing linguistic constraints that enforce morphological agreement and constrain subcategorisation, and by modelling the productive generation of German compounds.

Synthesis Lectures on Human Language Technologies | 2016

Syntax-based Statistical Machine Translation

Philip Williams; Rico Sennrich; Matt Post; Philipp Koehn

Abstract This unique book provides a comprehensive introduction to the most popular syntax-based statistical machine translation models, filling a gap in the current literature for researchers and developers in human language technologies. While phrase-based models have previously dominated the field, syntax-based approaches have proved a popular alternative, as they elegantly solve many of the shortcomings of phrase-based models. The heart of this book is a detailed introduction to decoding for syntax-based models. The book begins with an overview of synchronous-context free grammar (SCFG) and synchronous tree-substitution grammar (STSG) along with their associated statistical models. It also describes how three popular instantiations (Hiero, SAMT, and GHKM) are learned from parallel corpora. It introduces and details hypergraphs and associated general algorithms, as well as algorithms for decoding with both tree and string input. Special attention is given to efficiency, including search approximations ...Abstract This unique book provides a comprehensive introduction to the most popular syntax-based statistical machine translation models, filling a gap in the current literature for researchers and ...

Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra) | 2014

Using Feature Structures to Improve Verb Translation in English-to-German Statistical MT

Philip Williams; Philipp Koehn

SCFG-based statistical MT models have proven effective for modelling syntactic aspects of translation, but still suffer problems of overgeneration. The production of German verbal complexes is particularly challenging since highly discontiguous constructions must be formed consistently, often from multiple independent rules. We extend a strong SCFG-based string-to-tree model to incorporate a rich feature-structure based representation of German verbal complex types and compare verbal complex production against that of the reference translations, finding a high baseline rate of error. By developing model features that use source-side information to influence the production of verbal complexes we are able to substantially improve the type accuracy as compared to the reference.

Morgan & Claypool Publishers | 2016