Felix Stahlberg
University of Cambridge
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Felix Stahlberg.
meeting of the association for computational linguistics | 2016
Felix Stahlberg; Eva Hasler; Aurelien Waite; Bill Byrne
We investigate the use of hierarchical phrase-based SMT lattices in end-to-end neural machine translation (NMT). Weight pushing transforms the Hiero scores for complete translation hypotheses, with the full translation grammar score and full n-gram language model score, into posteriors compatible with NMT predictive probabilities. With a slightly modified NMT beam-search decoder we find gains over both Hiero and NMT decoding alone, with practical advantages in extending NMT to very large input and output vocabularies.
international conference on natural language generation | 2017
Eva Hasler; Felix Stahlberg; Marcus Tomalin; Adri`a de Gispert; Bill Byrne
We compare several language models for the word-ordering task and propose a new bag-to-sequence neural model based on attention-based sequence-to-sequence models. We evaluate the model on a large German WMT data set where it significantly outperforms existing models. We also describe a novel search strategy for LM-based word ordering and report results on the English Penn Treebank. Our best model setup outperforms prior work both in terms of speed and quality.
Computer Speech & Language | 2017
Eva Hasler; Adrià de Gispert; Felix Stahlberg; Aurelien Waite; Bill Byrne
Long and complex input sentences can be a challenge for translation systems.Source simplification is a way to reduce the complexity of the input.Translation lattices allow to combine the output spaces of full and simplified inputs.Constraining the hypothesis space to translations of simplified inputs can be beneficial. Long sentences with complex syntax and long-distance dependencies pose difficulties for machine translation systems. Short sentences, on the other hand, are usually easier to translate. We study the potential of addressing this mismatch using text simplification: given a simplified version of the full input sentence, can we use it in addition to the full input to improve translation? We show that the spaces of original and simplified translations can be effectively combined using translation lattices and compare two decoding approaches to process both inputs at different levels of integration. We demonstrate on source-annotated portions of WMT test sets and on top of strong baseline systems combining hierarchical and neural translation for two language pairs that source simplification can help to improve translation quality.
Archive | 2016
Eva Hasler; Gispert Adrià de; Felix Stahlberg; Aurelien Waite; William Byrne
This data set contains subsets of English-German test sets from the Workshop for Machine Translation (WMT) which have been annotated with manual text simplification information on the source side in the form of gap begin and gap end symbols ( , ). The data was tokenized and truecased using the processing scripts distributed with the Moses SMT system. The source simplifications were produced by workers recruited on the crowdsourcing platform Crowdflower (https://www.crowdflower.com). We asked workers to simplify a sentence by deleting words and punctuation, while trying to retain the most important information in the shortened sentence. Their performance was controlled using test questions and a second Crowdflower task which asked workers to identify bad simplifications from the first task. The outcomes of the second task were aggregated by combining an agreement score and the average worker trust score for each simplification. We selected randomly from the remaining simplifications with a combined score of at least 0.5.
conference of the european chapter of the association for computational linguistics | 2017
Felix Stahlberg; Adrià de Gispert; Eva Hasler; Bill Byrne
arXiv: Computation and Language | 2016
Felix Stahlberg; Eva Hasler; Bill Byrne
empirical methods in natural language processing | 2017
Felix Stahlberg; Bill Byrne
meeting of the association for computational linguistics | 2018
Danielle Saunders; Felix Stahlberg; Adrià de Gispert; Bill Byrne
meeting of the association for computational linguistics | 2018
Danielle Saunders; Felix Stahlberg; Adrià de Gispert; Bill Byrne
conference of the association for machine translation in the americas | 2018
Felix Stahlberg; Danielle Saunders; Gonzalo Iglesias; Bill Byrne