Fabienne Cap
University of Stuttgart
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Fabienne Cap.
conference of the european chapter of the association for computational linguistics | 2014
Fabienne Cap; Alexander M. Fraser; Marion Weller; Aoife Cahill
Compounding in morphologically rich languages is a highly productive process which often causes SMT approaches to fail because of unseen words. We present an approach for translation into a compounding language that splits compounds into simple words for training and, due to an underspecified representation, allows for free merging of simple words into compounds after translation. In contrast to previous approaches, we use features projected from the source language to predict compound mergings. We integrate our approach into end-to-end SMT and show that many compounds matching the reference translation are produced which did not appear in the training data. Additional manual evaluations support the usefulness of generalizing compound formation in SMT.
Proceedings of the First Workshop on Computational Approaches to Compound Analysis (ComAComA 2014) | 2014
Marion Weller; Fabienne Cap; Stefan Müller; Sabine Schulte im Walde; Alexander M. Fraser
The paper presents an approach to morphological compound splitting that takes the degree of compositionality into account. We apply our approach to German noun compounds and particle verbs within a German‐English SMT system, and study the effect of only splitting compositional compounds as opposed to an aggressive splitting. A qualitative study explores the translational behaviour of non-compositional compounds.
north american chapter of the association for computational linguistics | 2015
Fabienne Cap; Manju Nirmal; Marion Weller; Sabine Schulte im Walde
Support-verb constructions (i.e., multiword expressions combining a semantically light verb with a predicative noun) are problematic for standard statistical machine translation systems, because SMT systems cannot distinguish between literal and idiomatic uses of the verb. We work on the German to English translation direction, for which the identification of support-verb constructions is challenging due to the relatively free word order of German. We show that we achieve improved translation quality for verb-object supportverb constructions by marking the verbs when occuring in such constructions. Additional evaluations revealed that our systems produce more correct verb translations than a contrastive baseline system without verb markup.
workshop on statistical machine translation | 2014
Fabienne Cap; Marion Weller; Anita Ramm; Alexander M. Fraser
We present the CimS submissions to the 2014 Shared Task for the language pair EN!DE. We address the major problems that arise when translating into German: complex nominal and verbal morphology, productive compounding and flexible word ordering. Our morphologyaware translation systems handle word formation issues on different levels of morpho-syntactic modeling.
workshop on statistical machine translation | 2015
Fabienne Cap; Marion Weller; Anita Ramm; Alexander M. Fraser
We present the CimS submissions to the WMT 2015 Shared Task for the translation direction English to German. Similar to our previous submissions, all of our systems are aware of the complex nominal morphology of German. In this paper, we combine source-side reordering and target-side compound processing with basic morphological processing in order to obtain improved translation results. We also report on morphological processing for English to French.
workshop on statistical machine translation | 2014
Daniel Quernheim; Fabienne Cap
We present the IMS-TTT submission to WMT14, an experimental statistical treeto-tree machine translation system based on the multi-bottom up tree transducer including rule extraction, tuning and decoding. Thanks to input parse forests and a “no pruning” strategy during decoding, the obtained translations are competitive. The drawbacks are a restricted coverage of 70% on test data, in part due to exact input parse tree matching, and a relatively high runtime. Advantages include easy redecoding with a different weight vector, since the full translation forests can be stored after the first decoding pass.
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers | 2016
Jörg Tiedemann; Fabienne Cap; Jenna Kanerva; Filip Ginter; Sara Stymne; Robert Östling; Marion Weller-Di Marco
This paper summarises the contributions of the teams at the University of Helsinki, Uppsala University and the University of Turku to the news translation tasks for translating from and to Finnish. Our models address the problem of treating morphology and data coverage in various ways. We introduce a new efficient tool for word alignment and discuss factorisations, gappy language models and reinflection techniques for generating proper Finnish output. The results demonstrate once again that training data is the most effective way to increase translation performance.
Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017) | 2017
Fabienne Cap
Show me your variance and I tell you who you are - Deriving compound compositionality from word alignments
north american chapter of the association for computational linguistics | 2015
Fabienne Cap; Ina Rösiger; Jonas Kuhn
We present a manually annotated word alignment of Franz Kafka’s “Verwandlung” and use this as a controlled test case to assess the principled usefulness of word alignment as an additional information source for the (monolingually motivated) identification of literary characters, focusing on the technically wellexplored task of co-reference resolution. This pilot set-up allows us to illustrate a number of methodological components interacting in a modular architecture. In general, co-reference resolution is a relatively hard task, but the availability of word-aligned translations can provide additional indications, as there is a tendency for translations to explicate underspecified or vague passages.
conference of the european chapter of the association for computational linguistics | 2012
Alexander M. Fraser; Marion Weller; Aoife Cahill; Fabienne Cap