Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Johann Roturier is active.

Publication


Featured researches published by Johann Roturier.


Machine Translation | 2014

Quality evaluation in community post-editing

Linda Mitchell; Sharon O'Brien; Johann Roturier

Machine translation is increasingly being deployed to translate user-generated content (UGC). In many situations, post-editing is required to ensure that the translations are correct and comprehensible for the users. Post-editing by professional translators is not always feasible in the context of UGC within online communities and so members of such communities are sometimes asked to translate or post-edit content on behalf of the community. How should we measure the quality of UGC that has been post-edited by community members? Is quality evaluation by community members a feasible alternative to professional evaluation techniques? This paper describes the outcomes of three quality evaluation methods for community post-edited content: (1) an error annotation performed by a trained linguist; (2) evaluation of fluency and fidelity by domain specialists; (3) evaluation of fluency by community members. The study finds that there are correlations of evaluation results between the domain specialist evaluation and the community evaluation for content machine translated from English into German in an online technical support community. Interestingly, the community evaluators were more critical in their ratings for fluency than the domain experts. Although the results of the error annotation seem to contradict those obtained in the domain specialist evaluation, a higher number of errors in the error annotation appear to result in lower scores in the domain specialist evaluation. We conclude that, within the context of this evaluation, post-editing by community members is feasible, though with considerable variation across individuals, and that evaluation by the community is also a feasible proposition.


empirical methods in natural language processing | 2015

Foreebank: Syntactic Analysis of Customer Support Forums

Rasoul Samad Zadeh Kaljahi; Jennifer Foster; Johann Roturier; Corentin Ribeyre; Teresa Lynn; Joseph Le Roux

We present a new treebank of English and French technical forum content which has been annotated for grammatical errors and phrase structure. This double annotation allows us to empirically measure the effect of errors on parsing performance. While it is slightly easier to parse the corrected versions of the forum sentences, the errors are not the main factor in making this kind of text hard to parse.


Machine Translation | 2015

Quality estimation-guided supplementary data selection for domain adaptation of statistical machine translation

Pratyush Banerjee; Raphael Rubino; Johann Roturier; Josef van Genabith

The problem of domain adaptation in statistical machine translation systems emanates from the fundamental assumption that test and training data are drawn independently from the same distribution (topic, domain, genre, style etc.). In real-life translation tasks, the sparseness of in-domain parallel training data often leads to poor model estimation, and consequentially poor translation quality. Domain adaptation by supplementary data selection aims at addressing this specific issue by selecting relevant parallel training data from out-of-domain or general-domain bi-text to enhance the quality of a poor baseline system. State-of-the-art research in data selection focuses on the development of novel similarity measures to improve the relevance of selected data. However, in this paper we approach the problem from a different perspective. In contrast to the conventional approach of using the entire available target-domain data as a reference for supplementary data selection, we restrict the reference set to only those sentences that are expected to be poorly translated by the baseline MT system using a Quality Estimation model. Our rationale is to focus help (i.e. supplementary training material) to where it is needed most. Automatic quality estimation techniques are used to identify such poorly translated sentences in the target domain. The experiments reported in this paper show that (i) this technique provides statistically significant improvements over the unadapted baseline translation and (ii) using significantly smaller amounts of supplementary data our approach achieves results comparable to state-of-the-art approaches using conventional reference sets.


joint conference on lexical and computational semantics | 2014

Semantic Role Labelling with minimal resources: Experiments with French

Rasoul Samad Zadeh Kaljahi; Jennifer Foster; Johann Roturier

This paper describes a series of French semantic role labelling experiments which show that a small set of manually annotated training data is superior to a much larger set containing semantic role labels which have been projected from a source language via word alignment. Using universal part-of-speech tags and dependencies makes little difference over the original fine-grained tagset and dependency scheme. Moreover, there seems to be no improvement gained from projecting semantic roles between direct translations than between indirect translations.


conference of the european chapter of the association for computational linguistics | 2014

The ACCEPT Portal: An Online Framework for the Pre-editing and Post-editing of User-Generated Content

Violeta Seretan; Johann Roturier; David Silva; Pierrette Bouillon

With the development of Web 2.0, a lot of content is nowadays generated online by users. Due to its characteristics (e.g., use of jargon and abbreviations, typos, grammatical and style errors), the user-generated content poses specific challenges to machine translation. This paper presents an online platform devoted to the pre-editing of user-generated content and its post-editing, two main types of human assistance strategies which are combined with domain adaptation and other techniques in order to improve the translation of this type of content. The platform has recently been released publicly and is being tested by two main types of user communities, namely, technical forum users and volunteer translators.


empirical methods in natural language processing | 2014

Syntax and Semantics in Quality Estimation of Machine Translation

Rasoul Samad Zadeh Kaljahi; Jennifer Foster; Johann Roturier

We employ syntactic and semantic information in estimating the quality of machine translation from a new data set which contains source text from English customer support forums and target text consisting of its machine translation into French. These translations have been both post-edited and evaluated by professional translators. We find that quality estimation using syntactic and semantic information on this data set can hardly improve over a baseline which uses only surface features. However, the performance can be improved when they are combined with such surface features. We also introduce a novel metric to measure translation adequacy based on predicate-argument structure match using word alignments. While word alignments can be reliably used, the two main factors affecting the performance of all semantic-based methods seems to be the low quality of semantic role labelling (especially on ill-formed text) and the lack of nominal predicate annotation.


Machine Translation | 2015

Miguel Á. Bernal-Merino: Translation and localisation in video games: making entertainment software global

Johann Roturier

Translation and Localisation in Video Games is a monograph aimed at scholars, researchers and students in the field of translation studies whowish to specialize in this revenue-generating type of entertainment software. Existing localisation practitioners and video game publishers should also benefit from this book to better understand the cultural issues that are at stake when going global. As the subtitle,Making Entertainment Software Global suggests, the book focuses on the translation and localisation work that is required to adapt video games to other cultures and locales. Even though the video games industry is one of the fastest growing segments of the entertainment industry, it has received little academic or industrial focus apart from the recent work of O’Hagan and Mangiron (2013) and Chandler and Deming (2011). This monograph is divided into five core chapters, which are preceded by a short introduction and followed by a brief conclusion. All of these chapters (apart from the conclusion) finish with a list of possible research projects (or practical tasks) that can be used by the individual reader to think further about some of the topics covered in the chapters or in a classroom as activities with students. Even though each chapter could be easily read as a standalone piece (e.g. localisation practitioners may be tempted to focus on Chapter 5 while educators may want to focus on Chapter 6), a linear reading provides detailed background and context for any of the remaining chapters. And context matters: throughout this monograph, one of the key messages of the author is that translators often do not currently have enough context to produce quality translations to create an immersive experience for users. This observation, which is not unique to the video game localisation industry, is obviously significant for those story-rich games where the user plays alone for a long period of time, but


Archive | 2012

Domain Adaptation in SMT of User-Generated Forum Content Guided by OOV Word Reduction: Normalization and/or Supplementary Data?

Pratyush Banerjee; Sudip Kumar Naskar; Johann Roturier; Andy Way; Josef van Genabith


workshop on statistical machine translation | 2012

DCU-Symantec Submission for the WMT 2012 Quality Estimation Task

Raphael Rubino; Jennifer Foster; Joachim Wagner; Johann Roturier; Rasoul Samad Zadeh Kaljahi; Fred Hollowood


Archive | 2010

Improving the post-editing experience using translation recommendation: a user study

Yifan He; Yanjun Ma; Johann Roturier; Andy Way; Josef van Genabith

Collaboration


Dive into the Johann Roturier's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Andy Way

Dublin City University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge