Jean Senellart | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jean Senellart is active.

Explore More

Publication

Featured researches published by Jean Senellart.

workshop on statistical machine translation | 2007

Statistical Post-Editing on SYSTRAN's Rule-Based Translation System

Lo"ic Dugast; Jean Senellart; Philipp Koehn

This article describes the combination of a SYSTRAN system with a statistical post-editing (SPE) system. We document qualitative analysis on two experiments performed in the shared task of the ACL 2007 Workshop on Statistical Machine Translation. Comparative results and more integrated hybrid techniques are discussed.

meeting of the association for computational linguistics | 2017

OpenNMT: Open-Source Toolkit for Neural Machine Translation.

Guillaume Klein; Yoon Kim; Yuntian Deng; Jean Senellart; Alexander M. Rush

We describe an open-source toolkit for neural machine translation (NMT). The toolkit prioritizes efficiency, modularity, and extensibility with the goal of supporting NMT research into model architectures, feature representations, and source modalities, while maintaining competitive performance and reasonable training requirements. The toolkit consists of modeling and translation support, as well as detailed pedagogical documentation about the underlying techniques.

workshop on statistical machine translation | 2009

SMT and SPE Machine Translation Systems for WMT'09

Holger Schwenk; Sadaf Abdul Rauf; Loïc Barrault; Jean Senellart

This paper describes the development of several machine translation systems for the 2009 WMT shared task evaluation. We only consider the translation between French and English. We describe a statistical system based on the Moses decoder and a statistical post-editing system using SYSTRANs rule-based system. We also investigated techniques to automatically extract additional bilingual texts from comparable corpora.

workshop on statistical machine translation | 2008

Can we Relearn an RBMT System

Lo"ic Dugast; Jean Senellart; Philipp Koehn

This paper describes SYSTRAN submissions for the shared task of the third Workshop on Statistical Machine Translation at ACL. Our main contribution consists in a French-English statistical model trained without the use of any human-translated parallel corpus. In substitution, we translated a monolingual corpus with SYSTRAN rule-based translation engine to produce the parallel corpus. The results are provided herein, along with a measure of error analysis.

workshop on statistical machine translation | 2009

Statistical Post Editing and Dictionary Extraction: Systran/Edinburgh Submissions for ACL-WMT2009

Lo"ic Dugast; Jean Senellart; Philipp Koehn

We describe here the two Systran/University of Edinburgh submissions for WMT2009. They involve a statistical post-editing model with a particular handling of named entities (English to French and German to English) and the extraction of phrasal rules (English to French).

workshop on statistical machine translation | 2008

First Steps towards a General Purpose French/English Statistical Machine Translation System

Holger Schwenk; Jean-Baptiste Fouet; Jean Senellart

This paper describes an initial version of a general purpose French/English statistical machine translation system. The main features of this system are the open-source Moses decoder, the integration of a bilingual dictionary and a continuous space target language model. We analyze the performance of this system on the test data of the WMT08 evaluation.

meeting of the association for computational linguistics | 1998

Locating Noun Phrases with Finite State Transducers

Jean Senellart

We present a method for constructing, maintaining and consulting a database of proper nouns. We describe noun phrases composed of a proper noun and/or a description of a human occupation. They are formalized by finite state transducers (FST) and large coverage dictionaries and are applied to a corpus of newspapers. We take into account synonymy and hyperonymy. This first stage of our parsing procedure has a high degree of accuracy. We show how we can handle requests such as: Find all newspaper articles in a general corpus mentioning the French prime minister, or How is Mr. X referred to in the corpus; what have been his different occupations through out the period over which our corpus extends? In the first case, non trivial occurrences of noun phrases are located, that is phrases not containing words present in the request, but either synonyms, or proper nouns relevant to request. The results of the search is far better than than those obtained by a key-word based engine. Most answers are correct: except some cases of homonymy (where a human reader would also fail without more context). Also, the treatment of people having several different occupations is not fully resolved. We have built for French, a library of about one thousand such FSTs, and English FSTs are under construction. The same method can be used to locate and propose new proper nouns, simply by replacing given proper names in the same FSTs by variables.

international conference on computational linguistics | 2008

Tighter Integration of Rule-Based and Statistical MT in Serial System Combination

Nicola Ueffing; Jens Stephan; Loïc Dugast; George F. Foster; Roland Kuhn; Jean Senellart; Jin Yang

Recent papers have described machine translation (MT) based on an automatic post-editing or serial combination strategy whereby the input language is first translated into the target language by a rule-based MT (RBMT) system, then the target language output is automatically post-edited by a phrase-based statistical machine translation (SMT) system. This approach has been shown to improve MT quality over RBMT or SMT alone. In this previous work, there was a very loose coupling between the two systems: the SMT system only had access to the final 1-best translations from RBMT. Furthermore, the previous work involved European language pairs and relatively small training corpora. In this paper, we describe a more tightly integrated serial combination for the Chinese-to-English MT task. We will present experimental evaluation results on the 2008 NIST constrained data track where a significant gain in terms of both automatic and subjective metrics is achieved through the tighter coupling of the two systems.

recent advances in natural language processing | 2017

Domain Control for Neural Machine Translation.

Catherine Kobus; Josep Maria Crego; Jean Senellart

Machine translation systems are very sensitive to the domains they were trained on. Several domain adaptation techniques have been deeply studied. We propose a new technique for neural machine translation (NMT) that we call domain control which is performed at runtime using a unique neural network covering multiple domains. The presented approach shows quality improvements when compared to dedicated domains translating on any of the covered domains and even on out-of-domain data. In addition, model parameters do not need to be re-estimated for each domain, making this effective to real use cases. Evaluation is carried out on English-to-French translation for two different testing scenarios. We first consider the case where an end-user performs translations on a known domain. Secondly, we consider the scenario where the domain is not known and predicted at the sentence level before translating. Results show consistent accuracy improvements for both conditions.

Archive | 2010