Philippe Langlais | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Philippe Langlais is active.

Explore More

Publication

Featured researches published by Philippe Langlais.

empirical methods in natural language processing | 2005

Translating with Non-contiguous Phrases

Michel Simard; Nicola Cancedda; Bruno Cavestro; Marc Dymetman; Eric Gaussier; Cyril Goutte; Kenji Yamada; Philippe Langlais; Arne Mauser

This paper presents a phrase-based statistical machine translation method, based on non-contiguous phrases, i.e. phrases with gaps. A method for producing such phrases from a word-aligned corpora is proposed. A statistical translation model is also presented that deals such phrases, as well as a training method based on the maximization of translation accuracy, as measured with the NIST evaluation metric. Translations are produced by means of a beam-search decoder. Experimental results are presented, that demonstrate how the proposed method allows to better generalize from the training data.

north american chapter of the association for computational linguistics | 2000

TransType: a computer-aided translation typing system

Philippe Langlais; George F. Foster; Guy Lapalme

This paper describes the embedding of a statistical translation system within a text editor to produce TRANSTYPE, a system that watches over the user as he or she types a translation and repeatedly suggests completions for the text already entered. This innovative Embedded Machine Translation system is thus a specialized means of helping produce high quality translations.

Machine Translation | 2002

Trans Type: Development-Evaluation Cycles to Boost Translator's Productivity

Philippe Langlais; Guy Lapalme

We present TransType: a new approach to Machine-Aided Translation in which the human translator maintains control of the translation process while being helped by real-time completions proposed by a statistical translation engine. The TransType approach is first presented through a series of prototypes that illustrate their underlying translation model and graphical interface. The results of two rounds of in situ evaluation of TransType prototypes are discussed followed by a set of lessons learned in these experiments. It will be shown that this approach is valued by translators but given the short time allotted for the evaluation, translators were not able to quantitatively increase their productivity. TransType is compared with other approaches and new perspectives are elaborated for a new version being developed in the context of a Fifth Framework European Community Project.

meeting of the association for computational linguistics | 1998

Methods and Practical Issues in Evaluating Alignment Techniques

Philippe Langlais; Michel Simard; Jean Véronis

This paper describes the work achieved in the first half of a 4-year cooperative research project (ARCADE), financed by AUPELF-UREF. The project is devoted to the evaluation of parallel text alignment techniques. In its first period ARCADE ran a competition between six systems on a sentence-to-sentence alignment task which yielded two main types of results. First, a large reference bilingual corpus comprising of texts of different genres was created, each presenting various degrees of difficulty with respect to the alignment task.Second, significant methodological progress was made both on the evaluation protocols and metrics, and the algorithms used by the different systems. For the second phase, which is now underway, ARCADE has been opened to a larger number of teams who will tackle the problem of word-level alignment.

Archive | 2000

Evaluation of parallel text alignment systems

Jean Véronis; Philippe Langlais

This chapter describes the ARCADE project, concerned with the evaluation of parallel text alignment systems. The project is composed of two tracks, devoted to the evaluation of alignment at sentence and word level respectively, and is planned for a four-year period. At the time of this report, twelve systems have participated in the sentence track, and five in the word track. Substantial progress has been made on the evaluation methodology, metrics and protocols, and a large reference corpus has been produced. The results show that sentence level alignment is quite satisfactory (over 98.5% accuracy on “normal” texts), although it degrades sharply for texts that do not match perfectly at the structural level (i.e., missing fragments, order differences, etc.). State-of-the-art word alignment systems can largely improve, since they reach only ca. 75% accuracy on the “translation spotting” task on which they were evaluated.

Machine Translation | 2000

Unit Completion for a Computer-aided Translation Typing System

Philippe Langlais; George F. Foster; Guy Lapalme

This work is in the context of, a system thatwatches over the users as they type a translation andrepeatedly suggests completions for the text already entered.The users may either accept, modify, or ignore these suggestions. Wedescribe the design, implementation, and performance of aprototype which suggests completions of units of texts that arelonger than one word.

workshop on statistical machine translation | 2008

Limsi's Statistical Translation Systems for WMT'08

Daniel Déchelotte; Gilles Adda; Alexandre Allauzen; Hélène Bonneau-Maynard; Olivier Galibert; Jean-Luc Gauvain; Philippe Langlais; François Yvon

This paper describes our statistical machine translation systems based on the Moses toolkit for the WMT08 shared task. We address the Europarl and News conditions for the following language pairs: English with French, German and Spanish. For Europarl, n-best rescoring is performed using an enhanced n-gram or a neuronal language model; for the News condition, language models incorporate extra training data. We also report unconvincing results of experiments with factored models.

international conference on computational linguistics | 2002

Improving a general-purpose Statistical Translation Engine by terminological lexicons

Philippe Langlais

The past decade has witnessed exciting work in the field of Statistical Machine Translation (SMT). However, accurate evaluation of its potential in real-life contexts is still a questionable issue.In this study, we investigate the behavior of an SMT engine faced with a corpus far different from the one it has been trained on. We show that terminological databases are obvious resources that should be used to boost the performance of a statistical engine. We propose and evaluate a way of integrating terminology into a SMT engine which yields a significant reduction in word error rate.

Machine Translation | 2010

TransSearch: from a bilingual concordancer to a translation finder

Julien Bourdaillet; Stéphane Huet; Philippe Langlais; Guy Lapalme

As basic as bilingual concordancers may appear, they are some of the most widely used computer-assisted translation tools among professional translators. Nevertheless, they still do not benefit from recent breakthroughs in machine translation. This paper describes the improvement of the commercial bilingual concordancer TransSearch in order to embed a word alignment feature. The use of statistical word alignment methods allows the system to spot user query translations, and thus the tool is transformed into a translation search engine. We describe several translation identification and postprocessing algorithms that enhance the application. The excellent results obtained using a large translation memory consisting of 8.3 million sentence pairs are confirmed via human evaluation.

meeting of the association for computational linguistics | 2009

Improvements in Analogical Learning: Application to Translating Multi-Terms of the Medical Domain

Philippe Langlais; François Yvon; Pierre Zweigenbaum

Handling terminology is an important matter in a translation workflow. However, current Machine Translation (MT) systems do not yet propose anything proactive upon tools which assist in managing terminological databases. In this work, we investigate several enhancements to analogical learning and test our implementation on translating medical terms. We show that the analogical engine works equally well when translating from and into a morphologically rich language, or when dealing with language pairs written in different scripts. Combining it with a phrase-based statistical engine leads to significant improvements.

Explore More