Loïc Barrault | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Loïc Barrault is active.

Explore More

Publication

Featured researches published by Loïc Barrault.

empirical methods in natural language processing | 2017

Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

Alexis Conneau; Douwe Kiela; Holger Schwenk; Loïc Barrault; Antoine Bordes

Many modern NLP systems rely on word embeddings, previously trained in an unsupervised manner on large corpora, as base features. Efforts to obtain embeddings for larger chunks of text, such as sentences, have however not been so successful. Several attempts at learning unsupervised representations of sentences have not reached satisfactory enough performance to be widely adopted. In this paper, we show how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks. Much like how computer vision uses ImageNet to obtain features, which can then be transferred to other tasks, our work tends to indicate the suitability of natural language inference for transfer learning to other NLP tasks. Our encoder is publicly available.

The Prague Bulletin of Mathematical Linguistics | 2010

Open Source Machine Translation System Combination

Loïc Barrault

Many: Open Source Machine Translation System Combination This paper describes a push-the-button MT system combination toolkit. The combination is based on the creation of a lattice made on several confusion networks (CN) connected together. This lattice is then decoded with a token-pass decoder to provide the best and/or n-best outputs. Each CN is built using a modified version of the TERp tool. The toolkit is made of several scripts along a core program developed in Java. It is totally configurable and the parameters can be tuned quite easily.

workshop on statistical machine translation | 2009

SMT and SPE Machine Translation Systems for WMT'09

Holger Schwenk; Sadaf Abdul Rauf; Loïc Barrault; Jean Senellart

This paper describes the development of several machine translation systems for the 2009 WMT shared task evaluation. We only consider the translation between French and English. We describe a statistical system based on the Moses decoder and a statistical post-editing system using SYSTRANs rule-based system. We also investigated techniques to automatically extract additional bilingual texts from comparable corpora.

international conference on acoustics, speech, and signal processing | 2008

Frame-based acoustic feature integration for speech understanding

Loïc Barrault; Christophe Servan; Driss Matrouf; Georges Linarès; R. De Mori

With the purpose of improving spoken language understanding (SLU) performance, a combination of different acoustic speech recognition (ASR) systems is proposed. State a posteriori probabilities obtained with systems using different acoustic feature sets are combined with log-linear interpolation. In order to perform a coherent combination of these probabilities, acoustic models must have the same topology (i.e. same set of states). For this purpose, a fast and efficient twin model training protocol is proposed. By a wise choice of acoustic feature sets and log-linear interpolation of their likelihood ratios, a substantial concept error rate (CER) reduction has been observed on the test part of the French MEDIA corpus.

arXiv: Computation and Language | 2016

Does Multimodality Help Human and Machine for Translation and Image Captioning

Ozan Caglayan; Walid Aransa; Yaxing Wang; Marc Masana; Mercedes García-Martínez; Fethi Bougares; Loïc Barrault; Joost van de Weijer

This paper presents the systems developed by LIUM and CVC for the WMT16 Multimodal Machine Translation challenge. We explored various comparative methods, namely phrase-based systems and attentional recurrent neural networks models trained using monomodal or multimodal data. We also performed a human evaluation in order to estimate the usefulness of multimodal data for human machine translation and image description generation. Our systems obtained the best results for both tasks according to the automatic evaluation metrics BLEU and METEOR.

Machine Translation | 2014

Translation project adaptation for MT-enhanced computer assisted translation

Mauro Cettolo; Nicola Bertoldi; Marcello Federico; Holger Schwenk; Loïc Barrault; Christophe Servan

The effective integration of MT technology into computer-assisted translation tools is a challenging topic both for academic research and the translation industry. In particular, professional translators consider the ability of MT systems to adapt to the feedback provided by them to be crucial. In this paper, we propose an adaptation scheme to tune a statistical MT system to a translation project using small amounts of post-edited texts, like those generated by a single user in even just one day of work. The same scheme can be applied on a larger scale in order to focus general purpose models towards the specific domain of interest. We assess our method on two domains, namely information technology and legal, and four translation directions, from English to French, Italian, Spanish and German. The main outcome is that our adaptation strategy can be very effective provided that the seed data used for adaptation is ‘close enough’ to the remaining text to be translated; otherwise, MT quality neither improves nor worsens, thus showing the robustness of our method.

International Conference on NLP | 2012

Parallel Texts Extraction from Multimodal Comparable Corpora

Haithem Afli; Loïc Barrault; Holger Schwenk

Statistical machine translation (SMT) systems depend on the availability of domain-specific bilingual parallel text. However parallel corpora are a limited resource and they are often not available for some domains or language pairs. We analyze the feasibility of extracting parallel sentences from multimodal comparable corpora. This work extends the use of comparable corpora by using audio sources instead of texts on the source side. The audio is transcribed by an automatic speech recognition system and translated with a baseline SMT system. We then use information retrieval in a large text corpus in the target language to extract parallel sentences. We have performed a series of experiments on data of the IWSLT’11 speech translation task that shows the feasibility of our approach.

international conference on acoustics, speech, and signal processing | 2006

Characterizing Feature Variability in Automatic Speech Recognition Systems

Loïc Barrault; Driss Matrouf; R. De Mori; Roberto Gemello; Franco Mana

A method is described for predicting acoustic feature variability by analyzing the consensus and relative entropy of phoneme posterior probability distributions obtained with different acoustic models having the same type of observations. Variability prediction is used for diagnosis of automatic speech recognition (ASR) systems. When errors are likely to occur, different feature sets are considered for correcting recognition results. Experimental results are provided on the CH1 Italian portion of AURORA3

The Prague Bulletin of Mathematical Linguistics | 2017

NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation Systems

Ozan Caglayan; Mercedes García-Martínez; Adrien Bardet; Walid Aransa; Fethi Bougares; Loïc Barrault

Abstract In this paper, we present nmtpy, a flexible Python toolkit based on Theano for training Neural Machine Translation and other neural sequence-to-sequence architectures. nmtpy decouples the specification of a network from the training and inference utilities to simplify the addition of a new architecture and reduce the amount of boilerplate code to be written. nmtpy has been used for LIUM’s top-ranked submissions to WMT Multimodal Machine Translation and News Translation tasks in 2016 and 2017.

Computer Speech & Language | 2017

Introduction to the special issue on deep learning approaches for machine translation

Marta Ruiz Costa-Jussà; Alexandre Allauzen; Loïc Barrault; Kyunghun Cho; Holger Schwenk

Deep learning is revolutionizing speech and natural language technologies since it is offering an effective way to train systems and obtaining significant improvements. The main advantage of deep learning is that, by developing the right architecture, the system automatically learns features from data without the need of explicitly designing them. This machine learning perspective is conceptually changing how speech and natural language technologies are addressed. In the case of Machine Translation (MT), deep learning was first introduced in standard statistical systems. By now, end-to-end neural MT systems have reached competitive results. This special issue introductory paper addresses how deep learning has been gradually introduced in MT. This introduction covers all topics contained in the papers included in this special issue, which basically are: integration of deep learning in statistical MT; development of the end-to-end neural MT system; and introduction of deep learning in interactive MT and MT evaluation. Finally, this introduction sketches some research directions that MT is taking guided by deep learning.

Explore More