Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Anthony Rousseau is active.

Publication


Featured researches published by Anthony Rousseau.


The Prague Bulletin of Mathematical Linguistics | 2013

XenC: An Open-Source Tool for Data Selection in Natural Language Processing

Anthony Rousseau

Abstract In this paper we describe XenC, an open-source tool for data selection aimed at Natural Language Processing (NLP) in general and Statistical Machine Translation (SMT) or Automatic Speech Recognition (ASR) in particular. Usually, when building a SMT or ASR system, the considered task is related to a specific domain of application, like news articles or scientific talks for instance. The goal of XenC is to allow selection of relevant data regarding the considered task, which will be used to build the statistical models for such a system. It is done by computing the difference between cross-entropy scores of sentences from a large out-of-domain corpus and sentences from a corpus considered as in-domain for the task. Written in C++, this tool can operate on monolingual or bilingual data and is language-independent. XenC, now part of the LIUM toolchain for SMT, is actively developed since December 2011 and used in many MT projects.


text speech and dialogue | 2014

LIUM and CRIM ASR System Combination for the REPERE Evaluation Campaign

Anthony Rousseau; Gilles Boulianne; Paul Deléglise; Yannick Estève; Vishwa Gupta; Sylvain Meignier

This paper describes the ASR system proposed by the SODA consortium to participate in the ASR task of the French REPERE evaluation campaign. The official test REPERE corpus is composed of TV shows. The entire ASR system was produced by combining two ASR systems built by two members of the consortium. Each ASR system has some specificities: one uses an i-vector-based speaker adaptation of deep neural networks for acoustic modeling, while the other one rescores word-lattices with continuous space language models. The entire ASR system won the REPERE evaluation campaign on the ASR task. On the REPERE test corpus, this composite ASR system reaches a word error rate of 13.5%.


spoken language technology workshop | 2016

LIUM ASR systems for the 2016 Multi-Genre Broadcast Arabic challenge

Natalia A. Tomashenko; Kévin Vythelingum; Anthony Rousseau; Yannick Estève

This paper describes the automatic speech recognition (ASR) systems developed by LIUM in the framework of the 2016 Multi-Genre Broadcast (MGB-2) Challenge in the Arabic language. LIUM participated in the first of the two proposed tasks, namely the speech-to-text transcription of Aljazeera recordings. We present the approaches and details found in our systems, as well as our results in the evaluation campaign: the primary LIUM ASR system attained the second position. The main aspects come from the use of GMM-derived features for training a DNN, combined with the use of time-delay neural networks for acoustic models, the use of two different approaches in order to automatically phonetize Arabic words, and finally, the training data selection strategy for acoustic and language models.


ieee automatic speech recognition and understanding workshop | 2015

CRIM and LIUM approaches for multi-genre broadcast media transcription

Vishwa Gupta; Paul Deléglise; Gilles Boulianne; Yannick Estève; Sylvain Meignier; Anthony Rousseau

The Multi-Genre Broadcast Challenge at ASRU 2015 is a controlled evaluation of speech recognition, speaker diarization, and lightly supervised alignment using BBC TV recordings. CRIM and LIUM teams participated in the speech recognition part of the challenge with a joint submission. This paper presents the CRIM and LIUMs contributions. Each team made different choices to develop its ASR system. By the way, it was expected to compare and to evaluate different approaches to diarization and acoustic modeling, and to get complementary ASR systems for effective merging. CRIMs main contributions are the use of a training scenario similar to multi-lingual training to estimate the deep neural net (DNN) acoustic models with most of the data, the use of a pruned trigram model for search, in addition to the use of a genre-dependent quadgram language model for rescoring the lattice from the search. For LIUM, the focus was on fast decoding with high accuracy. The final word error rates (WER) after merging show that it is possible to get reasonable WER with automatically aligned files. The final global WER of 25.1% corresponds to a WER reduction of about 20% absolute in comparison to the ASR baseline system provided by the organizers.


north american chapter of the association for computational linguistics | 2012

Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation

Holger Schwenk; Anthony Rousseau; Mohammed Attik


language resources and evaluation | 2012

TED-LIUM: an Automatic Speech Recognition dedicated corpus

Anthony Rousseau; Paul Deléglise; Yannick Est`eve


language resources and evaluation | 2014

Enhancing the TED-LIUM Corpus with Selected Data for Language Modeling and More TED Talks

Anthony Rousseau; Paul Deléglise; Yannick Est`eve


International Workshop on Spoken Language Translation | 2011

LIUM’s systems for the IWSLT 2011 Speech Translation Tasks

Anthony Rousseau; Fethi Bougares; Paul Deléglise; Holger Schwenk; Yannick Estève


International Workshop on Spoken Language Translation (IWSLT) | 2013

The LIUM English-to-French Spoken Language Translation System and the Vecsys/LIUM Automatic Speech Recognition System for Italian Language for IWSLT 2014

Anthony Rousseau; Loïc Barrault; Paul Deléglise; Yannick Estève; Holger Schwenk; Samir Bennacef; Armando Muscariello; Stephan Vanni


International Workshop on Spoken Language Translation (IWSLT) 2010 | 2010

LIUM's Statistical Machine Translation System for IWSLT 2010

Anthony Rousseau; Loïc Barrault; Paul Deléglise; Yannick Estève

Collaboration


Dive into the Anthony Rousseau's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Holger Schwenk

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gilles Boulianne

Institut national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Vishwa Gupta

Institut national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge