Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Benjamin Lecouteux is active.

Publication


Featured researches published by Benjamin Lecouteux.


ACM Transactions on Accessible Computing | 2015

Evaluation of a Context-Aware Voice Interface for Ambient Assisted Living: Qualitative User Study vs. Quantitative System Evaluation

Michel Vacher; Sybille Caffiau; François Portet; Brigitte Meillon; Camille Roux; Elena Elias; Benjamin Lecouteux; Pedro Chahuara

This article presents an experiment with seniors and people with visual impairment in a voice-controlled smart home using the Sweet-Home system. The experiment shows some weaknesses in automatic speech recognition that must be addressed, as well as the need for better adaptation to the user and the environment. Users were disturbed by the rigid structure of the grammar and were eager to adapt it to their own preferences. Surprisingly, while no humanoid aspect was introduced in the system, the senior participants were inclined to embody the system. Despite these aspects to improve, the system has been favorably assessed as diminishing most participant fears related to the loss of autonomy.


IEEE Transactions on Audio, Speech, and Language Processing | 2013

Dynamic Combination of Automatic Speech Recognition Systems by Driven Decoding

Benjamin Lecouteux; Georges Linarès; Yannick Estève; Guillaume Gravier

Combining automatic speech recognition (ASR) systems generally relies on the posterior merging of the outputs or on acoustic cross-adaptation. In this paper, we propose an integrated approach where outputs of secondary systems are integrated in the search algorithm of a primary one. In this driven decoding algorithm (DDA), the secondary systems are viewed as observation sources that should be evaluated and combined to others by a primary search algorithm. DDA is evaluated on a subset of the ESTER I corpus consisting of 4 hours of French radio broadcast news. Results demonstrate DDA significantly outperforms vote-based approaches: we obtain an improvement of 14.5% relative word error rate over the best single-systems, as opposed to the the 6.7% with a ROVER combination. An in-depth analysis of the DDA shows its ability to improve robustness (gains are greater in adverse conditions) and a relatively low dependency on the search algorithm. The application of DDA to both and beam-search-based decoder yields similar performances.


international conference on acoustics, speech, and signal processing | 2008

Generalized driven decoding for speech recognition system combination

Benjamin Lecouteux; Georges Linarès; Yannick Estève; Guillaume Gravier

Driven decoding algorithm (DDA) is initially an integrated approach for the combination of 2 speech recognition (ASR) systems. It consists in guiding the search algorithm of a primary ASR system by the one-best hypothesis of an auxiliary system. In this paper, we generalize DDA to confusion-network driven decoding and we propose new combination schemes for multiple system combination. Since previous experiments involved 2 ASR systems on broadcast news data, the proposed extended DDA is evaluated using 3 ASR systems from different labs. Results show that generalized- DDA outperforms significantly ROVER method: we obtain a 15.7% relative word error rate improvement with respect to the best single system, as opposed to 8.5% with the ROVER combination.


workshop on statistical machine translation | 2014

LIG System for Word Level QE task at WMT14

Ngoc Quang Luong; Laurent Besacier; Benjamin Lecouteux

This paper describes our Word-level QE system for WMT 2014 shared task on Spanish-English pair. Compared to WMT 2013, this years task is different due to the lack of SMT setting information and additional resources. We report how we overcome this challenge to retain most of the important features which performed well last year in our system. Novel features related to the availability of multiple systems output (new point of this year) are also proposed and experimented along with baseline set. The system is optimized by several ways: tuning the classification threshold, combining with WMT 2013 data, and refining using Feature Selection strategy on our development set, before dealing with the test set for submission.


international conference on acoustics, speech, and signal processing | 2007

System Combination by Driven Decoding

Benjamin Lecouteux; Georges Linarès; Yannick Estève; Julie Mauclair

The combination of automatic speech recognition (ASR) systems generally relies on a posteriori merge of system outputs or on a cross-adaptation. In this paper, we propose an integrated approach where the search of a primary system is driven by the outputs of a secondary one. This method allows to drive the primary system search by using the one-best hypotheses and the word posteriors gathered from the secondary system. Experiments are carried out within the experimental framework of the ESTER evaluation campaign (S. Galliano et al. 2005). Results show that the driven decoding algorithm significantly outperforms the two single ASR systems (-8% of relative WER, -1.7% absolute). Finally, we investigate the interactions between driven decoding and cross-adaptations. The best cross-adaptation strategy in combination with the driven decoding process brings to a final absolute gain of about 1.9% WER.


international conference on acoustics, speech, and signal processing | 2011

A segment-level confidence measure for Spoken Document Retrieval

Grégory Senay; Georges Linarès; Benjamin Lecouteux

This paper presents a semantic confidence measure that aims to predict the relevance of automatic transcripts for a task of Spoken Document Retrieval (SDR). The proposed predicting method relies on the combination of Automatic Speech Recognition (ASR) confidence measure and a Semantic Compacity Index (SCI), that estimates the relevance of the words considering the semantic context in which they occurred. Experiments are conducted on the French Broadcast news corpus ESTER, by simulating a classical SDR usage scenario: users submit text-queries to a search engine that is expected to return the most relevant documents regarding the query. Results demonstrate the interest of using semantic level information to predict the transcription indexability.


Computer Speech & Language | 2012

Integrating imperfect transcripts into speech recognition systems for building high-quality corpora

Benjamin Lecouteux; Georges Linarès; Stanislas Oger

Abstract: The training of state-of-the-art automatic speech recognition (ASR) systems requires huge relevant training corpora. The cost of such databases is high and remains a major limitation for the development of speech-enabled applications in particular contexts (e.g. low-density languages or specialized domains). On the other hand, a large amount of data can be found in news prompts, movie subtitles or scripts, etc. The use of such data as training corpus could provide a low-cost solution to the acoustic model estimation problem. Unfortunately, prior transcripts are seldom exact with respect to the content of the speech signal, and suffer from a lack of temporal information. This paper tackles the issue of prompt-based speech corpora improvement, by addressing the problems mentioned above. We propose a method allowing to locate accurate transcript segments in speech signals and automatically correct errors or lack of transcript surrounding these segments. This method relies on a new decoding strategy where the search algorithm is driven by the imperfect transcription of the input utterances. The experiments are conducted on the French language, by using the ESTER database and a set of records (and associated prompts) from RTBF (Radio Television Belge Francophone). The results demonstrate the effectiveness of the proposed approach, in terms of both error correction and text-to-speech alignment.


knowledge and systems engineering | 2014

Word Confidence Estimation and Its Integration in Sentence Quality Estimation for Machine Translation

Ngoc-Quang Luong; Laurent Besacier; Benjamin Lecouteux

This paper proposes some ideas to build an effective estimator, which predicts the quality of words in a Machine Translation (MT) output. We integrate a number of features of various types (system-based, lexical, syntactic and semantic) into the conventional feature set, for our baseline classifier training. After the experiments with all features, we deploy a “Feature Selection” strategy to filter the best performing ones. Then, a method that combines multiple “weak” classifiers to build a strong “composite” classifier by taking advantage of their complementarity allows us to achieve a better performance in term of F score. Finally, we exploit word confidence scores for improving the estimation system at sentence level.


conference of the european chapter of the association for computational linguistics | 2014

Word Confidence Estimation for SMT N-best List Re-ranking

Ngoc Quang Luong; Laurent Besacier; Benjamin Lecouteux

This paper proposes to use Word Confidence Estimation (WCE) information to improve MT outputs via N-best list reranking. From the confidence label assigned for each word in the MT hypothesis, we add six scores to the baseline loglinear model in order to re-rank the N-best list. Firstly, the correlation between the WCE-based sentence-level scores and the conventional evaluation scores (BLEU, TER, TERp-A) is investigated. Then, the N-best list re-ranking is evaluated over different WCE system performance levels: from our real and efficient WCE system (ranked 1st during last WMT 2013 Quality Estimation Task) to an oracle WCE (which simulates an interactive scenario where a user simply validates words of a MT hypothesis and the new output will be automatically re-generated). The results suggest that our real WCE system slightly (but significantly) improves the baseline while the oracle one extremely boosts it; and better WCE leads to better MT quality.


Eurasip Journal on Audio, Speech, and Music Processing | 2010

Query-Driven Strategy for On-the-Fly Term Spotting in Spontaneous Speech

Mickael Rouvier; Georges Linarès; Benjamin Lecouteux

Spoken utterance retrieval was largely studied in the last decades, with the purpose of indexing large audio databases or of detecting keywords in continuous speech streams. While the indexing of closed corpora can be performed via a batch process, on-line spotting systems have to synchronously detect the targeted spoken utterances. We propose a two-level architecture for on-the-fly term spotting. The first level performs a fast detection of the speech segments that probably contain the targeted utterance. The second level refines the detection on the selected segments, by using a speech recognizer based on a query-driven decoding algorithm. Experiments are conducted on both broadcast and spontaneous speech corpora. We investigate the impact of the spontaneity level on system performance. Results show that our method remains effective even if the recognition rates are significantly degraded by disfluencies.

Collaboration


Dive into the Benjamin Lecouteux's collaboration.

Top Co-Authors

Avatar

Laurent Besacier

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Michel Vacher

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

François Portet

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar

Didier Schwab

French Institute for Research in Computer Science and Automation

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Pedro Chahuara

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge