Langzhou Chen
Centre national de la recherche scientifique
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Langzhou Chen.
international conference on acoustics, speech, and signal processing | 2003
Jean-Luc Gauvain; Lori Lamel; Holger Schwenk; Gilles Adda; Langzhou Chen; Fabrice Lefèvre
This paper describes the development of a speech recognition system for the processing of telephone conversations, starting with a state-of-the-art broadcast news transcription system. We identify major changes and improvements in acoustic and language modeling, as well as decoding, which are required to achieve state-of-the-art performance on conversational speech. Some major changes on the acoustic side include the use of speaker normalization (VTLN), the need to cope with channel variability, and the need for efficient speaker adaptation and better pronunciation modeling. On the linguistic side the primary challenge is to cope with the limited amount of language model training data. To address this issue we make use of a data selection technique, and a smoothing technique based on a neural network language model. At the decoding level lattice rescoring and minimum word error decoding are applied. On the development data, the improvements yield an overall word error rate of 24.9% whereas the original BN transcription system had a word error rate of about 50% on the same data.
international conference on acoustics, speech, and signal processing | 2004
Richard M. Schwartz; Thomas Colthurst; Nicolae Duta; Herbert Gish; Rukmini Iyer; Chia-Lin Kao; Daben Liu; Owen Kimball; Jeff Z. Ma; John Makhoul; Spyros Matsoukas; Long Nguyen; Mohammed Noamany; Rohit Prasad; Bing Xiang; Dongxin Xu; Jean-Luc Gauvain; Lori Lamel; Holger Schwenk; Gilles Adda; Langzhou Chen
We report on the results of the first evaluations for the BBN/LIMSI system under the new DARPA EARS program. The evaluations were carried out for conversational telephone speech (CTS) and broadcast news (BN) for three languages: English, Mandarin, and Arabic. In addition to providing system descriptions and evaluation results, the paper highlights methods that worked well across the two domains and those few that worked well on one domain but not the other. For the BN evaluations, which had to be run under 10 times real-time, we demonstrated that a joint BBN/LIMSI system with a time constraint achieved better results than either system alone.
international conference on acoustics, speech, and signal processing | 2003
Langzhou Chen; Jean-Luc Gauvain; Lori Lamel; Gilles Adda
Unsupervised language model adaptation for speech recognition is challenging, particularly for complicated tasks such the transcription of broadcast news (BN) data. This paper presents an unsupervised adaptation method for language modeling based on information retrieval techniques. The method is designed for the broadcast news transcription task where the topics of the audio data cannot be predicted in advance. Experiments are carried out using the LIMSI American English BN transcription system and the NIST 1999 BN evaluation sets. The unsupervised adaptation method reduces the perplexity by 7% relative to the baseline LM and yields a 2% relative improvement for a 10xRT system.
international conference on acoustics, speech, and signal processing | 2004
Lori Lamel; Jean-Luc Gauvain; Gilles Adda; Martine Adda-Decker; L. Canseco; Langzhou Chen; Olivier Galibert; Abdelkhalek Messaoudi; Holger Schwenk
The paper summarizes recent work underway at LIMSI on speech-to-text transcription in multiple languages. The research has been oriented towards the processing of broadcast audio and conversational speech for information access. Broadcast news transcription systems have been developed for seven languages, and it is planned to address several other languages in the near term. Research on conversational speech has mainly focused on the English language, with some initial work on French, Arabic and Spanish. Automatic processing must take into account the characteristics of the audio data, such as needing to deal with the continuous data stream, specificities of the language and the use of an imperfect word transcription for accessing the information content. Our experience thus far indicates that at todays word error rates, the techniques used in one language can be successfully ported to other languages, and most of the language specificities concern lexical and pronunciation modeling.
meeting of the association for computational linguistics | 2001
Jean-Luc Gauvain; Lori Lamel; Gilles Adda; Martine Adda-Decker; Claude Barras; Langzhou Chen; Yannick de Kercadio
This paper addresses recent progress in speaker-independent, large vocabulary, continuous speech recognition, which has opened up a wide range of near and mid-term applications. One rapidly expanding application area is the processing of broadcast audio for information access. At LIMSI, broadcast news transcription systems have been developed for English, French, German, Mandarin and Portuguese, and systems for other languages are under development. Audio indexation must take into account the specificities of audio data, such as needing to deal with the continuous data stream and an imperfect word transcription. Some near-term applications areas are audio data mining, selective dissemination of information and media monitoring.
ACM Transactions on Speech and Language Processing | 2004
Langzhou Chen; Lori Lamel; Jean-Luc Gauvain; Gilles Adda
conference of the international speech communication association | 2001
Langzhou Chen; Jean-Luc Gauvain; Lori Lamel; Gilles Adda; Martine Adda-Decker
international conference on acoustics, speech, and signal processing | 2004
Langzhou Chen; Lori Lamel; Jean-Luc Gauvain
conference of the international speech communication association | 2000
Langzhou Chen; Lori Lamel; Gilles Adda; Jean-Luc Gauvain
Archive | 2001
Langzhou Chen; Jean-Luc Gauvain; Lori Lamel; Gilles Adda; Martine Adda