Bing Xiang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bing Xiang is active.

Explore More

Publication

Featured researches published by Bing Xiang.

international conference on acoustics, speech, and signal processing | 2006

Morphological Decomposition for Arabic Broadcast News Transcription

Bing Xiang; Kham Nguyen; Long Nguyen; Richard M. Schwartz; J. Makhoul

In this paper, we present a novel approach for morphological decomposition in large vocabulary Arabic speech recognition. It achieved low out-of-vocabulary (OOV) rate as well as high recognition accuracy in a state-of-the-art Arabic broadcast news transcription system. In this approach, the compound words are decomposed into stems and affixes in both language training and acoustic training data. The decomposed words in the recognition output are re-joined before scoring. Four algorithms are experimented and compared in this work. The best system achieved 1.9% absolute reduction (9.8% relative) in word error rate (WER) when compared to the 64K-word baseline. The recognition performance of this system is also comparable to a 300K-word recognition system trained on the normal words. In the meantime, the decomposed system is much faster in terms of speed and also needs less memory than the systems with larger than 64K vocabularies

IEEE Transactions on Audio, Speech, and Language Processing | 2006

Advances in transcription of broadcast news and conversational telephone speech within the combined EARS BBN/LIMSI system

Spyridon Matsoukas; Jean-Luc Gauvain; Gilles Adda; Thomas Colthurst; Chia-Lin Kao; Owen Kimball; Lori Lamel; Fabrice Lefèvre; Jeff Z. Ma; John Makhoul; Long Nguyen; Rohit Prasad; Richard M. Schwartz; Holger Schwenk; Bing Xiang

This paper describes the progress made in the transcription of broadcast news (BN) and conversational telephone speech (CTS) within the combined BBN/LIMSI system from May 2002 to September 2004. During that period, BBN and LIMSI collaborated in an effort to produce significant reductions in the word error rate (WER), as directed by the aggressive goals of the Effective, Affordable, Reusable, Speech-to-text [Defense Advanced Research Projects Agency (DARPA) EARS] program. The paper focuses on general modeling techniques that led to recognition accuracy improvements, as well as engineering approaches that enabled efficient use of large amounts of training data and fast decoding architectures. Special attention is given on efforts to integrate components of the BBN and LIMSI systems, discussing the tradeoff between speed and accuracy for various system combination strategies. Results on the EARS progress test sets show that the combined BBN/LIMSI system achieved relative reductions of 47% and 51% on the BN and CTS domains, respectively

international conference on acoustics, speech, and signal processing | 2004

Speech recognition in multiple languages and domains: the 2003 BBN/LIMSI EARS system

Richard M. Schwartz; Thomas Colthurst; Nicolae Duta; Herbert Gish; Rukmini Iyer; Chia-Lin Kao; Daben Liu; Owen Kimball; Jeff Z. Ma; John Makhoul; Spyros Matsoukas; Long Nguyen; Mohammed Noamany; Rohit Prasad; Bing Xiang; Dongxin Xu; Jean-Luc Gauvain; Lori Lamel; Holger Schwenk; Gilles Adda; Langzhou Chen

We report on the results of the first evaluations for the BBN/LIMSI system under the new DARPA EARS program. The evaluations were carried out for conversational telephone speech (CTS) and broadcast news (BN) for three languages: English, Mandarin, and Arabic. In addition to providing system descriptions and evaluation results, the paper highlights methods that worked well across the two domains and those few that worked well on one domain but not the other. For the BN evaluations, which had to be run under 10 times real-time, we demonstrated that a joint BBN/LIMSI system with a time constraint achieved better results than either system alone.

international conference on acoustics, speech, and signal processing | 2007

Integrating Speech Recognition and Machine Translation

Spyros Matsoukas; Ivan Bulyko; Bing Xiang; Kham Nguyen; Richard M. Schwartz; John Makhoul

This paper presents a set of experiments that we conducted in order to optimize the performance of an Arabic/English machine translation system on broadcast news and conversational speech data. Proper integration of speech-to-text (STT) and machine translation (MT) requires special attention to issues such as sentence boundary detection, punctuation, STT accuracy, tokenization, conversion of spoken numbers and dates to written form, optimization of MT decoding weights, and scoring. We discuss these issues, and show that a carefully tuned STT/MT integration can lead to significant translation accuracy improvements compared to simply feeding the regular STT output to a text MT system.

international conference on acoustics, speech, and signal processing | 2005

Cluster-dependent acoustic modeling [speech recognition applications]

Bing Xiang; Long Nguyen; Spyros Matsoukas; Richard M. Schwartz

In this paper, we present cluster-dependent acoustic modeling for large-vocabulary speech recognition. With large amounts of acoustic training data, we build multiple cluster-dependent models (CDM), each focusing on a group of speakers in order to represent speaker-dependent characteristics. It is motivated by the fact that a sufficiently trained speaker-dependent (SD) model is better than the speaker-independent (SI) model. During decoding, we decode the data of each test speaker using CDMs selected under certain criteria to achieve high recognition accuracy. Various speaker clustering and model selection techniques are proposed and compared in the task of broadcast news (BN) transcription. The CDM provided more than 1% absolute gain in unadapted decoding and 0.5% gain in adapted decoding when compared to our baseline system on the EARS BN 2003 development test set.

north american chapter of the association for computational linguistics | 2007

Combining Outputs from Multiple Machine Translation Systems

Antti-Veikko I. Rosti; Necip Fazil Ayan; Bing Xiang; Spyridon Matsoukas; Richard M. Schwartz; Bonnie J. Dorr

conference of the international speech communication association | 2005

Recent progress in Arabic broadcast news transcription at BBN.

Mohamed Afify; Long Nguyen; Bing Xiang; Sherif M. Abdou; John Makhoul

2004 Rich Transcriptions Workshop, Pallisades, NY | 2003

THE 2004 BBN/LIMSI 10xRT ENGLISH BROADCAST NEWS TRANSCRIPTION SYSTEM

Long Nguyen; Sherif M. Abdou; Mohamed Afify; John Makhoul; Spyros Matsoukas; Richard G. Schwartz; Bing Xiang; Lori Lamel; Jean-Luc Gauvain; Gilles Adda; Holger Schwenk; Fabrice Lefèvre

conference of the international speech communication association | 2005