Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Chia-Lin Kao is active.

Publication


Featured researches published by Chia-Lin Kao.


IEEE Transactions on Audio, Speech, and Language Processing | 2006

Advances in transcription of broadcast news and conversational telephone speech within the combined EARS BBN/LIMSI system

Spyridon Matsoukas; Jean-Luc Gauvain; Gilles Adda; Thomas Colthurst; Chia-Lin Kao; Owen Kimball; Lori Lamel; Fabrice Lefèvre; Jeff Z. Ma; John Makhoul; Long Nguyen; Rohit Prasad; Richard M. Schwartz; Holger Schwenk; Bing Xiang

This paper describes the progress made in the transcription of broadcast news (BN) and conversational telephone speech (CTS) within the combined BBN/LIMSI system from May 2002 to September 2004. During that period, BBN and LIMSI collaborated in an effort to produce significant reductions in the word error rate (WER), as directed by the aggressive goals of the Effective, Affordable, Reusable, Speech-to-text [Defense Advanced Research Projects Agency (DARPA) EARS] program. The paper focuses on general modeling techniques that led to recognition accuracy improvements, as well as engineering approaches that enabled efficient use of large amounts of training data and fast decoding architectures. Special attention is given on efforts to integrate components of the BBN and LIMSI systems, discussing the tradeoff between speed and accuracy for various system combination strategies. Results on the EARS progress test sets show that the combined BBN/LIMSI system achieved relative reductions of 47% and 51% on the BN and CTS domains, respectively


international conference on acoustics, speech, and signal processing | 2004

Speech recognition in multiple languages and domains: the 2003 BBN/LIMSI EARS system

Richard M. Schwartz; Thomas Colthurst; Nicolae Duta; Herbert Gish; Rukmini Iyer; Chia-Lin Kao; Daben Liu; Owen Kimball; Jeff Z. Ma; John Makhoul; Spyros Matsoukas; Long Nguyen; Mohammed Noamany; Rohit Prasad; Bing Xiang; Dongxin Xu; Jean-Luc Gauvain; Lori Lamel; Holger Schwenk; Gilles Adda; Langzhou Chen

We report on the results of the first evaluations for the BBN/LIMSI system under the new DARPA EARS program. The evaluations were carried out for conversational telephone speech (CTS) and broadcast news (BN) for three languages: English, Mandarin, and Arabic. In addition to providing system descriptions and evaluation results, the paper highlights methods that worked well across the two domains and those few that worked well on one domain but not the other. For the BN evaluations, which had to be run under 10 times real-time, we demonstrated that a joint BBN/LIMSI system with a time constraint achieved better results than either system alone.


international conference on acoustics, speech, and signal processing | 2008

Recent improvements and performance analysis of ASR and MT in a speech-to-speech translation system

David Stallard; Chia-Lin Kao; Kriste Krstovski; Daben Liu; Premkumar Natarajan; Rohit Prasad; Shirin Saleem; Krishna Subramanian

We report on recent ASR and MT work on our English/Iraqi Arabic speech-to-speech translation system. We present detailed results for both objective and subjective evaluations of translation quality, along with a detailed analysis and categorization of translation errors. We also present novel ideas for quantifying the relative importance of different subjective error categories, and for assigning the blame for an error to a particular phrase pair in the translation model.


Computer Speech & Language | 2013

BBN TransTalk: Robust multilingual two-way speech-to-speech translation for mobile platforms

Rohit Prasad; Prem Natarajan; David Stallard; Shirin Saleem; Shankar Ananthakrishnan; Stavros Tsakalidis; Chia-Lin Kao; Fred Choi; Ralf Meermeier; Mark Rawls; Jacob Devlin; Kriste Krstovski; Aaron Challenner

In this paper we present a speech-to-speech (S2S) translation system called the BBN TransTalk that enables two-way communication between speakers of English and speakers who do not understand or speak English. The BBN TransTalk has been configured for several languages including Iraqi Arabic, Pashto, Dari, Farsi, Malay, Indonesian, and Levantine Arabic. We describe the key components of our system: automatic speech recognition (ASR), machine translation (MT), text-to-speech (TTS), dialog manager, and the user interface (UI). In addition, we present novel techniques for overcoming specific challenges in developing high-performing S2S systems. For ASR, we present techniques for dealing with lack of pronunciation and linguistic resources and effective modeling of ambiguity in pronunciations of words in these languages. For MT, we describe techniques for dealing with data sparsity as well as modeling context. We also present and compare different user confirmation techniques for detecting errors that can cause the dialog to drift or stall.


international conference on acoustics, speech, and signal processing | 2010

Pashto speech recognition with limited pronunciation lexicon

Rohit Prasad; Stavros Tsakalidis; Ivan Bulyko; Chia-Lin Kao; Prem Natarajan

Automatic speech recognition (ASR) for low resource languages continues to be a difficult problem. In particular, colloquial dialects of Arabic, Farsi, and Pashto pose significant challenges in pronunciation dictionary creation. Therefore, most state-of-the-art ASR engines rely on the grapheme-as-phoneme approach for creating pronunciation dictionaries in these languages. While the grapheme approach simplifies ASR training, it performs significantly worse than a system trained with a high-quality phonetic dictionary. In this paper, we explore two techniques for bridging the performance gap between the grapheme and the phonetic approaches, without requiring manual pronunciations for all the words in the training data. The first approach is based on learning letter-to-sound rules from a small set of manual pronunciations in Pashto, and the second approach uses a hybrid phoneme/grapheme representation for recognition. Through experimental results on colloquial Pashto, we demonstrate that both techniques perform as well as a full phonetic system while requiring manual pronunciations for only a small fraction of the words in the acoustic training data.


spoken language technology workshop | 2008

Recent improvements in BBN's English/Iraqi speech-to-speech translation system

Fred Choi; Stavros Tsakalidis; Shirin Saleem; Chia-Lin Kao; Ralf Meermeier; Kriste Krstovski; Christine Moran; Krishna Subramanian; David Stallard; Rohit Prasad; Prem Natarajan

We report on recent improvements in our English/Iraqi Arabic speech-to-speech translation system. User interface improvements include a novel parallel approach to user confirmation which makes confirmation cost-free in terms of dialog duration. Automatic speech recognition improvements include the incorporation of state-of-the-art techniques in feature transformation and discriminative training. Machine translation improvements include a novel combination of multiple alignments derived from various pre-processing techniques, such as Arabic segmentation and English word compounding, higher order N-grams for target language model, and use of context in form of semantic classes and part-of-speech tags.


spoken language technology workshop | 2008

Name aware speech-to-speech translation for English/Iraqi

Rohit Prasad; Christine Moran; Fred Choi; Ralf Meermeier; Shirin Saleem; Chia-Lin Kao; David Stallard; Prem Natarajan

In this paper, we describe a novel approach that exploits intra-sentence and dialog-level context for improving translation performance on spoken Iraqi utterances that contain named entities (NEs). Dialog-level context is used to predict whether the Iraqi response is likely to contain names and the intra-sentence context is used to determine words that are named entities. While we do not address the problem of translating out-of-vocabulary (OOV) NEs in spoken utterances, we show that our approach is capable of translating OOV names in text input. To demonstrate efficacy of our approach, we present results on internal test set as well as the 2008 June DARPA TRANSTAC name evaluation set.


conference of the international speech communication association | 2007

Rapid and accurate spoken term detection.

David R. H. Miller; Michael Kleber; Chia-Lin Kao; Owen Kimball; Thomas Colthurst; Stephen A. Lowe; Richard M. Schwartz; Herbert Gish


conference of the international speech communication association | 2005

The 2004 BBN/LIMSI 20xRT English Conversational Telephone Speech Recognition System

Rohit Prasad; Spyros Matsoukas; Chia-Lin Kao; Jeff Z. Ma; Dongxin Xu; Thomas Colthurst; Owen Kimball; Richard M. Schwartz; Jean-Luc Gauvain; Lori Lamel; Holger Schwenk; Gilles Adda; Fabrice Lefèvre


conference of the international speech communication association | 2007

The BBN 2007 displayless English/iraqi speech-to-speech translation system.

David Stallard; Fred Choi; Chia-Lin Kao; Kriste Krstovski; Premkumar Natarajan; Rohit Prasad; Shirin Saleem; Krishna Subramanian

Collaboration


Dive into the Chia-Lin Kao's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kriste Krstovski

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Prem Natarajan

University of Southern California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge