Fred Choi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Fred Choi is active.

Explore More

Publication

Featured researches published by Fred Choi.

analytics for noisy unstructured text data | 2007

Finding structure in noisy text: topic classification and unsupervised clustering

Premkumar Natarajan; Rohit Prasad; Krishna Subramanian; Shirin Saleem; Fred Choi; Richard M. Schwartz

This paper addresses two types of classification of noisy, unstructured text such as newsgroup messages: (1) spotting messages containing topics of interest, and (2) automatic conceptual organization of messages without prior knowledge of topics of interest. In addition to applying our hidden Markov model methodology to spotting topics of interest in newsgroup messages, we present a robust methodology for rejecting messages which are off-topic. We describe a novel approach for automatically organizing a large, unstructured collection of messages. The approach applies an unsupervised topic clustering procedure to generate a hierarchical tree of topics.

Computer Speech & Language | 2013

BBN TransTalk: Robust multilingual two-way speech-to-speech translation for mobile platforms

Rohit Prasad; Prem Natarajan; David Stallard; Shirin Saleem; Shankar Ananthakrishnan; Stavros Tsakalidis; Chia-Lin Kao; Fred Choi; Ralf Meermeier; Mark Rawls; Jacob Devlin; Kriste Krstovski; Aaron Challenner

In this paper we present a speech-to-speech (S2S) translation system called the BBN TransTalk that enables two-way communication between speakers of English and speakers who do not understand or speak English. The BBN TransTalk has been configured for several languages including Iraqi Arabic, Pashto, Dari, Farsi, Malay, Indonesian, and Levantine Arabic. We describe the key components of our system: automatic speech recognition (ASR), machine translation (MT), text-to-speech (TTS), dialog manager, and the user interface (UI). In addition, we present novel techniques for overcoming specific challenges in developing high-performing S2S systems. For ASR, we present techniques for dealing with lack of pronunciation and linguistic resources and effective modeling of ambiguity in pronunciations of words in these languages. For MT, we describe techniques for dealing with data sparsity as well as modeling context. We also present and compare different user confirmation techniques for detecting errors that can cause the dialog to drift or stall.

spoken language technology workshop | 2008

Recent improvements in BBN's English/Iraqi speech-to-speech translation system

Fred Choi; Stavros Tsakalidis; Shirin Saleem; Chia-Lin Kao; Ralf Meermeier; Kriste Krstovski; Christine Moran; Krishna Subramanian; David Stallard; Rohit Prasad; Prem Natarajan

We report on recent improvements in our English/Iraqi Arabic speech-to-speech translation system. User interface improvements include a novel parallel approach to user confirmation which makes confirmation cost-free in terms of dialog duration. Automatic speech recognition improvements include the incorporation of state-of-the-art techniques in feature transformation and discriminative training. Machine translation improvements include a novel combination of multiple alignments derived from various pre-processing techniques, such as Arabic segmentation and English word compounding, higher order N-grams for target language model, and use of context in form of semantic classes and part-of-speech tags.

2007 IEEE International Conference on Portable Information Devices | 2007

Real-Time Speech-to-Speech Translation for PDAs

Rohit Prasad; Kriste Krstovski; Fred Choi; Shirin Saleem; Prem Natarajan; Michael Decerbo; David Stallard

In this paper we present a speech-to-speech translation system configured for translingual communication in English and colloquial Iraqi on a mobile, handheld device. The end-to-end system employs a medium/large vocabulary n-gram speech recognition engine for recognizing English and colloquial Iraqi, a question canonicalizer for mapping a recognized English question or command to one of the questions supported in the system, a concept translation engine for translating recognized Iraqi text, and a text-to-speech synthesis engine for playing back the English translation for the Iraqi to the English speaker. In addition to describing the system architecture and the functionality of the components, we present optimization techniques that enable low-latency, real-time speech recognition on low-power hardware platforms.

spoken language technology workshop | 2008

Name aware speech-to-speech translation for English/Iraqi

Rohit Prasad; Christine Moran; Fred Choi; Ralf Meermeier; Shirin Saleem; Chia-Lin Kao; David Stallard; Prem Natarajan

In this paper, we describe a novel approach that exploits intra-sentence and dialog-level context for improving translation performance on spoken Iraqi utterances that contain named entities (NEs). Dialog-level context is used to predict whether the Iraqi response is likely to contain names and the intra-sentence context is used to determine words that are named entities. While we do not address the problem of translating out-of-vocabulary (OOV) NEs in spoken utterances, we show that our approach is capable of translating OOV names in text input. To demonstrate efficacy of our approach, we present results on internal test set as well as the 2008 June DARPA TRANSTAC name evaluation set.

international conference on acoustics, speech, and signal processing | 2010

Evaluating different confirmation strategies for speech-to-speech translation systems

David Stallard; Rohit Prasad; Shankar Ananthakrishnan; Fred Choi; Shirin Saleem; Prem Natarajan

Speech-to-speech translation systems have made a great deal of progress in recent years. But users of such systems still face the problem of not knowing whether the system has translated their utterance correctly. Various confirmation strategies can be used to address this problem. Some of these generate a confirmation utterance for the user to approve, such as reading back the ASR result, or performing “back-translation” to translate the systems translation output back into the source language. Other strategies use automated methods such as confidence measures to eliminate likely mistranslations. We propose a methodology for quantitatively evaluating the effectiveness of these different strategies, and present results of experiments using this methodology.

spoken language technology workshop | 2006

DESIGN AND EVALUATION OF THE 2006 BBN ENGLISH/IRAQI TWO-WAY SPEECH TRANSLATION SYSTEM

David Stallard; Fred Choi; Kriste Krstovski; Prem Natarajan; Rohit Prasad; Shirin Saleem; Raid Suleiman

In this paper, we present a 2-way speech-to-speech translation system for English and Iraqi colloquial Arabic, the dialect of Arabic spoken by ordinary people in Iraq. The application domain of the system is military force protection, including municipal services surveys, detainee screening, and descriptions of people, houses, vehicles, etc. The system uses statistical speech recognition, and a combination of prerecorded questions and statistical machine translation with speech synthesis to translate the speech recognition output. We present evaluation results, along with an analysis of the gap between Iraqi-to-English and English-to-Iraqi translation performance.

conference of the international speech communication association | 2007