Fred Choi
BBN Technologies
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Fred Choi.
analytics for noisy unstructured text data | 2007
Premkumar Natarajan; Rohit Prasad; Krishna Subramanian; Shirin Saleem; Fred Choi; Richard M. Schwartz
This paper addresses two types of classification of noisy, unstructured text such as newsgroup messages: (1) spotting messages containing topics of interest, and (2) automatic conceptual organization of messages without prior knowledge of topics of interest. In addition to applying our hidden Markov model methodology to spotting topics of interest in newsgroup messages, we present a robust methodology for rejecting messages which are off-topic. We describe a novel approach for automatically organizing a large, unstructured collection of messages. The approach applies an unsupervised topic clustering procedure to generate a hierarchical tree of topics.
Computer Speech & Language | 2013
Rohit Prasad; Prem Natarajan; David Stallard; Shirin Saleem; Shankar Ananthakrishnan; Stavros Tsakalidis; Chia-Lin Kao; Fred Choi; Ralf Meermeier; Mark Rawls; Jacob Devlin; Kriste Krstovski; Aaron Challenner
In this paper we present a speech-to-speech (S2S) translation system called the BBN TransTalk that enables two-way communication between speakers of English and speakers who do not understand or speak English. The BBN TransTalk has been configured for several languages including Iraqi Arabic, Pashto, Dari, Farsi, Malay, Indonesian, and Levantine Arabic. We describe the key components of our system: automatic speech recognition (ASR), machine translation (MT), text-to-speech (TTS), dialog manager, and the user interface (UI). In addition, we present novel techniques for overcoming specific challenges in developing high-performing S2S systems. For ASR, we present techniques for dealing with lack of pronunciation and linguistic resources and effective modeling of ambiguity in pronunciations of words in these languages. For MT, we describe techniques for dealing with data sparsity as well as modeling context. We also present and compare different user confirmation techniques for detecting errors that can cause the dialog to drift or stall.
spoken language technology workshop | 2008
Fred Choi; Stavros Tsakalidis; Shirin Saleem; Chia-Lin Kao; Ralf Meermeier; Kriste Krstovski; Christine Moran; Krishna Subramanian; David Stallard; Rohit Prasad; Prem Natarajan
We report on recent improvements in our English/Iraqi Arabic speech-to-speech translation system. User interface improvements include a novel parallel approach to user confirmation which makes confirmation cost-free in terms of dialog duration. Automatic speech recognition improvements include the incorporation of state-of-the-art techniques in feature transformation and discriminative training. Machine translation improvements include a novel combination of multiple alignments derived from various pre-processing techniques, such as Arabic segmentation and English word compounding, higher order N-grams for target language model, and use of context in form of semantic classes and part-of-speech tags.
2007 IEEE International Conference on Portable Information Devices | 2007
Rohit Prasad; Kriste Krstovski; Fred Choi; Shirin Saleem; Prem Natarajan; Michael Decerbo; David Stallard
In this paper we present a speech-to-speech translation system configured for translingual communication in English and colloquial Iraqi on a mobile, handheld device. The end-to-end system employs a medium/large vocabulary n-gram speech recognition engine for recognizing English and colloquial Iraqi, a question canonicalizer for mapping a recognized English question or command to one of the questions supported in the system, a concept translation engine for translating recognized Iraqi text, and a text-to-speech synthesis engine for playing back the English translation for the Iraqi to the English speaker. In addition to describing the system architecture and the functionality of the components, we present optimization techniques that enable low-latency, real-time speech recognition on low-power hardware platforms.
spoken language technology workshop | 2008
Rohit Prasad; Christine Moran; Fred Choi; Ralf Meermeier; Shirin Saleem; Chia-Lin Kao; David Stallard; Prem Natarajan
In this paper, we describe a novel approach that exploits intra-sentence and dialog-level context for improving translation performance on spoken Iraqi utterances that contain named entities (NEs). Dialog-level context is used to predict whether the Iraqi response is likely to contain names and the intra-sentence context is used to determine words that are named entities. While we do not address the problem of translating out-of-vocabulary (OOV) NEs in spoken utterances, we show that our approach is capable of translating OOV names in text input. To demonstrate efficacy of our approach, we present results on internal test set as well as the 2008 June DARPA TRANSTAC name evaluation set.
international conference on acoustics, speech, and signal processing | 2010
David Stallard; Rohit Prasad; Shankar Ananthakrishnan; Fred Choi; Shirin Saleem; Prem Natarajan
Speech-to-speech translation systems have made a great deal of progress in recent years. But users of such systems still face the problem of not knowing whether the system has translated their utterance correctly. Various confirmation strategies can be used to address this problem. Some of these generate a confirmation utterance for the user to approve, such as reading back the ASR result, or performing “back-translation” to translate the systems translation output back into the source language. Other strategies use automated methods such as confidence measures to eliminate likely mistranslations. We propose a methodology for quantitatively evaluating the effectiveness of these different strategies, and present results of experiments using this methodology.
spoken language technology workshop | 2006
David Stallard; Fred Choi; Kriste Krstovski; Prem Natarajan; Rohit Prasad; Shirin Saleem; Raid Suleiman
In this paper, we present a 2-way speech-to-speech translation system for English and Iraqi colloquial Arabic, the dialect of Arabic spoken by ordinary people in Iraq. The application domain of the system is military force protection, including municipal services surveys, detainee screening, and descriptions of people, houses, vehicles, etc. The system uses statistical speech recognition, and a combination of prerecorded questions and statistical machine translation with speech synthesis to translate the speech recognition output. We present evaluation results, along with an analysis of the gap between Iraqi-to-English and English-to-Iraqi translation performance.
conference of the international speech communication association | 2007
David Stallard; Fred Choi; Chia-Lin Kao; Kriste Krstovski; Premkumar Natarajan; Rohit Prasad; Shirin Saleem; Krishna Subramanian
international conference on pattern recognition | 2012
Shiv Naga Prasad Vitaladevuni; Fred Choi; Rohit Prasad; Premkumar Natarajan
conference of the international speech communication association | 2006
David Stallard; Fred Choi; Kriste Krstovski; Prem Natarajan; Rohit Prasad; Shirin Saleem