Daben Liu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daben Liu is active.

Explore More

Publication

Featured researches published by Daben Liu.

international conference on acoustics, speech, and signal processing | 2003

Novel approaches to Arabic speech recognition: report from the 2002 Johns-Hopkins Summer Workshop

Katrin Kirchhoff; Jeff A. Bilmes; Sourin Das; Nicolae Duta; Melissa Egan; Gang Ji; Feng He; John Henderson; Daben Liu; Mohammed Noamany; Patrick Schone; Richard M. Schwartz; Dimitra Vergyri

Although Arabic is currently one of the most widely spoken languages in the world, there has been relatively little speech recognition research on Arabic compared to other languages. Moreover, most previous work has concentrated on the recognition of formal rather than dialectal Arabic. This paper reports on our project at the 2002 Johns Hopkins Summer Workshop, which focused on the recognition of dialectal Arabic. Three problems were addressed: (a) the lack of short vowels and other pronunciation information in Arabic texts; (b) the morphological complexity of Arabic; and (c) the discrepancies between dialectal and formal Arabic. We present novel approaches to automatic vowel restoration, morphology-based language modeling and the integration of out-of-corpus language model data, and report significant word error rate improvements on the LDC Arabic CallHome task.

Communications of The ACM | 2000

Integrated technologies for indexing spoken language

Francis Kubala; Sean Colbath; Daben Liu; Amit Srivastava; John Makhoul

because it is impossible to efficiently locate information in large audio archives. By itself, speech does not permit content-based searches for information like those commonly employed for text documents over the Internet. But now, after more than a decade of steady advances in speech recognition, speaker identification, and language understanding, it is possible to begin building usable automatic content-based indexing tools for spoken language by integrating these emerging technologies. Once these tools become Integrated Technologies FOR Indexing Spoken Language

international conference on acoustics, speech, and signal processing | 2002

Audio Indexing of Arabic broadcast news

Jay Billa; Mohammed Noamany; Amit Srivastava; Daben Liu; Rebecca Stone; J. Xu; John Makhoul; Francis Kubala

This paper describes the development of the BBN Audio Indexing System for broadcast news in Arabic. Key issues addressed in this work revolve around the three major components of the audio indexing system: automatic speech recognition, speaker identification, and named entity identification. The system deals with several challenges introduced by the Arabic language, including the absence of short vowels in written text and the presence of compound words that are formed by the concatenation of certain conjunctions, prepositions, articles, and pronouns, as prefixes and suffixes to the word stem. The lack of short vowels in the transcripts prompted a novel solution that further demonstrated the power of hidden Markov models to deal with ambiguity. Another challenge was the acquisition of appropriate language modeling data, given the absence of broadcast news data for that purpose. We present performance results for all three components of the Audio Indexing System, which we believe represent the state of the art for Arabic broadcast news.

international conference on acoustics, speech, and signal processing | 2004

Speech recognition in multiple languages and domains: the 2003 BBN/LIMSI EARS system

Richard M. Schwartz; Thomas Colthurst; Nicolae Duta; Herbert Gish; Rukmini Iyer; Chia-Lin Kao; Daben Liu; Owen Kimball; Jeff Z. Ma; John Makhoul; Spyros Matsoukas; Long Nguyen; Mohammed Noamany; Rohit Prasad; Bing Xiang; Dongxin Xu; Jean-Luc Gauvain; Lori Lamel; Holger Schwenk; Gilles Adda; Langzhou Chen

We report on the results of the first evaluations for the BBN/LIMSI system under the new DARPA EARS program. The evaluations were carried out for conversational telephone speech (CTS) and broadcast news (BN) for three languages: English, Mandarin, and Arabic. In addition to providing system descriptions and evaluation results, the paper highlights methods that worked well across the two domains and those few that worked well on one domain but not the other. For the BN evaluations, which had to be run under 10 times real-time, we demonstrated that a joint BBN/LIMSI system with a time constraint achieved better results than either system alone.

ACM Computing Surveys | 1999

Rough'n'Ready: a meeting recorder and browser

Francis Kubala; Sean Colbath; Daben Liu; John Makhoul

Abstract : The objective of this effort is to integrate and enhance existing technologies in speech recognition, speaker identification, and topic classification to provide cost-effective transcription, structural summarization, and retrieval of user-specified aspects of meetings. A software system consisting of a meeting recorder and browser was designed and developed to provide a higher level view of collaborative meetings, co-locational or distributed and a way to browse through and listen to those parts which are most relevant to the user.

hawaii international conference on system sciences | 2000

Spoken documents: creating searchable archives from continuous audio

Sean Colbath; Francis Kubala; Daben Liu; Amit Srivastava

Current search technologies for audio rely on the cataloguer of the data to provide additional keywords or metadata to enable retrieval. This can lead to haphazard cataloging and misleading searches, and provides the end user with no summarization, editing, or information extraction capabilities. One obvious way to tackle this problem is to transcribe the speech-based audio using automatic speech recognition technology.

international conference on acoustics, speech, and signal processing | 2008

Recent improvements and performance analysis of ASR and MT in a speech-to-speech translation system

David Stallard; Chia-Lin Kao; Kriste Krstovski; Daben Liu; Premkumar Natarajan; Rohit Prasad; Shirin Saleem; Krishna Subramanian

We report on recent ASR and MT work on our English/Iraqi Arabic speech-to-speech translation system. We present detailed results for both objective and subjective evaluations of translation quality, along with a detailed analysis and categorization of translation errors. We also present novel ideas for quantifying the relative importance of different subjective error categories, and for assigning the blame for an error to a particular phrase pair in the translation model.

Proceedings of the IEEE | 2000