Francis Kubala | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Francis Kubala is active.

Explore More

Publication

Featured researches published by Francis Kubala.

Communications of The ACM | 2000

Integrated technologies for indexing spoken language

Francis Kubala; Sean Colbath; Daben Liu; Amit Srivastava; John Makhoul

because it is impossible to efficiently locate information in large audio archives. By itself, speech does not permit content-based searches for information like those commonly employed for text documents over the Internet. But now, after more than a decade of steady advances in speech recognition, speaker identification, and language understanding, it is possible to begin building usable automatic content-based indexing tools for spoken language by integrating these emerging technologies. Once these tools become Integrated Technologies FOR Indexing Spoken Language

international conference on acoustics, speech, and signal processing | 2002

Audio Indexing of Arabic broadcast news

Jay Billa; Mohammed Noamany; Amit Srivastava; Daben Liu; Rebecca Stone; J. Xu; John Makhoul; Francis Kubala

This paper describes the development of the BBN Audio Indexing System for broadcast news in Arabic. Key issues addressed in this work revolve around the three major components of the audio indexing system: automatic speech recognition, speaker identification, and named entity identification. The system deals with several challenges introduced by the Arabic language, including the absence of short vowels in written text and the presence of compound words that are formed by the concatenation of certain conjunctions, prepositions, articles, and pronouns, as prefixes and suffixes to the word stem. The lack of short vowels in the transcripts prompted a novel solution that further demonstrated the power of hidden Markov models to deal with ambiguity. Another challenge was the acquisition of appropriate language modeling data, given the absence of broadcast news data for that purpose. We present performance results for all three components of the Audio Indexing System, which we believe represent the state of the art for Arabic broadcast news.

international conference on acoustics, speech, and signal processing | 1989

Iterative normalization for speaker-adaptive training in continuous speech recognition

M.-W. Feng; Richard G. Schwartz; Francis Kubala; J. Makhoul

The authors present several techniques to improve an algorithm presented last year for speaker-adaptive training in continuous speech recognition. The previous method uses a transformation matrix to modify the hidden Markov model (HMM) parameters of a prechosen prototype speaker to model a target speaker. To estimate the transformation matrix, it aligns a set of target speech with the same set of speech uttered by the prototype speaker using dynamic time warping. The authors focus on improving the previous method in the modeling of the spectral differences between two speakers, and the accuracy of the alignment. To improve the modeling of the spectral differences, they implemented a phoneme-dependent mapping procedure which transforms the prototype HMMs to the estimated target HMMs using a set of phoneme-dependent matrices. To improve the alignment, the authors developed a modeling of the silence, a linear duration normalization, and an iterative normalization procedure. They tested the new methods in the standard DARPA database with a grammar of perplexity 60. The performance shows a 30% word-error reduction compared to the previous algorithm.<<ETX>>

Proceedings of the NATO Advanced Study Institute on Recent advances in speech understanding and dialog systems | 1988

Acoustic-phonetic decoding of speech

Richard M. Schwartz; Yen-Lu Chow; M. Dunham; Owen Kimball; M. Krasner; Francis Kubala; J. Makhoul; P. Price; Salim E. Roucos

Several methods for acoustic-phonetic decoding are reviewed. Emphasis is placed on the need for mathematical methods for speech recognition. Several examples of statistical methods are described. The author presents several techniques for incorporating “speech knowledge” into these statistical models, and provides a simple formalism for using multiple knowledge sources in a coherent speech recognition system.

north american chapter of the association for computational linguistics | 2003

TAP-XL: an automated analyst's assistant

Sean Colbath; Francis Kubala

The TAP-XL Automated Analysts Assistant is an application designed to help an English-speaking analyst write a topical report, culling information from a large inflow of multilingual, multimedia data. It gives users the ability to spend their time finding more data relevant to their task, and gives them translingual reach into other languages by leveraging human language technology.

north american chapter of the association for computational linguistics | 2004

Multilingual video and audio news alerting

David D. Palmer; Patrick Bray; Marc Reichman; Katherine Rhodes; Noah White; Andrew Merlino; Francis Kubala

This paper describes a fully-automated realtime broadcast news video and audio processing system. The system combines speech recognition, machine translation, and cross-lingual information retrieval components to enable real-time alerting from live English and Arabic news sources.

Archive | 1999