Andrew Aaron | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andrew Aaron is active.

Explore More

Publication

Featured researches published by Andrew Aaron.

international conference on acoustics, speech, and signal processing | 2003

Recent improvements to the IBM trainable speech synthesis system

Ellen Eide; Andrew Aaron; Raimo Bakis; R. Cohen; Robert E. Donovan; Wael Hamza; T. Mathes; Michael Picheny; M. Polkosky; M. Smith; M. Viswanathan

In this paper we describe the current status of the trainable text-to-speech system at IBM. Recent algorithmic and database changes to the system have led to significant gains in the output quality. On the algorithms side, we have introduced statistical models for predicting pitch and duration targets which replace the rule-based target generation previously employed. Additionally, we have changed the cost function and the search strategy, introduced a post-search pitch smoothing algorithm, and improved our method of preselection. Through the combined data and algorithmic contributions, we have been able to significantly improve (p < 0.0001) the mean opinion score (MOS) of our female voice, from 3.68 to 4.85 when heard over loudspeakers and to 5.42 when heard over the telephone (seven point scale).

international conference on acoustics, speech, and signal processing | 2001

Speech recognition for DARPA Communicator

Andrew Aaron; Scott Saobing Chen; Paul S. Cohen; Satya Dharanipragada; Ellen Eide; Martin Franz; Jean-Michel LeRoux; Xiaoqiang Luo; Benoît Maison; Lidia Mangu; T. Mathes; Miroslav Novak; Peder A. Olsen; Michael Picheny; Harry Printz; Bhuvana Ramabhadran; Andrej Sakrajda; George Saon; Borivoj Tydlitát; Karthik Visweswariah; D. Yuk

We report the results of investigations in acoustic modeling, language modeling and decoding techniques, for the DARPA Communicator, a speaker-independent, telephone-based dialog system. By a combination of methods, including enlarging the acoustic model, augmenting the recognizer vocabulary, conditioning the language model upon the dialog state, and applying a post-processing decoding method, we lowered the overall word error rate from 21.9% to 15.0%, a gain of 6.9% absolute and 31.5% relative.

SSW | 2004