Trevor Strohman
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Trevor Strohman.
international conference on acoustics, speech, and signal processing | 2015
Zhenzhen Kou; Daisy Stanton; Fuchun Peng; Francoise Beaufays; Trevor Strohman
The pronunciation dictionary, or lexicon, is an essential component in an automatic speech recognition (ASR) system in that incorrect pronunciations cause systematic misrecognitions. It typically consists of a list of word-pronunciation pairs written by linguists, and a grapheme-to-phoneme (G2P) engine to generate pronunciations for words not in the list. The hand-generated list can never keep pace with the growing vocabulary of a live speech recognition system, and the G2P is usually of limited accuracy. This is especially true for proper names whose pronunciations may be influenced by various historical or foreign-origin factors. In this paper, we propose a language-independent approach to detect misrecognitions and their corrections from voice search logs. We learn previously unknown pronunciations from this data, and demonstrate that they significantly improve the quality of a production-quality speech recognition system.
New Era for Robust Speech Recognition, Exploiting Deep Learning | 2017
Michiel Bacchiani; Francoise Beaufays; Alexander H. Gruenstein; Pedro J. Moreno; Johan Schalkwyk; Trevor Strohman; Heiga Zen
Since the wide adoption of smartphones, speech as an input modality has developed from a science fiction dream to a widely accepted technology. The quality demand on this technology that allowed fueling this adoption is high and has been a continuous focus of research activities at Google. Early adoption of large neural network model deployments and training of such models on large datasets has significantly improved core recognition accuracy. Adoption of novel approaches like long short-term memory models and connectionist temporal classification have further improved accuracy and reduced latency. In addition, algorithms that allow adaptive language modeling improve accuracy based on the context of the speech input. Focus on expanding coverage of the user population in terms of languages and speaker characteristics (e.g., child speech) has lead to novel algorithms that further pushed the universal speech input vision. Continuing this trend, our most recent investigations have been on noise and far-field robustness. Tackling speech processing in those environments will enable applications of in-car, wearable, and in-the-home scenarios and as such be another step towards true universal speech input. This chapter will briefly describe the algorithmic developments at Google over the past decade that have brought speech processing to where it is today.
Archive | 2009
Robert M. Wyman; Trevor Strohman; Paul Haahr; Laramie Leavitt; John Sarapata
Archive | 2012
Trevor Strohman
Archive | 2013
Fuchun Peng; Francoise Beaufays; Brian Strope; Xin Lei; Pedro J. Moreno Mengibar; Trevor Strohman
Archive | 2010
Adam Sadovsky; Paul Haahr; Trevor Strohman; Per Bjornsson; Jun Xu; Gabriel Schine; Jay Shrauner
Archive | 2014
Fuchun Peng; Francoise Beaufays; Brian Strope; Xin Lei; Pedro J. Moreno Mengibar; Trevor Strohman
Archive | 2013
Brian Strope; Francoise Beaufays; Trevor Strohman
Archive | 2013
Fuchun Peng; Francoise Beaufays; Brian Strope; Xin Lei; Pedro J. Moreno Mengibar; Trevor Strohman
arXiv: Computation and Language | 2018
Arun Narayanan; Ananya Misra; Khe Chai Sim; Golan Pundak; Anshuman Tripathi; Mohamed Elfeky; Parisa Haghani; Trevor Strohman; Michiel Bacchiani