Alexander H. Gruenstein
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alexander H. Gruenstein.
international conference on acoustics, speech, and signal processing | 2016
Ian McGraw; Rohit Prabhavalkar; Raziel Alvarez; Montse Gonzalez Arenas; Kanishka Rao; David Rybach; Ouais Alsharif; Hasim Sak; Alexander H. Gruenstein; Francoise Beaufays; Carolina Parada
We describe a large vocabulary speech recognition system that is accurate, has low latency, and yet has a small enough memory and computational footprint to run faster than real-time on a Nexus 5 Android smartphone. We employ a quantized Long Short-Term Memory (LSTM) acoustic model trained with connectionist temporal classification (CTC) to directly predict phoneme targets, and further reduce its memory footprint using an SVD-based compression scheme. Additionally, we minimize our memory footprint by using a single language model for both dictation and voice command domains, constructed using Bayesian interpolation. Finally, in order to properly handle device-specific information, such as proper names and other context-dependent information, we inject vocabulary items into the decoder graph and bias the language model on-the-fly. Our system achieves 13.5% word error rate on an open-ended dictation task, running with a median speed that is seven times faster than real-time.
New Era for Robust Speech Recognition, Exploiting Deep Learning | 2017
Michiel Bacchiani; Francoise Beaufays; Alexander H. Gruenstein; Pedro J. Moreno; Johan Schalkwyk; Trevor Strohman; Heiga Zen
Since the wide adoption of smartphones, speech as an input modality has developed from a science fiction dream to a widely accepted technology. The quality demand on this technology that allowed fueling this adoption is high and has been a continuous focus of research activities at Google. Early adoption of large neural network model deployments and training of such models on large datasets has significantly improved core recognition accuracy. Adoption of novel approaches like long short-term memory models and connectionist temporal classification have further improved accuracy and reduced latency. In addition, algorithms that allow adaptive language modeling improve accuracy based on the context of the speech input. Focus on expanding coverage of the user population in terms of languages and speaker characteristics (e.g., child speech) has lead to novel algorithms that further pushed the universal speech input vision. Continuing this trend, our most recent investigations have been on noise and far-field robustness. Tackling speech processing in those environments will enable applications of in-car, wearable, and in-the-home scenarios and as such be another step towards true universal speech input. This chapter will briefly describe the algorithmic developments at Google over the past decade that have brought speech processing to where it is today.
Archive | 2010
Michael J. Lebeau; William J. Byrne; Nicholas Jitkoff; Alexander H. Gruenstein
conference of the international speech communication association | 2013
Xin Lei; Andrew W. Senior; Alexander H. Gruenstein; Jeffrey S. Sorensen
Archive | 2011
Alexander H. Gruenstein; William J. Byrne
conference of the international speech communication association | 2010
Brandon M. Ballinger; Cyril Allauzen; Alexander H. Gruenstein; Johan Schalkwyk
Archive | 2011
William J. Byrne; Brett Rolston Lider; Nicholas Jitkoff; Alexander H. Gruenstein; Benedict John Davies
Archive | 2010
William J. Byrne; Alexander H. Gruenstein; Douglas H. Beeferman
Archive | 2013
Alexander H. Gruenstein; Petar Aleksic
conference of the international speech communication association | 2012
Ian McGraw; Alexander H. Gruenstein