Todd Andrew Stephenson
University of Edinburgh
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Todd Andrew Stephenson.
ieee automatic speech recognition and understanding workshop | 2003
Mathew Magimai.-Doss; Todd Andrew Stephenson; Samy Bengio
State-of-the-art ASR systems typically use phonemes as the subword units. We investigate a system where the word models are defined in-terms of two different subword units, i.e., phonemes and graphemes. We train models for both the subword units, and then perform decoding using either both or just one subword unit. We have studied this system for American English where there is weak correspondence between grapheme and phoneme. We carried out the study in the framework of a state-of-the-art hybrid HMM/ANN system. The results show that there is good potential in using graphemes as auxiliary subword units.
ieee workshop on neural networks for signal processing | 2002
Todd Andrew Stephenson; J. Escofet; Mathew Magimai-Doss
Pitch and energy are two fundamental features describing speech, having importance in human speech recognition. However, when incorporated as features in automatic speech recognition (ASR), they usually result in a significant degradation on recognition performance due to the noise inherent in estimating or modeling them. We show experimentally how this can be corrected by either conditioning the emission distributions upon these features or by marginalizing out these features in recognition. Since to do this is not obvious with standard hidden Markov models (HMMs), this work has been performed in the framework of dynamic Bayesian networks (DBNs), resulting in more flexibility in defining the topology of the emission distributions and in specifying whether variables should be marginalized out.
international conference on pattern recognition | 2002
Todd Andrew Stephenson; Mathew Magimai-Doss
In standard automatic speech recognition (ASR), hidden Markov models (HMMs) calculate their emission probabilities by an artificial neural network (ANN) or a Gaussian distribution conditioned only upon the hidden state variable. Stephenson et al. (2001) showed the benefit of conditioning the emission distributions also upon a discrete auxiliary variable, which is observed in training and hidden in recognition. Related work (Fujinaga et al., 2001) has shown the utility of conditioning the emission distributions on a continuous auxiliary variable. We apply mixed Bayesian networks (BNs) to extend these works by introducing a continuous auxiliary variable that is observed in training but is hidden in recognition. We find that an auxiliary pitch variable conditioned itself upon the hidden state can degrade performance unless the auxiliary variable is also hidden. The performance, furthermore, can be improved by making the auxiliary pitch variable independent of the hidden state.
conference of the international speech communication association | 1998
Simon King; Todd Andrew Stephenson; Stephen Isard; Paul Taylor; Alex Strachan
IEEE Transactions on Speech and Audio Processing | 2004
Todd Andrew Stephenson; Mathew Magimai.-Doss
conference of the international speech communication association | 2001
Todd Andrew Stephenson; Mathew Magimai.-Doss
Archive | 2003
M. Magimai; Todd Andrew Stephenson; Hervé Bourlard; Samy Bengio
conference of the international speech communication association | 2004
Mathew Magimai.-Doss; Todd Andrew Stephenson; Shajith Ikbal
international conference on spoken language processing | 2002
Todd Andrew Stephenson; Mathew Magimai.-Doss
Archive | 2002
Mathew Magimai.-Doss; Todd Andrew Stephenson; Hervé Bourlard