Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Gordon Ramsay is active.

Publication


Featured researches published by Gordon Ramsay.


Speech Communication | 1997

Production models as a structural basis for automatic speech recognition

Li Deng; Gordon Ramsay; Don X. Sun

Abstract We postulate in this paper that highly structured speech production models will have much to contribute to the ultimate success of speech recognition in view of the weaknesses of the theoretical foundation underpinning current technology. These weaknesses are analyzed in terms of phonological modeling and of phonetic-interface modeling. We present two probabilistic speech recognition models with the structure designed based on approximations to human speech production mechanisms, and conclude by suggesting that many of the advantages to be gained from interaction between speech production and speech recognition communities will develop from integrating production models with the probabilistic analysis-by-synthesis strategy currently used by the technology community.


Journal of the Acoustical Society of America | 2008

The Influence of Constriction Geometry on Sound Generation in Fricative Consonants

Gordon Ramsay

Sound generation in fricative consonants is traditionally supposed to depend only on the Reynolds Number, usually defined in terms of the constriction area and the volume velocity at the constriction. The potential influence of the detailed three‐dimensional geometry of the constriction is often ignored, even though previous empirical studies have shown this to have an important effect on the spectral shape of the source and the overall sound strength. At present, the physical processes governing turbulent jet formation and aeroacoustic source generation in fricative consonants are not fully understood. In this paper, we use large‐eddy simulations of three‐dimensional viscous incompressible flow to visualize the development of the turbulent flow field and aeroacoustic source distribution in an elliptical duct representing the vocal tract, for elliptical, laminar, and grooved constriction shapes that share the same cross‐sectional area function. By contrasting results for these geometries, we test the hypo...


international conference on spoken language processing | 1996

Optimal filtering and smoothing for speech recognition using a stochastic target model

Gordon Ramsay; Li Deng

Presents a stochastic target model of speech production, where articulator motion in the vocal tract is represented by the state of a Markov-modulated linear dynamical system, driven by a piecewise-deterministic control trajectory and observed through a non-linear function representing the articulatory-acoustic mapping. Optimal filtering and smoothing algorithms for estimating the hidden states of the model from acoustic measurements are derived using a measure-change technique and require the solution of recursive integral equations. A sub-optimal approximation is developed and illustrated using examples taken from real speech.


Journal of the Acoustical Society of America | 1994

A stochastic framework for articulatory speech recognition

Gordon Ramsay; Li Deng

One of the major difficulties in incorporating knowledge of speech production contraints into speech recognition lies in the problem of adequately characterizing the relationship between an articulatory description of speech and its lexical and acoustic counterparts, and in developing procedures for recovering such an articulatory description from acoustic input. In this paper, a new stochastic framework for articulatory speech recognition is presented aimed at addressing these issues. Utterances are described in terms of overlapping gestural units that are built into a Markov state structure. Each gestural combination is identified with a set of acoustic/articulatory correlates embodied in a target distribution on an articulatory parameter space, while articulatory motion is represented by a stochastic linear dynamical system whose parameters and input are indexed on the Markov state. A piecewise‐linear approximation to the articulatory‐acoustic mapping derived from an explicit production model transform...


IEEE Signal Processing Letters | 1995

Tracking nonstationary targets using a dynamical system with Markov-modulated parameters

Gordon Ramsay; Li Deng

Tracking moving targets from partial measurements is an important problem with applications in control theory and pattern recognition. This letter presents a statistical framework for modeling target motion as the output of a linear dynamical system driven by a random step function representing sequences of idealized target positions. Target trajectories are observed through a noisy nonlinear measurement function, whereas system parameters are modulated by a semi-Markov chain representing changes in target regime. An algorithm is presented for maximum-likelihood parameter estimation from a corpus of observations, and potential applications to articulatory speech recognition are discussed.<<ETX>>


international conference on spoken language processing | 1996

A non-linear filtering approach to stochastic training of the articulatory-acoustic mapping using the EM algorithm

Gordon Ramsay

Current techniques for training representations of the articulatory-acoustic mapping from data rely on artificial simulations to provide codebooks of articulatory and acoustic measurements, which are then modelled by simple functional approximations. This paper outlines a stochastic framework for adapting an artificial model to real speech from acoustic measurements alone, using the EM algorithm. It is shown that parameter and state estimation problems for articulatory-acoustic inversion can be solved by adopting a statistical approach based on non-linear filtering.


Journal of the Acoustical Society of America | 2016

The clockwork music of speech: Gestural synthesis in 18th and 19th century speaking machines

Gordon Ramsay

Gestural theories of phonology developed in the latter half of the 20th century have proposed that intrinsically timed sequences of linguistically significant actions of the vocal tract are the natural embodiment of speech, in stark contrast to earlier theories that emphasized the role of pure acoustic, auditory or articulatory representations in speech production and perception. Although these recent theories apparently represent a significant change in viewpoint, and have been highly influential, it has been largely forgotten that most of these ideas can actually be traced back to the 18th and 19th centuries, when they were the dominant perspective. Remarkably, early attempts to create speaking robots considered from the outset, and endeavoured to replicate, not only the physics of sound production in realistic vocal tract geometries, but also the sequencing of those geometries over time. This presentation traces the prehistory of gestural synthesis by using unpublished historical documents to analyze t...


Journal of the Acoustical Society of America | 2016

Developmental progressions in the harmonic structure of infant-directed speech

Jhonelle Bailey; Shweta Ghai; Gordon Ramsay

Caregivers speak differently to infants than to adults early in life. Previous studies of this special register, “motherese,” have sought to identify key differences in voice quality between infant-directed and adult-directed speech. Acoustic properties most salient to infants appear to reside in exaggerations of prosodic structure, but these findings are limited by speech analysis techniques that do not fully capture the acoustic structure of the maternal voice or the way motherese changes over the course of development. In this study, multitaper analysis was used to measure developmental progressions in the complete harmonic structure of the maternal voice. Samples of adult- and infant-directed speech were extracted from home audio recordings of 20 mothers collected monthly from 0-24 months using LENA technology. Multitaper analysis was used to calculate the time-varying amplitude and phase of every harmonic component, the residual noise component, and the spectral envelope, permitting a complete statis...


Journal of the Acoustical Society of America | 2009

Talking heads: Speech synthesis and embodied cognition.

Philip E. Rubin; Gordon Ramsay; Eric Vatikiotis-Bateson

This presentation provides a brief overview of 300 years of effort toward the creation of talking heads: mechanical, electronic, and/or computational models of human speech. Speech, language, communication, and cognition are fundamentally shaped, in part, by both biological and physical factors. To understand this grounding and how to effectively replicate its most salient aspects in synthesis systems requires us to pay serious attention to the structure, kinematics, and dynamics of the articulators; the organization and characterization of complex, emergent behavior in multimodal systems; and the consideration of how events within such systems unfold over multiple time scales. New approaches that will help advance our knowledge and improve our synthesis tools and techniques will be discussed.


Journal of the Acoustical Society of America | 2009

Parametric excitation of the acoustic eigenmodes of the vocal tract by motion of the vocal folds: Non‐aeroacoustic mechanisms of phonation.

Gordon Ramsay

Studies of voice production have focused on aeroacoustic mechanisms of phonation, describing how energy in the background flow is converted into sources of sound. This paper describes a non‐aeroacoustic mechanism that may be important in explaining how sound is generated by vocal fold vibration. According to dynamical systems theory, temporal variations in the parameters of a system may result in parametric excitation of the system. Under appropriate conditions, modulation of the parameters of the acoustic wave operator governing sound propagation in the vocal tract by changes in glottal geometry should be sufficient to induce parametric excitation of the vocal tract eigenmodes, creating sources of sound additional to those predicted by classical aeroacoustics. Rapid changes in the shape of the glottis during phonation are already known to create modulations of the formants within each glottal cycle, but the role of these modulations in exciting the formants has not been established. To test this hypothes...

Collaboration


Dive into the Gordon Ramsay's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Catherine T. Best

University of Western Sydney

View shared research outputs
Top Co-Authors

Avatar

Eric Vatikiotis-Bateson

University of British Columbia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge