Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Leo Lee is active.

Publication


Featured researches published by Leo Lee.


international conference on acoustics, speech, and signal processing | 2004

A structured speech model with continuous hidden dynamics and prediction-residual training for tracking vocal tract resonances

Li Deng; Leo Lee; Hagai Attias; Alex Acero

A novel approach is developed for efficient and accurate tracking of vocal tract resonances, which are natural frequencies of the resonator from larynx to lips, in fluent speech. The tracking algorithm is based on a version of the structured speech model consisting of continuous-valued hidden dynamics and a piecewise-linearized prediction function from resonance frequencies and bandwidths to LPC cepstra. We present details of the piecewise linearization design process and an adaptive training technique for the parameters that characterize the prediction residuals. An iterative tracking algorithm is described and evaluated that embeds both the prediction-residual training and the piecewise linearization design in an adaptive Kalman filtering framework. Experiments on tracking vocal tract resonances in Switchboard speech data demonstrate high accuracy in the results, as well as the effectiveness of residual training embedded in the algorithm. Our approach differs from traditional formant trackers in that it provides meaningful results even during consonantal closures when the supra-laryngeal source may cause no spectral prominences in speech acoustics.


international conference on acoustics, speech, and signal processing | 2003

Variational inference and learning for segmental switching state space models of hidden speech dynamics

Leo Lee; Hagai Attias; Li Deng

This paper describes novel and powerful variational EM algorithms for the segmental switching state space models used in speech applications, which are capable of capturing key internal (or hidden) dynamics of natural speech production. Hidden dynamic models (HDMs) have recently become a class of promising acoustic models to incorporate crucial speech-specific knowledge and overcome many inherent weaknesses of traditional HMMs. However, the lack of powerful and efficient statistical learning algorithms is one of the main obstacles preventing them from being well studied and widely used. Since exact inference and learning are intractable, a variational approach is taken to develop effective approximate algorithms. We have implemented the segmental constraint crucial for modeling speech dynamics and present algorithms for recovering hidden speech dynamics and discrete speech units from acoustic data only. The effectiveness of the algorithms developed are verified by experiments on simulation and Switchboard speech data.


international conference on acoustics, speech, and signal processing | 2001

A functional articulatory dynamic model for speech production

Leo Lee; Paul W. Fieguth; Li Deng

Introduces a statistical speech production model. The model synthesizes natural speech by modeling some key dynamic properties of vocal articulators in a linear/nonlinear state-space framework. The goal-oriented movements of the articulators (tongue tip, tongue dorsum, upper lip, lower lip, and jaw) are described in a linear dynamic state equation. The resulting articulatory trajectories, combined with the effects of the velum and larynx, are nonlinearly mapped into the acoustic feature space (MFCCs). The key challenges in this model are the development of a nonlinear parameter estimation methodology, and the incorporation of appropriate prior assumptions to assert in the articulatory dynamic structure. Such a model can also be directly applied to speech recognition to better account for coarticulation and phonetic reduction phenomena with considerably fewer parameters than HMM based approaches.


international conference on image processing | 2005

A probabilistic living cell segmentation model

Nezamoddin N. Kachouie; Leo Lee; Paul W. Fieguth

A better understanding of cell behavior is very important in drug and disease research. Cell size, shape, and motility may play a key role in stem-cell specialization or cancer development. However the traditional method of inferring these values from image sequences manually is such an onerous task that automated methods of cell tracking and segmentation are in high demanded, especially given the increasing amount of cell data being collected. In this paper, a novel probabilistic cell model is designed to segment the individual hematopoietic stem cells (HSCs) extracted from mice bone marrow cells. The proposed cell model has been successfully applied to HSC segmentation, identifying the most probable cell locations in the image on the basis of cell brightness and morphology.


international conference on image processing | 2003

Parametric contour estimation by simulated annealing

Michael Jamieson; Paul W. Fieguth; Leo Lee

Virtually all implementations of simulated annealing are simplified by assuming discrete unknowns, however continuous-parameter annealing has many potential applications to image processing. Widely scattered problems such as formant tracking, boundary estimation and phase- unwrapping can all be approached as the annealed minimizations of continuous B-spline parameters. The benefits of simulated annealing are well known, including an insensitivity to initial conditions and the ability to solve problems with many local minima. Discrete variable annealing has seen broad application, however continuous-variable annealing is limited by the computational challenge of Gibbs sampling. In this paper we develop efficient approaches to sampling, illustrated in the context of contour tracking in noisy images.


international conference on acoustics, speech, and signal processing | 2004

A multimodal variational approach to learning and inference in switching state space models [speech processing application]

Leo Lee; Hagai Attias; Li Deng; Paul W. Fieguth

An important general model for discrete-time signal processing is the switching state space (SSS) model, which generalizes the hidden Markov model and the Gaussian state space model. Inference and parameter estimation in this model are known to be computationally intractable. This paper presents a powerful new approximation to the SSS model. The approximation is based on a variational technique that preserves the multimodal nature of the continuous state posterior distribution. Furthermore, by incorporating a windowing technique, the resulting EM algorithm has complexity that is just linear in the length of the time series. An alternative Viterbi decoding with frame-based likelihood is also presented which is crucial for the speech application that originally motivates this work. Our experiments focus on demonstrating the effectiveness of the algorithm by extensive simulations. A typical example in speech processing is also included to show the potential of this approach for practical applications.


Journal of the Acoustical Society of America | 2006

Method of speech recognition using variational inference with switching state space models

Hagai Attias; Leo Lee; Li Deng


international conference on acoustics, speech, and signal processing | 2004

A multimodal variational approach to learning and inference in switching state space models

Leo Lee; Hagai Attias; Li Deng; Paul W. Fieguth


Journal of the Acoustical Society of America | 2010

Method of speech recognition using multimodal variational inference with switching state space models

Hagai Attias; Li Deng; Leo Lee


IEEE Transactions on Audio, Speech, and Language Processing | 2007

Adaptive Kalman Smoothing for Tracking Vocal Tract Resonances Using a Continuous-Valued Hidden Dynamic Model

Li Deng; Leo Lee; Hagai Attias; Alex Acero

Collaboration


Dive into the Leo Lee's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nezamoddin N. Kachouie

Florida Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge