Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Robert W. Morris is active.

Publication


Featured researches published by Robert W. Morris.


Medical Engineering & Physics | 2002

Reconstruction of speech from whispers

Robert W. Morris; Mark A. Clements

This paper investigates a method for the real-time reconstruction of normal speech from whispers. This system could be used by aphonic individuals as a voice prosthesis. It could also provide improved verbal communication when normal speech is not appropriate. The normal speech is synthesized using the mixed excitation linear prediction model. Differences between whispered and phonated speech are discussed and methods for estimating the parameters of this model from whispered speech for real-time synthesis are proposed. This includes smoothing the noisy linear prediction spectra, modifying the formants, and synthesizing of the excitation signal. Trade-offs between computational complexity, delay, and accuracy of different methods are discussed.


IEEE Signal Processing Letters | 2002

Modification of formants in the line spectrum domain

Robert W. Morris; Mark A. Clements

A method for modifying formant locations and bandwidths directly in the line spectrum domain is developed. This method is based on first-order approximations between different representations of the linear prediction spectrum and does not require finding the roots of the prediction polynomial. In addition, this method is less sensitive to pole interaction problems associated with direct pole modification.


international conference on acoustics, speech, and signal processing | 2005

Comparison of autoregressive parameter estimation algorithms for speech processing and recognition

Robert W. Morris; Jon A. Arrowood; Mark A. Clements

Noise mitigation systems for speech coding and recognition have primarily focused on spectral subtraction techniques due to their well understood behavior and computational simplicity. As computation complexity becomes a smaller constraint, understanding the characteristics of different estimation schemes becomes more important. The merits of two algorithms based on direct estimation of the linear prediction spectrum of a speech signal are explored. These algorithms are maximum likelihood (ML) and minimum mean square error estimation (MMSE) of the autoregressive speech spectrum. The MMSE algorithm is able to improve objective quality effectively at low SNRs while also improving the speech recognition accuracy by 20-30% on the Aurora2 test set at the cost of requiring two orders of magnitude more operations than the ML method. Because of these improvements, autoregressive based algorithms should be considered in the future for noise robust speech processing tasks.


international conference on acoustics, speech, and signal processing | 2002

Efficient second-order adaptation for large vocabulary distributed speech recognition

Robert W. Morris; Michael E. Deisher

This paper describes practical implementation details for a second-order approximation to the parallel model combination (PMC) algorithm with application to large vocabulary distributed speech recognition. The proposed method is capable of simultaneously adapting to noise and channel changes. A more accurate method for computing the derivatives based on numeric integration PMC is introduced. The proposed second-order adaptation algorithm requires only twice the memory and computation of standard Jacobian Adaptation (JA). This represents a 382-fold reduction in memory and a 29-fold reduction in computation. Moreover, the proposed algorithm produces models that are much closer to the PMC-derived models than standard JA.


international conference on acoustics, speech, and signal processing | 2002

Estimation of speech spectra from whispers

Robert W. Morris; Mark A. Clements

The most obvious difference between normal and whispered speech is the excitation. However, there are other significant spectral differences between these two modes of speech. In particular, the formant locations are raised in whispers because of increased coupling between the vocal tract and the trachea. In addition, the noise excitation increases the variance of spectral estimates such as linear prediction. In order to reconstruct quality phonated speech from whispers, it is necessary to remove the bias in the formants and reduce the variance of spectrum. Methods for estimating the linear prediction spectrum from whispered speech are proposed. These include modification of the formants in the line spectrum domain and smoothing of the noisy linear prediction spectra.


north american chapter of the association for computational linguistics | 2004

Scoring algorithms for wordspotting systems

Robert W. Morris; Jon A. Arrowood; Peter S. Cardillo; Mark A. Clements

When evaluating wordspotting systems, one normally compares receiver operating characteristic curves and different measures of accuracy. However, there are many other factors that are relevant to the systems usability for searching speech. In this paper, we discuss both measures of quality for confidence scores and propose algorithms for producing scores that are optimal with respect to these criteria.


Speech Coding, 2002, IEEE Workshop Proceedings. | 2002

Autoregressive parameter estimation of speech in noise

Robert W. Morris; Mark A. Clements; J.S. Collura

We describe a method for estimating the spectral parameters of speech corrupted by additive noise based on prior statistics of their trajectories. This method uses a two-stage estimation procedure. In the first step, the maximum likelihood estimate of the line spectrum pair frequencies and average power is determined. However, these estimates are known to have an unacceptably large variance and follow unnatural trajectories. To improve these estimates, we propose modeling the spectral parameters with a jump Markov linear system. This model accommodates both the rapid transitions that occur during consonants, and the slowly changing dynamics of vowels. We use this model to derive a new estimator for autoregressive speech parameters that does not introduce delay and compares favorably with the MELPe speech enhancement scheme.


international conference on acoustics, speech, and signal processing | 2001

Maximum-likelihood compensation of zero-memory nonlinearities in speech signals

Robert W. Morris; Mark A. Clements

An algorithm to blindly compensate zero-memory nonlinear distortions of speech waveforms is derived and analyzed. This method finds a maximum-likelihood estimate of the distortion without a priori knowledge of the microphone characteristics by using the expectation-maximization algorithm. The autoregressive signal model coefficients are solved jointly with the nonlinearity estimate created by an extended Kalman filter. Also, a new family of nonlinear functions is developed for use with this algorithm, although the method can estimate the shape of any parametric zero-memory nonlinearity. These nonlinear distortions can degrade the speech recognition rates, yet lower the perceptual quality only slightly. The compensation algorithm improves automatic speech recognition of distorted speech for a variety of such nonlinearities.


Archive | 2008

Keyword spotting using a phoneme-sequence index

Jon A. Arrowood; Robert W. Morris; Mark Finlay; Scott A. Judy


Archive | 2003

Enhancement and recognition of whispered speech

Robert W. Morris; Mark A. Clements

Collaboration


Dive into the Robert W. Morris's collaboration.

Top Co-Authors

Avatar

Mark A. Clements

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Jon A. Arrowood

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Joseph DiVita

Space and Naval Warfare Systems Center Pacific

View shared research outputs
Top Co-Authors

Avatar

Ralph Johnson

Space and Naval Warfare Systems Center Pacific

View shared research outputs
Top Co-Authors

Avatar

Vladimir Goncharoff

University of Illinois at Chicago

View shared research outputs
Researchain Logo
Decentralizing Knowledge