Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kathleen E. Cummings is active.

Publication


Featured researches published by Kathleen E. Cummings.


Journal of the Acoustical Society of America | 1995

Analysis of the glottal excitation of emotionally styled and stressed speech

Kathleen E. Cummings; Mark A. Clements

The problems of automatic recognition of and synthesis of multistyle speech have become important topics of research in recent years. This paper reports an extensive investigation of the variations that occur in the glottal excitation of eleven commonly encountered speech styles. Glottal waveforms were extracted from utterances of non-nasalized vowels for two speakers for each of the eleven speaking styles. The extracted waveforms were parametrized into four duration-related and two slope-related values. Using these six parameters, the glottal waveforms from the eleven styles of speech were analyzed both qualitatively and quantitatively. The glottal waveforms from each style speech were analyzed both qualitatively and quantitatively. The glottal waveforms from each style of speech have been shown to be significantly and identifiably different from all other styles, thereby confirming the importance of the glottal waveform in conveying speech style information and in causing speech waveform variations. The degree of variation in styled glottal waveforms has been shown to be consistent when trained on one speaker and compared with another.


international conference on acoustics, speech, and signal processing | 1990

Analysis of glottal waveforms across stress styles

Kathleen E. Cummings; Mark A. Clements

Results from a study of glottal waveforms derived from 11 different styles of stressed speech are described. The general goal of this research is to investigate the differences in the glottal excitation across specific speaking styles. Using a glottal inverse filtering method tailored specifically for the speech under study, an analysis of glottal waveforms from voiced speech spoken under various conditions is performed. The extracted waveforms are parameterized using six descriptors, and statistical analysis of the resulting database allows very good characterization of each style with respect to these parameters. Specifically, each style is shown to have a unique profile which will allow high performance in a stress style identification task.<<ETX>>


international conference on acoustics, speech, and signal processing | 1993

Application of the analysis of glottal excitation of stressed speech to speaking style modification

Kathleen E. Cummings; Mark A. Clements

Two extensions of previous research into the combined areas of analysis of stressed speech and glottal modeling are presented. First, a simple pattern recognition principle is used to show that, given an unknown glottal waveform, the style can be correctly identified with roughly 90% accuracy. Deviant styles such as angry, loud, and soft can be correctly identified with accuracy approaching 100%. These results confirm the importance of the glottal excitation in conveying stress and in contributing to the variability of speech waveforms. Second, several speaking style modification algorithms have been developed and are reported. These algorithms are able to modify styled speech to sound more normal and normal speech to sound styled. In one example, subjective listening tests demonstrate that styled speech can be modified to sound significantly more normal with these algorithms.<<ETX>>


international conference on acoustics, speech, and signal processing | 1995

Synthesizing styled speech using the Klatt synthesizer

Janet C. Rutledge; Kathleen E. Cummings; Daniel Lambert; Mark A. Clements

This paper reports the implementation of high-quality synthesis of speech with varying speaking styles using the Klatt (1980) synthesizer. This research is based on previously-reported research that determined that the glottal waveforms of various styles of speech are significantly and identifiably different. Given the parameter tracks that control the synthesis of a normal version of an utterance, those parameters that control known acoustic correlates of speaking style are varied appropriately, relative to normal, to synthesize styled speech. In addition to varying the parameters that control the glottal waveshape, phoneme duration, phoneme intensity, and pitch contour are also varied appropriately. Listening tests that demonstrate that the synthetic speech is perceptibly and appropriately styled, and that the synthetic speech is natural-sounding, were performed, and the results are presented.


southeastcon | 1989

Estimation and comparison of the glottal source waveform across stress styles using glottal inverse filtering

Kathleen E. Cummings; Mark A. Clements; John H. L. Hansen

An iterative method to extract the glottal waveform using inverse filtering is presented. The method is then applied to the analysis of glottal waveforms from utterances displaying eleven different styles of speech: normal, slow, fast, angry, loud, soft, clear, question, two different task loading conditions, and speech produced while the talker is presented with noise through headphones (Lombard effect). Results of the analysis are presented in terms of statistical descriptors of the extracted glottal waveforms for each style. Typical examples of the waveforms, displaying the salient features of each, are shown, and a qualitative discussion of the results is presented. It is concluded that this method of extracting glottal waveforms using inverse filtering, while time-consuming, gives reasonable results. The extracted waveforms are consistent across utterances, and changes in the glottal waveshape that theoretically should occur under different stress conditions are present in the extracted glottal waveforms.<<ETX>>


international conference on acoustics, speech, and signal processing | 1995

Modelling speech production using Yee's finite difference method

Kathleen E. Cummings; James G. Maloney; Mark A. Clements

This paper describes a model of speech production based on solving for acoustic wave propagation in the vocal tract using a finite-difference time-domain (FDTD) technique. This FDTD technique was first developed by Yee (1966) and utilizes a discretization scheme in which pressure and velocity components are interleaved in both space and time. The specific implementation of this model of speech production, including discretization of the coupled acoustic wave equations, boundary conditions, stability criteria, values of model constants, and method of excitation, are presented. The accuracy of the model is verified by comparing the FDTD results to the theoretically expected results for a well-known acoustics problem. The FDTD model of speech production has been used in a variety of experiments, and several results, including those that compare the use of several common glottal models as excitation, are presented.


Journal of the Acoustical Society of America | 1994

Synthesizing multistyle speech using the Klatt synthesizer

Daniel Lambert; Kathleen E. Cummings; Janet C. Rutledge; Mark A. Clements

Synthesizing multistyle speech has been an important topic of research in recent years. In this research, 11 commonly encountered speech styles have been synthesized by varying pitch, duration, intensity, and the glottal excitation on the Klatt Synthesizer 88, KLSYN88. These 11 speech styles include angry, clear, 50% tasking, 70% tasking, fast, Lombard, loud, normal, question, slow, and soft. All of the styles proved to be intelligible and appropriately styled based on subjective listening tests. The parameter variations of the glottal excitation are based on the results of statistical analyses [K. Cummings, Analysis, Syn., and Rec. of Stressed Speech, Ph.D thesis, Georgia Inst. of Tech., 1992] that demonstrated the importance of glottal excitation changes in styled speech. These statistical analyses demonstrated that the glottal excitation of each of the eleven styles is significantly and identifiably different. One utterance of the work ‘‘hot’’ was synthesized in the normal style. The other ten styles o...


Journal of the Acoustical Society of America | 1996

Analysis of the glottal excitation of intoxicated versus sober speech: A first report.

Kathleen E. Cummings; Steven B. Chin; David B. Pisoni

The objective of the research reported here was to perform an analysis of the voicing characteristics of sober and intoxicated speech in order to assess the possibility of detecting intoxication from the speech signal. Because the glottal excitation is the result of complicated and intricate motions by the vocal folds, it should be significantly affected by alcohol use. In this study, excitation parameters were extracted from non‐nasalized vowels in eight utterances spoken in both sober and intoxicated conditions by four speakers. Fifteen parameters, including pitch, pitch contour, rms intensity measures, and measures of shimmer and jitter, were extracted directly from the speech waveform. As expected, the most significant differences between sober and intoxicated speech were in parameters measuring perturbations in adjacent pitch periods. Also as expected, there were no significant differences in more global parameters such as average pitch. Additionally, glottal waveforms were extracted from two of the ...


Journal of the Acoustical Society of America | 1994

Modeling speech production using finite difference techniques

Kathleen E. Cummings; Mark A. Clements

Because of limitations in computing resources and analysis methods, speech processing techniques have relied on linear models that make simplifying assumptions about the manner of propagation, losses, and the geometry of the vocal tract. This research has developed a new, more sophisticated model of speech production that is based on numerical simulation of wave propagation in the vocal tract. The model uses a numerical analysis technique, finite differencing, to solve for wave propagation in a complicated geometry that is more representative of the vocal tract than previous models. The objective in developing this model of speech production is to enable the researcher to investigate and compare theories of speech production of varying complexity. The model is designed as a research tool in which it is possible to vary the driving equations and boundary conditions as well as the vocal tract geometry and the excitation to the vocal tract. Simulations that compare the speech produced given several different...


Archive | 1995

Adaptation of FDTD techniques to acoustic modeling

James G. Maloney; Kathleen E. Cummings

Collaboration


Dive into the Kathleen E. Cummings's collaboration.

Top Co-Authors

Avatar

Mark A. Clements

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Daniel Lambert

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John H. L. Hansen

University of Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar

David B. Pisoni

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge