Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kenneth N. Stevens is active.

Publication


Featured researches published by Kenneth N. Stevens.


Journal of the Acoustical Society of America | 1972

Emotions and Speech: Some Acoustical Correlates

Carl E. Williams; Kenneth N. Stevens

This paper describes some further attempts to identify and measure those parameters in the speech signal that reflect the emotional state of a speaker. High‐quality recordings were obtained of professional “method” actors reading the dialogue of a short scenario specifically written to contain various emotional situations. Excerpted portions of the recordings were subjected to both quantitative and qualitative analyses. A comparison was also made of recordings from a real‐life situation, in which the emotions of a speaker were clearly defined, with recordings from an actor who simulated the same situation. Anger, fear, and sorrow situations tended to produce characteristic differences in contour of fundamental frequency, average speech spectrum, temporal characteristics, precision of articulation, and waveform regularity of successive glottal pulses. Attributes for a given emotional situation were not always consistent from one speaker to another.


Journal of the Acoustical Society of America | 1978

Invariant cues for place of articulation in stop consonants

Kenneth N. Stevens; Sheila E. Blumstein

In a series of experiments, identification responses for place of articulation were obtained for synthetic stop consonants in consonant-vowel syllables with different vowels. The acoustic attributes of the consonants were systematically manipulated, the selection of stimulus characteristics being guided in part by theoretical considerations concerning the expected properties of the sound generated in the vocal tract as place of articulation is varied. Several stimulus series were generated with and without noise bursts at the onset, and with and without formant transitions following consonantal release. Stimuli with transitions only, and with bursts plus transitions, were consistently classified according to place of articulation, whereas stimuli with bursts only and no transitions were not consistently identified. The acoustic attributes of the stimuli were examined to determine whether invariant properties characterized each place of atriculation independent of vowel context. It was determined that the gross shape of the spectrum sampled at the consonantal release showed a distinctive shape for each place of articulation: a prominent midfrequency spectral peak for velars, a diffuse-rising spectrum for alveolars, and a diffuse-falling spectrum for labials. These attributes are evident for stimuli containing transitions only, but are enhanced by the presence of noise bursts at the onset.


Journal of the Acoustical Society of America | 1979

Acoustic invariance in speech production: Evidence from measurements of the spectral characteristics of stop consonants

Sheila E. Blumstein; Kenneth N. Stevens

On the basis of theoretical considerations and the results of experiments with synthetic consonant-vowel syllables, it has been hypothesized that the short-time spectrum sampled at the onset of a stop consonant should exhibit gross properties that uniquely specify the consonantal place of articulation independent of the following vowel. The aim of this paper is to test this hypothesis by measuring the spectrum sampled at the onsets and offsets of a large number of consonant-vowel (CV) and vowel-consonant (VC) syllables containing both voiced and voiceless stops produced by several speakers. Templates were devised in an attempt to capture three classes of spectral shapes: diffuse-rising, diffuse-falling, and compact, corresponding to alveolar, labial, and velar consonants, respectively. Spectra were derived from the utterances by sampling at the consonantal release of CV syllables and at the implosion and burst release of VC syllables, and these spectra (smoothed by a linear prediction algorithm) were matched against the templates. It was found that about 85% of the spectra at initial consonant release and at final burst release were correctly classified by the templates, although there was some variability across vowel contexts. The spectra sampled at the implosion were not consistently classified. A preliminary examination of spectra sampled at the release of nasal consonants in CV syllables showed a somewhat lower accuracy of classification by the same templates. Overall, the results support an hypothesis that, in natural speech, the acoustic characteristics of stop consonants, specified in terms of the gross spectral shape sampled at the discontinuity in the acoustic signal, show invariant properties independent of the adjacent vowel or of the voicing characteristics of the consonant. The implication is that the auditory system is endowed with detectors that are sensitive to these kinds of gross spectral shapes, and that the existence of these detectors helps the infant to organize the sounds of speech into their natural classes.


Journal of the Acoustical Society of America | 1974

Role of formant transitions in the voiced‐voiceless distinction for stops

Kenneth N. Stevens; Dennis H. Klatt

Previous research on acoustic cues responsible for the voiced‐voiceless distinction in prestressed English plosives has emphasized the importance of voicing onset time with respect to plosive release (VOT). Voiced plosives in English normally have a short VOT (less than 20–30 msec) and a significant formant transition is present following voice onset. Voiceless plosives in prestressed position, on the other hand, have relatively long VOTs (greater than about 50 msec) and the formant transitions are essentially completed prior to voice onset. Our experiments with synthetic speech compare the role of VOT and the presence or absence of a significant formant transition following voicing onset as cues for the voiced‐voiceless distinction. The data indicate that there is a significant trading relationship between these two cues. The presence or absence of a rapid spectral change following voice onset produces up to 15‐msec change in the location of the perceived phoneme boundary as measured in terms of absolut...


Journal of the Acoustical Society of America | 1976

Perceptual invariance and onset spectra for stop consonants in different vowel environments.

Sheila E. Blumstein; Kenneth N. Stevens

A series of listening tests with brief synthetic consonant-vowel syllables was carried out to determine whether the initial part of a syllable can provide cues to place of articulation for voiced stop consonants independent of the remainder of the syllable. The data show that stimuli as short as 10-20 ms sampled from the onset of a consonant-vowel syllable, can be reliably identified for consonantal place of articulation, whether the second and higher formants contain moving or straight transitions and whether or not an initial burst is present. In most instances, these brief stimuli also contain sufficient information for vowel indentification. Stimulus continua in which formant transitions ranged from values appropriate to [b], [d], [g] in various vowel environments, and in which stimulus durations were 20 and 46 ms, yielded categorical labeling functions with a few exceptions. These results are consistent with a theory of speech perception in which consonant place of articulation is cued by invariant properties derived from the spectrum sampled in a 10-20 ms time window adjacent to consonantal onset or offset.


Journal of the Acoustical Society of America | 1955

Development of a Quantitative Description of Vowel Articulation

Kenneth N. Stevens; Arthur S. House

A set of parameters that yield a simple yet reasonably accurate description of the articulation of vowel sounds is developed. The articulatory description is potentially useful in a speech band‐width compression system based on the coding of articulatory data. The parameters give information on the position of the tongue constriction, the size of the constriction formed by the tongue, and the dimensions in the vicinity of the mouth opening. An electrical analog of the vocal tract is utilized to obtain experimental relations between the articulatory parameters and the formant frequencies. Contours of vowel articulation are derived from these data. The relation of the contours to classical phonetics is discussed.


Journal of the Acoustical Society of America | 1961

On the Properties of Voiceless Fricative Consonants

John M. Heinz; Kenneth N. Stevens

According to an acoustical theory of speech production, the spectra of voiceless fricatives can be characterized by poles and zeros whose frequency locations are dependent on the vocal‐tract configuration and on the location of the source of excitation within the vocal tract. The locations of the important poles and zeros in the spectra of fricatives can be determined by a matching process whereby comparison spectra synthesized by electric circuits are matched against the spectra under analysis. This method has been used to determine the frequencies and bandwidths of the important poles and zeros for several versions of /f/, /s/, and /∫/. Based on these findings, a simplified electrical model is developed for the synthesis of voiceless fricatives. The model consists of a noise‐excited electric circuit characterized by a pole and a zero whose frequency locations can be varied. Stimuli generated by this model, both in isolation and in syllables, are presented to listeners for identification. The results of ...


IEEE Transactions on Information Theory | 1962

Speech recognition: A model and a program for research

Morris Halle; Kenneth N. Stevens

A speech recognition model is proposed in which the transformation from an input speech signal into a sequence of phonemes is carried out largely through an active or feedback process. In this process, patterns are generated internally in the analyzer according to an adaptable sequence of instructions until a best match with the input signal is obtained. Details of the process are given, and the areas where further research is needed are indicated.


Journal of the Acoustical Society of America | 1971

Airflow and Turbulence Noise for Fricative and Stop Consonants: Static Considerations

Kenneth N. Stevens

A number of speech sounds are generated by creating turbulent airflow in the vicinity of a constriction in the vocal tract. The equations relating the airflow through such a constriction, the pressure drop across the constriction, and the dimensions of the constriction are reviewed and are summarized in graphical form. Previous theoretical and experimental data on turbulence noise generation at a constriction or obstruction in a tube are described. These data are consistent with a model that represents a turbulence noise source in the vocal tract as an equivalent sound‐pressure source whose magnitude is proportional to the pressure drop across the constriction or obstruction. The characteristics of the sound radiated from the mouth opening are determined by the source location and radiation characteristics, as well as by the properties of the source. The airflow and acoustic characteristics for various classes of speech sounds produced with turbulence noise at the glottis or at a supraglottal constriction...


Journal of the Acoustical Society of America | 1992

Acoustic and perceptual characteristics of voicing in fricatives and fricative clusters

Kenneth N. Stevens; Sheila E. Blumstein; Laura B. Glicksman; Martha W. Burton; Kathleen Kurowski

Several types of measurements were made to determine the acoustic characteristics that distinguish between voiced and voiceless fricatives in various phonetic environments. The selection of measurements was based on a theoretical analysis that indicated the acoustic and aerodynamic attributes at the boundaries between fricatives and vowels. As expected, glottal vibration extended over a longer time in the obstruent interval for voiced fricatives than for voiceless fricatives, and there were more extensive transitions of the first formant adjacent to voiced fricatives than for the voiceless cognates. When two fricatives with different voicing were adjacent, there were substantial modifications of these acoustic attributes, particularly for the syllable-final fricative. In some cases, these modifications leads to complete assimilation of the voicing feature. Several perceptual studies with synthetic vowel-consonant-vowel stimuli and with edited natural stimuli examined the role of consonant duration, extent and location of glottal vibration, and extent of formant transitions on the identification of the voicing characteristics of fricatives. The perceptual results were in general consistent with the acoustic observations and with expectations based on the theoretical model. The results suggest that listeners base their voicing judgments of intervocalic fricatives on an assessment of the time interval in the fricative during which there is no glottal vibration. This time interval must exceed about 60 ms if the fricative is to be judged as voiceless, except that a small correction to this threshold is applied depending on the extent to which the first-formant transitions are truncated at the consonant boundaries.

Collaboration


Dive into the Kenneth N. Stevens's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Helen M. Hanson

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dennis H. Klatt

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Samuel Jay Keyser

Massachusetts Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge