Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Robert T. Gayvert is active.

Publication


Featured researches published by Robert T. Gayvert.


Journal of the Acoustical Society of America | 1993

Identification of steady‐state vowels synthesized from the Peterson and Barney measurements

James Hillenbrand; Robert T. Gayvert

The purpose of this study was to determine how well listeners can identify vowels based exclusively on static spectral cues. This was done by asking listeners to identify steady-state synthesized versions of 1520 vowels (76 talkers x 10 vowels x 2 repetitions) using Peterson and Barneys measured values of F0 and F1-F3 [J. Acoust. Soc. Am. 24, 175-184 (1952)]. The values for all control parameters remained constant throughout the 300-ms duration of each stimulus. A second set of 1520 signals was identical to these stimuli except that a falling pitch contour was used. The identification error rate for the flat-formant, flat-pitch signals was 27.3%, several times greater than the 5.6% error rate shown by Peterson and Barneys listeners. The introduction of a falling pitch contour resulted in a small but statistically reliable reduction in the error rate. The implications of these results for interpreting pattern recognition studies using the Peterson and Barney database are discussed. Results are also discussed in relation to the role of dynamic cues in vowel identification.


Journal of the Acoustical Society of America | 2002

Speech perception based on spectral peaks versus spectral shape

James Hillenbrand; Robert A. Houde; Robert T. Gayvert

This study was designed to measure the relative contributions to speech intelligibility of spectral envelope peaks (including, but not limited to formants) versus the detailed shape of the spectral envelope. The problem was addressed by asking listeners to identify sentences and nonsense syllables that were generated by two structurally identical source-filter synthesizers, one of which constructs the filter function based on the detailed spectral envelope shape while the other constructs the filter function using a purposely coarse estimate that is based entirely on the distribution of peaks in the envelope. Viewed in the broadest terms the results showed that nearly as much speech information is conveyed by the peaks-only method as by the detail-preserving method. Just as clearly, however, every test showed some measurable advantage for spectral detail, although the differences were not large in absolute terms.


Journal of the Acoustical Society of America | 2003

Open‐source software for speech perception research

Robert T. Gayvert; James Hillenbrand

The purpose of this paper is to describe some relatively simple software that can be used for performing such routine tasks as controlling listening experiments (e.g., simple labeling, discrimination using procedures such as ABX, oddity, same–different, etc., sentence intelligibility, magnitude estimation, and so on), recording responses and response latencies, analyzing and plotting the results of those experiments, displaying instructions, and making scripted audio recordings. The software runs under Windows and is controlled by creating text files that allow the experimenter to specify key features of the experiment such as the stimuli that are to be presented, the randomization scheme, inter‐stimulus and inter‐trial intervals, the format of the output file, and the layout of response alternatives on the screen. Some simple demonstrations will be provided, along with instructions for downloading the software. [Work supported by NIH.]


Journal of Speech Language and Hearing Research | 2015

Phonetics Exercises Using the Alvin Experiment-Control Software

James Hillenbrand; Robert T. Gayvert; Michael J. Clark

PURPOSE Exercises are described that were designed to provide practice in phonetic transcription for students taking an introductory phonetics course. The goal was to allow instructors to offload much of the drill that would otherwise need to be covered in class or handled with paper-and-pencil tasks using text rather than speech as input. METHOD The exercises were developed using Alvin, a general-purpose software package for experiment design and control. The simplest exercises help students learn sound-symbol associations. For example, a vowel-transcription exercise presents listeners with consonant-vowel-consonant syllables on each trial; students are asked to choose among buttons labeled with phonetic symbols for 12 vowels. Several word-transcription exercises are included in which students hear a word and are asked to enter a phonetic transcription. Immediate feedback is provided for all of the exercises. An explanation of the methods that are used to create exercises is provided. RESULTS Although no formal evaluation was conducted, comments on course evaluations suggest that most students found the exercises to be useful. CONCLUSIONS Exercises were developed for use in an introductory phonetics course. The exercises can be used in their current form, they can be modified to suit individual needs, or new exercises can be developed.


Journal of the Acoustical Society of America | 1988

Statistical approaches to formant tracking

Robert T. Gayvert; James Hillenbrand

Formant trackers that rely on peak picking tend to make occasional large errors. This paper investigates two general methods for determining formant locations without explicit use of peak information. Both approaches involve statistical models derived from hand‐traced formant tracks. In the first method, individual formant frequency values are estimated using a maximum likelihood classifier. In the second, formant probability distributions are found for each element of a vector quantization codebook, and formant values are then determined by conditional mean estimates. Hidden Markov models or simple smoothing can then be applied to provide continuity constraints. Both of these techniques have been quantitatively analyzed using a database of 78 utterances produced by four males and four females. The performance of these trackers across different training and testing sets will be discussed. [Work supported by Rome Air Development Center under contract F3060285‐C‐0008.]


Journal of the Acoustical Society of America | 1987

Speaker‐independent vowel classification based on fundamental frequency and formant frequencies

James Hillenbrand; Robert T. Gayvert

A quadratic discriminant classification technique was used to classify spectral measurements from vowels spoken by men, women, and children. The parameters used to train the discriminant classifier consisted of various combinations of fundamental frequency and the three lowest formant frequencies. Several nonlinear auditory transforms were evaluated. Unlike previous studies using a linear discriminant classifier, there was no advantage in category separability for any of the nonlinear auditory transforms over a linear frequency scale, and no advantage for spectral distances over absolute frequencies. However, it was found that parameter sets using nonlinear transforms and spectral differences reduced the differences between phonetically equivalent tokens produced by different groups of talkers.


Journal of the Acoustical Society of America | 2017

Multi-fiber coding on the auditory nerve and the origin of critical-band masking

Robert A. Houde; James Hillenbrand; Robert T. Gayvert; John F. Houde

Understanding the physiological mechanisms that underlie the exquisite frequency discrimination abilities of listeners remains a central problem in auditory science. We describe a computational model of the cochlea and auditory nerve that was developed to evaluate the frequency analysis capabilities of a system in which the output of a basilar membrane filter, transduced into a probability-of-firing function by an inner hair cell, is encoded on the auditory nerve as the instantaneous sum of firings on a critical band of fibers surrounding that filter channel and transmitted to the central nervous system for narrow-band frequency analysis. Performance of the model on vowels over a wide range of input levels was found to be robust and accurate, comparable to the Average Localized Synchronized Rate results of Young and Sachs [J. Acoust. Soc. Am. 1979, 66, 1381-1403]. Model performance in perceptual threshold simulations was also evaluated. The model succeeded in replicating psychophysical results reported in classic studies of critical band masking.Understanding the physiological mechanisms that underlie the exquisite frequency discrimination abilities of listeners remains a central problem in auditory science. We describe a computational model of the cochlea and auditory nerve that was developed to evaluate the frequency analysis capabilities of a system in which the output of a basilar membrane filter, transduced into a probability-of-firing function by an inner hair cell, is encoded on the auditory nerve as the instantaneous sum of firings on a critical band of fibers surrounding that filter channel and transmitted to the central nervous system for narrow-band frequency analysis. Performance of the model on vowels over a wide range of input levels was found to be robust and accurate, comparable to the Average Localized Synchronized Rate results of Young and Sachs [J. Acoust. Soc. Am. 1979, 66, 1381-1403]. Model performance in perceptual threshold simulations was also evaluated. The model succeeded in replicating psychophysical results reported in...


Journal of the Acoustical Society of America | 2013

A simulation of neural coding and auditory frequency analysis

Robert A. Houde; James Hillenbrand; Robert T. Gayvert; John F. Houde

Our understanding of the neural mechanisms underlying the very fine auditory frequency discrimination exhibited by listeners remains far from complete. To investigate this question we developed a functional model of the cochlear process in sufficient detail to allow the simulation of the principal characteristics of the cochleas response to multi-tone and noise stimuli over a wide range of input levels. The model simulates level-dependent changes in frequency selectivity, combination-tone distortion, tone-on-tone suppression and masking, adaptation, and critical-band masking. The model is structured as 3000 channels, each consisting of a basilar membrane bandpass filter and inner-hair cell assembly. Input to each channel is the stapes displacement signal, and the output consists of ten independent stochastic point processes that are transmitted to the CNS on auditory-nerve fibers (ANFs). Our main purpose is to address these questions: (1) What narrowband spectrum information is available in the cochlea o...


Journal of the Acoustical Society of America | 1989

Effects of tree structure and statistical methods on broad phonetic classification

James W. Delmege; James Hillenbrand; Robert T. Gayvert

The goal of this project was to develop a system for assigning individual frames of a speech signal to one of four broad phonetic categories: vowel‐like, strong fricative, weak fricative, and silence. Classification results were compared from a K‐means clustering algorithm and a maximum likelihood distance measure. In addition to the comparison of statistical methods, this study compared classification performance using several tree‐structured, decision‐making techniques. Training and test data consisted of various combinations of 98 utterances produced by five male and five female speakers. Results showed very little difference between the K‐means and maximum likelihood methods. However, the nature of the decision tree had a significant effect on the performance of the classifier. [Work supported by Rome Air Development Center and the Air Force Office of Scientific Research as part of the Northeast Artificial Intelligence Consortium (Contract No. F3060285‐C‐60008) and by Redcom Laboratories, Victor, NY.]


Journal of the Acoustical Society of America | 1989

ESPRIT: A signal processing environment with a visual programming interface

Robert T. Gayvert; John A. Biles; Harvey Rhody; James Hillenbrand

ESPRIT (Explorer speech processing system from the Rochester Institute of Technology) is an integrated speech research development environment that runs on the TI Explorer, optionally augmented by the TMS‐320 based Odyssey DSP board. The goal of ESPRIT is to provide speech scientists, linguists, and engineers an intuitive environment in which to collect, process, and display speech signals. ESPRITs module editor allows users who are not programmers to draw data‐flow programs made up of built‐in and user‐defined speech processing algorithms, display functions, and standard utilities. ESPRITs display editor allows users to manipulate the graphical displays that result from running these programs to zoom, scroll, rearrange, take precise measurements, and perform a variety of other operations. While ESPRIT provides standard signal processing algorithms (FFT,LPC) and displays (waveforms, spectrograms, waterfalls, spectral slices), users who develop their own Lisp or TMS 320 programs can easily install them t...

Collaboration


Dive into the Robert T. Gayvert's collaboration.

Top Co-Authors

Avatar

James Hillenbrand

Western Michigan University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John F. Houde

University of California

View shared research outputs
Top Co-Authors

Avatar

Michael J. Clark

Western Michigan University

View shared research outputs
Top Co-Authors

Avatar

John A. Biles

Rochester Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge