Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where James Hillenbrand is active.

Publication


Featured researches published by James Hillenbrand.


Journal of the Acoustical Society of America | 1994

Acoustic characteristics of American English vowels

James Hillenbrand; Laura A. Getty; Michael J. Clark; Kimberlee Wheeler

The purpose of this study was to replicate and extend the classic study of vowel acoustics by Peterson and Barney (PB) [J. Acoust. Soc. Am. 24, 175-184 (1952)]. Recordings were made of 45 men, 48 women, and 46 children producing the vowels /i,I,e, epsilon,ae,a, [symbol: see text],O,U,u, lambda,3 iota/ in h-V-d syllables. Formant contours for F1-F4 were measured from LPC spectra using a custom interactive editing tool. For comparison with the PB data, formant patterns were sampled at a time that was judged by visual inspection to be maximally steady. Analysis of the formant data shows numerous differences between the present data and those of PB, both in terms of average frequencies of F1 and F2, and the degree of overlap among adjacent vowels. As with the original study, listening tests showed that the signals were nearly always identified as the vowel intended by the talker. Discriminant analysis showed that the vowels were more poorly separated than the PB data based on a static sample of the formant pattern. However, the vowels can be separated with a high degree of accuracy if duration and spectral change information is included.


Journal of the Acoustical Society of America | 1997

Effects of consonant environment on vowel formant patterns

James Hillenbrand; Michael J. Clark; Terrance M. Nearey

A significant body of evidence has accumulated indicating that vowel identification is influenced by spectral change patterns. For example, a large-scale study of vowel formant patterns showed substantial improvements in category separability when a pattern classifier was trained on multiple samples of the formant pattern rather than a single sample at steady state [J. Hillenbrand et al., J. Acoust. Soc. Am. 97, 3099-3111 (1995)]. However, in the earlier study all utterances were recorded in a constant /hVd/ environment. The main purpose of the present study was to determine whether a close relationship between vowel identity and spectral change patterns is maintained when the consonant environment is allowed to vary. Recordings were made of six men and six women producing eight vowels (see text) in isolation and in CVC syllables. The CVC utterances consisted of all combinations of seven initial consonants (/h,b,d,g,p,t,k/) and six final consonants (/b,d,g,p,t,k/). Formant frequencies for F1-F3 were measured every 5 ms during the vowel using an interactive editing tool. Results showed highly significant effects of phonetic environment. As with an earlier study of this type, particularly large shifts in formant patterns were seen for rounded vowels in alveolar environments [K. Stevens and A. House, J. Speech Hear. Res. 6, 111-128 (1963)]. Despite these context effects, substantial improvements in category separability were observed when a pattern classifier incorporated spectral change information. Modeling work showed that many aspects of listener behavior could be accounted for by a fairly simple pattern classifier incorporating F0, duration, and two discrete samples of the formant pattern.


Journal of the Acoustical Society of America | 1984

Limits on phonetic accuracy in foreign language speech production

James Emil Flege; James Hillenbrand

This study examined the French syllables /tu/ (‘‘tous’’) and /ty/ (‘‘tu’’) produced in three speaking tasks by native speakers of American English and French talkers living in the U. S. In a paired‐comparison task listeners correctly identified more of the vowels produced by French than American talkers, and more vowels produced by experienced than inexperienced American speakers of French. An acoustic analysis revealed that the American talkers produced /u/ with significantly higher F2 values than the French talkers, but produced /y/ with F2 values equal to those of the French talkers. A labeling task revealed that the /y/ vowels produced by the experienced and inexperienced Americans were identified equally well, but that the experienced Americans produced a more identifiable /u/ than the inexperienced Americans. It is hypothesized that English speakers learn French /y/ rapidly because this vowel is not—like French /u/—judged to be equivalent to a vowel of English. The French and American talkers produc...


Journal of the Acoustical Society of America | 2000

Some effects of duration on vowel recognition

James Hillenbrand; Michael J. Clark; Robert A. Houde

This study was designed to examine the role of duration in vowel perception by testing listeners on the identification of CVC syllables generated at different durations. Test signals consisted of synthesized versions of 300 utterances selected from a large, multitalker database of /hVd/ syllables [Hillenbrand et al., J. Acoust. Soc. Am. 97, 3099-3111 (1995)]. Four versions of each utterance were synthesized: (1) an original duration set (vowel duration matched to the original utterance), (2) a neutral duration set (duration fixed at 272 ms, the grand mean across all vowels), (3) a short duration set (duration fixed at 144 ms, two standard deviations below the mean), and (4) a long duration set (duration fixed at 400 ms, two standard deviations above the mean). Experiment 1 used a formant synthesizer, while a second experiment was an exact replication using a sinusoidal synthesis method that represented the original vowel spectrum more precisely than the formant synthesizer. Findings included (1) duration had a small overall effect on vowel identity since the great majority of signals were identified correctly at their original durations and at all three altered durations; (2) despite the relatively small average effect of duration, some vowels, especially [see text] and [see text], were significantly affected by duration; (3) some vowel contrasts that differ systematically in duration, such as [see text], and [see text], were minimally affected by duration; (4) a simple pattern recognition model appears to be capable of accounting for several features of the listening test results, especially the greater influence of duration on some vowels than others; and (5) because a formant synthesizer does an imperfect job of representing the fine details of the original vowel spectrum, results using the formant-synthesized signals led to a slight overestimate of the role of duration in vowel recognition, especially for the shortened vowels.


Annals of Otology, Rhinology, and Laryngology | 2003

Cepstral Peak Prominence: A More Reliable Measure of Dysphonia

Yolanda D. Heman-Ackah; Deirdre D. Michael; Margaret M. Baroody; Rosemary Ostrowski; James Hillenbrand; Reinhardt J. Heuer; Michelle Horman; Robert T. Sataloff

Quantification of perceptual voice characteristics allows the assessment of voice changes. Acoustic measures of jitter, shimmer, and noise-to-harmonic ratio (NHR) are often unreliable. Measures of cepstral peak prominence (CPP) may be more reliable predictors of dysphonia. Trained listeners analyzed voice samples from 281 patients. The NHR, amplitude perturbation quotient, smoothed pitch perturbation quotient, percent jitter, and CPP were obtained from sustained vowel phonation, and the CPP was obtained from running speech. For the first time, normal and abnormal values of CPP were defined, and they were compared with other acoustic measures used to predict dysphonia. The CPP for running speech is a good predictor and a more reliable measure of dysphonia than are acoustic measures of jitter, shimmer, and NHR.


Journal of the Acoustical Society of America | 1988

Perception of aperiodicities in synthetically generated voices

James Hillenbrand

The purpose of this study was to investigate univariate relationships between perceived dysphonia and variation in pitch perturbation, amplitude perturbation, and additive noise. A time-domain, pitch-synchronous synthesis technique was used to generate sustained vowels varying in each of the three acoustic dimensions. A panel of trained listeners provided direct magnitude estimates of roughness in the case of the stimuli varying in pitch and amplitude perturbation, and breathiness in the case of the stimuli varying in additive noise. Very strong relationships were found between perceived roughness and either pitch or amplitude perturbation. However, unlike results reported previously for nonspeech stimuli, the subjective quality associated with pitch perturbation was quite different from that associated with amplitude perturbation. Results also showed that perceived roughness was affected not only by the amount of perturbation, but also by the degree of correlation between adjacent pitch or amplitude values. A strong relationship was found between perceived breathiness and signal-to-noise ratio. Contrary to previous findings, there was no interaction between signal-to-noise ratio and the amount of high-frequency energy in the periodic component of the stimulus: Stimuli with similar signal-to-noise ratios received similar ratings, regardless of differences in the spectral slope of the periodic component.


Attention Perception & Psychophysics | 2009

The role of f (0) and formant frequencies in distinguishing the voices of men and women.

James Hillenbrand; Michael J. Clark

The purpose of the present study was to determine the contributions of fundamental frequency (f0) and formants in cuing the distinction between men’s and women’s voices. A source-filter synthesizer was used to create four versions of 25 sentences spoken by men: (1) unmodified synthesis, (2) f0 only shifted up toward values typical of women, (3) formants only shifted up toward values typical of women, and (4) both f0 and formants shifted up. Identical methods were used to generate four corresponding versions of 25 sentences spoken by women, but with downward shifts. Listening tests showed that (1) shifting both f0 and formants was usually effective (∼82%) in changing the perceived sex of the utterance, and (2) shifting either f0 or formants alone was usually ineffective in changing the perceived sex. Both f0 and formants are apparently needed to specify speaker sex, though even together these cues are not entirely effective. Results also suggested that f0 is somewhat more important than formants. A second experiment used the same methods, but isolated /hVd/ syllables were used as test signals. Results were broadly similar, with the important exception that, on average, the syllables were more likely to shift perceived talker sex with shifts in f0 and/or formants.


Attention Perception & Psychophysics | 1986

Psychoacoustics of a chilling sound

D. Lynn Halpern; Randolph Blake; James Hillenbrand

We digitally synthesized versions of the sound of a sharp object scraping across a slate surface (which mimics the sound of fingernails scraping across a blackboard) to determine whether spectral content or amplitude contour contributed to its obnoxious quality. Using magnitude estimation, listeners rated each synthesized sound’s unpleasantness. Contrary to intuition, removal of low, but not of high, frequencies lessened the sound’s unpleasantness. Manipulations of the signal amplitude had no significant impact on listeners’ unpleasantness estimates. Evidently, low-frequency spectral factors contribute primarily to the discomfort associated with this sound.


Journal of the Acoustical Society of America | 1993

Identification of steady‐state vowels synthesized from the Peterson and Barney measurements

James Hillenbrand; Robert T. Gayvert

The purpose of this study was to determine how well listeners can identify vowels based exclusively on static spectral cues. This was done by asking listeners to identify steady-state synthesized versions of 1520 vowels (76 talkers x 10 vowels x 2 repetitions) using Peterson and Barneys measured values of F0 and F1-F3 [J. Acoust. Soc. Am. 24, 175-184 (1952)]. The values for all control parameters remained constant throughout the 300-ms duration of each stimulus. A second set of 1520 signals was identical to these stimuli except that a falling pitch contour was used. The identification error rate for the flat-formant, flat-pitch signals was 27.3%, several times greater than the 5.6% error rate shown by Peterson and Barneys listeners. The introduction of a falling pitch contour resulted in a small but statistically reliable reduction in the error rate. The implications of these results for interpreting pattern recognition studies using the Peterson and Barney database are discussed. Results are also discussed in relation to the role of dynamic cues in vowel identification.


Journal of the Acoustical Society of America | 2002

Evaluation of a strategy for automatic formant tracking

Terrance M. Nearey; Peter F. Assmann; James Hillenbrand

Variations on an automatic formant tracking strategy developed at Alberta will be compared to manual formant measurements from two databases of vowels spoken by men, women, and children (in Texas or Michigan). ‘‘Correct’’ vowel formant candidates for F1, F2, and F3 may be found roughly 85–90 percent of the time for adult male speakers using autocorrelation LPC with the following settings: F3 maximum at 3000 Hz, LPC order of 14, sampling rate of 10 kHz [J. Markel and A. Gray, Linear Prediction of Speech (Springer, New York, 1975)]. Experience shows good results are also often found with females’ and children’s speech, provided the sampling rate and F3 maximum are scaled appropriately for each speaker. Our new basic strategy involves analyzing each utterance at several distinct sampling rates and coordinated F3 cutoff frequencies with a fixed LPC order. Each scaling choice provides an independent set of candidates that is post‐processed by a simple tracking algorithm. A correlation measure between a spectro...

Collaboration


Dive into the James Hillenbrand's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Michael J. Clark

Western Michigan University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

James Emil Flege

University of Alabama at Birmingham

View shared research outputs
Top Co-Authors

Avatar

Dennis R. Ingrisano

University of Northern Colorado

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John F. Houde

University of California

View shared research outputs
Top Co-Authors

Avatar

Reinhardt J. Heuer

Thomas Jefferson University

View shared research outputs
Researchain Logo
Decentralizing Knowledge