Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Brian D. Simpson is active.

Publication


Featured researches published by Brian D. Simpson.


Journal of the Acoustical Society of America | 2001

Informational and energetic masking effects in the perception of multiple simultaneous talkers.

Douglas S. Brungart; Brian D. Simpson; Mark A. Ericson; Kimberly R. Scott

Although many researchers have examined the role that binaural cues play in the perception of spatially separated speech signals, relatively little is known about the cues that listeners use to segregate competing speech messages in a monaural or diotic stimulus. This series of experiments examined how variations in the relative levels and voice characteristics of the target and masking talkers influence a listeners ability to extract information from a target phrase in a 3-talker or 4-talker diotic stimulus. Performance in this speech perception task decreased systematically when the level of the target talker was reduced relative to the masking talkers. Performance also generally decreased when the target and masking talkers had similar voice characteristics: the target phrase was most intelligible when the target and masking phrases were spoken by different-sex talkers, and least intelligible when the target and masking phrases were spoken by the same talker. However, when the target-to-masker ratio was less than 3 dB, overall performance was usually lower with one different-sex masker than with all same-sex maskers. In most of the conditions tested, the listeners performed better when they were exposed to the characteristics of the target voice prior to the presentation of the stimulus. The results of these experiments demonstrate how monaural factors may play an important role in the segregation of speech signals in multitalker environments.


Journal of the Acoustical Society of America | 2000

A speech corpus for multitalker communications research

Robert S. Bolia; W. Todd Nelson; Mark A. Ericson; Brian D. Simpson

A database of speech samples from eight different talkers has been collected for use in multitalker communications research. Descriptions of the nature of the corpus, the data collection methodology, and the means for obtaining copies of the database are presented.


Journal of the Acoustical Society of America | 2006

Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation

Douglas S. Brungart; Peter S. Chang; Brian D. Simpson; DeLiang Wang

When a target speech signal is obscured by an interfering speech wave form, comprehension of the target message depends both on the successful detection of the energy from the target speech wave form and on the successful extraction and recognition of the spectro-temporal energy pattern of the target out of a background of acoustically similar masker sounds. This study attempted to isolate the effects that energetic masking, defined as the loss of detectable target information due to the spectral overlap of the target and masking signals, has on multitalker speech perception. This was achieved through the use of ideal time-frequency binary masks that retained those spectro-temporal regions of the acoustic mixture that were dominated by the target speech but eliminated those regions that were dominated by the interfering speech. The results suggest that energetic masking plays a relatively small role in the overall masking that occurs when speech is masked by interfering speech but a much more significant role when speech is masked by interfering noise.


Journal of the Acoustical Society of America | 2002

The effects of spatial separation in distance on the informational and energetic masking of a nearby speech signal.

Douglas S. Brungart; Brian D. Simpson

Although many studies have shown that intelligibility improves when a speech signal and an interfering sound source are spatially separated in azimuth, little is known about the effect that spatial separation in distance has on the perception of competing sound sources near the head. In this experiment, head-related transfer functions (HRTFs) were used to process stimuli in order to simulate a target talker and a masking sound located at different distances along the listeners interaural axis. One of the signals was always presented at a distance of 1 m, and the other signal was presented 1 m, 25 cm, or 12 cm from the center of the listeners head. The results show that distance separation has very different effects on speech segregation for different types of maskers. When speech-shaped noise was used as the masker, most of the intelligibility advantages of spatial separation could be accounted for by spectral differences in the target and masking signals at the ear with the higher signal-to-noise ratio (SNR). When a same-sex talker was used as the masker, the intelligibility advantages of spatial separation in distance were dominated by binaural effects that produced the same performance improvements as a 4-5-dB increase in the SNR of a diotic stimulus. These results suggest that distance-dependent changes in the interaural difference cues of nearby sources play a much larger role in the reduction of the informational masking produced by an interfering speech signal than in the reduction of the energetic masking produced by an interfering noise source.


Journal of the Acoustical Society of America | 2002

Within-ear and across-ear interference in a cocktail-party listening task

Douglas S. Brungart; Brian D. Simpson

Although many researchers have shown that listeners are able to selectively attend to a target speech signal when a masking talker is present in the same ear as the target speech or when a masking talker is present in a different ear than the target speech, little is known about selective auditory attention in tasks with a target talker in one ear and independent masking talkers in both ears at the same time. In this series of experiments, listeners were asked to respond to a target speech signal spoken by one of two competing talkers in their right (target) ear while ignoring a simultaneous masking sound in their left (unattended) ear. When the masking sound in the unattended ear was noise, listeners were able to segregate the competing talkers in the target ear nearly as well as they could with no sound in the unattended ear. When the masking sound in the unattended ear was speech, however, speech segregation in the target ear was substantially worse than with no sound in the unattended ear. When the masking sound in the unattended ear was time-reversed speech, speech segregation was degraded only when the target speech was presented at a lower level than the masking speech in the target ear. These results show that within-ear and across-ear speech segregation are closely related processes that cannot be performed simultaneously when the interfering sound in the unattended ear is qualitatively similar to speech.


tests and proofs | 2005

Optimizing the spatial configuration of a seven-talker speech display

Douglas S. Brungart; Brian D. Simpson

Although there is substantial evidence that performance in multitalker listening tasks can be improved by spatially separating the apparent locations of the competing talkers, very little effort has been made to determine the best locations and presentation levels for the talkers in a multichannel speech display. In this experiment, a call sign based color and number identification task was used to evaluate the effectiveness of three different spatial configurations and two different level normalization schemes in a seven-channel binaural speech display. When only two spatially adjacent channels of the seven-channel system were active, overall performance was substantially better with a geometrically spaced spatial configuration (with far-field talkers at −90°, −30°, −10°, 0°, +10°, +30°, and +90° azimuth) or a hybrid near-far configuration (with far-field talkers at −90°, −30°, 0°, +30°, and +90° azimuth and near-field talkers at ±90°) than with a more conventional linearly spaced configuration (with far-field talkers at −90°, −60°, −30°, 0°, +30°, +60°, and +90° azimuth). When all seven channels were active, performance was generally better with a “better-ear” normalization scheme that equalized the levels of the talkers in the more intense ear than with a default normalization scheme that equalized the levels of the talkers at the center of the head. The best overall performance in the seven-talker task occurred when the hybrid near-far spatial configuration was combined with the better-ear normalization scheme. This combination resulted in a 20% increase in the number of correct identifications relative to the baseline condition with linearly spaced talker locations and no level normalization. Although this is a relatively modest improvement, it should be noted that it could be achieved at little or no cost simply by reconfiguring the HRTFs used in a multitalker speech display.


Journal of the Acoustical Society of America | 2005

Precedence-based speech segregation in a virtual auditory environment

Douglas S. Brungart; Brian D. Simpson; Richard L. Freyman

When a masking sound is spatially separated from a target speech signal, substantial releases from masking typically occur both for speech and noise maskers. However, when a delayed copy of the masker is also presented at the location of the target speech (a condition that has been referred to as the front target, right-front masker or F-RF configuration), the advantages of spatial separation vanish for noise maskers but remain substantial for speech maskers. This effect has been attributed to precedence, which introduces an apparent spatial separation between the target and masker in the F-RF configuration that helps the listener to segregate the target from a masking voice but not from a masking noise. In this study, virtual synthesis techniques were used to examine variations of the F-RF configuration in an attempt to more fully understand the stimulus parameters that influence the release from masking obtained in that condition. The results show that the release from speech-on-speech masking caused by the addition of the delayed copy of the masker is robust across a wide variety of source locations, masker locations, and masker delay values. This suggests that the speech unmasking that occurs in the F-RF configuration is not dependent on any single perceptual cue and may indicate that F-RF speech segregation is only partially based on the apparent left-right location of the RF masker.


Journal of the Acoustical Society of America | 2009

Multitalker speech perception with ideal time-frequency segregation: Effects of voice characteristics and number of talkers

Douglas S. Brungart; Peter S. Chang; Brian D. Simpson; DeLiang Wang

When a target voice is masked by an increasingly similar masker voice, increases in energetic masking are likely to occur due to increased spectro-temporal overlap in the competing speech waveforms. However, the impact of this increase may be obscured by informational masking effects related to the increased confusability of the target and masking utterances. In this study, the effects of target-masker similarity and the number of competing talkers on the energetic component of speech-on-speech masking were measured with an ideal time-frequency segregation (ITFS) technique that retained all the target-dominated time-frequency regions of a multitalker mixture but eliminated all the time-frequency regions dominated by the maskers. The results show that target-masker similarity has a small but systematic impact on energetic masking, with roughly a 1 dB release from masking for same-sex maskers versus same-talker maskers and roughly an additional 1 dB release from masking for different-sex masking voices. The results of a second experiment measuring ITFS performance with up to 18 interfering talkers indicate that energetic masking increased systematically with the number of competing talkers. These results suggest that energetic masking differences related to target-masker similarity have a much smaller impact on multitalker listening performance than energetic masking effects related to the number of competing talkers in the stimulus and non-energetic masking effects related to the confusability of the target and masking voices.


Proceedings of the Human Factors and Ergonomics Society Annual Meeting | 2005

EVALUATION OF BONE-CONDUCTION HEADSETS FOR USE IN MULTITALKER COMMUNICATION ENVIRONMENTS

Bruce N. Walker; Raymond M. Stanley; Nandini Iyer; Brian D. Simpson; Douglas S. Brungart

Standard audio headphones are useful in many applications, but they cover the ears of the listener and thus may impair the perception of ambient sounds. Bone-conduction headphones offer a possible alternative, but traditionally their use has been limited to monaural applications due to the high propagation speed of sound in the human skull. Here we show that stereo bone-conduction headsets can be used to provide a limited amount of interaural isolation in a dichotic speech perception task. The results suggest that reliable spatial separation is possible with bone-conduction headsets, but that they probably cannot be used to lateralize signals to extreme left or right apparent locations


Proceedings of the Human Factors and Ergonomics Society Annual Meeting | 2004

3D Audio Cueing for Target Identification in a Simulated Flight Task

Brian D. Simpson; Douglas S. Brungart; Robert H. Gilkey; Jeffrey L. Cowgill; Ronald C. Dallman; Randall F. Green; Kevin L. Youngblood; Thomas Moore

Modern Traffic Advisory Systems (TAS) can increase flight safety by providing pilots with real-time information about the locations of nearby aircraft. However, most current collision avoidance systems rely on non-intuitive visual and audio displays that may not allow pilots to take full advantage of this information. In this experiment, we compared the response times required for subjects participating in a fully-immersive simulated flight task to visually acquire and identify nearby targets under four different simulated TAS display conditions: 1) no display; 2) a visual display combined with a non-spatialized warning sound; 3) a visual display combined with a clock-coordinate speech signal; and 4) a visual display combined with a spatialized auditory warning sound. The results show that response times varied in an orderly fashion as a function of display condition, with the slowest times occurring in the no display condition and the fastest times occurring in the 3D audio display condition, where they were roughly 25% faster than those without the 3D audio cues.

Collaboration


Dive into the Brian D. Simpson's collaboration.

Top Co-Authors

Avatar

Douglas S. Brungart

Air Force Research Laboratory

View shared research outputs
Top Co-Authors

Avatar

Nandini Iyer

Air Force Research Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Griffin D. Romigh

Air Force Research Laboratory

View shared research outputs
Top Co-Authors

Avatar

Mark A. Ericson

Air Force Research Laboratory

View shared research outputs
Top Co-Authors

Avatar

Richard L. McKinley

Wright-Patterson Air Force Base

View shared research outputs
Top Co-Authors

Avatar

Robert S. Bolia

Wright-Patterson Air Force Base

View shared research outputs
Top Co-Authors

Avatar

Matthew G. Wisniewski

State University of New York System

View shared research outputs
Top Co-Authors

Avatar

Victor Finomore

Air Force Research Laboratory

View shared research outputs
Researchain Logo
Decentralizing Knowledge