Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sarah E. Yoho is active.

Publication


Featured researches published by Sarah E. Yoho.


Journal of the Acoustical Society of America | 2013

An algorithm to improve speech recognition in noise for hearing-impaired listeners

Eric W. Healy; Sarah E. Yoho; Yuxuan Wang; DeLiang Wang

Despite considerable effort, monaural (single-microphone) algorithms capable of increasing the intelligibility of speech in noise have remained elusive. Successful development of such an algorithm is especially important for hearing-impaired (HI) listeners, given their particular difficulty in noisy backgrounds. In the current study, an algorithm based on binary masking was developed to separate speech from noise. Unlike the ideal binary mask, which requires prior knowledge of the premixed signals, the masks used to segregate speech from noise in the current study were estimated by training the algorithm on speech not used during testing. Sentences were mixed with speech-shaped noise and with babble at various signal-to-noise ratios (SNRs). Testing using normal-hearing and HI listeners indicated that intelligibility increased following processing in all conditions. These increases were larger for HI listeners, for the modulated background, and for the least-favorable SNRs. They were also often substantial, allowing several HI listeners to improve intelligibility from scores near zero to values above 70%.


Journal of the Acoustical Society of America | 2016

Large-scale training to increase speech intelligibility for hearing-impaired listeners in novel noises

Jitong Chen; Yuxuan Wang; Sarah E. Yoho; DeLiang Wang; Eric W. Healy

Supervised speech segregation has been recently shown to improve human speech intelligibility in noise, when trained and tested on similar noises. However, a major challenge involves the ability to generalize to entirely novel noises. Such generalization would enable hearing aid and cochlear implant users to improve speech intelligibility in unknown noisy environments. This challenge is addressed in the current study through large-scale training. Specifically, a deep neural network (DNN) was trained on 10 000 noises to estimate the ideal ratio mask, and then employed to separate sentences from completely new noises (cafeteria and babble) at several signal-to-noise ratios (SNRs). Although the DNN was trained at the fixed SNR of - 2 dB, testing using hearing-impaired listeners demonstrated that speech intelligibility increased substantially following speech segregation using the novel noises and unmatched SNR conditions of 0 dB and 5 dB. Sentence intelligibility benefit was also observed for normal-hearing listeners in most noisy conditions. The results indicate that DNN-based supervised speech segregation with large-scale training is a very promising approach for generalization to new acoustic environments.


Journal of the Acoustical Society of America | 2015

An algorithm to increase speech intelligibility for hearing-impaired listeners in novel segments of the same noise type

Eric W. Healy; Sarah E. Yoho; Jitong Chen; Yuxuan Wang; DeLiang Wang

Machine learning algorithms to segregate speech from background noise hold considerable promise for alleviating limitations associated with hearing impairment. One of the most important considerations for implementing these algorithms into devices such as hearing aids and cochlear implants involves their ability to generalize to conditions not employed during the training stage. A major challenge involves the generalization to novel noise segments. In the current study, sentences were segregated from multi-talker babble and from cafeteria noise using an algorithm that employs deep neural networks to estimate the ideal ratio mask. Importantly, the algorithm was trained on segments of noise and tested using entirely novel segments of the same nonstationary noise type. Substantial sentence-intelligibility benefit was observed for hearing-impaired listeners in both noise types, despite the use of unseen noise segments during the test stage. Interestingly, normal-hearing listeners displayed benefit in babble but not in cafeteria noise. This result highlights the importance of evaluating these algorithms not only in human subjects, but in members of the actual target population.


Journal of the Acoustical Society of America | 2013

Role and relative contribution of temporal envelope and fine structure cues in sentence recognition by normal-hearing listeners

Frédéric Apoux; Sarah E. Yoho; Carla L. Youngdahl; Eric W. Healy

The present study investigated the role and relative contribution of envelope and temporal fine structure (TFS) to sentence recognition in noise. Target and masker stimuli were added at five different signal-to-noise ratios (SNRs) and filtered into 30 contiguous frequency bands. The envelope and TFS were extracted from each band by Hilbert decomposition. The final stimuli consisted of the envelope of the target/masker sound mixture at x dB SNR and the TFS of the same sound mixture at y dB SNR. A first experiment showed a very limited contribution of TFS cues, indicating that sentence recognition in noise relies almost exclusively on temporal envelope cues. A second experiment showed that replacing the carrier of a sound mixture with noise (vocoder processing) cannot be considered equivalent to disrupting the TFS of the target signal by adding a background noise. Accordingly, a re-evaluation of the vocoder approach as a model to further understand the role of TFS cues in noisy situations may be necessary. Overall, these data are consistent with the view that speech information is primarily extracted from the envelope while TFS cues are primarily used to detect glimpses of the target.


Journal of the Acoustical Society of America | 2014

Speech-cue transmission by an algorithm to increase consonant recognition in noise for hearing-impaired listeners

Eric W. Healy; Sarah E. Yoho; Yuxuan Wang; Frédéric Apoux; DeLiang Wang

Consonant recognition was assessed following extraction of speech from noise using a more efficient version of the speech-segregation algorithm described in Healy, Yoho, Wang, and Wang [(2013) J. Acoust. Soc. Am. 134, 3029-3038]. Substantial increases in recognition were observed following algorithm processing, which were significantly larger for hearing-impaired (HI) than for normal-hearing (NH) listeners in both speech-shaped noise and babble backgrounds. As observed previously for sentence recognition, older HI listeners having access to the algorithm performed as well or better than young NH listeners in conditions of identical noise. It was also found that the binary masks estimated by the algorithm transmitted speech features to listeners in a fashion highly similar to that of the ideal binary mask (IBM), suggesting that the algorithm is estimating the IBM with substantial accuracy. Further, the speech features associated with voicing, manner of articulation, and place of articulation were all transmitted with relative uniformity and at relatively high levels, indicating that the algorithm and the IBM transmit speech cues without obvious deficiency. Because the current implementation of the algorithm is much more efficient, it should be more amenable to real-time implementation in devices such as hearing aids and cochlear implants.


Journal of the Acoustical Society of America | 2013

Can envelope recovery account for speech recognition based on temporal fine structure

Frédéric Apoux; Carla L. Youngdahl; Sarah E. Yoho; Eric W. Healy

Over the past decade, several studies have demonstrated that normal-hearing listeners can achieve high levels of speech recognition when presented with only the temporal fine structure (TFS) of speech stimuli. Initial suggestions to explain these findings were that they were the result of the auditory system’s ability to recover envelope information from the TFS (envelope recovery; ER). A number of studies have since showed decreasing ER with increasing numbers of analysis filters (the filters used to decompose the signal) while intelligibility from speech-TFS remains almost unaffected. Accordingly, it is now assumed that speech information is present in the TFS. A recent psychophysical study, however, showed that envelope information remains in the TFS after decomposition, suggesting a possible role of ER in speech-TFS understanding. The present study investigated this potential role. In contrast to previous work, a clear influence of analysis filter bandwidth on speech-TFS understanding was established....


Journal of the Acoustical Society of America | 2014

An algorithm to improve speech recognition in noise for hearing-impaired listeners: Consonant identification and articulatory feature transmission

Eric W. Healy; Sarah E. Yoho; Yuxuan Wang; Frédéric Apoux; Carla L. Youngdahl; DeLiang Wang

Previous work has shown that a supervised-learning algorithm estimating the ideal binary mask (IBM) can improve sentence intelligibility in noise for hearing-impaired (HI) listeners from scores below 30% to above 80% [Healy et al., J. Acoust. Soc. Am. 134 (2013)]. The algorithm generates a binary mask by using a deep neural network to classify speech-dominant and noise-dominant time-frequency units. In the current study, these results are extended to consonant recognition, in order to examine the specific speech cues responsible for the observed performance improvements. Consonant recognition in speech-shaped noise or babble was examined in normal-hearing and HI listeners in three conditions: unprocessed, noise removed via the IBM, and noise removed via the classification-based algorithm. The IBM demonstrated substantial performance improvements, averaging up to 45% points. The algorithm also produced sizeable gains, averaging up to 34% points. An information-transmission analysis of cues associated with ...


Journal of the Acoustical Society of America | 2018

The noise susceptibility of various speech bands

Sarah E. Yoho; Frédéric Apoux; Eric W. Healy

The degrading influence of noise on various critical bands of speech was assessed. A modified version of the compound method [Apoux and Healy (2012) J. Acoust. Soc. Am. 132, 1078-1087] was employed to establish this noise susceptibility for each speech band. Noise was added to the target speech band at various signal-to-noise ratios to determine the amount of noise required to reduce the contribution of that band by 50%. It was found that noise susceptibility is not equal across the speech spectrum, as is commonly assumed and incorporated into modern indexes. Instead, the signal-to-noise ratio required to equivalently impact various speech bands differed by as much as 13 dB. This noise susceptibility formed an irregular pattern across frequency, despite the use of multi-talker speech materials designed to reduce the potential influence of a particular talkers voice. But basic trends in the pattern of noise susceptibility across the spectrum emerged. Further, no systematic relationship was observed between noise susceptibility and speech band importance. It is argued here that susceptibility to noise and band importance are different phenomena, and that this distinction may be underappreciated in previous works.


Journal of Speech Language and Hearing Research | 2018

The Effect of Remote Masking on the Reception of Speech by Young School-Age Children.

Carla L. Youngdahl; Eric W. Healy; Sarah E. Yoho; Frédéric Apoux; Rachael Frush Holt

Purpose Psychoacoustic data indicate that infants and children are less likely than adults to focus on a spectral region containing an anticipated signal and are more susceptible to remote masking of a signal. These detection tasks suggest that infants and children, unlike adults, do not listen selectively. However, less is known about childrens ability to listen selectively during speech recognition. Accordingly, the current study examines remote masking during speech recognition in children and adults. Method Adults and 7- and 5-year-old children performed sentence recognition in the presence of various spectrally remote maskers. Intelligibility was determined for each remote-masker condition, and performance was compared across age groups. Results It was found that speech recognition for 5-year-olds was reduced in the presence of spectrally remote noise, whereas the maskers had no effect on the 7-year-olds or adults. Maskers of different bandwidth and remoteness had similar effects. Conclusions In accord with psychoacoustic data, young children do not appear to focus on a spectral region of interest and ignore other regions during speech recognition. This tendency may help account for their typically poorer speech perception in noise. This study also appears to capture an important developmental stage, during which a substantial refinement in spectral listening occurs.


Journal of the Acoustical Society of America | 2017

How susceptibility to noise varies across speech frequencies

Sarah E. Yoho; Eric W. Healy; Frédéric Apoux

It has been long assumed that the corrupting influence of noise on speech is uniform across all frequencies, and that the contribution of each speech frequency decreases at the same rate as noise is added. This assumption is evident in numerous previous works, and is seen clearly in the Speech Intelligibility Index where the contribution of each speech band is scaled in the same way according to the amount of noise present. Here, it is argued that susceptibility to noise may differ across speech bands. To test this hypothesis, the compound method [F. Apoux and E.W. Healy (2012) 132, J. Acoust. Soc. Am.] was modified to evaluate the noise susceptibility of individual critical bands of speech. Noise was added to each target speech band and the signal-to-noise ratio (SNR) required to reduce the contribution of that band by half was estimated. It was found that noise susceptibility varies greatly across speech bands, and that the SNR required to similarly affect each band differed by as much as 12 dB. Interestingly, no obvious systematic relationship appeared to exist between band importance and noise susceptibility.It has been long assumed that the corrupting influence of noise on speech is uniform across all frequencies, and that the contribution of each speech frequency decreases at the same rate as noise is added. This assumption is evident in numerous previous works, and is seen clearly in the Speech Intelligibility Index where the contribution of each speech band is scaled in the same way according to the amount of noise present. Here, it is argued that susceptibility to noise may differ across speech bands. To test this hypothesis, the compound method [F. Apoux and E.W. Healy (2012) 132, J. Acoust. Soc. Am.] was modified to evaluate the noise susceptibility of individual critical bands of speech. Noise was added to each target speech band and the signal-to-noise ratio (SNR) required to reduce the contribution of that band by half was estimated. It was found that noise susceptibility varies greatly across speech bands, and that the SNR required to similarly affect each band differed by as much as 12 dB. Interes...

Collaboration


Dive into the Sarah E. Yoho's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge