Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Robert V. Shannon is active.

Publication


Featured researches published by Robert V. Shannon.


Science | 1995

Speech Recognition with Primarily Temporal Cues

Robert V. Shannon; Fan-Gang Zeng; Vivek Kamath; John Wygonski; Michael Ekelid

Nearly perfect speech recognition was observed under conditions of greatly reduced spectral information. Temporal envelopes of speech were extracted from broad frequency bands and were used to modulate noises of the same bandwidths. This manipulation preserved temporal envelope cues in each band but restricted the listener to severely degraded information on the distribution of spectral energy. The identification of consonants, vowels, and words in simple sentences improved markedly as the number of bands increased; high speech recognition performance was obtained with only three bands of modulated noise. Thus, the presentation of a dynamic temporal pattern in only a few broad spectral regions is sufficient for the recognition of speech.


Journal of the Acoustical Society of America | 2001

Speech recognition in noise as a function of the number of spectral channels : Comparison of acoustic hearing and cochlear implants

Lendra M. Friesen; Robert V. Shannon; Deniz Başkent; Xiaosong Wang

Speech recognition was measured as a function of spectral resolution (number of spectral channels) and speech-to-noise ratio in normal-hearing (NH) and cochlear-implant (CI) listeners. Vowel, consonant, word, and sentence recognition were measured in five normal-hearing listeners, ten listeners with the Nucleus-22 cochlear implant, and nine listeners with the Advanced Bionics Clarion cochlear implant. Recognition was measured as a function of the number of spectral channels (noise bands or electrodes) at signal-to-noise ratios of + 15, + 10, +5, 0 dB, and in quiet. Performance with three different speech processing strategies (SPEAK, CIS, and SAS) was similar across all conditions, and improved as the number of electrodes increased (up to seven or eight) for all conditions. For all noise levels, vowel and consonant recognition with the SPEAK speech processor did not improve with more than seven electrodes, while for normal-hearing listeners, performance continued to increase up to at least 20 channels. Speech recognition on more difficult speech materials (word and sentence recognition) showed a marginally significant increase in Nucleus-22 listeners from seven to ten electrodes. The average implant score on all processing strategies was poorer than scores of NH listeners with similar processing. However, the best CI scores were similar to the normal-hearing scores for that condition (up to seven channels). CI listeners with the highest performance level increased in performance as the number of electrodes increased up to seven, while CI listeners with low levels of speech recognition did not increase in performance as the number of electrodes was increased beyond four. These results quantify the effect of number of spectral channels on speech recognition in noise and demonstrate that most CI subjects are not able to fully utilize the spectral information provided by the number of electrodes used in their implant.


Journal of the Acoustical Society of America | 1998

Effects of noise and spectral resolution on vowel and consonant recognition: Acoustic and electric hearing

Qian-Jie Fu; Robert V. Shannon; Xiaosong Wang

Current multichannel cochlear implant devices provide high levels of speech performance in quiet. However, performance deteriorates rapidly with increasing levels of background noise. The goal of this study was to investigate whether the noise susceptibility of cochlear implant users is primarily due to the loss of fine spectral information. Recognition of vowels and consonants was measured as a function of signal-to-noise ratio in four normal-hearing listeners in conditions simulating cochlear implants with both CIS and SPEAK-like strategies. Six conditions were evaluated: 3-, 4-, 8-, and 16-band processors (CIS-like), a 6/20 band processor (SPEAK-like), and unprocessed speech. Recognition scores for vowels and consonants decreased as the S/N level worsened in all conditions, as expected. Phoneme recognition threshold (PRT) was defined as the S/N at which the recognition score fell to 50% of its level in quiet. The unprocessed speech had the best PRT, which worsened as the number of bands decreased. Recognition of vowels and consonants was further measured in three Nucleus-22 cochlear implant users using either their normal SPEAK speech processor or a custom processor with a four-channel CIS strategy. The best cochlear implant user showed similar performance with the CIS strategy in quiet and in noise to that of normal-hearing listeners when listening to correspondingly spectrally degraded speech. These findings suggest that the noise susceptibility of cochlear implant users is at least partly due to the loss of spectral resolution. Efforts to improve the effective number of spectral information channels should improve implant performance in noise.


Journal of the Acoustical Society of America | 2000

Speech recognition with reduced spectral cues as a function of age

Laurie S. Eisenberg; Robert V. Shannon; Amy S. Martinez; John Wygonski; Arthur Boothroyd

Adult listeners are able to recognize speech even under conditions of severe spectral degradation. To assess the developmental time course of this robust pattern recognition, speech recognition was measured in two groups of children (5-7 and 10-12 years of age) as a function of the degree of spectral resolution. Results were compared to recognition performance of adults listening to the same materials and conditions. The spectral detail was systematically manipulated using a noise-band vocoder in which filtered noise bands were modulated by the amplitude envelope from the same spectral bands in speech. Performance scores between adults and older children did not differ statistically, whereas scores by younger children were significantly lower; they required more spectral resolution to perform at the same level as adults and older children. Part of the deficit in younger children was due to their inability to utilize fully the sensory information, and part was due to their incomplete linguistic/cognitive development. The fact that young children cannot recognize spectrally degraded speech as well as adults suggests that a long learning period is required for robust acoustic pattern recognition. These findings have implications for the application of auditory sensory devices for young children with early-onset hearing loss.


Journal of the Acoustical Society of America | 1995

Importance of tonal envelope cues in Chinese speech recognition

Qian-Jie Fu; Fan-Gang Zeng; Robert V. Shannon; Sigfrid D. Soli

Temporal waveform envelope cues provide significant information for English speech recognition, and, when combined with lip reading, could produce near‐perfect consonant identification performance [Van Tasell et al., 1152–1161 (1987)]. Tonal patterns are important for Chinese speech recognition and can be effectively conveyed by temporal envelope cues [D. H. Whalen and Y. Xu, Phonetics 49, 25–47 (1992)]. This study investigates whether tones can help Chinese‐speaking listeners use envelope cues more effectively than English listeners. The speech envelope was extracted from broad frequency bands and used to modulate a noise of the same bandwidth. Mandarin vowels, consonants, tones, and sentences were identified by ten native Chinese‐speaking listeners with 1, 2, 3, and 4 noise bands (or channels). The results showed that recognition of vowels, consonants and sentences increases dramatically with the number of channels, a pattern similar to that observed in English speech recognition. However, tones were co...


Journal of the Acoustical Society of America | 1996

Speech recognition with altered spectral distribution of envelope cues

Robert V. Shannon; Fan-Gang Zeng; John Wygonski

Recognition of consonants, vowels, and sentences was measured in conditions of reduced spectral resolution and distorted spectral distribution of temporal envelope cues. Speech materials were processed through four bandpass filters (analysis bands), half-wave rectified, and low-pass filtered to extract the temporal envelope from each band. The envelope from each speech band modulated a band-limited noise (carrier bands). Analysis and carrier bands were manipulated independently to alter the spectral distribution of envelope cues. Experiment I demonstrated that the location of the cutoff frequencies defining the bands was not a critical parameter for speech recognition, as long as the analysis and carrier bands were matched in frequency extent. Experiment II demonstrated a dramatic decrease in performance when the analysis and carrier bands did not match in frequency extent, which resulted in a warping of the spectral distribution of envelope cues. Experiment III demonstrated a large decrease in performance when the carrier bands were shifted in frequency, mimicking the basal position of electrodes in a cochlear implant. And experiment IV showed a relatively minor effect of the overlap in the noise carrier bands, simulating the overlap in neural populations responding to adjacent electrodes in a cochlear implant. Overall, these results show that, for four bands, the frequency alignment of the analysis bands and carrier bands is critical for good performance, while the exact frequency divisions and overlap in carrier bands are not as critical.


Journal of the Acoustical Society of America | 1992

Temporal modulation transfer functions in patients with cochlear implants

Robert V. Shannon

Thresholds for the detection of amplitude modulation were measured in cochlear implant patients as a function of modulation frequency. Three types of threshold measures were taken: detection of amplitude modulation, detection of low-frequency sinusoidal current waveforms, and detection of beats in two-tone complexes. The temporal modulation transfer function (TMTF), defined as the plot of modulation detection thresholds as a function of modulation frequency, show low-pass filter characteristics with similar cutoff frequencies for all three tasks. The similarity of these three measures suggests a common temporal mechanism. While modulation detection differs somewhat in normal-hearing and implanted listeners, both exhibit the same general characteristics. The TMTFs are low pass with a cutoff frequency near 70 Hz for normal-hearing listeners and near 140 Hz for implanted listeners. Patients with cochlear implants can best detect temporal modulation at modulation frequencies below 300 Hz, and are most sensitive to 80- to 100-Hz modulation. At high carrier levels many implant patients could detect smaller modulation amplitudes than normal-hearing listeners, a finding that is consistent with the smaller intensity DLs for some implanted listeners at high levels. These results demonstrate that, while implant listeners cannot discriminate steady-state, high-frequency stimuli, speech information might be conveyed by the envelope of the high-frequency components of speech.


Otolaryngology-Head and Neck Surgery | 1993

Auditory Brainstem Implant: I. Issues in Surgical Implantation:

Derald E. Brackmann; William E. Hitselberger; Ralph A. Nelson; Jean K. Moore; Michael Waring; Franco Portillo; Robert V. Shannon; Fred F. Telischi

Most patients with neurofibromatosis type 2 (NF2) are totally deaf after removal of their bilateral acoustic neuromas. Twenty-five patients with neurofibromatosis type 2 have been implanted with a brainstem electrode during surgery to remove an acoustic neuroma. The electrode is positioned in the lateral recess of the fourth ventricle, adjacent to the cochlear nuclei. The present electrode consists of three platinum plates mounted on a Dacron mesh backing, a design that has been demonstrated to be biocompatible and positionally stable in an animal model. Correct electrode placement depends on accurate identification of anatomic landmarks from the translabyrinthine surgical approach and also on Intrasurglcal electrophysiologic monitoring. Some tumors and their removal can result in significant distortion of the brainstem and surrounding structures. Even in the absence of Identifiable anatomic landmarks, electrode location can be adjusted during surgical placement to find the location that maximizes the auditory evoked response and minimizes activation of other monitored cranial nerves. Stimulation of the electrodes produces auditory sensations in most patients, with results similar to those of single-channel cochlear Implants. A coordinated multldlscipllnary team is essential for successful application of an auditory brainstem implant.


Acta Oto-laryngologica | 2004

The number of spectral channels required for speech recognition depends on the difficulty of the listening situation.

Robert V. Shannon; Qian-Jie Fu; Galvin J rd

Cochlear implants provide a limited number of electrodes, each of which represents a channel of spectral information. Studies have shown that implant recipients are not receiving all of the information from the channels presented to their implant. The present paper provides a quantitative framework for evaluating how many spectral channels of information are necessary for speech recognition. Speech and melody recognition data from previous studies with cochlear implant simulations are compared as a function of the number of spectral channels of information. A quantitative model is applied to the results. Speech recognition performance increases as the number of spectral channels increases. A sigmoid function best describes this increase when plotted as a function of the log number of channels. As speech materials become more difficult, the function shifts to the right, indicating that more spectral channels of information are required. A model proposed by Plomp provides a single index to relate the difficulty of the task to the number of spectral channels needed for moderate recognition performance. In conclusion, simple sentence recognition in quiet can be achieved with only 3-4 channels of spectral information, while more complex materials can require 30 or more channels for an equivalent level of performance. The proposed model provides a single index that not only quantifies the number of functional channels in a cochlear implant, but also predicts the level of performance for different listening tasks.


Journal of the Acoustical Society of America | 1999

Recognition of spectrally degraded and frequency-shifted vowels in acoustic and electric hearing

Qian-Jie Fu; Robert V. Shannon

The present study measured the recognition of spectrally degraded and frequency-shifted vowels in both acoustic and electric hearing. Vowel stimuli were passed through 4, 8, or 16 bandpass filters and the temporal envelopes from each filter band were extracted by half-wave rectification and low-pass filtering. The temporal envelopes were used to modulate noise bands which were shifted in frequency relative to the corresponding analysis filters. This manipulation not only degraded the spectral information by discarding within-band spectral detail, but also shifted the tonotopic representation of spectral envelope information. Results from five normal-hearing subjects showed that vowel recognition was sensitive to both spectral resolution and frequency shifting. The effect of a frequency shift did not interact with spectral resolution, suggesting that spectral resolution and spectral shifting are orthogonal in terms of intelligibility. High vowel recognition scores were observed for as few as four bands. Regardless of the number of bands, no significant performance drop was observed for tonotopic shifts equivalent to 3 mm along the basilar membrane, that is, for frequency shifts of 40%-60%. Similar results were obtained from five cochlear implant listeners, when electrode locations were fixed and the spectral location of the analysis filters was shifted. Changes in recognition performance in electrical and acoustic hearing were similar in terms of the relative location of electrodes rather than the absolute location of electrodes, indicating that cochlear implant users may at least partly accommodate to the new patterns of speech sounds after long-time exposure to their normal speech processor.

Collaboration


Dive into the Robert V. Shannon's collaboration.

Top Co-Authors

Avatar

Qian-Jie Fu

University of California

View shared research outputs
Top Co-Authors

Avatar

Derald E. Brackmann

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Fan-Gang Zeng

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Deniz Başkent

University Medical Center Groningen

View shared research outputs
Top Co-Authors

Avatar

John J. Galvin

University of California

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge