Odette Scharenborg | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Odette Scharenborg is active.

Explore More

Publication

Featured researches published by Odette Scharenborg.

Cognitive Science | 2005

How Should a Speech Recognizer Work

Odette Scharenborg; Dennis Norris; Louis ten Bosch; James M. McQueen

Although researchers studying human speech recognition (HSR) and automatic speech recognition (ASR) share a common interest in how information processing systems (human or machine) recognize spoken language, there is little communication between the two disciplines. We suggest that this lack of communication follows largely from the fact that research in these related fields has focused on the mechanics of how speech can be recognized. In Marrs (1982) terms, emphasis has been on the algorithmic and implementational levels rather than on the computational level. In this article, we provide a computational-level analysis of the task of speech recognition, which reveals the close parallels between research concerned with HSR and ASR. We illustrate this relation by presenting a new computational model of human spoken-word recognition, built using techniques from the field of ASR that, in contrast to current existing models of HSR, recognizes words from real speech input.

Journal of Phonetics | 2011

Acoustic reduction in conversational Dutch: A quantitative analysis based on automatically generated segmental transcriptions

Barbara Schuppler; Mirjam Ernestus; Odette Scharenborg; Lou Boves

Abstract In spontaneous, conversational speech, words are often reduced compared to their citation forms, such that a word like yesterday may sound like [ ’ j e ʃ e I ] . The present chapter investigates such acoustic reduction. The study of reduction needs large corpora that are transcribed phonetically. The first part of this chapter describes an automatic transcription procedure used to obtain such a large phonetically transcribed corpus of Dutch spontaneous dialogues, which is subsequently used for the investigation of acoustic reduction. First, the orthographic transcriptions were adapted for automatic processing. Next, the phonetic transcription of the corpus was created by means of a forced alignment using a lexicon with multiple pronunciation variants per word. These variants were generated by applying phonological and reduction rules to the canonical phonetic transcriptions of the words. The second part of this chapter reports the results of a quantitative analysis of reduction in the corpus on the basis of the generated transcriptions and gives an inventory of segmental reductions in standard Dutch. Overall, we found that reduction is more pervasive in spontaneous Dutch than previously documented.

Journal of the Acoustical Society of America | 2010

Unsupervised speech segmentation: An analysis of the hypothesized phone boundaries

Odette Scharenborg; Vincent Wan; Mirjam Ernestus

Despite using different algorithms, most unsupervised automatic phone segmentation methods achieve similar performance in terms of percentage correct boundary detection. Nevertheless, unsupervised segmentation algorithms are not able to perfectly reproduce manually obtained reference transcriptions. This paper investigates fundamental problems for unsupervised segmentation algorithms by comparing a phone segmentation obtained using only the acoustic information present in the signal with a reference segmentation created by human transcribers. The analyses of the output of an unsupervised speech segmentation method that uses acoustic change to hypothesize boundaries showed that acoustic change is a fairly good indicator of segment boundaries: over two-thirds of the hypothesized boundaries coincide with segment boundaries. Statistical analyses showed that the errors are related to segment duration, sequences of similar segments, and inherently dynamic phones. In order to improve unsupervised automatic speech segmentation, current one-stage bottom-up segmentation methods should be expanded into two-stage segmentation methods that are able to use a mix of bottom-up information extracted from the speech signal and automatically derived top-down information. In this way, unsupervised methods can be improved while remaining flexible and language-independent.

Cognition | 2013

Phonological abstraction without phonemes in speech perception.

Holger Mitterer; Odette Scharenborg; James M. McQueen

Recent evidence shows that listeners use abstract prelexical units in speech perception. Using the phenomenon of lexical retuning in speech processing, we ask whether those units are necessarily phonemic. Dutch listeners were exposed to a Dutch speaker producing ambiguous phones between the Dutch syllable-final allophones approximant [r] and dark [l]. These ambiguous phones replaced either final /r/ or final /l/ in words in a lexical-decision task. This differential exposure affected perception of ambiguous stimuli on the same allophone continuum in a subsequent phonetic-categorization test: Listeners exposed to ambiguous phones in /r/-final words were more likely to perceive test stimuli as /r/ than listeners with exposure in /l/-final words. This effect was not found for test stimuli on continua using other allophones of /r/ and /l/. These results confirm that listeners use phonological abstraction in speech perception. They also show that context-sensitive allophones can play a role in this process, and hence that context-insensitive phonemes are not necessary. We suggest there may be no one unit of perception.

Wiley Interdisciplinary Reviews: Cognitive Science | 2012

Models of spoken-word recognition

Andrea Weber; Odette Scharenborg

All words of the languages we know are stored in the mental lexicon. Psycholinguistic models describe in which format lexical knowledge is stored and how it is accessed when needed for language use. The present article summarizes key findings in spoken-word recognition by humans and describes how models of spoken-word recognition account for them. Although current models of spoken-word recognition differ considerably in the details of implementation, there is general consensus among them on at least three aspects: multiple word candidates are activated in parallel as a word is being heard, activation of word candidates varies with the degree of match between the speech signal and stored lexical representations, and activated candidate words compete for recognition. No consensus has been reached on other aspects such as the flow of information between different processing levels, and the format of stored prelexical and lexical representations. WIREs Cogn Sci 2012, 3:387-401. doi: 10.1002/wcs.1178 For further resources related to this article, please visit the WIREs website.

Speech Communication | 2010

Native and non-native listeners' perception of English consonants in different types of noise

Mirjam Broersma; Odette Scharenborg

This paper shows that the effect of different types of noise on recognition of different phonemes by native versus non-native listeners is highly variable, even within classes of phonemes with the same manner or place of articulation. In a phoneme identification experiment, English and Dutch listeners heard all 24 English consonants in VCV stimuli in quiet and in three types of noise: competing talker, speech-shaped noise, and modulated speech-shaped noise (all with SNRs of -6dB). Differential effects of noise type for English and Dutch listeners were found for eight consonants (/p t k g m n @? r/) but not for the other 16 consonants. For those eight consonants, effects were again highly variable: each noise type hindered non-native listeners more than native listeners for some of the target sounds, but none of the noise types did so for all of the target sounds, not even for phonemes with the same manner or place of articulation. The results imply that the noise types employed will strongly affect the outcomes of any study of native and non-native speech perception in noise.

Speech Communication | 2010

Language-independent processing in speech perception: Identification of English intervocalic consonants by speakers of eight European languages

Martin Cooke; Maria Luisa Garcia Lecumberri; Odette Scharenborg; Wim A. van Dommelen

Processing speech in a non-native language requires listeners to cope with influences from their first language and to overcome the effects of limited exposure and experience. These factors may be particularly important when listening in adverse conditions. However, native listeners also suffer in noise, and the intelligibility of speech in noise clearly depends on factors which are independent of a listeners first language. The current study explored the issue of language-independence by comparing the responses of eight listener groups differing in native language when confronted with the task of identifying English intervocalic consonants in three masker backgrounds, viz. stationary speech-shaped noise, temporally-modulated speech-shaped noise and competing English speech. The study analysed the effects of (i) noise type, (ii) speaker, (iii) vowel context, (iv) consonant, (v) phonetic feature classes, (vi) stress position, (vii) gender and (viii) stimulus onset relative to noise onset. A significant degree of similarity in the response to many of these factors was evident across all eight language groups, suggesting that acoustic and auditory considerations play a large role in determining intelligibility. Language-specific influences were observed in the rankings of individual consonants and in the masking effect of competing speech relative to speech-modulated noise.

Journal of the Acoustical Society of America | 2003

Bridging automatic speech recognition and psycholinguistics: Extending Shortlist to an end-to-end model of human speech recognition (L)

Odette Scharenborg; L.F.M. ten Bosch; L.W.J. Boves; Dennis Norris

~Received 10 December 2002; accepted for publication 25 August 2003!This letter evaluates potential beneﬁts of combining human speech recognition~HSR! and automaticspeech recognition by building a joint model of an automatic phone recognizer ~APR! and acomputational model of HSR, viz., Shortlist @Norris, Cognition 52, 189–234 ~1994!#. Experimentsbased on ‘‘real-life’’ speech highlight critical limitations posed by some of the simplifyingassumptions made in models of human speech recognition. These limitations could be overcome byavoiding hard phone decisions at the output side of theAPR, and by using a match between the inputand the internal lexicon that ﬂexibly copes with deviations from canonical phonemicrepresentations.

Attention Perception & Psychophysics | 2013

Comparing lexically guided perceptual learning in younger and older listeners

Odette Scharenborg; Esther Janse

Numerous studies have shown that younger adults engage in lexically guided perceptual learning in speech perception. Here, we investigated whether older listeners are also able to retune their phonetic category boundaries. More specifically, in this research we tried to answer two questions. First, do older adults show perceptual-learning effects of similar size to those of younger adults? Second, do differences in lexical behavior predict the strength of the perceptual-learning effect? An age group comparison revealed that older listeners do engage in lexically guided perceptual learning, but there were two age-related differences: Younger listeners had a stronger learning effect right after exposure than did older listeners, but the effect was more stable for older than for younger listeners. Moreover, a clear link was shown to exist between individuals’ lexical-decision performance during exposure and the magnitude of their perceptual-learning effects. A subsequent analysis on the results of the older participants revealed that, even within the older participant group, with increasing age the perceptual retuning effect became smaller but also more stable, mirroring the age group comparison results. These results could not be explained by differences in hearing loss. The age effect may be accounted for by decreased flexibility in the adjustment of phoneme categories or by age-related changes in the dynamics of spoken-word recognition, with older adults being more affected by competition from similar-sounding lexical competitors, resulting in less lexical guidance for perceptual retuning. In conclusion, our results clearly show that the speech perception system remains flexible over the life span.

Attention Perception & Psychophysics | 2015

The role of attentional abilities in lexically-guided perceptual learning by older listeners

Odette Scharenborg; Andrea Weber; Esther Janse

This study investigates two variables that may modify lexically guided perceptual learning: individual hearing sensitivity and attentional abilities. Older Dutch listeners (aged 60+ years, varying from good hearing to mild-to-moderate high-frequency hearing loss) were tested on a lexically guided perceptual learning task using the contrast [f]-[s]. This contrast mainly differentiates between the two consonants in the higher frequencies, and thus is supposedly challenging for listeners with hearing loss. The analyses showed that older listeners generally engage in lexically guided perceptual learning. Hearing loss and selective attention did not modify perceptual learning in our participant sample, while attention-switching control did: listeners with poorer attention-switching control showed a stronger perceptual learning effect. We postulate that listeners with better attention-switching control may, in general, rely more strongly on bottom-up acoustic information compared to listeners with poorer attention-switching control, making them in turn less susceptible to lexically guided perceptual learning. Our results, moreover, clearly show that lexically guided perceptual learning is not lost when acoustic processing is less accurate.

Explore More