Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Roel Smits is active.

Publication


Featured researches published by Roel Smits.


Journal of the Acoustical Society of America | 2004

Patterns of English phoneme confusions by native and non-native listeners

Anne Cutler; Andrea Weber; Roel Smits; Nicole Cooper

Native American English and non-native (Dutch) listeners identified either the consonant or the vowel in all possible American English CV and VC syllables. The syllables were embedded in multispeaker babble at three signal-to-noise ratios (0, 8, and 16 dB). The phoneme identification performance of the non-native listeners was less accurate than that of the native listeners. All listeners were adversely affected by noise. With these isolated syllables, initial segments were harder to identify than final segments. Crucially, the effects of language background and noise did not interact; the performance asymmetry between the native and non-native groups was not significantly different across signal-to-noise ratios. It is concluded that the frequently reported disproportionate difficulty of non-native listening under disadvantageous conditions is not due to a disproportionate increase in phoneme misidentifications.


Journal of the Acoustical Society of America | 2004

An acoustic description of the vowels of Northern and Southern Standard Dutch

Patti Adank; Roeland van Hout; Roel Smits

A database is presented of measurements of the fundamental frequency, the frequencies of the first three formants, and the duration of the 15 vowels of Standard Dutch as spoken in the Netherlands (Northern Standard Dutch) and in Belgium (Southern Standard Dutch). The speech material consisted of read monosyllabic utterances in a neutral consonantal context (i.e., /sVs/). Recordings were made for 20 female talkers and 20 male talkers, who were stratified for the factors age, gender, and region. Of the 40 talkers, 20 spoke Northern Standard Dutch and 20 spoke Southern Standard Dutch. The results indicated that the nine monophthongal Dutch vowels /a [see symbol in text] epsilon i I [see symbol in text] u y Y/ can be separated fairly well given their steady-state characteristics, while the long mid vowels /e o ø/ and three diphthongal vowels /epsilon I [see symbol in text]u oey/ also require information about their dynamic characteristics. The analysis of the formant values indicated that Northern Standard Dutch and Southern Standard Dutch differ little in the formant frequencies at steady-state for the nine monophthongal vowels. Larger differences between these two language varieties were found for the dynamic specifications of the three long mid vowels, and, to a lesser extent, of the three diphthongal vowels.


Journal of Phonetics | 2004

Acoustical and perceptual analysis of the voicing distinction in Dutch initial plosives: the role of prevoicing

Petra M. Van Alphen; Roel Smits

Abstract Three experiments investigated the voicing distinction in Dutch initial labial and alveolar plosives. The difference between voiced and voiceless Dutch plosives is generally described in terms of the presence or absence of prevoicing (negative voice onset time). Experiment 1 showed, however, that prevoicing was absent in 25% of voiced plosive productions across 10 speakers. The production of prevoicing was influenced by place of articulation of the plosive, by whether the plosive occurred in a consonant cluster or not, and by speaker sex. Experiment 2 was a detailed acoustic analysis of the voicing distinction, which identified several acoustic correlates of voicing. Prevoicing appeared to be by far the best predictor. Perceptual classification data revealed that prevoicing was indeed the strongest cue that listeners use when classifying plosives as voiced or voiceless. In the cases where prevoicing was absent, other acoustic cues influenced classification, such that some of these tokens were still perceived as being voiced. These secondary cues were different for the two places of articulation. We discuss the paradox raised by these findings: although prevoicing is the most reliable cue to the voicing distinction for listeners, it is not reliably produced by speakers.


Speech Communication | 2008

Supervised and unsupervised learning of multidimensionally varying non-native speech categories

Martijn Goudbeek; Anne Cutler; Roel Smits

The acquisition of novel phonetic categories is hypothesized to be affected by the distributional properties of the input, the relation of the new categories to the native phonology, and the availability of supervision (feedback). These factors were examined in four experiments in which listeners were presented with novel categories based on vowels of Dutch. Distribution was varied such that the categorization depended on the single dimension duration, the single dimension frequency, or both dimensions at once. Listeners were clearly sensitive to the distributional information, but unidimensional contrasts proved easier to learn than multidimensional. The native phonology was varied by comparing Spanish versus American English listeners. Spanish listeners found categorization by frequency easier than categorization by duration, but this was not true of American listeners, whose native vowel system makes more use of duration-based distinctions. Finally, feedback was either available or not; this comparison showed supervised learning to be significantly superior to unsupervised learning.


Journal of the Acoustical Society of America | 2003

Unfolding of phonetic information over time: A database of Dutch diphone perception

Roel Smits; Natasha Warner; James M. McQueen; Anne Cutler

We present the results of a large-scale study on speech perception, assessing the number and type of perceptual hypotheses which listeners entertain about possible phoneme sequences in their language. Dutch listeners were asked to identify gated fragments of all 1179 diphones of Dutch, providing a total of 488,520 phoneme categorizations. The results manifest orderly uptake of acoustic information in the signal. Differences across phonemes in the rate at which fully correct recognition was achieved arose as a result of whether or not potential confusions could occur with other phonemes of the language (long with short vowels, affricates with their initial components, etc.). These data can be used to improve models of how acoustic-phonetic information is mapped onto the mental lexicon during speech comprehension.


Journal of Experimental Psychology: Human Perception and Performance | 2006

Categorization of sounds

Roel Smits; Joan A. Sereno; Allard Jongman

The authors conducted 4 experiments to test the decision-bound, prototype, and distribution theories for the categorization of sounds. They used as stimuli sounds varying in either resonance frequency or duration. They created different experimental conditions by varying the variance and overlap of 2 stimulus distributions used in a training phase and varying the size of the stimulus continuum used in the subsequent test phase. When resonance frequency was the stimulus dimension, the pattern of categorization-function slopes was in accordance with the decision-bound theory. When duration was the stimulus dimension, however, the slope pattern gave partial support for the decision-bound and distribution theories. The authors introduce a new categorization model combining aspects of decision-bound and distribution theories that gives a superior account of the slope patterns across the 2 stimulus dimensions.


Journal of Experimental Psychology: Human Perception and Performance | 2001

Evidence for hierarchical categorization of coarticulated phonemes

Roel Smits

The reported research investigates how listeners recognize coarticulated phonemes. First, 2 data sets from experiments on the recognition of coarticulated phonemes published by D. H. Whalen (1989) are reanalyzed. The analyses indicate that listeners used categorization strategies involving a hierarchical dependency. Two new experiments are reported investigating the production and perception of fricative-vowel syllables. On the basis of measurements of acoustic cues on a large set of natural utterances, it was predicted that listeners would use categorization strategies involving a dependency of the fricative categorization on the perceived vowel. The predictions were tested in a perception experiment using a 2-dimensional synthetic fricative-vowel continuum. Model analyses of the results pooled across listeners confirmed the predictions. Individual analyses revealed some variability in the categorization dependencies used by different participants.


Journal of Experimental Psychology: Human Perception and Performance | 2009

Supervised and Unsupervised Learning of Multidimensional Acoustic Categories

Martijn Goudbeek; Daniel Swingley; Roel Smits

Learning to recognize the contrasts of a language-specific phonemic repertoire can be viewed as forming categories in a multidimensional psychophysical space. Research on the learning of distributionally defined visual categories has shown that categories defined over 1 dimension are easy to learn and that learning multidimensional categories is more difficult but tractable under specific task conditions. In 2 experiments, adult participants learned either a unidimensional or a multidimensional category distinction with or without supervision (feedback) during learning. The unidimensional distinctions were readily learned and supervision proved beneficial, especially in maintaining category learning beyond the learning phase. Learning the multidimensional category distinction proved to be much more difficult and supervision was not nearly as beneficial as with unidimensionally defined categories. Maintaining a learned multidimensional category distinction was only possible when the distributional information that identified the categories remained present throughout the testing phase. We conclude that listeners are sensitive to both trial-by-trial feedback and the distributional information in the stimuli. Even given limited exposure, listeners learned to use 2 relevant dimensions, albeit with considerable difficulty.


Attention Perception & Psychophysics | 2001

Hierarchical categorization of coarticulated phonemes: A theoretical analysis

Roel Smits

This article is concerned with the question of how listeners recognize coarticulated phonemes. The problem is approached from a pattern classification perspective. First, the potential acoustical effects of coarticulation are defined in terms of the patterns that form the input to a classifier. Next, a categorization model called HICAT is introduced that incorporates hierarchical dependencies to optimally deal with this input. The model allows the position, orientation, and steepness of one phoneme boundary to depend on the perceived value of a neighboring phoneme. It is argued that, if listeners do behave like statistical pattern recognizers, they may use the categorization strategies incorporated in the model. The HICAT model is compared with existing categorization models, among which are the fuzzylogical model of perception and Nearey’s diphone-biased secondary-cue model. Finally, a method is presented by which categorization strategies that are likely to be used by listeners can be predicted from distributions of acoustical cues as they occur in natural speech.


Journal of Phonetics | 2000

Temporal distribution of information for human consonant recognition in VCV utterances

Roel Smits

Abstract The temporal distribution of perceptually relevant information for consonant recognition in British English VCVs is investigated. The information distribution in the vicinity of consonantal closure and release was measured by presenting initial and final portions, respectively, of naturally produced VCV utterances to listeners for categorization. A multidimensional scaling analysis of the results provided highly interpretable, four-dimensional geometrical representations of the confusion patterns in the categorization data. In addition, transmitted information as a function of truncation point was calculated for the features manner place and voicing. The effects of speaker, vowel context, stress, and distinctive feature on the resulting information distributions were tested statistically. It was found that, although all factors are significant, the location and spread of the distributions depends principally on the distinctive feature, i.e., the temporal distribution of perceptually relevant information is very different for the features manner, place, and voicing.

Collaboration


Dive into the Roel Smits's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Patti Adank

University College London

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Daniel Swingley

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Roeland van Hout

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Andrea Weber

University of Tübingen

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge