Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Santiago Barreda is active.

Publication


Featured researches published by Santiago Barreda.


Journal of the Acoustical Society of America | 2012

The direct and indirect roles of fundamental frequency in vowel perception.

Santiago Barreda; Terrance M. Nearey

Several experiments have found that changing the intrinsic f0 of a vowel can have an effect on perceived vowel quality. It has been suggested that these shifts may occur because f0 is involved in the specification of vowel quality in the same way as the formant frequencies. Another possibility is that f0 affects vowel quality indirectly, by changing a listeners assumptions about characteristics of a speaker who is likely to have uttered the vowel. In the experiment outlined here, participants were asked to listen to vowels differing in terms of f0 and their formant frequencies and report vowel quality and the apparent speakers gender and size on a trial-by-trial basis. The results presented here suggest that f0 affects vowel quality mainly indirectly via its effects on the apparent-speaker characteristics; however, f0 may also have some residual direct effects on vowel quality. Furthermore, the formant frequencies were also found to have significant indirect effects on vowel quality by way of their strong influence on the apparent speaker.


Journal of the Acoustical Society of America | 2012

Vowel normalization and the perception of speaker changes: An exploration of the contextual tuning hypothesis

Santiago Barreda

Many experiments have reported a perceptual advantage for vowels presented in blocked-versus mixed-voice conditions. Nusbaum and colleagues [Nusbaum and Morin (1992). in Speech Perception, Speech Production, and Linguistic Structure, edited by Y. Tohkura, Y. Sagisaka, and E. Vatikiotis-Bateson (OHM, Tokyo), pp. 113-134; Magnuson and Nusbaum (2007). J. Exp. Psychol. Hum. Percept. Perform. 33(2), 391-409] present results which suggest that the size of this advantage may be related to the facility with which listeners can detect speaker changes, so that combinations of less similar voices can result in better performance than combinations of more similar voices. To test this, a series of synthetic voices (differing in their source characteristics and/or formant-spaces) was used in a speeded-monitoring task. Vowels were presented in blocks made up of tokens from one or two synthetic voices. Results indicate that formant-space differences, in the absence of source differences between voices in a block, were unlikely to result in the perception of multiple voices, leading to lower accuracy and relatively faster reaction times. Source differences between voices in a block resulted in the perception of multiple voices, increased reaction times, and a decreased negative effect of formant-space differences between voices on identification accuracy. These results are consistent with a process in which the detection of speaker changes guides the appropriate or inappropriate use of extrinsic information in normalization.


Journal of Phonetics | 2016

Investigating the use of formant frequencies in listener judgments of speaker size

Santiago Barreda

Abstract The formant-pattern present in a given vowel sound will be determined by the vocal-tract length (VTL) of the speaker as well as by phoneme-specific information. Although human listeners tend to associate lower formant-frequencies with larger speakers, it is unclear whether they are responding to VTL information in speech sounds, or simply responding to the formant-pattern present in the sound. In this experiment listeners were presented with pairs of synthetic vowels from the set of (/i ae ʊ/), which could differ on the basis of simulated VTL and vowel category, within-pair. Listeners were divided into groups based on the number of formants contained by stimulus vowels (2, 3, 4, and 5-formant vowel groups). For each trial, listeners were asked to indicate which vowel sounded like it had been produced by a taller speaker. Results indicate that listeners do not rely solely on VTL cues when making speaker-size judgments, and that they exhibit biases towards selecting given phonemes as taller, even when contrary to the VTL differences between the voices. Furthermore, the higher formants (up to F5) are used by listeners when making speaker-size judgments, though not in a manner consistent with VTL-based speaker-size judgments.


Journal of the Acoustical Society of America | 2013

Training listeners to report the acoustic correlate of formant-frequency scaling using synthetic voices.

Santiago Barreda; Terrance M. Nearey

The vocal tract length of a speaker is the primary determinant of the range of formant frequencies (FFs) produced by that speaker. Listeners have demonstrated sensitivity to the average FFs produced by voices, for example, in estimating the relative heights of two speakers based on their speech. However, it is not known whether they can learn to identify voices based on the acoustic characteristic associated with the average FFs produced by a voice (this characteristic will be referred to as FF-scaling). To investigate this, a series of vowels corresponding to voices that differed in their average f0 and/or FF-scaling were synthesized. Listeners (n = 71) were trained to identify these voices using a training procedure where, for each trial, they heard the vowels representing a voice and then had to identify the stimulus voice from among a series of candidate voices that differed in terms of their FF-scaling and/or their f0. Results indicate that listeners can identify voices on the basis of FF-scaling quite accurately and consistently after only a short training session and that, although f0 weakly influences these estimates, they are most strongly determined by the stimulus FFs.


Pattern Recognition Letters | 2016

Bayes covariant multi-class classification

Ondrej uch; Santiago Barreda

Defining notion of Bayes covariance.Proving existence and uniqueness of Bayes covariant classifier of 3 categories.Explicit construction of a Bayes covariant classifier for any number of categories.A proof that previously considered methods are not Bayes covariant.Comparison of various methods for combining pairwise classifiers via MDS, and speech frame classification. We consider multi-class classification models built from complete sets of pairwise binary classifiers. The BradleyTerry model is often used to estimate posterior distributions in this setting. We introduce the notion of Bayes covariance, which holds if the multi-class classifier respects multiplicative group action on class priors. As a consequence, a Bayes covariant method yields the same result whether new priors are considered before or after combination of the individual classifiers, which has several practical advantages for systems with feedback. In the paper, we construct a Bayes covariant combining method and compare it with previously published methods in both Monte Carlo simulations as well as on a practical speech frame recognition task.


Journal of the Acoustical Society of America | 2018

Apparent-talker height is influenced by Mandarin lexical tone

Santiago Barreda; Zoey Y. Liu

Apparent-talker height is determined by a talkers fundamental frequency (f0) and spectral information, typically indexed using formant frequencies (FFs). Barreda [(2017b). J. Acoust. Soc. Am. 141, 4781-4792] reports that the apparent height of a talker can be influenced by vowel-specific variation in the f0 or FFs of a sound. In this experiment, native speakers of Mandarin were presented with a series of syllables produced by talkers of different apparent heights. Results indicate that there is substantial variability in the estimated height of a single talker based on lexical tone, as well as the inherent f0 and FFs of vowel phonemes.


Journal of Phonetics | 2017

Listeners respond to phoneme-specific spectral information when assessing speaker size from speech

Santiago Barreda

Abstract Spectral information in speech sounds varies as a function of linguistic content, as well as the vocal-tract length (VTL) of the speaker. It is usually considered that human listeners rely on VTL information when assessing apparent speaker-size. However, a recent experiment (Barreda, 2016) found that listeners respond to the specific spectral-content of speech sounds rather than simply responding to speaker VTL information. This results in biases towards identifying certain phonemes with larger speakers independently of VTL information. To investigate this, listeners were asked to judge relative speaker-size based on vowel pairs differing in vowel quality and/or apparent speaker VTL. Additionally, one group of listeners was asked to report relative-height differences, while another group was trained to report relative-VTL differences directly. Results indicate that both groups of listeners exhibited substantial biases towards associating certain phonemes with larger speakers. In addition, listeners showed substantial variation both in their sensitivity to specific acoustic cues, and in their general approach to speaker size estimation. For example, some listeners rely primarily on VTL cues while others rely heavily on phoneme-specific spectral information.


Respiratory Physiology & Neurobiology | 2015

Developmental nicotine exposure adversely effects respiratory patterning in the barbiturate anesthetized neonatal rat.

Santiago Barreda; Ian J. Kidder; Jordan A. Mudery; E. Fiona Bailey

Neonates at risk for sudden infant death syndrome (SIDS) are hospitalized for cardiorespiratory monitoring however, monitoring is costly and generates large quantities of averaged data that serve as poor predictors of infant risk. In this study we used a traditional autocorrelation function (ACF) testing its suitability as a tool to detect subtle alterations in respiratory patterning in vivo. We applied the ACF to chest wall motion tracings obtained from rat pups in the period corresponding to the mid-to-end of the third trimester of human pregnancy. Pups were drawn from two groups: nicotine-exposed and saline-exposed at each age (i.e., P7, P8, P9, and P10). Respiratory-related motions of the chest wall were recorded in room air and in response to an arousal stimulus (FIO2 14%). The autocorrelation function was used to determine measures of breathing rate and respiratory patterning. Unlike alternative tools such as Poincare plots that depict an averaged difference in a measure breath to breath, the ACF when applied to a digitized chest wall trace yields an instantaneous sample of data points that can be used to compare (data) points at the same time in the next breath or in any subsequent number of breaths. The moment-to-moment evaluation of chest wall motion detected subtle differences in respiratory pattern in rat pups exposed to nicotine in utero and aged matched saline-exposed peers. The ACF can be applied online as well as to existing data sets and requires comparatively short sampling windows (∼2 min). As shown here, the ACF could be used to identify factors that precipitate or minimize instability and thus, offers a quantitative measure of risk in vulnerable populations.


Journal of the Acoustical Society of America | 2013

The perception of formant-frequency range is affected by veridical and judged fundamental frequency

Santiago Barreda; Terrance M. Nearey

The vowels produced by different speakers vary in terms of their fundamental frequency (f0) and formant frequencies (FFs). Variation in the production of a given vowel category between speakers of different sizes is primarily according to a single multiplicative parameter (related to speaker vocal-tract length). This parameter, which we refer to as FF-scaling, has an associated perceptual quality that listeners may use to determine apparent speaker characteristics and vowel quality. In a previous experiment [Barreda & Nearey. 2011. J. Acoust. Soc. Am., 129, p. 2661], listeners were trained to identify a limited set of voices based on FF-scaling and f0 differences. The current study presented listeners with large number of voices (n = 4000) varying in FF-scaling and f0, arranged in a two-dimensional space where one dimension corresponded to each acoustic characteristic. Listeners were played a voice, and asked to indicate its location on the board, thereby providing an f0 and FF-scaling estimate for the vo...


Journal of the Acoustical Society of America | 2018

Modeling the perception of children's age from speech acoustics

Santiago Barreda; Peter F. Assmann

Adult listeners were presented with /hVd/ syllables spoken by boys and girls ranging from 5 to 18 years of age. Half of the listeners were informed of the sex of the speaker; the other half were not. Results indicate that veridical age in children can be predicted accurately based on the acoustic characteristics of the talkers voice and that listener behavior is highly predictable on the basis of speech acoustics. Furthermore, listeners appear to incorporate assumptions about talker sex into their estimates of talker age, even when information about the talkers sex is not explicitly provided for them.

Collaboration


Dive into the Santiago Barreda's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Peter F. Assmann

University of Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Zoey Y. Liu

University of California

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge