James W. Dias
University of California, Riverside
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by James W. Dias.
Attention Perception & Psychophysics | 2016
James W. Dias; Lawrence D. Rosenblum
Talkers automatically imitate aspects of perceived speech, a phenomenon known as phonetic convergence. Talkers have previously been found to converge to auditory and visual speech information. Furthermore, talkers converge more to the speech of a conversational partner who is seen and heard, relative to one who is just heard (Dias & Rosenblum Perception, 40, 1457–1466, 2011). A question raised by this finding is what visual information facilitates the enhancement effect. In the following experiments, we investigated the possible contributions of visible speech articulation to visual enhancement of phonetic convergence within the noninteractive context of a shadowing task. In Experiment 1, we examined the influence of the visibility of a talker on phonetic convergence when shadowing auditory speech either in the clear or in low-level auditory noise. The results suggest that visual speech can compensate for convergence that is reduced by auditory noise masking. Experiment 2 further established the visibility of articulatory mouth movements as being important to the visual enhancement of phonetic convergence. Furthermore, the word frequency and phonological neighborhood density characteristics of the words shadowed were found to significantly predict phonetic convergence in both experiments. Consistent with previous findings (e.g., Goldinger Psychological Review, 105, 251–279, 1998), phonetic convergence was greater when shadowing low-frequency words. Convergence was also found to be greater for low-density words, contrasting with previous predictions of the effect of phonological neighborhood density on auditory phonetic convergence (e.g., Pardo, Jordan, Mallari, Scanlon, & Lewandowski Journal of Memory and Language, 69, 183–195, 2013). Implications of the results for a gestural account of phonetic convergence are discussed.
Journal of cognitive psychology | 2017
Lawrence D. Rosenblum; James W. Dias; Josh Dorsi
ABSTRACT The perceptual brain is designed around multisensory input. Areas once thought dedicated to a single sense are now known to work with multiple senses. It has been argued that the multisensory nature of the brain reflects a cortical architecture for which task, rather than sensory system, is the primary design principle. This supramodal thesis is supported by recent research on human echolocation and multisensory speech perception. In this review, we discuss the behavioural implications of a supramodal architecture, especially as they pertain to auditory perception. We suggest that the architecture implies a degree of perceptual parity between the senses and that cross-sensory integration occurs early and completely. We also argue that a supramodal architecture implies that perceptual experience can be shared across modalities and that this sharing should occur even without bimodal experience. We finish by briefly suggesting areas of future research.
Ecological Psychology | 2016
Lawrence D. Rosenblum; Josh Dorsi; James W. Dias
ABSTRACT One important contribution of Carol Fowlers direct approach to speech perception is its account of multisensory perception. This supramodal account proposes a speech function that detects supramodal information available across audition, vision, and touch. This detection allows for the recovery of articulatory primitives that provide the basis of a common currency shared between modalities as well as between perception and production. Common currency allows for perceptual experience to be shared between modalities and supports perceptually guided speaking as well as production-guided perception. In this report, we discuss the contribution and status of the supramodal approach relative to recent research in multisensory speech perception. We argue that the approach has helped motivate a multisensory revolution in perceptual psychology. We then review the new behavioral and neurophysiological research on (a) supramodal information, (b) cross-sensory sharing of experience, and (c) perceptually guided speaking as well as production guided speech perception. We conclude that Fowlers supramodal theory has fared quite well in light of this research.
Journal of the Acoustical Society of America | 2016
Josh Dorsi; Lawrence D. Rosenblum; James W. Dias; Dana Ashkar
Training with audio-visual speech improves subsequent auditory speech perception more so than training with auditory-alone speech (e.g., Bernstein, et al., 2013). What is the source of this bimodal training advantage? One explanation is that perceivers rely on learned bimodal associations. Alternatively, perceivers could be exploiting natural, amodal regularities available in both the auditory and visual signals. To test this question, multisensory training stimuli were tested for which observers had no bimodal associative experience. It is known that felt articulations —acquired by placing a hand on a speaker’s face— can provide information for speech perception (see Trielle et al., 2014, for a review). Importantly, these effects are found in participants with no prior experience perceiving speech through touch. If training with audio-haptic speech improves auditory speech perception, then this bimodal advantage cannot be due to learned associations but likely reflects sensitivity to amodal information. ...
Journal of the Acoustical Society of America | 2016
Chase Weinholtz; James W. Dias
Previous research suggests that many speech sounds are perceived in discrete categorical units. For example, perceivers typically identify acoustic stimuli from a continuum that ranges from one syllable to another (e.g., /va/-to-/ba/) exclusively as one syllable with a sharp change in which syllable the stimuli are identified as in the middle of the continuum. Further, pairs of continuum stimuli are typically easier to discriminate if they span this sharp change in stimulus identification. Walden et al. (1987) investigated whether visual (lipread) speech is perceived categorically using a continuum of digitally morphed mouths, but failed to find any strong evidence for categorical perception. In the current investigation, a technique developed by Baart and Vroomen (2010) was used to create a /va/-to-/ba/ continuum of visual test-stimuli. A video of a talker articulating /va/ was digitally superimposed over a video of the same talker articulating /ba/. Opacity of the /va/ video was then adjusted in 10% inc...
Journal of the Acoustical Society of America | 2015
Dominique C. Simmons; James W. Dias; Josh Dorsi; Lawrence D. Rosenblum
Observers can match unfamiliar faces to corresponding voices (Kamachi et al., 2003). Observers can also use dynamic point-light displays containing isolated visible articulations to match faces to voices (Rosenblum et al., 2006) and to sinewave versions of those voices (Lachs & Pisoni, 2004) suggesting that isolated idiolectic information can support this skill. Cross-modal skills also extend to facilitation of speech perception. Familiarity with a talker in one modality can facilitate speech perception in another modality (Rosenblum, Miller, & Sanchez, 2007; Sanchez, Dias, & Rosenblum, 2013). Using point-light and sinewave techniques, we tested whether talker learning transfers across modalities. If learning of idiolectic talker information can transfer across modalities, observers should better learn to auditorily recognize talkers they have previously seen. Sixteen subjects trained to recognize five point-light talkers. Eight of these subjects then trained to recognize sinewave voices of the five previ...
Journal of the Acoustical Society of America | 2008
James W. Dias; Lorin Lachs
Auditory and visual perceptual processes interact during the identification of speech sounds. Some evaluations of this interaction have utilized a comparison of performance on audio and audiovisual word recognition tasks. A measure derived from these data, R, can be used as an index of the perceptual gain due to multisensory stimulation relative to unimodal stimulation. Recent evidence has indicated that cross‐modal relationships between the acoustic and optical forms of speech stimuli exist. Furthermore, this cross‐modal information may be used by the perceptual mechanisms responsible for integrating disparate sensory signals. However, little is known about the ways in which acoustic and optic signals carry cross‐modal information. The present experiment manipulated the acoustic form of speech in systematic ways that selectively disrupted candidate sources of cross‐modal information in the acoustic signal. Participants were then asked to perform a simple word recognition task with the transformed words i...
Quarterly Journal of Experimental Psychology | 2018
Josh Dorsi; Navin Viswanathan; Lawrence D. Rosenblum; James W. Dias
The Irrelevant Sound Effect (ISE) is the finding that background sound impairs accuracy for visually presented serial recall tasks. Among various auditory backgrounds, speech typically acts as the strongest distractor. Based on the changing-state hypothesis, speech is a disruptive background because it is more complex than other nonspeech backgrounds. In the current study, we evaluate an alternative explanation by examining whether the speech-likeness of the background (speech fidelity) contributes, beyond signal complexity, to the ISE. We did this by using noise-vocoded speech as a background. In Experiment 1, we varied the complexity of the background by manipulating the number of vocoding channels. Results indicate that the ISE increases with the number of channels, suggesting that more complex signals produce greater ISEs. In Experiment 2, we varied complexity and speech fidelity independently. At each channel level, we selectively reversed a subset of channels to design a low-fidelity signal that was equated in overall complexity. Experiment 2 results indicated that speech-like noise-vocoded speech produces a larger ISE than selectively reversed noise-vocoded speech. Finally, in Experiment 3, we evaluated the locus of the speech-fidelity effect by assessing the distraction produced by these stimuli in a missing-item task. In this task, even though noise-vocoded speech disrupted task performance relative to silence, neither its complexity nor speech fidelity contributed to this effect. Together, these findings indicate a clear role for speech fidelity of the background beyond its changing-state quality and its attention capture potential.
Journal of the Acoustical Society of America | 2016
James W. Dias; Lawrence D. Rosenblum
Selective adaptation of speech information can change perceptual categorization of ambiguous phonetic information (e.g., Vroomen and Baart, 2012). The results of a number of studies suggest that selective adaptation may depend on sensory-specific information shared between the adaptor and test stimuli (e.g., Roberts and Summerfield, 1981; Saldana and Rosenblum, 1994). For example, adaptation to heard syllables can change perception of heard syllables, but adaptation to lipread syllables has not been found to significantly change perception of heard syllables. The lack of crossmodal influence in selective speech adaptation is inconsistent with other phenomena suggesting crossmodal influences occur early in the speech process (e.g., Rosenblum, 2008). For the current investigation, a replication of an attempt to induce crossmodal speech adaptation again failed to yield significant effects. However, a meta-analysis including these results with past attempts (Roberts and Summerfield, 1981) revealed that small ...
Journal of the Acoustical Society of America | 2015
James W. Dias; Lawrence D. Rosenblum
Human perceivers unconsciously imitate the subtle articulatory characteristics of perceived speech (phonetic convergence) during live conversation [e.g., Pardo, 2006] and when shadowing (saying aloud) pre-recorded speech [e.g., Goldinger, 1998]. Perceivers converge along acoustical speech dimensions when shadowing auditory (heard) and visual (lipread) speech [Miller, Sanchez, & Rosenblum, 2010], suggesting the information to which perceivers converge may be similar across sensory modalities. It is known that phonetic convergence to auditory speech is reduced when shadowing high-frequency words and when shadowing responses are delayed. These findings suggest that stored lexical representations and working memory processing time may modulate speech imitation [Goldinger, 1998]. The question arises of whether phonetic convergence to shadowed visual speech would demonstrate the same effects of word frequency and shadowing response delay. Phonetic convergence was reduced when shadowing lip-read high-frequency w...