Zilong Xie
University of Texas at Austin
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Zilong Xie.
Neuropsychologia | 2015
Zilong Xie; W. Todd Maddox; Valerie S. Knopik; John E. McGeary; Bharath Chandrasekaran
Listeners vary substantially in their ability to recognize speech in noisy environments. Here we examined the role of genetic variation on individual differences in speech recognition in various noise backgrounds. Background noise typically varies in the levels of energetic masking (EM) and informational masking (IM) imposed on target speech. Relative to EM, release from IM is hypothesized to place greater demand on executive function to selectively attend to target speech while ignoring competing noises. Recent evidence suggests that the long allele variant in exon III of the DRD4 gene, primarily expressed in the prefrontal cortex, may be associated with enhanced selective attention to goal-relevant high-priority information even in the face of interference. We investigated the extent to which this polymorphism is associated with speech recognition in IM and EM conditions. In an unscreened adult sample (Experiment 1) and a larger screened replication sample (Experiment 2), we demonstrate that individuals with the DRD4 long variant show better recognition performance in noise conditions involving significant IM, but not in EM conditions. In Experiment 2, we also obtained neuropsychological measures to assess the underlying mechanisms. Mediation analysis revealed that this listening condition-specific advantage was mediated by enhanced executive attention/working memory capacity in individuals with the long allele variant. These findings suggest that DRD4 may contribute specifically to individual differences in speech recognition ability in noise conditions that place demands on executive function.
Journal of Neurophysiology | 2017
Zilong Xie; Rachel Reetzke; Bharath Chandrasekaran
While lifelong language experience modulates subcortical encoding of pitch patterns, there is emerging evidence that short-term training introduced in adulthood also shapes subcortical pitch encoding. Here we use a cross-language design to examine the stability of language experience-dependent subcortical plasticity over multiple days. We then examine the extent to which behavioral relevance induced by sound-to-category training leads to plastic changes in subcortical pitch encoding in adulthood relative to adolescence, a period of ongoing maturation of subcortical and cortical auditory processing. Frequency-following responses (FFRs), which reflect phase-locked activity from subcortical neural ensembles, were elicited while participants passively listened to pitch patterns reflective of Mandarin tones. In experiment 1, FFRs were recorded across three consecutive days from native Chinese-speaking (n = 10) and English-speaking (n = 10) adults. In experiment 2, FFRs were recorded from native English-speaking adolescents (n = 20) and adults (n = 15) before, during, and immediately after a session of sound-to-category training, as well as a day after training ceased. Experiment 1 demonstrated the stability of language experience-dependent subcortical plasticity in pitch encoding across multiple days of passive exposure to linguistic pitch patterns. In contrast, experiment 2 revealed an enhancement in subcortical pitch encoding that emerged a day after the sound-to-category training, with some developmental differences observed. Taken together, these findings suggest that behavioral relevance is a critical component for the observation of plasticity in the subcortical encoding of pitch.NEW & NOTEWORTHY We examine the timescale of experience-dependent auditory plasticity to linguistically relevant pitch patterns. We find extreme stability in lifelong experience-dependent plasticity. We further demonstrate that subcortical function in adolescents and adults is modulated by a single session of sound-to-category training. Our results suggest that behavioral relevance is a necessary ingredient for neural changes in pitch encoding to be observed throughout human development. These findings contribute to the neurophysiological understanding of long- and short-term experience-dependent modulation of pitch.
Attention Perception & Psychophysics | 2017
Kristin J. Van Engen; Zilong Xie; Bharath Chandrasekaran
In noisy situations, visual information plays a critical role in the success of speech communication: listeners are better able to understand speech when they can see the speaker. Visual influence on auditory speech perception is also observed in the McGurk effect, in which discrepant visual information alters listeners’ auditory perception of a spoken syllable. When hearing /ba/ while seeing a person saying /ga/, for example, listeners may report hearing /da/. Because these two phenomena have been assumed to arise from a common integration mechanism, the McGurk effect has often been used as a measure of audiovisual integration in speech perception. In this study, we test whether this assumed relationship exists within individual listeners. We measured participants’ susceptibility to the McGurk illusion as well as their ability to identify sentences in noise across a range of signal-to-noise ratios in audio-only and audiovisual modalities. Our results do not show a relationship between listeners’ McGurk susceptibility and their ability to use visual cues to understand spoken sentences in noise, suggesting that McGurk susceptibility may not be a valid measure of audiovisual integration in everyday speech processing.
Cognition & Emotion | 2015
Bharath Chandrasekaran; Kristin J. Van Engen; Zilong Xie; Christopher G. Beevers; W. Todd Maddox
It is widely acknowledged that individuals with elevated depressive symptoms exhibit deficits in inter-personal communication. Research has primarily focused on speech production in individuals with elevated depressive symptoms. Little is known about speech perception in individuals with elevated depressive symptoms, especially in challenging listening conditions. Here, we examined speech perception in young adults with low- or high-depressive (HD) symptoms in the presence of a range of maskers. Maskers were selected to reflect various levels of informational masking (IM), which refers to cognitive interference due to signal and masker similarity, and energetic masking (EM), which refers to peripheral interference due to signal degradation by the masker. Speech intelligibility data revealed that individuals with HD symptoms did not differ from those with low-depressive symptoms during EM, but they exhibited a selective deficit during IM. Since IM is a common occurrence in real-world social settings, this listening deficit may exacerbate communicative difficulties.
Journal of Neuroscience Methods | 2017
Fernando Llanos; Zilong Xie; Bharath Chandrasekaran
BACKGROUND The frequency-following response (FFR) is a scalp-recorded electrophysiological potential reflecting phase-locked activity from neural ensembles in the auditory system. The FFR is often used to assess the robustness of subcortical pitch processing. Due to low signal-to-noise ratio at the single-trial level, FFRs are typically averaged across thousands of stimulus repetitions. Prior work using this approach has shown that subcortical encoding of linguistically-relevant pitch patterns is modulated by long-term language experience. NEW METHOD We examine the extent to which a machine learning approach using hidden Markov modeling (HMM) can be utilized to decode Mandarin tone-categories from scalp-record electrophysiolgical activity. We then assess the extent to which the HMM can capture biologically-relevant effects (language experience-driven plasticity). To this end, we recorded FFRs to four Mandarin tones from 14 adult native speakers of Chinese and 14 of native English. We trained a HMM to decode tone categories from the FFRs with varying size of averages. RESULTS AND COMPARISONS WITH EXISTING METHODS Tone categories were decoded with above-chance accuracies using HMM. The HMM derived metric (decoding accuracy) revealed a robust effect of language experience, such that FFRs from native Chinese speakers yielded greater accuracies than native English speakers. Critically, the language experience-driven plasticity was captured with average sizes significantly smaller than those used in the extant literature. CONCLUSIONS Our results demonstrate the feasibility of HMM in assessing the robustness of neural pitch. Machine-learning approaches can complement extant analytical methods that capture auditory function and could reduce the number of trials needed to capture biological phenomena.
Brain and behavior | 2017
Han G. Yi; Zilong Xie; Rachel Reetzke; Alexandros G. Dimakis; Bharath Chandrasekaran
Scalp‐recorded electrophysiological responses to complex, periodic auditory signals reflect phase‐locked activity from neural ensembles within the auditory system. These responses, referred to as frequency‐following responses (FFRs), have been widely utilized to index typical and atypical representation of speech signals in the auditory system. One of the major limitations in FFR is the low signal‐to‐noise ratio at the level of single trials. For this reason, the analysis relies on averaging across thousands of trials. The ability to examine the quality of single‐trial FFRs will allow investigation of trial‐by‐trial dynamics of the FFR, which has been impossible due to the averaging approach.
PLOS ONE | 2016
Rachel Reetzke; Boji Pak-Wing Lam; Zilong Xie; Li Sheng; Bharath Chandrasekaran
Recognizing speech in adverse listening conditions is a significant cognitive, perceptual, and linguistic challenge, especially for children. Prior studies have yielded mixed results on the impact of bilingualism on speech perception in noise. Methodological variations across studies make it difficult to converge on a conclusion regarding the effect of bilingualism on speech-in-noise performance. Moreover, there is a dearth of speech-in-noise evidence for bilingual children who learn two languages simultaneously. The aim of the present study was to examine the extent to which various adverse listening conditions modulate differences in speech-in-noise performance between monolingual and simultaneous bilingual children. To that end, sentence recognition was assessed in twenty-four school-aged children (12 monolinguals; 12 simultaneous bilinguals, age of English acquisition ≤ 3 yrs.). We implemented a comprehensive speech-in-noise battery to examine recognition of English sentences across different modalities (audio-only, audiovisual), masker types (steady-state pink noise, two-talker babble), and a range of signal-to-noise ratios (SNRs; 0 to -16 dB). Results revealed no difference in performance between monolingual and simultaneous bilingual children across each combination of modality, masker, and SNR. Our findings suggest that when English age of acquisition and socioeconomic status is similar between groups, monolingual and bilingual children exhibit comparable speech-in-noise performance across a range of conditions analogous to everyday listening environments.
Current Biology | 2018
Rachel Reetzke; Zilong Xie; Fernando Llanos; Bharath Chandrasekaran
Although challenging, adults can learn non-native phonetic contrasts with extensive training [1, 2], indicative of perceptual learning beyond an early sensitivity period [3, 4]. Training can alter low-level sensory encoding of newly acquired speech sound patterns [5]; however, the time-course, behavioral relevance, and long-term retention of such sensory plasticity is unclear. Some theories argue that sensory plasticity underlying signal enhancement is immediate and critical to perceptual learning [6, 7]. Others, like the reverse hierarchy theory (RHT), posit a slower time-course for sensory plasticity [8]. RHT proposes that higher-level categorical representations guide immediate, novice learning, while lower-level sensory changes do not emerge until expert stages of learning [9]. We trained 20 English-speaking adults to categorize a non-native phonetic contrast (Mandarin lexical tones) using a criterion-dependent sound-to-category training paradigm. Sensory and perceptual indices were assayed across operationally defined learning phases (novice, experienced, over-trained, and 8-week retention) by measuring the frequency-following response, a neurophonic potential that reflects fidelity of sensory encoding, and the perceptual identification of a tone continuum. Our results demonstrate that while robust changes in sensory encoding and perceptual identification of Mandarin tones emerged with training and were retained, such changes followed different timescales. Sensory changes were evidenced and related to behavioral performance only when participants were over-trained. In contrast, changes in perceptual identification reflecting improvement in categorical percept emerged relatively earlier. Individual differences in perceptual identification, and not sensory encoding, related to faster learning. Our findings support the RHT-sensory plasticity accompanies, rather than drives, expert levels of non-native speech learning.
Archive | 2017
Rachel Reetzke; Zilong Xie; Bharath Chandrasekaran
Literacy acquisition is complex and multifactorial. Successful literacy acquisition places extreme demands on sensory and cognitive processes. Individuals with reading disorders demonstrate a range of linguistic, sensory, and cognitive deficits. In this chapter, the relationship between reading ability and the frequency-following response (FFR) is examined. The utility of the FFR in assessment of successful literacy and reading disorders is reviewed along with the use of FFR as an index of remediation. Finally, the chapter concludes with a discussion of current issues and future directions regarding the utility of the FFR as an objective neural metric of deficits in literacy disorders. Throughout these sections the distinct cognitive, linguistic, and experiential influences on the FFR are highlighted to further demonstrate how the FFR to speech may serve as an auditory biomarker to predict literacy disorders.
Journal of the Acoustical Society of America | 2017
Fernando Llanos; Zilong Xie; Bharath Chandrasekaran
Pitch encoding is often studied with frequency-following response (FFR), a scalp-recorded potential reflecting phase-locked activity from auditory subcortical ensembles. Prior work using FFR have shown that long-term language experience modulates subcortical encoding of linguistically-relevant pitch patterns. These studies typically rely on FFRs averaging across thousands of repetitions, due to low signal-to-noise ratio of single-trial FFRs. Here, we evaluated the extent to which hidden Markov models (HMMs), with fewer numbers of trials, can be used to quantify pitch encoding as well as capture language experience-dependent plasticity in pitch encoding. FFRs were recorded from fourteen Mandarin Chinese and fourteen American English passively listening to four Mandarin tones (1000 trials per tone). HMMs were used to recognize FFRs to each tone in individual participants. Specifically, HMMs were trained and tested across FFR sets of different sizes, ranging from 50 to 500 trials. Results showed that HMMs we...