Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ross K. Maddox is active.

Publication


Featured researches published by Ross K. Maddox.


The Journal of Neuroscience | 2008

Invariance and Sensitivity to Intensity in Neural Discrimination of Natural Sounds

Cyrus P. Billimoria; Benjamin J. Kraus; Rajiv Narayan; Ross K. Maddox; Kamal Sen

Intensity variation poses a fundamental problem for sensory discrimination because changes in the response of sensory neurons as a result of stimulus identity, e.g., a change in the identity of the speaker uttering a word, can potentially be confused with changes resulting from stimulus intensity, for example, the loudness of the utterance. Here we report on the responses of neurons in field L, the primary auditory cortex homolog in songbirds, which allow for accurate discrimination of birdsongs that is invariant to intensity changes over a large range. Such neurons comprise a subset of a population that is highly diverse, in terms of both discrimination accuracy and intensity sensitivity. We find that the neurons with a high degree of invariance also display a high discrimination performance, and that the degree of invariance is significantly correlated with the reproducibility of spike timing on a short time scale and the temporal sparseness of spiking activity. Our results indicate that a temporally sparse spike timing-based code at a primary cortical stage can provide a substrate for intensity-invariant discrimination of natural sounds.


Jaro-journal of The Association for Research in Otolaryngology | 2012

Influence of task-relevant and task-irrelevant feature continuity on selective auditory attention.

Ross K. Maddox; Barbara G. Shinn-Cunningham

Past studies have explored the relative strengths of auditory features in a selective attention task by pitting features against one another and asking listeners to report the words perceived in a given sentence. While these studies show that the continuity of competing features affects streaming, they did not address whether the influence of specific features is modulated by volitionally directed attention. Here, we explored whether the continuity of a task-irrelevant feature affects the ability to selectively report one of two competing speech streams when attention is specifically directed to a different feature. Sequences of simultaneous pairs of spoken digits were presented in which exactly one digit of each pair matched a primer phrase in pitch and exactly one digit of each pair matched the primer location. Within a trial, location and pitch were randomly paired; they either were consistent with each other from digit to digit or were switched (e.g., the sequence from the primers location changed pitch across digits). In otherwise identical blocks, listeners were instructed to report digits matching the primer either in location or in pitch. Listeners were told to ignore the irrelevant feature, if possible, in order to perform well. Listener responses depended on task instructions, proving that top–down attention alters how a subject performs the task. Performance improved when the separation of the target and masker in the task-relevant feature increased. Importantly, the values of the task-irrelevant feature also influenced performance in some cases. Specifically, when instructed to attend location, listeners performed worse as the separation between target and masker pitch increased, especially when the spatial separation between digits was small. These results indicate that task-relevant and task-irrelevant features are perceptually bound together: continuity of task-irrelevant features influences selective attention in an automatic, obligatory manner, consistent with the idea that auditory attention operates on objects.


PLOS Biology | 2012

Competing sound sources reveal spatial effects in cortical processing.

Ross K. Maddox; Cyrus P. Billimoria; Ben P. Perrone; Barbara G. Shinn-Cunningham; Kamal Sen

Neurons in the avian auditory forebrain show strong sensitivity to the spatial configuration of two competing sources, even though there is only weak spatial dependence for any single source.


eLife | 2015

Auditory selective attention is enhanced by a task-irrelevant temporally coherent visual stimulus in human listeners

Ross K. Maddox; Huriye Atilgan; Jennifer K. Bizley; Adrian Lee

In noisy settings, listening is aided by correlated dynamic visual cues gleaned from a talkers face—an improvement often attributed to visually reinforced linguistic information. In this study, we aimed to test the effect of audio–visual temporal coherence alone on selective listening, free of linguistic confounds. We presented listeners with competing auditory streams whose amplitude varied independently and a visual stimulus with varying radius, while manipulating the cross-modal temporal relationships. Performance improved when the auditory targets timecourse matched that of the visual stimulus. The fact that the coherence was between task-irrelevant stimulus features suggests that the observed improvement stemmed from the integration of auditory and visual streams into cross-modal objects, enabling listeners to better attend the target. These findings suggest that in everyday conditions, where listeners can often see the source of a sound, temporal cues provided by vision can help listeners to select one sound source from a mixture. DOI: http://dx.doi.org/10.7554/eLife.04995.001


Trends in Neurosciences | 2016

Defining Auditory-Visual Objects: Behavioral Tests and Physiological Mechanisms

Jennifer K. Bizley; Ross K. Maddox; Adrian Lee

Crossmodal integration is a term applicable to many phenomena in which one sensory modality influences task performance or perception in another sensory modality. We distinguish the term binding as one that should be reserved specifically for the process that underpins perceptual object formation. To unambiguously differentiate binding form other types of integration, behavioral and neural studies must investigate perception of a feature orthogonal to the features that link the auditory and visual stimuli. We argue that supporting true perceptual binding (as opposed to other processes such as decision-making) is one role for cross-sensory influences in early sensory cortex. These early multisensory interactions may therefore form a physiological substrate for the bottom-up grouping of auditory and visual stimuli into auditory-visual (AV) objects.


Frontiers in Neuroscience | 2014

Improving spatial localization in MEG inverse imaging by leveraging intersubject anatomical differences

Eric Larson; Ross K. Maddox; Adrian Lee

Modern neuroimaging techniques enable non-invasive observation of ongoing neural processing, with magnetoencephalography (MEG) in particular providing direct measurement of neural activity with millisecond time resolution. However, accurately mapping measured MEG sensor readings onto the underlying source neural structures remains an active area of research. This so-called “inverse problem” is ill posed, and poses a challenge for source estimation that is often cited as a drawback limiting MEG data interpretation. However, anatomically constrained MEG localization estimates may be more accurate than commonly believed. Here we hypothesize that, by combining anatomically constrained inverse estimates across subjects, the spatial uncertainty of MEG source localization can be mitigated. Specifically, we argue that differences in subject brain geometry yield differences in point-spread functions, resulting in improved spatial localization across subjects. To test this, we use standard methods to combine subject anatomical MRI scans with coregistration information to obtain an accurate forward (physical) solution, modeling the MEG sensor data resulting from brain activity originating from different cortical locations. Using a linear minimum-norm inverse to localize this brain activity, we demonstrate that a substantial increase in the spatial accuracy of MEG source localization can result from combining data from subjects with differing brain geometry. This improvement may be enabled by an increase in the amount of available spatial information in MEG data as measurements from different subjects are combined. This approach becomes more important in the face of practical issues of coregistration errors and potential noise sources, where we observe even larger improvements in localization when combining data across subjects. Finally, we use a simple auditory N100(m) localization task to show how this effect can influence localization using a recorded neural dataset.


Neuron | 2018

Integration of Visual Information in Auditory Cortex Promotes Auditory Scene Analysis through Multisensory Binding

Huriye Atilgan; Stephen Michael Town; Katherine C. Wood; Gareth Jones; Ross K. Maddox; Adrian Lee; Jennifer K. Bizley

Summary How and where in the brain audio-visual signals are bound to create multimodal objects remains unknown. One hypothesis is that temporal coherence between dynamic multisensory signals provides a mechanism for binding stimulus features across sensory modalities. Here, we report that when the luminance of a visual stimulus is temporally coherent with the amplitude fluctuations of one sound in a mixture, the representation of that sound is enhanced in auditory cortex. Critically, this enhancement extends to include both binding and non-binding features of the sound. We demonstrate that visual information conveyed from visual cortex via the phase of the local field potential is combined with auditory information within auditory cortex. These data provide evidence that early cross-sensory binding provides a bottom-up mechanism for the formation of cross-sensory objects and that one role for multisensory binding in auditory cortex is to support auditory scene analysis.


Jaro-journal of The Association for Research in Otolaryngology | 2012

Neuron-specific stimulus masking reveals interference in spike timing at the cortical level.

Eric H. Larson; Ross K. Maddox; Ben P. Perrone; Kamal Sen; Cyrus P. Billimoria

The auditory system is capable of robust recognition of sounds in the presence of competing maskers (e.g., other voices or background music). This capability arises despite the fact that masking stimuli can disrupt neural responses at the cortical level. Since the origins of such interference effects remain unknown, in this study, we work to identify and quantify neural interference effects that originate due to masking occurring within and outside receptive fields of neurons. We record from single and multi-unit auditory sites from field L, the auditory cortex homologue in zebra finches. We use a novel method called spike timing-based stimulus filtering that uses the measured response of each neuron to create an individualized stimulus set. In contrast to previous adaptive experimental approaches, which have typically focused on the average firing rate, this method uses the complete pattern of neural responses, including spike timing information, in the calculation of the receptive field. When we generate and present novel stimuli for each neuron that mask the regions within the receptive field, we find that the time-varying information in the neural responses is disrupted, degrading neural discrimination performance and decreasing spike timing reliability and sparseness. We also find that, while removing stimulus energy from frequency regions outside the receptive field does not significantly affect neural responses for many sites, adding a masker in these frequency regions can nonetheless have a significant impact on neural responses and discriminability without a significant change in the average firing rate. These findings suggest that maskers can interfere with neural responses by disrupting stimulus timing information with power either within or outside the receptive fields of neurons.


Journal of the Acoustical Society of America | 2009

The intelligibility of pointillistic speech

Gerald Kidd; Timothy Streeter; Antje Ihlefeld; Ross K. Maddox; Christine R. Mason

A form of processed speech is described that is highly discriminable in a closed-set identification format. The processing renders speech into a set of sinusoidal pulses played synchronously across frequency. The processing and results from several experiments are described. The number and width of frequency analysis channels and tone-pulse duration were variables. In one condition, various proportions of the tones were randomly removed. The processed speech was remarkably resilient to these manipulations. This type of speech may be useful for examining multitalker listening situations in which a high degree of stimulus control is required.


international conference on latent variable analysis and signal separation | 2018

Generating Talking Face Landmarks from Speech

Sefik Emre Eskimez; Ross K. Maddox; Chenliang Xu; Zhiyao Duan

The presence of a corresponding talking face has been shown to significantly improve speech intelligibility in noisy conditions and for hearing impaired population. In this paper, we present a system that can generate landmark points of a talking face from an acoustic speech in real time. The system uses a long short-term memory (LSTM) network and is trained on frontal videos of 27 different speakers with automatically extracted face landmarks. After training, it can produce talking face landmarks from the acoustic speech of unseen speakers and utterances. The training phase contains three key steps. We first transform landmarks of the first video frame to pin the two eye points into two predefined locations and apply the same transformation on all of the following video frames. We then remove the identity information by transforming the landmarks into a mean face shape across the entire training dataset. Finally, we train an LSTM network that takes the first- and second-order temporal differences of the log-mel spectrogram as input to predict face landmarks in each frame. We evaluate our system using the mean-squared error (MSE) loss of landmarks of lips between predicted and ground-truth landmarks as well as their first- and second-order temporal differences. We further evaluate our system by conducting subjective tests, where the subjects try to distinguish the real and fake videos of talking face landmarks. Both tests show promising results.

Collaboration


Dive into the Ross K. Maddox's collaboration.

Top Co-Authors

Avatar

Adrian Lee

University of Washington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chenliang Xu

University of Rochester

View shared research outputs
Top Co-Authors

Avatar

Dean Pospisil

University of Washington

View shared research outputs
Top Co-Authors

Avatar

Zhiyao Duan

University of Rochester

View shared research outputs
Top Co-Authors

Avatar

Huriye Atilgan

University College London

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge