Andrew J. R. Simpson | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andrew J. R. Simpson is active.

Explore More

Publication

Featured researches published by Andrew J. R. Simpson.

Speech Communication | 1998

The effect of cue-enhancement on the intelligibility of nonsense word and sentence materials presented in noise

Valerie Hazan; Andrew J. R. Simpson

Abstract Two sets of experiments were performed to test the perceptual benefits of enhancing consonantal regions which contain a high density of acoustic cues to phonemic contrasts. In the first set, hand-annotated consonantal regions of natural vowel–consonant–vowel (VCV) stimuli were amplified to increase their salience, and filtered to stylise the cues they contained. In the second set, corresponding regions in natural semantically-unpredictable sentence (SUS) material were annotated and enhanced in the same way. Both sets of stimuli were combined with speech-shaped noise and presented to normally-hearing listeners. The VCV experiments showed statistically significant improvements in intelligibility as a result of enhancement; significant improvements were also obtained for sentence material after some adjustments in enhancement strategies and levels. These results demonstrate the benefits gained from enhancement techniques which use knowledge of acoustic cues to phonetic contrasts to improve the intelligibility of speech in the presence of background noise.

PLOS Neglected Tropical Diseases | 2016

Diagnostics in Ebola Virus Disease in Resource-Rich and Resource-Limited Settings

Robert J. Shorten; Colin S Brown; Michael Jacobs; Simon Rattenbury; Andrew J. R. Simpson; Stephen Mepham

The Ebola virus disease (EVD) outbreak in West Africa was unprecedented in scale and location. Limited access to both diagnostic and supportive pathology assays in both resource-rich and resource-limited settings had a detrimental effect on the identification and isolation of cases as well as individual patient management. Limited access to such assays in resource-rich settings resulted in delays in differentiating EVD from other illnesses in returning travellers, in turn utilising valuable resources until a diagnosis could be made. This had a much greater impact in West Africa, where it contributed to the initial failure to contain the outbreak. This review explores diagnostic assays of use in EVD in both resource-rich and resource-limited settings, including their respective limitations, and some novel assays and approaches that may be of use in future outbreaks.

PLOS ONE | 2013

Tuning of Human Modulation Filters Is Carrier-Frequency Dependent

Andrew J. R. Simpson; Joshua D. Reiss; David McAlpine

Recent studies employing speech stimuli to investigate ‘cocktail-party’ listening have focused on entrainment of cortical activity to modulations at syllabic (5 Hz) and phonemic (20 Hz) rates. The data suggest that cortical modulation filters (CMFs) are dependent on the sound-frequency channel in which modulations are conveyed, potentially underpinning a strategy for separating speech from background noise. Here, we characterize modulation filters in human listeners using a novel behavioral method. Within an ‘inverted’ adaptive forced-choice increment detection task, listening level was varied whilst contrast was held constant for ramped increments with effective modulation rates between 0.5 and 33 Hz. Our data suggest that modulation filters are tonotopically organized (i.e., vary along the primary, frequency-organized, dimension). This suggests that the human auditory system is optimized to track rapid (phonemic) modulations at high sound-frequencies and slow (prosodic/syllabic) modulations at low frequencies.

conference of the international speech communication association | 2016

Combining Mask Estimates for Single Channel Audio Source Separation using Deep Neural Networks

Emad M. Grais; Gerard Roma; Andrew J. R. Simpson; Mark D. Plumbley

Deep neural networks (DNNs) are usually used for single channel source separation to predict either soft or binary time frequency masks. The masks are used to separate the sources from the mixed signal. Binary masks produce separated sources with more distortion and less interference than soft masks. In this paper, we propose to use another DNN to combine the estimates of binary and soft masks to achieve the advantages and avoid the disadvantages of using each mask individually. We aim to achieve separated sources with low distortion and low interference between each other. Our experimental results show that combining the estimates of binary and soft masks using DNN achieves lower distortion than using each estimate individually and achieves as low interference as the binary mask.

european signal processing conference | 2016

Evaluation of audio source separation models using hypothesis-driven non-parametric statistical methods

Andrew J. R. Simpson; Gerard Roma; Emad M. Grais; Russell Mason; Christopher Hummersone; Antoine Liutkus; Mark D. Plumbley

Audio source separation models are typically evaluated using objective separation quality measures, but rigorous statistical methods have yet to be applied to the problem of model comparison. As a result, it can be difficult to establish whether or not reliable progress is being made during the development of new models. In this paper, we provide a hypothesis-driven statistical analysis of the results of the recent source separation SiSEC challenge involving twelve competing models tested on separation of voice and accompaniment from fifty pieces of “professionally produced” contemporary music. Using non-parametric statistics, we establish reliable evidence for meaningful conclusions about the performance of the various models.

international conference on latent variable analysis and signal separation | 2017

Discriminative Enhancement for Single Channel Audio Source Separation using Deep Neural Networks

Emad M. Grais; Gerard Roma; Andrew J. R. Simpson; Mark D. Plumbley

The sources separated by most single channel audio source separation techniques are usually distorted and each separated source contains residual signals from the other sources. To tackle this problem, we propose to enhance the separated sources to decrease the distortion and interference between the separated sources using deep neural networks (DNNs). Two different DNNs are used in this work. The first DNN is used to separate the sources from the mixed signal. The second DNN is used to enhance the separated signals. To consider the interactions between the separated sources, we propose to use a single DNN to enhance all the separated sources together. To reduce the residual signals of one source from the other separated sources (interference), we train the DNN for enhancement discriminatively to maximize the dissimilarity between the predicted sources. The experimental results show that using discriminative enhancement decreases the distortion and interference between the separated sources.

PLOS ONE | 2013

The dynamic range paradox: a central auditory model of intensity change detection

Andrew J. R. Simpson; Joshua D. Reiss

In this paper we use empirical loudness modeling to explore a perceptual sub-category of the dynamic range problem of auditory neuroscience. Humans are able to reliably report perceived intensity (loudness), and discriminate fine intensity differences, over a very large dynamic range. It is usually assumed that loudness and intensity change detection operate upon the same neural signal, and that intensity change detection may be predicted from loudness data and vice versa. However, while loudness grows as intensity is increased, improvement in intensity discrimination performance does not follow the same trend and so dynamic range estimations of the underlying neural signal from loudness data contradict estimations based on intensity just-noticeable difference (JND) data. In order to account for this apparent paradox we draw on recent advances in auditory neuroscience. We test the hypothesis that a central model, featuring central adaptation to the mean loudness level and operating on the detection of maximum central-loudness rate of change, can account for the paradoxical data. We use numerical optimization to find adaptation parameters that fit data for continuous-pedestal intensity change detection over a wide dynamic range. The optimized model is tested on a selection of equivalent pseudo-continuous intensity change detection data. We also report a supplementary experiment which confirms the modeling assumption that the detection process may be modeled as rate-of-change. Data are obtained from a listening test (N = 10) using linearly ramped increment-decrement envelopes applied to pseudo-continuous noise with an overall level of 33 dB SPL. Increments with half-ramp durations between 5 and 50,000 ms are used. The intensity JND is shown to increase towards long duration ramps (p<10−6). From the modeling, the following central adaptation parameters are derived; central dynamic range of 0.215 sones, 95% central normalization, and a central loudness JND constant of 5.5×10−5 sones per ms. Through our findings, we argue that loudness reflects peripheral neural coding, and the intensity JND reflects central neural coding.

international conference on spoken language processing | 1996

Enhancing information-rich regions of natural VCV and sentence materials presented in noise

Valerie Hazan; Andrew J. R. Simpson

Two sets of experiments to test the perceptual benefits of enhancing information-rich regions of consonants in natural speech were performed. In the first set, hand-annotated consonantal regions of natural vowel-consonant-vowel (VCV) stimuli were amplified to increase their salience, and filtered to stylize the cues they contained. In the second set, natural semantically unpredictable sentence (SUS) material was annotated and enhanced in the same way. Both sets of stimuli were combined with speech-shaped noise and presented to normally-hearing listeners. Both sets of experiments showed statistically significant improvements in intelligibility as a result of enhancement, although the increase was greater for VCV than for SUS. These results demonstrate the benefits gained from enhancement techniques which use knowledge about acoustic cues to phonetic contrasts to improve the resistance of speech to noise.

Frontiers in Psychology | 2015

Auditory scene analysis and sonified visual images. Does consonance negatively impact on object formation when using complex sonified stimuli

David J. Brown; Andrew J. R. Simpson; Michael J. Proulx

A critical task for the brain is the sensory representation and identification of perceptual objects in the world. When the visual sense is impaired, hearing and touch must take primary roles and in recent times compensatory techniques have been developed that employ the tactile or auditory system as a substitute for the visual system. Visual-to-auditory sonifications provide a complex, feature-based auditory representation that must be decoded and integrated into an object-based representation by the listener. However, we don’t yet know what role the auditory system plays in the object integration stage and whether the principles of auditory scene analysis apply. Here we used coarse sonified images in a two-tone discrimination task to test whether auditory feature-based representations of visual objects would be confounded when their features conflicted with the principles of auditory consonance. We found that listeners (N = 36) performed worse in an object recognition task when the auditory feature-based representation was harmonically consonant. We also found that this conflict was not negated with the provision of congruent audio–visual information. The findings suggest that early auditory processes of harmonic grouping dominate the object formation process and that the complexity of the signal, and additional sensory information have limited effect on this.

PLOS ONE | 2013

Syncopation and the Score

Chunyang Song; Andrew J. R. Simpson; Christopher Harte; Marcus T. Pearce; Mark B. Sandler

The score is a symbolic encoding that describes a piece of music, written according to the conventions of music theory, which must be rendered as sound (e.g., by a performer) before it may be perceived as music by the listener. In this paper we provide a step towards unifying music theory with music perception in terms of the relationship between notated rhythm (i.e., the score) and perceived syncopation. In our experiments we evaluated this relationship by manipulating the score, rendering it as sound and eliciting subjective judgments of syncopation. We used a metronome to provide explicit cues to the prevailing rhythmic structure (as defined in the time signature). Three-bar scores with time signatures of 4/4 and 6/8 were constructed using repeated one-bar rhythm-patterns, with each pattern built from basic half-bar rhythm-components. Our manipulations gave rise to various rhythmic structures, including polyrhythms and rhythms with missing strong- and/or down-beats. Listeners (N = 10) were asked to rate the degree of syncopation they perceived in response to a rendering of each score. We observed higher degrees of syncopation in time signatures of 6/8, for polyrhythms, and for rhythms featuring a missing down-beat. We also found that the location of a rhythm-component within the bar has a significant effect on perceived syncopation. Our findings provide new insight into models of syncopation and point the way towards areas in which the models may be improved.

Explore More