Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where André Coy is active.

Publication


Featured researches published by André Coy.


Computer Speech & Language | 2010

Speech fragment decoding techniques for simultaneous speaker identification and speech recognition

Jon Barker; Ning Ma; André Coy; Martin Cooke

This paper addresses the problem of recognising speech in the presence of a competing speaker. We review a speech fragment decoding technique that treats segregation and recognition as coupled problems. Data-driven techniques are used to segment a spectro-temporal representation into a set of fragments, such that each fragment is dominated by one or other of the speech sources. A speech fragment decoder is used which employs missing data techniques and clean speech models to simultaneously search for the set of fragments and the word sequence that best matches the target speaker model. The paper investigates the performance of the system on a recognition task employing artificially mixed target and masker speech utterances. The fragment decoder produces significantly lower error rates than a conventional recogniser, and mimics the pattern of human performance that is produced by the interplay between energetic and informational masking. However, at around 0dB the performance is generally quite poor. An analysis of the errors shows that a large number of target/masker confusions are being made. The paper presents a novel fragment-based speaker identification approach that allows the target speaker to be reliably identified across a wide range of SNRs. This component is combined with the recognition system to produce significant improvements. When the target and masker utterance have the same gender, the recognition system has a performance at 0dB equal to that of humans; in other conditions the error rate is roughly twice the human error rate.


Speech Communication | 2007

An automatic speech recognition system based on the scene analysis account of auditory perception

André Coy; Jon Barker

Despite many years of concentrated research, the performance gap between automatic speech recognition (ASR) and human speech recognition (HSR) remains large. The difference between ASR and HSR is particularly evident when considering the response to additive noise. Whereas human performance is remarkably robust, ASR systems are brittle and only operate well within the narrow range of noise conditions for which they were designed. This paper considers how humans may achieve noise robustness. We take the view that robustness is achieved because the human perceptual system treats the problems of speech recognition and sound source separation as being tightly coupled. Taking inspiration from Bregmans Auditory Scene Analysis account of auditory organisation, we present a speech recognition system which couples these processes by using a combination of primitive and schema-driven processes: first, a set of coherent spectro-temporal fragments is generated by primitive segmentation techniques; then, a decoder based on statistical ASR techniques performs a simultaneous search for the correct background/foreground segmentation and word sequence hypothesis. Mutually supporting solutions to both the source segmentation and speech recognition problems arise as a result. The decoder is tested on a challenging corpus of connected digit strings mixed monaurally at 0dB and recognition performance is compared with that achieved by listeners using identical data. The results, although preliminary, are encouraging and suggest that techniques which interface ASA and statistical ASR have great potential. The paper concludes with a discussion of future research directions that may further develop this class of perceptually motivated ASR solutions.


international conference on acoustics, speech, and signal processing | 2005

Recognising speech in the presence of a competing speaker using a 'speech fragment decoder'

André Coy; Jon Barker

This paper addresses the problem of recognising speech in the presence of a competing speech source. A novel two stage approach is described. A spectral representation is first divided into a set of spectro-temporal fragments where each fragment is believed to be due to a single acoustic source. An unknown subset of these will be due to the target speech source. The standard ASR search is then extended to find the most likely combination of speech model sequence and fragment subset. The technique is tested with a fragment generation stage using pitch information to locate harmonic energy components, and image processing techniques to segment the inharmonic regions of the spectrogram. The system achieves an accuracy of 65.1% on a 0 dB simultaneous connected digit sequence task with cross-gender mixtures. Extension of the technique to handle matched-gender utterances is discussed.


conference of the international speech communication association | 2015

Remote Speech Technology for Speech Professionals - the CloudCAST initiative

Phil D. Green; Ricard Marxer; Stuart P. Cunningham; Heidi Christensen; Frank Rudzicz; Maria Yancheva; André Coy; Massimiliano Malavasi; Lorenzo Desideri

Clinical applications of speech technology face two challenges. The first is data sparsity. There is little data available to underpin techniques which are based on machine learning and, because it is difficult to collect disordered speech corpora, the only way to address this problem is by pooling what is produced from systems which are already in use. The second is personalisation. This field demands individual solutions, technology which adapts to its user rather than demanding that the user adapt to it. Here we introduce a project, CloudCAST, which addresses these two problems by making remote, adaptive technology available to professionals who work with speech: therapists, educators and clinicians. Index Terms: assistive technology, clinical applications of speech technology


conference of the international speech communication association | 2016

CloudCAST - Remote Speech Technology for Speech Professionals.

Phil D. Green; Ricard Marxer; Stuart P. Cunningham; Heidi Christensen; Frank Rudzicz; Maria Yancheva; André Coy; Massimiliano Malavasi; Lorenzo Desideri; Fabio Tamburini

Recent advances in speech technology are potentially of great benefit to the professionals who help people with speech problems: therapists, pathologists, educators and clinicians. There are 3 obstacles to progress which we seek to address in the CloudCAST project: • the design of applications deploying the technology should be user-driven, • the computing resource should be available remotely • the software should be capable of personalisation: clinical applications demand individual solutions. CloudCAST aims to provide such a resource, and in addition to gather the data produced as the applications are used, to underpin the machine learning required for further progress.


Speech Communication | 2007

Exploiting correlogram structure for robust speech recognition with multiple speech sources

Ning Ma; Phil D. Green; Jon Barker; André Coy


conference of the international speech communication association | 2006

Recent advances in speech fragment decoding techniques

Jon Barker; André Coy; Ning Ma; Martin Cooke


conference of the international speech communication association | 2006

A multipitch tracker for monaural speech segmentation

André Coy; Jon Barker


conference of the international speech communication association | 2005

Soft Harmonic Masks for Recognising Speech in the Presence of a Competing Speaker

André Coy; Jon Barker


AAATE Conf. | 2017

Cloud-Based Speech Technology for Assistive Technology Applications (CloudCAST).

Stuart P. Cunningham; Phil D. Green; Heidi Christensen; José Joaquín Atria; André Coy; Massimiliano Malavasi; Lorenzo Desideri; Frank Rudzicz

Collaboration


Dive into the André Coy's collaboration.

Top Co-Authors

Avatar

Jon Barker

University of Sheffield

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ning Ma

University of Sheffield

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge