Helen L. Bear
University of East Anglia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Helen L. Bear.
international conference on acoustics, speech, and signal processing | 2016
Helen L. Bear; Richard W. Harvey
To undertake machine lip-reading, we try to recognise speech from a visual signal. Current work often uses viseme classification supported by language models with varying degrees of success. A few recent works suggest phoneme classification, in the right circumstances, can outperform viseme classification. In this work we present a novel two-pass method of training phoneme classifiers which uses previously trained visemes in the first pass. With our new training algorithm, we show classification performance which significantly improves on previous lip-reading results.
international symposium on visual computing | 2014
Helen L. Bear; Richard W. Harvey; Barry-John Theobald; Yuxuan Lan
A critical assumption of all current visual speech recognition systems is that there are visual speech units called visemes which can be mapped to units of acoustic speech, the phonemes. Despite there being a number of published maps it is infrequent to see the effectiveness of these tested, particularly on visual-only lip-reading (many works use audio-visual speech). Here we examine 120 mappings and consider if any are stable across talkers. We show a method for devising maps based on phoneme confusions from an automated lip-reading system, and we present new mappings that show improvements for individual talkers.
international conference on image processing | 2014
Helen L. Bear; Richard W. Harvey; Barry-John Theobald; Yuxuan Lan
Visual-only speech recognition is dependent upon a number of factors that can be difficult to control, such as: lighting; identity; motion; emotion and expression. But some factors, such as video resolution are controllable, so it is surprising that there is not yet a systematic study of the effect of resolution on lip-reading. Here we use a new data set, the Rosetta Raven data, to train and test recognizers so we can measure the affect of video resolution on recognition accuracy. We conclude that, contrary to common practice, resolution need not be that great for automatic lip-reading. However it is highly unlikely that automatic lip-reading can work reliably when the distance between the bottom of the lower lip and the top of the upper lip is less than four pixels at rest.
arXiv: Computer Vision and Pattern Recognition | 2015
Helen L. Bear; Stephen J. Cox; Richard W. Harvey
arXiv: Computer Vision and Pattern Recognition | 2015
Helen L. Bear; Richard W. Harvey; Yuxuan Lan
arXiv: Computer Vision and Pattern Recognition | 2014
Helen L. Bear; Gari Owen; Richard W. Harvey; Barry-John Theobald
arXiv: Computer Vision and Pattern Recognition | 2017
Helen L. Bear
arXiv: Computer Vision and Pattern Recognition | 2017
Kwanchiva Thangthai; Helen L. Bear; Richard W. Harvey
arXiv: Computer Vision and Pattern Recognition | 2017
Helen L. Bear; Sarah L. Taylor
arXiv: Computer Vision and Pattern Recognition | 2018
Jake Burton; David Frank; Madhi Saleh; Nassir Navab; Helen L. Bear