Ibrahim Almajai
University of East Anglia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ibrahim Almajai.
IEEE Transactions on Audio, Speech, and Language Processing | 2011
Ibrahim Almajai; Ben Milner
The aim of this work is to examine whether visual speech information can be used to enhance audio speech that has been contaminated by noise. First, an analysis of audio and visual speech features is made, which identifies the pair with highest audio-visual correlation. The study also reveals that higher audio-visual correlation exists within individual phoneme sounds rather than globally across all speech. This correlation is exploited in the proposal of a visually derived Wiener filter that obtains clean speech and noise power spectrum statistics from visual speech features. Clean speech statistics are estimated from visual features using a maximum a posteriori framework that is integrated within the states of a network of hidden Markov models to provide phoneme localization. Noise statistics are obtained through a novel audio-visual voice activity detector which utilizes visual speech features to make robust speech/nonspeech classifications. The effectiveness of the visually derived Wiener filter is evaluated subjectively and objectively and is compared with three different audio-only enhancement methods over a range of signal-to-noise ratios.
international conference on acoustics, speech, and signal processing | 2016
Ibrahim Almajai; Stephen J. Cox; Richard W. Harvey; Yuxuan Lan
Recent improvements in tracking and feature extraction mean that speaker-dependent lip-reading of continuous speech using a medium size vocabulary (around 1000 words) is realistic. However, the recognition of previously unseen speakers has been found to be a very challenging task, because of the large variation in lip-shapes across speakers and the lack of large, tracked databases of visual features, which are very expensive to produce. By adapting a technique that is established in speech recognition but has not previously been used in lip-reading, we show that error-rates for speaker-independent lip-reading can be very significantly reduced. Furthermore, we show that error-rates can be even further reduced by the additional use of Deep Neural Networks (DNN). We also find that there is no need to map phonemes to visemes for context-dependent visual speech transcription.
international conference on acoustics, speech, and signal processing | 2007
Jonathan Darch; Ben Milner; Ibrahim Almajai; Saeed Vaseghi
This work develops a statistical framework to predict acoustic features (fundamental frequency, formant frequencies and voicing) from MFCC vectors. An analysis of correlation between acoustic features and MFCCs is made both globally across all speech and within phoneme classes, and also from speaker-independent and speaker-dependent speech. This leads to the development of both a global prediction method, using a Gaussian mixture model (GMM) to model the joint density of acoustic features and MFCCs, and a phoneme-specific prediction method using a combined hidden Markov model (HMM)-GMM. Prediction accuracy measurements show the phoneme-dependent HMM-GMM system to be more accurate which agrees with the correlation analysis. Results also show prediction to be more accurate from speaker-dependent speech which also agrees with the correlation analysis.
european signal processing conference | 2008
Ibrahim Almajai; Ben Milner
AVSP | 2007
Ibrahim Almajai; Ben Milner
international conference on acoustics, speech, and signal processing | 2007
Ibrahim Almajai; Ben Milner; Jonathan Darch; Saeed Vaseghi
conference of the international speech communication association | 2006
Ibrahim Almajai; Ben Milner; Jonathan Darch
conference of the international speech communication association | 2009
Ibrahim Almajai; Ben Milner
AVSP | 2009
Ibrahim Almajai; Ben Milner
language resources and evaluation | 2014
Jean-Philippe Goldman; Adrian Leeman; Marie-José Kolly; Ingrid Hove; Ibrahim Almajai; Volker Dellwo; Steven Moran