Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Darryl Stewart is active.

Publication


Featured researches published by Darryl Stewart.


Eurasip Journal on Image and Video Processing | 2008

Comparison of image transform-based features for visual speech recognition in clean and corrupted videos

Rowan Seymour; Darryl Stewart; Ji Ming

We present results of a study into the performance of a variety of different image transform-based feature types for speaker-independent visual speech recognition of isolated digits. This includes the first reported use of features extracted using a discrete curvelet transform. The study will show a comparison of some methods for selecting features of each feature type and show the relative benefits of both static and dynamic visual features. The performance of the features will be tested on both clean video data and also video data corrupted in a variety of ways to assess each feature types robustness to potential real-world conditions. One of the test conditions involves a novel form of video corruption we call jitter which simulates camera and/or head movement during recording.


IEEE Transactions on Systems, Man, and Cybernetics | 2014

Robust Audio-Visual Speech Recognition Under Noisy Audio-Video Conditions

Darryl Stewart; Rowan Seymour; Adrian Pass; Ji Ming

This paper presents the maximum weighted stream posterior (MWSP) model as a robust and efficient stream integration method for audio-visual speech recognition in environments, where the audio or video streams may be subjected to unknown and time-varying corruption. A significant advantage of MWSP is that it does not require any specific measurements of the signal in either stream to calculate appropriate stream weights during recognition, and as such it is modality-independent. This also means that MWSP complements and can be used alongside many of the other approaches that have been proposed in the literature for this problem. For evaluation we used the large XM2VTS database for speaker-independent audio-visual speech recognition. The extensive tests include both clean and corrupted utterances with corruption added in either/both the video and audio streams using a variety of types (e.g., MPEG-4 video compression) and levels of noise. The experiments show that this approach gives excellent performance in comparison to another well-known dynamic stream weighting approach and also compared to any fixed-weighted integration approach in both clean conditions or when noise is added to either stream. Furthermore, our experiments show that the MWSP approach dynamically selects suitable integration weights on a frame-by-frame basis according to the level of noise in the streams and also according to the naturally fluctuating relative reliability of the modalities even in clean conditions. The MWSP approach is shown to maintain robust recognition performance in all tested conditions, while requiring no prior knowledge about the type or level of noise.


IEEE Transactions on Speech and Audio Processing | 2005

Subband correlation and robust speech recognition

James McAuley; Ji Ming; Darryl Stewart; Philip Hanna

This paper investigates the effect of modeling subband correlation for noisy speech recognition. Subband feature streams are assumed to be independent in many subband-based speech recognition systems. However, speech recognition experimental results suggest this assumption is unrealistic. In this paper, a method is proposed to incorporate correlation into subband speech feature streams. In the proposed method, all possible combinations of subbands are created and each combination is treated as a single frequency-band by calculating a single feature vector for it. The resulting feature vectors, therefore, capture information about every band in the combination, as well as the dependency across the bands. Although using the new features results in a higher computational complexity, our experimental results show that they effectively capture the correlation between the subbands while making minimal assumptions about the structure of the correlation. Experiments are conducted on the TIDigits database. The results demonstrate improved accuracy for clean speech recognition and improved robustness in the presence of both stationary and nonstationary band-selective noise, in comparison to a system assuming subband independence.


international conference on acoustics, speech, and signal processing | 2005

Speaker identification in unknown noisy conditions - a universal compensation approach

Ji Ming; Darryl Stewart; Saeed Vaseghi

We consider speaker identification involving background noise, assuming no knowledge about the noise characteristics. A new method, namely universal compensation (UC), is studied as a solution to the problem. The UC method is an extension of the missing-feature method, i.e. recognition based only on reliable data but robust to any corruption type, including full corruption that affects all time-frequency components of the speech representation. The UC technique achieves robustness to unknown, full noise corruption through a novel combination of the multi-condition training method and the missing-feature method. The combination of these two strategies makes the new method potentially capable of dealing with arbitrary additive noise - with arbitrary temporal-spectral characteristics - based only on clean speech training data and simulated noise data, without requiring knowledge about the actual noise. The SPIDRE database is used for the evaluation, assuming various corruptions from real-world noise data. The results obtained are encouraging.


international conference on acoustics speech and signal processing | 1999

Improving speech recognition performance by using multi-model approaches

Ji Ming; Philip Hanna; Darryl Stewart; Marie Owens; Francis Jack Smith

Most current speech recognition systems are built upon a single type of model, e.g. an HMM or certain type of segment based model, and furthermore typically employs only one type of acoustic feature e.g. MFCCs and their variants. This entails that the system may not be robust should the modeling assumptions be violated. Recent research efforts have investigated the use of multi-scale/multi-band acoustic features for robust speech recognition. This paper described a multi-model approach as an alternative and complement to the multi-feature approaches. The multi-model approach seeks a combination of different types of acoustic models, thereby integrating the capabilities of each individual model for capturing discriminative information. An example system built upon the combination of the standard HMM technique with a segment-based modeling technique was implemented. Experiments for both isolated-word and continuous speech recognition have shown improved performances over each of the individual models considered in isolation.


international conference on image processing | 2010

AN investigation into features for multi-view lipreading

Adrian Pass; Jianguo Zhang; Darryl Stewart

For the first time in this paper we present results showing the effect of speaker head pose angle on automatic lip-reading performance over a wide range of closely spaced angles. We analyse the effect head pose has upon the features themselves and show that by selecting coefficients with minimum variance w.r.t. pose angle, recognition performance can be improved when train-test pose angles differ. Experiments are conducted using the initial phase of a unique multi view Audio-Visual database designed specifically for research and development of pose-invariant lip-reading systems. We firstly show that it is the higher order horizontal spatial frequency components that become most detrimental as the pose deviates. Secondly we assess the performance of different feature selection masks across a range of pose angles including a new mask based on Minimum Cross-Pose Variance coefficients. We report a relative improvement of 50% in Word Error Rate when using our selection mask over a common energy based selection during profile view lip-reading.


IET Biometrics | 2013

Gender classification via lips: static and dynamic features

Darryl Stewart; Adrian Pass; Jianguo Zhang

Automatic gender classification has many security and commercial applications. Various modalities have been investigated for gender classification with face-based classification being the most popular. In some real-world scenarios the face may be partially occluded. In these circumstances a classification based on individual parts of the face known as local features must be adopted. The authors investigate gender classification using lip movements. They show for the first time that important gender-specific information can be obtained from the way in which a person moves their lips during speech. Furthermore, this study indicates that the lip dynamics during speech provide greater gender discriminative information than simply lip appearance. They also show that the lip dynamics and appearance contain complementary gender information such that a model which captures both traits gives the highest overall classification result. They use discrete cosine transform-based features and Gaussian mixture modelling to model lip appearance and dynamics and employ the XM2VTS database for their experiments. These experiments show that a model which captures lip dynamics along with appearance can improve gender classification rates by between 16 and 21% compared with models of only lip appearance.


international conference on pattern recognition | 2000

Discrete Chebyshev transform. A natural modification of the DCT

Pat Corr; Darryl Stewart; Philip Hanna; Ji Ming; Francis Jack Smith

Although the discrete cosine transform (DCT) is widely used for feature extraction in pattern recognition, it is shown that it converges slowly for most theoretically smooth functions. A modification of the DCT is described, based on a change of variable, which changes it to a new transform, called the discrete Chebyshev transform (DChT), which converges very rapidly for the same smooth functions. Although this rapid convergence is largely destroyed by the noise in real experimental data, the discrete Chebyshev transform is still generally better than the DCT when the sampling of the data can be selected at nonequidistant points. The improvement over the DCT gives a theoretical explanation for improved speech recognition obtained using Mel feature cepstral coefficients. These choose the sampling frequencies of a DCT to correspond to the human perception of pitch. It is shown that this sampling is similar to the sampling used in the discrete Chebyshev transform.


digital image computing: techniques and applications | 2009

Investigations into the Robustness of Audio-Visual Gender Classification to Background Noise and Illumination Effects

Darryl Stewart; Hongbin Wang; Jiali Shen; Paul C. Miller

In this paper we investigate the robustness of a multimodal gender profiling system which uses face and voice modalities. We use support vector machines combined with principal component analysis features to model faces, and Gaussian mixture models with Mel Frequency Cepstral Coefficients to model voices. Our results show that these approaches perform well individually in ‘clean’ training and testing conditions but that their performance can deteriorate substantially in the presence of audio or image corruptions such as additive acoustic noise and differing image illumination conditions. However, our results also show that a straightforward combination of these modalities can provide a gender classifier which is robust when tested in the presence of corruption in either modality. We also show that in most of the tested conditions the multimodal system can automatically perform on a par with whichever single modality is currently the most reliable.


ambient intelligence | 2018

Agile risk management using software agents

Edzreena Edza Odzaly; Des Greer; Darryl Stewart

Risk management is an important process in Software Engineering. However, it can be perceived as somewhat contrary to the more lightweight processes used in Agile methods. Thus an appropriate and realistic risk management model is required as well as tool support that minimizes human effort. We propose the use of software agents to carry out risk management tasks and make use of the data collected from the project environment to detect risks. This paper describes the underlying risk management model in an Agile risk tool where software agents are used to support identification, assessment and monitoring of risk. It demonstrates the interaction between agents, agents’ compliance with designated rules and how agents can react to changes in project environment data. The results, demonstrated using case studies, show that agents are of use for detecting risk and reacting dynamically to changes in project environment thus, help to minimize the human effort in managing risk.

Collaboration


Dive into the Darryl Stewart's collaboration.

Top Co-Authors

Avatar

Philip Hanna

Queen's University Belfast

View shared research outputs
Top Co-Authors

Avatar

Ji Ming

Queen's University Belfast

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Adrian Pass

Queen's University Belfast

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ian M. O'Neill

Queen's University Belfast

View shared research outputs
Top Co-Authors

Avatar

Peter Jancovic

University of Birmingham

View shared research outputs
Top Co-Authors

Avatar

Rowan Seymour

Queen's University Belfast

View shared research outputs
Top Co-Authors

Avatar

Des Greer

Queen's University Belfast

View shared research outputs
Top Co-Authors

Avatar

F. Jack Smith

Queen's University Belfast

View shared research outputs
Researchain Logo
Decentralizing Knowledge