Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ronan Flynn is active.

Publication


Featured researches published by Ronan Flynn.


Speech Communication | 2008

Combined speech enhancement and auditory modelling for robust distributed speech recognition

Ronan Flynn; Edward Jones

The performance of automatic speech recognition (ASR) systems in the presence of noise is an area that has attracted a lot of research interest. Additive noise from interfering noise sources, and convolutional noise arising from transmission channel characteristics both contribute to a degradation of performance in ASR systems. This paper addresses the problem of robustness of speech recognition systems in the first of these conditions, namely additive noise. In particular, the paper examines the use of the auditory model of Li et al. [Li, Q., Soong, F.K., Siohan, O., 2000. A high-performance auditory feature for robust speech recognition. In: Proc. 6th Internat. Conf. on Spoken Language Processing (ICSLP), Vol. III. pp. 51-54] as a front-end for a HMM-based speech recognition system. The choice of this particular auditory model is motivated by the results of a previous study by Flynn and Jones [Flynn, R., Jones, E., 2006. A comparative study of auditory-based front-ends for robust speech recognition using the Aurora 2 database. In: Proc. IET Irish Signals and Systems Conf., Dublin, Ireland. pp. 111-116] in which this auditory model was found to exhibit superior performance for the task of robust speech recognition using the Aurora 2 database [Hirsch, H.G., Pearce, D., 2000. The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. In: Proc. ISCA ITRW ASR2000, Paris, France. pp. 181-188]. In the speech recognition system described here, the input speech is pre-processed using an algorithm for speech enhancement. A number of different methods for the enhancement of speech, combined with the auditory front-end of Li et al., are evaluated for the purpose of robust connected digit recognition. The ETSI basic [ETSI ES 201 108 Ver. 1.1.3, 2003. Speech processing, transmission and quality aspects (STQ); distributed speech recognition; front-end feature extraction algorithm; compression algorithms] and advanced [ETSI ES 202 050 Ver. 1.1.5, 2007. Speech processing, transmission and quality aspects (STQ); distributed speech recognition; advanced front-end feature extraction algorithm; compression algorithms] front-ends proposed for DSR are used as a baseline for comparison. In addition to their effects on speech recognition performance, the speech enhancement algorithms are also assessed using perceptual speech quality tests, in order to examine if a correlation exists between perceived speech quality and recognition performance. Results indicate that the combination of speech enhancement pre-processing and the auditory model front-end provides an improvement in recognition performance in noisy conditions over the ETSI front-ends.


IEEE Transactions on Consumer Electronics | 2008

Robust distributed speech recognition using speech enhancement

Ronan Flynn; Edward Jones

The issue of robustness in the presence of noise is regarded as a significant bottleneck in the commercialisation of speech recognition products, particularly in mobile environments. This paper examines the use of an auditory model combined with a speech enhancement algorithm as a robust front-end for a distributed speech recognition (DSR) system, whereby frontend functionality is implemented on a limited-resource consumer device like a mobile phone, while back-end classifier functionality is carried out by a remote server.


Speech Communication | 2012

Reducing bandwidth for robust distributed speech recognition in conditions of packet loss

Ronan Flynn; Edward Jones

This paper proposes a method to reduce the bandwidth requirements for a distributed speech recognition (DSR) system, with minimal impact on recognition performance. Bandwidth reduction is achieved by applying a wavelet decomposition to feature vectors extracted from speech using an auditory-based front-end. The resulting vectors undergo vector quantisation and are then combined in pairs for transmission over a statistically modeled channel that is subject to packet burst loss. Recognition performance is evaluated in the presence of both background noise and packet loss. When there is no packet loss, results show that the proposed method can reduce the bandwidth required to 50% of the bandwidth required for the system in which the proposed method is not used, without compromising recognition performance. The bandwidth can be further reduced to 25% of the baseline for a slight decrease in recognition performance. Furthermore, in the presence of packet loss, the proposed method for bandwidth reduction, when combined with a suitable redundancy scheme, gives a 29% reduction in bandwidth, when compared to the recognition performance of an established packet loss mitigation technique.


Digital Signal Processing | 2010

Robust distributed speech recognition in noise and packet loss conditions

Ronan Flynn; Edward Jones

This paper examines the performance of a Distributed Speech Recognition (DSR) system in the presence of both background noise and packet loss. Recognition performance is examined for feature vectors extracted from speech using a physiologically-based auditory model, as an alternative to the more commonly-used Mel Frequency Cepstral Coefficient (MFCC) front-end. The feature vectors produced by the auditory model are vector quantised and combined in pairs for transmission over a statistically modelled channel that is subject to packet burst loss. In order to improve recognition performance in the presence of noise, the speech is enhanced prior to feature extraction using Wiener filtering. Packet loss mitigation to compensate for missing features is also used to further improve performance. Speech recognition results show the benefit of combining speech enhancement and packet loss mitigation to compensate for channel and environmental degradations.


Speech Communication | 2012

Short Communication: Feature selection for reduced-bandwidth distributed speech recognition

Ronan Flynn; Edward Jones

The impact on speech recognition performance in a distributed speech recognition (DSR) environment of two methods used to reduce the dimension of the feature vectors is examined in this paper. The motivation behind reducing the dimension of the feature set is to reduce the bandwidth required to send the feature vectors over a channel from the client front-end to the server back-end in a DSR system. In the first approach, the features are empirically chosen to maximise recognition performance. A data-centric transform-based dimensionality-reduction technique is applied in the second case. Test results for the empirical approach show that individual coefficients have different impacts on the speech recognition performance, and that certain coefficients should always be present in an empirically selected reduced feature set for given training and test conditions. Initial results show that for the empirical method, the number of elements in a feature vector produced by an established DSR front-end can be reduced by 23% with low impact on the recognition performance (less than 8% relative performance drop compared to the full bandwidth case). Using the transform-based approach, for a similar impact on recognition performance, the number of feature vector elements can be reduced by 30%. Furthermore, for best recognition performance, the results indicate that the SNR of the speech signal should be considered using either approach when selecting the feature vector elements that are to be included in a reduced feature set.


acm multimedia | 2017

Comparing User QoE via Physiological and Interaction Measurements of Immersive AR and VR Speech and Language Therapy Applications

Conor Keighrey; Ronan Flynn; Siobhan Murray; Sean Brennan; Niall Murray

Virtual reality (VR) and augmented reality (AR) applications are gaining significant attention in industry and academia as potential avenues to support truly immersive and interactive multimedia experiences. Understanding the user perceived quality of immersive multimedia experiences is critical to the success of these technologies. However, this is a multidimensional and multifactorial problem. The user quality of experience (QoE) is influenced by human, context and system factors. Attempts to understand QoE via multimedia quality assessment has typically involved users reporting their experiences via post-test questionnaires. More recently, efforts have been made to automatically collect objective metrics that can quantitatively reflect user QoE in terms of physiological measurement methods. In this context, this paper presents a novel comparison of objective quality measures of immersive AR and VR applications through physiological: (electrodermal activity (EDA) and heart rate (HR)); and interaction (response times (RT), incorrect responses, and miss-click) metrics. The analysis shows consistency in terms of physiological ratings and miss-click metrics between the AR and VR groups. Interestingly, the AR group reported lower response times and less incorrect responses compared to the VR group. The difference between the AR and VR groups was statistically significant for the incorrect response metric and in 45.5% of the cases tested for response times metric, they were statistically significant with 95% confidence levels.


irish signals and systems conference | 2015

Evaluation of wains as a classifier for automatic speech recognition

Rosemary T. Salaja; Ronan Flynn; Michael Russell

This paper introduces a new back-end classifier for a speech recognition system that is based on artificial life (ALife). The ALife species being used for classification purposes are called wains, which were developed using the Créatúr framework. The speech recognition task used in the evaluation of the new classifier is that of isolated digit recognition. Performance of the proposed back-end classifier is benchmarked for comparison purposes against a hidden Markov model (HMM)-based classifier. Mel frequency cepstral coefficients extracted from the speech by a front-end processor are used by the two back-ends for classification of the speech utterances. In tests carried out to date, the performance of the proposed classifier falls short of the performance of the HMM-based classifier. However, the results suggest that with further training the recognition accuracy of the proposed classifier can be significantly improved.


international conference on signal processing | 2007

Robust connected digit recognition using speech enhancement and an auditory model front-end

Ronan Flynn; Edward Jones

This paper addresses the problem of speech recognition in noisy conditions. In particular, the paper examines the use of an auditory model as a front-end for a HMM-based speech recognition system. To further improve the performance of the auditory-based recognition system in background noise, the input speech is pre-processed using an algorithm for speech enhancement Preliminary results indicate that the combination of speech enhancement pre-processing and the auditory-model front-end provides an improvement in recognition performance in noisy conditions over the system without speech enhancement.


acm sigmm conference on multimedia systems | 2018

A QoE assessment method based on EDA, heart rate and EEG of a virtual reality assistive technology system

Debora Salgado; Felipe Roque Martins; Thiago Braga Rodrigues; Conor Keighrey; Ronan Flynn; Eduardo Lázaro Martins Naves; Niall Murray

The1 key aim of various assistive technology (AT) systems is to augment an individuals functioning whilst supporting an enhanced quality of life (QoL). In recent times, we have seen the emergence of Virtual Reality (VR) based assistive technology systems made possible by the availability of commercially available Head Mounted Displays (HMDs). The use of VR for AT aims to support levels of interaction and immersion not previously possibly with more traditional AT solutions. Crucial to the success of these technologies is understanding, from the user perspective, the influencing factors that affect the user Quality of Experience (QoE). In addition to the typical QoE metrics, other factors to consider are human behavior like mental and emotional state, posture and gestures. In terms of trying to objectively quantify such factors, there are wide ranges of wearable sensors that are able to monitor physiological signals and provide reliable data. In this demo, we will capture and present the users EEG, heart Rate, EDA and head motion during the use of AT VR application. The prototype is composed of the sensor and presentation systems: for acquisition of biological signals constituted by wearable sensors and the virtual wheelchair simulator that interfaces to a typical LCD display.


irish signals and systems conference | 2017

Speech intelligibility and quality: A comparative study of speech enhancement algorithms

Michael Russell; Ronan Flynn; Xiaodong Xu

Mobile devices are widely used today for speech communication. The environments in which these devices are used are widely varied and often the level of background noise in the speakers environment can be significant. The purpose of speech enhancement is to reduce the level of background noise, ideally to such a level that it is not noticed by the listener. While speech enhancement algorithms can significantly reduce the noise level in a speech signal, improving speech quality, it is widely recognized that enhancement algorithms can have a negative impact on speech intelligibility. This paper compares the effect of three different speech enhancement algorithms on the intelligibility and the quality of speech. This work is the initial phase of an investigation into mitigating the impact of speech enhancement algorithms on speech intelligibility. The speech enhancement algorithms evaluated each use different approaches for noise reduction, namely, a statistical model-based algorithm, a noise estimation algorithm and a wavelet packet decomposition-based algorithm. Two objective speech intelligibility measurements and three objective speech quality measurements are used to assess the performance of the enhancement algorithms. The results of the experiments show that all the speech enhancement algorithms in this study have a negative impact on speech intelligibility to varying degrees.

Collaboration


Dive into the Ronan Flynn's collaboration.

Top Co-Authors

Avatar

Niall Murray

Athlone Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Edward Jones

National University of Ireland

View shared research outputs
Top Co-Authors

Avatar

Michael Russell

Athlone Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Brian Lee

Athlone Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Conor Keighrey

Athlone Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Enda Fallon

Athlone Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Jonny O'Dwyer

Athlone Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Rosemary T. Salaja

Athlone Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Sean Hayes

Athlone Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Yuansong Qiao

Athlone Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge