Damien Kelly
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Damien Kelly.
international conference on acoustics, speech, and signal processing | 2014
Julius Kammerl; Neil Birkbeck; Sasi Inguva; Damien Kelly; Andrew Joseph Crawford; Hugh Denman; Anil C. Kokaram; Caroline Pantofaru
Given the proliferation of consumer media recording devices, events often give rise to a large number of recordings. These recordings are taken from different spatial positions and do not have reliable timestamp information. In this paper, we present two robust graph-based approaches for synchronizing multiple audio signals. The graphs are constructed atop the over-determined system resulting from pairwise signal comparison using cross-correlation of audio features. The first approach uses a Minimum Spanning Tree (MST) technique, while the second uses Belief Propagation (BP) to solve the system. Both approaches can provide excellent solutions and robustness to pairwise outliers, however the MST approach is much less complex than BP. In addition, an experimental comparison of audio features-based synchronization shows that spectral flatness outperforms the zero-crossing rate and signal energy.
Journal of the Acoustical Society of America | 2015
Andrew Hines; Eoin Gillen; Damien Kelly; Jan Skoglund; Anil C. Kokaram; Naomi Harte
Streaming services seek to optimise their use of bandwidth across audio and visual channels to maximise the quality of experience for users. This letter evaluates whether objective quality metrics can predict the audio quality for music encoded at low bitrates by comparing objective predictions with results from listener tests. Three objective metrics were benchmarked: PEAQ, POLQA, and VISQOLAudio. The results demonstrate objective metrics designed for speech quality assessment have a strong potential for quality assessment of low bitrate audio codecs.
acm multimedia | 2014
Andrew Hines; Eoin Gillen; Damien Kelly; Jan Skoglund; Anil C. Kokaram; Naomi Harte
Users of audio-visual streaming services expect an ever increasing quality of experience. Channel bandwidth remains a bottleneck commonly addressed with lossy compression schemes for both the video and audio streams. Anecdotal evidence suggests a strongly perceived link between bit rate and quality. This paper presents three audio quality listening experiments using the ITU MUSHRA methodology to assess a number of audio codecs typically used by streaming services. They were assessed for a range of bit rates using three presentation modes: consumer and studio quality headphones and loudspeakers. Our results indicate that with consumer quality headphones, listeners were not differentiating between codecs with bit rates greater than 48 kb/s (p>=0.228). For studio quality headphones and loudspeakers aac-lc at 128 kb/s and higher was differentiated over other codecs (p<=0.001). The results provide insights into quality of experience that will guide future development of objective audio quality metrics.
international conference on image processing | 2012
Anil C. Kokaram; Damien Kelly; Hugh Denman; Andrew Joseph Crawford
The vast majority of previous work in noise reduction for visual media has assumed uncorrelated, white, noise sources. In practice this is almost always violated by real media. Film grain noise is never white, and this paper highlights that the same applies to almost all consumer video content. We therefore present an algorithm for measuring the spatial and temporal spectral density of noise in archived video content, be it consumer digital camera or film orginated. As an example of how this information can be used for video denoising, the spectral density is then used for spatio-temporal noise reduction in the Fourier frequency domain. Results show improved performance for noise reduction in an easily pipelined system.
international conference on acoustics, speech, and signal processing | 2011
Damien Kelly; Anil C. Kokaram; Frank Boland
An automated system is presented for reducing a multi-view lecture recording into a single view video containing a best view summary of active speakers. The system uses skin color detection and voxel-based analysis in locating likely speaker locations. Using time-delay estimates from multiple microphones, speech activity is analyzed for each speaker position. The Viterbi algorithm is then used to estimate a track of the active speaker which maximizes the observed speech activity. This novel approach is termed Voxel-based Viterbi Active Speaker Tracking (V-VAST) and is shown to track speakers with an accuracy of 0.23m. Using the tracking information, the system then extracts from the available camera views the most frontal face view of the active speaker to display.
Applications of Digital Image Processing XLI | 2018
Anil C. Kokaram; Chao Chen; Yilin Wang; Jessie Lin; Balu Adsumilli; Steve Benting; Neil Birkbeck; Damien Kelly; Michele Covell; Sasi Inguva
The development of video quality metrics and perceptual video quality metrics has been a well established pursuit for more than 25 years. The body of work has been seen to be most relevant for improving the performance of visual compression algorithms. However, modeling the human perception of video with an algorithm of some sort is notoriously complicated. As a result the perceptual coding of video remains challenging and no standards have incorporated perceptual video quality metrics within their specification. In this paper we present the use of video metrics at the system level of a video processing pipeline. We show that it is possible to combine the artefact detection and correction process by posing the problem as a classification exercise. We also present the use of video metrics as part of a classical testing pipeline for software infrastructure, but here it is sensitive to the perceived quality in picture degradation.
IEEE Transactions on Broadcasting | 2017
Colm Sloan; Naomi Harte; Damien Kelly; Anil C. Kokaram; Andrew Hines
Digital audio broadcasting services transmit substantial amounts of data that is encoded to minimize bandwidth whilst maximizing user quality of experience. Many large service providers continually alter codecs to improve the encoding process. Performing subjective tests to validate each codec alteration would be impractical, necessitating the use of objective perceptual audio quality models. This paper evaluates the quality scores from ViSQOLAudio, an objective perceptual audio quality model, against the quality scores of PEAQ, POLQA, and PEMO-Q on three datasets containing fullband audio encoded with a variety of codecs and bitrates. The results show that ViSQOLAudio was more accurate than all other models on two of the datasets and performed well on the third, demonstrating the utility of ViSQOLAudio for predicting the perceptual audio quality for encoded music.
quality of multimedia experience | 2016
Colm Sloan; Naomi Harte; Damien Kelly; Anil C. Kokaram; Andrew Hines
When a user uploads audio files to a music streaming service, these files are subsequently re-encoded to lower bitrates to target different devices, e.g. low bitrate for mobile. To save time and bandwidth uploading files, some users encode their original files using a lossy codec. The metadata for these files cannot always be trusted as users might have encoded their files more than once. Determining the lowest bitrate of the files allows the streaming service to skip the process of encoding the files to bitrates higher than that of the uploaded files, saving on processing and storage space. This paper presents a model that uses quality predictions from ViSQOLAudio, a full reference objective audio quality metric, as features in combination with a multi-class support vector machine classifier. An experiment on twice-encoded files found that low bitrate codecs could be classified using audio quality features. The experiment also provides insights into the implications of multiple transcodes from a quality perspective.
quality of multimedia experience | 2014
François Pitié; Damien Kelly; Thierry Foucu; Naomi Harte; Anil C. Kokaram
Quality assessment in the streaming media industry has matured to the stage that it encompasses not only traditional notions of pixel and audio sample integrity but also file format consistency and the media consumption experience itself. Audio/Video synchronisation has already been established as an associated measure of media quality and is well known in video conferencing and movie streaming applications. This paper presents a new system for the assessment of audio and video synchronisation in a media file. The system incorporates the idea of learning features which are robust to coding artefacts to establish robust fingerprints for A/V Sync measurement. Results from large scale testing of 30,000 clips from YouTube show why measurement of A/V Sync is important for file based video repositories and highlights issues that can now be addressed quantitatively.
Archive | 2014
Neil Birkbeck; Isasi Inguva; Damien Kelly; Andrew Joseph Crawford; Hugh Denman; Perry Tobin; Steve Benting; Anil C. Kokaram; Jeremy Doig