Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Peter Jancovic is active.

Publication


Featured researches published by Peter Jancovic.


EURASIP Journal on Advances in Signal Processing | 2011

Automatic Detection and Recognition of Tonal Bird Sounds in Noisy Environments

Peter Jancovic; Münevver Köküer

This paper presents a study of automatic detection and recognition of tonal bird sounds in noisy environments. The detection of spectro-temporal regions containing bird tonal vocalisations is based on exploiting the spectral shape to identify sinusoidal components in the short-time spectrum. The detection method provides tonal-based feature representation that is employed for automatic bird recognition. The recognition system uses Gaussian mixture models to model 165 different bird syllables, produced by 95 bird species. Standard models, as well as models compensating for the effect of the noise, are employed. Experiments are performed on bird sound recordings corrupted by White noise and real-world environmental noise. The proposed detection method shows high detection accuracy of bird tonal components. The employed tonal-based features show significant recognition accuracy improvements over the Mel-frequency cepstral coefficients, in both standard and noise-compensated models, and strong robustness to mismatch between the training and testing conditions.


IEEE Transactions on Speech and Audio Processing | 2002

Robust speech recognition using probabilistic union models

Ji Ming; Peter Jancovic; Francis Jack Smith

This paper introduces a new statistical approach, namely the probabilistic union model, for speech recognition involving partial, unknown frequency-band corruption. Partial frequency-band corruption accounts for the effect of a family of real-world noises. Previous methods based on the missing feature theory usually require the identity of the noisy bands. This identification can be difficult for unexpected noise with unknown, time-varying band characteristics. The new model combines the local frequency-band information based on the union of random events, to reduce the dependence of the model on information about the noise. This model partially accomplishes the target: offering robustness to partial frequency-band corruption, while requiring no information about the noise. This paper introduces the theory and implementation of the union model, and is focused on several important advances. These new developments include a new algorithm for automatic order selection, a generalization of the modeling principle to accommodate partial feature stream corruption, and a combination of the union model with conventional noise reduction techniques to deal with a mixture of stationary noise and unknown, nonstationary noise. For the evaluation, we used the TIDIGITS database for speaker-independent connected digit recognition. The utterances were corrupted by various types of additive noise, stationary or time-varying, assuming no knowledge about the noise characteristics. The results indicate that the new model offers significantly improved robustness in comparison to other models.


IEEE Signal Processing Letters | 2007

Estimation of Voicing-Character of Speech Spectra Based on Spectral Shape

Peter Jancovic; Münevver Köküer

This letter presents a method for estimation of the voicing-character of speech spectra. It is based on a calculation of a similarity between the shape of the signal short-term magnitude spectra and spectra of the frame-analysis window, which is weighted by the signal magnitude spectra. It is demonstrated that the proposed voicing measure is related to the local SNR of noise-corrupted voiced speech. The performance is evaluated for detection of voiced regions in the spectra of speech corrupted by various types of noise. The experimental results in terms of false-acceptance and false-rejection show errors of less than 5% for speech corrupted by white noise at the local SNR of 10 dB and in terms of recognition accuracy obtained by an ASR system using the voicing information estimated by the proposed method and by the full a priori knowledge about the noise show similar recognition performance


IEEE Transactions on Signal Processing | 2008

Speech Signal Enhancement Based on MAP Algorithm in the ICA Space

Xin Zou; Peter Jancovic; Ju Liu; Münevver Köküer

This paper presents a novel maximum a posteriori (MAP) denoising algorithm based on the independent component analysis (ICA). We demonstrate that the employment of individual ICA transformations for signal and noise can provide the best estimate within the linear framework. The signal enhancement problem is categorized based on the distribution of signal and noise being Gaussian or non-Gaussian and the estimation rule is derived for each of the categories. Our theoretical analysis shows that under the assumption of a Gaussian noise the proposed algorithm leads to some well-known enhancement techniques, i.e., Wiener filter and sparse code shrinkage. The analysis of the denoising capability shows that the proposed algorithm is most efficient for non-Gaussian signals corrupted by a non-Gaussian noise. We employed the generalized Gaussian model (GGM) to model the distributions of speech and noise. Experimental evaluation is performed in terms of signal-to-noise ratio (SNR) and spectral distortion measure. Experimental results show that the proposed algorithms achieve significant improvement on the enhancement performance in both Gaussian and non-Gaussian noise.


IEEE Signal Processing Letters | 2015

Acoustic Recognition of Multiple Bird Species Based on Penalized Maximum Likelihood

Peter Jancovic; Münevver Köküer

Automatic system for recognition of multiple bird species in audio recordings is presented. Time-frequency segmentation of the acoustic scene is obtained by employing a sinusoidal detection algorithm, which does not require any estimate of noise and is able to handle multiple simultaneous bird vocalizations. Each segment is characterized as a sequence of frequencies over time, referred to as a frequency track. Each bird species is represented by a hidden Markov model that models the temporal evolution of frequency tracks. The decision on the number and identity of bird species in a given recording is obtained based on maximizing the overall likelihood of the set of detected segments, with a penalization applied for increasing the number of bird models used. Experimental evaluations are performed on audio field recordings containing 30 bird species. The presence of multiple bird species is simulated by joining the set of detected segments from several bird species. Results show that the proposed method can achieve recognition performance for multiple bird species not far from that obtained for single bird species, and considerably outperforms majority voting methods.


international conference on acoustics, speech, and signal processing | 2014

Bird species recognition from field recordings using HMM-based modelling of frequency tracks

Peter Jancovic; Münevver Köküer; Martin J. Russell

This paper presents an automatic system for recognition of bird species from field audio recordings. The proposed system employs a novel method for detection of sinusoidal components in the acoustic scene. This provides a segmentation of the signal and also feature representation of each segment in terms of frequencies over time, referred to as frequency track. We employ hidden Markov models (HMMs) to model the temporal evolution of frequency tracks. We demonstrate the effect of including local temporal dynamics of frequency tracks and HMM modelling parameters. Experiments are performed on over 33 hours of field recordings, containing 30 bird species. Evaluations demonstrate that the HMM-based temporal modelling provides considerable performance improvement over a system based on Gaussian mixture modelling. The proposed HMM-based system is capable of recognising bird species with accuracy over 85% from only 3 seconds of detected signal.


Journal of the Acoustical Society of America | 2001

A probabilistic union model with automatic order selection for noisy speech recognition

Peter Jancovic; Ji Ming

A critical issue in exploiting the potential of the sub-band-based approach to robust speech recognition is the method of combining the sub-band observations, for selecting the bands unaffected by noise. A new method for this purpose, i.e., the probabilistic union model, was recently introduced. This model has been shown to be capable of dealing with band-limited corruption, requiring no knowledge about the band position and statistical distribution of the noise. A parameter within the model, which we call its order, gives the best results when it equals the number of noisy bands. Since this information may not be available in practice, in this paper we introduce an automatic algorithm for selecting the order, based on the state duration pattern generated by the hidden Markov model (HMM). The algorithm has been tested on the TIDIGITS database corrupted by various types of additive band-limited noise with unknown noisy bands. The results have shown that the union model equipped with the new algorithm can achieve a recognition performance similar to that achieved when the number of noisy bands is known. The results show a very significant improvement over the traditional full-band model, without requiring prior information on either the position or the number of noisy bands. The principle of the algorithm for selecting the order based on state duration may also be applied to other sub-band combination methods.


IEEE Signal Processing Letters | 2012

Contrasting the Effects of Different Frequency Bands on Speaker and Accent Identification

Saeid Safavi; Abualsoud Hanani; Martin J. Russell; Peter Jancovic; Michael J. Carey

This letter presents an experimental study investigating the effect of frequency sub-bands on regional accent identification (AID) and speaker identification (SID) performance on the ABI-1 corpus. The AID and SID systems are based on Gaussian mixture modeling. The SID experiments show up to 100% accuracy when using the full 11.025 kHz bandwidth. The best AID performance of 60.34% is obtained when using band-pass filtered (0.23-3.4 kHz) speech. The experiments using isolated narrow sub-bands show that the regions (0-0.77 kHz) and (3.40-11.02 kHz) are the most useful for SID, while those in the region (0.34-3.44 kHz) are best for AID. AID experiments are also performed with intersession variability compensation, which provides the biggest performance gain in the (2.23-5.25 kHz) region.


ieee international conference on healthcare informatics | 2014

Intelligent Assistive System Using Real-Time Action Recognition for Stroke Survivors

Emilie M. D. Jean-Baptiste; Roozbeh Nabiei; Manish Parekh; Evangelia Fringi; Bogna Drozdowska; Chris Baber; Peter Jancovic; Pia Rotshein; Martin J. Russell

Cog Watch is an EU project developing technologies for cognitive rehabilitation of stroke patients. The Cog Watch prototype is an automatic system to re-train patients with Apraxia or Action Disorganization Syndrome (AADS) to complete activities of daily living (ADLs). This paper describes the approach to automatic planning based on a Markov Decision Process, and real-time action recognition (AR) based on instrumented objects using Hidden Markov Models. The experimental results demonstrate the ability of a psychologically plausible planning system integrated in a Task Model (TM) to improve task performance via user simulation, and the viability of the approach to AR.


2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE) | 2016

Delay reduction in real-time recognition of human activity for stroke rehabilitation

Roozbeh Nabiei; Maryam Najafian; Manish Parekh; Peter Jancovic; Martin J. Russell

Assisting patients to perform activity of daily living (ADLs) is a challenging task for both human and machine. Hence, developing a computer-based rehabilitation system to re-train patients to carry out daily activities is an essential step towards facilitating rehabilitation of stroke patients with apraxia and action disorganization syndrome (AADS). This paper presents a real-time hidden Markov model (HMM) based human activity recognizer, and proposes a technique to reduce the time-delay occurred during the decoding stage. Results are reported for complete tea-making trials. In this study, the input features are recorded using sensors attached to the objects involved in the tea-making task, plus hand coordinate data captured using KinectTM sensor. A coaster of sensors, comprising an accelerometer and three force-sensitive resistors, are packaged in a unit which can be easily attached to the base of an object. A parallel asynchronous set of detectors, each responsible for the detection of one sub-goal in the tea-making task, are used to address challenges arising from overlaps between human actions. The proposed activity recognition system with the modified HMM topology provides a practical solution to the action recognition problem and reduces the time-delay by 64% with no loss in accuracy.

Collaboration


Dive into the Peter Jancovic's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xin Zou

University of Birmingham

View shared research outputs
Top Co-Authors

Avatar

Ji Ming

Queen's University Belfast

View shared research outputs
Top Co-Authors

Avatar

Philip Weber

University of Birmingham

View shared research outputs
Top Co-Authors

Avatar

Linxue Bai

University of Birmingham

View shared research outputs
Top Co-Authors

Avatar

S. M. Houghton

University of Birmingham

View shared research outputs
Top Co-Authors

Avatar

Saeid Safavi

University of Birmingham

View shared research outputs
Top Co-Authors

Avatar

Cham Athwal

Birmingham City University

View shared research outputs
Top Co-Authors

Avatar

Darryl Stewart

Queen's University Belfast

View shared research outputs
Researchain Logo
Decentralizing Knowledge