Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Douglas D. O'Shaughnessy is active.

Publication


Featured researches published by Douglas D. O'Shaughnessy.


Eurasip Journal on Audio, Speech, and Music Processing | 2008

Experiments on automatic recognition of nonnative Arabic speech

Yousef Ajami Alotaibi; Sid-Ahmed Selouani; Douglas D. O'Shaughnessy

The automatic recognition of foreign-accented Arabic speech is a challenging task since it involves a large number of nonnative accents. As well, the nonnative speech data available for training are generally insufficient. Moreover, as compared to other languages, the Arabic language has sparked a relatively small number of research efforts. In this paper, we are concerned with the problem of nonnative speech in a speaker independent, large-vocabulary speech recognition system for modern standard Arabic (MSA). We analyze some major differences at the phonetic level in order to determine which phonemes have a significant part in the recognition performance for both native and nonnative speakers. Special attention is given to specific Arabic phonemes. The performance of an HMM-based Arabic speech recognition system is analyzed with respect to speaker gender and its native origin. The WestPoint modern standard Arabic database from the language data consortium (LDC) and the hidden Markov Model Toolkit (HTK) are used throughout all experiments. Our study shows that the best performance in the overall phoneme recognition is obtained when nonnative speakers are involved in both training and testing phases. This is not the case when a language model and phonetic lattice networks are incorporated in the system. At the phonetic level, the results show that female nonnative speakers perform better than nonnative male speakers, and that emphatic phonemes yield a significant decrease in performance when they are uttered by both male and female nonnative speakers.


information sciences, signal processing and their applications | 2010

Text-independent distributed speaker identification and verification using GMM-UBM speaker models for mobile communications

Foezur Rahman Chowdhury; Sid-Ahmed Selouani; Douglas D. O'Shaughnessy

This paper presents the simulation results of a speaker identification and verification (SIDV) system that would be efficient for resource limited mobile devices. The proposed system works as a text-independent system within the distributed speech recognition (DSR) framework and is designed to identify a target speaker or imposter using short digit utterances rather than long utterances. In this distributed SIDV (DSIDV), the target speaker model is developed by using the most popular generative system called a GMM-UBM system. A Gaussian Mixture Model (GMM) for each true speaker is derived from the Universal Background Model (UBM) by using Bayesian maximum a posteriori (MAP) adaptation. The objective of this paper is to show how speaker recognition and verification over telephone channels can be done using short speeches and DSR technology robust to channel distortions. The ETSI Aurora2 speech corpus was tested in these experiments. The experimental results show that the proposed DSIDV system yields excellent identification and detection performances in a ETSI DSR evaluation task and would be suitable for small hand held mobile devices.


International Journal of Speech Technology | 2012

Bayesian on-line spectral change point detection: a soft computing approach for on-line ASR

M. F. Chowdhury; Sid-Ahmed Selouani; Douglas D. O'Shaughnessy

Current automatic speech recognition (ASR) works in off-line mode and needs prior knowledge of the stationary or quasi-stationary test conditions for expected word recognition accuracy. These requirements limit the application of ASR for real-world applications where test conditions are highly non-stationary and are not known a priori. This paper presents an innovative frame dynamic rapid adaptation and noise compensation technique for tracking highly non-stationary noises and its application for on-line ASR. The proposed algorithm is based on a soft computing model using Bayesian on-line inference for spectral change point detection (BOSCPD) in unknown non-stationary noises. BOSCPD is tested with the MCRA noise tracking technique for on-line rapid environmental change learning in different non-stationary noise scenarios. The test results show that the proposed BOSCPD technique reduces the delay in spectral change point detection significantly compared to the baseline MCRA and its derivatives. The proposed BOSCPD soft computing model is tested for joint additive and channel distortions compensation (JAC)-based on-line ASR in unknown test conditions using non-stationary noisy speech samples from the Aurora 2 speech database. The simulation results for the on-line AR show significant improvement in recognition accuracy compared to the baseline Aurora 2 distributed speech recognition (DSR) in batch-mode.


International Journal of Computers and Applications | 2007

Incorporating phonetic knowledge into an evolutionary subspace approach for robust speech recognition

Sid-Ahmed Selouani; Douglas D. O'Shaughnessy; Jean Caelen

Abstract The reliability of automatic speech recognition (ASR) systems is closely related to the parameterization process which is expected to accurately characterize the phonetic, dynamic and static components in speech. For this purpose, ASR methods build speech sound models based on large speech corpora that attempt to include common sources of variability that may occur in real-life conditions. Nevertheless, not all variabilities can reasonably be covered. For that reason, the performance of current ASR systems, whose designs are predicated on relatively noise-free conditions, degrades rapidly in the presence of high-level adverse conditions. To cope with mismatched (adverse) conditions and to achieve noise robustness, we present in this paper an original approach that operates in two steps. The first one consists of integrating in the front-end process, besides mean-subtracted mel-frequency cepstral coefficients, acoustic distinctive features that provides a more convenient interface to higher-level components of ASR systems. The second step consists of combining subspace filtering and Genetic Algorithms to get less-variant parameters. The advantages of this approach include that no estimation of noise is required and the recognition system is not modified. The effectiveness of the method is assessed in high interfering car noise by using a noisy subset of the TIMIT database. Obtained results show that the proposed method reduces drastically the word error rate for a wide range of signal-to-noise ratios.


canadian conference on electrical and computer engineering | 2010

Frame recursive dynamic mean bias removal technique for robust environment-aware speech recognition in real world applications

Foezur Rahman Chowdhury; Sid-Ahmed Selouani; Douglas D. O'Shaughnessy

In this paper, we investigated and simulated the frame recursive dynamic mean bias removing technique in the cepstral domain with a time smoothing parameter in order to improve the robustness of automatic speech recognition (ASR) in realtime environments. The objective of this simulation was to examine the suitability of the frame recursive cepstral mean bias removal technique as a part of an effort to develop single channel joint additive noise and channel distortion compensation (JAC) algorithm in feature space for real-world applications. The Aurora2 speech corpus was used in this simulation. The simulation results show that the frame recursive dynamic mean bias removal technique performs better in real-time scenarios compared to conventional approaches (non real-time) to improve the robustness of ASR under noisy conditions.


international symposium on signal processing and information technology | 2004

Investigation into a Mel subspace based front-end processing for robust speech recognition

Sid-Ahmed Selouani; Douglas D. O'Shaughnessy

This paper addresses the issue of noise reduction applied to robust large-vocabulary continuous-speech recognition (CSR). We investigate strategies based on the subspace filtering that has been proven very effective in the area of speech enhancement. We compare original hybrid techniques that combine the Karhonen-Loeve transform (KLT), multilayer perceptron (MLP) and genetic algorithms (GAs) in order to get less-variant Mel-frequency parameters. The advantages of these methods include that they do not require estimation of either noise or speech spectra. To evaluate the effectiveness of these methods, an extensive set of recognition experiments are carried out in a severe interfering car noise environment for a wide range of SNRs varying from 16 dB to -4 dB using a noisy version of the TIMIT database.


international conference on acoustics, speech, and signal processing | 2012

A highly non-stationary noise tracking and compensation algorithm, with applications to speech enhancement and on-line ASR

Foezur Rahman Chowdhury; Sid-Ahmed Selouani; Douglas D. O'Shaughnessy

This paper presents a noise tracking and estimation algorithm for highly non-stationary noises using the Bayesian on-line spectral change point detection (BOSCPD) technique. In BOSCPD, the local minima search window update technique of minima controlled recursive averaging (MCRA) algorithm is made a function of spectral change point detection. The novelty of this algorithm is that it can detect the rapid changes instantly and quickly update the non-stationary noise estimate compared to the MCRA-based algorithms. The BOSCPD algorithm shows improvement in objective quality measures in terms of higher SNR and lower output distortion scores for speech enhancement. It is also tested to track and compensate for rapidly varying noises in on-line automatic speech recognition (ASR) using the Aurora 2 speech database. The simulation results show significant improvement in recognition accuracy compared to the baseline MCRA technique.


information sciences, signal processing and their applications | 2012

A soft computing approach to improve the robustness of on-line ASR in previously unseen highly non-stationary acoustic environments

Mohammad Foezur Rahman Chowdhury; Sid-Ahmed Selouani; Douglas D. O'Shaughnessy

This paper presents a soft noise compensation algorithm in the feature space to improve the noise robustness of HMM-based on-line automatic speech recognition (ASR) in unknown highly non-stationary acoustic environments. Current hard computing techniques fail to track and compensate the non-stationary noises properly in previously unseen acoustic environments. The proposed soft noise compensation algorithm is based on a joint additive background noises and channel distortions compensation (JAC) technique in feature space. In this novel soft JAC (SJAC), we use an evolutionary dynamic multi-swarm particle swarm optimization (DMS-PSO)-based soft computing (SC) technique in the front-end, and a frame synchronous bias compensation technique in the back-end of the ASR, respectively, for frame adaptive modeling and compensation of the background additive noises and channel distortions in feature space that are highly non-linear and non-Gaussian. From the experimental results, we find that the proposed evolutionary DMS-PSO-based SJAC technique achieves significant improvement in recognition performance of on-line ASR compared to our previously developed baseline Bayesian on-line spectral change point detection (BOSCPD)-based SJAC technique when evaluated over the Aurora 2 speech database.


european signal processing conference | 2009

A method utilizing window function frequency characteristics for noise-robust spectral pitch estimation

Iman Haji Abolhassani; Douglas D. O'Shaughnessy; Sid-Ahmed Selouani


european signal processing conference | 2008

Speech enhancement based on a hybrid a priori signal-to-noise ratio (SNR) estimator and a self-adaptive Lagrange multiplier

Md. Jahangir Alam; Sid-Ahmed Selouani; Douglas D. O'Shaughnessy

Collaboration


Dive into the Douglas D. O'Shaughnessy's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Amr H. Nour-Eldin

Université du Québec à Montréal

View shared research outputs
Top Co-Authors

Avatar

Hesham Tolba

Université du Québec à Montréal

View shared research outputs
Top Co-Authors

Avatar

Iman Haji Abolhassani

Université du Québec à Montréal

View shared research outputs
Top Co-Authors

Avatar

M. F. Chowdhury

Université du Québec à Montréal

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge