Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where James D. Gordy is active.

Publication


Featured researches published by James D. Gordy.


IEEE Transactions on Audio, Speech, and Language Processing | 2006

On the perceptual performance limitations of echo cancellers in wideband telephony

James D. Gordy; Rafik A. Goubran

In this paper, standard echo canceller performance measures are evaluated in terms of psychoacoustic aspects of human hearing. The focus is on wideband speech communications systems with long round-trip delays of 200 ms and up present in the transmission path. The results of a simple acoustic echo cancellation experiment are analyzed with a standard psychoacoustic model, revealing that steady-state echo return loss enhancement and mean square error cannot be used to determine whether residual echo is perceivable in the presence of background noise. In addition, a simple modification to the normalized least mean square (NLMS) algorithm is introduced by adding a perceptual preemphasis filter. Simulation results and listening tests show that it is possible to improve the perceived performance of an echo canceller during convergence by placing greater emphasis on frequencies at which the human auditory system is most sensitive.


international conference on multimedia and expo | 2005

A Perceptual Performance Measure for Adaptive Echo Cancellers in Packet-Based Telephony

James D. Gordy; Rafik A. Goubran

This paper investigates performance measures of adaptive echo cancellers for packet-based telephony. It is shown that steady-state echo return loss enhancement (ERLE) does not accurately reflect perceived echo canceller convergence when background noise is present. An upper bound is derived for the maximum perceivable ERLE achievable in practice, and an algorithm is introduced for calculating ERLE that incorporates these masking effects based on a perceptual hearing model. Simulation and informal listening test results show a clear correspondence between the new performance measure and the perceptual upper bound induced by background noise


instrumentation and measurement technology conference | 2005

Fast System Identification Using Affine Projection and a Critically Sampled Subband Adaptive Filter

James D. Gordy; Rafik A. Goubran

This paper investigates the use of a subband affine projection (AP) algorithm for solving system identification problems using critically sampled subband adaptive filters. The subband AP is first analyzed with respect to theoretical rate of convergence and computational complexity. Simulation results for the algorithm are presented in the context of measuring a room impulse response for acoustic echo cancellation and tracking changes to the impulse response over time. In these simulations, the subband AP is compared to other subband adaptation algorithms using two-, four-, and eight-channel filter banks and to fullband adaptation algorithms. To evaluate the algorithm in a practical implementation, experimental results are presented for echo cancellation using speech input signals in a conference room. It is shown that a four-channel filter bank with subband AP can achieve an average mean square error that is 5 dB lower than a subband normalized least-mean-square algorithm during initial filter convergence


international conference on acoustics, speech, and signal processing | 2004

A combined LPC-based speech coder and filtered-X LMS algorithm for acoustic echo cancellation

James D. Gordy; Rafik A. Goubran

This paper presents a novel acoustic echo canceller structure based on combining the filtered-X LMS algorithm with an LPC-based speech coder for use in videoconferencing and VoIP. The algorithm updates coefficients using filtered versions of the input and error signals obtained by directly tapping the short-term excitation signal from the speech decoder, and by filtering the error signal with a bank of FIR decorrelation filters constructed from the LPC synthesis filter coefficients. The proposed algorithm was implemented using ITU G.729, and simulation results with 2000-tap room impulse responses show a faster and more constant rate of convergence than NLMS using speech input signals and an average 10 dB greater ERLE observed during convergence.


workshop on applications of signal processing to audio and acoustics | 2005

A low-complexity doubletalk detector for acoustic echo cancellers in packet-based telephony

James D. Gordy; Rafik A. Goubran

This paper investigates a normalized cross-correlation-based doubletalk detector for acoustic echo cancellers. A novel low-complexity version is presented suitable for implementation on IP-enabled telephones employing LPC-based speech coders. In particular, the algorithm obtains a decorrelated input signal and decorrelation filter coefficients directly from the speech decoder. The proposed algorithm is implemented using ITU G.729, and simulation results are collected for a typical room. Calibration data are obtained and shown to be independent of the speech input signal. It is also shown that the proposed algorithm has the same performance as a full-complexity version employing the same decorrelation filter order.


canadian conference on electrical and computer engineering | 2008

Beamformer performance limits in monaural and binaural hearing aid applications

James D. Gordy; Martin Bouchard; Tyseer Aboulnasr

This paper investigates the performance limits of beamforming algorithms in monaural and binaural multi-microphone hearing aids. Directional microphones and minimum variance distortionless response (MVDR) beamformer designs are reviewed in the context of hearing aids. A general binaural filter-and-sum beamformer structure is described based on combining microphone signals from right and left hearing aids. The monaural and binaural algorithms are evaluated with respect to directivity pattern, directivity index, and noise gain. Simulation results show that for a frontal source signal, a binaural beamformer is capable of increasing the directivity by approximately 3 dB compared to a single-array beamformer, but with no increase in the noise gain.


international conference on acoustics, speech, and signal processing | 2008

Reduced-complexity proportionate nlms employing block-based selective coefficient updates

James D. Gordy; Tyseer Aboulnasr; Martin Bouchard

This paper proposes a selective coefficient update algorithm for reducing the complexity of the proportionate normalized least- mean-square (P-NLMS) class of algorithms. It is shown that an optimal subset of coefficients to update, namely those minimizing the a posteriori error, cannot be constructed efficiently. A sub- optimal block-based coefficient selection algorithm is presented that combines proportional weighting of the input signal vector with fast ranking methods. It is compared to existing sub-optimal algorithms with respect to complexity overhead and convergence rate. Simulations show that the proposed algorithm produces performance approaching that of the optimal subset while maintaining a low coefficient selection overhead.


IEEE Transactions on Audio, Speech, and Language Processing | 2007

Statistical Analysis of Doubletalk Detection for Calibration and Performance Evaluation

James D. Gordy; Rafik A. Goubran

Doubletalk detection is an important part of a practical echo canceller implementation, but a difficult problem is calibrating the doubletalk detector for arbitrary environments and input signals. In this paper, it is shown that a statistical model of a doubletalk detection variables probability density function (PDF) can be used to obtain an optimal detection threshold and expected detection performance curves. In particular, a statistical analysis of a recently proposed cross-correlation-based doubletalk detector is presented. The doubletalk detection variable is modeled in terms of its constituent parameter estimators, resulting in conditional PDFs in the absence and presence of doubletalk. These are used to obtain a signal-adaptive detection threshold for calibration, and to provide expected doubletalk detection probability. Simulations are presented comparing the theoretical and measured detection probability compared to a fixed detection threshold for speech input and doubletalk signals. The results indicate a close agreement with the proposed model for moderate-to-high levels of doubletalk


international conference on multimedia and expo | 2006

Postfiltering for Suppression of Residual Echo from Vocoder Distortion in Packet-Based Telephony

James D. Gordy; Rafik A. Goubran

This paper investigates postfiltering for residual echo suppression in networks employing low-bit-rate speech compression in the echo path. Simulations show that the residual echo from nonlinear vocoder distortion with ITU G.729 is proportional to the input signal LPC spectrum. An algorithm is proposed to estimate the residual echo power spectrum using a frequency-dependent scaling factor. The algorithm is incorporated into a psychoacoustic postfilter for residual echo suppression and compared to an existing estimator with a fixed scaling factor. Experiments with speech input and near-end signals show an average 0.85 dB lower spectral distortion and 0.4 higher estimated mean opinion score


asilomar conference on signals, systems and computers | 2004

Reduced-delay mixing of compressed speech signals for VoIP and cellular telephony

James D. Gordy; Rafik A. Goubran

This paper introduces techniques for reducing the computational complexity of re-encoding mixed speech using LPC-based vocoders, with application to conferencing over VoIP and cellular telephony networks. Our methods exploit a priori knowledge of the LPC synthesis filter coefficients, codebook gains and pitch periods for the compressed source speech signals. Experimental results with a modified G.729 vocoder show an average spectral distortion of 1.61 dB compared to reference mixed signals, 1.34 dB less than that introduced by the full encoder. Furthermore, the proposed techniques introduce no additional decrease in speech quality measured using ITU P.862 (PESQ) and informal MOS tests.

Collaboration


Dive into the James D. Gordy's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge