James D. Gordy
Carleton University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by James D. Gordy.
IEEE Transactions on Audio, Speech, and Language Processing | 2006
James D. Gordy; Rafik A. Goubran
In this paper, standard echo canceller performance measures are evaluated in terms of psychoacoustic aspects of human hearing. The focus is on wideband speech communications systems with long round-trip delays of 200 ms and up present in the transmission path. The results of a simple acoustic echo cancellation experiment are analyzed with a standard psychoacoustic model, revealing that steady-state echo return loss enhancement and mean square error cannot be used to determine whether residual echo is perceivable in the presence of background noise. In addition, a simple modification to the normalized least mean square (NLMS) algorithm is introduced by adding a perceptual preemphasis filter. Simulation results and listening tests show that it is possible to improve the perceived performance of an echo canceller during convergence by placing greater emphasis on frequencies at which the human auditory system is most sensitive.
international conference on multimedia and expo | 2005
James D. Gordy; Rafik A. Goubran
This paper investigates performance measures of adaptive echo cancellers for packet-based telephony. It is shown that steady-state echo return loss enhancement (ERLE) does not accurately reflect perceived echo canceller convergence when background noise is present. An upper bound is derived for the maximum perceivable ERLE achievable in practice, and an algorithm is introduced for calculating ERLE that incorporates these masking effects based on a perceptual hearing model. Simulation and informal listening test results show a clear correspondence between the new performance measure and the perceptual upper bound induced by background noise
instrumentation and measurement technology conference | 2005
James D. Gordy; Rafik A. Goubran
This paper investigates the use of a subband affine projection (AP) algorithm for solving system identification problems using critically sampled subband adaptive filters. The subband AP is first analyzed with respect to theoretical rate of convergence and computational complexity. Simulation results for the algorithm are presented in the context of measuring a room impulse response for acoustic echo cancellation and tracking changes to the impulse response over time. In these simulations, the subband AP is compared to other subband adaptation algorithms using two-, four-, and eight-channel filter banks and to fullband adaptation algorithms. To evaluate the algorithm in a practical implementation, experimental results are presented for echo cancellation using speech input signals in a conference room. It is shown that a four-channel filter bank with subband AP can achieve an average mean square error that is 5 dB lower than a subband normalized least-mean-square algorithm during initial filter convergence
international conference on acoustics, speech, and signal processing | 2004
James D. Gordy; Rafik A. Goubran
This paper presents a novel acoustic echo canceller structure based on combining the filtered-X LMS algorithm with an LPC-based speech coder for use in videoconferencing and VoIP. The algorithm updates coefficients using filtered versions of the input and error signals obtained by directly tapping the short-term excitation signal from the speech decoder, and by filtering the error signal with a bank of FIR decorrelation filters constructed from the LPC synthesis filter coefficients. The proposed algorithm was implemented using ITU G.729, and simulation results with 2000-tap room impulse responses show a faster and more constant rate of convergence than NLMS using speech input signals and an average 10 dB greater ERLE observed during convergence.
workshop on applications of signal processing to audio and acoustics | 2005
James D. Gordy; Rafik A. Goubran
This paper investigates a normalized cross-correlation-based doubletalk detector for acoustic echo cancellers. A novel low-complexity version is presented suitable for implementation on IP-enabled telephones employing LPC-based speech coders. In particular, the algorithm obtains a decorrelated input signal and decorrelation filter coefficients directly from the speech decoder. The proposed algorithm is implemented using ITU G.729, and simulation results are collected for a typical room. Calibration data are obtained and shown to be independent of the speech input signal. It is also shown that the proposed algorithm has the same performance as a full-complexity version employing the same decorrelation filter order.
canadian conference on electrical and computer engineering | 2008
James D. Gordy; Martin Bouchard; Tyseer Aboulnasr
This paper investigates the performance limits of beamforming algorithms in monaural and binaural multi-microphone hearing aids. Directional microphones and minimum variance distortionless response (MVDR) beamformer designs are reviewed in the context of hearing aids. A general binaural filter-and-sum beamformer structure is described based on combining microphone signals from right and left hearing aids. The monaural and binaural algorithms are evaluated with respect to directivity pattern, directivity index, and noise gain. Simulation results show that for a frontal source signal, a binaural beamformer is capable of increasing the directivity by approximately 3 dB compared to a single-array beamformer, but with no increase in the noise gain.
international conference on acoustics, speech, and signal processing | 2008
James D. Gordy; Tyseer Aboulnasr; Martin Bouchard
This paper proposes a selective coefficient update algorithm for reducing the complexity of the proportionate normalized least- mean-square (P-NLMS) class of algorithms. It is shown that an optimal subset of coefficients to update, namely those minimizing the a posteriori error, cannot be constructed efficiently. A sub- optimal block-based coefficient selection algorithm is presented that combines proportional weighting of the input signal vector with fast ranking methods. It is compared to existing sub-optimal algorithms with respect to complexity overhead and convergence rate. Simulations show that the proposed algorithm produces performance approaching that of the optimal subset while maintaining a low coefficient selection overhead.
IEEE Transactions on Audio, Speech, and Language Processing | 2007
James D. Gordy; Rafik A. Goubran
Doubletalk detection is an important part of a practical echo canceller implementation, but a difficult problem is calibrating the doubletalk detector for arbitrary environments and input signals. In this paper, it is shown that a statistical model of a doubletalk detection variables probability density function (PDF) can be used to obtain an optimal detection threshold and expected detection performance curves. In particular, a statistical analysis of a recently proposed cross-correlation-based doubletalk detector is presented. The doubletalk detection variable is modeled in terms of its constituent parameter estimators, resulting in conditional PDFs in the absence and presence of doubletalk. These are used to obtain a signal-adaptive detection threshold for calibration, and to provide expected doubletalk detection probability. Simulations are presented comparing the theoretical and measured detection probability compared to a fixed detection threshold for speech input and doubletalk signals. The results indicate a close agreement with the proposed model for moderate-to-high levels of doubletalk
international conference on multimedia and expo | 2006
James D. Gordy; Rafik A. Goubran
This paper investigates postfiltering for residual echo suppression in networks employing low-bit-rate speech compression in the echo path. Simulations show that the residual echo from nonlinear vocoder distortion with ITU G.729 is proportional to the input signal LPC spectrum. An algorithm is proposed to estimate the residual echo power spectrum using a frequency-dependent scaling factor. The algorithm is incorporated into a psychoacoustic postfilter for residual echo suppression and compared to an existing estimator with a fixed scaling factor. Experiments with speech input and near-end signals show an average 0.85 dB lower spectral distortion and 0.4 higher estimated mean opinion score
asilomar conference on signals, systems and computers | 2004
James D. Gordy; Rafik A. Goubran
This paper introduces techniques for reducing the computational complexity of re-encoding mixed speech using LPC-based vocoders, with application to conferencing over VoIP and cellular telephony networks. Our methods exploit a priori knowledge of the LPC synthesis filter coefficients, codebook gains and pitch periods for the compressed source speech signals. Experimental results with a modified G.729 vocoder show an average spectral distortion of 1.61 dB compared to reference mixed signals, 1.34 dB less than that introduced by the full encoder. Furthermore, the proposed techniques introduce no additional decrease in speech quality measured using ITU P.862 (PESQ) and informal MOS tests.