Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jean-Marc Valin is active.

Publication


Featured researches published by Jean-Marc Valin.


IEEE Transactions on Audio, Speech, and Language Processing | 2010

A High-Quality Speech and Audio Codec With Less Than 10-ms Delay

Jean-Marc Valin; Timothy B. Terriberry; Christopher B. Montgomery; Gregory Maxwell

With increasing quality requirements for multimedia communications, audio codecs must maintain both high quality and low delay. Typically, audio codecs offer either low delay or high quality, but rarely both. We propose a codec that simultaneously addresses both these requirements, with a delay of only 8.7 ms at 44.1 kHz. It uses gain-shape algebraic vector quantization in the frequency domain with time-domain pitch prediction. We demonstrate that the proposed codec operating at 48 kb/s and 64 kb/s out-performs both G.722.1C and MP3 and has quality comparable to AAC-LD, despite having less than one fourth of the algorithmic delay of these codecs.


IEEE Transactions on Audio, Speech, and Language Processing | 2007

On Adjusting the Learning Rate in Frequency Domain Echo Cancellation With Double-Talk

Jean-Marc Valin

One of the main difficulties in echo cancellation is the fact that the learning rate needs to vary according to conditions such as double-talk and echo path change. In this paper, we propose a new method of varying the learning rate of a frequency-domain echo canceller. This method is based on the derivation of the optimal learning rate of the normalized least mean square (NLMS) algorithm in the presence of noise. The method is evaluated in conjunction with the multidelay block frequency domain (MDF) adaptive filter. We demonstrate that it performs better than current double-talk detection techniques and is simple to implement


international conference on acoustics speech and signal processing | 1999

On the limits of speech recognition in noise

Stephen Douglas Peters; Peter Stubley; Jean-Marc Valin

We consider the performance of speech recognition in noise and focus on its sensitivity to the acoustic feature set. In particular, we examine the perceived information reduction imposed on a speech signal using a feature extraction method commonly used for automatic speech recognition. We observe that the human recognition rates on noisy digit strings drop considerably as the speech signal undergoes the typical loss of phase and loss of frequency resolution. Steps are taken to ensure that human subjects are constrained in ways similar to that of an automatic recognizer. The high correlation between the performance of the human listeners and that of our connected digit recognizer leads us to some interesting conclusions, including that typical cepstral processing is insufficient to support speech information in noise.


arXiv: Sound | 2008

Perceptually-Motivated Nonlinear Channel Decorrelation for Stereo Acoustic Echo Cancellation

Jean-Marc Valin

Acoustic echo cancellation with stereo signals is generally an under-determined problem because of the high coherence between the left and right channels. In this paper, we present a novel method of significantly reducing inter-channel coherence without affecting the audio quality. Our work takes into account psychoacoustic masking and binaural auditory cues. The proposed non-linear processing combines a shaped comb-allpass (SCAL) filter with the injection of psychoacoustically masked noise. We show that the proposed method performs significantly better than other known methods for reducing inter-channel coherence.


IEEE Signal Processing Letters | 2007

Interference-Normalized Least Mean Square Algorithm

Jean-Marc Valin; Iain B. Collings

An interference-normalized least mean square (INLMS) algorithm for robust adaptive filtering is proposed. The INLMS algorithm extends the gradient-adaptive learning rate approach to the case where the signals are nonstationary. In particular, we show that the INLMS algorithm can work even for highly nonstationary interference signals, where previous gradient-adaptive learning rate algorithms fail.


international conference on communications | 2009

Reflected Simplex Codebooks for Limited Feedback MIMO Beamforming

Daniel J. Ryan; Iain B. Collings; Jean-Marc Valin

This paper proposes Reflected Simplex codebooks for limited feedback beamforming in multiple-input multiple-output (MIMO) wireless systems. The codebooks are a geometric construction based on simplices and the An lattice. We propose a fast codebook search and indexing algorithm. We show that such codebooks perform superior or comparable to other codebooks, with much lower implementation complexity.


electronic imaging | 2015

Perceptual vector quantization for video coding

Jean-Marc Valin; Timothy B. Terriberry

This paper applies energy conservation principles to the Daala video codec using gain-shape vector quantization to encode a vector of AC coefficients as a length (gain) and direction (shape). The technique originates from the CELT mode of the Opus audio codec, where it is used to conserve the spectral envelope of an audio signal. Conserving energy in video has the potential to preserve textures rather than low-passing them. Explicitly quantizing a gain allows a simple contrast masking model with no signaling cost. Vector quantizing the shape keeps the number of degrees of freedom the same as scalar quantization, avoiding redundancy in the representation. We demonstrate how to predict the vector by transforming the space it is encoded in, rather than subtracting off the predictor, which would make energy conservation impossible. We also derive an encoding of the vector-quantized codewords that takes advantage of their non-uniform distribution. We show that the resulting technique outperforms scalar quantization by an average of 0.90 dB on still images, equivalent to a 24.8% reduction in bitrate at equal quality, while for videos, the improvement averages 0.83 dB, equivalent to a 13.7% reduction in bitrate.


global communications conference | 2008

Adaptive Rate Control for Aggregated VoIP Traffic

Fariza Sabrina; Jean-Marc Valin

This paper presents a novel mechanism for dynamically adapting the quality of congestion controlled voice over IP (VoIP) applications on the Internet in real time. The system uses our proposed variable bit rate speech codec called Speex, which can dynamically adjust the encoding bit rate (and hence the speech quality) based on both the feedback information about the network congestion and the instantaneous speech properties. Our extensive NS2 simulation results prove that the proposed system indeed provides highest quality speech while maximising the bandwidth utilisation and reducing the network congestion.


international conference on acoustics, speech, and signal processing | 2007

A New Robust Frequency Domain Echo Canceller with Closed-Loop Learning Rate Adaptation

Jean-Marc Valin; Iain B. Collings

One of the main difficulties in echo cancellation is the fact that the learning rate needs to vary according to conditions such as double-talk and echo path change. Several methods have been proposed to vary the learning. In this paper we propose a new closed-loop method where the learning rate is proportional to a misalignment parameter, which is in turn estimated based on a gradient adaptive approach. The method is presented in the context of a multidelay block frequency domain (MDF) echo canceller. We demonstrate that the proposed algorithm outperforms current popular double-talk detection techniques by up to 6 dB.


global communications conference | 2009

Priority Based Dynamic Rate Control for VoIP Traffic

Fariza Sabrina; Jean-Marc Valin

This paper presents a novel mechanism for dynamic rate control of prioritised Voice Over IP (VoIP) traffic in real time. The system uses our proposed variable bit rate speech codec called Speex, which can dynamically adjust the encoding bit rate (and hence the voice quality) based on the feedback information about the network congestion, flow priority, and the instantaneous speech properties. Our extensive NS2 simulation results along with results from ITU-T standard of speech quality evaluation tool (PESQ) show that the proposed system indeed provides highest quality speech while maximising the bandwidth utilisation and reducing the network congestion.

Collaboration


Dive into the Jean-Marc Valin's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Christopher B. Montgomery

National Institute of Standards and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Fariza Sabrina

University of New South Wales

View shared research outputs
Researchain Logo
Decentralizing Knowledge