Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tommy Vaillancourt is active.

Publication


Featured researches published by Tommy Vaillancourt.


international conference on acoustics, speech, and signal processing | 2007

ITU-T G.729.1: AN 8-32 Kbit/S Scalable Coder Interoperable with G.729 for Wideband Telephony and Voice Over IP

Stéphane Ragot; Balazs Kovesi; Romain Trilling; David Virette; Nicolas Duc; Dominique Massaloux; Stéphane Proust; Bernd Geiser; Martin Gartner; Stefan Schandl; Hervé Taddei; Yang Gao; Eyal Shlomot; Hiroyuki Ehara; Koji Yoshida; Tommy Vaillancourt; Redwan Salami; Mi Suk Lee; Do Young Kim

This paper describes the scalable coder - G.729.1 - which has been recently standardized by ITU-T for wideband telephony and voice over IP (VoIP) applications. G.729.1 can operate at 12 different bit rates from 32 down to 8 kbit/s with wideband quality starting at 14 kbit/s. This coder is a bitstream interoperable extension of ITU-T G.729 based on three embedded stages: narrowband cascaded CELP coding at 8 and 12 kbit/s, time-domain bandwidth extension (TDBWE) at 14 kbit/s, and split-band MDCT coding with spherical vector quantization (VQ) and pre-echo reduction from 16 to 32 kbit/s. Side information - consisting of signal class, phase, and energy - is transmitted at 12, 14 and 16 kbit/s to improve the resilience and recovery of the decoder in case of frame erasures. The quality, delay, and complexity of G.729.1 are summarized based on ITU-T results.


international conference on acoustics, speech, and signal processing | 2008

ITU-T G.EV-VBR baseline codec

Milan Jelinek; Tommy Vaillancourt; Ali Erdem Ertan; Jacek Stachurski; Anssi Rämö; Lasse Laaksonen; Jon Gibbs; Stefan Bruhn

We present the Q.EV-VBR winning candidate codec recently selected by Question 9 of Study Group 16 (Q9/16) of ITU-T as a baseline for the development of a scalable solution for wideband speech and audio compression at rates between 8 kb/s and 32 kb/s. The Q9/16 codec is an embedded codec comprising 5 layers where higher layer bitstreams can be discarded without affecting the decoding of the lower layers. The two lower layers are based on the CELP technology where the core layer takes advantage of signal classification based encoding. The higher layers encode the weighted error signal from lower layers using overlap-add transform coding. The codec has been designed with the primary objective of a high-performance wideband speech coding for error- prone telecommunications channels, without compromising the quality for narrowband/wideband speech or wideband music signals. The codec performance is demonstrated with selected test results.


international conference on acoustics, speech, and signal processing | 2007

Efficient Frame Erasure Concealment in Predictive Speech Codecs using Glottal Pulse Resynchronisation

Tommy Vaillancourt; Milan Jelinek; Redwan Salami; Roch Lefebvre

Error propagation after a frame loss is an important factor in quality degradation for predictive speech coders. This is mainly due to the lack of synchronization in the adaptive codebook in the good frames following a frame erasure. This article presents a method for resynchronizing the glottal pulse after an erased frame. The method uses an extra frame delay at the decoder, and can be applied with or without additional side information. The approach has been implemented in the G.729.1 standard and results in improved decoder convergence after erased frames. Subjective tests have demonstrated that this improves perceived quality in the presence of frame erasures.


IEEE Communications Magazine | 2009

G.718: A new embedded speech and audio coding standard with high resilience to error-prone transmission channels

Milan Jelinek; Tommy Vaillancourt; Jon Gibbs

This article is an overview of the standardization, architecture, and performance of the new ITU-T Recommendation G.718. G.718 is an embedded variable bit rate codec providing a scalable solution for compression of 8 and 16 kHz sampled speech and audio signals at rates between 8 kb/s and 32 kb/s. It comprises five layers where higher-layer bitstreams can be discarded without affecting the lower layersiquest decoding. The codec also has an optional core layer interoperable with ITU-T G.722.2 (3GPP AMR-WB) at 12.65 kb/s. G.718 was designed to provide high speech quality at low bit rates and to be robust to significant rates of frame erasures or packet losses. It is also targeting good quality for generic audio at higher rates.


international conference on acoustics, speech, and signal processing | 2008

Quality evaluation of the G.EV-VBR speech codec

Anssi Rämö; Henri Toukomaa; S. Craig Greer; Lasse Laaksonen; Jacek Stachurski; A. Erdem Ertan; Jonas Svedberg; Jon Gibbs; Tommy Vaillancourt

ITU-T has selected the candidate submitted by Ericsson, Nokia, Motorola, VoiceAge, and Texas Instruments as the baseline for the G.EV-VBR coding standard. G.EV-VBR is an embedded scalable speech codec that uses state-of-the-art technology to provide the most efficient encoded speech available for various real-time applications. EV-VBR encodes both narrowband (NB) and wideband (WB) speech signals starting at 8 kbps. Near perfect wideband representation is achieved at 32 kbps for all signal types. The bit stream is divided into five robust layers, providing sufficient granularity, in particular for VoIP applications. In addition, an extension to the codec will provide super- wideband and stereo capability by adding layers to the codec. Extensive listening tests were conducted during the ITU-T selection phase to support selection of the best- performing candidate. The selected EV-VBR candidate passed 69 of 70 required and 25 of 28 objective terms of reference.


international conference on acoustics, speech, and signal processing | 2015

Packet-loss concealment technology advances in EVS

Jérémie Lecomte; Tommy Vaillancourt; Stefan Bruhn; Ho-Sang Sung; Ke Peng; Kei Kikuiri; Bin Wang; Shaminda Subasingha; Julien Faure

EVS, the newly standardized 3GPP Codec for Enhanced Voice Services (EVS) was developed for mobile services such as VoLTE, where error resilience is highly essential. The presented paper outlines all aspects of the advances brought during the EVS development on packet loss concealment, by presenting a high level description of all technical features present in the final standardized codec. Coupled with jitter buffer management, the EVS codec provides robustness against late or lost packets. The advantages of the new EVS codec over reference codecs are further discussed based on listening test results.


international conference on acoustics, speech, and signal processing | 2015

Advances in low bitrate time-frequency coding

Tommy Vaillancourt; Vladimir Malenovsky; Redwan Salami; Zexin Liu; Lei Miao; Jon Gibbs; Milan Jelinek

In this paper a novel technique is presented to efficiently mix traditional ACELP time domain coding with a frequency domain coding model to improve the quality of generic audio signals coded at low bitrates without additional delay. The paper discusses how to integrate parts of a traditional Algebraic Code Excited Linear Prediction (ACELP) speech codec to create a time-domain contribution which coexists with a frequency based coding model. A mechanism to determine the value of the time-domain contribution is proposed and a method is described how the frequency-domain contribution might be added without increasing the overall delay of the codec. The proposed method forms part of the recently standardised 3GPP EVS codec.


international conference on acoustics, speech, and signal processing | 2015

Two-stage speech/music classifier with decision smoothing and sharpening in the EVS codec

Vladimir Malenovsky; Tommy Vaillancourt; Wang Zhe; Ki-hyun Choo; Venkatraman S. Atti

In most internationally recognized standardized multi-mode codecs, signal classification is performed in a single step by either linear discrimination or SNR-based metrics. The speech/music classifier of the EVS codec achieves greater discrimination than these single-step models by combining Gaussian mixture modelling (GMM) with a series of context-based improvement layers. Additionally, unlike traditional GMM classifiers the EVS model adopts a short hangover period, allowing it to track transitions between music and speech. Misclassifications are mitigated by applying a novel decision smoothing and sharpening technique. The results in relatively static environments demonstrate that the new two-stage approach with selective hangover leads to classification accuracies comparable to speech/music classifiers with longer hangovers. They also show that the new approach leads to faster and more accurate switching of coding modes than conventional classifiers for more complex audio environments such as advertisements, jingles and speech superimposed on music.


international conference on acoustics, speech, and signal processing | 2009

Inter-tone noise reduction in a low bit rate CELP decoder

Tommy Vaillancourt; Milan Jelinek; Redwan Salami; Vladimir Malenovsky; Roch Lefebvre

In this paper we present a novel technique to enhance music signals encoded using a low bit rate CELP coder. The method is based on reduction of inter-tone quantization noise for decoded music signals without affecting the quality for speech signals. The proposed technique consists of two modules. The first module is used to discriminate between stable tonal sounds and other sounds and the second module is used to reduce the inter-tone quantization noise in the stable tonal segments. The inter-tone noise is reduced by means of spectral subtraction. The proposed method is a part of the newly standardised ITU-T G.718 codec.


european signal processing conference | 2017

Flexible and scalable transform-domain codebook for high bit rate CELP coders

Vaclav Eksler; Bruno Bessette; Milan Jelinek; Tommy Vaillancourt

The Code-Excited Linear Prediction (CELP) model is very efficient in coding speech at low bit rates. However, if the bit rate of the coder is increased, the CELP model does not gain in quality as quickly as other approaches. Moreover, the computational complexity of the CELP model generally increases significantly at higher bit rates. In this paper we focus on a technique that aims to overcome these limitations by means of a special transform-domain codebook within the CELP model. We show by the example of the AMR-WB codec that the CELP model with the new flexible and scalable codebook improves the quality at high bit rates at no additional complexity cost.

Collaboration


Dive into the Tommy Vaillancourt's collaboration.

Top Co-Authors

Avatar

Redwan Salami

Université de Sherbrooke

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Roch Lefebvre

Université de Sherbrooke

View shared research outputs
Researchain Logo
Decentralizing Knowledge