David Virette
Huawei
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by David Virette.
international conference on latent variable analysis and signal separation | 2012
Cyril Joder; Felix Weninger; Florian Eyben; David Virette; Björn W. Schuller
In this paper, we present an on-line semi-supervised algorithm for real-time separation of speech and background noise. The proposed system is based on Nonnegative Matrix Factorization (NMF), where fixed speech bases are learned from training data whereas the noise components are estimated in real-time on the recent past. n nExperiments with spontaneous conversational speech and real-life non-stationary noise show that this system performs as well as a supervised NMF algorithm exploiting noise components learned from the same noise environment as the test sample. Furthermore, it outperforms a supervised system trained on different noise conditions.
international conference on acoustics, speech, and signal processing | 2013
Wenyu Jin; W. Bastiaan Kleijn; David Virette
We introduce a method for 2-D spatial multizone soundfield reproduction based on describing the desired multizone soundfield as an orthogonal expansion of basis functions over the desired reproduction region. This approach finds the solution to the Helmholtz equation that is closest to the desired soundfield in a weighted least squares sense. The basis orthogonal set is formed using QR factorization with as input a suitable set of solutions of the Helmholtz equation. The coefficients of the Helmholtz solution wavefields can then be calculated, reducing the multizone sound reproduction problem to the reconstruction of a set of basis wavefields over the desired region. The method facilitates its application with a more practical loudspeaker configuration. The approach is shown effective for both accurately reproducing sound in the selected bright zone and minimizing sound leakage into the predefined quiet zone.
international conference on acoustics, speech, and signal processing | 2013
Cyril Joder; Felix Weninger; David Virette; Björn W. Schuller
In this work, we study the usefulness of several types of sparsity penalties in the task of speech separation using supervised and semi-supervised Nonnegative Matrix Factorization (NMF). We compare different criteria from the literature to two novel penalty functions based on Wiener Entropy, in a large-scale evaluation on spontaneous speech overlaid by realistic domestic noise, as well as music and stationary environmental noise corpora. The results show that enforcing the sparsity constraint in the separation phase does not improve the perceptual quality. In the learning phase however, it yields a better estimation of the base spectra, especially in the case of supervised NMF, where the proposed criteria delivered the best results.
international conference on acoustics, speech, and signal processing | 2013
Cyril Joder; Felix Weninger; David Virette; Björn W. Schuller
We present a novel method to integrate noise estimates by unsupervised speech enhancement algorithms into a semi-supervised non-negative matrix factorization framework. A multiplicative update algorithm is derived to estimate a non-negative noise dictionary given a time-varying background noise estimate with a stationarity constraint. A large-scale, speaker-independent evaluation is carried out on spontaneous speech overlaid with the official CHiME 2011 Challenge corpus of realistic domestic noise, as well as music and stationary environmental noise corpora. In the result, the proposed method delivers higher signal-distortion ratio and objective perceptual measure than standard semi-supervised NMF or spectral subtraction based on the same noise estimation algorithm, and further gains can be expected by speaker adaptation.
international conference on acoustics, speech, and signal processing | 2013
Wenhai Wu; Lei Miao; Yue Lang; David Virette
This paper presents a novel low bit rate parametric stereo coding scheme which uses whole band inter-channel time difference (WITD) and whole band inter-channel phase difference (WIPD) together with a new effective downmixing method. The inter-channel level differences and inter-channel phase differences are also employed in the proposed stereo coding to further improve the quality in an embedded structure. This scheme is applied to the stereo extension of ITU-T G.722 at 56+8 kbits/s with a frame length of 5 ms. Listening test results are provided to assess the quality of the proposed downmixing method and WITD/WIPD coding scheme separately.
international conference on acoustics, speech, and signal processing | 2013
David Virette; Yue Lang; Lei Miao; Wenhai Wu; Balazs Kovesi; Claude Lamblin; Stéphane Ragot
This paper presents the two new ITU-T Recommendations G.722 Annex D and G.711.1 Annex F, which are stereo extensions of the wideband codecs ITU-T G.722 and G.711.1 and their superwideband extensions (G.722 Annex B and G.711.1 Annex D). An embedded scalable structure is used to add stereo extension layers on top of the wideband or superwideband core coding. Wideband stereo modes are supported at the bit rates of 64/80 and 96/128 kbit/s for G.722 and G.711.1 (respectively), while superwideband stereo modes are supported at 80/96/112/128 and 112/128/144/160 kbit/s. The parametric stereo coding model is based on a frequency domain downmix, wideband inter-channel differences estimation, quantization and synthesis, low complexity coherence analysis and synthesis, stereo transient detection and stereo post-processing. An overview of formal ITU-T characterization listening tests illustrates the performance of these codecs.
Archive | 2015
Yue Lang; David Virette
Archive | 2014
David Virette; Yang Gao; Wei Xiao
Archive | 2014
Deming Zhang; David Virette; Yue Lang
Archive | 2014
Anisse Taleb; Jianfeng Xu; David Virette