Roland Badeau | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Roland Badeau is active.

Explore More

Publication

Featured researches published by Roland Badeau.

international conference on acoustics, speech, and signal processing | 2016

Complex NMF under phase constraints based on signal modeling: Application to audio source separation

Paul Magron; Roland Badeau; Bertrand David

Nonnegative Matrix Factorization (NMF) is a powerful tool for decomposing mixtures of audio signals in the Time-Frequency (TF) domain. In the source separation framework, the phase recovery for each extracted component is necessary for synthesizing time-domain signals. The Complex NMF (CNMF) model aims to jointly estimate the spectrogram and the phase of the sources, but requires to constrain the phase in order to produce satisfactory sounding results. We propose to incorporate phase constraints based on signal models within the CNMF framework: a phase unwrapping constraint that enforces a form of temporal coherence, and a constraint based on the repetition of audio events, which models the phases of the sources within onset frames. We also provide an algorithm for estimating the model parameters. The experimental results highlight the interest of including such constraints in the CNMF framework for separating overlapping components in complex audio mixtures.

IEEE Transactions on Audio, Speech, and Language Processing | 2016

Projection-based demixing of spatial audio

Derry Fitzgerald; Antoine Liutkus; Roland Badeau

We propose a method to unmix multichannel audio signals into their different constitutive spatial objects. To achieve this, we characterize an audio object through both a spatial and a spectro-temporal modeling. The particularity of the spatial model we pick is that it neither assumes an object has only one underlying source point, nor does it attempt to model the complex room acoustics. Instead, it focuses on a listener perspective, and takes each object as the superposition of many contributions with different incoming directions and interchannel delays. Our spectro-temporal probabilistic model is based on the recently proposed α-harmonisable processes, which are adequate for signals with large dynamics, such as audio. Then, the main originality of this paper is to provide a new way to estimate and exploit interchannel dependences of an object for the purpose of demixing. In the Gaussian α = 2 case, previous research focused on covariance structures. This approach is no longer valid for α <; 2 where covariances are not defined. Instead, we show how simple linear combinations of the mixture channels can be used to learn the model parameters, and the method we propose consists in pooling the estimates based on many projections to correctly account for the original multichannel audio. Intuitively, each such downmix of the mixture provides a new perspective where some objects are canceled or enhanced. Finally, we also explain how to recover the different spatial audio objects when all parameters have been computed. Performance of the method is illustrated on the separation of stereophonic music signals.

IEEE Transactions on Audio, Speech, and Language Processing | 2016

Multichannel Audio Source Separation With Probabilistic Reverberation Priors

Simon Leglaive; Roland Badeau; Gaël Richard

Incorporating prior knowledge about the sources and/or the mixture is a way to improve under-determined audio source separation performance. A great number of informed source separation techniques concentrate on taking priors on the sources into account, but fewer works have focused on constraining the mixing model. In this paper, we address the problem of underdetermined multichannel audio source separation in reverberant conditions. We target a semi-informed scenario where some room parameters are known. Two probabilistic priors on the frequency response of the mixing filters are proposed. Early reverberation is characterized by an autoregressive model while according to statistical room acoustics results, late reverberation is represented by an autoregressive moving average model. Both reverberation models are defined in the frequency domain. They aim to transcribe the temporal characteristics of the mixing filters into frequency-domain correlations. Our approach leads to a maximum a posteriori estimation of the mixing filters which is achieved thanks to the expectation-maximization algorithm. We experimentally show the superiority of this approach compared with a maximum likelihood estimation of the mixing filters.

international conference on acoustics, speech, and signal processing | 2017

Phase-dependent anisotropic Gaussian model for audio source separation

Paul Magron; Roland Badeau; Bertrand David

Phase reconstruction of complex components in the time-frequency domain is a challenging but necessary task for audio source separation. While traditional approaches do not exploit phase constraints that originate from signal modeling, some prior information about the phase can be obtained from sinusoidal modeling. In this paper, we introduce a probabilistic mixture model which allows us to incorporate such phase priors within a source separation framework. While the magnitudes are estimated beforehand, the phases are modeled by Von Mises random variables whose location parameters are the phase priors. We then approximate this non-tractable model by an anisotropic Gaussian model, in which the phase dependencies are preserved. This enables us to derive an MMSE estimator of the sources which optimally combines Wiener filtering and prior phase estimates. Experimental results highlight the potential of incorporating phase priors into mixture models for separating overlapping components in complex audio mixtures.

international conference on acoustics, speech, and signal processing | 2017

Alpha-stable multichannel audio source separation

Simon Leglaive; Umut Simsekli; Antoine Liutkus; Roland Badeau; Gaël Richard

In this paper, we focus on modeling multichannel audio signals in the short-time Fourier transform domain for the purpose of source separation. We propose a probabilistic model based on a class of heavy-tailed distributions, in which the observed mixtures and the latent sources are jointly modeled by using a certain class of multivariate alpha-stable distributions. As opposed to the conventional Gaussian models, where the observations are constrained to lie just within a few standard deviations from the mean, the proposed heavy-tailed model allows us to account for spurious data or important uncertainties in the model. We develop a Monte Carlo Expectation-Maximization algorithm for inferring the sources from the proposed model. We show that our approach leads to significant performance improvements in audio source separation under corrupted mixtures and in spatial audio object coding.

international conference on acoustics, speech, and signal processing | 2016

PROJET — Spatial audio separation using projections

Derry Fitzgerald; Antoine Liutkus; Roland Badeau

We propose a projection-based method for the unmixing of multichannel audio signals into their different constituent spatial objects. Here, spatial objects are modelled using a unified framework which handles both point sources and diffuse sources. We then propose a novel methodology to estimate and take advantage of the spatial dependencies of an object. Where previous research has processed the original multichannel mixtures directly and has been principally focused on the use of inter-channel covariance structures, here we instead process projections of the multichannel signal on many different spatial directions. These linear combinations consist of observations where some spatial objects are cancelled or enhanced. We then propose an algorithm which takes these projections as the observations, discarding dependencies between them. Since each one contains global information regarding all channels of the original multichannel mixture, this provides an effective means of learning the parameters of the original audio, while avoiding the need for joint-processing of all the channels. We further show how to recover the separated spatial objects and demonstrate the use of the technique on stereophonic music signals.

international conference on acoustics, speech, and signal processing | 2017

Multichannel audio source separation: Variational inference of time-frequency sources from time-domain observations

Simon Leglaive; Roland Badeau; Gaël Richard

A great number of methods for multichannel audio source separation are based on probabilistic approaches in which the sources are modeled as latent random variables in a Time-Frequency (TF) domain. For reverberant mixtures, it is common to approximate the time-domain convolutive mixing process as being instantaneous in the short-term Fourier transform domain, under a short mixing filters assumption. The TF latent sources are then inferred from the TF mixture observations. In this paper we propose to infer the TF latent sources from the time-domain observations. This approach allows us to exactly model the convolutive mixing process. The inference procedure relies on a variational expectation-maximization algorithm. In significant reverberation conditions, our approach leads to a signal-to-distortion ratio improvement of 5.5 dB compared with the usual TF approximation of the convolutive mixing process.

international workshop on acoustic signal enhancement | 2016

Anechoic phase estimation from reverberant signals

Arthur Belhomme; Yves Grenier; Roland Badeau; Eric Humbert

Most dereverberation methods aim to reconstruct the anechoic magnitude spectrogram, given a reverberant signal. Regardless of the method, the dereverberated signal is systematically synthesized with the reverberant phase. This corrupted phase reintroduces reverberation and distortion in the signal. This is why we intend to also reconstruct the anechoic phase, given a reverberant signal. Before processing speech signals, we propose in this paper a method for estimating the anechoic phase of reverberant chirp signals. Our method presents an accurate estimation of the instantaneous phase and improves objective measures of dereverberation.

international conference on acoustics, speech, and signal processing | 2016

Stochastic thermodynamic integration: Efficient Bayesian model selection via stochastic gradient MCMC

Umut Simcekli; Roland Badeau; Gaël Richard; Ali Taylan Cemgil

Model selection is a central topic in Bayesian machine learning, which requires the estimation of the marginal likelihood of the data under the models to be compared. During the last decade, conventional model selection methods have lost their charm as they have high computational requirements. In this study, we propose a computationally efficient model selection method by integrating ideas from Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) literature and statistical physics. As opposed to conventional methods, the proposed method has very low computational needs and can be implemented almost without modifying existing SG-MCMC code. We provide an upper-bound for the bias of the proposed method. Our experiments show that, our method is 40 times as fast as the baseline method on finding the optimal model order in a matrix factorization problem.

international conference on latent variable analysis and signal separation | 2017

Sketching for Nearfield Acoustic Imaging of Heavy-Tailed Sources

Mathieu Fontaine; Charles Vanwynsberghe; Antoine Liutkus; Roland Badeau

We propose a probabilistic model for acoustic source localization with known but arbitrary geometry of the microphone array. The approach has several features. First, it relies on a simple nearfield acoustic model for wave propagation. Second, it does not require the number of active sources. On the contrary, it produces a heat map representing the energy of a large set of candidate locations, thus imaging the acoustic field. Second, it relies on a heavy-tail (alpha )-stable probabilistic model, whose most important feature is to yield an estimation strategy where the multichannel signals need to be processed only once in a simple online procedure, called sketching. This sketching produces a fixed-sized representation of the data that is then analyzed for localization. The resulting algorithm has a small computational complexity and in this paper, we demonstrate that it compares favorably with state of the art for localization in realistic simulations of reverberant environments.

Explore More