Is this you? Create Your Porfile

Cédric Févotte

Centre national de la recherche scientifique

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Cédric Févotte is active.

Explore More

Publication

Featured researches published by Cédric Févotte.

IEEE Transactions on Audio, Speech, and Language Processing | 2006

Performance measurement in blind audio source separation

Emmanuel Vincent; Rémi Gribonval; Cédric Févotte

In this paper, we discuss the evaluation of blind audio source separation (BASS) algorithms. Depending on the exact application, different distortions can be allowed between an estimated source and the wanted true source. We consider four different sets of such allowed distortions, from time-invariant gains to time-varying filters. In each case, we decompose the estimated source into a true source part plus error terms corresponding to interferences, additive noise, and algorithmic artifacts. Then, we derive a global performance measure using an energy ratio, plus a separate performance measure for each error term. These measures are computed and discussed on the results of several BASS problems with various difficulty levels

Neural Computation | 2009

Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis

Cédric Févotte; Nancy Bertin; Jean-Louis Durrieu

This letter presents theoretical, algorithmic, and experimental results about nonnegative matrix factorization (NMF) with the Itakura-Saito (IS) divergence. We describe how IS-NMF is underlaid by a well-defined statistical model of superimposed gaussian components and is equivalent to maximum likelihood estimation of variance parameters. This setting can accommodate regularization constraints on the factors through Bayesian priors. In particular, inverse-gamma and gamma Markov chain priors are considered in this work. Estimation can be carried out using a space-alternating generalized expectation-maximization (SAGE) algorithm; this leads to a novel type of NMF algorithm, whose convergence to a stationary point of the IS cost function is guaranteed. We also discuss the links between the IS divergence and other cost functions used in NMF, in particular, the Euclidean distance and the generalized Kullback-Leibler (KL) divergence. As such, we describe how IS-NMF can also be performed using a gradient multiplicative algorithm (a standard algorithm structure in NMF) whose convergence is observed in practice, though not proven. Finally, we report a furnished experimental comparative study of Euclidean-NMF, KL-NMF, and IS-NMF algorithms applied to the power spectrogram of a short piano sequence recorded in real conditions, with various initializations and model orders. Then we show how IS-NMF can successfully be employed for denoising and upmix (mono to stereo conversion) of an original piece of early jazz music. These experiments indicate that IS-NMF correctly captures the semantics of audio and is better suited to the representation of music signals than NMF with the usual Euclidean and KL costs.

IEEE Transactions on Audio, Speech, and Language Processing | 2010

Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation

Alexey Ozerov; Cédric Févotte

We consider inference in a general data-driven object-based model of multichannel audio data, assumed generated as a possibly underdetermined convolutive mixture of source signals. We work in the short-time Fourier transform (STFT) domain, where convolution is routinely approximated as linear instantaneous mixing in each frequency band. Each source STFT is given a model inspired from nonnegative matrix factorization (NMF) with the Itakura-Saito divergence, which underlies a statistical model of superimposed Gaussian components. We address estimation of the mixing and source parameters using two methods. The first one consists of maximizing the exact joint likelihood of the multichannel data using an expectation-maximization (EM) algorithm. The second method consists of maximizing the sum of individual likelihoods of all channels using a multiplicative update algorithm inspired from NMF methodology. Our decomposition algorithms are applied to stereo audio source separation in various settings, covering blind and supervised separation, music and speech sources, synthetic instantaneous and convolutive mixtures, as well as professionally produced music recordings. Our EM method produces competitive results with respect to state-of-the-art as illustrated on two tasks from the international Signal Separation Evaluation Campaign (SiSEC 2008).

Neural Computation | 2011

Algorithms for nonnegative matrix factorization with the β-divergence

Cédric Févotte; Jérôme Idier

This letter describes algorithms for nonnegative matrix factorization (NMF) with the β-divergence (β-NMF). The β-divergence is a family of cost functions parameterized by a single shape parameter β that takes the Euclidean distance, the Kullback-Leibler divergence, and the Itakura-Saito divergence as special cases (β = 2, 1, 0 respectively). The proposed algorithms are based on a surrogate auxiliary function (a local majorization of the criterion function). We first describe a majorization-minimization algorithm that leads to multiplicative updates, which differ from standard heuristic multiplicative updates by a β-dependent power exponent. The monotonicity of the heuristic algorithm can, however, be proven for β ∈ (0, 1) using the proposed auxiliary function. Then we introduce the concept of the majorization-equalization (ME) algorithm, which produces updates that move along constant level sets of the auxiliary function and lead to larger steps than MM. Simulations on synthetic and real data illustrate the faster convergence of the ME approach. The letter also describes how the proposed algorithms can be adapted to two common variants of NMF: penalized NMF (when a penalty function of the factors is added to the criterion function) and convex NMF (when the dictionary is assumed to belong to a known subspace).

IEEE Transactions on Audio, Speech, and Language Processing | 2006

A Bayesian Approach for Blind Separation of Sparse Sources

Cédric Févotte; Simon J. Godsill

We present a Bayesian approach for blind separation of linear instantaneous mixtures of sources having a sparse representation in a given basis. The distributions of the coefficients of the sources in the basis are modeled by a Student t distribution, which can be expressed as a scale mixture of Gaussians, and a Gibbs sampler is derived to estimate the sources, the mixing matrix, the input noise variance and also the hyperparameters of the Student t distributions. The method allows for separation of underdetermined (more sources than sensors) noisy mixtures. Results are presented with audio signals using a modified discrete cosine transform basis and compared with a finite mixture of Gaussians prior approach. These results show the improved sound quality obtained with the Student t prior and the better robustness to mixing matrices close to singularity of the Markov chain Monte Carlo approach

IEEE Transactions on Audio, Speech, and Language Processing | 2010

Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals

Jean-Louis Durrieu; Gaël Richard; Bertrand David; Cédric Févotte

Extracting the main melody from a polyphonic music recording seems natural even to untrained human listeners. To a certain extent it is related to the concept of source separation, with the human ability of focusing on a specific source in order to extract relevant information. In this paper, we propose a new approach for the estimation and extraction of the main melody (and in particular the leading vocal part) from polyphonic audio signals. To that aim, we propose a new signal model where the leading vocal part is explicitly represented by a specific source/filter model. The proposed representation is investigated in the framework of two statistical models: a Gaussian Scaled Mixture Model (GSMM) and an extended Instantaneous Mixture Model (IMM). For both models, the estimation of the different parameters is done within a maximum-likelihood framework adapted from single-channel source separation techniques. The desired sequence of fundamental frequencies is then inferred from the estimated parameters. The results obtained in a recent evaluation campaign (MIREX08) show that the proposed approaches are very promising and reach state-of-the-art performances on all test sets.

workshop on applications of signal processing to audio and acoustics | 2009

Factorial Scaled Hidden Markov Model for polyphonic audio representation and source separation

Alexey Ozerov; Cédric Févotte; Maurice Charbit

We present a new probabilistic model for polyphonic audio termed factorial scaled hidden Markov model (FS-HMM), which generalizes several existing models, notably the Gaussian scaled mixture model and the Itakura-Saito nonnegative matrix factorization (NMF) model. We describe two expectation-maximization (EM) algorithms for maximum likelihood estimation, which differ by the choice of complete data set. The second EM algorithm, based on a reduced complete data set and multiplicative updates inspired from NMF methodology, exhibits much faster convergence. We consider the FS-HMM in different configurations for the difficult problem of speech/music separation from a single channel and report satisfying results.

IEEE Signal Processing Letters | 2004

Two contributions to blind source separation using time-frequency distributions

Cédric Févotte; Christian Doncarli

We present two improvements/extensions of a previous deterministic blind source separation (BSS) technique, by Belouchrani and Amin, that involves joint-diagonalization of a set of Cohens class spatial time-frequency distributions. The first contribution concerns the extension of the BSS technique to the stochastic case using spatial Wigner-Ville spectrum. Then, we show that Belouchrani and Amins technique can be interpreted as a practical implementation of the general equations we provide in the stochastic case. The second contribution is a new criterion aimed at selecting more efficiently the time-frequency locations where the spatial matrices should be joint-diagonalized, introducing single autoterms selection. Simulation results on stochastic time-varying autoregressive moving average (TVARMA) signals demonstrate the improved efficiency of the method.

IEEE Signal Processing Magazine | 2014

Static and Dynamic Source Separation Using Nonnegative Factorizations: A unified view

Paris Smaragdis; Cédric Févotte; Gautham J. Mysore; Nasser Mohammadiha; Matthew D. Hoffman

Source separation models that make use of nonnegativity in their parameters have been gaining increasing popularity in the last few years, spawning a significant number of publications on the topic. Although these techniques are conceptually similar to other matrix decompositions, they are surprisingly more effective in extracting perceptually meaningful sources from complex mixtures. In this article, we will examine the various methodologies and extensions that make up this family of approaches and present them under a unified framework. We will begin with a short description of the basic concepts and in the subsequent sections we will delve in more details and explore some of the latest extensions.

international conference on acoustics, speech, and signal processing | 2011

Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation

Alexey Ozerov; Cédric Févotte; Raphaël Blouet; Jean-Louis Durrieu

Separating multiple tracks from professionally produced music recordings (PPMRs) is still a challenging problem. We address this task with a user-guided approach in which the separation system is provided segmental information indicating the time activations of the particular instruments to separate. This information may typically be retrieved from manual annotation. We use a so-called multichannel nonnegative tensor factorization (NTF) model, in which the original sources are observed through a multichannel convolutive mixture and in which the source power spectrograms are jointly modeled by a 3-valence (time/frequency/source) tensor. Our user-guided separation method produced competitive results at the 2010 Signal Separation Evaluation Campaign, with sufficient quality for real-world music editing applications.

Explore More