Simon Arberet
École Polytechnique Fédérale de Lausanne
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Simon Arberet.
information sciences, signal processing and their applications | 2010
Simon Arberet; Alexey Ozerov; Ngoc Q. K. Duong; Emmanuel Vincent; Rémi Gribonval; Frédéric Bimbot; Pierre Vandergheynst
We address the problem of blind audio source separation in the under-determined and convolutive case. The contribution of each source to the mixture channels in the time-frequency domain is modeled by a zero-mean Gaussian random vector with a full rank covariance matrix composed of two terms: a variance which represents the spectral properties of the source and which is modeled by a nonnegative matrix factorization (NMF) model and another full rank covariance matrix which encodes the spatial properties of the source contribution in the mixture. We address the estimation of these parameters by maximizing the likelihood of the mixture using an expectation-maximization (EM) algorithm. Theoretical propositions are corroborated by experimental studies on stereo reverberant music mixtures.
IEEE Transactions on Image Processing | 2013
Mohammad Golbabaee; Simon Arberet; Pierre Vandergheynst
We propose and analyze a new model for hyperspectral images (HSIs) based on the assumption that the whole signal is composed of a linear combination of few sources, each of which has a specific spectral signature, and that the spatial abundance maps of these sources are themselves piecewise smooth and therefore efficiently encoded via typical sparse models. We derive new sampling schemes exploiting this assumption and give theoretical lower bounds on the number of measurements required to reconstruct HSI data and recover their source model parameters. This allows us to segment HSIs into their source abundance maps directly from compressed measurements. We also propose efficient optimization algorithms and perform extensive experimentation on synthetic and real datasets, which reveals that our approach can be used to encode HSI with far less measurements and computational effort than traditional compressive sensing methods.
international conference on independent component analysis and signal separation | 2006
Simon Arberet; Rémi Gribonval; Frédéric Bimbot
We propose a robust method to estimate the number of audio sources and the mixing matrix in a linear instantaneous mixture, even with more sources than sensors. Our method is based on a multiscale Short Time Fourier Transform (STFT), and relies on the assumption that in the neighborhood of some (unknown) scales and time-frequency points, only one source contributes to the mixture. Such time-frequency regions provide local estimates of the corresponding columns of the mixing matrix. Our main contribution is a new clustering algorithm called DEMIX to estimate the number of sources and the mixing matrix based on such local estimates. In contrast to DUET or other similar sparsity-based algorithms, which rely on a global scatter plot, our algorithm exploits a local confidence measure to weight the influence of each time-frequency point in the estimated matrix. Inspired by the work of Deville, the confidence measure relies on the time-frequency local persistence of the activity/inactivity of each source. Experiments are provided with stereophonic mixtures and show the improved performance of DEMIX compared to K-means or ELBG clustering algorithms.
international conference on acoustics, speech, and signal processing | 2007
Simon Arberet; Rémi Gribonval; Frédéric Bimbot
We propose a new method, called DEMIX anechoic, to estimate the mixing conditions, i.e. number of audio sources plus attenuation and time delay of each sources, in an underdetermined anechoic mixture. The method relies on the assumption that in the neighborhood of some time-frequency points, only one source contributes to the mixture. Such time-frequency points, located with a local confidence measure, provide estimates of the attenuation, as well as the phase difference at some frequency, of the corresponding source. The time delay parameters are estimated, by a method similar to GCC-PHAT, on points having close attenuations. As opposed to DUET like methods, our method can estimate time-delay higher than only one sample. Experiments show that DEMIX anechoic estimates, in more than 65% of the cases, the number of directions until 6 sources and outperforms DUET in the accuracy of the estimation by a factor of 10.
IEEE Transactions on Audio, Speech, and Language Processing | 2013
Simon Arberet; Pierre Vandergheynst; Rafael E. Carrillo; Jean-Philippe Thiran; Yves Wiaux
We propose a novel algorithm for source signals estimation from an underdetermined convolutive mixture assuming known mixing filters. Most of the state-of-the-art methods are dealing with anechoic or short reverberant mixture, assuming a synthesis sparse prior in the time-frequency domain and a narrowband approximation of the convolutive mixing process. In this paper, we address the source estimation of convolutive mixtures with a new algorithm based on i) an analysis sparse prior, ii) a reweighting scheme so as to increase the sparsity, iii) a wideband data-fidelity term in a constrained form. We show, through theoretical discussions and simulations, that this algorithm is particularly well suited for source separation of realistic reverberation mixtures. Particularly, the proposed algorithm outperforms state-of-the-art methods on reverberant mixtures of audio sources by more than 2 dB of signal-to-distortion ratio on the BSS Oracle dataset.
asilomar conference on signals, systems and computers | 2010
Mohammad Golbabaee; Simon Arberet; Pierre Vandergheynst
This paper describes a novel framework for compressive sampling (CS) of multichannel signals that are highly dependent across the channels. In this work, we assume few number of sources are generating the multichannel observations based on a linear mixture model. Moreover, sources are assumed to have sparse/compressible representations in some orthonormal basis. The main contribution of this paper lies in 1) rephrasing the CS acquisition of multichannel data as a compressive blind source separation problem, and 2) proposing an optimization problem and a recovery algorithm to estimate both the sources and the mixing matrix (and thus the whole data) from the compressed measurements. A number of experiments on the acquisition of Hyperspectral images show that our proposed algorithm obtains a reconstruction error between 10 dB and 15 dB less than other state-of-the-art CS methods.
Signal Processing | 2012
Simon Arberet; Alexey Ozerov; Frédéric Bimbot; Rémi Gribonval
The underdetermined blind audio source separation (BSS) problem is often addressed in the time-frequency (TF) domain assuming that each TF point is modeled as an independent random variable with sparse distribution. On the other hand, methods based on structured spectral model, such as the Spectral Gaussian Scaled Mixture Models (Spectral-GSMMs) or Spectral Non-negative Matrix Factorization models, perform better because they exploit the statistical diversity of audio source spectrograms, thus allowing to go beyond the simple sparsity assumption. However, in the case of discrete state-based models, such as Spectral-GSMMs, learning the models from the mixture can be computationally very expensive. One of the main problems is that using a classical Expectation-Maximization procedure often leads to an exponential complexity with respect to the number of sources. In this paper, we propose a framework with a linear complexity to learn spectral source models (including discrete state-based models) from noisy source estimates. Moreover, this framework allows combining different probabilistic models that can be seen as a sort of probabilistic fusion. We illustrate that methods based on this framework can significantly improve the BSS performance compared to the state-of-the-art approaches.
international conference on latent variable analysis and signal separation | 2010
Prasad Sudhakar; Simon Arberet; Rémi Gribonval
We propose a framework for blind multiple filter estimation from convolutive mixtures, exploiting the time-domain sparsity of the mixing filters and the disjointness of the sources in the time-frequency domain. The proposed framework includes two steps: (a) a clustering step, to determine the frequencies where each source is active alone; (b) a filter estimation step, to recover the filter associated to each source from the corresponding incomplete frequency information. We show how to solve the filter estimation step (b) using convex programming, and we explore numerically the factors that drive its performance. Step (a) remains challenging, and we discuss possible strategies that will be studied in future work.
IEEE Signal Processing Letters | 2014
Simon Arberet; Pierre Vandergheynst
The performance of audio source separation from underdetermined convolutive mixture assuming known mixing filters can be significantly improved by using an analysis sparse prior optimized by a reweighting ℓ1 scheme and a wideband data-fidelity term, as demonstrated by a recent article. In this letter, we show that the performance can be improved even more significantly by exploiting a low-rank prior on the source spectrograms. We present a new algorithm to estimate the sources based on i) an analysis sparse prior, ii) a reweighting scheme so as to increase the sparsity, iii) a wideband data-fidelity term in a constrained form, and iv) a low-rank constraint on the source spectrograms. Evaluation on reverberant music mixtures shows that the resulting algorithm improves state-of-the-art methods by more than 2 dB of signal-to-distortion ratio.
international conference on acoustics, speech, and signal processing | 2011
Simon Arberet; Prasad Sudhakar; Rémi Gribonval
We propose an approach for the estimation of sparse filters from a convolutive mixture of sources, exploiting the time-domain sparsity of the mixing filters and the sparsity of the sources in the time-frequency (TF) domain. The proposed approach is based on a wideband formulation of the cross-relation (CR) in the TF domain and on a framework including two steps: (a) a clustering step, to determine the TF points where the CR is valid; (b) a filter estimation step, to recover the set of filters associated with each source. We propose for the first time a method to blindly perform the clustering step (a) and we show that the proposed approach based on the wideband CR outperforms the narrowband approach and the GCC-PHAT approach by between 5 dB and 20 dB.