Romain Hennequin
Télécom ParisTech
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Romain Hennequin.
international conference on acoustics, speech, and signal processing | 2011
Romain Hennequin; Bertrand David; Roland Badeau
In this paper we present a new technique for monaural source separation in musical mixtures, which uses the knowledge of the musical score. This information is used to initialize an algorithm which computes a parametric decomposition of the spectrogram based on non-negative matrix factorization (NMF). This algorithm provides time-frequency masks which are used to separate the sources with Wiener filtering.
IEEE Transactions on Audio, Speech, and Language Processing | 2011
Romain Hennequin; Roland Badeau; Bertrand David
Real-world sounds often exhibit time-varying spectral shapes, as observed in the spectrogram of a harpsichord tone or that of a transition between two pronounced vowels. Whereas the standard non-negative matrix factorization (NMF) assumes fixed spectral atoms, an extension is proposed where the temporal activations (coefficients of the decomposition on the spectral atom basis) become frequency dependent and follow a time-varying autoregressive moving average (ARMA) modeling. This extension can thus be interpreted with the help of a source/filter paradigm and is referred to as source/filter factorization. This factorization leads to an efficient single-atom decomposition for a single audio event with strong spectral variation (but with constant pitch). The new algorithm is tested on real audio data and shows promising results.
international conference on acoustics, speech, and signal processing | 2015
Simon Leglaive; Romain Hennequin; Roland Badeau
In this paper, we propose a new method for singing voice detection based on a Bidirectional Long Short-Term Memory (BLSTM) Recurrent Neural Network (RNN). This classifier is able to take a past and future temporal context into account to decide on the presence/absence of singing voice, thus using the inherent sequential aspect of a short-term feature extraction in a piece of music. The BLSTM-RNN contains several hidden layers, so it is able to extract a simple representation fitted to our task from low-level features. The results we obtain significantly outperform state-of-the-art methods on a common database.
IEEE Signal Processing Letters | 2011
Romain Hennequin; Bertrand David; Roland Badeau
In this paper, we present a complete proof that the β-divergence is a particular case of Bregman divergence. This little-known result makes it possible to straightforwardly apply theorems about Bregman divergences to β-divergences. This is of interest for numerous applications since these divergences are widely used, for instance in non-negative matrix factorization (NMF).
international conference on acoustics, speech, and signal processing | 2010
Romain Hennequin; Roland Badeau; Bertrand David
Real-world sounds often exhibit time-varying spectral shapes, as observed in the spectrogram of a harpsichord tone or that of a transition between two pronounced vowels. Whereas the standard non-negative matrix factorization (NMF) assumes fixed spectral atoms, an extension is proposed where the temporal activations (coefficients of the decomposition on the spectral atom basis) become frequency dependent and follow a time-varying autoregressive moving average (ARMA) modeling. This extension can thus be interpreted with the help of a source/filter paradigm and is referred to as source/filter factorization. This factorization leads to an efficient single-atom decomposition for a single audio event with strong spectral variation (but with constant pitch). The new algorithm is tested on real audio data and shows promising results.
international conference on acoustics, speech, and signal processing | 2014
Romain Hennequin; Juan José Burred; Simon Maller; Pierre Leveau
In this paper, we present a new method to perform underdetermined audio source separation using a spoken or sung reference signal to inform the separation process. This method explicitly models possible differences between the spoken reference and the target signal, such as pitch differences and time lag. We show that the proposed algorithm outperforms state-of-the art methods.
workshop on applications of signal processing to audio and acoustics | 2011
Romain Hennequin; Roland Badeau; Bertrand David
In this paper, we present a new method for decomposing musical spectrograms. This method is similar to shift-invariant Probabilistic Latent Component Analysis, but, when the latter works with constant Q spectrograms (i.e. with a logarithmic frequency resolution), our technique is designed to decompose standard short time Fourier transform spectrograms (i.e. with a linear frequency resolution). This makes it possible to easily reconstruct the latent signals (which can be useful for source separation).
international conference on acoustics, speech, and signal processing | 2017
Romain Hennequin; Jimena Royo-Letelier; Manuel Moussallam
In this paper, we propose a method for detecting marks of lossy compression encoding, such as MP3 or AAC, from PCM audio. The method is based on a convolutional neural network (CNN) applied to audio spectrograms and trained with the output of various lossy audio codecs and bitrates. Our method shows good performances on a large database and robustness to codec type and resampling.
Journal of Sound and Vibration | 2011
Marc Rébillat; Romain Hennequin; Etienne Corteel; Brian F. G. Katz
Proc. of the 13th International Conference on Digital Audio Effects (DAFx) | 2010
Romain Hennequin; Roland Badeau; Bertrand David