Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Pierre-Michel Bousquet is active.

Publication


Featured researches published by Pierre-Michel Bousquet.


international conference on acoustics, speech, and signal processing | 2012

I-vectors in the context of phonetically-constrained short utterances for speaker verification

Anthony Larcher; Pierre-Michel Bousquet; Kong Aik Lee; Driss Matrouf; Haizhou Li; Jean-François Bonastre

Short speech duration remains a critical factor of performance degradation when deploying a speaker verification system. To overcome this difficulty, a large number of commercial applications impose the use of fixed pass-phrases. In this context, we show that the performance of the popular i-vector approach can be greatly improved by taking advantage of the phonetic information that they convey. Moreover, as i-vectors require a conditioning process to reach high accuracy, we show that further improvements are possible by taking advantage of this phonetic information within the normalisation process. We compare two methods, Within Class Covariance Normalization (WCCN) and Eigen Factor Radial (EFR), both relying on parameters estimated on the same development data. Our study suggests that WCCN is more robust to data mismatch but less efficient than EFR when the development data has a better match with the test data.


Pattern Recognition Letters | 2014

Feature selection using Principal Component Analysis for massive retweet detection

Mohamed Morchid; Richard Dufour; Pierre-Michel Bousquet; Georges Linarès; Juan-Manuel Torres-Moreno

Social networks become a major actor in massive information propagation. In the context of the Twitter platform, its popularity is due in part to the capability of relaying messages (i.e. tweets) posted by users. This particular mechanism, called retweet, allows users to massively share tweets they consider as potentially interesting for others. In this paper, we propose to study the behavior of tweets that have been massively retweeted in a short period of time. We first analyze specific tweet features through a Principal Component Analysis (PCA) to better understand the behavior of highly forwarded tweets as opposed to those retweeted only a few times. Finally, we propose to automatically detect the massively retweeted messages. The qualitative study is used to select the features allowing the best classification performance. We show that the selection of only the most correlated features, leads to the best classification accuracy (F-measure of 65.7%), with a gain of about 2.4 points in comparison to the use of the complete set of features.


international conference on acoustics, speech, and signal processing | 2014

Improving dialogue classification using a topic space representation and a Gaussian classifier based on the decision rule

Mohamed Morchid; Richard Dufour; Pierre-Michel Bousquet; Mohamed Bouallegue; Georges Linarès; Renato De Mori

In this paper, we study the impact of dialogue representations and classification methods in the task of theme identification of telephone conversation services having highly imperfect automatic transcriptions. Two dialogue representations are firstly compared: the classical Term Frequency-Inverse Document Frequency with Gini purity criteria (TF-IDF-Gini) method and the Latent Dirichlet Allocation (LDA) approach. We then propose to study an original classification method that takes advantage of the LDA topic space representation, highlighted as the best dialogue representation. To do so, two assumptions about topic representation led us to choose a Gaussian process (GP) based method. This approach is compared with a Support Vector Machine (SVM) classification method. Results show that the GP approach is a better solution to deal with the multiple theme complexity of a dialogue, no matter the conditions studied (manual or automatic transcriptions). We finally discuss the impact of the topic space reduction on the classification accuracy.


international conference on acoustics, speech, and signal processing | 2015

Additive noise compensation in the i-vector space for speaker recognition

Waad Ben Kheder; Driss Matrouf; Jean-François Bonastre; Moez Ajili; Pierre-Michel Bousquet

State-of-the-art speaker recognition systems performance degrades considerably in noisy environments even though they achieve very good results in clean conditions. In order to deal with this strong limitation, we aim in this work to remove the noisy part of an i-vector directly in the i-vector space. Our approach offers the advantage to operate only at the i-vector extraction level, letting the other steps of the system unchanged. A maximum a posteriori (MAP) procedure is applied in order to obtain clean version of the noisy i-vectors taking advantage of prior knowledge about clean i-vectors distribution. To perform this MAP estimation, Gaussian assumptions over clean and noise i-vectors distributions are made. Operating on NIST 2008 data, we show a relative improvement up to 60% compared with baseline system. Our approach also outperforms the “multi-style” backend training technique. The efficiency of the proposed method is obtained at the price of relative high computational cost. We present at the end some ideas to improve this aspect.


international conference on acoustics, speech, and signal processing | 2011

Discriminant binary data representation for speaker recognition

Jean-François Bonastre; Pierre-Michel Bousquet; Driss Matrouf; Xavier Anguera

In supervector UBM/GMM paradigm, each acoustic file is represented by the mean parameters of a GMM model. This supervector space is used as a data representation space, which has a high dimensionality. Moreover, this space is not intrinsically discriminant and a complete speech segment is represented by only one vector, withdrawing mainly the possibility to take into account temporal or sequential information. This work proposes a new approach where each acoustic frame is represented in a discriminant binary space. The proposed approach relies on a UBM to structure the acoustic space in regions. Each region is then populated with a set of Gaussian models, denoted as “specificities”, able to emphasize speaker specific information. Each acoustic frame is mapped in the discriminant binary space, turning “on” or “off” all the specificities to create a large binary vector. All the following steps, speaker reference extraction, likelihood estimation or decision take place in this binary space. Even if this work is a first step in this avenue, the experiments based on NIST SRE 2008 framework demonstrate the potential of the proposed approach. Moreover, this approach opens the opportunity to rethink all the classical processes using a discrete, binary view.


International Conference on Statistical Language and Speech Processing | 2014

Robust Speaker Recognition Using MAP Estimation of Additive Noise in i-vectors Space

Waad Ben Kheder; Driss Matrouf; Pierre-Michel Bousquet; Jean-François Bonastre; Moez Ajili

In the last few years, the use of i-vectors along with a generative back-end has become the new standard in speaker recognition. An i-vector is a compact representation of a speaker utterance extracted from a low dimensional total variability subspace. Although current speaker recognition systems achieve very good results in clean training and test conditions, the performance degrades considerably in noisy environments. The compensation of the noise effect is actually a research subject of major importance. As far as we know, there was no serious attempt to treat the noise problem directly in the i-vectors space without relying on data distributions computed on a prior domain. This paper proposes a full-covariance Gaussian modeling of the clean i-vectors and noise distributions in the i-vectors space then introduces a technique to estimate a clean i-vector given the noisy version and the noise density function using MAP approach. Based on NIST data, we show that it is possible to improve up to 60 % the baseline system performances. A noise adding tool is used to help simulate a real-world noisy environment at different signal-to-noise ratio levels.


Pattern Recognition Letters | 2014

Session compensation using binary speech representation for speaker recognition

Gabriel Hernández-Sierra; José R. Calvo; Jean-François Bonastre; Pierre-Michel Bousquet

We aim to present the power of a new speech representation, the Speaker Binary Key.New variant of the within-class scatter matrix for session compensation is proposed.Covariance matrix using common attributes contains much more information.i-Vector and binary key framework contain complementary information. Recently, a simple representation of a speech excerpt was proposed, as a binary matrix where each acoustic frame is represented by a binary vector. This new approach relies on the UBM paradigm but shifts the speaker recognition workspace from a continuous probabilistic to a discrete, binary discrete space, allowing easy access to the speaker discriminant information. In addition to the time-related abilities of this representation, it also allows the system to work with a more compact representation based on cumulative vectors. A cumulative vector is the sum of a set of frame-based binary vectors. In this space, global information can be exploited to compensate for the effects of session variability. This work is mainly dedicated to this aspect. A new variability compensation method in the cumulative vector space is proposed in order to remove not only the unwanted attributes of session variability but also the common attributes among speakers. This is done by incorporating in the projection matrix the common information to all classes. A specificity selection approach using a mask in the cumulative vector space is also proposed. This aims to reduce the non informative coefficients. The experimental validation, done on the NIST-SRE framework, demonstrates the efficiency of the proposed solutions, which shows an EER improvement from 42% to 61%. The combination of i-vector and binary approaches, using the proposed methods, showed the complementarity of the discriminatory information exploited by each of them.


iberoamerican congress on pattern recognition | 2013

Identify the Benefits of the Different Steps in an i-Vector Based Speaker Verification System

Pierre-Michel Bousquet; Jean-François Bonastre; Driss Matrouf

This paper focuses on the analysis of the i-vector paradigm, a compact representation of spoken utterances that is used by most of the state of the art speaker verification systems. This work was mainly motivated by the need to quantify the impact of their steps on the final performance, especially their ability to model data according to a theoretical Gaussian framework. These investigations allow to highlight the key points of the approach, in particular a core conditioning procedure, that lead to the success of the i-vector paradigm.


spoken language technology workshop | 2016

Quaternion Neural Networks for Spoken Language Understanding

Titouan Parcollet; Mohamed Morchid; Pierre-Michel Bousquet; Richard Dufour; Georges Linarès; Renato De Mori

Machine Learning (ML) techniques have allowed a great performance improvement of different challenging Spoken Language Understanding (SLU) tasks. Among these methods, Neural Networks (NN), or Multilayer Perceptron (MLP), recently received a great interest from researchers due to their representation capability of complex internal structures in a low dimensional subspace. However, MLPs employ document representations based on basic word level or topic-based features. Therefore, these basic representations reveal little in way of document statistical structure by only considering words or topics contained in the document as a “bag-of-words”, ignoring relations between them. We propose to remedy this weakness by extending the complex features based on Quaternion algebra presented in [1] to neural networks called QMLP. This original QMLP approach is based on hyper-complex algebra to take into consideration features dependencies in documents. New document features, based on the document structure itself, used as input of the QMLP, are also investigated in this paper, in comparison to those initially proposed in [1]. Experiments made on a SLU task from a real framework of human spoken dialogues showed that our QMLP approach associated with the proposed document features outperforms other approaches, with an accuracy gain of 2% with respect to the MLP based on real numbers and more than 3% with respect to the first Quaternion-based features proposed in [1]. We finally demonstrated that less iterations are needed by our QMLP architecture to be efficient and to reach promising accuracies.


Computer Speech & Language | 2017

Fast i-vector denoising using MAP estimation and a noise distributions database for robust speaker recognition

Waad Ben Kheder; Driss Matrouf; Pierre-Michel Bousquet; Jean-François Bonastre; Moez Ajili

We use a normal distribution model for both clean and noisy i-vectors.We use an additive model of the noise in the i-vector space.We use a MAP estimator to clean-up noisy i-vectors based on both clean i-vectors and noise distributions in the i-vector space. Once the i-vector paradigm has been introduced in the field of speaker recognition, many techniques have been proposed to deal with additive noise within this framework. Due to the complexity of its effect in the i-vector space, a lot of effort has been put into dealing with noise in other domains (speech enhancement, feature compensation, robust i-vector extraction and robust scoring). As far as we know, there was no serious attempt to handle the noise problem directly in the i-vector space without relying on data distributions computed on a prior domain. The aim of this paper is twofold. First, it proposes a full-covariance Gaussian modeling of the clean i-vectors and noise distribution in the i-vector space and introduces a technique to estimate a clean i-vector given the noisy version and the noise density function using the MAP approach. Based on NIST data, we show that it is possible to improve by up to 60% the baseline system performance. Second, in order to make this algorithm usable in a real application and reduce the computational time needed by i-MAP, we propose an extension that requires building a noise distribution database in the i-vector space in an off-line step and using it later in the test phase. We show that it is possible to achieve comparable results using this approach (up to 57% of relative EER improvement) with a sufficiently large noise distribution database.

Collaboration


Dive into the Pierre-Michel Bousquet's collaboration.

Top Co-Authors

Avatar

Jean-François Bonastre

Institut Universitaire de France

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge