Mohammed Senoussaoui
École de technologie supérieure
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mohammed Senoussaoui.
IEEE Transactions on Audio, Speech, and Language Processing | 2014
Mohammed Senoussaoui; Patrick Kenny; Themos Stafylakis; Pierre Dumouchel
Speaker clustering is a crucial step for speaker diarization. The short duration of speech segments in telephone speech dialogue and the absence of prior information on the number of clusters dramatically increase the difficulty of this problem in diarizing spontaneous telephone speech conversations. We propose a simple iterative Mean Shift algorithm based on the cosine distance to perform speaker clustering under these conditions. Two variants of the cosine distance Mean Shift are compared in an exhaustive practical study. We report state of the art results as measured by the Diarization Error Rate and the Number of Detected Speakers on the LDC CallHome telephone corpus.
international conference on acoustics, speech, and signal processing | 2013
Mohammed Senoussaoui; Patrick Kenny; Pierre Dumouchel; Themos Stafylakis
Speaker clustering is an important task in many applications such as Speaker Diarization as well as Speech Recognition. Speaker clustering can be done within a single multi-speaker recording (Diarization) or for a set of different recordings. In this work we are interested by the former case and we propose a simple iterative Mean Shift (MS) algorithm to deal with this problem. Traditionally, MS algorithm is based on Euclidean distance. We propose to use the Cosine distance in order to build a new version of MS algorithm. We report results as measured by speaker and cluster impurities on NIST SRE 2008 datasets.
international conference on acoustics, speech, and signal processing | 2011
Mohammed Senoussaoui; Patrick Kenny; Pierre Dumouchel; Fabio Castaldo
The work presented in this paper is an extension of our two previous works [1, 2]. In the first paper [1], we proposed a low dimensional feature (i-vectors) extractor which is suitable for both telephone and microphone data of the NIST speaker recognition evaluation dataset. The second paper [2] introduces the use of Probabilistic Linear Discriminant Analysis (PLDA) framework with a heavy tailed distribution for speaker verification. The advantage of PLDA comes from the fact that it does not require eigenchannel modelization nor scores normalization. However, this approach is only known for its success on telephone data speech but not for microphone data. We propose to overcome this drawback by using PLDA as a second pass at the front-end feature extraction as well as a classifier. We present results on female speakers for the interview-interview condition in NIST2010 SRE. As measured by equal error rate (ERR) and NIST detection cost function (DCF), results with raw scores are 17% better than with score normalization. We have also calibrated our scores and we achieve a minimum and an actual DCF respectively of 0.559 and 0.607.
conference of the international speech communication association | 2016
Mohammed Senoussaoui; Patrick Cardinal; Najim Dehak; Alessandro L. Koerich
Native-language identification is the task of determining a speaker’s native language based only on their speeches in a second language. In this paper we propose the use of the wellknown i-vector representation of the speech signal to detect the native language of an English speaker. The i-vector representation has shown an excellent performance on the quite similar task of distinguishing between different languages. We have evaluated different ways to extract i-vectors in order to adapt them to the specificities of the native language detection task. The experimental results on the 2016 ComParE Native language sub-challenge test set have shown that the proposed system based on a conventional i-vector extractor outperforms the baseline system with a 42% relative improvement.
international conference on acoustics, speech, and signal processing | 2017
Mohammed Senoussaoui; João Felipe Santos; Tiago H. Falk
Reverberation and noise are known to be the two most important culprits for poor performance in far-field speech applications, such as automatic speech recognition. Recent research has suggested that reverberation-aware speech enhancement (or speech technologies, in general) could be used to improve performance. However, recent results also show existing blind room acoustics characterization algorithms are not robust under ambient noise and there is still room for improvement under such settings. In this paper, several fusion approaches are proposed for noise-robust reverberation time estimation. More specifically, feature- and score-level fusion of short- and long-term speech temporal dynamics features are proposed. With noise-aware feature-level fusion, gains of up to 15.4% could be seen in root mean square error. Score-level fusion, in turn, showed further improvements of up to 9.8%. Relative to a recently-proposed noise-robust benchmark algorithm, improvements of 30% could be seen, thus showing the advantages of speech temporal dynamics fusion approaches for noise-robust reverberation time estimation.
Odyssey | 2010
Mohammed Senoussaoui; Patrick Kenny; Najim Dehak; Pierre Dumouchel
conference of the international speech communication association | 2011
Mohammed Senoussaoui; Patrick Kenny; Niko Brümmer; Edward de Villiers; Pierre Dumouchel
Odyssey | 2012
Themos Stafylakis; Patrick Kenny; Mohammed Senoussaoui; Pierre Dumouchel
Odyssey | 2012
Mohammed Senoussaoui; Najim Dehak; Patrick Kenny; Réda Dehak; Pierre Dumouchel
conference of the international speech communication association | 2012
Themos Stafylakis; Patrick Kenny; Mohammed Senoussaoui; Pierre Dumouchel