Mohammed Senoussaoui | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mohammed Senoussaoui is active.

Explore More

Publication

Featured researches published by Mohammed Senoussaoui.

IEEE Transactions on Audio, Speech, and Language Processing | 2014

A Study of the Cosine Distance-Based Mean Shift for Telephone Speech Diarization

Mohammed Senoussaoui; Patrick Kenny; Themos Stafylakis; Pierre Dumouchel

Speaker clustering is a crucial step for speaker diarization. The short duration of speech segments in telephone speech dialogue and the absence of prior information on the number of clusters dramatically increase the difficulty of this problem in diarizing spontaneous telephone speech conversations. We propose a simple iterative Mean Shift algorithm based on the cosine distance to perform speaker clustering under these conditions. Two variants of the cosine distance Mean Shift are compared in an exhaustive practical study. We report state of the art results as measured by the Diarization Error Rate and the Number of Detected Speakers on the LDC CallHome telephone corpus.

international conference on acoustics, speech, and signal processing | 2013

Efficient iterative mean shift based cosine dissimilarity for multi-recording speaker clustering

Mohammed Senoussaoui; Patrick Kenny; Pierre Dumouchel; Themos Stafylakis

Speaker clustering is an important task in many applications such as Speaker Diarization as well as Speech Recognition. Speaker clustering can be done within a single multi-speaker recording (Diarization) or for a set of different recordings. In this work we are interested by the former case and we propose a simple iterative Mean Shift (MS) algorithm to deal with this problem. Traditionally, MS algorithm is based on Euclidean distance. We propose to use the Cosine distance in order to build a new version of MS algorithm. We report results as measured by speaker and cluster impurities on NIST SRE 2008 datasets.

international conference on acoustics, speech, and signal processing | 2011

Well-calibrated heavy tailed Bayesian speaker verification for microphone speech

Mohammed Senoussaoui; Patrick Kenny; Pierre Dumouchel; Fabio Castaldo

The work presented in this paper is an extension of our two previous works [1, 2]. In the first paper [1], we proposed a low dimensional feature (i-vectors) extractor which is suitable for both telephone and microphone data of the NIST speaker recognition evaluation dataset. The second paper [2] introduces the use of Probabilistic Linear Discriminant Analysis (PLDA) framework with a heavy tailed distribution for speaker verification. The advantage of PLDA comes from the fact that it does not require eigenchannel modelization nor scores normalization. However, this approach is only known for its success on telephone data speech but not for microphone data. We propose to overcome this drawback by using PLDA as a second pass at the front-end feature extraction as well as a classifier. We present results on female speakers for the interview-interview condition in NIST2010 SRE. As measured by equal error rate (ERR) and NIST detection cost function (DCF), results with raw scores are 17% better than with score normalization. We have also calibrated our scores and we achieve a minimum and an actual DCF respectively of 0.559 and 0.607.

conference of the international speech communication association | 2016

Native language detection using the i-vector framework

Mohammed Senoussaoui; Patrick Cardinal; Najim Dehak; Alessandro L. Koerich

Native-language identification is the task of determining a speaker’s native language based only on their speeches in a second language. In this paper we propose the use of the wellknown i-vector representation of the speech signal to detect the native language of an English speaker. The i-vector representation has shown an excellent performance on the quite similar task of distinguishing between different languages. We have evaluated different ways to extract i-vectors in order to adapt them to the specificities of the native language detection task. The experimental results on the 2016 ComParE Native language sub-challenge test set have shown that the proposed system based on a conventional i-vector extractor outperforms the baseline system with a 42% relative improvement.

international conference on acoustics, speech, and signal processing | 2017

Speech temporal dynamics fusion approaches for noise-robust reverberation time estimation

Mohammed Senoussaoui; João Felipe Santos; Tiago H. Falk

Reverberation and noise are known to be the two most important culprits for poor performance in far-field speech applications, such as automatic speech recognition. Recent research has suggested that reverberation-aware speech enhancement (or speech technologies, in general) could be used to improve performance. However, recent results also show existing blind room acoustics characterization algorithms are not robust under ambient noise and there is still room for improvement under such settings. In this paper, several fusion approaches are proposed for noise-robust reverberation time estimation. More specifically, feature- and score-level fusion of short- and long-term speech temporal dynamics features are proposed. With noise-aware feature-level fusion, gains of up to 15.4% could be seen in root mean square error. Score-level fusion, in turn, showed further improvements of up to 9.8%. Relative to a recently-proposed noise-robust benchmark algorithm, improvements of 30% could be seen, thus showing the advantages of speech temporal dynamics fusion approaches for noise-robust reverberation time estimation.

Odyssey | 2010