Jean-Luc Rouas
Centre national de la recherche scientifique
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jean-Luc Rouas.
Speech Communication | 2005
Jean-Luc Rouas; Jérôme Farinas; François Pellegrino; Régine André-Obrecht
This paper deals with an approach to Automatic Language Identification based on rhythmic modelling. Beside phonetics and phonotactics, rhythm is actually one of the most promising features to be considered for language identification, even if its extraction and modelling are not a straightforward issue. Actually, one of the main problems to address is what to model. In this paper, an algorithm of rhythm extraction is described: using a vowel detection algorithm, rhythmic units related to syllables are segmented. Several parameters are extracted (consonantal and vowel duration, cluster complexity) and modelled with a Gaussian Mixture. Experiments are performed on read speech for 7 languages (English, French, German, Italian, Japanese, Mandarin and Spanish) and results reach up to 86 ± 6% of correct discrimination between stress-timed mora-timed and syllable-timed classes of languages, and to 67 ± 8% percent of correct language identification on average for the 7 languages with utterances of 21 seconds. These results are commented and compared with those obtained with a standard acoustic Gaussian mixture modelling approach (88 ± 5% of correct identification for the 7-languages identification task).
international conference on acoustics, speech, and signal processing | 2003
Julien Pinquier; Jean-Luc Rouas; Régine André-Obrecht
In this paper, we present and merge two speech / music classification approaches of that we have developed. The first one is a differentiated modeling approach based on a spectral analysis, which is implemented with GMM. The other one is based on three original features: entropy modulation, stationary segment duration and number of segments. They are merged with the classical 4 Hertz modulation energy. Our classification system is a fusion of the two approaches. It is divided in two classifications (speech/non-speech and music/non-music) and provides 94 % of accuracy for speech detection and 90 % for music detection, with one second of input signal. Beside the spectral information and GMM, classically used in speech / music discrimination, simple parameters bring complementary and efficient information.
international conference on acoustics, speech, and signal processing | 2003
Jean-Luc Rouas; Jérôme Farinas; François Pellegrino; Régine André-Obrecht
This paper deals with an approach to Automatic Language Identification using only prosodic modeling. The actual approach for language identification focuses mainly on phonotactics because it gives the best results. We propose here to evaluate the relevance of prosodic information for language identification with read studio recording (previous experiment [l]) and spontaneous telephone speech. For read speech, experiments were performed on the five languages of the MULTEXT database [Z]. On the MULTEXT corpus, our prosodic system achieved an identification rate of 79 % on the five languages discrimination task. For spontaneous speech, experiments are made on the ten languages of the OGI Multilingual telephone speech corpus [3]. On the OGI MLTS corpus, the results are given for languages pair discrimination tasks, and are compared with results from [4]. As a conclusion, if our prosodic system achieves good performance on read speech, it might not take into account the complexity of spontaneous speech prosody.
international conference on acoustics, speech, and signal processing | 2002
Jérôme Farinas; François Pellegrino; Jean-Luc Rouas; Régine André-Obrecht
This paper deals with an approach to Automatic Language Identification based on rhythmic modeling and vowel system modeling. Experiments are performed on read speech for 5 European languages. They show that rhythm and stress may be automatically extracted and are relevant in language identification: using cross-validation, 78% of correct identification is reached with 21 s. utterances. The Vowel System Modeling, tested in the same conditions (cross-validation), is efficient and results in a 70% of correct identification for the 21 s. utterances. Last. merging the two models slightly improves the results.
ieee international conference on automatic face gesture recognition | 2015
Zuheng Ming; Aurélie Bugeau; Jean-Luc Rouas; Takaaki Shochi
Automatic facial expression recognition has emerged over two decades. The recognition of the posed facial expressions and the detection of Action Units (AUs) of facial expression have already made great progress. More recently, the automatic estimation of the variation of facial expression, either in terms of the intensities of AUs or in terms of the values of dimensional emotions, has emerged in the field of the facial expression analysis. However, discriminating different intensities of AUs is a far more challenging task than AUs detection due to several intractable problems. Aiming to continuing standardized evaluation procedures and surpass the limits of the current research, the second Facial Expression Recognition and Analysis challenge (FERA2015) is presented. In this context, we propose a method using the fusion of the different appearance and geometry features based on a multi-kernel Support Vector Machine (SVM) for the automatic estimation of the intensities of the AUs. The result of our approach benefiting from taking advantages of the different features adapting to a multi-kernel SVM is shown to outperform the conventional methods based on the mono-type feature with single kernel SVM.
international conference on pattern recognition | 2004
Jorge Gutiérrez; Jean-Luc Rouas; Régine André-Obrecht
Making a pattern recognition decision with the maximum-likelihood rule is a particular case of the risk-based Bayesian decision rule which is simplified when the loss function is zero-one symmetrical and classes are equally a priori probable. In the case the recognition system is composed of several experts, we can take into account their estimated performance at the class level as a key heuristic-like factor to weight the loss function and drive the recognition process while fusing their decisions. Such indices are formally computed by applying the discriminant factor analysis method. The experiments are carried out in the automatic language identification domain with a system composed of several identification experts. Fusion of expert decisions is achieved by building statistical classifiers.
asian conference on intelligent information and database systems | 2015
Toyoaki Nishida; Masakazu Abe; Takashi Ookaki; Divesh Lala; Sutasinee Thovuttikul; Hengjie Song; Yasser F. O. Mohammad; Christian Nitschke; Yoshimasa Ohmoto; Atsushi Nakazawa; Takaaki Shochi; Jean-Luc Rouas; Aurélie Bugeau; Fabien Lotte; Ming Zuheng; Geoffrey Letournel; Marine Guerry; Dominique Fourer
Synthetic evidential study (SES) is a novel approach to understanding and augmenting collective thought process through substantiation by interactive media. It consists of a role-play game by participants, projecting the resulting play into a shared virtual space, critical discussions with mediated role-play, and componentization for reuse. We present the conceptual framework of SES, initial findings from a SES workshop, supporting technologies for SES, potential applications of SES, and future challenges.
ieee international conference semantic computing | 2012
Ioannidis Leonidas; Jean-Luc Rouas
In this paper we propose a method for singing voice detection in popular music recordings. The method is based on statistical learning of spectral features extracted from the audio tracks. In our method we use Mel Frequency Cepstrum Coefficients (MFCC) to train two Gaussian Mixture Models (GMM). Special attention is brought to our novel approach for smoothing the errors produced by the automatic classification by exploiting semantic content from the songs, which will significantly boost the overall performance of the system.
content based multimedia indexing | 2015
Alejandro Ramírez; Jenny Benois-Pineau; Mireya Saraí García Vázquez; Andrei Stoian; Michel Crucianu; Mariko Nakano; Francisco J. García-Ugalde; Jean-Luc Rouas; Henri Nicolas; Jean Carrive
In this paper we present the Mex-Culture Multimedia platform, which is the first prototype of multimedia indexing and retrieval for a large-scale access to digitized Mexican cultural audio-visual content. The platform is designed as an open and extensible architecture of Web services. The different architectural layers and media services are presented, ensuring a rich set of scenarios. The latter comprises summarization of audio-visual content in cross-media description spaces, video queries by actions, key-frame and image queries by example and audio-analysis services. Specific attention is paid to the selection of data to be representative of Mexican cultural content. Scalability issues are addressed as well.
conference of the international speech communication association | 2002
Julien Pinquier; Jean-Luc Rouas; Régine André-Obrecht