Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Fabien Ringeval is active.

Publication


Featured researches published by Fabien Ringeval.


Pattern Recognition Letters | 2015

Prediction of asynchronous dimensional emotion ratings from audiovisual and physiological data

Fabien Ringeval; Florian Eyben; Eleni Kroupi; Anıl Yüce; Jean-Philippe Thiran; Touradj Ebrahimi; Denis Lalanne; Björn W. Schuller

We study the relevance of context-learning for handling asynchrony of annotation.We unite audiovisual and physiological data for continuous affect analysis.We propose multi-time resolution features extraction from multimodal data.The use of context-learning allows to include reaction time delay of raters.Fusion of audiovisual and physiological data performs best on arousal and valence. Automatic emotion recognition systems based on supervised machine learning require reliable annotation of affective behaviours to build useful models. Whereas the dimensional approach is getting more and more popular for rating affective behaviours in continuous time domains, e.g., arousal and valence, methodologies to take into account reaction lags of the human raters are still rare. We therefore investigate the relevance of using machine learning algorithms able to integrate contextual information in the modelling, like long short-term memory recurrent neural networks do, to automatically predict emotion from several (asynchronous) raters in continuous time domains, i.e., arousal and valence. Evaluations are performed on the recently proposed RECOLA multimodal database (27 subjects, 5? min of data and six raters for each), which includes audio, video, and physiological (ECG, EDA) data. In fact, studies uniting audiovisual and physiological information are still very rare. Features are extracted with various window sizes for each modality and performance for the automatic emotion prediction is compared for both different architectures of neural networks and fusion approaches (feature-level/decision-level). The results show that: (i) LSTM network can deal with (asynchronous) dependencies found between continuous ratings of emotion with video data, (ii) the prediction of the emotional valence requires longer analysis window than for arousal and (iii) a decision-level fusion leads to better performance than a feature-level fusion. The best performance (concordance correlation coefficient) for the multimodal emotion prediction is 0.804 for arousal and 0.528 for valence.


Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction | 2008

Exploiting a Vowel Based Approach for Acted Emotion Recognition

Fabien Ringeval; Mohamed Chetouani

This paper is devoted to the description of a new approach for emotion recognition. Our contribution is based on both the extraction and the characterization of phonemic units such as vowels and consonants, which are provided by a pseudophonetic speech segmentation phase combined with a vowel detector. Concerning the emotion recognition task, we explore acoustic and prosodic features from these pseudo-phonetic segments (vowels and consonants), and we compare this approach with traditional voiced and unvoiced segments. The classification is realized by the well-known k-nn classifier (k nearest neighbors) from two different emotional speech databases: Berlin (German) and Aholab (Basque).


Cognitive Computation | 2009

Time-Scale Feature Extractions for Emotional Speech Characterization

Mohamed Chetouani; Ammar Mahdhaoui; Fabien Ringeval

Emotional speech characterization is an important issue for the understanding of interaction. This article discusses the time-scale analysis problem in feature extraction for emotional speech processing. We describe a computational framework for combining segmental and supra-segmental features for emotional speech detection. The statistical fusion is based on the estimation of local a posteriori class probabilities and the overall decision employs weighting factors directly related to the duration of the individual speech segments. This strategy is applied to a real-world application: detection of Italian motherese in authentic and longitudinal parent–infant interaction at home. The results suggest that short- and long-term information, respectively, represented by the short-term spectrum and the prosody parameters (fundamental frequency and energy) provide a robust and efficient time-scale analysis. A similar fusion methodology is also investigated by the use of a phonetic-specific characterization process. This strategy is motivated by the fact that there are variations across emotional states at the phoneme level. A time-scale based on both vowels and consonants is proposed and it provides a relevant and discriminant feature space for acted emotion recognition. The experimental results on two different databases Berlin (German) and Aholab (Basque) show that the best performance are obtained by our phoneme-dependent approach. These findings demonstrate the relevance of taking into account phoneme dependency (vowels/consonants) for emotional speech characterization.


affective computing and intelligent interaction | 2013

On the Influence of Emotional Feedback on Emotion Awareness and Gaze Behavior

Fabien Ringeval; Andreas Sonderegger; Basilio Noris; Aude Billard; Jürgen Sauer; Denis Lalanne

This paper examines how emotion feedback influences emotion awareness and gaze behavior. Simulating a videoconference setup, 36 participants watched 12 emotional video sequences that were selected from the SEMAINE database. All participants wore an eye-tracker to measure gaze behavior and were asked to rate the perceived emotion for each video sequence. 3 conditions were tested: (c1) no feedback, i.e., the original video-sequences, (c2) correct feedback, i.e., an emoticon is integrated in the video to show the emotion depicted by the person in the video and (c3) random feedback, i.e., the emoticon displays at random an emotional state that may or may not correspond to the one of the person. The results showed that emotion feedback had a significant influence on gaze behavior, e.g., over time random feedback led to a decrease in the frequency of episodes of gaze. No effect of emotion display was observed for emotion recognition. However, experiments on the automatic emotion recognition using gaze behavior provided good performance, with better score on arousal than valence, and a very good performance was obtained in the automatic recognition of the correctness of the emotion feedback.


Image and Vision Computing | 2017

Strength modelling for real-worldautomatic continuous affect recognition from audiovisual signals*

Jing Han; Zixing Zhang; Nicholas Cummins; Fabien Ringeval; Björn W. Schuller

Abstract Automatic continuous affect recognition from audiovisual cues is arguably one of the most active research areas in machine learning. In addressing this regression problem, the advantages of the models, such as the global-optimisation capability of Support Vector Machine for Regression and the context-sensitive capability of memory-enhanced neural networks, have been frequently explored, but in an isolated way. Motivated to leverage the individual advantages of these techniques, this paper proposes and explores a novel framework, Strength Modelling, where two models are concatenated in a hierarchical framework. In doing this, the strength information of the first model, as represented by its predictions, is joined with the original features, and this expanded feature space is then utilised as the input by the successive model. A major advantage of Strength Modelling, besides its ability to hierarchically explore the strength of different machine learning algorithms, is that it can work together with the conventional feature- and decision-level fusion strategies for multimodal affect recognition. To highlight the effectiveness and robustness of the proposed approach, extensive experiments have been carried out on two time- and value-continuous spontaneous emotion databases (RECOLA and SEMAINE) using audio and video signals. The experimental results indicate that employing Strength Modelling can deliver a significant performance improvement for both arousal and valence in the unimodal and bimodal settings. The results further show that the proposed systems is competitive or outperform the other state-of-the-art approaches, but being with a simple implementation.


international conference on acoustics, speech, and signal processing | 2017

Prediction-based learning for continuous emotion recognition in speech

Jing Han; Zixing Zhang; Fabien Ringeval; Björn W. Schuller

In this paper, a prediction-based learning framework is proposed for a continuous prediction task of emotion recognition from speech, which is one of the key components of affective computing in multimedia. The main goal of this framework is to utmost exploit the individual advantages of different regression models cooperatively. To this end, we take two widely used regression models for example, i. e., support vector regression and bidirectional long short-term memory recurrent neural network. We concatenate the two models in a tandem structure by different ways, forming a united cascaded framework. The outputs predicted by the former model are combined together with the original features as the input of the following model for final predictions. The experimental results on a time- and value-continuous spontaneous emotion database (RECOLA) show that, the prediction-based learning framework significantly outperforms the individual models for both arousal and valence dimensions, and provides significantly better results in comparison to other state-of-the-art methodologies on this corpus.


international conference on human-computer interaction | 2013

Computer-Supported Work in Partially Distributed and Co-located Teams: The Influence of Mood Feedback

Andreas Sonderegger; Denis Lalanne; Luisa Bergholz; Fabien Ringeval; Jürgen Sauer

This article examines the influence of mood feedback on different outcomes of teamwork in two different collaborative work environments. Employing a 2 x 2 between-subjects design, mood feedback (present vs. not present) and communication mode (face-to-face vs. video conferencing) were manipulated experimentally. We used a newly developed collaborative communication environment, called EmotiBoard, which is a large vertical interactive screen, with which team members can interact in a face-to-face discussion or as a spatially distributed team. To support teamwork, this tool provides visual feedback of each team member’s emotional state. Thirty-five teams comprising 3 persons each (with a confederate in each team) completed three different tasks, measuring mood, performance, subjective workload, and team satisfaction. Results indicated that the evaluation of the other team members’ emotional state was more accurate when the mood feedback was presented. In addition, mood feedback influenced team performance positively in the video conference condition and negatively in the face-to-face condition. Furthermore, participants in the video conference condition were more satisfied after task completion than participants in the face-to-face condition. Findings indicate that the mood feedback tool is helpful for teams to gain a more accurate understanding of team members’ emotional states in different work situations.


international conference on signals circuits and systems | 2009

Emotional speech characterization based on multi-features fusion for face-to-face interaction

Ammar Mahdhaoui; Fabien Ringeval; Mohamed Chetouani

Speech contains non verbal elements known as paralanguage, including voice quality, emotion and speaking style, as well as prosodic features such as rhythm, intonation and stress. The study of nonverbal communication has focused on face-to-face interaction since that the behaviors of communicators play a major role during social interaction and carry information between the different speakers. In this paper, we describe a computational framework for combining different features for emotional speech detection. The statistical fusion is based on the estimation of local a posteriori class probabilities and the overall decision employs weighting factors directly related to the duration of the individual speech segments. This strategy is applied to a real-life application: detection of motherese in authentic and longitudinal parent-infant interaction at home. The results suggest that short- and long-term information provide a robust and efficient time-scale analysis. A similar fusion methodology is also investigated by the use of a phonetic-specific characterization process. This strategy is motivated by the fact that there are variations across emotional states at the phoneme level. A time-scale based on both vowels and consonants is proposed and it provides a relevant discriminant feature space for acted emotion recognition.


international conference on acoustics, speech, and signal processing | 2017

Reconstruction-error-based learning for continuous emotion recognition in speech

Jing Han; Zixing Zhang; Fabien Ringeval; Björn W. Schuller

To advance the performance of continuous emotion recognition from speech, we introduce a reconstruction-error-based (RE-based) learning framework with memory-enhanced Recurrent Neural Networks (RNN). In the framework, two successive RNN models are adopted, where the first model is used as an autoencoder for reconstructing the original features, and the second is employed to perform emotion prediction. The RE of the original features is used as a complementary descriptor, which is merged with the original features and fed to the second model. The assumption of this framework is that the system has the ability to learn its ‘drawback’ which is expressed by the RE. Experimental results on the RECOLA database show that the proposed framework significantly outperforms the baseline systems without any RE information in terms of Concordance Correlation Coefficient (.729 vs .710 for arousal, .360 vs .237 for valence), and also significantly overcomes other state-of-the-art methods.


Research in Autism Spectrum Disorders | 2011

Differential Language Markers of Pathology in Autism, Pervasive Developmental Disorder Not Otherwise Specified and Specific Language Impairment.

Julie Demouy; Monique Plaza; Jean Xavier; Fabien Ringeval; Mohamed Chetouani; Didier Périsse; Dominique Chauvin; Sylvie Viaux; Bernard Golse; David Cohen; Laurence Robel

Collaboration


Dive into the Fabien Ringeval's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jing Han

University of Augsburg

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Maja Pantic

Imperial College London

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Roddy Cowie

Queen's University Belfast

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Laurence Robel

Necker-Enfants Malades Hospital

View shared research outputs
Top Co-Authors

Avatar

Monique Plaza

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge