Laurence Devillers
Paris-Sorbonne University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Laurence Devillers.
affective computing and intelligent interaction | 2007
Ellen Douglas-Cowie; Roddy Cowie; Ian Sneddon; Cate Cox; Orla Lowry; Margaret McRorie; Jean-Claude Martin; Laurence Devillers; Sarkis Abrilian; Anton Batliner; Noam Amir; Kostas Karpouzis
The HUMAINE project is concerned with developing interfaces that will register and respond to emotion, particularly pervasive emotion (forms of feeling, expression and action that colour most of human life). The HUMAINE Database provides naturalistic clips which record that kind of material, in multiple modalities, and labelling techniques that are suited to describing it.
Neural Networks | 2005
Laurence Devillers; Laurence Vidrascu; Lori Lamel
Since the early studies of human behavior, emotion has attracted the interest of researchers in many disciplines of Neurosciences and Psychology. More recently, it is a growing field of research in computer science and machine learning. We are exploring how the expression of emotion is perceived by listeners and how to represent and automatically detect a subjects emotional state in speech. In contrast with most previous studies, conducted on artificial data with archetypal emotions, this paper addresses some of the challenges faced when studying real-life non-basic emotions. We present a new annotation scheme allowing the annotation of emotion mixtures. Our studies of real-life spoken dialogs from two call center services reveal the presence of many blended emotions, dependent on the dialog context. Several classification methods (SVM, decision trees) are compared to identify relevant emotional states from prosodic, disfluency and lexical cues extracted from the real-life spoken human-human interactions.
Speech Communication | 2008
Chloé Clavel; Ioana Vasilescu; Laurence Devillers; Gaël Richard; Thibaut Ehrette
This paper addresses the issue of automatic emotion recognition in speech. We focus on a type of emotional manifestation which has been rarely studied in speech processing: fear-type emotions occurring during abnormal situations (here, unplanned events where human life is threatened). This study is dedicated to a new application in emotion recognition - public safety. The starting point of this work is the definition and the collection of data illustrating extreme emotional manifestations in threatening situations. For this purpose we develop the SAFE corpus (situation analysis in a fictional and emotional corpus) based on fiction movies. It consists of 7h of recordings organized into 400 audiovisual sequences. The corpus contains recordings of both normal and abnormal situations and provides a large scope of contexts and therefore a large scope of emotional manifestations. In this way, not only it addresses the issue of the lack of corpora illustrating strong emotions, but also it forms an interesting support to study a high variety of emotional manifestations. We define a task-dependent annotation strategy which has the particularity to describe simultaneously the emotion and the situation evolution in context. The emotion recognition system is based on these data and must handle a large scope of unknown speakers and situations in noisy sound environments. It consists of a fear vs. neutral classification. The novelty of our approach relies on dissociated acoustic models of the voiced and unvoiced contents of speech. The two are then merged at the decision step of the classification system. The results are quite promising given the complexity and the diversity of the data: the error rate is about 30%.
International Journal of Humanoid Robotics | 2006
Jean-Claude Martin; Radoslaw Niewiadomski; Laurence Devillers; Stéphanie Buisine; Catherine Pelachaud
One of the challenges of designing virtual humans is the definition of appropriate models of the relation between realistic emotions and the coordination of behaviors in several modalities. In this paper, we present the annotation, representation and modeling of multimodal visual behaviors occurring during complex emotions. We illustrate our work using a corpus of TV interviews. This corpus has been annotated at several levels of information: communicative acts, emotion labels, and multimodal signs. We have defined a copy-synthesis approach to drive an Embodied Conversational Agent from these different levels of information. The second part of our paper focuses on a model of complex (superposition and masking of) emotions in facial expressions of the agent. We explain how the complementary aspects of our work on corpus and computational model is used to specify complex emotional behaviors.
international conference on spoken language processing | 1996
Samir Bennacef; Laurence Devillers; Sophie Rosset; Lori Lamel
Dialog management is of particular importance in telephone-based services. In this paper we describe our recent activities in dialog management and natural language generation in the LIMSI RAILTEL system for access to rail travel information. The aim of LEMLAP project RAILTEL was to assess the capabilities of spoken language technology for interactive telephone information services. Because all interaction is over the telephone, oral dialog management and response generation are very important aspects of the overall system design and usability. Each dialog is analysed to determine the source of any errors (speech recognition, understanding, information retreival, processing or dialog management). An analysis is provided for 100 dialogs taken from the RAILTEL field trials with naive subjects accessing timetable information.
international conference on multimedia and expo | 2003
Laurence Devillers; Lori Lamel; Ioana Vasilescu
Detecting emotions in the context of automated call center services can be helpful for following the evolution of the human-computer dialogues, enabling dynamic modification of the dialogue strategies and influencing the final outcome. The emotion detection work reported here is a part of larger study aiming to model user behavior in real interactions. We make use of a corpus of real agent-client spoken dialogues in which the manifestation of emotion is quite complex, and it is common to have shaded emotions since the interlocutors attempt to control the expression of their internal attitude. Our aims are to define appropriate emotions for call center services, to annotate the dialogues and to validate the presence of emotions via perceptual tests and to find robust cues for emotion detection. In contrast to research carried out with artificial data with simulated emotions, for real-life corpora the set of appropriate emotion labels must be determined. Two studies are reported: the first investigates automatic emotion detection using linguistic information, whereas the second concerns perceptual tests for identifying emotions as well as the prosodic and textual cues which signal them. About 11% of the utterances are annotated with non-neutral emotion labels. Preliminary experiments using lexical cues detect about 70% of these labels.
affective computing and intelligent interaction | 2005
Laurence Devillers; Sarkis Abrilian; Jean-Claude Martin
The modeling of realistic emotional behavior is needed for various applications in multimodal human-machine interaction such as emotion detection in a surveillance system or the design of natural Embodied Conversational Agents. Yet, building such models requires appropriate definition of various levels for representing: the emotional context, the emotion itself and observed multimodal behaviors. This paper presents the multi-level emotion and context coding scheme that has been defined following the annotation of fifty one videos of TV interviews. Results of annotation analysis show the complexity and the richness of the real-life data: around 50% of the clips feature mixed emotions with multi-modal conflictual cues. A typology of mixed emotional patterns is proposed showing that cause-effect conflict and masked acted emotions are perceptually difficult to annotate regarding the valence dimension.
Speech Communication | 1997
Lori Lamel; Samir Bennacef; Sophie Rosset; Laurence Devillers; S. Foukia; J. J. Gangolf; Jean-Luc Gauvain
Abstract This paper describes the RailTel system developed at LIMSI to provide vocal access to static train timetable information in French, and a field trial carried out to assess the technical adequacy of available speech technology for interactive services. The data collection system used to carry out the field trials is based on the LIMSI Mask spoken language system and runs on a Unix workstation with a high quality telephone interface. The spoken language system allows a mixed-initiative dialog where the user can provide any information at any point in time. Experienced users are thus able to provide all the information needed for database access in a single sentence, whereas less experienced users tend to provide shorter responses, allowing the system to guide them. The RailTel field trial was carried out using a common methodology defined by the consortium. 100 novice subjects participated in the field trials, each calling the system one time and completing a user questionnaire. Of the callers, 72% successfully completed their scenario. The subjective assessment of the prototype was for the most part favourable, with subjects expressing an interest in using such a service.
Archive | 1997
Jean-Luc Gauvain; Samir Bennacef; Laurence Devillers; Lori Lamel; Sophie Rosset
The aim of the Multimodal-Multimedia Automated Service Kiosk (MASK) project is to pave the way for more advanced public service applications by user interfaces employing multimodal, multimedia input and output. The project has analyzed the technological requirements in the context of users and the tasks they perform in carrying out travel enquiries, and developed a prototype information kiosk that will be installed in the Gare St. Lazare in Paris. The kiosk will improve the effectiveness of such services by enabling interaction through the coordinated use of multimodal inputs (speech and touch) and multimedia output (sound, video, text, and graphics) and in doing so create the opportunity for new public services. Vocal input is managed by a spoken language system, which aims to provide a natural interface between the user and the computer through the use of simple and natural dialogues. In this paper the architecture and the capabilities of the spoken language system are described, with emphasis on the speaker-independent, large vocabulary continuous speech recognizer, the natural language component (including semantic analysis and dialogue management), and the response generator. We also describe our data collection and evaluation activities which are crucial to system development.
intelligent virtual agents | 2006
Stéphanie Buisine; Sarkis Abrilian; Radoslaw Niewiadomski; Jean-Claude Martin; Laurence Devillers; Catherine Pelachaud
Real life emotions are often blended and involve several simultaneous superposed or masked emotions. This paper reports on a study on the perception of multimodal emotional behaviors in Embodied Conversational Agents. This experimental study aims at evaluating if people detect properly the signs of emotions in different modalities (speech, facial expressions, gestures) when they appear to be superposed or masked. We compared the perception of emotional behaviors annotated in a corpus of TV interviews and replayed by an expressive agent at different levels of abstraction. The results provide insights on the use of such protocols for studying the effect of various models and modalities on the perception of complex emotions.