Kevin El Haddad
University of Mons
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kevin El Haddad.
affective computing and intelligent interaction | 2015
Laurence Devillers; Sophie Rosset; Guillaume Dubuisson Duplessis; Mohamed A. Sehili; Lucile Bechade; Agnes Delaborde; Clément Gossart; Vincent Letard; Fan Yang; Yücel Yemez; Bekir Berker Turker; T. Metin Sezgin; Kevin El Haddad; Stéphane Dupont; Daniel Luzzati; Yannick Estève; Emer Gilmartin; Nick Campbell
Thanks to a remarkably great ability to show amusement and engagement, laughter is one of the most important social markers in human interactions. Laughing together can actually help to set up a positive atmosphere and favors the creation of new relationships. This paper presents a data collection of social interaction dialogs involving humor between a human participant and a robot. In this work, interaction scenarios have been designed in order to study social markers such as laughter. They have been implemented within two automatic systems developed in the Joker project: a social dialog system using paralinguistic cues and a task-based dialog system using linguistic content. One of the major contributions of this work is to provide a context to study human laughter produced during a human-robot interaction. The collected data will be used to build a generic intelligent user interface which provides a multimodal dialog system with social communication skills including humor and other informal socially oriented behaviors. This system will emphasize the fusion of verbal and non-verbal channels for emotional and social behavior perception, interaction and generation capabilities.
international conference on acoustics, speech, and signal processing | 2015
Kevin El Haddad; Stéphane Dupont; Jérôme Urbain; Thierry Dutoit
This paper presents an HMM-based synthesis approach for speechlaughs. The building stone of this project was the idea of the co-occurrence of smile and laughter bursts in varying proportions within amused speech utterances. A corpus with three complementary speaking styles was used to train the underlying HMM models: neutral speech, speech-smile, and finally laughter in different articulatory configurations. Two types of speech-laughs were then synthesized: one made by combining neutral speech and laughter bursts, and the other made by combining speech-smile and laughter bursts. Synthesized stimuli were then rated in terms of perceived amusement and naturalness levels. Results show the compound effect of both laughter bursts and smile on both amusement and naturalness and inspire interesting perspectives.
ieee international conference on automatic face gesture recognition | 2015
Kevin El Haddad; Stéphane Dupont; Nicolas D'Alessandro; Thierry Dutoit
This paper presents an HMM-based speech-smile synthesis system. In order to do that, databases of three speech styles were recorded. This system was used to study to what extent synthesized speech-smiles (defined as Duchenne smiles in our work) and spread-lips (speech modulated by spreading the lips) communicate amusement. Our evaluation results showed that the speech-smiles synthesized sentences are perceived as more amused than the spread-lips ones. Acoustic analysis of the pitch and first two formants are also provided.
international symposium on signal processing and information technology | 2015
Kevin El Haddad; Hüseyin Çakmak; Alexis Moinet; Stéphane Dupont; Thierry Dutoit
Smile is not only a visual expression. When it occurs together with speech, it also alters its acoustic realization. Being able to synthesize speech altered by the expression of smile can hence be an important contributor for adding naturalness and expressiveness in interactive systems. In this work, we present a first attempt to develop a Hidden Markov Model (HMM)-based synthesis system allowing to control the degree of smile in speech. It relies on a model interpolation technique, enabling speech-smile sentences with various smiling intensities to be generated. Sentences synthesized using this approach have been evaluated through a perceptual test. Encouraging results are reported here.
international conference on multimodal interfaces | 2016
Kevin El Haddad; Hüseyin Çakmak; Emer Gilmartin; Stéphane Dupont; Thierry Dutoit
In this work, we experiment with the use of smiling and laughter in order to help create more natural and efficient listening agents. We present preliminary results on a system which predicts smile and laughter sequences in one dialogue participant based on observations of the other participants behavior. This system also predicts the level of intensity or arousal for these sequences. We also describe an audiovisual concatenative synthesis process used to generate laughter and smiling sequences, producing multilevel amusement expressions from a dataset of audiovisual laughs. We thus present two contributions: one in the generation of smiling and laughter responses, the other in the prediction of what laughter and smiles to use in response to an interlocutors behaviour. Both the synthesis system and the prediction system have been evaluated via Mean Opinion Score tests and have proved to give satisfying and promising results which open the door to interesting perspectives.
european signal processing conference | 2016
Kevin El Haddad; Hüseyin Çakmak; Martin Sulír; Stéphane Dupont; Thierry Dutoit
Affect bursts are short, isolated and non-verbal expressions of affect expressed vocally or facially. In this paper we present an attempt at synthesizing audio affect bursts on several levels of arousal. This work concerns 3 different types of affect bursts: disgust, startle and surprised expressions. Data are first gathered for each of these affect bursts at two different levels of arousal each. Then, each level of each emotion is modeled using Hidden Markov Models. A weighted linear interpolation technique is then used to obtain intermediate levels from these models. The obtained synthesized affect bursts are then evaluated in a perception test.
ieee global conference on signal and information processing | 2015
Kevin El Haddad; Stéphane Dupont; Hüseyin Çakmak; Thierry Dutoit
In this paper, we present our work on speech-smile/shaking vowels classification. An efficient classification system would be a first step towards the estimation (from speech signals only) of amusement levels beyond smile, as indeed shaking vowels represent a transition from smile to laughter superimposed to speech. A database containing examples of both classes has been collected from acted and spontaneous speech corpora. An experimental study using several acoustic feature sets is presented here, and novel features are also proposed. The best configuration achieves a 30.1% error rate, hence well above chance.
european signal processing conference | 2015
Kevin El Haddad; Hüseyin Çakmak; Stéphane Dupont; Thierry Dutoit
In this work, we present a study dedicated to improve the speech-laugh synthesis quality. The impact of two factors is evaluated. The first factor is the addition of breath intake sounds after laughter bursts in speech. The second is the repetition of the word interrupted by laughs in the speech-laugh sentences. Several configurations are evaluated through subjective perceptual tests. We report an improvement of the synthesized speech-laugh naturalness when the breath intake sounds are added. We were unable, though, to make a conclusion concerning a possible positive impact of the repetition of the interrupted words on the speech-laugh synthesis quality.
affective computing and intelligent interaction | 2015
Hüseyin Çakmak; Kevin El Haddad; Thierry Dutoit
In this paper we propose synchronization rules between acoustic and visual laughter synthesis systems. Previous works have addressed separately the acoustic and visual laughter synthesis following an HMM-based approach. The need of synchronization rules comes from the constraint that in laughter, HMM-based synthesis cannot be performed using a unified system where common transcriptions may be used as it has been shown to be the case for audio-visual speech synthesis. Therefore acoustic and visual models are trained independently without any synchronization constraints. In this work, we propose rules derived from the analysis of audio and visual laughter transcriptions in order to be able to generate a visual laughter transcriptions corresponding to an audio laughter data.
International Conference on Statistical Language and Speech Processing | 2017
Kevin El Haddad; Ilaria Torre; Emer Gilmartin; Hüseyin Çakmak; Stéphane Dupont; Thierry Dutoit; Nick Campbell
In this paper we present the AmuS database of about three hours worth of data related to amused speech recorded from two males and one female subjects and contains data in two languages French and English. We review previous work on smiled speech and speech-laughs. We describe acoustic analysis on part of our database, and a perception test comparing speech-laughs with smiled and neutral speech. We show the efficiency of the data in AmuS for synthesis of amused speech by training HMM-based models for neutral and smiled speech for each voice and comparing them using an on-line CMOS test.