Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mohammad Soleymani is active.

Publication


Featured researches published by Mohammad Soleymani.


IEEE Transactions on Affective Computing | 2012

A Multimodal Database for Affect Recognition and Implicit Tagging

Mohammad Soleymani; Jeroen Lichtenauer; Thierry Pun; Maja Pantic

MAHNOB-HCI is a multimodal database recorded in response to affective stimuli with the goal of emotion recognition and implicit tagging research. A multimodal setup was arranged for synchronized recording of face videos, audio signals, eye gaze data, and peripheral/central nervous system physiological signals. Twenty-seven participants from both genders and different cultural backgrounds participated in two experiments. In the first experiment, they watched 20 emotional videos and self-reported their felt emotions using arousal, valence, dominance, and predictability as well as emotional keywords. In the second experiment, short videos and images were shown once without any tag and then with correct or incorrect tags. Agreement or disagreement with the displayed tags was assessed by the participants. The recorded videos and bodily responses were segmented and stored in a database. The database is made available to the academic community via a web-based system. The collected data were analyzed and single modality and modality fusion results for both emotion recognition and implicit tagging experiments are reported. These results show the potential uses of the recorded modalities and the significance of the emotion elicitation protocol.


International Journal of Human-computer Studies \/ International Journal of Man-machine Studies | 2009

Short-term emotion assessment in a recall paradigm

Guillaume Chanel; Joep Johannes Maria Kierkels; Mohammad Soleymani; Thierry Pun

The work presented in this paper aims at assessing human emotions using peripheral as well as electroencephalographic (EEG) physiological signals on short-time periods. Three specific areas of the valence-arousal emotional space are defined, corresponding to negatively excited, positively excited, and calm-neutral states. An acquisition protocol based on the recall of past emotional life episodes has been designed to acquire data from both peripheral and EEG signals. Pattern classification is used to distinguish between the three areas of the valence-arousal space. The performance of several classifiers has been evaluated on 10 participants and different feature sets: peripheral features, EEG time-frequency features, EEG pairwise mutual information (MI) features. Comparison of results obtained using either peripheral or EEG signals confirms the interest of using EEGs to assess valence and arousal in emotion recall conditions. The obtained accuracy for the three emotional classes is 63% using EEG time-frequency features, which is better than the results obtained from previous studies using EEG and similar classes. Fusion of the different feature sets at the decision level using a summation rule also showed to improve accuracy to 70%. Furthermore, the rejection of non-confident samples finally led to a classification accuracy of 80% for the three classes.


international conference on multimedia retrieval | 2011

Automatic tagging and geotagging in video collections and communities

Martha Larson; Mohammad Soleymani; Pavel Serdyukov; Stevan Rudinac; Christian Wartena; Vanessa Murdock; Gerald Friedland; Roeland Ordelman; Gareth J. F. Jones

Automatically generated tags and geotags hold great promise to improve access to video collections and online communities. We overview three tasks offered in the MediaEval 2010 benchmarking initiative, for each, describing its use scenario, definition and the data set released. For each task, a reference algorithm is presented that was used within MediaEval 2010 and comments are included on lessons learned. The Tagging Task, Professional involves automatically matching episodes in a collection of Dutch television with subject labels drawn from the keyword thesaurus used by the archive staff. The Tagging Task, Wild Wild Web involves automatically predicting the tags that are assigned by users to their online videos. Finally, the Placing Task requires automatically assigning geo-coordinates to videos. The specification of each task admits the use of the full range of available information including user-generated metadata, speech recognition transcripts, audio, and visual features.


international symposium on multimedia | 2008

Affective Characterization of Movie Scenes Based on Multimedia Content Analysis and User's Physiological Emotional Responses

Mohammad Soleymani; Guillaume Chanel; Joep Johannes Maria Kierkels; Thierry Pun

In this paper, we propose an approach for affective representation of movie scenes based on the emotions that are actually felt by spectators. Such a representation can be used for characterizing the emotional content of video clips for e.g. affective video indexing and retrieval, neuromarketing studies, etc. A dataset of 64 different scenes from eight movies was shown to eight participants. While watching these clips, their physiological responses were recorded. The participants were also asked to self-assess their felt emotional arousal and valence for each scene. In addition, content-based audio- and video-based features were extracted from the movie scenes in order to characterize each one. Degrees of arousal and valence were estimated by a linear combination of features from physiological signals, as well as by a linear combination of content-based features. We showed that a significant correlation exists between arousal/valence provided by the spectators self-assessments, and affective grades obtained automatically from either physiological responses or from audio-video features. This demonstrates the ability of using multimedia features and physiological responses to predict the expected affect of the user in response to the emotional video content.


acm multimedia | 2013

1000 songs for emotional analysis of music

Mohammad Soleymani; Micheal N. Caro; Erik M. Schmidt; Cheng-Ya Sha; Yi-Hsuan Yang

Music is composed to be emotionally expressive, and emotional associations provide an especially natural domain for indexing and recommendation in todays vast digital music libraries. But such libraries require powerful automated tools, and the development of systems for automatic prediction of musical emotion presents a myriad challenges. The perceptual nature of musical emotion necessitates the collection of data from human subjects. The interpretation of emotion varies between listeners thus each clip needs to be annotated by a distribution of subjects. In addition, the sharing of large music content libraries for the development of such systems, even for academic research, presents complicated legal issues which vary by country. This work presents a new publicly available dataset for music emotion recognition research and a baseline system. In addressing the difficulties of emotion annotation we have turned to crowdsourcing, using Amazon Mechanical Turk, and have developed a two-stage procedure for filtering out poor quality workers. The dataset consists entirely of creative commons music from the Free Music Archive, which as the name suggests, can be shared freely without penalty. The final dataset contains 1000 songs, each annotated by a minimum of 10 subjects, which is larger than many currently available music emotion dataset.


Proceedings of the 2nd ACM workshop on Multimedia semantics | 2008

Affective ranking of movie scenes using physiological signals and content analysis

Mohammad Soleymani; Guillaume Chanel; Joep Johannes Maria Kierkels; Thierry Pun

In this paper, we propose an approach for affective ranking of movie scenes based on the emotions that are actually felt by spectators. Such a ranking can be used for characterizing the affective, or emotional, content of video clips. The ranking can for instance help determine which video clip from a database elicits, for a given user, the most joy. This in turn will permit video indexing and retrieval based on affective criteria corresponding to a personalized user affective profile. A dataset of 64 different scenes from 8 movies was shown to eight participants. While watching, their physiological responses were recorded; namely, five peripheral physiological signals (GSR - galvanic skin resistance, EMG - electromyograms, blood pressure, respiration pattern, skin temperature) were acquired. After watching each scene, the participants were asked to self-assess their felt arousal and valence for that scene. In addition, movie scenes were analyzed in order to characterize each with various audio- and video-based features capturing the key elements of the events occurring within that scene. Arousal and valence levels were estimated by a linear combination of features from physiological signals, as well as by a linear combination of content-based audio and video features. We show that a correlation exists between arousal- and valence-based rankings provided by the spectators self-assessments, and rankings obtained automatically from either physiological signals or audio-video features. This demonstrates the ability of using physiological responses of participants to characterize video scenes and to rank them according to their emotional content. This further shows that audio-visual features, either individually or combined, can fairly reliably be used to predict the spectators felt emotion for a given scene. The results also confirm that participants exhibit different affective responses to movie scenes, which emphasizes the need for the emotional profiles to be user-dependant.


affective computing and intelligent interaction | 2009

A Bayesian framework for video affective representation

Mohammad Soleymani; Joep Johannes Maria Kierkels; Guillaume Chanel; Thierry Pun

Emotions that are elicited in response to a video scene contain valuable information for multimedia tagging and indexing. The novelty of this paper is to introduce a Bayesian classification framework for affective video tagging that allows taking contextual information into account. A set of 21 full length movies was first segmented and informative content-based features were extracted from each shot and scene. Shots were then emotionally annotated, providing ground truth affect. The arousal of shots was computed using a linear regression on the content-based features. Bayesian classification based on the shots arousal and content-based features allowed tagging these scenes into three affective classes, namely calm, positive excited and negative excited. To improve classification accuracy, two contextual priors have been proposed: the movie genre prior, and the temporal dimension prior consisting of the probability of transition between emotions in consecutive scenes. The f1 classification measure of 54.9% that was obtained on three emotional classes with a naïve Bayes classifier was improved to 63.4% after utilizing all the priors.


international conference on multimedia and expo | 2009

Queries and tags in affect-based multimedia retrieval

Joep Johannes Maria Kierkels; Mohammad Soleymani; Thierry Pun

An approach for implementing affective information as tags for multimedia content indexing and retrieval is presented. The approach can be used for implicit as well as explicit tags and is presented here using data recorded during the viewing of movie fragments containing annotations and physiological signal recordings. For retrieval based on affective queries, a representation of the query-words is defined in the arousalvalence space in the form of a Gaussian probability distribution and a retrieval method based on this representation is presented. Validation of retrieval accuracy is performed using Precision and Recall parameters. Results show that the use of arousal and valence as affective tags can improve retrieval results.


IEEE Transactions on Multimedia | 2014

Corpus Development for Affective Video Indexing

Mohammad Soleymani; Martha Larson; Thierry Pun; Alan Hanjalic

Affective video indexing is the area of research that develops techniques to automatically generate descriptions of video content that encode the emotional reactions which the video content evokes in viewers. This paper provides a set of corpus development guidelines based on state-of-the-art practice intended to support researchers in this field. Affective descriptions can be used for video search and browsing systems offering users affective perspectives. The paper is motivated by the observation that affective video indexing has yet to fully profit from the standard corpora (data sets) that have benefited conventional forms of video indexing. Affective video indexing faces unique challenges, since viewer-reported affective reactions are difficult to assess. Moreover affect assessment efforts must be carefully designed in order to both cover the types of affective responses that video content evokes in viewers and also capture the stable and consistent aspects of these responses. We first present background information on affect and multimedia and related work on affective multimedia indexing, including existing corpora. Three dimensions emerge as critical for affective video corpora, and form the basis for our proposed guidelines: the context of viewer response, personal variation among viewers, and the effectiveness and efficiency of corpus creation. Finally, we present examples of three recent corpora and discuss how these corpora make progressive steps towards fulfilling the guidelines.


international conference on multimedia and expo | 2014

Continuous emotion detection using EEG signals and facial expressions

Mohammad Soleymani; Sadjad Asghari-Esfeden; Maja Pantic; Yun Fu

Emotions play an important role in how we select and consume multimedia. Recent advances on affect detection are focused on detecting emotions continuously. In this paper, for the first time, we continuously detect valence from electroencephalogram (EEG) signals and facial expressions in response to videos. Multiple annotators provided valence levels continuously by watching the frontal facial videos of participants who watched short emotional videos. Power spectral features from EEG signals as well as facial fiducial points are used as features to detect valence levels for each frame continuously. We study the correlation between features from EEG and facial expressions with continuous valence. We have also verified our models performance for the emotional highlight detection using emotion recognition from EEG signals. Finally the results of multimodal fusion between facial expression and EEG signals are presented. Having such models we will be able to detect spontaneous and subtle affective responses over time and use them for video highlight detection.

Collaboration


Dive into the Mohammad Soleymani's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Maja Pantic

Imperial College London

View shared research outputs
Top Co-Authors

Avatar

Martha Larson

Delft University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge