Is this you? Create Your Porfile

Marco Paleari

Istituto Italiano di Tecnologia

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Marco Paleari is active.

Explore More

Publication

Featured researches published by Marco Paleari.

content based multimedia indexing | 2008

Toward emotion indexing of multimedia excerpts

Marco Paleari; Benoit Huet

Multimedia indexing is about developing techniques allowing people to effectively find media. Content-based methods become necessary when dealing with large databases. Current technology allows exploring the emotional space which is known to carry very interesting semantic information. In this paper we state the need for an integrated method which extracts reliable affective information and attaches this semantic information to the medium itself. We describe SAMMI [1], a framework explicitly designed to fulfill this need and we present a list of possible applications pointing out the advantages that the emotional information can bring about. Finally, different scenarios are considered for the recognition of the emotions which involve different modalities, feature sets, fusion algorithms, and result optimization methods such as temporal averaging or thresholding.

Proceedings of the 1st ACM international workshop on Human-centered multimedia | 2006

Toward multimodal fusion of affective cues

Marco Paleari; Christine L. Lisetti

During face to face communication, it has been suggested that as much as 70% of what people communicate when talking directly with others is through paralanguage involving multiple modalities combined together (e.g. voice tone and volume, body language). In an attempt to render humancomputer interaction more similar to human-human communication and enhance its naturalness, research on sensory acquisition and interpretation of single modalities of human expressions have seen ongoing progress over the last decade. These progresses are rendering current research on artificial sensor fusion of multiple modalities an increasingly important research domain in order to reach better accuracy of congruent messages on the one hand, and possibly to be able to detect incongruent messages across multiple modalities (incongruency being itself a message about the nature of the information being conveyed). Accurate interpretation of emotional signals - quintessentially multimodal - would hence particularly benefit from multimodal sensor fusion and interpretation algorithms. In this paper we provide a state of the art multimodal fusion and describe one way to implement a generic framework for multimodal emotion recognition. The system is developed within the MAUI framework [31] and Scherers Component Process Theory (CPT) [49, 50, 51, 24, 52], with the goal to be modular and adaptive. We want the designed framework to be able to accept different single and multi modality recognition systems and to automatically adapt the fusion algorithm to find optimal solutions. The system also aims to be adaptive to channel (and system) reliability.

conference on multimedia modeling | 2009

Evidence Theory-Based Multimodal Emotion Recognition

Marco Paleari; Rachid Benmokhtar; Benoit Huet

Automatic recognition of human affective states is still a largely unexplored and challenging topic. Even more issues arise when dealing with variable quality of the inputs or aiming for real-time, unconstrained, and person independent scenarios. In this paper, we explore audio-visual multimodal emotion recognition. We present SAMMI, a framework designed to extract real-time emotion appraisals from non-prototypical, person independent, facial expressions and vocal prosody. Different probabilistic method for fusion are compared and evaluated with a novel fusion technique called NNET. Results shows that NNET can improve the recognition score (CR + ) of about 19% and the mean average precision of about 30% with respect to the best unimodal system.

ieee conference on cybernetics and intelligent systems | 2010

Features for multimodal emotion recognition: An extensive study

Marco Paleari; Ryad Chellali; Benoit Huet

The ability to recognize emotions in natural human communications is known to be very important for mankind. In recent years, a considerable number of researchers have investigated techniques allowing computer to replicate this capability by analyzing both prosodic (voice) and facial expressions. The applications of the resulting systems are manifold and range from gaming to indexing and retrieval, through chat and health care. No study has, to the best of our knowledge, ever reported results comparing the effectiveness of several features for automatic emotion recognition. In this work, we present an extensive study conducted on feature selection for automatic, audio-visual, real-time, and person independent emotion recognition. More than 300,000 different neural networks have been trained in order to compare the performances of 64 features and 11 different sets of features with 450 different analysis settings. Results show that: 1) to build an optimal emotion recognition system, different emotions should be classified via different features and 2) different features, in general, require different processing.

PLOS ONE | 2014

Quantifying Forearm Muscle Activity during Wrist and Finger Movements by Means of Multi-Channel Electromyography

Marco Gazzoni; Nicolo Celadon; Davide Mastrapasqua; Marco Paleari; Valentina Margaria; Paolo Ariano

The study of hand and finger movement is an important topic with applications in prosthetics, rehabilitation, and ergonomics. Surface electromyography (sEMG) is the gold standard for the analysis of muscle activation. Previous studies investigated the optimal electrode number and positioning on the forearm to obtain information representative of muscle activation and robust to movements. However, the sEMG spatial distribution on the forearm during hand and finger movements and its changes due to different hand positions has never been quantified. The aim of this work is to quantify 1) the spatial localization of surface EMG activity of distinct forearm muscles during dynamic free movements of wrist and single fingers and 2) the effect of hand position on sEMG activity distribution. The subjects performed cyclic dynamic tasks involving the wrist and the fingers. The wrist tasks and the hand opening/closing task were performed with the hand in prone and neutral positions. A sensorized glove was used for kinematics recording. sEMG signals were acquired from the forearm muscles using a grid of 112 electrodes integrated into a stretchable textile sleeve. The areas of sEMG activity have been identified by a segmentation technique after a data dimensionality reduction step based on Non Negative Matrix Factorization applied to the EMG envelopes. The results show that 1) it is possible to identify distinct areas of sEMG activity on the forearm for different fingers; 2) hand position influences sEMG activity level and spatial distribution. This work gives new quantitative information about sEMG activity distribution on the forearm in healthy subjects and provides a basis for future works on the identification of optimal electrode configuration for sEMG based control of prostheses, exoskeletons, or orthoses. An example of use of this information for the optimization of the detection system for the estimation of joint kinematics from sEMG is reported.

international conference on image processing | 2008

A multimodal approach to music transcription

Marco Paleari; Benoit Huet; Antony Schutz; Dirk T. M. Slock

Music transcription refers to extraction of a human readable and interpretable description from a recording of a music performance. Automatic music transcription remains, nowadays, a challenging research problem when dealing with polyphonic sounds or when removing certain constraints. Some instruments like guitars and violins add ambiguity to the problem as the same note can be played at different positions. When dealing with guitar music tablature are, often, preferred to the usual music score, as they present information in a more accessible way. Here, we address this issue with a system which uses the visual modality to support traditional audio transcription techniques. The system is composed of four modules which have been implemented and evaluated: a system which tracks the position of the fretboard on a video stream, a system which automatically detects the position of the guitar on the first fret to initialize the first system, a system which detects the position of the hand on the guitar, and finally a system which fuses the visual and audio information to extract a tablature. Results show that this kind of multimodal approach can easily disambiguate 89% of notes in a deterministic way.

ieee international workshop on advances in sensors and interfaces | 2013

A wireless address-event representation system for ATC-based multi-channel force wireless transmission

Paolo Motto Ros; Marco Paleari; Nicolo Celadon; Alessandro Sanginario; Alberto Bonanno; Marco Crepaldi; Paolo Ariano; Danilo Demarchi

This paper extends Average Threshold Crossing (ATC) wireless transmission to a multi-channel case by using Address-Event Representation (AER) as the way to convey information. This is encoded in the timings of the transmitted packets which in turn carry the identifier of the event source. By integrating a Impulse RadioUltra Wide Band (IR-UWB) and choosing the proper protocol and modulation, we can aim to minimize the power consumption and provide error detection. The whole system, fully asynchronous, has been implemented in a full-custom chip; besides having multiple independent inputs, it can be configured both to deploy a multi-chip system (with a single receiver) and to optimize wireless transmission parameters. The paper concludes with additional theoretical simulations on the ATC scheme to justify further analyses for our specific application area which regards movement recognition.

SAMT'07 Proceedings of the semantic and digital media technologies 2nd international conference on Semantic Multimedia | 2007

SAMMI: semantic affect-enhanced multimedia indexing

Marco Paleari; Benoit Huet; Brian Duffy

Multimedia indexing is about developing techniques allowing people to effectively find media. Content-based methods become necessary when dealing with big databases. Current technology allows exploring the emotional space which is known to carry very interesting semantic information. In this paper we state the need for an integrated method which extracts reliable affective information and attaches this semantic information to the medium itself. We present a list of possible applications and advantages that the emotional information can bring about together with a framework called SAMMI and the preliminary results of this newly initiated research work.Multimedia indexing is about developing techniques allowing people to effectively find media. Content-based methods become necessary when dealing with big databases. Current technology allows exploring the emotional space which is known to carry very interesting semantic information. In this paper we state the need for an integrated method which extracts reliable affective information and attaches this semantic information to the medium itself. We present a list of possible applications and advantages that the emotional information can bring about together with a framework called SAMMI and the preliminary results of this newly initiated research work.

international conference on social robotics | 2010

Bimodal emotion recognition

Marco Paleari; Ryad Chellali; Benoit Huet

When interacting with robots we show a plethora of affective reactions typical of natural communications. Indeed, emotions are embedded on our communications and represent a predominant communication channel to convey relevant, high impact, information. In recent years more and more researchers have tried to exploit this channel for human robot (HRI) and human computer interactions (HCI). Two key abilities are needed for this purpose: the ability to display emotions and the ability to automatically recognize them. In this work we present our system for the computer based automatic recognition of emotions and the new results we obtained on a small dataset of quasi unconstrained emotional videos extracted from TV series and movies. The results are encouraging showing a recognition rate of about 74%.

multimedia signal processing | 2009

Face dynamics for biometric people recognition

Marco Paleari; Carmelo Velardo; Benoit Huet; Jean-Luc Dugelay

Biometric systems have gained the attention of both the research community and the industry becoming an important topic in real application scenarios. Face recognition is, with fingerprint, among the most used techniques since it is natural for humans to recognize people from facial appearance, since the technology is mature, and because, unlike fingerprint, it is completely unintrusive. Existing systems only focus on the appearance of the subjects considering facial expressions as an obstacle to their aim. On the other hand such systems presents several limitations when dealing with variable illumination conditions, head pose, day-to-day variations (e.g. beard, glasses, or make-up), etc. Furthermore, most of the current techniques do not exploit dynamics to detect the liveness of the tested subjects. In this paper we present a study on person recognition from the dynamics of the facial feature points. The aim of this work is to demonstrate that dynamics of facial expressions could be seen as a biometric characteristic. Therefore, only dynamic characteristics are considered and the adopted features are purged of all appearance information. The results clearly show that relevant biometric information can be extracted from facial expressions and other dynamics of the face.

Explore More