Is this you? Create Your Porfile

Ognjen Rudovic

Massachusetts Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ognjen Rudovic is active.

Explore More

Publication

Featured researches published by Ognjen Rudovic.

computer vision and pattern recognition | 2016

Copula Ordinal Regression for Joint Estimation of Facial Action Unit Intensity

Robert Walecki; Ognjen Rudovic; Vladimir Pavlovic; Maja Pantic

Joint modeling of the intensity of facial action units (AUs) from face images is challenging due to the large number of AUs (30+) and their intensity levels (6). This is in part due to the lack of suitable models that can efficiently handle such a large number of outputs/classes simultaneously, but also due to the lack of labelled target data. For this reason, majority of the methods proposed so far resort to independent classifiers for the AU intensity. This is suboptimal for at least two reasons: the facial appearance of some AUs changes depending on the intensity of other AUs, and some AUs co-occur more often than others. Encoding this is expected to improve the estimation of target AU intensities, especially in the case of noisy image features, head-pose variations and imbalanced training data. To this end, we introduce a novel modeling framework, Copula Ordinal Regression (COR), that leverages the power of copula functions and CRFs, to detangle the probabilistic modeling of AU dependencies from the marginal modeling of the AU intensity. Consequently, the COR model achieves the joint learning and inference of intensities of multiple AUs, while being computationally tractable. We show on two challenging datasets of naturalistic facial expressions that the proposed approach consistently outperforms (i) independent modeling of AU intensities, and (ii) the state-of the-art approach for the target task.

computer vision and pattern recognition | 2017

Personalized Automatic Estimation of Self-Reported Pain Intensity from Facial Expressions

Daniel Lopez Martinez; Ognjen Rudovic; Rosalind W. Picard

Pain is a personal, subjective experience that is commonly evaluated through visual analog scales (VAS). While this is often convenient and useful, automatic pain detection systems can reduce pain score acquisition efforts in large-scale studies by estimating it directly from the partictipants facial expressions. In this paper, we propose a novel two-stage learning approach for VAS estimation: first, our algorithm employs Recurrent Neural Networks (RNNs) to automatically estimate Prkachin and Solomon Pain Intensity (PSPI) levels from face images. The estimated scores are then fed into the personalized Hidden Conditional Random Fields (HCRFs), used to estimate the VAS, provided by each person. Personalization of the model is performed using a newly introduced facial expressiveness score, unique for each person. To the best of our knowledge, this is the first approach to automatically estimate VAS from face images. We show the benefits of the proposed personalized over traditional non-personalized approach on a benchmark dataset for pain analysis from face images.

computer vision and pattern recognition | 2016

Gaussian Process Domain Experts for Model Adaptation in Facial Behavior Analysis

Stefanos Eleftheriadis; Ognjen Rudovic; Marc Peter Deisenroth; Maja Pantic

We present a novel approach for supervised domain adaptation that is based upon the probabilistic framework of Gaussian processes (GPs). Specifically, we introduce domain-specific GPs as local experts for facial expression classification from face images. The adaptation of the classifier is facilitated in probabilistic fashion by conditioning the target expert on multiple source experts. Furthermore, in contrast to existing adaptation approaches, we also learn a target expert from available target data solely. Then, a single and confident classifier is obtained by combining the predictions from multiple experts based on their confidence. Learning of the model is efficient and requires no retraining/ reweighting of the source classifiers. We evaluate the proposed approach on two publicly available datasets for multi-class (MultiPIE) and multi-label (DISFA) facial expression classification. To this end, we perform adaptation of two contextual factors: where (view) and who (subject). We show in our experiments that the proposed approach consistently outperforms both source and target classifiers, while using as few as 30 target examples. It also outperforms the state-of-the-art approaches for supervised domain adaptation.

human robot interaction | 2017

NAO-Dance Therapy for Children with ASD

Ryo Suzuki; Jaeryoung Lee; Ognjen Rudovic

Children with Autism Spectrum Disorder (ASD) have very short attention span, and autism therapy requires the entertainment for longer interaction for them. Robots draw their attention to focus on the therapy and learn the social skills. In this paper, NAO is used for as part of a dance therapy for children with ASD. To explore the effectiveness, we compared three settings involving NAO, therapist and/or unfamiliar person, during the dance. Our results indicate that a robot can be an effective education agent for children with ASD, and in particular, as part of the dance therapy.

computer vision and pattern recognition | 2017

Deep Structured Learning for Facial Action Unit Intensity Estimation

Robert Walecki; Ognjen Rudovic; Vladimir Pavlovic; Björn W. Schuller; Maja Pantic

We consider the task of automated estimation of facial expression intensity. This involves estimation of multiple output variables (facial action units — AUs) that are structurally dependent. Their structure arises from statistically induced co-occurrence patterns of AU intensity levels. Modeling this structure is critical for improving the estimation performance, however, this performance is bounded by the quality of the input features extracted from face images. The goal of this paper is to model these structures and estimate complex feature representations simultaneously by combining conditional random field (CRF) encoded AU dependencies with deep learning. To this end, we propose a novel Copula CNN deep learning approach for modeling multivariate ordinal variables. Our model accounts for ordinal structure in output variables and their non-linear dependencies via copula functions modeled as cliques of a CRF. These are jointly optimized with deep CNN feature encoding layers using a newly introduced balanced batch iterative training algorithm. We demonstrate the effectiveness of our approach on the task of AU intensity estimation on two benchmark datasets. We show that joint learning of the deep features and the target output structure results in significant performance gains compared to existing structured deep models and deep models for analysis of facial expressions.

IEEE Transactions on Image Processing | 2017

Gaussian Process Domain Experts for Modeling of Facial Affect

Stefanos Eleftheriadis; Ognjen Rudovic; Marc Peter Deisenroth; Maja Pantic

Most of existing models for facial behavior analysis rely on generic classifiers, which fail to generalize well to previously unseen data. This is because of inherent differences in source (training) and target (test) data, mainly caused by variation in subjects’ facial morphology, camera views, and so on. All of these account for different contexts in which target and source data are recorded, and thus, may adversely affect the performance of the models learned solely from source data. In this paper, we exploit the notion of domain adaptation and propose a data efficient approach to adapt already learned classifiers to new unseen contexts. Specifically, we build upon the probabilistic framework of Gaussian processes (GPs), and introduce domain-specific GP experts (e.g., for each subject). The model adaptation is facilitated in a probabilistic fashion, by conditioning the target expert on the predictions from multiple source experts. We further exploit the predictive variance of each expert to define an optimal weighting during inference. We evaluate the proposed model on three publicly available data sets for multi-class (MultiPIE) and multi-label (DISFA, FERA2015) facial expression analysis by performing adaptation of two contextual factors: “where” (view) and “who” (subject). In our experiments, the proposed approach consistently outperforms: 1) both source and target classifiers, while using a small number of target examples during the adaptation and 2) related state-of-the-art approaches for supervised domain adaptation.

machine learning and data mining in pattern recognition | 2018

A Mixture of Personalized Experts for Human Affect Estimation

Michael Feffer; Ognjen Rudovic; Rosalind W. Picard

We investigate the personalization of deep convolutional neural networks for facial expression analysis from still images. While prior work has focused on population-based (“one-size-fits-all”) approaches, we formulate and construct personalized models via a mixture of experts and supervised domain adaptation approach, showing that it improves greatly upon non-personalized models. Our experiments demonstrate the ability of the model personalization to quickly and effectively adapt to limited amounts of target data. We also provide a novel training methodology and architecture for creating personalized machine learning models for more effective analysis of emotion state.

arXiv: Robotics | 2018

Personalized machine learning for robot perception of affect and engagement in autism therapy

Ognjen Rudovic; Jaeryoung Lee; Miles Dai; Björn W. Schuller; Rosalind W. Picard

Personalized machine learning enables robot perception of children’s affective states and engagement during robot-assisted autism therapy. Robots have the potential to facilitate future therapies for children on the autism spectrum. However, existing robots are limited in their ability to automatically perceive and respond to human affect, which is necessary for establishing and maintaining engaging interactions. Their inference challenge is made even harder by the fact that many individuals with autism have atypical and unusually diverse styles of expressing their affective-cognitive states. To tackle the heterogeneity in children with autism, we used the latest advances in deep learning to formulate a personalized machine learning (ML) framework for automatic perception of the children’s affective states and engagement during robot-assisted autism therapy. Instead of using the traditional one-size-fits-all ML approach, we personalized our framework to each child using their contextual information (demographics and behavioral assessment scores) and individual characteristics. We evaluated this framework on a multimodal (audio, video, and autonomic physiology) data set of 35 children (ages 3 to 13) with autism, from two cultures (Asia and Europe), and achieved an average agreement (intraclass correlation) of ~60% with human experts in the estimation of affect and engagement, also outperforming nonpersonalized ML solutions. These results demonstrate the feasibility of robot perception of affect and engagement in children with autism and have implications for the design of future autism therapies.

computer vision and pattern recognition | 2017

DeepSpace: Mood-Based Image Texture Generation for Virtual Reality from Music

Misha Sra; Prashanth Vijayaraghavan; Ognjen Rudovic; Pattie Maes; Deb Roy

Affective virtual spaces are of interest for many VR applications in areas of wellbeing, art, education, and entertainment. Creating content for virtual environments is a laborious task involving multiple skills like 3D modeling, texturing, animation, lighting, and programming. One way to facilitate content creation is to automate sub-processes like assignment of textures and materials within virtual environments. To this end, we introduce the DeepSpace approach that automatically creates and applies image textures to objects in procedurally created 3D scenes. The main novelty of our DeepSpace approach is that it uses music to automatically create kaleidoscopic textures for virtual environments designed to elicit emotional responses in users. Specifically, DeepSpace exploits the modeling power of deep neural networks, which have shown great performance in image generation tasks, to achieve mood-based image generation. Our study results indicate the virtual environments created by DeepSpace elicit positive emotions and achieve high presence scores.

IEEE Transactions on Affective Computing | 2017

A Copula Ordinal Regression Framework for Joint Estimation of Facial Action Unit Intensity

Robert Walecki; Ognjen Rudovic; Vladimir Pavlovic; Maja Pantic

Joint modeling of the intensity of multiple facial action units (AUs) from face images is challenging due to the large number of AUs (30+) and their intensity levels (6). This is in part due to the lack of suitable models that can efficiently handle such a large number of outputs/classes simultaneously, but also due to the lack of suitable data the models on. For this reason, majority of the methods resort to independent classifiers for the AU intensity. This is suboptimal for at least two reasons: the facial appearance of some AUs changes depending on the intensity of other AUs, and some AUs co-occur more often than others. To this end, we propose the Copula regression approach for modeling multivariate ordinal variables. Our model accounts for

Explore More