Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mohammed E. Hoque is active.

Publication


Featured researches published by Mohammed E. Hoque.


ubiquitous computing | 2013

MACH: my automated conversation coach

Mohammed E. Hoque; Matthieu Courgeon; Jean-Claude Martin; Bilge Mutlu; Rosalind W. Picard

MACH--My Automated Conversation coacH--is a novel system that provides ubiquitous access to social skills training. The system includes a virtual agent that reads facial expressions, speech, and prosody and responds with verbal and nonverbal behaviors in real time. This paper presents an application of MACH in the context of training for job interviews. During the training, MACH asks interview questions, automatically mimics certain behavior issued by the user, and exhibit appropriate nonverbal behaviors. Following the interaction, MACH provides visual feedback on the users performance. The development of this application draws on data from 28 interview sessions, involving employment-seeking students and career counselors. The effectiveness of MACH was assessed through a weeklong trial with 90 MIT undergraduates. Students who interacted with MACH were rated by human experts to have improved in overall interview performance, while the ratings of students in control groups did not improve. Post-experiment interviews indicate that participants found the interview experience informative about their behaviors and expressed interest in using MACH in the future.


IEEE Transactions on Affective Computing | 2012

Exploring Temporal Patterns in Classifying Frustrated and Delighted Smiles

Mohammed E. Hoque; Daniel McDuff; Rosalind W. Picard

We create two experimental situations to elicit two affective states: frustration, and delight. In the first experiment, participants were asked to recall situations while expressing either delight or frustration, while the second experiment tried to elicit these states naturally through a frustrating experience and through a delightful video. There were two significant differences in the nature of the acted versus natural occurrences of expressions. First, the acted instances were much easier for the computer to classify. Second, in 90 percent of the acted cases, participants did not smile when frustrated, whereas in 90 percent of the natural cases, participants smiled during the frustrating interaction, despite self-reporting significant frustration with the experience. As a follow up study, we develop an automated system to distinguish between naturally occurring spontaneous smiles under frustrating and delightful stimuli by exploring their temporal patterns given video of both. We extracted local and global features related to human smile dynamics. Next, we evaluated and compared two variants of Support Vector Machine (SVM), Hidden Markov Models (HMM), and Hidden-state Conditional Random Fields (HCRF) for binary classification. While human classification of the smile videos under frustrating stimuli was below chance, an accuracy of 92 percent distinguishing smiles under frustrating and delighted stimuli was obtained using a dynamic SVM classifier.


intelligent user interfaces | 2015

Rhema: A Real-Time In-Situ Intelligent Interface to Help People with Public Speaking

M. Iftekhar Tanveer; Emy Lin; Mohammed E. Hoque

A large number of people rate public speaking as their top fear. What if these individuals were given an intelligent interface that provides live feedback on their speaking skills? In this paper, we present Rhema, an intelligent user interface for Google Glass to help people with public speaking. The interface automatically detects the speakers volume and speaking rate in real time and provides feedback during the actual delivery of speech. While designing the interface, we experimented with two different strategies of information delivery: 1) Continuous streams of information, and 2) Sparse delivery of recommendation. We evaluated our interface with 30 native English speakers. Each participant presented three speeches (avg. duration 3 minutes) with 2 different feedback strategies (continuous, sparse) and a baseline (no feeback) in a random order. The participants were significantly more pleased (p < 0.05) with their speech while using the sparse feedback strategy over the continuous one and no feedback.


ieee international conference on automatic face gesture recognition | 2015

Automated prediction and analysis of job interview performance: The role of what you say and how you say it

Iftekhar Naim; M. Iftekhar Tanveer; Daniel Gildea; Mohammed E. Hoque

Ever wondered why you have been rejected from a job despite being a qualified candidate? What went wrong? In this paper, we provide a computational framework to quantify human behavior in the context of job interviews. We build a model by analyzing 138 recorded interview videos (total duration of 10.5 hours) of 69 internship-seeking students from Massachusetts Institute of Technology (MIT) as they spoke with professional career counselors. Our automated analysis includes facial expressions (e.g., smiles, head gestures), language (e.g., word counts, topic modeling), and prosodic information (e.g., pitch, intonation, pauses) of the interviewees. We derive the ground truth labels by averaging over the ratings of 9 independent judges. Our framework automatically predicts the ratings for interview traits such as excitement, friendliness, and engagement with correlation coefficients of 0.73 or higher, and quantifies the relative importance of prosody, language, and facial expressions. According to our framework, it is recommended to speak more fluently, use less filler words, speak as “we” (vs. “I”), use more unique words, and smile more.


intelligent virtual agents | 2006

Robust recognition of emotion from speech

Mohammed E. Hoque; Mohammed Yeasin; Max M. Louwerse

This paper presents robust recognition of a subset of emotions by animated agents from salient spoken words. To develop and evaluate the model for each emotion from the chosen subset, both the prosodic and acoustic features were used to extract the intonational patterns and correlates of emotion from speech samples. The computed features were projected using a combination of linear projection techniques for compact and clustered representation of features. The projected features were used to build models of emotions using a set of classifiers organized in hierarchical fashion. The performances of the models were obtained using number of classifiers from the WEKA machine learning toolbox. Empirical analysis indicated that the lexical information computed from both the prosodic and acoustic features at word level yielded robust classification of emotions.


intelligent virtual agents | 2009

When Human Coders (and Machines) Disagree on the Meaning of Facial Affect in Spontaneous Videos

Mohammed E. Hoque; Rana el Kaliouby; Rosalind W. Picard

This paper describes the challenges of getting ground truth affective labels for spontaneous video, and presents implications for systems such as virtual agents that have automated facial analysis capabilities. We first present a dataset from an intelligent tutoring application and describe the most prevalent approach to labeling such data. We then present an alternative labeling approach, which closely models how the majority of automated facial analysis systems are designed. We show that while participants, peers and trained judges report high inter-rater agreement on expressions of delight, confusion, flow, frustration, boredom, surprise, and neutral when shown the entire 30 minutes of video for each participant, inter-rater agreement drops below chance when human coders are asked to watch and label short 8 second clips for the same set of labels. We also perform discriminative analysis for facial action units for each affective state represented in the clips. The results emphasize that human coders heavily rely on factors such as familiarity of the person and context of the interaction to correctly infer a persons affective state; without this information, the reliability of humans as well as machines attributing affective labels to spontaneous facial-head movements drops significantly.


human factors in computing systems | 2009

Lessons from participatory design with adolescents on the autism spectrum

Miriam Madsen; Rana el Kaliouby; Micah Eckhardt; Mohammed E. Hoque; Matthew S. Goodwin; Rosalind W. Picard

Participatory user interface design with adolescent users on the autism spectrum presents a number of unique challenges and opportunities. Through our work developing a system to help autistic adolescents learn to recognize facial expressions, we have learned valuable lessons about software and hardware design issues for this population. These lessons may also be helpful in assimilating iterative user input to customize technology for other populations with special needs.


IEEE Computer | 2014

Rich Nonverbal Sensing Technology for Automated Social Skills Training

Mohammed E. Hoque; Rosalind W. Picard

Automated nonverbal sensing and feedback technologies, such as My Automated Conversation coacH (MACH), can provide a personalized means to better understand, evaluate, and improve human social interaction--for both practical and therapeutic purposes, and to advance future communications research. The first Web extra at http://youtu.be/l3ztu9shfMg discusses My Automated Conversation coacH (MACH), a system for people to practice social interactions in face-to-face scenarios. MACH consists of a 3D character that can see, hear, and make its own decisions in real time. The second Web extra at http://youtu.be/krdwB8bfXLQ discusses software developed at MIT that can be used to help people practice their interpersonal skills until they feel more comfortable with situations such as a job interview or a first date. The software, called MACH (short for My Automated Conversation coacH), uses a computer-generated onscreen face, along with facial, speech, and behavior analysis and synthesis software to simulate face-to-face conversations. It then provides users with feedback on their interactions.


intelligent user interfaces | 2016

AutoManner: An Automated Interface for Making Public Speakers Aware of Their Mannerisms

M. Iftekhar Tanveer; Ru Zhao; Kezhen Chen; Zoe Tiet; Mohammed E. Hoque

Many individuals exhibit unconscious body movements called mannerisms while speaking. These repeated changes often distract the audience when not relevant to the verbal context. We present an intelligent interface that can automatically extract human gestures using Microsoft Kinect to make speakers aware of their mannerisms. We use a sparsity-based algorithm, Shift Invariant Sparse Coding, to automatically extract the patterns of body movements. These patterns are displayed in an interface with subtle question and answer-based feedback scheme that draws attention to the speakers body language. Our formal evaluation with 27 participants shows that the users became aware of their body language after using the system. In addition, when independent observers annotated the accuracy of the algorithm for every extracted pattern, we find that the patterns extracted by our algorithm is significantly (p<0.001) more accurate than just random selection. This represents a strong evidence that the algorithm is able to extract human-interpretable body movement patterns. An interactive demo of AutoManner is available at http://tinyurl.com/AutoManner.


conference on computers and accessibility | 2008

Analysis of speech properties of neurotypicals and individuals diagnosed with autism and down

Mohammed E. Hoque

Many individuals diagnosed with autism and Down syndrome have difficulties producing intelligible speech. Systematic analysis of their voice parameters could lead to better understanding of the specific challenges they face in achieving proper speech production. In this study, 100 minutes of speech data from natural conversations between neurotypicals and individuals diagnosed with autism/Down-syndrome was used. Analyzing their voice parameters indicated new findings across a variety of speech parameters. An immediate extension of this work would be to customize this technology allowing participants to visualize and control their speech parameters in real time and get live feedback.

Collaboration


Dive into the Mohammed E. Hoque's collaboration.

Top Co-Authors

Avatar

Rosalind W. Picard

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ru Zhao

University of Rochester

View shared research outputs
Top Co-Authors

Avatar

Taylan Sen

University of Rochester

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rana el Kaliouby

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kamrul Hasan

University of Rochester

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge