Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Matthew P. Black is active.

Publication


Featured researches published by Matthew P. Black.


multimedia signal processing | 2007

A System for Technology Based Assessment of Language and Literacy in Young Children: the Role of Multiple Information Sources

Abeer Alwan; Yijian Bai; Matthew P. Black; Larry Casey; Matteo Gerosa; Markus Iseli; Barbara Jones; Abe Kazemzadeh; Sungbok Lee; Shrikanth Narayanan; Patti Price; Joseph Tepperman; Shizhen Wang

This paper describes the design and realization of an automatic system for assessing and evaluating the language and literacy skills of young children. This system was developed in the context of the TBALL (technology based assessment of language and literacy) project and aims at automatically assessing the English literacy skills of both native talkers of American English and Mexican-American children in grades K-2. The automatic assessments were carried out employing appropriate speech recognition and understanding techniques. In this paper, we describe the system focusing on the role of the multiple sources of information at our disposal. We present the content of the assessment system, discuss some issues in creating a child-friendly interface, and how to provide a suitable feedback to the teachers. In addition, we will discuss the different assessment modules and the different algorithms used for speech analysis.


Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge | 2014

Multimodal Prediction of Affective Dimensions and Depression in Human-Computer Interactions

Rahul Gupta; Nikolaos Malandrakis; Bo Xiao; Tanaya Guha; Maarten Van Segbroeck; Matthew P. Black; Alexandros Potamianos; Shrikanth Narayanan

Depression is one of the most common mood disorders. Technology has the potential to assist in screening and treating people with depression by robustly modeling and tracking the complex behavioral cues associated with the disorder (e.g., speech, language, facial expressions, head movement, body language). Similarly, robust affect recognition is another challenge which stands to benefit from modeling such cues. The Audio/Visual Emotion Challenge (AVEC) aims toward understanding the two phenomena and modeling their correlation with observable cues across several modalities. In this paper, we use multimodal signal processing methodologies to address the two problems using data from human-computer interactions. We develop separate systems for predicting depression levels and affective dimensions, experimenting with several methods for combining the multimodal information. The proposed depression prediction system uses a feature selection approach based on audio, visual, and linguistic cues to predict depression scores for each session. Similarly, we use multiple systems trained on audio and visual cues to predict the affective dimensions in continuous-time. Our affect recognition system accounts for context during the frame-wise inference and performs a linear fusion of outcomes from the audio-visual systems. For both problems, our proposed systems outperform the video-feature based baseline systems. As part of this work, we analyze the role played by each modality in predicting the target variable and provide analytical insights.


Journal of Speech Language and Hearing Research | 2014

The Psychologist as an Interlocutor in Autism Spectrum Disorder Assessment: Insights From a Study of Spontaneous Prosody

Daniel Bone; Chi-Chun Lee; Matthew P. Black; Marian E. Williams; Sungbok Lee; Pat Levitt; Shrikanth Narayanan

PURPOSE The purpose of this study was to examine relationships between prosodic speech cues and autism spectrum disorder (ASD) severity, hypothesizing a mutually interactive relationship between the speech characteristics of the psychologist and the child. The authors objectively quantified acoustic-prosodic cues of the psychologist and of the child with ASD during spontaneous interaction, establishing a methodology for future large-sample analysis. METHOD Speech acoustic-prosodic features were semiautomatically derived from segments of semistructured interviews (Autism Diagnostic Observation Schedule, ADOS; Lord, Rutter, DiLavore, & Risi, 1999; Lord et al., 2012) with 28 children who had previously been diagnosed with ASD. Prosody was quantified in terms of intonation, volume, rate, and voice quality. Research hypotheses were tested via correlation as well as hierarchical and predictive regression between ADOS severity and prosodic cues. RESULTS Automatically extracted speech features demonstrated prosodic characteristics of dyadic interactions. As rated ASD severity increased, both the psychologist and the child demonstrated effects for turn-end pitch slope, and both spoke with atypical voice quality. The psychologists acoustic cues predicted the childs symptom severity better than did the childs acoustic cues. CONCLUSION The psychologist, acting as evaluator and interlocutor, was shown to adjust his or her behavior in predictable ways based on the childs social-communicative impairments. The results support future study of speech prosody of both interaction partners during spontaneous conversation, while using automatic computational methods that allow for scalable analysis on much larger corpora.


international conference on multimedia and expo | 2011

Rachel: Design of an emotionally targeted interactive agent for children with autism

Emily Mower; Matthew P. Black; Elisa Flores; Marian E. Williams; Shrikanth Narayanan

Increasingly, multimodal human-computer interactive tools are leveraged in both autism research and therapies. Embodied conversational agents (ECAs) are employed to facilitate the collection of socio-emotional interactive data from children with autism. In this paper we present an overview of the Rachel system developed at the University of Southern California. The Rachel ECA is designed to elicit and analyze complex, structured, and naturalistic interactions and to encourage affective and social behavior. The pilot studies suggest that this tool can be used to effectively elicit social conversational behavior. This paper presents a description of the multimodal human-computer interaction system and an overview of the collected data. Future work includes utilizing signal processing techniques to provide a quantitative description of the interaction patterns.


affective computing and intelligent interaction | 2011

That's aggravating, very aggravating: is it possible to classify behaviors in couple interactions using automatically derived lexical features?

Panayiotis G. Georgiou; Matthew P. Black; Adam C. Lammert; Brian R. Baucom; Shrikanth Narayanan

Psychology is often grounded in observational studies of human interaction behavior, and hence on human perception and judgment. There are many practical and theoretical challenges in observational practice. Technology holds the promise of mitigating some of these difficulties by assisting in the evaluation of higher level human behavior. In this work we attempt to address two questions: (1) Does the lexical channel contain the necessary information towards such an evaluation; and if yes (2) Can such information be captured by a noisy automated transcription process. We utilize a large corpus of couple interaction data, collected in the context of a longitudinal study of couple therapy. In the original study, each spouse was manually evaluated with several sessionlevel behavioral codes (e.g., level of acceptance toward other spouse). Our results will show that both of our research questions can be answered positively and encourage future research into such assistive observational technologies.


Journal of Autism and Developmental Disorders | 2015

Applying Machine Learning to Facilitate Autism Diagnostics: Pitfalls and Promises

Daniel Bone; Matthew S. Goodwin; Matthew P. Black; Chi-Chun Lee; Kartik Audhkhasi; Shrikanth Narayanan

Machine learning has immense potential to enhance diagnostic and intervention research in the behavioral sciences, and may be especially useful in investigations involving the highly prevalent and heterogeneous syndrome of autism spectrum disorder. However, use of machine learning in the absence of clinical domain expertise can be tenuous and lead to misinformed conclusions. To illustrate this concern, the current paper critically evaluates and attempts to reproduce results from two studies (Wall et al. in Transl Psychiatry 2(4):e100, 2012a; PloS One 7(8), 2012b) that claim to drastically reduce time to diagnose autism using machine learning. Our failure to generate comparable findings to those reported by Wall and colleagues using larger and more balanced data underscores several conceptual and methodological problems associated with these studies. We conclude with proposed best-practices when using machine learning in autism research, and highlight some especially promising areas for collaborative work at the intersection of computational and behavioral science.


Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding | 2011

Behavioral signal processing for understanding (distressed) dyadic interactions: some recent developments

Panayiotis G. Georgiou; Matthew P. Black; Shrikanth Narayanan

The expression and experience of human behavior manifestations are complex and are characterized by individual and contextual heterogeneity. Many domains rely on interpreting behavior -- especially those that are distressed and atypical -- through the available signals, both overt e.g., audio-visual data and covert e.g., heart rate. This paper describes the recent developments in behavioral signal processing aimed at understanding dyadic interactions in couple therapy.


Speech Communication | 2009

Assessment of emerging reading skills in young native speakers and language learners

Patti Price; Joseph Tepperman; Markus Iseli; Thao Duong; Matthew P. Black; Shizhen Wang; Christy Boscardin; P. David Pearson; Shrikanth Narayanan; Abeer Alwan

To automate assessments of beginning readers, especially those still learning English, we have investigated the types of knowledge sources that teachers use and have tried to incorporate them into an automated system. We describe a set of speech recognition and verification experiments and compare teacher scores with automatic scores in order to decide when a novel pronunciation is best viewed as a reading error or as dialect variation. Since no one classroom teacher is expected to be familiar with as many dialect systems as might occur in an urban classroom, making progress in automated assessments in this area can improve the consistency and fairness of reading assessment. We found that automatic methods performed best when the acoustic models were trained on both native and non-native speech, and argue that this training condition is necessary for automatic reading assessment since a childs reading ability is not directly observable in one utterance. We also found assessment of emerging reading skills in young children to be an area ripe for more research!


affective computing and intelligent interaction | 2011

Affective state recognition in married couples' interactions using PCA-based vocal entrainment measures with multiple instance learning

Chi-Chun Lee; Athanasios Katsamanis; Matthew P. Black; Brian R. Baucom; Panayiotis G. Georgiou; Shrikanth Narayanan

Recently there has been an increase in efforts in Behavioral Signal Processing (BSP), that aims to bring quantitative analysis using signal processing techniques in the domain of observational coding. Currently observational coding in fields such as psychology is based on subjective expert coding of abstract human interaction dynamics. In this work, we use a Multiple Instance Learning (MIL) framework, a saliencybased prediction model, with a signal-driven vocal entrainment measure as the feature to predict the affective state of a spouse in problem solving interactions. We generate 18 MIL classifiers to capture the variablelength saliency of vocal entrainment, and a cross-validation scheme with maximum accuracy and mutual information as the metric to select the best performing classifier for each testing couple. This method obtains a recognition accuracy of 53.93%, a 2.14% (4.13% relative) improvement over baseline model using Support Vector Machine. Furthermore, this MIL-based framework has potential for identifying meaningful regions of interest for further detailed analysis of married couples interactions.


Computer Speech & Language | 2014

Intoxicated speech detection: A fusion framework with speaker-normalized hierarchical functionals and GMM supervectors

Daniel Bone; Ming Li; Matthew P. Black; Shrikanth Narayanan

Segmental and suprasegmental speech signal modulations offer information about paralinguistic content such as affect, age and gender, pathology, and speaker state. Speaker state encompasses medium-term, temporary physiological phenomena influenced by internal or external biochemical actions (e.g., sleepiness, alcohol intoxication). Perceptual and computational research indicates that detecting speaker state from speech is a challenging task. In this paper, we present a system constructed with multiple representations of prosodic and spectral features that provided the best result at the Intoxication Subchallenge of Interspeech 2011 on the Alcohol Language Corpus. We discuss the details of each classifier and show that fusion improves performance. We additionally address the question of how best to construct a speaker state detection system in terms of robust and practical marginalization of associated variability such as through modeling speakers, utterance type, gender, and utterance length. As is the case in human perception, speaker normalization provides significant improvements to our system. We show that a held-out set of baseline (sober) data can be used to achieve comparable gains to other speaker normalization techniques. Our fused frame-level statistic-functional systems, fused GMM systems, and final combined system achieve unweighted average recalls (UARs) of 69.7%, 65.1%, and 68.8%, respectively, on the test set. More consistent numbers compared to development set results occur with matched-prompt training, where the UARs are 70.4%, 66.2%, and 71.4%, respectively. The combined system improves over the Challenge baseline by 5.5% absolute (8.4% relative), also improving upon our previously best result.

Collaboration


Dive into the Matthew P. Black's collaboration.

Top Co-Authors

Avatar

Shrikanth Narayanan

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Panayiotis G. Georgiou

University of Southern California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chi-Chun Lee

National Tsing Hua University

View shared research outputs
Top Co-Authors

Avatar

Sungbok Lee

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Daniel Bone

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Joseph Tepperman

University of Southern California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Abe Kazemzadeh

University of Southern California

View shared research outputs
Researchain Logo
Decentralizing Knowledge