Katherine Forbes-Riley
University of Pittsburgh
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Katherine Forbes-Riley.
meeting of the association for computational linguistics | 2004
Diane J. Litman; Katherine Forbes-Riley
We examine the utility of speech and lexical features for predicting student emotions in computer-human spoken tutoring dialogues. We first annotate student turns for negative, neutral, positive and mixed emotions. We then extract acoustic-prosodic features from the speech signal, and lexical items from the transcribed or recognized speech. We compare the results of machine learning experiments using these features alone or in combination to predict various categorizations of the annotated student emotions. Our best results yield a 19-36% relative improvement in error reduction over a baseline. Finally, we compare our results with emotion prediction in human-human tutoring dialogues.
artificial intelligence in education | 2006
Diane J. Litman; Carolyn Penstein Rosé; Katherine Forbes-Riley; Kurt VanLehn; Dumisizwe Bhembe; Scott Silliman
While human tutors typically interact with students using spoken dialogue, most computer dialogue tutors are text-based. We have conducted 2 experiments comparing typed and spoken tutoring dialogues, one in a human-human scenario, and another in a human-computer scenario. In both experiments, we compared spoken versus typed tutoring for learning gains and time on task, and also measured the correlations of learning gains with dialogue features. Our main results are that changing the modality from text to speech caused large differences in the learning gains, time and superficial dialogue characteristics of human tutoring, but for computer tutoring it made less difference.
Speech Communication | 2006
Diane J. Litman; Katherine Forbes-Riley
While human tutors respond to both what a student says and to how the student says it, most tutorial dialogue systems cannot detect the student emotions and attitudes underlying an utterance. We present an empirical study investigating the feasibility of recognizing student state in two corpora of spoken tutoring dialogues, one with a human tutor, and one with a computer tutor. We first annotate student turns for negative, neutral and positive student states in both corpora. We then automatically extract acoustic–prosodic features from the student speech, and lexical items from the transcribed or recognized speech. We compare the results of machine learning experiments using these features alone, in combination, and with student and task dependent features, to predict student states. We also compare our results across human–human and human–computer spoken tutoring dialogues. Our results show significant improvements in prediction accuracy over relevant baselines, and provide a first step towards enhancing our intelligent tutoring spoken dialogue system to automatically recognize and adapt to student states.
Speech Communication | 2011
Katherine Forbes-Riley; Diane J. Litman
We evaluate the performance of a spoken dialogue system that provides substantive dynamic responses to automatically detected user affective states. We then present a detailed system error analysis that reveals challenges for real-time affect detection and adaptation. This research is situated in the tutoring domain, where the user is a student and the spoken dialogue system is a tutor. Our adaptive system detects uncertainty in each student turn via a model that combines a machine learning approach with hedging phrase heuristics; the learned model uses acoustic-prosodic and lexical features extracted from the speech signal, as well as dialogue features. The adaptive system varies its content based on the automatic uncertainty and correctness labels for each turn. Our controlled experimental evaluation shows that the adaptive system yields higher global performance than two non-adaptive control systems, but the difference is only significant for a subset of students. Our system error analysis indicates that noisy affect labeling is a major performance bottleneck, yielding fewer than expected adaptations thus lower than expected performance. However, the percentage of received adaptation correlates with higher performance over all students. Moreover, when uncertainty is accurately recognized and adapted to, local performance is significantly improved.
User Modeling and User-adapted Interaction | 2008
Katherine Forbes-Riley; Mihai Rotaru; Diane J. Litman
We hypothesize that student affect is a useful predictor of spoken dialogue system performance, relative to other parameters. We test this hypothesis in the context of our spoken dialogue tutoring system, where student learning is the primary performance metric. We first present our system and corpora, which have been annotated with several student affective states, student correctness and discourse structure. We then discuss unigram and bigram parameters derived from these annotations. The unigram parameters represent each annotation type individually, as well as system-generic features. The bigram parameters represent annotation combinations, including student state sequences and student states in the discourse structure context. We then use these parameters to build learning models. First, we build simple models based on correlations between each of our parameters and learning. Our results suggest that our affect parameters are among our most useful predictors of learning, particularly in specific discourse structure contexts. Next, we use the PARADISE framework (multiple linear regression) to build complex learning models containing only the most useful subset of parameters. Our approach is a value-added one; we perform a number of model-building experiments, both with and without including our affect parameters, and then compare the performance of the models on the training and the test sets. Our results show that when included as inputs, our affect parameters are selected as predictors in most models, and many of these models show high generalizability in testing. Our results also show that overall, the affect-included models significantly outperform the affect-excluded models.
Natural Language Engineering | 2006
Diane J. Litman; Katherine Forbes-Riley
We examine correlations between dialogue behaviors and learning in tutoring, using two corpora of spoken tutoring dialogues: a human-human corpus and a human-computer corpus. To formalize the notion of dialogue behavior, we manually annotate our data using a tagset of student and tutor dialogue acts relative to the tutoring domain. A unigram analysis of our annotated data shows that student learning correlates both with the tutors dialogue acts and with the students dialogue acts. A bigram analysis shows that student learning also correlates with joint patterns of tutor and student dialogue acts. In particular, our human-computer results show that the presence of student utterances that display reasoning (whether correct or incorrect), as well as the presence of reasoning questions asked by the computer tutor, both positively correlate with learning. Our human-human results show that student introductions of a new concept into the dialogue positively correlates with learning, but student attempts at deeper reasoning (particularly when incorrect), and the human tutors attempts to direct the dialogue, both negatively correlate with learning. These results suggest that while the use of dialogue act n-grams is a promising method for examining correlations between dialogue behavior and learning, specific findings can differ in human versus computer tutoring, with the latter better motivating adaptive strategies for implementation.
language and technology conference | 2006
Katherine Forbes-Riley; Diane J. Litman
We investigate using the PARADISE framework to develop predictive models of system performance in our spoken dialogue tutoring system. We represent performance with two metrics: user satisfaction and student learning. We train and test predictive models of these metrics in our tutoring system corpora. We predict user satisfaction with 2 parameter types: 1) system-generic, and 2) tutoring-specific. To predict student learning, we also use a third type: 3) user affect. Although generic parameters are useful predictors of user satisfaction in other PARADISE applications, overall our parameters produce less useful user satisfaction models in our system. However, generic and tutoring-specific parameters do produce useful models of student learning in our system. User affect parameters can increase the usefulness of these models.
affective computing and intelligent interaction | 2007
Katherine Forbes-Riley; Diane J. Litman
We use a i¾?2analysis on our spoken dialogue tutoring corpus to investigate dependencies between uncertain student answers and 9 dialogue acts the human tutor uses in his response to these answers. Our results show significant dependencies between the tutors use of some dialogue acts and the uncertainty expressed in the prior student answer, even after factoring out the answers (in)correctness. Identification and analysis of these dependencies is part of our empirical approach to developing an adaptive version of our spoken dialogue tutoring system that responds to student affective states as well as to student correctness.
north american chapter of the association for computational linguistics | 2007
Katherine Forbes-Riley; Mihai Rotaru; Diane J. Litman; Joel R. Tetreault
We use X2 to investigate the context dependency of student affect in our computer tutoring dialogues, targeting uncertainty in student answers in 3 automatically monitorable contexts. Our results show significant dependencies between uncertain answers and specific contexts. Identification and analysis of these dependencies is our first step in developing an adaptive version of our dialogue system.
annual meeting of the special interest group on discourse and dialogue | 2014
Diane J. Litman; Katherine Forbes-Riley
We present an evaluation of a spoken dialogue system that detects and adapts to user disengagement and uncertainty in real-time. We compare this version of our system to a version that adapts to only user disengagement, and to a version that ignores user disengagement and uncertainty entirely. We find a significant increase in task success when comparing both affectadaptive versions of our system to our nonadaptive baseline, but only for male users.