Gabriel Skantze | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gabriel Skantze is active.

Explore More

Publication

Featured researches published by Gabriel Skantze.

meeting of the association for computational linguistics | 2009

Incremental Dialogue Processing in a Micro-Domain

Gabriel Skantze; David Schlangen

This paper describes a fully incremental dialogue system that can engage in dialogues in a simple domain, number dictation. Because it uses incremental speech recognition and prosodic analysis, the system can give rapid feedback as the user is speaking, with a very short latency of around 200ms. Because it uses incremental speech synthesis and self-monitoring, the system can react to feedback from the user as the system is speaking. A comparative evaluation shows that naive users preferred this system over a non-incremental version, and that it was perceived as more human-like.

Speech Communication | 2005

Exploring human error recovery strategies : Implications for spoken dialogue systems

Gabriel Skantze

In this study, an explorative experiment was conducted in which subjects were asked to give route directions to each other in a simulated campus (similar to Map Task). In order to elicit error handling strategies, a speech recogniser was used to corrupt the speech in one direction. This way, data could be collected on how the subjects might recover from speech recognition errors. This method for studying error handling has the advantages that the level of understanding is transparent to the analyser, and the errors that occur are similar to errors in spoken dialogue systems. The results show that when subjects face speech recognition problems, a common strategy is to ask task-related questions that confirm their hypothesis about the situation instead of signalling non-understanding. Compared to other strategies, such as asking for a repetition, this strategy leads to better understanding of subsequent utterances, whereas signalling non-understanding leads to decreased experience of task success.

COST'11 Proceedings of the 2011 international conference on Cognitive Behavioural Systems | 2011

Furhat: a back-projected human-like robot head for multiparty human-machine interaction

Samer Al Moubayed; Jonas Beskow; Gabriel Skantze; Björn Granström

In this chapter, we first present a summary of findings from two previous studies on the limitations of using flat displays with embodied conversational agents (ECAs) in the contexts of face-to-face human-agent interaction. We then motivate the need for a three dimensional display of faces to guarantee accurate delivery of gaze and directional movements and present Furhat, a novel, simple, highly effective, and human-like back-projected robot head that utilizes computer animation to deliver facial movements, and is equipped with a pan-tilt neck. After presenting a detailed summary on why and how Furhat was built, we discuss the advantages of using optically projected animated agents for interaction. We discuss using such agents in terms of situatedness, environment, context awareness, and social, human-like face-to-face interaction with robots where subtle nonverbal and social facial signals can be communicated. At the end of the chapter, we present a recent application of Furhat as a multimodal multiparty interaction system that was presented at the London Science Museum as part of a robot festival,. We conclude the paper by discussing future developments, applications and opportunities of this technology.

international conference on multimodal interfaces | 2012

IrisTK: a statechart-based toolkit for multi-party face-to-face interaction

Gabriel Skantze; Samer Al Moubayed

In this paper, we present IrisTK - a toolkit for rapid development of real-time systems for multi-party face-to-face interaction. The toolkit consists of a message passing system, a set of modules for multi-modal input and output, and a dialog authoring language based on the notion of statecharts. The toolkit has been applied to a large scale study in a public museum setting, where the back-projected robot head Furhat interacted with the visitors in multi-party dialog.

annual meeting of the special interest group on discourse and dialogue | 2008

Galatea: A Discourse Modeller Supporting Concept-Level Error Handling in Spoken Dialogue Systems

Gabriel Skantze

In this chapter, a discourse modeller for conversational spoken dialogue systems, called GALATEA, is presented. Apart from handling the resolution of ellipses and anaphora, it tracks the “grounding status” of concepts that are mentioned during the discourse, i.e., information about who said what when. This grounding information also contains concept confidence scores that are derived from the speech recogniser word confidence scores. The discourse model may then be used for concept-level error handling, i.e., grounding of concepts, fragmentary clarification requests, and detection of erroneous concepts in the model at later stages in the dialogue. An evaluation of GALATEA, used in a complete spoken dialogue system with naive users, is also presented.

Speech Communication | 2014

Turn-taking, feedback and joint attention in situated human–robot interaction

Gabriel Skantze; Anna Hjalmarsson; Catharine Oertel

Abstract In this paper, we present a study where a robot instructs a human on how to draw a route on a map. The human and robot are seated face-to-face with the map placed on the table between them. The user’s and the robot’s gaze can thus serve several simultaneous functions: as cues to joint attention, turn-taking, level of understanding and task progression. We have compared this face-to-face setting with a setting where the robot employs a random gaze behaviour, as well as a voice-only setting where the robot is hidden behind a paper board. In addition to this, we have also manipulated turn-taking cues such as completeness and filled pauses in the robot’s speech. By analysing the participants’ subjective rating, task completion, verbal responses, gaze behaviour, and drawing activity, we show that the users indeed benefit from the robot’s gaze when talking about landmarks, and that the robot’s verbal and gaze behaviour has a strong effect on the users’ turn-taking behaviour. We also present an analysis of the users’ gaze and lexical and prosodic realisation of feedback after the robot instructions, and show that these cues reveal whether the user has yet executed the previous instruction, as well as the user’s level of uncertainty.

Computer Speech & Language | 2013

Towards incremental speech generation in conversational systems

Gabriel Skantze; Anna Hjalmarsson

This paper presents a model of incremental speech generation in practical conversational systems. The model allows a conversational system to incrementally interpret spoken input, while simultaneously planning, realising and self-monitoring the system response. If these processes are time consuming and result in a response delay, the system can automatically produce hesitations to retain the floor. While speaking, the system utilises hidden and overt self-corrections to accommodate revisions in the system. The model has been implemented in a general dialogue system framework. Using this framework, we have implemented a conversational game application. A Wizard-of-Oz experiment is presented, where the automatic speech recognizer is replaced by a Wizard who transcribes the spoken input. In this setting, the incremental model allows the system to start speaking while the users utterance is being transcribed. In comparison to a non-incremental version of the same system, the incremental version has a shorter response time and is perceived as more efficient by the users.

international conference on social robotics | 2013

Head Pose Patterns in Multiparty Human-Robot Team-Building Interactions

Martin Johansson; Gabriel Skantze; Joakim Gustafson

We present a data collection setup for exploring turn-taking in three-party human-robot interaction involving objects competing for attention. The collected corpus comprises 78 minutes in four interactions. Using automated techniques to record head pose and speech patterns, we analyze head pose patterns in turn-transitions. We find that introduction of objects makes addressee identification based on head pose more challenging. The symmetrical setup also allows us to compare human-human to human-robot behavior within the same interaction. We argue that this symmetry can be used to assess to what extent the system exhibits a human-like behavior.

annual meeting of the special interest group on discourse and dialogue | 2009

Attention and Interaction Control in a Human-Human-Computer Dialogue Setting

Gabriel Skantze; Joakim Gustafson

This paper presents a simple, yet effective model for managing attention and interaction control in multimodal spoken dialogue systems. The model allows the user to switch attention between the system and other humans, and the system to stop and resume speaking. An evaluation in a tutoring setting shows that the users attention can be effectively monitored using head pose tracking, and that this is a more reliable method than using push-to-talk.

International Journal of Humanoid Robotics | 2013

The Furhat Back-Projected Humanoid Head-Lip Reading, Gaze And Multi-Party Interaction

Samer Al Moubayed; Gabriel Skantze; Jonas Beskow

In this paper, we present Furhat - a back-projected human-like robot head using state-of-the art facial animation. Three experiments are presented where we investigate how the head might facilitate human - robot face-to-face interaction. First, we investigate how the animated lips increase the intelligibility of the spoken output, and compare this to an animated agent presented on a flat screen, as well as to a human face. Second, we investigate the accuracy of the perception of Furhats gaze in a setting typical for situated interaction, where Furhat and a human are sitting around a table. The accuracy of the perception of Furhats gaze is measured depending on eye design, head movement and viewing angle. Third, we investigate the turn-taking accuracy of Furhat in a multi-party interactive setting, as compared to an animated agent on a flat screen. We conclude with some observations from a public setting at a museum, where Furhat interacted with thousands of visitors in a multi-party interaction.

Explore More