Anna Hjalmarsson | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Anna Hjalmarsson is active.

Explore More

Publication

Featured researches published by Anna Hjalmarsson.

Speech Communication | 2008

Towards human-like spoken dialogue systems

Jens Edlund; Joakim Gustafson; Mattias Heldner; Anna Hjalmarsson

This paper presents an overview of methods that can be used to collect and analyse data on user responses to spoken dialogue system components intended to increase human-likeness, and to evaluate how well the components succeed in reaching that goal. Wizard-of-Oz variations, human-human data manipulation, and micro-domains are discussed in this context, as is the use of third-party reviewers to get a measure of the degree of human-likeness. We also present the two-way mimicry target, a model for measuring how well a human-computer dialogue mimics or replicates some aspect of human-human dialogue, including human flaws and inconsistencies. Although we have added a measure of innovation, none of the techniques is new in its entirety. Taken together and described from a human-likeness perspective, however, they form a set of tools that may widen the path towards human-like spoken dialogue systems.

Speech Communication | 2009

Embodied conversational agents in computer assisted language learning

Preben Wik; Anna Hjalmarsson

This paper describes two systems using embodied conversational agents (ECAs) for language learning. The first system, called Ville, is a virtual language teacher for vocabulary and pronunciation training. The second system, a dialogue system called DEAL, is a role-playing game for practicing conversational skills. Whereas DEAL acts as a conversational partner with the objective of creating and keeping an interesting dialogue, Ville takes the role of a teacher who guides, encourages and gives feedback to the students.

Speech Communication | 2014

Turn-taking, feedback and joint attention in situated human–robot interaction

Gabriel Skantze; Anna Hjalmarsson; Catharine Oertel

Abstract In this paper, we present a study where a robot instructs a human on how to draw a route on a map. The human and robot are seated face-to-face with the map placed on the table between them. The user’s and the robot’s gaze can thus serve several simultaneous functions: as cues to joint attention, turn-taking, level of understanding and task progression. We have compared this face-to-face setting with a setting where the robot employs a random gaze behaviour, as well as a voice-only setting where the robot is hidden behind a paper board. In addition to this, we have also manipulated turn-taking cues such as completeness and filled pauses in the robot’s speech. By analysing the participants’ subjective rating, task completion, verbal responses, gaze behaviour, and drawing activity, we show that the users indeed benefit from the robot’s gaze when talking about landmarks, and that the robot’s verbal and gaze behaviour has a strong effect on the users’ turn-taking behaviour. We also present an analysis of the users’ gaze and lexical and prosodic realisation of feedback after the robot instructions, and show that these cues reveal whether the user has yet executed the previous instruction, as well as the user’s level of uncertainty.

Computer Speech & Language | 2013

Towards incremental speech generation in conversational systems

Gabriel Skantze; Anna Hjalmarsson

This paper presents a model of incremental speech generation in practical conversational systems. The model allows a conversational system to incrementally interpret spoken input, while simultaneously planning, realising and self-monitoring the system response. If these processes are time consuming and result in a response delay, the system can automatically produce hesitations to retain the floor. While speaking, the system utilises hidden and overt self-corrections to accommodate revisions in the system. The model has been implemented in a general dialogue system framework. Using this framework, we have implemented a conversational game application. A Wizard-of-Oz experiment is presented, where the automatic speech recognizer is replaced by a Wizard who transcribes the spoken input. In this setting, the incremental model allows the system to start speaking while the users utterance is being transcribed. In comparison to a non-incremental version of the same system, the incremental version has a shorter response time and is perceived as more efficient by the users.

conference on future play | 2007

DEAL: dialogue management in SCXML for believable game characters

Jenny Brusk; Torbjörn Lager; Anna Hjalmarsson; Preben Wik

In order for game characters to be believable, they must appear to possess qualities such as emotions, the ability to learn and adapt as well as being able to communicate in natural language. With this paper we aim to contribute to the development of believable non-player characters (NPCs) in games, by presenting a method for managing NPC dialogues. We have selected the trade scenario as an example setting since it offers a well-known and limited domain common in games that support ownership, such as role-playing games. We have developed a dialogue manager in State Chart XML, a newly introduced W3C standard, as part of DEAL --- a research platform for exploring the challenges and potential benefits of combining elements from computer games, dialogue systems and language learning.

annual meeting of the special interest group on discourse and dialogue | 2008

Speaking without knowing what to say... or when to end

Anna Hjalmarsson

Humans produce speech incrementally and on-line as the dialogue progresses using information from several different sources in parallel. A dialogue system that generates output in a stepwise manner and not in preplanned syntactically correct sentences needs to signal how new dialogue contributions relate to previous discourse. This paper describes a data collection which is the foundation for an effort towards more human-like language generation in DEAL, a spoken dialogue system developed at KTH. Two annotators labelled cue phrases in the corpus with high inter-annotator agreement (kappa coefficient 0.82).

Computers in the Human Interaction Loop | 2009

Multimodal Interaction Control

Jonas Beskow; Rolf Carlson; Jens Edlund; Björn Granström; Mattias Heldner; Anna Hjalmarsson; Gabriel Skantze

No matter how well hidden our systems are and how well they do their magic unnoticed in the background, there are times when direct interaction between system and human is a necessity. As long as the interaction can take place unobtrusively and without techno-clutter, this is desirable. It is hard to picture a means of interaction less obtrusive and techno-cluttered than spoken communication on human terms. Spoken face-to-face communication is the most intuitive and robust form of communication between humans imaginable. In order to exploit such human spoken communication to its full potential as an interface between human and machine, we need a much better understanding of how the more human-like aspects of spoken communication work.

perception and interactive technologies | 2008

Human-Likeness in Utterance Generation: Effects of Variability

Anna Hjalmarsson; Jens Edlund

There are compelling reasons to endow dialogue systems with human-like conversational abilities, which require modelling of aspects of human behaviour. This paper examines the value of using human behaviour as a target for system behaviour through a study making use of a simulation method. Two versions of system behaviour are compared: a replica of a human speakers behaviour and a constrained version with less variability. The version based on human behaviour is rated more human-like, polite and intelligent.

international conference on acoustics, speech, and signal processing | 2015

An information-theoretic framework for automated discovery of prosodic cues to conversational structure

Kornel Laskowski; Anna Hjalmarsson

Interaction timing in conversation exhibits myriad variabilities, yet it is patently not random. However, identifying consistencies is a manually labor-intensive effort, and findings have been limited. We propose a conditonal mutual information measure of the influence of prosodic features, which can be computed for any conversation at any instant, with only a speech/non-speech segmentation as its requirement. We evaluate the methodology on two segmental features: energy and speaking rate. Results indicate that energy, the less controversial of the two, is in fact better on average at predicting conversational structure. We also explore the temporal evolution of model “surprise”, which permits identifying instants where each features influence is operative. The method corroborates earlier findings, and appears capable of large-scale data-driven discovery in future research.

Speech in Mobile Environments, Call Centers and Clinics: Views of Speech Industry Leaders | 2010

Computational Approaches to Modeling Speaker State in the Medical Domain

Julia Hirschberg; Anna Hjalmarsson; Noémie Elhadad

Recently, researchers in computer science and engineering have begun to explore the possibility of finding speech-based correlates of various medical conditions using automatic, computational methods. If such language cues can be identified and quantified automatically, this information can be used to support diagnosis and treatment of medical conditions in clinical settings and to further fundamental research in understanding cognition. This chapter reviews computational approaches that explore communicative patterns of patients who suffer from medical conditions such as depression, autism spectrum disorders, schizophrenia, and cancer. There are two main approaches discussed: research that explores features extracted from the acoustic signal and research that focuses on lexical and semantic features. We also present some applied research that uses computational methods to develop assistive technologies. In the final sections we discuss issues related to and the future of this emerging field of research.

Explore More