Katsuya Takanashi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Katsuya Takanashi is active.

Explore More

Publication

Featured researches published by Katsuya Takanashi.

international conference on multimodal interfaces | 2013

Context-based conversational hand gesture classification in narrative interaction

Shogo Okada; Mayumi Bono; Katsuya Takanashi; Yasuyuki Sumi; Katsumi Nitta

Communicative hand gestures play important roles in face-to-face conversations. These gestures are arbitrarily used depending on an individual; even when two speakers narrate the same story, they do not always use the same hand gesture (movement, position, and motion trajectory) to describe the same scene. In this paper, we propose a framework for the classification of communicative gestures in small group interactions. We focus on how many times the hands are held in a gesture and how long a speaker continues a hand stroke, instead of observing hand positions and hand motion trajectories. In addition, to model communicative gesture patterns, we use nonverbal features of participants addressed from participant gestures. In this research, we extract features of gesture phases defined by Kendon (2004) and co-occurring nonverbal patterns with gestures, i.e., utterance, head gesture, and head direction of each participant, by using pattern recognition techniques. In the experiments, we collect eight group narrative interaction datasets to evaluate the classification performance. The experimental results show that gesture phase features and nonverbal features of other participants improves the performance to discriminate communicative gestures that are used in narrative speeches and other gestures from 4% to 16%.

international conference on acoustics, speech, and signal processing | 2007

Automatic Detection of Sentence and Clause Units using Local Syntactic Dependency

Tatsuya Kawahara; Masahiro Saikou; Katsuya Takanashi

For robust detection of sentence and clause units in spontaneous speech such as lectures and meetings, we propose a novel cascaded chunking strategy which incorporates syntactic and semantic information. Application of general syntactic parsing is difficult for spontaneous speech having ill-formed sentences and disfluencies, especially for erroneous transcripts generated by ASR systems. Therefore, we focus on the local syntactic dependency of adjacent words and phrases, and train binary classifiers based on SVM (support vector machines) for this purpose. An experimental evaluation using spontaneous talks of the CSJ (Corpus of Spontaneous Japanese) demonstrates that the proposed dependency analysis can be robustly performed and is effective for clause/sentence unit detection in ASR outputs.

conference of the international speech communication association | 2016

Prediction and Generation of Backchannel Form for Attentive Listening Systems.

Tatsuya Kawahara; Takashi Yamaguchi; Koji Inoue; Katsuya Takanashi; Nigel Ward

In human-human dialogue, especially in attentive listening such as counseling, backchannels are important not only for smooth communication but also for establishing rapport. Despite several studies on when to backchannel, most of the current spoken dialogue systems generate the same pattern of backchannels, giving monotonous impressions to users. In this work, we investigate generation of a variety of backchannel forms according to the dialogue context. We first show the feasibility of choosing appropriate backchannel forms based on machine learning, and the synergy of using linguistic and prosodic features. For generation of backchannels, a framework based on a set of binary classifiers is adopted to effectively make a “not-to-generate” decision. The proposed model achieved better prediction accuracy than a baseline which always outputs the same backchannel form and another baseline which randomly generates backchannels. Finally, evaluations by human subjects demonstrate that the proposed method generates backchannels as naturally as human choices, giving impressions of understanding and empathy.

2011 International Conference on Speech Database and Assessments (Oriental COCOSDA) | 2011

Annotation of japanese response tokens and preliminary analysis on their distribution in three-party conversations

Yasuharu Den; Nao Yoshida; Katsuya Takanashi; Hanae Koiso

In this paper, we propose a new annotation scheme for Japanese response tokens (RTs), which is based on strict and consistent procedures. Our scheme consists of two-stage annotation, in which RTs are first identified and classified according to their forms and then further sub-classified based on their sequential positions. Six forms are included in our class of RTs: i) responsive interjections, ii) expressive interjections, iii) lexical reactive expressions, iv) repetitions, v) completions, and vi) assessments. Some of them bear an additional tag according to their sequential position in the discourse: i) first pair parts, ii) second pair parts, iii) sequence-closing thirds, iv) other responding turns, and v) unclassifiable positions. We apply our scheme to annotate a Japanese three-party conversation corpus, and present the results of a preliminary analysis on the distribution of RTs in the corpus.

Proceedings of the Workshop on Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction | 2016

Annotation and analysis of listener's engagement based on multi-modal behaviors

Koji Inoue; Divesh Lala; Shizuka Nakamura; Katsuya Takanashi; Tatsuya Kawahara

We address the annotation of engagement in the context of human-machine interaction. Engagement represents the level of how much a user is being interested in and willing to continue the current interaction. The conversational data used in the annotation work is a human-robot interaction corpus where a human subject talks with the android ERICA, which is remotely operated by another human subject. The annotation work was done by multiple third-party annotators, and the task was to detect the time point when the level of engagement becomes high. The annotation results indicate that there are agreements among the annotators although the numbers of annotated points are different among them. It is also found that the level of engagement is related to turn-taking behaviors. Furthermore, we conducted interviews with the annotators to reveal behaviors used to show a high level of engagement. The results suggest that laughing, backchannels and nodding are related to the level of engagement.

APSIPA Transactions on Signal and Information Processing | 2016

Multi-modal sensing and analysis of poster conversations with smart posterboard

Tatsuya Kawahara; Takuma Iwatate; Koji Inoue; Soichiro Hayashi; Hiromasa Yoshimoto; Katsuya Takanashi

Conversations in poster sessions in academic events, referred to as poster conversations, pose interesting, and challenging topics on multi-modal signal and information processing. We have developed a smart posterboard for multi-modal recording and analysis of poster conversations. The smart posterboard has multiple sensing devices to record poster conversations, so we can review who came to the poster and what kind of questions or comments he/she made. The conversation analysis incorporates face and eye-gaze tracking for effective speaker diarization. It is demonstrated that eye-gaze information is useful for predicting turn-taking and also improving speaker diarization. Moreover, high-level indexing of interest and comprehension level of the audience is explored based on the multi-modal behaviors during the conversation. This is realized by predicting the audiences speech acts such as questions and reactive tokens.

international symposium on artificial intelligence | 2011

Designing a Future Space in Real Spaces: Transforming the Heterogeneous Representations of a “Not Yet Existing” Object

Katsuya Takanashi; Takeshi Hiramoto

Based on ethnography in a scientific museum, this article analyzes a single case of successive group problem-solving interactions for designing an exhibition as a future space. In the process of multimodal interaction in multiple spaces, heterogeneous resources in real spaces became representations of objects not yet existing and were continuously transformed with reference to a framework for collaborative problem-solving.

IWSDS | 2019

A Conversational Dialogue Manager for the Humanoid Robot ERICA

Pierrick Milhorat; Divesh Lala; Koji Inoue; Tianyu Zhao; Masanari Ishida; Katsuya Takanashi; Shizuka Nakamura; Tatsuya Kawahara

We present a dialogue system for a conversational robot, Erica. Our goal is for Erica to engage in more human-like conversation, rather than being a simple question-answering robot. Our dialogue manager integrates question-answering with a statement response component which generates dialogue by asking about focused words detected in the user’s utterance, and a proactive initiator which generates dialogue based on events detected by Erica. We evaluate the statement response component and find that it produces coherent responses to a majority of user utterances taken from a human-machine dialogue corpus. An initial study with real users also shows that it reduces the number of fallback utterances by half. Our system is beneficial for producing mixed-initiative conversation.

asia pacific signal and information processing association annual summit and conference | 2015

Synchrony in prosodic and linguistic features between backchannels and preceding utterances in attentive listening

Tatsuya Kawahara; Takashi Yamaguchi; Miki Uesato; Koichiro Yoshino; Katsuya Takanashi

In human-human dialogue, especially in attentive listening such as counseling, backchannels play an important role. Appropriately coordinated backchannels will not only make smooth communication but also help establish rapport. By collecting counseling dialogue, we investigate whether and how synchrony is expressed by prosodic and linguistic features of backchannels with respect to the preceding speakers utterances. First, we find out correlation patterns according to the type of backchannels and prosodic features; a larger correlation is observed for reactive tokens than acknowledging tokens and for the power features than the pitch features. Next, we investigate the relationship between the morphological complexity of backchannels and the syntactic complexity of the preceding clause/sentence unit. The result can be useful for generating a variety of backchannels adaptive to the speakers utterances.

international conference on universal access in human-computer interaction | 2014

The Practice of Showing ‘Who I am’: A Multimodal Analysis of Encounters between Science Communicator and Visitors at Science Museum

Mayumi Bono; Hiroaki Ogata; Katsuya Takanashi; Ayami Joh

In this paper, we try to contribute to the design of future technologies used in science museums where there is no explicit, pre-determined relationship regarding knowledge between Science Communicators (SCs) and visitors. We illustrate the practice of interaction between them, especially focusing on social encounter. Starting in October 2012, we conducted a field study at the National Museum of Emerging Science and Innovation (Miraikan) in Japan. Based on multimodal analysis, we examine various activities, focusing on how expert SCs communicate about science: how they begin interactions with visitors, how they maintain them, and how they conclude them.

Explore More