Shogo Okada | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shogo Okada is active.

Explore More

Publication

Featured researches published by Shogo Okada.

intelligent robots and systems | 2009

Unsupervised simultaneous learning of gestures, actions and their associations for Human-Robot Interaction

Yasser F. O. Mohammad; Toyoaki Nishida; Shogo Okada

Human-Robot Interaction using free hand gestures is gaining more importance as more untrained humans are operating robots in home and office environments. The robot needs to solve three problems to be operated by free hand gestures: gesture (command) detection, action generation (related to the domain of the task) and association between gestures and actions.

international symposium on neural networks | 2011

Online incremental clustering with distance metric learning for high dimensional data

Shogo Okada; Toyoaki Nishida

In this paper, we present a novel incremental clustering algorithm which assigns of a set of observations into clusters and learns the distance metric iteratively in an incremental manner. The proposed algorithm SOINN-AML is composed based on the Self-organizing Incremental Neural Network (Shen et al 2006), which represents the distribution of unlabeled data and reports a reasonable number of clusters. SOINN adopts a competitive Hebbian rule for each input signal, and distance between nodes is measured using the Euclidean distance. Such algorithms rely on the distance metric for the input data patterns. Distance Metric Learning (DML) learns a distance metric for the high dimensional input space of data that preserves the distance relation among the training data. DML is not performed for input space of data in SOINN based approaches. SOINN-AML learns input space of data by using the Adaptive Distance Metric Learning (AML) algorithm which is one of the DML algorithms. It improves the incremental clustering performance of the SOINN algorithm by optimizing the distance metric in the case that input data space is high dimensional. In experimental results, we evaluate the performance by using two artificial datasets, seven real datasets from the UCI dataset and three real image datasets. We have found that the proposed algorithm outperforms conventional algorithms including SOINN (Shen et al 2006) and Enhanced SOINN (Shen et al 2007). The improvement of clustering accuracy (NMI) is between 0.03 and 0.13 compared to state of the art SOINN based approaches.

international conference on pattern recognition | 2008

Motion recognition based on Dynamic-Time Warping method with Self-Organizing Incremental Neural Network

Shogo Okada; Osamu Hasegawa

This paper presents an approach (SOINN-DTW)for recognition of motion (gesture) that is based on the Self-Organizing Incremental Neural Network (SOINN) and Dynamic Time Warping (DTW). Using SOlNNs function of eliminating noise in the input data and representing the distribution of input data, SOINN-PTW method approximates the output distribution of each state in a self-organizing manner corresponding to the input data. The proposed SOINN-DTW method enhances Stochastic Dynamic Time Warping Method (Nakagawa. 1986). Results of experiments show that SOINN-DTW outperforms HMM, CRF. and HCRF in motion data.

international conference on multimodal interfaces | 2014

Predicting Influential Statements in Group Discussions using Speech and Head Motion Information

Fumio Nihei; Yukiko I. Nakano; Yuki Hayashi; Hung-Hsuan Hung; Shogo Okada

Group discussions are used widely when generating new ideas and forming decisions as a group. Therefore, it is assumed that giving social influence to other members through facilitating the discussion is an important part of discussion skill. This study focuses on influential statements that affect discussion flow and highly related to facilitation, and aims to establish a model that predicts influential statements in group discussions. First, we collected a multimodal corpus using different group discussion tasks; in-basket and case-study. Based on schemes for analyzing arguments, each utterance was annotated as being influential or not. Then, we created classification models for predicting influential utterances using prosodic features as well as attention and head motion information from the speaker and other members of the group. In our model evaluation, we discovered that the assessment of each participant in terms of discussion facilitation skills by experienced observers correlated highly to the number of influential utterances by a given participant. This suggests that the proposed model can predict influential statements with considerable accuracy, and the prediction results can be a good predictor of facilitators in group discussions.

Proceedings of the 2010 workshop on Eye gaze in intelligent human machine interaction | 2010

Autonomous development of gaze control for natural human-robot interaction

Yasser F. O. Mohammad; Shogo Okada; Toyoaki Nishida

Gaze behavior is one of the most important nonverbal behaviors during human-human close encounters. For this reason, many researchers in natural human-robot interaction focus on developing robots that can achieve human-like gaze behavior. Many approaches have been proposed to achieve this natural gaze behavior based on accurate analysis of human behaviors during natural interactions. One limitation of most available approaches is that the behavior is hardwired to the robot and learning techniques are used only, if ever, for adjusting the parameters of the behavior. In this paper we propose and evaluate a different approach in which the robot learns natural gaze behavior by watching natural interactions between humans. The proposed approach uses the LiEICA architecture developed by the authors and is completely unsupervised which leads to grounded behavior. We compare the resulting gaze controller with a state-of-the-art gaze controller that achieved human-like behavior and show that the proposed approach leads to a more natural gaze behavior based on subjective evaluations of subjects.

international conference on multimodal interfaces | 2015

Personality Trait Classification via Co-Occurrent Multiparty Multimodal Event Discovery

Shogo Okada; Oya Aran; Daniel Gatica-Perez

This paper proposes a novel feature extraction framework from mutli-party multimodal conversation for inference of personality traits and emergent leadership. The proposed framework represents multi modal features as the combination of each participants nonverbal activity and group activity. This feature representation enables to compare the nonverbal patterns extracted from the participants of different groups in a metric space. It captures how the target member outputs nonverbal behavior observed in a group (e.g. the member speaks while all members move their body), and can be available for any kind of multiparty conversation task. Frequent co-occurrent events are discovered using graph clustering from multimodal sequences. The proposed framework is applied for the ELEA corpus which is an audio visual dataset collected from group meetings. We evaluate the framework for binary classification task of 10 personality traits. Experimental results show that the model trained with co-occurrence features obtained higher accuracy than previously related work in 8 out of 10 traits. In addition, the co-occurrence features improve the accuracy from 2 % up to 17 %.

international conference on multimodal interfaces | 2016

Estimating communication skills using dialogue acts and nonverbal features in multiple discussion datasets

Shogo Okada; Yoshihiko Ohtake; Yukiko I. Nakano; Yuki Hayashi; Hung-Hsuan Huang; Yutaka Takase; Katsumi Nitta

This paper focuses on the computational analysis of the individual communication skills of participants in a group. The computational analysis was conducted using three novel aspects to tackle the problem. First, we extracted features from dialogue (dialog) act labels capturing how each participant communicates with the others. Second, the communication skills of each participant were assessed by 21 external raters with experience in human resource management to obtain reliable skill scores for each of the participants. Third, we used the MATRICS corpus, which includes three types of group discussion datasets to analyze the influence of situational variability regarding to the discussion types. We developed a regression model to infer the score for communication skill using multimodal features including linguistic and nonverbal features: prosodic, speaking turn, and head activity. The experimental results show that the multimodal fusing model with feature selection achieved the best accuracy, 0.74 in R2 of the communication skill. A feature analysis of the models revealed the task-dependent and task-independent features to contribute to the prediction performance.

international conference on multimodal interfaces | 2013

Context-based conversational hand gesture classification in narrative interaction

Shogo Okada; Mayumi Bono; Katsuya Takanashi; Yasuyuki Sumi; Katsumi Nitta

Communicative hand gestures play important roles in face-to-face conversations. These gestures are arbitrarily used depending on an individual; even when two speakers narrate the same story, they do not always use the same hand gesture (movement, position, and motion trajectory) to describe the same scene. In this paper, we propose a framework for the classification of communicative gestures in small group interactions. We focus on how many times the hands are held in a gesture and how long a speaker continues a hand stroke, instead of observing hand positions and hand motion trajectories. In addition, to model communicative gesture patterns, we use nonverbal features of participants addressed from participant gestures. In this research, we extract features of gesture phases defined by Kendon (2004) and co-occurring nonverbal patterns with gestures, i.e., utterance, head gesture, and head direction of each participant, by using pattern recognition techniques. In the experiments, we collect eight group narrative interaction datasets to evaluate the classification performance. The experimental results show that gesture phase features and nonverbal features of other participants improves the performance to discriminate communicative gestures that are used in narrative speeches and other gestures from 4% to 16%.

Ai & Society | 2010

Incremental learning of gestures for human–robot interaction

Shogo Okada; Yoichi Kobayashi; Satoshi Ishibashi; Toyoaki Nishida

For a robot to cohabit with people, it should be able to learn people’s nonverbal social behavior from experience. In this paper, we propose a novel machine learning method for recognizing gestures used in interaction and communication. Our method enables robots to learn gestures incrementally during human–robot interaction in an unsupervised manner. It allows the user to leave the number and types of gestures undefined prior to the learning. The proposed method (HB-SOINN) is based on a self-organizing incremental neural network and the hidden Markov model. We have added an interactive learning mechanism to HB-SOINN to prevent a single cluster from running into a failure as a result of polysemy of being assigned more than one meaning. For example, a sentence: “Keep on going left slowly” has three meanings such as, “Keep on (1)”, “going left (2)”, “slowly (3)”. We experimentally tested the clustering performance of the proposed method against data obtained from measuring gestures using a motion capture device. The results show that the classification performance of HB-SOINN exceeds that of conventional clustering approaches. In addition, we have found that the interactive learning function improves the learning performance of HB-SOINN.

international conference industrial engineering other applications applied intelligent systems | 2013

Semi-supervised latent Dirichlet allocation for multi-label text classification

Youwei Lu; Shogo Okada; Katsumi Nitta

This paper proposes a semi-supervised latent Dirichlet allocation (ssLDA) method, which differs from the existing supervised topic models for multi-label classification in mainly two aspects. Firstly both labeled and unlabeled learning data are used in ssLDA to train a model, which is very important for reducing the cost by manually labeling, especially when obtaining a fully labeled dataset is difficult. Secondly ssLDA provides a more flexible training scheme that allows two ways of labeling assignment while existing topic model-based methods usually focus on either of them: (1) a document-level assignment of labels to a document; (2) imposing word-level correspondences between words and labels within a document. Our experiment results indicate that ssLDA gains an advantage over other methods in implementation flexibility and can outperform others in terms of multi-label classification performance.

Explore More