Natsuki Oka
Kyoto Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Natsuki Oka.
Speech Communication | 2013
Xiang Zuo; Taisuke Sumii; Naoto Iwahashi; Mikio Nakano; Kotaro Funakoshi; Natsuki Oka
This paper presents a method called Interactive Phoneme Update (IPU) that enables users to teach systems the pronunciation (phoneme sequences) of words in the course of speech interaction. Using the method, users can correct mis-recognized phoneme sequences by repeatedly making correction utterances according to the system responses. The originalities of this method are: (1) word-segment-based correction that allows users to use word segments for locating mis-recognized phonemes based on open-begin-end dynamic programming matching and generalized posterior probability, (2) history-based correction that utilizes the information of phoneme sequences that were recognized and corrected previously in the course of interactive learning of each word. Experimental results show that the proposed IPU method reduces the error rate by a factor of three over a previously proposed maximum-likelihood-based method.
Neural Networks | 2000
Makoto Hirahara; Natsuki Oka; Toshiki Kindo
The introduction of a hierarchical memory structure into a cascade associative memory model for storing hierarchically correlated patterns improves the storage capacity and the size of the basins of attraction remarkably. A learning algorithm groups descendants (second-level patterns) according to their ancestors (first-level ones), and organizes the memory structure in a weight matrix where the groups are memorized separately. The weight matrix is, thus, in the form of a pile of covariance matrices, each of which is responsible for recalling only the descendants of each ancestor. Putting it simply, the model is multiplex associative memory. The recalling process proceeds as follows: the model first recalls the ancestor of a target descendant. Then, the dynamics with dynamic threshold combines the ancestor and the weight matrix to activate the covariance matrix for recalling only the descendants of the ancestor. This mechanism suppresses the cross-talk noise generated by the descendants of the other ancestors, and the recalling ability is enhanced.
international conference on acoustics, speech, and signal processing | 2010
Xiang Zuo; Naoto Iwahashi; Ryo Taguchi; Shigeki Matsuda; Komei Sugiura; Kotaro Funakoshi; Mikio Nakano; Natsuki Oka
In this paper, we propose a novel method to detect robot-directed (RD) speech that adopts the Multimodal Semantic Confidence (MSC) measure. The MSC measure is used to decide whether the speech can be interpreted as a feasible action under the current physical situation in an object manipulation task. This measure is calculated by integrating speech, image, and motion confidence measures with weightings that are optimized by logistic regression. Experimental results show that, compared with a baseline method that uses speech confidence only, MSC achieved an absolute increase of 5% for clean speech and 12% for noisy speech in terms of average maximum F-measure.
Transactions of The Society for Computer Simulation International | 1997
Makoto Hirahara; Natsuki Oka; Toshiki Kindo
Abstract H. Gutfreund (Neural networks with hierarchically correlated patterns. Physical Review A, 37, 570–577, 1988) has proposed a model for storing hierachically correlated patterns where ancestor patterns are correlated with descendent ones. However, there is a problem of small storage capacity. Furthermore, we must give ancestors in the learning phase, and determine the value of the parameter on which the capacity and the basins of attraction strongly depend. To overcome these problems, we present a model (CASM) consisting of the first associative memory to form ancestors from their descendants and the second associative memory to store sparse difference patterns which have only information on differences between the formed ancestors and the descendants. To evaluate the performance of CASM, extensive simulations are carried out. The results show that the capacity increases correlation between the ancestors and the descendants, and is as large as that of sparsely encoded associative memory. The basins of attraction become larger with decreasing correlation, and do not depend on loading level.
human-agent interaction | 2014
Kasumi Abe; Chie Hieida; Muhammad Attamimi; Takayuki Nagai; Takayuki Shimotomai; Takashi Omori; Natsuki Oka
It is difficult to design robotic playmates for introverted children. Therefore, we examined how a robot should play with such shy children. In this study, we hypothesized and tested an effective play strategy for building a good relationship with shy children. We conducted an experiment with 5- to 6-year-old children and a humanoid robot teleoperated by a preschool teacher. We developed a valid play strategy for shy children.
Neural Networks | 2000
Makoto Hirahara; Natsuki Oka; Toshiki Kindo
In conventional models for storing hierarchically correlated patterns, correlations between ancestors (first-level patterns) and their descendants (second-level ones) are assumed to be uniform, so that the descendants are distributed around their ancestors with equal distances. However, this assumption might be unnatural. We believe that objects are encoded into patterns by preserving the similarity between them. In this case, descendants are distributed around their ancestors with various distances, so that the assumption is invalid and the conventional models become inapplicable. To overcome this, we propose a model CASM3 for storing hierarchically correlated patterns with various correlations. In CASM3, critical load levels vary with the descendants, and become higher with increasing correlations. Increase in load level successively destroys the memories of the descendants in descending order of their correlations. The size of the basins of attraction depends on the range of the correlations, and becomes larger as the correlation range is shifted toward lower levels.
international symposium on neural networks | 1991
Natsuki Oka
Human intelligence has been modeled in two ways: modeling based on central symbolic processing, and modeling based on distributed subsymbolic processing. This paper points out the limitations of these two approaches, and proposes a hybrid cognitive model of central symbolic processing on the conscious level, and distributed subsymbolic processing on the unconscious level (C/U model). The advantages of the C/U model are clarified by explaining various functions realized by the model: multistage knowledge retrieval, recognition and inference with situated knowledge, inductive learning, and creative inference. Those functions are realized by utilizing the close interaction between the two levels. Finally, this paper describes an implementation method that utilizes the characteristics of a parallel logic programming language, and also describes a knowledge acquisition system necessary for building practical hybrid systems.<<ETX>>
international conference on human system interactions | 2011
Motoyuki Ozeki; Yasuhiro Kashiwagi; Mariko Inoue; Natsuki Oka
A novel visual attention model based on a particle filter is that also has a filter-type feature, (2) a compact model independent of the high-level processes, and (3) a unitary model that naturally integrates top-down modulation and bottom-up processes. These features allow the model to be applied simply to robots and to be easily understood by the developers. In this paper, we first briefly discuss human visual attention, computational models for bottom-up attention, and attentional metaphors. We then describe the proposed model and its top-down control interface. Finally, three experiments demonstrate the potential of the proposed model as an attentional metaphor and top-down attention control interface.
robot and human interactive communication | 2010
Xiang Zuo; Naoto Iwahashi; Ryo Taguchi; Kotaro Funakoshi; Mikio Nakano; Shigeki Matsuda; Komei Sugiura; Natsuki Oka
In this paper, we propose a novel method for a robot to detect robot-directed speech, that is, to distinguish speech that users speak to a robot from speech that users speak to other people or to themselves. The originality of this work is the introduction of a multimodal semantic confidence (MSC) measure, which is used for domain classification of input speech based on the decision on whether the speech can be interpreted as a feasible action under the current physical situation in an object manipulation task. This measure is calculated by integrating speech, object, and motion confidence with weightings that are optimized by logistic regression. Then we integrate this measure with gaze tracking and conduct experiments under conditions of natural human-robot interaction. Experimental results show that the proposed method achieves a high performance of 94% and 96% in average recall and precision rates, respectively, for robot-directed speech detection.
Proceedings of the 2007 workshop on Multimodal interfaces in semantic interaction | 2007
Kazuaki Tanaka; Xiang Zuo; Yasuaki Sagano; Natsuki Oka
In the future, robots will become common in our daily life. For using the robot more efficiently, it is desirable that the robot would have learning ability. However, a human teaching process for robot learning in the real environment usually takes a very long period of time. We hence believe that the robot should learn from implicit information which is included in human natural behavior. We direct our attention to the lack of utterance as a kind of implicit information, and insist that the lack of utterance should be interpreted as a positive evaluation of the ongoing action, which we call No News Criterion, in a robot navigation context. In this paper, we propose an efficient command learning algorithm based on the No News Criterion, and demonstrate its effectiveness by a human-robot interaction experiment in the real environment.