Is this you? Create Your Porfile

Naoto Iwahashi

National Institute of Information and Communications Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Naoto Iwahashi is active.

Explore More

Publication

Featured researches published by Naoto Iwahashi.

intelligent robots and systems | 2007

Multimodal object categorization by a robot

Tomoaki Nakamura; Takayuki Nagai; Naoto Iwahashi

In this paper unsupervised object categorization by robots is examined. We propose an unsupervised multimodal categorization based on audio-visual and haptic information. The robot uses its physical embodiment to grasp and observe an object from various view points as well as listen to the sound during the observation. The proposed categorization method is an extension of probabilistic latent semantic analysis(pLSA), which is a statistical technique. At the same time the proposed method provides a probabilistic framework for inferring the object property from limited observations. Validity of the proposed method is shown through some experimental results.

intelligent robots and systems | 2009

Grounding of word meanings in multimodal concepts using LDA

Tomoaki Nakamura; Takayuki Nagai; Naoto Iwahashi

In this paper we propose LDA-based framework for multimodal categorization and words grounding for robots. The robot uses its physical embodiment to grasp and observe an object from various view points as well as listen to the sound during the observing period. This multimodal information is used for categorizing and forming multimodal concepts. At the same time, the words acquired during the observing period are connected to the related concepts using multimodal LDA. We also provide a relevance measure that encodes the degree of connection between words and modalities. The proposed algorithm is implemented on a robot platform and some experiments are carried out to evaluate the algorithm. We also demonstrate a simple conversation between a user and the robot based on the learned model.

Information Sciences | 2003

Language acquisition through a human-robot interface by combining speech, visual, and behavioral information

Naoto Iwahashi

This paper describes new language-processing methods suitable for human-robot interfaces. These methods enable a robot to learn linguistic knowledge from scratch in unsupervised ways. The learning is done through statistical optimization in the process of human-robot communication, combining speech, visual, and behavioral information in a probabilistic framework. The linguistic knowledge learned includes speech units like phonemes, lexicon, and grammar, and is represented by a graphical model that includes hidden Markov models. In experiments, a robot was eventually able to understand utterances according to given situations, and act appropriately.

intelligent robots and systems | 2011

Autonomous acquisition of multimodal information for online object concept formation by robots

Takaya Araki; Tomoaki Nakamura; Takayuki Nagai; Kotaro Funakoshi; Mikio Nakano; Naoto Iwahashi

This paper proposes a robot that acquires multimodal information, i.e. auditory, visual, and haptic information, fully autonomous way using its embodiment. We also propose an online algorithm of multimodal categorization based on the acquired multimodal information and words, which are partially given by human users. The proposed framework makes it possible for the robot to learn object concepts naturally in everyday operation in conjunction with a small amount of linguistic information from human users. In order to obtain multimodal information, the robot detects an object on a fla surface. Then the robot grasps and shakes it for gaining haptic and auditory information. For obtaining visual information, the robot uses a hand held small observation table, so that the robot can control the viewpoints for observing the object. As for the multimodal concept formation, the multimodal LDA using Gibbs sampling is extended to the online version in this paper. The proposed algorithms are implemented on a real robot and tested using real everyday objects in order to show validity of the proposed system.

Advanced Robotics | 2016

Symbol emergence in robotics: a survey

Tadahiro Taniguchi; Takayuki Nagai; Tomoaki Nakamura; Naoto Iwahashi; Tetsuya Ogata; Hideki Asoh

Humans can learn a language through physical interaction with their environment and semiotic communication with other people. It is very important to obtain a computational understanding of how humans can form symbol systems and obtain semiotic skills through their autonomous mental development. Recently, many studies have been conducted regarding the construction of robotic systems and machine learning methods that can learn a language through embodied multimodal interaction with their environment and other systems. Understanding human?-social interactions and developing a robot that can smoothly communicate with human users in the long term require an understanding of the dynamics of symbol systems. The embodied cognition and social interaction of participants gradually alter a symbol system in a constructive manner. In this paper, we introduce a field of research called symbol emergence in robotics (SER). SER represents a constructive approach towards a symbol emergence system. The symbol emergence system is socially self-organized through both semiotic communications and physical interactions with autonomous cognitive developmental agents, i.e. humans and developmental robots. In this paper, specifically, we describe some state-of-art research topics concerning SER, such as multimodal categorization, word discovery, and double articulation analysis. They enable robots to discover words and their embodied meanings from raw sensory-motor information, including visual information, haptic information, auditory information, and acoustic speech signals, in a totally unsupervised manner. Finally, we suggest future directions for research in SER. Graphical Abstract

Speech Communication | 2013

Correcting phoneme recognition errors in learning word pronunciation through speech interaction

Xiang Zuo; Taisuke Sumii; Naoto Iwahashi; Mikio Nakano; Kotaro Funakoshi; Natsuki Oka

This paper presents a method called Interactive Phoneme Update (IPU) that enables users to teach systems the pronunciation (phoneme sequences) of words in the course of speech interaction. Using the method, users can correct mis-recognized phoneme sequences by repeatedly making correction utterances according to the system responses. The originalities of this method are: (1) word-segment-based correction that allows users to use word segments for locating mis-recognized phonemes based on open-begin-end dynamic programming matching and generalized posterior probability, (2) history-based correction that utilizes the information of phoneme sequences that were recognized and corrected previously in the course of interactive learning of each word. Experimental results show that the proposed IPU method reduces the error rate by a factor of three over a previously proposed maximum-likelihood-based method.

intelligent robots and systems | 2012

Online learning of concepts and words using multimodal LDA and hierarchical Pitman-Yor Language Model

Takaya Araki; Tomoaki Nakamura; Takayuki Nagai; Shogo Nagasaka; Tadahiro Taniguchi; Naoto Iwahashi

In this paper, we propose an online algorithm for multimodal categorization based on the autonomously acquired multimodal information and partial words given by human users. For multimodal concept formation, multimodal latent Dirichlet allocation (MLDA) using Gibbs sampling is extended to an online version. We introduce a particle filter, which significantly improve the performance of the online MLDA, to keep tracking good models among various models with different parameters. We also introduce an unsupervised word segmentation method based on hierarchical Pitman-Yor Language Model (HPYLM). Since the HPYLM requires no predefined lexicon, we can make the robot system that learns concepts and words in completely unsupervised manner. The proposed algorithms are implemented on a real robot and tested using real everyday objects to show the validity of the proposed system.

intelligent robots and systems | 2011

Multimodal categorization by hierarchical dirichlet process

Tomoaki Nakamura; Takayuki Nagai; Naoto Iwahashi

In this paper, we propose a nonparametric Bayesian framework for categorizing multimodal sensory signals such as audio, visual, and haptic information by robots. The robot uses its physical embodiment to grasp and observe an object from various viewpoints as well as listen to the sound during the observation. The multimodal information enables the robot to form human-like object categories that are bases of intelligence. The proposed method is an extension of Hierarchical Dirichlet Process (HDP), which is a kind of nonparametric Bayesian models, to multimodal HDP (MHDP). MHDP can estimate the number of categories, while the parametric model, e.g. LDA-based categorization, requires to specify the number in advance. As this is an unsupervised learning method, a human user does not need to give any correct labels to the robot and it can classify objects autonomously. At the same time the proposed method provides a probabilistic framework for inferring object properties from limited observations. Validity of the proposed method is shown through some experimental results.

Advanced Robotics | 2011

Learning, Generation and Recognition of Motions by Reference-Point-Dependent Probabilistic Models

Komei Sugiura; Naoto Iwahashi; Hideki Kashioka; Satoshi Nakamura

This paper presents a novel method for learning object manipulation such as rotating an object or placing one object on another. In this method, motions are learned using reference-point-dependent probabilistic models, which can be used for the generation and recognition of motions. The method estimates (i) the reference point, (ii) the intrinsic coordinate system type, which is the type of coordinate system intrinsic to a motion, and (iii) the probabilistic model parameters of the motion that is considered in the intrinsic coordinate system. Motion trajectories are modeled by a hidden Markov model (HMM), and an HMM-based method using static and dynamic features is used for trajectory generation. The method was evaluated in physical experiments in terms of motion generation and recognition. In the experiments, users demonstrated the manipulation of puppets and toys so that the motions could be learned. A recognition accuracy of 90% was obtained for a test set of motions performed by three subjects. Furthermore, the results showed that appropriate motions were generated even if the object placement was changed.

international conference on robotics and automation | 2011

Bag of multimodal LDA models for concept formation

Tomoaki Nakamura; Takayuki Nagai; Naoto Iwahashi

In this paper a novel framework for multimodal categorization using Bag of multimodal LDA models is proposed. The main issue, which is tackled in this paper, is granularity of categories. The categories are not fixed but varied according to context. Selective attention is the key to model this granularity of categories. This fact motivates us to introduce various sets of weights to the perceptual information. Obviously, as the weights change, the categories vary. In the proposed model, various sets of weights and model structures are assumed. Then the multimodal LDA-based categorization is carried out many times that results in a variety of models. In order to make the categories (concepts) useful for inference, significant models should be selected. The selection process is carried out through the interaction between the robot and the user. These selected models enable the robot to infer unobserved properties of the object. For example, the robot can infer audio information only from its appearance. Furthermore, the robot can describe appearance of any objects using some suitable words, thanks to the connection between words and perceptual information. The proposed algorithm is implemented on a robot platform and preliminary experiment is carried out to validate the proposed algorithm.

Explore More