Eiji Uchibe
Osaka University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Eiji Uchibe.
Artificial Intelligence | 1999
Minoru Asada; Eiji Uchibe; Koh Hosoda
Abstract In this paper, we first discuss the meaning of physical embodiment and the complexity of the environment in the context of multi-agent learning. We then propose a vision-based reinforcement learning method that acquires cooperative behaviors in a dynamic environment. We use the robot soccer game initiated by RoboCup (Kitano et al., 1997) to illustrate the effectiveness of our method. Each agent works with other team members to achieve a common goal against opponents. Our method estimates the relationships between a learners behaviors and those of other agents in the environment through interactions (observations and actions) using a technique from system identification. In order to identify the model of each agent, Akaikes Information Criterion is applied to the results of Canonical Variate Analysis to clarify the relationship between the observed data in terms of actions and future observations. Next, reinforcement learning based on the estimated state vectors is performed to obtain the optimal behavior policy. The proposed method is applied to a soccer playing situation. The method successfully models a rolling ball and other moving agents and acquires the learners behaviors. Computer simulations and real experiments are shown and a discussion is given.
intelligent robots and systems | 1996
Eiji Uchibe; Minoru Asada; Koh Hosoda
Coordination of multiple behaviors independently obtained by a reinforcement learning method is one of the issues in order for the method to be scaled to larger and more complex robot learning tasks. Direct combination of all the state spaces for individual modules (subtasks) needs enormous learning time, and it causes hidden states. This paper presents a method of modular learning which coordinates multiple behaviors taking account of a trade-off between learning time and performance. First, in order to reduce the learning time the whole state space is classified into two categories based on the action values separately obtained by Q learning: the area where one of the learned behaviors is directly applicable (no more learning area), and the area where learning is necessary due to competition of multiple behaviors (re-learning area). Second, hidden states are detected by model fitting to the learned action values based on the information criterion. Finally, the initial action valves in the re-learning area are adjusted so that they can be consistent with the values in the no more learning area. The method is applied to one to one soccer playing robots. Computer simulation and real robot experiments are given, to show the validity of the proposed method.
intelligent robots and systems | 1994
Minoru Asada; Eiji Uchibe; Shoichi Noda; Sukoya Tawaratsumida; Koh Hosoda
A method is proposed which accomplishes a whole task consisting of plural subtasks by coordinating multiple behaviors acquired by a vision-based reinforcement learning. First, individual behaviors which achieve the corresponding subtasks are independently acquired by Q-learning, a widely used reinforcement learning method. Each learned behavior can be represented by an action-value function in terms of state of the environment and robot action. Next, three kinds of coordinations of multiple behaviors are considered; simple summation of different action-value functions, switching action-value functions according to situations, and learning with previously obtained action-value functions as initial values of a new action-value function. A task of shooting a ball into the goal avoiding collisions with an enemy is examined. The task can be decomposed into a ball shooting subtask and a collision avoiding subtask. These subtasks should be accomplished simultaneously, but they are not independent of each other.<<ETX>>
international conference on robotics and automation | 2001
Eiji Uchibe; Tatsunori Kato; Minoru Asada; Koh Hosoda
It is necessary to coordinate multiple tasks in order to cope with larger-scaled and more complicated tasks. However, it seems very hard to accomplish the multiple tasks at the same time. The paper proposes a method to resolve a conflict between task modules through the processes of their executions. Based on the proposed method, the robot can select an appropriate module according to the priority. In addition, we apply the module conflict resolution to a multiagent environment. Consequently, multiple tasks are automatically allocated to the multiple robots. As a task example, a soccer game is selected to show the validity of the proposed method. Real experiments are shown, and a discussion is given.
intelligent robots and systems | 1998
Eiji Uchibe; Masateru Nakamura; Minoru Asada
Co-evolution has been receiving increased attention as a method for multi agent simultaneous learning. This paper discusses how multiple robots can emerge cooperative behaviors through co-evolutionary processes. As an example task, a simplified soccer game with three learning robots is selected and a genetic programming method is applied to individual population corresponding to each robot so as to obtain cooperative and competitive behaviors. The complexity of the problem can be explained twofold: co-evolution for cooperative behaviors needs exact synchronization of mutual evolutions, and three robot co-evolution requires well-complicated environment setups that may gradually change from, simpler to more complicated situations. Simulation results are shown, and a discussion is given.
robot soccer world cup | 1999
Eiji Uchibe; Masateru Nakamura; Minoru Asada
Co-evolution has recently beeb receiving increased attention as a method for multi agent simultaneous learing. This paper discusses how multiple robots can emerge cooperative behaviours through co-evolutionary processes. As an example task, a simplified soccer game with three learning robots is selected and a GP (genetic programming) method is applied to individual population corresponging to each robot so as to obtain cooperative and competitive behavours through evolutionary processes. The complexity of the problem can be explained twofold: co-evolution for cooperative behaviors needs exact synchronization of mutual evolutions, and three robot co-evolution requires well-complicated environment setups that may gradually change from simpler to more complicated situations so that they can obtain cooperative and competitive behaviours simultaneously in a wide range of search area in various kinds of aspects. Simuation results are shown, and a discussion is given.
international conference on robotics and automation | 1998
Eiji Uchibe; Minoru Asada; Koh Hosoda
Discusses how a robot can develop its state vector according to the complexity of the interactions with its environment. A method for controlling the complexity is proposed for a vision-based mobile robot whose task is to shoot a ball into a goal avoiding collisions with a goalkeeper. First, we provide the most difficult situation (the maximum speed of the goalkeeper with chasing-a-ball behavior), and the robot estimates the full set of state vectors with the order of the major vector components by a method of system identification. The environmental complexity is defined in terms of the speed of the goalkeeper while the complexity of the state vector is the number of the dimensions of the state vector. According to the increase of the speed of the goalkeeper, the dimension of the state vector is increased by taking a trade-off between the size of the state space (the dimension) and the learning time. Simulations are shown, and other issues for the complexity control are discussed.
systems man and cybernetics | 1999
Eiji Uchibe; Minoru Asada
A vector-valued reward function is discussed in the context of multiple behavior coordination, especially in a dynamically changing multiagent environment. Unlike the traditional weighted sum of several reward functions, we define a vector-valued value function which evaluates the current action strategy by introducing a discounted matrix to integrate several reward functions. Owing to the extension of the value function, the learning robot can estimate the future multiple reward from the environment appropriately not suffering from the weighting problem. The proposed method is applied to a simplified soccer game. Computer simulations are shown and a discussion is given.
robot soccer world cup | 1998
Sho'ji Suzuki; Yasutake Takahashi; Eiji Uchibe; Masateru Nakamura; Chizuko Mishima; Hiroshi Ishizuka; Tatsunori Kato; Minoru Asada
The authors have applied reinforcement learning methods to real robot tasks in several aspects. We selected a skill of soccer as a task for a vision-based mobile robot. In this paper, we explain two of our method; (1)learning a shooting behavior, and (2)learning a shooting with avoiding an opponent. These behaviors were obtained by a robot in simulation and tested in a real environment in RoboCup-97. We discuss current limitations and future work along with the results of RoboCup-97.
robot soccer world cup | 2000
Eiji Uchibe; Minoru Asada
A vector-valued reward function is discussed in the context of multiple behavior coordination, especially in a dynamically changing multiagent environment. Unlike the traditional weighted sum of several reward functions, we define a vector-valued value function which evaluates the current action strategy by introducing a discounted matrix to integrate several reward functions. Owing to the extension of the value function, the learning robot can estimate the future multiple reward from the environment appropriately not suffering from the weighting problem. The proposed method is applied to a simplified soccer game. Computer simulations are shown and a discussion is given.