Takashi Komeda | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Takashi Komeda is active.

Explore More

Publication

Featured researches published by Takashi Komeda.

In-Tech, ISBN 978-8-902613-55-4 | 2009

Solving POMDPs with Automatic Discovery of Subgoals

Le Tien Dung; Takashi Komeda; Motoki Takagi

Reinforcement Learning (RL) is the problem faced by an agent that must learn behavior through trial-and-error interactions with a dynamic environment (Kaelbling et al., 1996). At any time step, the environment is assumed to be at one state. In Markov Decision Processes (MPDs), all states are fully observable in which the agent can choose a good action based only on the current sensory observation. In Partially Observable Markov Decision Processes (POMDPs), any state can be a hidden state in which the agent doesn’t have sufficient sensory observation and the agent must remember the past sensations to select a good action. Q-learning is the most popular algorithm for learning from delayed reinforcement in MDPs, and RL with Recurrent Neural Network (RNN) can solve deep POMDPs. Several methods have been proposed to speed up learning performance in MDPs by creating useful subgoals (Girgin et al., 2006), (McGovern & Barto, 2001), (Menache et al., 2002), (Simsek & Barto, 2005). Subgoals are actually states that have a high reward gradient or that are visited frequently on successful trajectories but not on unsuccessful ones, or that lie between densely-connected regions of the state space. In MDPs, to attaint a subgoal, we can use a plain table based policy, named a skill. Then these useful skills are treated as options or macro actions in RL (Barto & Mahadevan, 2003), (McGovern & Barto, 2001), (Menache et al., 2002), (Girgin et al., 2006), (Simsek & Barto, 2005), (Sutton et al., 1999). For example, an option named “going to the door” helps a robot to move from any random position in the hall to one of two doors. However, it is difficult to apply directly this approach to RL when a RNN is used to predict Q values. Simply adding one more unit into output layer to predict Q values for an option doesn’t work because updating any connection’s weight will affect all previous Q values and because it is easy to lose the Q values when the option can’t be executed for a long time. In this chapter, a method named Reinforcement Learning using Automatic Discovery of Subgoals is presented towards this approach but in POMDPs. We can reuse existing algorithms to discover subgoals. To obtain a skill, a new policy using a RNN is trained by experience replay. Once useful skills are obtained by RNNs, these learned RNNs are integrated into the main RNN as experts in RL. Results of experiment in two problems, the E maze problem and the virtual office problem, show that the proposed method enables an agent to acquire a policy, as good as the policy acquired by RL with RNN, with better learning performance.

WIT Transactions on Biomedicine and Health | 2009

Development of a training system for interventional radiology

Masaru Ide; Y. Fujii; B. Fujioka; Takashi Komeda; Hiroyuki Koyama; Shin-ichiro Yamamoto; Makoto Mohri; P. Beomonte Zobel

The objective of the study reported here was to develop a master slave system for catheter-guided vascular surgery conducted by interventional radiology. By using a master slave system, the surgeon is not exposed to x-rays during the operation because the master tool managed by an operator is located away from the slave tool, which is near the patient. The system must provide vivid realism to the surgeon, particularly with regard to force information, because this surgery is performed in three dimensions while the surgeon watches a two-dimensional monitor. In this study, we developed a training system for a catheter guide in order to upgrade the surgeon’s skills because it is difficult to upgrade a master slave system without training. The system consists of a human interface device as the master tool, a control box, and a simulator. This training simulator is for the master slave system, which we developed. The master tool has a force display function using an electrorheological fluid. Two advantages of the fluid actuator are that it can be used without force feedback control and there is mechanical safety, as the surgeon does not experience any accidental force. An open loop control is used to achieve a simple mechanism and algorithm. Our results of preliminary experiments indicated that the output force achieved correlated with that sent from the PC. Three surgeons evaluated this training system under a variety of conditions. The operation of the master tool is simple. The thrust and rotation movements of the catheter can be handled instinctively and without complicated instructions. In addition, accurate force display, response, and stability were achieved with the electrorheological fluid. In the future, the training will need for a realistic depiction of interventional radiology, and the system provides accurate readings for aspiration and blood flow.

The Proceedings of the JSME Symposium on Welfare Engineering | 2006

3D1-02 Modeling and controlling a gait training system utilizing a biarticular muscle model

Mikael Roos; Shinichiro Yamamoto; Erik Fransén; Örjan Ekeberg; Takashi Komeda; Tasuku Miyoshi

In the purpose of rehabilitation for people unable to perform a normal gait pattern a pneumatically operated gait training system with a biarticular muscle model utilizing rubbertuators have been developed. The pneumatics and the biarticular characteristics make the system difficult to control. In this research paper machine learning techniques have been used in an attempt to design a control system for the pneumatic gait. Preliminary results indicate that inverse plant modeling using Artificial Neural Networks (ANN) might be a successful approach.

Advanced Robotics | 1986

DEVELOPMENT OF A FOUR-WHEELED MOBILE ROBOT SYSTEM FOR BEDRIDDEN PATIENTS.

Hiroyasu Funakubo; Tsuneshi Isomura; Takashi Komeda; Yukio Inuzuka

In Japan the number of severely crippled and bedridden patients who have to be completely taken care of with regard to daily routine is presently increasing for several reasons, and in addition the percentage of old-aged people in the population is growing. Providing care for these persons is becoming an extremely serious social and economic problem. With the aim of reducing the difficulties faced by handicapped people and those responsible for their care and nursing, we have undertaken research on the development of a four-wheeled mobile robot system to provide assistance in the daily activities of the bedridden and handicapped. The characteristics of this system are (1) front-wheel power steering and independently directly driven rear wheels, (2) one pair of manipulators which have nine degrees offreedom mounted on the four-wheeled mobile device, and (3) a hierarchical control system with one 16 bit and several 8 bit microcomputers. We utilize a teaching-playback method to develop control programs for s...

Archive | 1988