Kentarou Kurashige | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kentarou Kurashige is active.

Explore More

Publication

Featured researches published by Kentarou Kurashige.

international symposium on micro-nanomechatronics and human science | 2013

The proposal for deciding effective action using prediction of internal robot state based on internal state and action

Masashi Sugimoto; Kentarou Kurashige

For a robot working in a complicated environment, it is virtually impossible to predict all possible situations and to pre-program the robot with all suitable reaction patterns for each of the possible situations. Because robots are required to act differently in different situations. Furthermore, robots should be able to adapt to different environments by deciding upon the course of action to take depending on the situation, in addition to pre-registered commands, in a manner similar to humans. However, hardware and the limited computational resources pose a physical limitation, so the robot needs some time to decide its course of action. In this context, if robots will be able to select the most appropriate action quickly and can cancel the time delay caused by the limitations mentioned above. Moreover, if depending on the action a robot takes, the future internal state space will vary infinitely. If we take this point into consideration, we need to simultaneously predict the internal state and the action that the robot adopt. The purpose of this research is to compensate the current action as the appropriate action using next time and future actions that robots will take. For achievement of this, first, we state advance prediction using Online SVR as a learner. This Online SVR predicts the future internal robot state - i.e., the robots next internal state to be taken. Furthermore, this predictor will be useful for predicting the distant future internal robot state, using the internal state that the robot adopt repeatedly. Second, we determine the future action from Optimal Feedback Controller using predicted future internal state - i.e., the robots next action to be taken. In this paper, we designed a controller using LQR (Linear Quadratic Regulator) and use as determine an action. This paper presents the results of these studies and discusses methods that allow the robot to decide its desirable behavior quickly, using the state predicted. As an application example, we used two-wheeled inverted pendulum, and compared the results of the proposed method with the actual response of the inverted postural control task.

international symposium on micro-nanomechatronics and human science | 2014

Real-time sequentially decision for optimal action using prediction of the state-action pair

Masashi Sugimoto; Kentarou Kurashige

We previously reported that an approach to predict the changes of the state and action of the robot. In this paper, to extend this approach, we will attempt to apply the action to be taken in the future to current action. For the achievement of this point, firstly, we will attempt to apply the action to be taken in the future, to the current action, by extending the former approach. We will apply the prediction of the State-Action Pair that has former proposed method. This method predicts the robot state and action for the distant future, using the state that the robot adopt repeatedly. Accordingly, we will obtain the actions that the robot to be taken in the future. In addition, we consider the point that the state and the action of the robot will be changed continuously. In this paper, we propose the method that predicts the state and the action every time when the robot decide an action. By using this method, we will obtain the compensate current action. This paper presents the results of these studies and discusses methods that allow the robot decides its desirable behavior quickly, using the state predicted combined with optimal control method.

congress on evolutionary computation | 2015

The proposal for real-time sequential-decision for optimal action using flexible-weight coefficient based on the state-action pair

Masashi Sugimoto; Kentarou Kurashige

For a robot that works in a dynamic environment, the ability to autonomously cope with the changes in the environment, is important. In this paper, we propose an approach to predict the changes of the state and action of the robot. Further, to extend this approach, we will attempt to apply the action to be taken in the future, to the current action. This method predicts the robot state and action for the distant future using the state that the robot adopts repeatedly. By using this method, we can predict the actions that the robot will take in the future. In addition, we consider that the state and the action of the robot will change continuously and mutually. In this paper, we propose a method that predicts the state and the action each time the robot decides to perform an action. In particular, in this paper, we will focus on how to define the weight coefficients, using the characteristics of the future prediction results. By using this method, we will obtain the compensatory current action. This paper presents the results of our study and discusses methods that allow the robot to decide its desirable behavior quickly, using state prediction and optimal control methods.

computational intelligence in robotics and automation | 2007

A simple rule how to make a reward for learning with human interaction

Kentarou Kurashige

Various learning methods are adapted for experimental robot. We can make movement of a robot by giving teaching signals to a robot. But it is heavy for operator to define how to give teaching signals generally because operator must guess and think of a task and environment and define a function to do that. Here the author aim to create teaching signals automatically for each task and environment. In this paper, the author suggest a simple rule which is independent of information about any task and environment to create teaching signals for each task and environment. This rule is that a situation which is often happened is good situation. In this paper, the author adopt reinforcement learning as learning method and a small-sized humanoid robot as application. The author show creating a reward by adapting a rule and show that a robot can learn and make movement.

international conference on intelligent robotics and applications | 2016

A Study on the Deciding an Action Based on the Future Probabilistic Distribution

Masashi Sugimoto; Kentarou Kurashige

In case of operating the robot in a real environment, its behavior will be probabilistic due to the slight transition of the robots state or the error of the action that is taken at each time. We have previously reported that prediction of the state-action pair, is the prediction method to link the state and action of the robot for future the state and the action. From this standpoint, we have proposed the method that decides the action that tends to take in the future. In this paper, we will try to introduce the statistical approach to the prediction of the state-action pair. From this standpoint, we propose the method that decides the action that tends to take in the future, for the current action. In the proposed method, we will calculate the existence probability of the state and the action in the future, according to the normal distribution.

2014 IEEE Symposium on Robotic Intelligence in Informationally Structured Space (RiiSS) | 2014

Self-generation of reward in reinforcement learning by universal rules of interaction with the external environment

Kentarou Kurashige; Kaoru Nikaido

Various studies related to machine learning have been performed. In this study, we focus on reinforcement learning, one of the methods used in machine learning. In conventional reinforcement leaning, the design of the reward function is difficult, because it is a complex and laborious task and requires expert knowledge. In previous studies, the robot learned from external sources, not autonomously. To solve this problem, we propose a method of robot learning through interactions with humans using sensor input. The reward is also generated through interactions with humans. However, the method does not require additional tasks that must be performed by the human. Therefore, the user does not need expert knowledge, and anyone can teach the robot. Our experiment confirmed that robot learning is possible through the proposed method.

2013 IEEE Workshop on Robotic Intelligence in Informationally Structured Space (RiiSS) | 2013

Proposal of learning method which selects objectives based on the state

Hironori Miura; Kentarou Kurashige

Reinforcement learning (RL) is one of the methods for robot action learning. RL is formulated as the maximization of a single reward; however, in most practical problems, multiple objectives need to be considered. Therefore, it is necessary to perform multi-objective optimization. We focus on the required objectives that depended on the state of the robot and propose a multi-objective optimization for the required objectives. If there is more than one required objective, multi-objective optimization is performed based on the priority of each objective. In this paper, we give two objectives to a robot and perform simulation experiments. We will demonstrate the validity of the proposed system using the simulation results.

international symposium on micro-nanomechatronics and human science | 2012

Reduction of state space on reinforcement learning by sensor selection

Yasutaka Kishima; Kentarou Kurashige

Much research has been conducted on the application of reinforcement learning to robots. Learning time is a matter of concern in reinforcement learning. In reinforcement learning, information from sensors is projected on to a state space. A robot learns the correspondence between each state and action in state space and determines the best correspondence. When the state space is expanded according to the number of sensors, the number of correspondences learnt by the robot is increased. Therefore, learning the best correspondence becomes time consuming. In this study, we focus on the importance of sensors for a robot to perform a particular task. The sensors that are applicable to a task differ for different tasks. A robot does not need to use all installed sensors to perform a task. The state space should consist of only those sensors that are essential to a task. Using such a state space consisting of only important sensors, a robot can learn correspondences faster than in the case of a state space consisting of all installed sensors. Therefore, in this paper, we propose a relatively fast learning system in which a robot can autonomously select those sensors that are essential to a task and a state space for only such important sensors is constructed. We define the measure of importance of a sensor for a task. The measure is the coefficient of correlation between the value of each sensor and reward in reinforcement learning. A robot determines the importance of sensors based on this correlation. Consequently, the state space is reduced based on the importance of sensors. Thus, the robot can efficiently learn correspondences owing to the reduced state space. We confirm the effectiveness of our proposed system through a simulation.

International Journal of Artificial Life Research | 2017

A Study of Predicting Ability in State-Action Pair Prediction: Adaptability to an Almost-Periodic Disturbance

Kentarou Kurashige; Masashi Sugimoto; Naoya Iwamoto; Robert W. Johnston; Keizo Kanazawa; Yukinori Misaki

When a robot considers an action-decision based on a future prediction, it is necessary to know the property of disturbance signals from the outside environment. On the other hand, the properties of disturbance signals cannot be described simply, such as non-periodic function, nonlinear time-varying function nor almost-periodic function. In case of a robot control, sampling rate for control will be affected description of disturbance signals such as frequency or amplitude. If the sampling rate for acquiring a disturbance signal is not correct, the action will be taken far from its actual property. In general, future prediction using machine learning is based on the tendency obtained through past training or learning. In this case, an optimal action will be determined uniquely based on a property of disturbance. However, in this type of situation, the learning time increases in proportional to the amount of training data, either, the tendency may not be found using prediction, in the worst case. In this paper, we focus on prediction for almost-periodic disturbance. In particular, we consider the situation where almost-periodic disturbance signals occur. From this perspective, we propose a method that identifies the frequency of an almost-periodic function based on the frequency of the disturbance using Fourier transform, nearest-neighbor one-step-ahead forecasts and Nyquist-Shannon sampling theorem.

international conference on intelligent robotics and applications | 2016

Action Learning to Single Robot Using MARL with Repeated Consultation: Realization of Repeated Consultation Interruption for the Adaptation to Environmental Change

Yoh Takada; Kentarou Kurashige

We have proposed multi-agent reinforcement learning with repeated consultation MARLRC as a multi-agent reinforcement learning that agents can select the concerted action. In MARLRC, agents select a virtual action and share it with other agents several times in one robot decision-making. In this study, we focused on the problem that MARLRC does not take into account the environment that time constraints may occur in the decision-making. As an approach to solve this problem, we considered to introduce to determine the amount of time that can be used in decision-making and decision-making in predetermined time. We introduced the method to decision-making in time predetermined by MARLRC in this study.

Explore More