Sadayoshi Mikami | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sadayoshi Mikami is active.

Explore More

Publication

Featured researches published by Sadayoshi Mikami.

world congress on computational intelligence | 1994

Genetic reinforcement learning for cooperative traffic signal control

Sadayoshi Mikami; Yukinori Kakazu

Optimization of a group of traffic signals over an area is a large, multi-agent-type real-time planning problem without a precise reference model being given. To do this planning, each signal should learn not only to acquire its control plans individually through reinforcement learning, but also to cooperate with other signals. These two objectives-distributed learning of agents and cooperation among agents-conflict with each other, and a method that blends these two objectives together is required. In the method proposed in this paper, these two objectives correspond to localized reinforcement learning and global combinatorial optimization, respectively, and the method thus achieves cooperation in the long term without bothering with autonomy. The outline of the idea is as follows: each agent performs reinforcement learning and reports its cumulative performance evaluation, and combinatorial optimization is simultaneously carried out to find appropriate parameters for long-term learning that maximize the total profit of the signals (agents).<<ETX>>

intelligent vehicles symposium | 1993

Self-organized Control Of Traffic Signals Through Genetic Reinforcement Learning

Sadayoshi Mikami; Yukinori Kakazu

This paper introduces a learning approach for traffic signal control. The reinforcement learning is intended to optimize the traffic flow around cross roads, while the Genetic Algorithms are intended to introduce a global optimization criterion to each of the local learning processes. It is shown that the combination of reinforcement learning and the Genetic Algorithm exhibits good performance for dense traffic conditions.

intelligent autonomous systems | 1999

Characterization of biological internal dynamics by the synchronization of coupled chaotic system

Akihiro Yamaguchi; Hirotaka Watanabe; Sadayoshi Mikami; Mitsuo Wada

Abstract We investigate a characterization method of biological internal dynamics which exists in the background of biological signal. For the measured time series of the pulsation of human fingers capillary vessels, we extract an internal dynamics with a form of vector field by using embedding method. In order to characterize and recognize such dynamic structure, we construct an artificial dynamics of measured time series in the digital computer. Then we characterize the similarity of internal dynamics by the synchronized response of the artificial dynamics to the target internal dynamics. In this paper, we show the construction method of an artificial dynamics of measured time series and introduce three types of coupled system between original dynamics and artificial one. The effect of coupling is analyzed by the cross-correlation.

ieee international conference on evolutionary computation | 1996

Combining reinforcement learning with GA to find co-ordinated control rules for multi-agent system

Sadayoshi Mikami; Mitsuo Wada; Yukinori Kakazu

In a multi-agent application, it is necessary to find co-ordinated control rules that maximise a global objective function. To establish coordination, a real-time synchronous communication is normally assumed. However, communication is often limited to asynchronous and very time delayed methods in many practical applications. The paper intends to propose a method to search for co-ordinated plans under limited information exchange. Our approach is to combine on-line local optimisation by reinforcement learning (RL) and asynchronous global combinatorial optimisation by genetic algorithms. The GA search modifies RLs search direction to find a co-ordinated plan, whereas the RL tries to obtain that plan in real-time. Information on which direction is better to find by RL is given through long-term (not instant) communication. The direction is given by a state compression mapping. This is therefore a Lamarckian type GA that inherits acquired knowledge from RL. By using a seesaw balancing problem as a test bed, the performance of the proposed method is shown.

european conference on machine learning | 1995

Co-operative Reinforcement Learning By Payoff Filters (Extended Abstract)

Sadayoshi Mikami; Yukinori Kakazu; Terence C. Fogarty

This paper proposes an extension of Reinforcement Learning (RL) to acquire co-operation among agents. The idea is to learn filtered payoff that reflects a global objective function but does not require mass communication among agents. It is shown that the acquisition of two typical co-operation tasks is realised by preparing simple filter functions: an averaging filter for co-operative tasks and an enhancement filter for deadlock prevention tasks. The performance of these systems was tested through computer simulations of n-persons prisoners dilemma, and a traffic control problem.

artificial intelligence and the simulation of behaviour | 1995

Broadcast Based Fitness Sharing GA for Conflict Resolution Among Autonomous Robots

Sadayoshi Mikami; Yukinori Kakazu; Terence C. Fogarty

This paper proposes a distributed GA for autonomous agents to learn in order to achieve co-operative action. Our objective is to develop a learning system that would make real-world heterogeneous agents feasible with the minimum amount of communication hardware. With such real-world agents, there are two constraints that make it difficult to estimate the global payoff: one, is that the communication bandwidth between the agents is limited to a small band-width. This prohibits the gathering of fitness values from all the agents. Second, is that local fitness values are always evaluated a long time after a conflict between agents has taken place. This means that some agents may be far away by then and will no longer be able to exchange local payoffs in order to calculate the estimated global payoff. To overcome these difficulties, we have developed a polarity based broadcast fitness sharing method for physically distributed populations. Instead of waiting for an exact local payoff, an estimated local payoff is exchanged whenever a conflict takes place. We found that a specific filter function gives a good estimate of global fitness values in conflict resolution tasks. Our results from simulations of a bump-avoidance task for multiple mobile robots show that it elicits a notable performance improvement.

12th International Symposium on Automation and Robotics in Construction | 1995

Adaptive gait acquisition using multi-agent learning for wall climbing robots

Larry Bull; Terence C. Fogarty; Sadayoshi Mikami; James G Thomas

333 In this paper we present work in progress to examine the use of two machine learning techniques to determine the gait of a wall climbing robot. We describe the use of the genetic algorithm and then that of the reinforcement learning technique Q-learning, within a multiple-agent framework, for this task. We assert that there is one agent responsible for the control of each leg of the robot, where each agent is represented by a rule-based controller. It is shown that it is possible to use these techniques to control the gait of the basic robot.

robot and human interactive communication | 1997

Extraction of a dynamic structure of biological signals as stochastic automata for a human-machine interface

Akihiro Yamaguchi; H. Watanabe; Sadayoshi Mikami; Mitsuo Wada

Concerning a human-machine interface, we investigate a method to extract the dynamical structure from biological signals. Applying the /spl epsiv/-Machine reconstruction technique to the measured time series of the pulsation of a human fingers capillary vessels, we extract the macroscopic structure of its dynamics using a stochastic automaton. In terms of stochastic automata, we evaluate the graph complexity and the identification performance of the constructed automaton. To measure the pulsation of a capillary vessel, we use a reflection type pick-up consisting of an infrared sensor unit and an infrared emitter.

systems man and cybernetics | 1999

Identification of biological internal dynamics using a structure of unstable periodic orbits

Akihiro Yamaguchi; Sadayoshi Mikami; Mitsuo Wada

We identify the pulsation dynamics of blood flow comparing them with previously extracted periodic orbits. We show an algorithm for the extraction of periodic orbits from observed dynamics. The identification of the dynamics was performed by calculating the distance between the trajectory of the target dynamics and the extracted periodic orbits. We evaluated its performance by the time ratio so that the target dynamics are closest to each periodic orbit.

robot and human interactive communication | 1996

A realization of socially adaptive robots by competitive reinforcement learning

T. Nakayama; Sadayoshi Mikami; Mitsuo Wada

This paper proposes an extension of reinforcement learning that let each robot learn conflict-free strategy and that avoids state explosion problem. The key idea is to divide a state-action learner in a robot into a set of some discrete learning units, and let them compete with each other so that the task differentiation would easily be achieved. In the proposing architecture, the robots decide an action by choosing internal learner. The standard of selecting an internal agent is the utility vector. We applied this architecture to computer simulations of a seesaw balancing problem, and let the robots adjust the utility vector to differentiate behavior with each other.

Explore More