Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ruoying Sun is active.

Publication


Featured researches published by Ruoying Sun.


ieee region 10 conference | 2002

Q-MAP: a novel multicast routing method in wireless ad hoc networks with multiagent reinforcement learning

Ruoying Sun; Shoji Tatsumi; Gang Zhao

Multicast plays an important role in ad hoc networks, and multicast algorithms have the goal of directing traffic from sources to receivers maximizing some measure of network performance combining the processes of routing and resource reservation. This paper discusses some current literature about multicast routing in mobile ad hoc networks. Further, by investigating the swarm-based routing method and the multiagent reinforcement learning applications, this paper analyses the possibility and merit of adopting the reinforcement learning method in a multicast routing protocol for wireless ad hoc networks. And based on the above, this paper presents a novel multicast routing method, the Q-MAP algorithm, that ensures the reliability of the resource reservation in the wireless mobile ad hoc networks. The features and efficiency of the Q-MAP multicast routing method are also illustrated in this paper.


systems man and cybernetics | 2001

Multiagent reinforcement learning method with an improved ant colony system

Ruoying Sun; Shoji Tatsumi; Gang Zhao

Multiagent reinforcement learning has gained increasing attention in recent years. The authors discuss coordination means for sharing episodes and sharing policies in the field of multiagent reinforcement learning. From the point of the view of reinforcement learning, we analyse the performance of indirect media communication among multi-agents on an ant colony system which is an efficient method that uses pheromones to solve optimization problems. Based on the above, we propose the Q-ACS method, modifying the global updating rule in ACS for learning agents to share better episodes benefited from the exploitation of accumulated knowledge. Meanwhile, taking the visited times into account, we propose T-ACS by presenting a state transition policy for learning agents to share better policies, benefiting from biased exploration. To demonstrate the coordination performance of learning agents in our methods, we conducted experiments on an optimization problem, the traveling salesman problem. Comparison of results with ACS, Q-ACS and T-ACS show that the improved methods are efficient for solving the optimization problem.


systems, man and cybernetics | 2002

Application of multiagent reinforcement learning - to multicast routing in wireless ad hoc networks ensuring resource reservation

Ruoying Sun; Shoji Tatsumi; Gang Zhao

Mobile Ad hoc Networks (MANETS) are selforganized wireless networks, and each mobile node in the network acts as a router and forwards packets on behalf of other nodes. Multicast routing is becoming an important networking service in MANETS, which objective is to find optimal routes from a source node to all multicast destinations and use the network resource effectively. This paper investigate the possibility and merit of applying Reinforcement Learning (RL) into the multicast routing in MANETS. And, taking advantage of the multiagent RL, this paper proposes a novel multicast routing algorithm, the Q-MAP method, that ensures the resource allocation and delay-bounded in mobile ad hoc wireless networks. Further, this paper analyses the convergence and rationality of the Q-MAP method from the point of view of RL, and verifies the efficacy of the proposed method by simulations of route creation.


systems man and cybernetics | 2000

Convergence of the Q-ae learning under deterministic MDPs and its efficiency under the stochastic environment

Gang Zhao; Ruoying Sun; Shoji Tatsumi

Reinforcement learning (RL) is an efficient method for solving Markov decision processes (MDPs) without any priori knowledge about an environment. Q-learning is a representative RL. Though it is guaranteed to derive the optimal policy, Q-learning needs numerous trials to learn the optimal policy. By the use of the feature of Q value, this paper presents an accelerated RL method, the Q-ae learning. Further, utilizing the dynamic programming principle, this paper proves the convergence to the optimal policy of the Q-ae learning under deterministic MDPs. The analytical and simulation results illustrate the efficiencies of the Q-ae learning under deterministic and stochastic MDPs.


systems man and cybernetics | 1999

RTP-Q: a reinforcement learning system with an active exploration planning structure for enhancing the convergence rate

Gang Zhao; Shoji Tatsumi; Ruoying Sun

In this paper, we propose an active exploring planning method in the prioritized sweeping reinforcement learning system to make an agent explore an environment efficiently. In order to plan an active exploration behavior, considering the estimate values feature of primitive learning system in our structure, we propose an exploration planning method that fully uses the learned model, plans an active exploration action and simplifies the setting of the parameters. The proposed system utilizes the learned model efficiently not only on computation of estimates, but also for realizing the active exploration to the environment. The comparison experiments of different methods on navigation tasks demonstrate the efficiency of the proposed method.


international conference on robotics and automation | 1999

A heuristic Q-learning architecture for fully exploring a world and deriving an optimal policy by model-based planning

Gang Zhao; Shoji Tatsumi; Ruoying Sun

For solving Markov decision processes with incomplete information on robot learning tasks, model-based algorithm makes effective use of gathered data, but usually requires extreme computation. Dyna-Q is an architecture that uses experiences to build a model and uses the model to adjust the policy simultaneously, however, it does not help an agent to explore an environment actively. In, this paper, we present an Exa-Q architecture which learns models and makes plans using learned models to help the reinforcement learning agent explore an environment actively and improve the reinforcement function estimate. As a result, the Exa-Q architecture can identify an environment fully and speed up the learning rate for deriving the optimal policy. Experimental results demonstrate that the proposed method is efficient.


IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | 1999

RTP-Q: A Reinforcement Learning System with Time Constraints Exploration Planning for Accelerating the Learning Rate

Gang Zhao; Shoji Tatsumi; Ruoying Sun


人工知能学会誌 | 1999

An Accelerated k-Certainty Exploration Method

Gang Zhao; Shoji Tatsumi; Ruoying Sun


systems, man and cybernetics | 2003

Q-ac: multiagent reinforcement learning with perception-conversion action

Ruoying Sun; Shoji Tatsumi; Gang Zhao


人工知能学会誌 | 1999

Q-ee Learning : A Novel Q-Learning Method with Exploitation and Exploration

Gang Zhao; Shoji Tatsumi; Ruoying Sun

Collaboration


Dive into the Ruoying Sun's collaboration.

Top Co-Authors

Avatar

Gang Zhao

Osaka City University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge