Tomoki Hamagami | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tomoki Hamagami is active.

Explore More

Publication

Featured researches published by Tomoki Hamagami.

systems, man and cybernetics | 2004

Development of intelligent wheelchair acquiring autonomous, cooperative, and collaborative behavior

Tomoki Hamagami; Hironori Hirata

An intelligent wheelchair (IWC) prototype system: ACCoMo is developed to aid indoor safe mobility for physically challenged people. ACCoMo, as an agent, can acquire autonomous, cooperative, and collaborative behavior. The autonomous behavior realizes safe and effective moves with observing local real environments. The cooperative behavior emerges from interactions within other ACCoMo dynamically. The collaborative behavior aims to assist user operations, and provides functions for connecting to various ubiquitous devices. These behaviors are acquired with learning and evolution of intelligent ACCoMo agents through their experience in the real or virtual environments. Through experiments in real world environments, it is shown that the agent can acquire these intelligent behaviors on ACCoMo.

ieee wic acm international conference on intelligent agent technology | 2003

Method of crowd simulation by using multiagent on cellular automata

Tomoki Hamagami; Hironori Hirata

This paper presents a new simulation method of crowd behavior, method that uses a two- layer model that consists of multiagent (MA) framework and cellular automata (CA). The features of this method are as follows. (1) Complicated crowd behavior emerges from the autonomous actions of agents. (2) Separating an autonomous action process from a restriction of physical interferences. Using a simulation system implementing the two-layer model, crowd behavior simulations are realized. In particular, collision cases of counter crowds are analyzed in detail, and interesting results are found. (1) Homogeneous agent crowd tends to make whirlpools, waves, and blanks, and to be slow at the walking. (2) Heterogeneous agent crowd formed lines, then flows efficiently. Experimental results show that combining MA and CA is effective to easy realize the complicated crowd behavior on various environments.

systems, man and cybernetics | 2003

An adjustment method of the number of states of Q-learning segmenting state space adaptively

Tomoki Hamagami; Hironori Hirata

This paper presents a method to partition a continuous state space for the purposes of realizing an autonomous behavior of agent. The basic idea of this partitioning technique is derived from QLASS (Q-learning with adaptive state segmentation) which is a simple and effective technique. In segmentation by QLASS, since discrete state space is constructed as Voronoi diagram which is generated by a set of a finite number of points called generators, the state space is intuitively easy to understand. However, as QLASS has a problem that the algorithm generates too many segments in which during the learning, an agent, which uses QLASS, cannot learn appropriate action efficiently. To overcome this problem, an adjustment method of the number of states is proposed, method which restricts or boosts the partitioning by using eligibilities and temperature parameter of each segment. Experimental results show that this adjustment method can partition state space suitably according to not only the environment characteristic but its dynamic changes.

systems, man and cybernetics | 2007

A new genetic algorithm with diploid chromosomes by using probability decoding for non-stationary function optimization

Manabu Kominami; Tomoki Hamagami

This paper proposes a new diploid operation technique with probability for non-tationary function optimization. The advantage of the technique over previous diploid genetic algorithms, diploid GAs, is that one genotype is transformed into many phenotvpes with probability. The technique allows genes probabilistic representation of dominance, and can keep a diversity of individuals. The experiment results show that the technique can adapt to severe environmental changes where previous diploid GAs cannot adapt. It is shown that the technique is able to find optimum solutions with high probability! genotype and make trade-off between the diversity and convergency.

systems, man and cybernetics | 2006

Complex-Valued Reinforcement Learning

Tomoki Hamagami; Takashi Shibuya; Shingo Shimada

A new reinforcement learning algorithm with complex-valued functions is proposed. The algorithm is inspired by complex-valued neural networks introducing complex numbers representing phase and amplitude into a conventional neural network. The strong advantage of using complex values in reinforcement learning is that the state-action function in a time series can be easily extended. In particular, considering the coherence of each complex value, the proposed learning algorithm can represent the context of agent behavior. This extension allows compensating for the perceptual aliasing problem and provides for the intelligent behavior of mobile robots in the real world. The complex-valued functions are applied to the conventional reinforcement learning algorithms: Q-learning and profit sharing. These algorithms are evaluated by simple maze problems and a bar-carrying task involving perceptual aliasings. Simulation experiments show that the new algorithm can efficiently solve perceptual aliasing.

systems, man and cybernetics | 2005

State space partitioning and clustering with sensor alignment for autonomous robots

Tomoki Hamagami; Hironori Hirata

The goal of this study is to develop an intelligent wheelchair (IWC) acquiring autonomous, cooperative, and collaborative behavior. An important part of achieving the goal is that the agent controlling the IWC can realize suitable partitioning state of new environment for developing a behavior policy without human intervention. In particular, domestic robots like the IWCs which have low-cost, discrete simple range sensors have difficulty of efficient learning of environment because the deviation of sensor configuration in each robot causes serious problem in which each learning policy can not be shared among them. To overcome this problem, a new method of alignment and clustering of sensor space is conducted. The alignment process reconfigures sensor configurations on concentric circle from spot-turn behavior of IWC robots in order to compensate the deviation of the configuration. The clustering process realizes self-organized state space with simple vector partitioning method. The simulation experiment results have shown the method gives the IWC agent an ability of autonomous partitioning of environment and efficient constructing state space.

systems, man and cybernetics | 2002

Reinforcement learning to compensate for perceptual aliasing using dynamic additional parameter: motivational value

Tomoki Hamagami; Seiichi Koakutsu; Hironori Hirata

AbstnactIn this paper, we present a new reinforcement learning approach compensating for the perceptual aliasing problem by varying policies depending on the behavior context. For this approach, motiuatwd vdue(M-value) is introduced as a parameter emphasizing specific future action selection probabilities temporarily according to the context. In the learning phase, a Q-value renewal error linked with the current state-action pair is memorized as M-value linked with past visited experiences. In the control phase, t o motivate a next action, an agent awakes M-values linked with the current state and memorized in past experiences. By combining Mvalue with Q-value, even if an agent observes the same sensory inputs under the different states ,the agent can generate different action selection policies with the context. The advantage of the proposed approach is that the learning/control system reflecting the difference of context can be realized easily, in spite of the saving of computational memories, by the simple extension of general reinforcement learning: Q-learning. In order to investigate the validity of the proposed method, we apply the method to the maze problem containing perceptual aliasing problem, and compare it with the case of general Q-learning. The result on maze environment experiment shows that the proposed approach can work effectively in the non-Markov decision process environment involving perceptual aliasing problems. Keywordsreinforcement learning, Q-learning, POMDPs, perceptual aliasing.

genetic and evolutionary computation conference | 2017

Theoretical XCS parameter settings of learning accurate classifiers

Masaya Nakata; Will N. Browne; Tomoki Hamagami; Keiki Takadama

XCS is the most popular type of Learning Classifier System, but setting optimum parameter values is more of an art than a science. Early theoretical work required the impractical assumption that classifier parameters had fully converged with infinite update times. The aim of this work is to derive a theoretical condition to mathematically guarantee that XCS identifies maximally accurate classifiers, such that subsequent deletion methods can be used optimally, in as few updates as possible. Consequently, our theory provides a universally usable setup guide for three important parameter settings; the learning rate, the accuracy update and the threshold for subsumption deletion. XCS with our best parameter settings solves the 70-bit multiplexer problem with only 21% of instances that the standard XCS setup needs. On a highly class-imbalanced multiplexer problem with inaccurate classifiers having more than 99.99% classification accuracy, our theory enables XCS to identify only 100% accurate classifiers as accurate and thus obtain the optimal performance.

ieee international conference on intelligent systems and control | 2015

Examination of skill-based learning by inverse reinforcement learning using evolutionary process

Hiroaki Tsunekawa; Takuo Suzuki; Tomoki Hamagami

In this paper, we propose skill-based learning by inverse reinforcement learning using evolutionary process. Reinforcement learning requires a large amount of time and learning convergence does not depend on the learning targets. In addition, if the learning targets are not known clearly, the appropriate reward cannot be defined and this makes learning difficult. Sub-goal method and inverse reinforcement learning are effective for each problem. They can deal with the problem that it requires a large amount of time and finding appropriate reward is difficult. However, in case that there is interference between behavior rules, the learning is not achieved efficiently by the sub-goal method. Therefore, in this study, the process of learning each behavior rules simultaneously is made with evolutionary process and reward functions for the half way are obtained by inverse reinforcement learning of the process. The target behavior is achieved by using the reward functions. This proposed method is called skill-based learning. Finally, effectiveness of skill-based learning is confirmed by experiment of driving task.

Mobile Networks and Applications | 2012

A Method for Classifying Packets into Network Flows Based on GHSOM

Hongbo Shi; Tomoki Hamagami; Haoyuan Xu; Ping Yu; Yonghe Wu

Recently, various applications and services are used in the Internet. Load balancing the increasing network traffic in real time can improve the network quality. The flow control technologies become much more important than before. Our research proposes an intelligent network flow identifying method, which is based on the neural network algorithm, GHSOM. In this paper, we suggest to utilize the structural classification of GHSOM for training the properties of packets, such as timestamp, source and destination. Based on our proposed normalization, IP network flows can be formed autonomously during the learning process. The combination use of the new normalization with the GHSOM can divide a flow to several sub-IP flows. This paper indicates that a flow shall consist of several sub-IP flows, and sub-IP flow shall consist of several IP packets. The experiments show that IP packets can be divided to flow and sub-IP flow classes properly. Furthermore, those repeated jumbo sub-IP flows can be used to discover communicating errors or abnormal attacks.

Explore More