Daisuke Uragami | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daisuke Uragami is active.

Explore More

Publication

Featured researches published by Daisuke Uragami.

BioSystems | 2014

Cognitively inspired reinforcement learning architecture and its application to giant-swing motion control

Daisuke Uragami; Tatsuji Takahashi; Yoshiki Matsuo

Many algorithms and methods in artificial intelligence or machine learning were inspired by human cognition. As a mechanism to handle the exploration-exploitation dilemma in reinforcement learning, the loosely symmetric (LS) value function that models causal intuition of humans was proposed (Shinohara et al., 2007). While LS shows the highest correlation with causal induction by humans, it has been reported that it effectively works in multi-armed bandit problems that form the simplest class of tasks representing the dilemma. However, the scope of application of LS was limited to the reinforcement learning problems that have K actions with only one state (K-armed bandit problems). This study proposes LS-Q learning architecture that can deal with general reinforcement learning tasks with multiple states and delayed reward. We tested the learning performance of the new architecture in giant-swing robot motion learning, where uncertainty and unknown-ness of the environment is huge. In the test, the help of ready-made internal models or functional approximation of the state space were not given. The simulations showed that while the ordinary Q-learning agent does not reach giant-swing motion because of stagnant loops (local optima with low rewards), LS-Q escapes such loops and acquires giant-swing. It is confirmed that the smaller number of states is, in other words, the more coarse-grained the division of states and the more incomplete the state observation is, the better LS-Q performs in comparison with Q-learning. We also showed that the high performance of LS-Q depends comparatively little on parameter tuning and learning time. This suggests that the proposed method inspired by human cognition works adaptively in real environments.

international conference on mechatronics and automation | 2011

The efficacy of symmetric cognitive biases in robotic motion learning

Daisuke Uragami; Tatsuji Takahashi; Hisham Alsubeheen; Akinori Sekiguchi; Yoshiki Matsuo

We propose an application of human-like decision-making to robotic motion learning. Human is known to have illogical symmetric cognitive biases that induce “if p then q” and “if not q then not p” from “if q then p.” The loosely symmetric Shinohara model quantitatively represents the tendencies (Shinohara et al. 2007). Previous studies one of the authors have revealed that an agent with the model used as the action value function shows great performance in n-armed bandit problems, because of the illogical biases. In this study, we apply the model to reinforcement learning with Q-learning algorithm. Testing the model on a simulated giant-swing robot, we have confirmed its efficacy in convergence speed increase and avoidance of local optimum.

BioSystems | 2016

Robotic action acquisition with cognitive biases in coarse-grained state space

Daisuke Uragami; Yu Kohno; Tatsuji Takahashi

Some of the authors have previously proposed a cognitively inspired reinforcement learning architecture (LS-Q) that mimics cognitive biases in humans. LS-Q adaptively learns under uniform, coarse-grained state division and performs well without parameter tuning in a giant-swing robot task. However, these results were shown only in simulations. In this study, we test the validity of the LS-Q implemented in a robot in a real environment. In addition, we analyze the learning process to elucidate the mechanism by which the LS-Q adaptively learns under the partially observable environment. We argue that the LS-Q may be a versatile reinforcement learning architecture, which is, despite its simplicity, easily applicable and does not require well-prepared settings.

soft computing | 2012

Presynaptic inhibition balances the trade-off between differential sensitivity and reproducibility

Hiroyuki Ohta; Daisuke Uragami; Yasuhiro Nishida; James C. Houk

Neural adaptation process needs to balance an ability to distinguish similar patterns (differential sensitivity) and an ability to learn new patterns while keeping old ones intact (incremental learning). In this paper, we used a striatal medium spiny neuron (MSN) model to assess the effects of presynaptic inhibition on the differential sensitivity/incremental learning trade-off problem. Differential sensitivity is positively correlated with the strength of long-term depression (LTD) because it emphasizes pattern difference by eliminating anti-causal weak responses. However, strong LTD interferes with incremental learning. We assumed that presynaptic lateral inhibition could emphasize the pattern difference without losing the previously formed synaptic weight pattern. Then, we applied spike timing dependent plasticity (STDP) rule to a simulation of the corticostriatal pathway with and without presynaptic inhibition of the MSNs. The result confirmed that the trade-off problem could be overcome by presynaptic lateral inhibition.

INTERNATIONAL CONFERENCE OF NUMERICAL ANALYSIS AND APPLIED MATHEMATICS (ICNAAM 2016) | 2017

Analysis of human body motion by Lattice theory

Daisuke Uragami; Yurika Suzuki

We propose a method of applying lattice algebra to the analysis of multivariate time series data. We measured the body motion of a Shorinji-Kempo kata using acceleration sensors at five positions on the body. The proposed analysis was applied to the time series data obtained from the sensors. As a result, a correlation was observed between the skill levels and the number of elements of the lattice generated using the time series data. We observed that highly skilled subjects executed more complex motions; thus, we consider the number of lattice elements as an index of complexity of the space–time pattern.

society of instrument and control engineers of japan | 2015

A study on effect of two-arch structure of foot for biped robots

Akinori Sekiguchi; Tatsuya Morimoto; Yoshiki Matsuo; Daisuke Uragami

The arch structure of human foot plays important roles such as impact absorption for bipedal locomotion. In this paper, the two-arch structure is introduced into the foot model of biped robot. Using a flat foot model, a one-arch foot model and the two-arch foot model, simulations of robot motion are performed by ODE and effects of the arch structure are verified. When the inner arch was softer than the outer arch like human, the robot motion such that the COG returns to inner direction was observed. It was verified that the property of the returning motion was determined by the difference of elasticities between the inner arch and the outer arch, and the property of motion in the frontal direction was determined by the average of elasticities.

soft computing | 2012

Balancing between incremental learning and generalization by structured mutual inhibition: Shifting away from the question of whether there is a grandmother cell and toward conceptualization

Daisuke Uragami; Hiroyuki Ohta

Distributed connectionist networks have no facility for incremental learning, but they have the advantage of being able to generalize. In contrast, winner-take-all networks are suitable for incremental learning but lack the ability to generalize. In this paper, we use an abstract model to assess the trade-off between incremental learning and generalization abilities, and we propose a new model to solve this dilemma. To formulate and analyze the trade-off, we have defined a network that emulates both the connectionist network and the winner-take-all network. It does this through a single parameter that specifies the range of lateral inhibition and varies continuously and gradually between the distributed and winner-take-all networks. By using structured mutual inhibition instead of simple lateral inhibition, the network is able to balance the ability to learn incrementally with the ability to generalize. We also analyze the behavior of the proposed mechanisms using Formal Concept Analysis, which reveals that the network can form concepts that are defined by firing patterns in network subgroups. Based on these results, we propose that longstanding perspectives on this underlying dilemma in connectionism should be shifted and that the tradeoff problem needs to be solved through a new conceptualization.

Transactions of The Japanese Society for Artificial Intelligence | 2016