Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Masahiko Haruno is active.

Publication


Featured researches published by Masahiko Haruno.


Neural Computation | 2001

MOSAIC Model for Sensorimotor Learning and Control

Masahiko Haruno; Daniel M. Wolpert; Mitsuo Kawato

Humans demonstrate a remarkable ability to generate accurate and appropriate motor behavior under many different and often uncertain environmental conditions. We previously proposed a new modular architecture, the modular selection and identification for control (MOSAIC) model, for motor learning and control based on multiple pairs of forward (predictor) and inverse (controller) models. The architecture simultaneously learns the multiple inverse models necessary for control as well as how to select the set of inverse models appropriate for a given environment. It combines both feedforward and feedback sensorimotor information so that the controllers can be selected both prior to movement and subsequently during movement. This article extends and evaluates the MOSAIC architecture in the following respects. The learning in the architecture was implemented by both the original gradient-descent method and the expectation-maximization (EM) algorithm. Unlike gradient descent, the newly derived EM algorithm is robust to the initial starting conditions and learning parameters. Second, simulations of an object manipulation task prove that the architecture can learn to manipulate multiple objects and switch between them appropriately. Moreover, after learning, the model shows generalization to novel objects whose dynamics lie within the polyhedra of already learned dynamics. Finally, when each of the dynamics is associated with a particular object shape, the model is able to select the appropriate controller before movement execution. When presented with a novel shape-dynamic pairing, inappropriate activation of modules is observed followed by on-line correction.


The Journal of Neuroscience | 2004

A neural correlate of reward-based behavioral learning in caudate nucleus : A functional magnetic resonance imaging study of a stochastic decision task

Masahiko Haruno; Tomoe Kuroda; Kenji Doya; Keisuke Toyama; Minoru Kimura; Kazuyuki Samejima; Hiroshi Imamizu; Mitsuo Kawato

Humans can acquire appropriate behaviors that maximize rewards on a trial-and-error basis. Recent electrophysiological and imaging studies have demonstrated that neural activity in the midbrain and ventral striatum encodes the error of reward prediction. However, it is yet to be examined whether the striatum is the main locus of reward-based behavioral learning. To address this, we conducted functional magnetic resonance imaging (fMRI) of a stochastic decision task involving monetary rewards, in which subjects had to learn behaviors involving different task difficulties that were controlled by probability. We performed a correlation analysis of fMRI data by using the explanatory variables derived from subject behaviors. We found that activity in the caudate nucleus was correlated with short-term reward and, furthermore, paralleled the magnitude of a subjects behavioral change during learning. In addition, we confirmed that this parallelism between learning and activity in the caudate nucleus is robustly maintained even when we vary task difficulty by controlling the probability. These findings suggest that the caudate nucleus is one of the main loci for reward-based behavioral learning.


International Congress Series | 2003

Hierarchical MOSAIC for movement generation

Masahiko Haruno; Daniel M. Wolpert; Mitsuo Kawato

Abstract Hierarchy plays a key role in human motor control and learning. We can generate a variety of structured motor sequences such as writing or speech and learn to combine elemental actions in novel orders. We previously proposed the Modular Selection and Identification for Control (MOSAIC) model to explain the remarkable ability animals show in motor learning, adaptation and behavioral switching. In this paper, we extend this to a hierarchical MOSAIC (HMOSAIC). Each layer of HMOSAIC consists of a MOSAIC, which is a set of paired control and predictive models. The higher-level MOSAIC receives two inputs: an abstract (symbolic) desired trajectory and posterior probabilities of its subordinate level, which represent which modules are playing a crucial role in the lower level under the current behavioral situation. The higher control model generates, as a motor command, prior probabilities for the lower-level modules, and therefore prioritizes which lower-level modules should be selected. In contrast, the higher predictive model learns to estimate the posterior probability at the next time step. The outputs from controllers as well as the learning of both predictors and controllers are weighted by the precision of the prediction. We first show that this bidirectional architecture provides a general framework capable of hierarchical motor learning that is chunking of movement patterns. Then, we discuss the similarities between the HMOSAIC architecture and the closed cerebro–cerebellar loop circuits recently found by Middleton and Strick (Trends in Neuroscience 21 (1998) 367). In our view, modules in one layer are involved with similar functions and assumed to be implemented by one of the cerebro–cerebellar loop circuits. These layers are then connected to each other by the bidirectional information flows within the cerebral cortex.


Journal of Neurophysiology | 2010

Motor memory and local minimization of error and effort, not global optimization, determine motor behavior.

Gowrishankar Ganesh; Masahiko Haruno; Mitsuo Kawato; Etienne Burdet

Many real life tasks that require impedance control to minimize motion error are characterized by multiple solutions where the task can be performed either by co-contracting muscle groups, which requires a large effort, or, conversely, by relaxing muscles. However, human motor optimization studies have focused on tasks that are always satisfied by increasing impedance and that are characterized by a single error-effort optimum. To investigate motor optimization in the presence of multiple solutions and hence optima, we introduce a novel paradigm that enables us to let subjects repetitively (but inconspicuously) use different solutions and observe how exploration of multiple solutions affect their motor behavior. The results show that the behavior is largely influenced by motor memory with subjects tending to involuntarily repeat a recent suboptimal task-satisfying solution even after sufficient experience of the optimal solution. This suggests that the CNS does not optimize co-activation tasks globally but determines the motor behavior in a tradeoff of motor memory, error, and effort minimization.


Proceedings of the National Academy of Sciences of the United States of America | 2011

Dopamine neurons learn to encode the long-term value of multiple future rewards

Kazuki Enomoto; Naoyuki Matsumoto; Sadamu Nakai; Takemasa Satoh; Tatsuo K. Sato; Yasumasa Ueda; Hitoshi Inokawa; Masahiko Haruno; Minoru Kimura

Midbrain dopamine neurons signal reward value, their prediction error, and the salience of events. If they play a critical role in achieving specific distant goals, long-term future rewards should also be encoded as suggested in reinforcement learning theories. Here, we address this experimentally untested issue. We recorded 185 dopamine neurons in three monkeys that performed a multistep choice task in which they explored a reward target among alternatives and then exploited that knowledge to receive one or two additional rewards by choosing the same target in a set of subsequent trials. An analysis of anticipatory licking for reward water indicated that the monkeys did not anticipate an immediately expected reward in individual trials; rather, they anticipated the sum of immediate and multiple future rewards. In accordance with this behavioral observation, the dopamine responses to the start cues and reinforcer beeps reflected the expected values of the multiple future rewards and their errors, respectively. More specifically, when monkeys learned the multistep choice task over the course of several weeks, the responses of dopamine neurons encoded the sum of the immediate and expected multiple future rewards. The dopamine responses were quantitatively predicted by theoretical descriptions of the value function with time discounting in reinforcement learning. These findings demonstrate that dopamine neurons learn to encode the long-term value of multiple future rewards with distant rewards discounted.


international conference on robotics and automation | 2010

Biomimetic motor behavior for simultaneous adaptation of force, impedance and trajectory in interaction tasks

Gowrishankar Ganesh; Alin Albu-Schäffer; Masahiko Haruno; Mitsuo Kawato; Etienne Burdet

Interaction of a robot with dynamic environments would require continuous adaptation of force and impedance, which is generally not available in current robot systems. In contrast, humans learn novel task dynamics with appropriate force and impedance through the concurrent minimization of error and energy, and exhibit the ability to modify movement trajectory to comply with obstacles and minimize forces. This article develops a similar automatic motor behavior for a robot and reports experiments with a one degree-of-freedom system. In a postural control task, the robot automatically adapts torque to counter a slow disturbance and shifts to increasing its stiffness when the disturbance increases in frequency. In the presence of rigid obstacles, it refrains from increasing force excessively, and relaxes gradually to follow the obstacle, but comes back to the desired state when the obstacle is removed. A trajectory tracking task demonstrates that the robot is able to adapt to different loads during motion. On introduction of a new load, it increases its stiffness to adapt to the load quickly, and then relaxes once the adaptation is complete. Furthermore, in the presence of an obstacle, the robot adjusts its trajectory to go around it.


The Journal of Neuroscience | 2013

Reward Prediction Error Signal Enhanced by Striatum–Amygdala Interaction Explains the Acceleration of Probabilistic Reward Learning by Emotion

Noriya Watanabe; Masamichi Sakagami; Masahiko Haruno

Learning does not only depend on rationality, because real-life learning cannot be isolated from emotion or social factors. Therefore, it is intriguing to determine how emotion changes learning, and to identify which neural substrates underlie this interaction. Here, we show that the task-independent presentation of an emotional face before a reward-predicting cue increases the speed of cue–reward association learning in human subjects compared with trials in which a neutral face is presented. This phenomenon was attributable to an increase in the learning rate, which regulates reward prediction errors. Parallel to these behavioral findings, functional magnetic resonance imaging demonstrated that presentation of an emotional face enhanced reward prediction error (RPE) signal in the ventral striatum. In addition, we also found a functional link between this enhanced RPE signal and increased activity in the amygdala following presentation of an emotional face. Thus, this study revealed an acceleration of cue–reward association learning by emotion, and underscored a role of striatum–amygdala interactions in the modulation of the reward prediction errors by emotion.


Journal of Cognitive Neuroscience | 2014

Activity in the nucleus accumbens and amygdala underlies individual differences in prosocial and individualistic economic choices

Masahiko Haruno; Minoru Kimura; Chris Frith

Much decision-making requires balancing benefits to the self with benefits to the group. There are marked individual differences in this balance such that individualists tend to favor themselves whereas prosocials tend to favor the group. Understanding the mechanisms underlying this difference has important implications for society and its institutions. Using behavioral and fMRI data collected during the performance of the ultimatum game, we show that individual differences in social preferences for resource allocation, so-called “social value orientation,” is linked with activity in the nucleus accumbens and amygdala elicited by inequity, rather than activity in insula, ACC, and dorsolateral pFC. Importantly, the presence of cognitive load made prosocials behave more prosocially and individualists more individualistically, suggesting that social value orientation is driven more by intuition than reflection. In parallel, activity in the nucleus accumbens and amygdala, in response to inequity, tracked this behavioral pattern of prosocials and individualists. In addition, we conducted an impunity game experiment with different participants where they could not punish unfair behavior and found that the inequity-correlated activity seen in prosocials during the ultimatum game disappeared. This result suggests that the accumbens and amygdala activity of prosocials encodes “outcome-oriented emotion” designed to change situations (i.e., achieve equity or punish). Together, our results suggest a pivotal contribution of the nucleus accumbens and amygdala to individual differences in sociality.


NeuroImage | 2008

Sparse linear regression for reconstructing muscle activity from human cortical fMRI.

Gowrishankar Ganesh; Etienne Burdet; Masahiko Haruno; Mitsuo Kawato

In humans, it is generally not possible to use invasive techniques in order to identify brain activity corresponding to activity of individual muscles. Further, it is believed that the spatial resolution of non-invasive brain imaging modalities is not sufficient to isolate neural activity related to individual muscles. However, this study shows that it is possible to reconstruct muscle activity from functional magnetic resonance imaging (fMRI). We simultaneously recorded surface electromyography (EMG) from two antagonist muscles and motor cortices activity using fMRI, during an isometric task requiring both reciprocal activation and co-activation of the wrist muscles. Bayesian sparse regression was used to identify the parameters of a linear mapping from the fMRI activity in areas 4 (M1) and 6 (pre-motor, SMA) to EMG, and to reconstruct muscle activity in an independent test data set. The mapping obtained by the sparse regression algorithm showed significantly better generalization than those obtained from algorithms commonly used in decoding, i.e., support vector machine and least square regression. The two voxel sets corresponding to the activity of the antagonist muscles were intermingled but disjoint. They were distributed over a wide area of pre-motor cortex and M1 and not limited to regions generally associated with wrist control. These results show that brain activity measured by fMRI in humans can be used to predict individual muscle activity through Bayesian linear models, and that our algorithm provides a novel and non-invasive tool to investigate the brain mechanisms involved in motor control and learning in humans.


The Journal of Neuroscience | 2009

Activity in the Superior Temporal Sulcus Highlights Learning Competence in an Interaction Game

Masahiko Haruno; Mitsuo Kawato

During behavioral adaptation through interaction with human and nonhuman agents, marked individual differences are seen in both real-life situations and games. However, the underlying neural mechanism is not well understood. We conducted a neuroimaging experiment in which subjects maximized monetary rewards by learning in a prisoners dilemma game with two computer agents: agent A, a tit-for-tat player who repeats the subjects previous action, and agent B, a simple stochastic cooperator oblivious to the subjects action. Approximately 1/3 of the subjects (group I) learned optimally in relation to both A and B, while another 1/3 (group II) did so only for B. Postexperiment interviews indicated that group I exploited the agent strategies more often than group II. Significant differences in learning-related brain activity between the two groups were only found in the superior temporal sulcus (STS) for both A and B. Furthermore, the learning performance of each group I subject was predictable based on this STS activity, but not in the group II subjects. This differential activity could not be attributed to a behavioral difference since it persisted in relation to agent B for which the two groups behaved similarly. In sharp contrast, the brain structures for reward processing were recruited similarly by both groups. These results suggest that STS provides knowledge of the other agents strategies for association between action and reward and highlights learning competence during interactive reinforcement learning.

Collaboration


Dive into the Masahiko Haruno's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chris Frith

Wellcome Trust Centre for Neuroimaging

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kenji Doya

Okinawa Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Hiroyuki Nakahara

RIKEN Brain Science Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge