Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yasumasa Ueda is active.

Publication


Featured researches published by Yasumasa Ueda.


Science | 2005

Representation of Action-Specific Reward Values in the Striatum

Kazuyuki Samejima; Yasumasa Ueda; Kenji Doya; Minoru Kimura

The estimation of the reward an action will yield is critical in decision-making. To elucidate the role of the basal ganglia in this process, we recorded striatal neurons of monkeys who chose between left and right handle turns, based on the estimated reward probabilities of the actions. During a delay period before the choices, the activity of more than one-third of striatal projection neurons was selective to the values of one of the two actions. Fewer neurons were tuned to relative values or action choice. These results suggest representation of action values in the striatum, which can guide action selection in the basal ganglia circuit.


Proceedings of the National Academy of Sciences of the United States of America | 2011

Dopamine neurons learn to encode the long-term value of multiple future rewards

Kazuki Enomoto; Naoyuki Matsumoto; Sadamu Nakai; Takemasa Satoh; Tatsuo K. Sato; Yasumasa Ueda; Hitoshi Inokawa; Masahiko Haruno; Minoru Kimura

Midbrain dopamine neurons signal reward value, their prediction error, and the salience of events. If they play a critical role in achieving specific distant goals, long-term future rewards should also be encoded as suggested in reinforcement learning theories. Here, we address this experimentally untested issue. We recorded 185 dopamine neurons in three monkeys that performed a multistep choice task in which they explored a reward target among alternatives and then exploited that knowledge to receive one or two additional rewards by choosing the same target in a set of subsequent trials. An analysis of anticipatory licking for reward water indicated that the monkeys did not anticipate an immediately expected reward in individual trials; rather, they anticipated the sum of immediate and multiple future rewards. In accordance with this behavioral observation, the dopamine responses to the start cues and reinforcer beeps reflected the expected values of the multiple future rewards and their errors, respectively. More specifically, when monkeys learned the multistep choice task over the course of several weeks, the responses of dopamine neurons encoded the sum of the immediate and expected multiple future rewards. The dopamine responses were quantitatively predicted by theoretical descriptions of the value function with time discounting in reinforcement learning. These findings demonstrate that dopamine neurons learn to encode the long-term value of multiple future rewards with distant rewards discounted.


Experimental Brain Research | 2011

Inactivation of the putamen selectively impairs reward history-based action selection

Manabu Muranishi; Hitoshi Inokawa; Hiroshi Yamada; Yasumasa Ueda; Naoyuki Matsumoto; Masanori Nakagawa; Minoru Kimura

Behavioral decisions and actions are directed to achieve specific goals and to obtain rewards and escape punishments. Previous studies involving the recording of neuronal activity suggest the involvement of the cerebral cortex, basal ganglia, and midbrain dopamine system in these processes. The value signal of the action options is represented in the striatum, updated by reward prediction errors, and used for selecting higher-value actions. However, it remains unclear whether dysfunction of the striatum leads to impairment of value-based action selection. The present study examined the effect of inactivation of the putamen via local injection of the GABAA receptor agonist muscimol in monkeys engaged in a manual reward-based multi-step choice task. The monkeys first searched a reward target from three alternatives, based on the previous one or two choices and their outcomes, and obtained a large reward; they then earned an additional reward by choosing the last rewarded target. Inactivation of the putamen impaired the ability of monkeys to make optimal choices during third trial in which they were required to choose a target different from those selected in the two previous trials by updating the values of the three options. The monkeys normally changed options if the last choice resulted in small reward (lose-shift) and stayed with the last choice if it resulted in large reward (win-stay). Task start time and movement time during individual trials became longer after putamen inactivation. But monkeys could control the motivation level depending on the reward value of individual trial types before and after putamen inactivation. These results support a view that the putamen is involved selectively and critically in neuronal circuits for reward history-based action selection.


European Journal of Neuroscience | 2011

Neuronal basis for evaluating selected action in the primate striatum

Hiroshi Yamada; Hitoshi Inokawa; Naoyuki Matsumoto; Yasumasa Ueda; Minoru Kimura

Humans and animals optimize their behavior by evaluating outcomes of individual actions and predicting how much reward the actions will yield. While the estimated values of actions guide choice behavior, the choices are also governed by other behavioral norms, such as rules and strategies. Values, rules and strategies are represented in neuronal activity, and the striatum is one of the best qualified brain loci where these signals meet. To understand the role of the striatum in value‐ and strategy‐based decision‐making, we recorded striatal neurons in macaque monkeys performing a behavioral task in which they searched for a reward target by trial‐and‐error among three alternatives, earned a reward for a target choice, and then earned additional rewards for choosing the same target. This task allowed us to examine whether and how values of targets and strategy, which were defined as negative‐then‐search and positive‐then‐repeat (or win‐stay‐lose‐switch), are represented in the striatum. Large subsets of striatal neurons encoded positive and negative outcome feedbacks of individual decisions and actions. Once monkeys made a choice, signals related to chosen actions, their values and search‐ or repeat‐type actions increased and persisted until the outcome feedback appeared. Subsets of neurons exhibited a tonic increase in activity after the search‐ and repeat‐choices following negative and positive feedback in the last trials as the task strategy monkeys adapted. These activity profiles as a heterogeneous representation of decision variables may underlie a part of the process for reinforcement‐ and strategy‐based evaluation of selected actions in the striatum.


Journal of Neurophysiology | 2013

Coding of the long-term value of multiple future rewards in the primate striatum

Hiroshi Yamada; Hitoshi Inokawa; Naoyuki Matsumoto; Yasumasa Ueda; Kazuki Enomoto; Minoru Kimura

Decisions maximizing benefits involve a tradeoff between the quantity of a reward and the cost of elapsed time until an animal receives it. The estimation of long-term reward values is critical to attain the most desirable outcomes over a certain period of time. Reinforcement learning theories have established algorithms to estimate the long-term reward values of multiple future rewards in which the values of future rewards are discounted as a function of how many steps of choices are necessary to achieve them. Here, we report that presumed striatal projection neurons represent the long-term values of multiple future rewards estimated by a standard reinforcement learning model while monkeys are engaged in a series of trial-and-error choices and adaptive decisions for multiple rewards. We found that the magnitude of activity of a subset of neurons was positively correlated with the long-term reward values, and that of another subset of neurons was negatively correlated throughout the entire decision-making process in individual trials: from the start of the task trial, estimation of the values and their comparison among alternatives, choice execution, and evaluation of the received rewards. An idiosyncratic finding was that neurons showing negative correlations represented reward values in the near future (high discounting), while neurons showing positive correlations represented reward values not only in the near future, but also in the far future (low discounting). These findings provide a new insight that long-term value signals are embedded in two subsets of striatal neurons as high and low discounting of multiple future rewards.


Neuroscience Research | 2007

Action value in the striatum and reinforcement-learning model of cortico-basal ganglia network

Kazuyuki Samejima; Yasumasa Ueda; Kenji Doya; Minoru Kimura

Delay eyeblink conditioning is one of the most extensively characterized paradigms for motor learning and memory. Cumulative evidence indicates an essential role for the cerebellum in this paradigm. The site and mechanisms underlying memory formation, however, are under debate. The anterior interpositus nucleus (AIN) is a proposed site for the association memory. To unravel the mechanisms underlying memory formation, we systematically examined transcripts that change during conditioning in mice. There were two groups of genes with distinct spatial and temporal characteristics. Representative EARLY gene expression peaked on 1 d of training in broad cerebellar areas in both the paired and unpaired conditioning groups. Representative LATE gene expression was selectively increased in the AIN of the 7-d paired group, but not the unpaired group. These data fit well with the two-stage learning theory, which proposes emotional and motor learning phases, and support the role for the AIN as a memory site. Using EARLY genes as molecular probes, we are now investigating the role of emotional learning in eyeblink conditioning.


Frontiers in Neuroanatomy | 2017

Distinct Functions of the Primate Putamen Direct and Indirect Pathways in Adaptive Outcome-Based Action Selection

Yasumasa Ueda; Ko Yamanaka; Atsushi Noritake; Kazuki Enomoto; Naoyuki Matsumoto; Hiroshi Yamada; Kazuyuki Samejima; Hitoshi Inokawa; Yukiko Hori; Kae Nakamura; Minoru Kimura

Cortico-basal ganglia circuits are critical regulators of reward-based decision making. Reinforcement learning models posit that action reward value is encoded by the firing activity of striatal medium spiny neurons (MSNs) and updated upon changing reinforcement contingencies by dopamine (DA) signaling to these neurons. However, it remains unclear how the anatomically distinct direct and indirect pathways through the basal ganglia are involved in updating action reward value under changing contingencies. MSNs of the direct pathway predominantly express DA D1 receptors and those of the indirect pathway predominantly D2 receptors, so we tested for distinct functions in behavioral adaptation by injecting D1 and D2 receptor antagonists into the putamen of two macaque monkeys performing a free choice task for probabilistic reward. In this task, monkeys turned a handle toward either a left or right target depending on an asymmetrically assigned probability of large reward. Reward probabilities of left and right targets changed after 30–150 trials, so the monkeys were required to learn the higher-value target choice based on action–outcome history. In the control condition, the monkeys showed stable selection of the higher-value target (that more likely to yield large reward) and kept choosing the higher-value target regardless of less frequent small reward outcomes. The monkeys also made flexible changes of selection away from the high-value target when two or three small reward outcomes occurred randomly in succession. DA D1 antagonist injection significantly increased the probability of the monkey switching to the alternate target in response to successive small reward outcomes. Conversely, D2 antagonist injection significantly decreased the switching probability. These results suggest distinct functions of D1 and D2 receptor-mediated signaling processes in action selection based on action–outcome history, with D1 receptor-mediated signaling promoting the stable choice of higher-value targets and D2 receptor-mediated signaling promoting a switch in action away from small reward outcomes. Therefore, direct and indirect pathways appear to have complementary functions in maintaining optimal goal-directed action selection and updating action value, which are dependent on D1 and D2 DA receptor signaling.


Neuroscience Research | 2010

Roles of cholinergic signal transmission in the striatum in adaptive shift of task strategy

Hitoshi Inokawa; Hiroshi Yamada; Naoyuki Matsumoto; Yasumasa Ueda; Minoru Kimura

Reward-induced burst firing of dopaminergic neurons has mainly been studied in the primate midbrain. Voltammetry enables high-speed detection of actual dopamine release in the projection area, but to date, it has been recorded only in rodents. In this study, novel diamond microelectrodes were applied for high-speed dopamine detection in behaving primate brains. Dopamine was detected with constant-voltage amperometry, holding microelectrodes with the surface of boron-doped diamond (BDD) at +600 mV against Ag/AgCl reference electrodes. During Pavlovian cue-reward trials, a sharp response to a reward cue was detected in the caudate of Japanese monkeys. This method allows the measurement of actual dopamine release in specific target areas of the brain, which will expand the knowledge of dopamine neurotransmission obtained by unit recordings.


Neuroscience Research | 2009

Neuronal activity of CM thalamus during reward-bias task of monkey

Ko Yamanaka; Yukiko Hori; Yasumasa Ueda; Minoru Kimura

Maturation of motor function is essential in postnatal development of mammals. Execution of voluntary movement relies on the basal ganglia neuronal circuitry, especially direct and indirect pathways from the striatal matrix compartment. However, maturation of two antagonistic pathways is poorly understood. We hence visualized single striatofugal neurons of the matrix compartment in postnatal developing rats with membrane-targeted GFP by intrastriatal injection of a recombinant Sindbis virus. At postnatal day (P) 4, both direct and indirect pathway neurons had short dendrites and scarcely branched axons. During P8-12, axon collaterals around somata and in target nuclei became obvious and dendrites elongated. At P16, dendritic and axonal arbors seemed almost mature, except that dendritic spines were rare. During P24–P32 dendrites were covered with dense spines. These findings together lay out a timetable for basal ganglia circuit maturation, and suggest concurrent development of the direct and indirect pathways.


Archive | 2002

Involvement of the Basal Ganglia and Dopamine System in Learning and Execution of Goal-Directed Behavior

Minoru Kimura; Naoyuki Matsumoto; Yasumasa Ueda; Takemasa Satoh; Takafumi Minamimoto; Hiroshi Yamada

To perform any task of our own volition, we execute multiple movements in a specific order based on the likelihood of obtaining a successful outcome. Neurons in the supplementary motor area (SMA), pre-SMA and those in the basal ganglia encode the temporal order, or the sequence of movements used in the tasks (Mushiake and Strick, 1995; Kermadi and Joseph, 1995; Nakamura et al., 1998; Shima and Tanji, 1998, 2000). These neurons must play a crucial role in the mechanisms of planning and execution of temporally organized multiple movements, action.

Collaboration


Dive into the Yasumasa Ueda's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Naoyuki Matsumoto

Kyoto Prefectural University of Medicine

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hitoshi Inokawa

Kyoto Prefectural University of Medicine

View shared research outputs
Top Co-Authors

Avatar

Kenji Doya

Okinawa Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Kazuki Enomoto

Kyoto Prefectural University of Medicine

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Takafumi Minamimoto

National Institute of Radiological Sciences

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge