Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kazuyuki Samejima is active.

Publication


Featured researches published by Kazuyuki Samejima.


Science | 2005

Representation of Action-Specific Reward Values in the Striatum

Kazuyuki Samejima; Yasumasa Ueda; Kenji Doya; Minoru Kimura

The estimation of the reward an action will yield is critical in decision-making. To elucidate the role of the basal ganglia in this process, we recorded striatal neurons of monkeys who chose between left and right handle turns, based on the estimated reward probabilities of the actions. During a delay period before the choices, the activity of more than one-third of striatal projection neurons was selective to the values of one of the two actions. Fewer neurons were tuned to relative values or action choice. These results suggest representation of action values in the striatum, which can guide action selection in the basal ganglia circuit.


Neural Computation | 2002

Multiple model-based reinforcement learning

Kenji Doya; Kazuyuki Samejima; Ken-ichi Katagiri; Mitsuo Kawato

We propose a modular reinforcement learning architecture for nonlinear, nonstationary control tasks, which we call multiple model-based reinforcement learning (MMRL). The basic idea is to decompose a complex task into multiple domains in space and time based on the predictability of the environmental dynamics. The system is composed of multiple modules, each of which consists of a state prediction model and a reinforcement learning controller. The responsibility signal, which is given by the softmax function of the prediction errors, is used to weight the outputs of multiple modules, as well as to gate the learning of the prediction models and the reinforcement learning controllers. We formulate MMRL for both discrete-time, finite-state case and continuous-time, continuous-state case. The performance of MMRL was demonstrated for discrete case in a nonstationary hunting task in a grid world and for continuous case in a nonlinear, nonstationary control task of swinging up a pendulum with variable physical parameters.


The Journal of Neuroscience | 2004

A neural correlate of reward-based behavioral learning in caudate nucleus : A functional magnetic resonance imaging study of a stochastic decision task

Masahiko Haruno; Tomoe Kuroda; Kenji Doya; Keisuke Toyama; Minoru Kimura; Kazuyuki Samejima; Hiroshi Imamizu; Mitsuo Kawato

Humans can acquire appropriate behaviors that maximize rewards on a trial-and-error basis. Recent electrophysiological and imaging studies have demonstrated that neural activity in the midbrain and ventral striatum encodes the error of reward prediction. However, it is yet to be examined whether the striatum is the main locus of reward-based behavioral learning. To address this, we conducted functional magnetic resonance imaging (fMRI) of a stochastic decision task involving monetary rewards, in which subjects had to learn behaviors involving different task difficulties that were controlled by probability. We performed a correlation analysis of fMRI data by using the explanatory variables derived from subject behaviors. We found that activity in the caudate nucleus was correlated with short-term reward and, furthermore, paralleled the magnitude of a subjects behavioral change during learning. In addition, we confirmed that this parallelism between learning and activity in the caudate nucleus is robustly maintained even when we vary task difficulty by controlling the probability. These findings suggest that the caudate nucleus is one of the main loci for reward-based behavioral learning.


Proceedings of the National Academy of Sciences of the United States of America | 2010

Neural correlates of cognitive dissonance and choice-induced preference change

Keise Izuma; Madoka Matsumoto; Kou Murayama; Kazuyuki Samejima; Norihiro Sadato; Kenji Matsumoto

According to many modern economic theories, actions simply reflect an individuals preferences, whereas a psychological phenomenon called “cognitive dissonance” claims that actions can also create preference. Cognitive dissonance theory states that after making a difficult choice between two equally preferred items, the act of rejecting a favorite item induces an uncomfortable feeling (cognitive dissonance), which in turn motivates individuals to change their preferences to match their prior decision (i.e., reducing preference for rejected items). Recently, however, Chen and Risen [Chen K, Risen J (2010) J Pers Soc Psychol 99:573–594] pointed out a serious methodological problem, which casts a doubt on the very existence of this choice-induced preference change as studied over the past 50 y. Here, using a proper control condition and two measures of preferences (self-report and brain activity), we found that the mere act of making a choice can change self-report preference as well as its neural representation (i.e., striatum activity), thus providing strong evidence for choice-induced preference change. Furthermore, our data indicate that the anterior cingulate cortex and dorsolateral prefrontal cortex tracked the degree of cognitive dissonance on a trial-by-trial basis. Our findings provide important insights into the neural basis of how actions can alter an individuals preferences.


Annals of the New York Academy of Sciences | 2007

Multiple Representations of Belief States and Action Values in Corticobasal Ganglia Loops

Kazuyuki Samejima; Kenji Doya

Abstract:  Reward‐related neural activities have been found in a variety of cortical and subcortical areas by neurophysiological and neuroimaging experiments. Here we present a unified view on how three subloops of the corticobasal ganglia network are involved in reward prediction and action selection using different types of information. The motor/premotor‐posterior striatum loop is specialized for action‐based value representation and movement selection. The orbitofrontal–ventral striatum loop is specialized for object‐based value representation and target selection. The lateral prefrontal–anterior striatum loop is specialized for context‐based value representation and context estimation. Furthermore, the medial prefrontal cortex (MPFC) coordinates these multiple value representations and actions at different levels of hierarchy by monitoring the error in predictions.


Current Opinion in Neurobiology | 2007

Efficient reinforcement learning: computational theories, neuroscience and robotics

Mitsuo Kawato; Kazuyuki Samejima

Reinforcement learning algorithms have provided some of the most influential computational theories for behavioral learning that depends on reward and penalty. After briefly reviewing supporting experimental data, this paper tackles three difficult theoretical issues that remain to be explored. First, plain reinforcement learning is much too slow to be considered a plausible brain model. Second, although the temporal-difference error has an important role both in theory and in experiments, how to compute it remains an enigma. Third, function of all brain areas, including the cerebral cortex, cerebellum, brainstem and basal ganglia, seems to necessitate a new computational framework. Computational studies that emphasize meta-parameters, hierarchy, modularity and supervised learning to resolve these issues are reviewed here, together with the related experimental data.


Neural Networks | 1999

Adaptive internal state space construction method for reinforcement learning of a real-world agent

Kazuyuki Samejima; Takashi Omori

One of the difficulties encountered in the application of the reinforcement learning to real-world problems is the construction of a discrete state space from a continuous sensory input signal. In the absence of a priori knowledge about the task, a straightforward approach to this problem is to discretize the input space into a grid, and to use a lookup table. However, this method suffers from the curse of dimensionality. Some studies use continuous function approximators such as neural networks instead of lookup tables. However, when global basis functions such as sigmoid functions are used, convergence cannot be guaranteed. To overcome this problem, we propose a method in which local basis functions are incrementally assigned depending on the task requirement. Initially, only one basis function is allocated over the entire space. The basis function is divided according to the statistical property of locally weighted temporal difference error (TD error) of the value function. We applied this method to an autonomous robot collision avoidance problem, and evaluated the validity of the algorithm in simulation. The proposed algorithm, which we call adaptive basis division (ABD) algorithm, achieved the task using a smaller number of basis functions than the conventional methods. Moreover, we applied the method to a goal-directed navigation problem of a real mobile robot. The action strategy was learned using a database of sensor data, and it was then used for navigation of a real machine. The robot reached the goal using a smaller number of internal states than with the conventional methods.


Journal of Behavior Therapy and Experimental Psychiatry | 2012

Effects of depression on reward-based decision making and variability of action in probabilistic learning

Yoshihiko Kunisato; Yasumasa Okamoto; Kazutaka Ueda; Keiichi Onoda; Go Okada; Shinpei Yoshimura; Shinichi Suzuki; Kazuyuki Samejima; Shigeto Yamawaki

BACKGROUND AND OBJECTIVES Depression is characterized by low reward sensitivity in behavioral studies applying signal detection theory. We examined deficits in reward-based decision making in depressed participants during a probabilistic learning task, and used a reinforcement learning model to examine learning parameters during the task. METHODS Thirty-six nonclinical undergraduates completed a probabilistic selection task. Participants were divided into depressed and non-depressed groups based on Center for Epidemiologic Studies-Depression (CES-D) cut scores. We then applied a reinforcement learning model to every participants behavioral data. RESULTS Depressed participants showed a reward-based decision making deficit and higher levels of the learning parameter τ, which modulates variability of action selection, as compared to non-depressed participants. Highly variable action selection is more random and characterized by difficulties with selecting a specific course of action. CONCLUSION These results suggest that depression is characterized by deficits in reward-based decision making as well as high variability in terms of action selection.


NeuroImage | 2012

Changing the structure of complex visuo-motor sequences selectively activates the fronto-parietal network.

Chandrasekhar V. S. Pammi; Krishna P. Miyapuram; Ahmed; Kazuyuki Samejima; Raju S. Bapi; Kenji Doya

Previous brain imaging studies investigating motor sequence complexity have mainly examined the effect of increasing the length of pre-learned sequences. The novel contribution of this research is that we varied the structure of complex visuo-motor sequences along two different dimensions using mxn paradigm. The complexity of sequences is increased from 12 movements (organized as a 2×6 task) to 24 movements (organized as 4×6 and 2×12 tasks). Behavioral results indicate that although the success rate attained was similar across the two complex tasks (2×12 and 4×6), a greater decrease in response times was observed for the 2×12 compared to the 4×6 condition at an intermediate learning stage. This decrease is possibly related to successful chunking across sets in the 2×12 task. In line with this, we observed a selective activation of the fronto-parietal network. Shifts of activation were observed from the ventral to dorsal prefrontal, lateral to medial premotor and inferior to superior parietal cortex from the early to intermediate learning stage concomitant with an increase in hyperset length. We suggest that these selective activations and shifts in activity during complex sequence learning are possibly related to chunking of motor sequences.


Journal of Bioscience and Bioengineering | 2012

Improvement of neuronal cell adhesiveness on parylene with oxygen plasma treatment

Takayuki Hoshino; Itsuro Saito; Reo Kometani; Kazuyuki Samejima; Shinji Matsui; Takafumi Suzuki; Kunihiko Mabuchi; Yasuhiro X. Kato

We improved adhesiveness of a neuron-like cell, PC12, on a Parylene-C surface by O(2) plasma treatment which changes the surface from hydrophobic to hydrophilic. Neural cell adhesiveness on the plasma-treated Parylene-C was more than twenty times better compared to non-treated Parylene-C and it was close to that on a conventional polystyrene tissue-culture dish.

Collaboration


Dive into the Kazuyuki Samejima's collaboration.

Top Co-Authors

Avatar

Kenji Doya

Okinawa Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jiro Okuda

Kyoto Sangyo University

View shared research outputs
Top Co-Authors

Avatar

Mitsuo Kawato

Nara Institute of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Yasumasa Ueda

Kyoto Prefectural University of Medicine

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Go Okada

Hiroshima University

View shared research outputs
Researchain Logo
Decentralizing Knowledge