Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Samuel J. Gershman is active.

Publication


Featured researches published by Samuel J. Gershman.


Neuron | 2011

Model-Based Influences on Humans' Choices and Striatal Prediction Errors

Nathaniel D. Daw; Samuel J. Gershman; Ben Seymour; Peter Dayan; R. J. Dolan

Summary The mesostriatal dopamine system is prominently implicated in model-free reinforcement learning, with fMRI BOLD signals in ventral striatum notably covarying with model-free prediction errors. However, latent learning and devaluation studies show that behavior also shows hallmarks of model-based planning, and the interaction between model-based and model-free values, prediction errors and preferences is underexplored. We designed a multistep decision task in which model-based and model-free influences on human choice behavior could be distinguished. By showing that choices reflected both influences we could then test the purity of the ventral striatal BOLD signal as a model-free report. Contrary to expectations, the signal reflected both model-free and model-based predictions in proportions matching those that best explained choice behavior. These results challenge the notion of a separate model-free learner and suggest a more integrated computational architecture for high-level human decision-making.


Psychological Science | 2013

The Curse of Planning Dissecting Multiple Reinforcement-Learning Systems by Taxing the Central Executive

A. Ross Otto; Samuel J. Gershman; Arthur B. Markman; Nathaniel D. Daw

A number of accounts of human and animal behavior posit the operation of parallel and competing valuation systems in the control of choice behavior. In these accounts, a flexible but computationally expensive model-based reinforcement-learning system has been contrasted with a less flexible but more efficient model-free reinforcement-learning system. The factors governing which system controls behavior—and under what circumstances—are still unclear. Following the hypothesis that model-based reinforcement learning requires cognitive resources, we demonstrated that having human decision makers perform a demanding secondary task engenders increased reliance on a model-free reinforcement-learning strategy. Further, we showed that, across trials, people negotiate the trade-off between the two systems dynamically as a function of concurrent executive-function demands, and people’s choice latencies reflect the computational expenses of the strategy they employ. These results demonstrate that competition between multiple learning systems can be controlled on a trial-by-trial basis by modulating the availability of cognitive resources.


Psychological Review | 2010

Context, Learning, and Extinction.

Samuel J. Gershman; David M. Blei; Yael Niv

A. Redish et al. (2007) proposed a reinforcement learning model of context-dependent learning and extinction in conditioning experiments, using the idea of state classification to categorize new observations into states. In the current article, the authors propose an interpretation of this idea in terms of normative statistical inference. They focus on renewal and latent inhibition, 2 conditioning paradigms in which contextual manipulations have been studied extensively, and show that online Bayesian inference within a model that assumes an unbounded number of latent causes can characterize a diverse set of behavioral results from such manipulations, some of which pose problems for the model of Redish et al. Moreover, in both paradigms, context dependence is absent in younger animals, or if hippocampal lesions are made prior to training. The authors suggest an explanation in terms of a restricted capacity to infer new causes.


Current Opinion in Neurobiology | 2010

Learning latent structure: carving nature at its joints

Samuel J. Gershman; Yael Niv

Reinforcement learning (RL) algorithms provide powerful explanations for simple learning and decision-making behaviors and the functions of their underlying neural substrates. Unfortunately, in real-world situations that involve many stimuli and actions, these algorithms learn pitifully slowly, exposing their inferiority in comparison to animal and human learning. Here we suggest that one reason for this discrepancy is that humans and animals take advantage of structure that is inherent in real-world tasks to simplify the learning problem. We survey an emerging literature on structure learning--using experience to infer the structure of a task--and how this can be of service to RL, with an emphasis on structure in perception and action.


Journal of Experimental Psychology: General | 2014

Retrospective Revaluation in Sequential Decision Making: A Tale of Two Systems

Samuel J. Gershman; Arthur B. Markman; A. Ross Otto

Recent computational theories of decision making in humans and animals have portrayed 2 systems locked in a battle for control of behavior. One system--variously termed model-free or habitual--favors actions that have previously led to reward, whereas a second--called the model-based or goal-directed system--favors actions that causally lead to reward according to the agents internal model of the environment. Some evidence suggests that control can be shifted between these systems using neural or behavioral manipulations, but other evidence suggests that the systems are more intertwined than a competitive account would imply. In 4 behavioral experiments, using a retrospective revaluation design and a cognitive load manipulation, we show that human decisions are more consistent with a cooperative architecture in which the model-free system controls behavior, whereas the model-based system trains the model-free system by replaying and simulating experience.


Neural Computation | 2012

Multistability and perceptual inference

Samuel J. Gershman; Edward Vul; Joshua B. Tenenbaum

Ambiguous images present a challenge to the visual system: How can uncertainty about the causes of visual inputs be represented when there are multiple equally plausible causes? A Bayesian ideal observer should represent uncertainty in the form of a posterior probability distribution over causes. However, in many real-world situations, computing this distribution is intractable and requires some form of approximation. We argue that the visual system approximates the posterior over underlying causes with a set of samples and that this approximation strategy produces perceptual multistability—stochastic alternation between percepts in consciousness. Under our analysis, multistability arises from a dynamic sample-generating process that explores the posterior through stochastic diffusion, implementing a rational form of approximate Bayesian inference known as Markov chain Monte Carlo (MCMC). We examine in detail the most extensively studied form of multistability, binocular rivalry, showing how a variety of experimental phenomena—gamma-like stochastic switching, patchy percepts, fusion, and traveling waves—can be understood in terms of MCMC sampling over simple graphical models of the underlying perceptual tasks. We conjecture that the stochastic nature of spiking neurons may lend itself to implementing sample-based posterior approximations in the brain.


Psychonomic Bulletin & Review | 2011

Human memory reconsolidation can be explained using the temporal context model

Per B. Sederberg; Samuel J. Gershman; Sean M. Polyn; Kenneth A. Norman

Recent work by Hupbach, Gomez, Hardt, and Nadel (Learning & Memory, 14, 47–53, 2007) and Hupbach, Gomez, and Nadel (Memory, 17, 502–510, 2009) suggests that episodic memory for a previously studied list can be updated to include new items, if participants are reminded of the earlier list just prior to learning a new list. The key finding from the Hupbach studies was an asymmetric pattern of intrusions, whereby participants intruded numerous items from the second list when trying to recall the first list, but not viceversa. Hupbach et al. (2007; 2009) explained this pattern in terms of a cellular reconsolidation process, whereby first-list memory is rendered labile by the reminder and the labile memory is then updated to include items from the second list. Here, we show that the temporal context model of memory, which lacks a cellular reconsolidation process, can account for the asymmetric intrusion effect, using well-established principles of contextual reinstatement and item–context binding.


Learning & Behavior | 2012

Exploring a latent cause theory of classical conditioning.

Samuel J. Gershman; Yael Niv

We frame behavior in classical conditioning experiments as the product of normative statistical inference. According to this theory, animals learn an internal model of their environment from experience. The basic building blocks of this internal model are latent causes—explanatory constructs inferred by the animal that partition observations into coherent clusters. Generalization of conditioned responding from one cue to another arises from the animal’s inference that the cues were generated by the same latent cause. Through a wide range of simulations, we demonstrate where the theory succeeds and where it fails as a general account of classical conditioning.


Neuropsychologia | 2013

Moderate levels of activation lead to forgetting in the think/no-think paradigm

Greg Detre; Annamalai Natarajan; Samuel J. Gershman; Kenneth A. Norman

Using the think/no-think paradigm (Anderson & Green, 2001), researchers have found that suppressing retrieval of a memory (in the presence of a strong retrieval cue) can make it harder to retrieve that memory on a subsequent test. This effect has been replicated numerous times, but the size of the effect is highly variable. Also, it is unclear from a neural mechanistic standpoint why preventing recall of a memory now should impair your ability to recall that memory later. Here, we address both of these puzzles using the idea, derived from computational modeling and studies of synaptic plasticity, that the function relating memory activation to learning is U-shaped, such that moderate levels of memory activation lead to weakening of the memory and higher levels of activation lead to strengthening. According to this view, forgetting effects in the think/no-think paradigm occur when the suppressed item activates moderately during the suppression attempt, leading to weakening; the effect is variable because sometimes the suppressed item activates strongly (leading to strengthening) and sometimes it does not activate at all (in which case no learning takes place). To test this hypothesis, we ran a think/no-think experiment where participants learned word-picture pairs; we used pattern classifiers, applied to fMRI data, to measure how strongly the picture associates were activating when participants were trying not to retrieve these associates, and we used a novel Bayesian curve-fitting procedure to relate this covert neural measure of retrieval to performance on a later memory test. In keeping with our hypothesis, the curve-fitting procedure revealed a nonmonotonic relationship between memory activation (as measured by the classifier) and subsequent memory, whereby moderate levels of activation of the to-be-suppressed item led to diminished performance on the final memory test, and higher levels of activation led to enhanced performance on the final test.


Journal of Cognitive Neuroscience | 2013

Neural and Psychological Maturation of Decision-making in Adolescence and Young Adulthood

Anastasia Christakou; Samuel J. Gershman; Yael Niv; Andrew Simmons; Mick Brammer; Katya Rubia

We examined the maturation of decision-making from early adolescence to mid-adulthood using fMRI of a variant of the Iowa gambling task. We have previously shown that performance in this task relies on sensitivity to accumulating negative outcomes in ventromedial PFC and dorsolateral PFC. Here, we further formalize outcome evaluation (as driven by prediction errors [PE], using a reinforcement learning model) and examine its development. Task performance improved significantly during adolescence, stabilizing in adulthood. Performance relied on greater impact of negative compared with positive PEs, the relative impact of which matured from adolescence into adulthood. Adolescents also showed increased exploratory behavior, expressed as a propensity to shift responding between options independently of outcome quality, whereas adults showed no systematic shifting patterns. The correlation between PE representation and improved performance strengthened with age for activation in ventral and dorsal PFC, ventral striatum, and temporal and parietal cortices. There was a medial-lateral distinction in the prefrontal substrates of effective PE utilization between adults and adolescents: Increased utilization of negative PEs, a hallmark of successful performance in the task, was associated with increased activation in ventromedial PFC in adults, but decreased activation in ventrolateral PFC and striatum in adolescents. These results suggest that adults and adolescents engage qualitatively distinct neural and psychological processes during decision-making, the development of which is not exclusively dependent on reward-processing maturation.

Collaboration


Dive into the Samuel J. Gershman's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yael Niv

Princeton University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Arthur B. Markman

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Edward Vul

University of California

View shared research outputs
Top Co-Authors

Avatar

Joshua B. Tenenbaum

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge