Nathaniel D. Daw
Princeton University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nathaniel D. Daw.
Nature Neuroscience | 2005
Nathaniel D. Daw; Yael Niv; Peter Dayan
A broad range of neural and behavioral data suggests that the brain contains multiple systems for behavioral choice, including one associated with prefrontal cortex and another with dorsolateral striatum. However, such a surfeit of control raises an additional choice problem: how to arbitrate between the systems when they disagree. Here, we consider dual-action choice systems from a normative perspective, using the computational theory of reinforcement learning. We identify a key trade-off pitting computational simplicity against the flexible and statistically efficient use of experience. The trade-off is realized in a competition between the dorsolateral striatal and prefrontal systems. We suggest a Bayesian principle of arbitration between them according to uncertainty, so each controller is deployed when it should be most accurate. This provides a unifying account of a wealth of experimental evidence about the factors favoring dominance by either system.
Psychopharmacology | 2007
Yael Niv; Nathaniel D. Daw; Daphna Joel; Peter Dayan
RationaleDopamine neurotransmission has long been known to exert a powerful influence over the vigor, strength, or rate of responding. However, there exists no clear understanding of the computational foundation for this effect; predominant accounts of dopamine’s computational function focus on a role for phasic dopamine in controlling the discrete selection between different actions and have nothing to say about response vigor or indeed the free-operant tasks in which it is typically measured.ObjectivesWe seek to accommodate free-operant behavioral tasks within the realm of models of optimal control and thereby capture how dopaminergic and motivational manipulations affect response vigor.MethodsWe construct an average reward reinforcement learning model in which subjects choose both which action to perform and also the latency with which to perform it. Optimal control balances the costs of acting quickly against the benefits of getting reward earlier and thereby chooses a best response latency.ResultsIn this framework, the long-run average rate of reward plays a key role as an opportunity cost and mediates motivational influences on rates and vigor of responding. We review evidence suggesting that the average reward rate is reported by tonic levels of dopamine putatively in the nucleus accumbens.ConclusionsOur extension of reinforcement learning models to free-operant tasks unites psychologically and computationally inspired ideas about the role of tonic dopamine in striatum, explaining from a normative point of view why higher levels of dopamine might be associated with more vigorous responding.
Neural Networks | 2002
Nathaniel D. Daw; Sham M. Kakade; Peter Dayan
Anatomical and pharmacological evidence suggests that the dorsal raphe serotonin system and the ventral tegmental and substantia nigra dopamine system may act as mutual opponents. In the light of the temporal difference model of the involvement of the dopamine system in reward learning, we consider three aspects of motivational opponency involving dopamine and serotonin. We suggest that a tonic serotonergic signal reports the long-run average reward rate as part of an average-case reinforcement learning model; that a tonic dopaminergic signal reports the long-run average punishment rate in a similar context; and finally speculate that a phasic serotonin signal might report an ongoing prediction error for future punishment.
Neuron | 2011
Nathaniel D. Daw; Samuel J. Gershman; Ben Seymour; Peter Dayan; R. J. Dolan
Summary The mesostriatal dopamine system is prominently implicated in model-free reinforcement learning, with fMRI BOLD signals in ventral striatum notably covarying with model-free prediction errors. However, latent learning and devaluation studies show that behavior also shows hallmarks of model-based planning, and the interaction between model-based and model-free values, prediction errors and preferences is underexplored. We designed a multistep decision task in which model-based and model-free influences on human choice behavior could be distinguished. By showing that choices reflected both influences we could then test the purity of the ventral striatal BOLD signal as a model-free report. Contrary to expectations, the signal reflected both model-free and model-based predictions in proportions matching those that best explained choice behavior. These results challenge the notion of a separate model-free learner and suggest a more integrated computational architecture for high-level human decision-making.
The Journal of Neuroscience | 2007
Ben Seymour; Nathaniel D. Daw; Peter Dayan; Tania Singer; R. J. Dolan
Studies on human monetary prediction and decision making emphasize the role of the striatum in encoding prediction errors for financial reward. However, less is known about how the brain encodes financial loss. Using Pavlovian conditioning of visual cues to outcomes that simultaneously incorporate the chance of financial reward and loss, we show that striatal activation reflects positively signed prediction errors for both. Furthermore, we show functional segregation within the striatum, with more anterior regions showing relative selectivity for rewards and more posterior regions for losses. These findings mirror the anteroposterior valence-specific gradient reported in rodents and endorse the role of the striatum in aversive motivational learning about financial losses, illustrating functional and anatomical consistencies with primary aversive outcomes such as pain.
The Journal of Neuroscience | 2007
Tom Schonberg; Nathaniel D. Daw; Daphna Joel; John P. O'Doherty
The computational framework of reinforcement learning has been used to forward our understanding of the neural mechanisms underlying reward learning and decision-making behavior. It is known that humans vary widely in their performance in decision-making tasks. Here, we used a simple four-armed bandit task in which subjects are almost evenly split into two groups on the basis of their performance: those who do learn to favor choice of the optimal action and those who do not. Using models of reinforcement learning we sought to determine the neural basis of these intrinsic differences in performance by scanning both groups with functional magnetic resonance imaging. We scanned 29 subjects while they performed the reward-based decision-making task. Our results suggest that these two groups differ markedly in the degree to which reinforcement learning signals in the striatum are engaged during task performance. While the learners showed robust prediction error signals in both the ventral and dorsal striatum during learning, the nonlearner group showed a marked absence of such signals. Moreover, the magnitude of prediction error signals in a region of dorsal striatum correlated significantly with a measure of behavioral performance across all subjects. These findings support a crucial role of prediction error signals, likely originating from dopaminergic midbrain neurons, in enabling learning of action selection preferences on the basis of obtained rewards. Thus, spontaneously observed individual differences in decision making performance demonstrate the suggested dependence of this type of learning on the functional integrity of the dopaminergic striatal system in humans.
Brain | 2009
Nikoletta Bódi; Szabolcs Kéri; Helga Nagy; Ahmed A. Moustafa; Catherine E. Myers; Nathaniel D. Daw; György Dibó; Annamária Takáts; Dániel Bereczki; Mark A. Gluck
Parkinsons disease is characterized by the degeneration of dopaminergic pathways projecting to the striatum. These pathways are implicated in reward prediction. In this study, we investigated reward and punishment processing in young, never-medicated Parkinsons disease patients, recently medicated patients receiving the dopamine receptor agonists pramipexole and ropinirole and healthy controls. The never-medicated patients were also re-evaluated after 12 weeks of treatment with dopamine agonists. Reward and punishment processing was assessed by a feedback-based probabilistic classification task. Personality characteristics were measured by the temperament and character inventory. Results revealed that never-medicated patients with Parkinsons disease showed selective deficits on reward processing and novelty seeking, which were remediated by dopamine agonists. These medications disrupted punishment processing. In addition, dopamine agonists increased the correlation between reward processing and novelty seeking, whereas these drugs decreased the correlation between punishment processing and harm avoidance. Our finding that dopamine agonist administration in young patients with Parkinsons disease resulted in increased novelty seeking, enhanced reward processing, and decreased punishment processing may shed light on the cognitive and personality bases of the impulse control disorders, which arise as side-effects of dopamine agonist therapy in some Parkinsons disease patients.
Trends in Cognitive Sciences | 2006
Aaron C. Courville; Nathaniel D. Daw; David S. Touretzky
The recent flowering of Bayesian approaches invites the re-examination of classic issues in behavior, even in areas as venerable as Pavlovian conditioning. A statistical account can offer a new, principled interpretation of behavior, and previous experiments and theories can inform many unexplored aspects of the Bayesian enterprise. Here we consider one such issue: the finding that surprising events provoke animals to learn faster. We suggest that, in a statistical account of conditioning, surprise signals change and therefore uncertainty and the need for new learning. We discuss inference in a world that changes and show how experimental results involving surprise can be interpreted from this perspective, and also how, thus understood, these phenomena help constrain statistical theories of animal and human learning.
Cognitive, Affective, & Behavioral Neuroscience | 2008
Peter Dayan; Nathaniel D. Daw
Decision making is a core competence for animals and humans acting and surviving in environments they only partially comprehend, gaining rewards and punishments for their troubles. Decision-theoretic concepts permeate experiments and computational models in ethology, psychology, and neuroscience. Here, we review a well-known, coherent Bayesian approach to decision making, showing how it unifies issues in Markovian decision problems, signal detection psychophysics, sequential sampling, and optimal exploration and discuss paradigmatic psychological and neural examples of each problem. We discuss computational issues concerning what subjects know about their task and how ambitious they are in seeking optimal solutions; we address algorithmic topics concerning model-based and model-free methods for making choices; and we highlight key aspects of the neural implementation of decision making.
Neuropsychopharmacology | 2011
Roshan Cools; Kae Nakamura; Nathaniel D. Daw
Serotonin, like dopamine (DA), has long been implicated in adaptive behavior, including decision making and reinforcement learning. However, although the two neuromodulators are tightly related and have a similar degree of functional importance, compared with DA, we have a much less specific understanding about the mechanisms by which serotonin affects behavior. Here, we draw on recent work on computational models of dopaminergic function to suggest a framework by which many of the seemingly diverse functions associated with both DA and serotonin—comprising both affective and activational ones, as well as a number of other functions not overtly related to either—can be seen as consequences of a single root mechanism.