Elliot Andrew Ludvig
University of Warwick
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Elliot Andrew Ludvig.
Neural Computation | 2008
Elliot Andrew Ludvig; Richard S. Sutton; E. James Kehoe
The phasic firing of dopamine neurons has been theorized to encode a reward-prediction error as formalized by the temporal-difference (TD) algorithm in reinforcement learning. Most TD models of dopamine have assumed a stimulus representation, known as the complete serial compound, in which each moment in a trial is distinctly represented. We introduce a more realistic temporal stimulus representation for the TD model. In our model, all external stimuli, including rewards, spawn a series of internal microstimuli, which grow weaker and more diffuse over time. These microstimuli are used by the TD learning algorithm to generate predictions of future reward. This new stimulus representation injects temporal generalization into the TD model and enhances correspondence between model and data in several experiments, including those when rewards are omitted or received early. This improved fit mostly derives from the absence of large negative errors in the new model, suggesting that dopamine alone can encode the full range of TD errors in these situations.
PLOS ONE | 2011
Elliot Andrew Ludvig; Marcia L. Spetch
When faced with risky decisions, people tend to be risk averse for gains and risk seeking for losses (the reflection effect). Studies examining this risk-sensitive decision making, however, typically ask people directly what they would do in hypothetical choice scenarios. A recent flurry of studies has shown that when these risky decisions include rare outcomes, people make different choices for explicitly described probabilities than for experienced probabilistic outcomes. Specifically, rare outcomes are overweighted when described and underweighted when experienced. In two experiments, we examined risk-sensitive decision making when the risky option had two equally probable (50%) outcomes. For experience-based decisions, there was a reversal of the reflection effect with greater risk seeking for gains than for losses, as compared to description-based decisions. This fundamental difference in experienced and described choices cannot be explained by the weighting of rare events and suggests a separate subjective utility curve for experience.
Brain Research | 2010
Fuat Balcı; Elliot Andrew Ludvig; Ron Abner; Xiaoxi Zhuang; Patrick Poon; Daniela Brunner
We examined interval timing in mice that underexpress the dopamine transporter (DAT) and have chronically higher levels of extracellular dopamine (Zhuang et al., 2001). The dopaminergic system has been proposed as a neural substrate for an internal clock, with transient elevations of dopaminergic activity producing underestimation of temporal intervals. A group of DAT knockdown (KD) and littermate wild type (WT) mice were tested with a dual peak procedure. Mice obtained reinforcement by pressing one of two levers after a fixed amount of time (30 or 45 s) had elapsed since lever extension. Only one lever was available at a time, and each lever was associated with a single duration. On occasional probe trials, the DAT KD mice began responding earlier in the interval than WT mice, but showed maximal responding and terminated responding around the same time as the WT mice. Administration of raclopride (0.2, 0.6, and 1.2 mg/kg), a D2 antagonist, eliminated most of the differences between DAT KD and WT mice, suggesting that the effects of chronic DAT downregulation on interval timing were mediated by the D2 receptors. Another cohort of DAT KD mice was trained on a visual attention task, and no deficits were observed, confirming that the changes in timed behavior were not attentionally mediated. Our data are consistent with the view that tonic dopamine affects the sensitivity of an organism to external reward signals, and that this increased motivation for reward of DAT KD mice lowers the threshold for initiating responding in a timing task.
Timing & Time Perception | 2013
Patrick Simen; Francois Rivest; Elliot Andrew Ludvig; Fuat Balcı; Peter R. Killeen
Pacemaker-accumulator (PA) systems have been the most popular kind of timing model in the half-century since their introduction by Treisman (1963). Many alternative timing models have been designed predicated on different assumptions, though the dominant PA model during this period — Gibbon and Church’s Scalar Expectancy Theory (SET) — invokes most of them. As in Treisman, SET’s implementation assumes a fixed-rate clock-pulse generator and encodes durations by storing average pulse counts; unlike Treisman’s model, SET’s decision process invokes Weber’s law of magnitude-comparison to account for timescale-invariant temporal precision in animal behavior. This is one way to deal with the ‘Poisson timing’ issue, in which relative temporal precision increases for longer durations, contrafactually, in a simplified version of Treisman’s model. First, we review the fact that this problem does not afflict Treisman’s model itself due to a key assumption not shared by SET. Second, we develop a contrasting PA model, an extension of Killeen and Fetterman’s Behavioral Theory of Timing that accumulates Poisson pulses up to a fixed criterion level, with pulse rates adapting to time different intervals. Like Treisman’s model, this time-adaptive, opponent Poisson, drift–diffusion model accounts for timescale invariance without first assuming Weber’s law. It also makes new predictions about response times and learning speed and connects interval timing to the popular drift–diffusion model of perceptual decision making. With at least three different routes to timescale invariance, the PA model family can provide a more compelling account of timed behavior than may be generally appreciated.
Learning & Behavior | 2012
Elliot Andrew Ludvig; Richard S. Sutton; E. James Kehoe
The temporal-difference (TD) algorithm from reinforcement learning provides a simple method for incrementally learning predictions of upcoming events. Applied to classical conditioning, TD models suppose that animals learn a real-time prediction of the unconditioned stimulus (US) on the basis of all available conditioned stimuli (CSs). In the TD model, similar to other error-correction models, learning is driven by prediction errors—the difference between the change in US prediction and the actual US. With the TD model, however, learning occurs continuously from moment to moment and is not artificially constrained to occur in trials. Accordingly, a key feature of any TD model is the assumption about the representation of a CS on a moment-to-moment basis. Here, we evaluate the performance of the TD model with a heretofore unexplored range of classical conditioning tasks. To do so, we consider three stimulus representations that vary in their degree of temporal generalization and evaluate how the representation influences the performance of the TD model on these conditioning tasks.
Frontiers in Computational Neuroscience | 2014
Samuel J. Gershman; Ahmed A. Moustafa; Elliot Andrew Ludvig
Reinforcement learning (RL) models have been influential in understanding many aspects of basal ganglia function, from reward prediction to action selection. Time plays an important role in these models, but there is still no theoretical consensus about what kind of time representation is used by the basal ganglia. We review several theoretical accounts and their supporting evidence. We then discuss the relationship between RL models and the timing mechanisms that have been attributed to the basal ganglia. We hypothesize that a single computational system may underlie both RL and interval timing—the perception of duration in the range of seconds to hours. This hypothesis, which extends earlier models by incorporating a time-sensitive action selection mechanism, may have important implications for understanding disorders like Parkinsons disease in which both decision making and timing are impaired.
Psychonomic Bulletin & Review | 2014
Christopher R. Madan; Elliot Andrew Ludvig; Marcia L. Spetch
When making decisions on the basis of past experiences, people must rely on their memories. Human memory has many well-known biases, including the tendency to better remember highly salient events. We propose an extreme-outcome rule, whereby this memory bias leads people to overweight the largest gains and largest losses, leading to more risk seeking for relative gains than for relative losses. To test this rule, in two experiments, people repeatedly chose between fixed and risky options, where the risky option led equiprobably to more or less than did the fixed option. As was predicted, people were more risk seeking for relative gains than for relative losses. In subsequent memory tests, people tended to recall the extreme outcome first and also judged the extreme outcome as having occurred more frequently. Across individuals, risk preferences in the risky-choice task correlated with these memory biases. This extreme-outcome rule presents a novel mechanism through which memory influences decision making.
Psychopharmacology | 2008
Fuat Balcı; Elliot Andrew Ludvig; Jacqueline M. Gibson; Brian D. Allen; Krystal M. Frank; Bryan J. Kapustinski; Thomas E. Fedolak; Daniela Brunner
RationaleTiming deficits are characteristic of developmental and neurodegenerative disorders that are accompanied by cognitive impairment. A prominent theory of this interval timing posits an internal clock whose pace is modulated by the neurotransmitter dopamine.ObjectivesWe tested two hypotheses about the pharmacology of interval timing in mice: (1) that general cognitive enhancers should increase, and cognitive disruptors should decrease temporal precision and (2) that acutely elevated dopamine should speed this internal clock and produce overestimation of elapsing time.Materials and methodsC3H mice were tested in the peak procedure, a timing task, following acute administration of two putative cognitive enhancers (atomoxetine and physostigmine), two cognitive disruptors (scopolamine and chlordiazepoxide [CDP]), or two dopamine agonists (d-amphetamine and methamphetamine).ResultsThe first hypothesis received strong support: temporal precision worsened with both cognitive disruptors, but improved with both cognitive enhancers. The two dopamine agonists also produced underestimation of elapsing time—congruent with the slowing of an internal clock and inconsistent with a dopamine-driven clock.ConclusionOur results suggest that interval timing has potential as an assay for generalized cognitive performance and that the dopamine-clock hypothesis needs further refinement.
Behavioral Neuroscience | 2008
E. James Kehoe; Elliot Andrew Ludvig; Joanne E. Dudeney; James Neufeld; Richard S. Sutton
A trial-by-trial, subject-by-subject analysis was conducted to determine whether generation of the conditioned response (CR) occurs on a continuous or all-or-none basis. Three groups of rabbits were trained on different partial reinforcement schedules with the conditioned stimulus presented alone on 10%, 30%, or 50%, respectively, of all trials. Plots of each rabbits nictitating membrane movements revealed that their magnitude rose in a continuous fashion. Response growth during acquisition followed a sigmoidal curve, and the timing of CR-sized movements was largely stable throughout the experiment. The results are discussed with respect to alternative models of CR generation.
Journal of the Experimental Analysis of Behavior | 2016
Margaret A. McDevitt; Roger Dunn; Marcia L. Spetch; Elliot Andrew Ludvig
Pigeons and other animals sometimes deviate from optimal choice behavior when given informative signals for delayed outcomes. For example, when pigeons are given a choice between an alternative that always leads to food after a delay and an alternative that leads to food only half of the time after a delay, preference changes dramatically depending on whether the stimuli during the delays are correlated with (signal) the outcomes or not. With signaled outcomes, pigeons show a much greater preference for the suboptimal alternative than with unsignaled outcomes. Key variables and research findings related to this phenomenon are reviewed, including the effects of durations of the choice and delay periods, probability of reinforcement, and gaps in the signal. We interpret the available evidence as reflecting a preference induced by signals for good news in a context of uncertainty. Other explanations are briefly summarized and compared.