Carlos Diuk | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Carlos Diuk is active.

Explore More

Publication

Featured researches published by Carlos Diuk.

Neuron | 2011

A Neural Signature of Hierarchical Reinforcement Learning

José Ribas-Fernandes; Alec Solway; Carlos Diuk; Joseph McGuire; Andrew G. Barto; Yael Niv; Matthew Botvinick

Human behavior displays hierarchical structure: simple actions cohere into subtask sequences, which work together to accomplish overall task goals. Although the neural substrates of such hierarchy have been the target of increasing research, they remain poorly understood. We propose that the computations supporting hierarchical behavior may relate to those in hierarchical reinforcement learning (HRL), a machine-learning framework that extends reinforcement-learning mechanisms into hierarchical domains. To test this, we leveraged a distinctive prediction arising from HRL. In ordinary reinforcement learning, reward prediction errors are computed when there is an unanticipated change in the prospects for accomplishing overall task goals. HRL entails that prediction errors should also occur in relation to task subgoals. In three neuroimaging studies we observed neural responses consistent with such subgoal-related reward prediction errors, within structures previously implicated in reinforcement learning. The results reported support the relevance of HRL to the neural processes underlying hierarchical behavior.

international conference on machine learning | 2008

An object-oriented representation for efficient reinforcement learning

Carlos Diuk; Andre Cohen; Michael L. Littman

Rich representations in reinforcement learning have been studied for the purpose of enabling generalization and making learning feasible in large state spaces. We introduce Object-Oriented MDPs (OO-MDPs), a representation based on objects and their interactions, which is a natural way of modeling environments and offers important generalization opportunities. We introduce a learning algorithm for deterministic OO-MDPs and prove a polynomial bound on its sample complexity. We illustrate the performance gains of our representation and algorithm in the well-known Taxi domain, plus a real-life videogame.

international conference on machine learning | 2009

The adaptive k -meteorologists problem and its application to structure learning and feature selection in reinforcement learning

Carlos Diuk; Lihong Li; Bethany R. Leffler

The purpose of this paper is three-fold. First, we formalize and study a problem of learning probabilistic concepts in the recently proposed KWIK framework. We give details of an algorithm, known as the Adaptive k-Meteorologists Algorithm, analyze its sample-complexity upper bound, and give a matching lower bound. Second, this algorithm is used to create a new reinforcement-learning algorithm for factored-state problems that enjoys significant improvement over the previous state-of-the-art algorithm. Finally, we apply the Adaptive k-Meteorologists Algorithm to remove a limiting assumption in an existing reinforcement-learning algorithm. The effectiveness of our approaches is demonstrated empirically in a couple benchmark domains as well as a robotics navigation problem.

The Journal of Neuroscience | 2013

Hierarchical Learning Induces Two Simultaneous, But Separable, Prediction Errors in Human Basal Ganglia

Carlos Diuk; Karin Tsai; Jonathan D. Wallis; Matthew Botvinick; Yael Niv

Studies suggest that dopaminergic neurons report a unitary, global reward prediction error signal. However, learning in complex real-life tasks, in particular tasks that show hierarchical structure, requires multiple prediction errors that may coincide in time. We used functional neuroimaging to measure prediction error signals in humans performing such a hierarchical task involving simultaneous, uncorrelated prediction errors. Analysis of signals in a priori anatomical regions of interest in the ventral striatum and the ventral tegmental area indeed evidenced two simultaneous, but separable, prediction error signals corresponding to the two levels of hierarchy in the task. This result suggests that suitably designed tasks may reveal a more intricate pattern of firing in dopaminergic neurons. Moreover, the need for downstream separation of these signals implies possible limitations on the number of different task levels that we can learn about simultaneously.

PLOS Computational Biology | 2014

Optimal behavioral hierarchy.

Alec Solway; Carlos Diuk; Natalia Córdova; Debbie Yee; Andrew G. Barto; Yael Niv; Matthew Botvinick

Human behavior has long been recognized to display hierarchical structure: actions fit together into subtasks, which cohere into extended goal-directed activities. Arranging actions hierarchically has well established benefits, allowing behaviors to be represented efficiently by the brain, and allowing solutions to new tasks to be discovered easily. However, these payoffs depend on the particular way in which actions are organized into a hierarchy, the specific way in which tasks are carved up into subtasks. We provide a mathematical account for what makes some hierarchies better than others, an account that allows an optimal hierarchy to be identified for any set of tasks. We then present results from four behavioral experiments, suggesting that human learners spontaneously discover optimal action hierarchies.

adaptive agents and multi-agents systems | 2006

A hierarchical approach to efficient reinforcement learning in deterministic domains

Carlos Diuk; Alexander L. Strehl; Michael L. Littman

Factored representations, model-based learning, and hierarchies are well-studied techniques for improving the learning efficiency of reinforcement-learning algorithms in large-scale state spaces. We bring these three ideas together in a new algorithm. Our algorithm tackles two open problems from the reinforcement-learning literature, and provides a solution to those problems in deterministic domains. First, it shows how models can improve learning speed in the hierarchy-based MaxQ framework without disrupting opportunities for state abstraction. Second, we show how hierarchies can augment existing factored exploration algorithms to achieve not only low sample complexity for learning, but provably efficient planning as well. We illustrate the resulting performance gains in example domains. We prove polynomial bounds on the computational effort needed to attain near optimal performance within the hierarchy.

Frontiers in Integrative Neuroscience | 2012

A quantitative philology of introspection

Carlos Diuk; Diego Fernández Slezak; Iván Raskovsky; Mariano Sigman; Guillermo A. Cecchi

The cultural evolution of introspective thought has been recognized to undergo a drastic change during the middle of the first millennium BC. This period, known as the “Axial Age,” saw the birth of religions and philosophies still alive in modern culture, as well as the transition from orality to literacy—which led to the hypothesis of a link between introspection and literacy. Here we set out to examine the evolution of introspection in the Axial Age, studying the cultural record of the Greco-Roman and Judeo-Christian literary traditions. Using a statistical measure of semantic similarity, we identify a single “arrow of time” in the Old and New Testaments of the Bible, and a more complex non-monotonic dynamics in the Greco-Roman tradition reflecting the rise and fall of the respective societies. A comparable analysis of the twentieth century cultural record shows a steady increase in the incidence of introspective topics, punctuated by abrupt declines during and preceding the First and Second World Wars. Our results show that (a) it is possible to devise a consistent metric to quantify the history of a high-level concept such as introspection, cementing the path for a new quantitative philology and (b) to the extent that it is captured in the cultural record, the increased ability of human thought for self-reflection that the Axial Age brought about is still heavily determined by societal contingencies beyond the orality-literacy nexus.

Computational and Robotic Models of the Hierarchical Organization of Behavior | 2013

Divide and Conquer: Hierarchical Reinforcement Learning and Task Decomposition in Humans

Carlos Diuk; Anna C. Schapiro; Natalia Córdova; José Ribas-Fernandes; Yael Niv; Matthew Botvinick

The field of computational reinforcement learning (RL) has proved extremely useful in research on human and animal behavior and brain function. However, the simple forms of RL considered in most empirical research do not scale well, making their relevance to complex, real-world behavior unclear. In computational RL, one strategy for addressing the scaling problem is to introduce hierarchical structure, an approach that has intriguing parallels with human behavior. We have begun to investigate the potential relevance of hierarchical RL (HRL) to human and animal behavior and brain function. In the present chapter, we first review two results that show the existence of neural correlates to key predictions from HRL. Then, we focus on one aspect of this work, which deals with the question of how action hierarchies are initially established. Work in HRL suggests that hierarchy learning is accomplished by identifying useful subgoal states, and that this might in turn be accomplished through a structural analysis of the given task domain. We review results from a set of behavioral and neuroimaging experiments, in which we have investigated the relevance of these ideas to human learning and decision making.

national conference on artificial intelligence | 2007