Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jaldert O. Rombouts is active.

Publication


Featured researches published by Jaldert O. Rombouts.


PLOS Computational Biology | 2015

How attention can create synaptic tags for the learning of working memories in sequential tasks.

Jaldert O. Rombouts; Sander M. Bohte; Pieter R. Roelfsema

Intelligence is our ability to learn appropriate responses to new stimuli and situations. Neurons in association cortex are thought to be essential for this ability. During learning these neurons become tuned to relevant features and start to represent them with persistent activity during memory delays. This learning process is not well understood. Here we develop a biologically plausible learning scheme that explains how trial-and-error learning induces neuronal selectivity and working memory representations for task-relevant information. We propose that the response selection stage sends attentional feedback signals to earlier processing levels, forming synaptic tags at those connections responsible for the stimulus-response mapping. Globally released neuromodulators then interact with tagged synapses to determine their plasticity. The resulting learning rule endows neural networks with the capacity to create new working memory representations of task relevant information as persistent activity. It is remarkably generic: it explains how association neurons learn to store task-relevant information for linear as well as non-linear stimulus-response mappings, how they become tuned to category boundaries or analog variables, depending on the task demands, and how they learn to integrate probabilistic evidence for perceptual decisions.


Visual Cognition | 2015

A learning rule that explains how rewards teach attention

Jaldert O. Rombouts; Sander M. Bohte; Julio C. Martinez-Trujillo; Pieter R. Roelfsema

Many theories propose that top-down attentional signals control processing in sensory cortices by modulating neural activity. But who controls the controller? Here we investigate how a biologically plausible neural reinforcement learning scheme can create higher order representations and top-down attentional signals. The learning scheme trains neural networks using two factors that gate Hebbian plasticity: (1) an attentional feedback signal from the response-selection stage to earlier processing levels; and (2) a globally available neuromodulator that encodes the reward prediction error. We demonstrate how the neural network learns to direct attention to one of two coloured stimuli that are arranged in a rank-order. Like monkeys trained on this task, the network develops units that are tuned to the rank-order of the colours and it generalizes this newly learned rule to previously unseen colour combinations. These results provide new insight into how individuals can learn to control attention as a function of reward contingency.


international conference on artificial neural networks | 2012

Biologically plausible multi-dimensional reinforcement learning in neural networks

Jaldert O. Rombouts; Arjen van Ooyen; Pieter R. Roelfsema; Sander M. Bohte

How does the brain learn to map multi-dimensional sensory inputs to multi-dimensional motor outputs when it can only observe single rewards for the coordinated outputs of the whole network of neurons that make up the brain? We introduce Multi-AGREL, a novel, biologically plausible multi-layer neural network model for multi-dimensional reinforcement learning. We demonstrate that Multi-AGREL can learn non-linear mappings from inputs to multi-dimensional outputs by using only scalar reward feedback. We further show that in Multi-AGREL, the changes in the connection weights follow the gradient that minimizes global prediction error, and that all information required for synaptic plasticity is locally present.


BMC Neuroscience | 2013

Biologically plausible reinforcement learning of continuous actions

Jaldert O. Rombouts; Pieter R. Roelfsema; Sander M. Bohte

Humans and animals have the ability to perform very precise movements to obtain rewards. For instance, it is no problem at all to pick up a mug of coffee from your desk while you are working. Unfortunately, it is unknown how exactly the non-linear mapping between sensory inputs (e.g. your mug on the retina) and the correct motor actions (e.g. a set of joint angles) are learned by the brain. Here we show how a biologically plausible learning scheme can learn to perform non-linear transformations from sensory inputs to continuous actions based on reinforcement learning. To arrive at our novel scheme, we built on the idea of attention-gated reinforcement learning (AGREL) [1], a biologically plausible learning scheme that explains how networks of neurons can learn to perform non-linear transformations from sensory inputs to discrete actions (e.g. pressing a button) based on reinforcement learning [2]. We recently showed that the AGREL learning scheme can be generalized to perform multiple simultaneous discrete actions [3], and we now show how this scheme can be further generalized to continuous action spaces. The key idea is that motor areas have feedback connections to earlier processing layers which inform the network about the selected action. Synaptic plasticity is constrained to those synapses that were involved in the decision, and it follows a simple Hebbian rule which is gated by a globally available neuromodulatory signal that codes reward prediction errors. In our novel scheme motor units are situated in a population coding layer that encodes the outcome of the decision process as a bump of activations [4]. This contrasts to our earlier work where single motor units code for actions [1,3]. We show that the synaptic updates perform stochastic gradient descent on the prediction error that results from the combined action-value prediction of all the motor units that encoded the decision. Unlike other reinforcement learning based approaches, e.g. [5], our reinforcement learning rule is powerful enough to learn tasks that require non-linear transformations. The distribution of population centers in the motor layer can also be automatically adapted to task demands, yielding more representational power when actions need to be precise. We show that the novel scheme can learn to perform non-linear transformations from sensory inputs to motor outputs in a variety of direct reward tasks. The model can explain how visuomotor coordinate transforms might be learned by reinforcement learning instead of semi-supervised learning as used in [6]. It might also explain how humans learn to weigh the accuracy of their movement against the potential rewards and punishments for making inaccurate movements as in the visually guided movement task described in [7].


BMC Neuroscience | 2011

How attention and reinforcers jointly optimize the associations between sensory representations, working memory and motor programs

Jaldert O. Rombouts; Sander M. Bohte; Pieter R. Roelfsema

Almost all animal behaviors can be seen as sequences of actions towards achieving certain goals. How the association cortices learn to link sensory stimuli to a correct sequence of motor responses is not well understood, especially when only a correct sequence of responses is rewarding. We present a biologically plausible neuronal network model that can be trained to perform a large variety of tasks when only stimuli and reward contingencies are varied. The model’s aim is to learn action values in a feedforward neuronal network and we present mechanisms to overcome the structural and temporal credit assignment problems. The temporal credit assignment problem is solved by a form of Q-learning [1]. The structural credit assignment problem is solved by a form of ‘attentional’ feedback from motor cortex to association cortex that delineates the units that should change connectivity to improve behavior [2]. Moreover, the model has a new mechanism to store traces of relevant sensory stimuli in working memory. During learning, the sensory stimuli, in combination with traces of previous stimuli in working memory become associated with a unique set of action values. Learning in the model is biologically realistic as model units have Hebbian plasticity that is gated by two factors [2]. Firstly, reinforcers or increases in reward expectancy cause the global release of neuromodulatory signals that inform all synapses of the network if the outcome of a trial was better or worse than expected [3]. Selective attention is the second factor that gates plasticity. Attentional feedback highlights the chain of neurons between sensory and motor cortex responsible for the selected action. Only neurons that are causally linked to the action receive attentional feedback, and change the strength of their connections. Selective attention thereby solves the structural credit assignment problem. The resulting learning rule is a form of AGREL [2], which was previously shown to be on average equivalent to error-backpropagation in classification tasks with immediate reward. The present generalization of the learning scheme is based on temporal difference learning and it can train multilayer feedforward networks to perform delayed reward tasks with multiple epochs that require multiple behavioral responses. Importantly, the generalization MQ-AGREL learns to store in working memory information that is relevant at a later stage during a task. This memory is maintained by persistent activity of units at the intermediate network layers. We show that MQ-AGREL can be trained in many tasks that are in use in neurophysiology, including (1) (delayed) saccade-antisaccade tasks; (2) categorization tasks; and (3) probabilistic classification tasks. Neurons at intermediate levels of the network acquire visual responses and memory responses as the result of training that resemble the tuning of neurons in association areas of the cerebral cortex of animals that are trained in these same tasks. We conclude that MQ-AGREL is a powerful and biologically realistic learning rule that accounts for learning in delayed reward tasks that involve non-linear mappings from sensory stimuli and working memory onto motor responses.


neural information processing systems | 2012

Neurally Plausible Reinforcement Learning of Working Memory Tasks

Jaldert O. Rombouts; Pieter R. Roelfsema; Sander M. Bohte


neural information processing systems | 2010

Fractionally Predictive Spiking Neurons

Jaldert O. Rombouts; Sander M. Bohte


the european symposium on artificial neural networks | 2014

Learning resets of neural working memory

Jaldert O. Rombouts; Pieter R. Roelfsema; Sander M. Bohte


Lecture Notes in Computer Science | 2012

Biologically plausible multi-dimensional reinforcement learning in neural networks.

Jaldert O. Rombouts; A. Van Ooyen; Pieter R. Roelfsema; Sander M. Bohte


Ercim News | 2012

Learning to Recall

Jaldert O. Rombouts; Pieter R. Roelfsema; Sander M. Bohte

Collaboration


Dive into the Jaldert O. Rombouts's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge