Andrei A. Rusu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andrei A. Rusu is active.

Explore More

Publication

Featured researches published by Andrei A. Rusu.

Nature | 2015

Human-level control through deep reinforcement learning

Volodymyr Mnih; Koray Kavukcuoglu; David Silver; Andrei A. Rusu; Joel Veness; Marc G. Bellemare; Alex Graves; Martin A. Riedmiller; Andreas K. Fidjeland; Georg Ostrovski; Stig Petersen; Charles Beattie; Amir Sadik; Ioannis Antonoglou; Helen King; Dharshan Kumaran; Daan Wierstra; Shane Legg; Demis Hassabis

The theory of reinforcement learning provides a normative account, deeply rooted in psychological and neuroscientific perspectives on animal behaviour, of how agents may optimize their control of an environment. To use reinforcement learning successfully in situations approaching real-world complexity, however, agents are confronted with a difficult task: they must derive efficient representations of the environment from high-dimensional sensory inputs, and use these to generalize past experience to new situations. Remarkably, humans and other animals seem to solve this problem through a harmonious combination of reinforcement learning and hierarchical sensory processing systems, the former evidenced by a wealth of neural data revealing notable parallels between the phasic signals emitted by dopaminergic neurons and temporal difference reinforcement learning algorithms. While reinforcement learning agents have achieved some successes in a variety of domains, their applicability has previously been limited to domains in which useful features can be handcrafted, or to domains with fully observed, low-dimensional state spaces. Here we use recent advances in training deep neural networks to develop a novel artificial agent, termed a deep Q-network, that can learn successful policies directly from high-dimensional sensory inputs using end-to-end reinforcement learning. We tested this agent on the challenging domain of classic Atari 2600 games. We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to surpass the performance of all previous algorithms and achieve a level comparable to that of a professional human games tester across a set of 49 games, using the same algorithm, network architecture and hyperparameters. This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

Proceedings of the National Academy of Sciences of the United States of America | 2017

Overcoming catastrophic forgetting in neural networks

James Kirkpatrick; Razvan Pascanu; Neil C. Rabinowitz; Joel Veness; Guillaume Desjardins; Andrei A. Rusu; Kieran Milan; John Quan; Tiago Ramalho; Agnieszka Grabska-Barwinska; Demis Hassabis; Claudia Clopath; Dharshan Kumaran; Raia Hadsell

Significance Deep neural networks are currently the most successful machine-learning technique for solving a variety of tasks, including language translation, image classification, and image generation. One weakness of such models is that, unlike humans, they are unable to learn multiple tasks sequentially. In this work we propose a practical solution to train such models sequentially by protecting the weights important for previous tasks. This approach, inspired by synaptic consolidation in neuroscience, enables state of the art results on multiple reinforcement learning problems experienced sequentially. The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence. Until now neural networks have not been capable of this and it has been widely thought that catastrophic forgetting is an inevitable feature of connectionist models. We show that it is possible to overcome this limitation and train networks that can maintain expertise on tasks that they have not experienced for a long time. Our approach remembers old tasks by selectively slowing down learning on the weights important for those tasks. We demonstrate our approach is scalable and effective by solving a set of classification tasks based on a hand-written digit dataset and by learning several Atari 2600 games sequentially.

international conference on evolvable systems | 2010

HyperNEAT for locomotion control in modular robots

Evert Haasdijk; Andrei A. Rusu; A. E. Eiben

In an application where autonomous robots can amalgamate spontaneously into arbitrary organisms, the individual robots cannot know a priori at which location in an organism they will end up. If the organism is to be controlled autonomously by the constituent robots, an evolutionary algorithm that evolves the controllers can only develop a single genome that will have to suffice for every individual robot. However, the robots should show different behaviour depending on their position in an organism, meaning their phenotype should be different depending on their location. In this paper, we demonstrate a solution for this problem using the HyperNEAT generative encoding technique with differentiated genome expression. We develop controllers for organism locomotion with obstacle avoidance as a proof of concept. Finally, we identify promising directions for further research.

Science | 2018

Neural scene representation and rendering

S. M. Ali Eslami; Danilo Jimenez Rezende; Frederic Besse; Fabio Viola; Ari S. Morcos; Marta Garnelo; Avraham Ruderman; Andrei A. Rusu; Ivo Danihelka; Karol Gregor; David P. Reichert; Lars Buesing; Theophane Weber; Oriol Vinyals; Dan Rosenbaum; Neil C. Rabinowitz; Helen King; Chloe Hillier; Matt Botvinick; Daan Wierstra; Koray Kavukcuoglu; Demis Hassabis

A scene-internalizing computer program To train a computer to “recognize” elements of a scene supplied by its visual sensors, computer scientists typically use millions of images painstakingly labeled by humans. Eslami et al. developed an artificial vision system, dubbed the Generative Query Network (GQN), that has no need for such labeled data. Instead, the GQN first uses images taken from different viewpoints and creates an abstract description of the scene, learning its essentials. Next, on the basis of this representation, the network predicts what the scene would look like from a new, arbitrary viewpoint. Science, this issue p. 1204 A computer vision system predicts how a 3D scene looks from any viewpoint after just a few 2D views from other viewpoints. Scene representation—the process of converting visual sensory data into concise descriptions—is a requirement for intelligent behavior. Recent work has shown that neural networks excel at this task when provided with large, labeled datasets. However, removing the reliance on human labeling remains an important open problem. To this end, we introduce the Generative Query Network (GQN), a framework within which machines learn to represent scenes using only their own sensors. The GQN takes as input images of a scene taken from different viewpoints, constructs an internal representation, and uses this representation to predict the appearance of that scene from previously unobserved viewpoints. The GQN demonstrates representation learning without human labels or domain knowledge, paving the way toward machines that autonomously learn to understand the world around them.

Proceedings of the National Academy of Sciences of the United States of America | 2018

Reply to Huszár: The elastic weight consolidation penalty is empirically valid

In our recent work on elastic weight consolidation (EWC) (1) we show that forgetting in neural networks can be alleviated by using a quadratic penalty whose derivation was inspired by Bayesian evidence accumulation. In his letter (2), Dr. Huszar provides an alternative form for this penalty by following the standard work on expectation propagation using the Laplace approximation (3). He correctly argues that in cases when more than two tasks are undertaken the two forms of the penalty are different. Dr. Huszar also shows that for a toy linear regression problem his expression appears to be better. We would like to thank Dr. Huszar for pointing out … [↵][1]1To whom correspondence should be addressed. Email: [email protected]. [1]: #xref-corresp-1-1

Cerebral Cortex | 2014

Imagine All the People: How the Brain Creates and Uses Personality Models to Predict Behavior

Demis Hassabis; R. Nathan Spreng; Andrei A. Rusu; Clifford A. Robbins; Raymond A. Mar; Daniel L. Schacter

arXiv: Neural and Evolutionary Computing | 2017

PathNet: Evolution Channels Gradient Descent in Super Neural Networks

Chrisantha Fernando; Dylan Banarse; Charles Blundell; Yori Zwols; David Ha; Andrei A. Rusu; Alexander Pritzel; Daan Wierstra

arXiv: Robotics | 2016

Sim-to-Real Robot Learning from Pixels with Progressive Nets

Andrei A. Rusu; Matej Vecerik; Thomas Rothörl; Nicolas Heess; Razvan Pascanu; Raia Hadsell

Lecture Notes in Computer Science | 2010

HyperNEAT for Locomotion Control in Modular Robots

E.W. Haasdijk; Andrei A. Rusu; A. E. Eiben

international conference on machine learning | 2017

DARLA: Improving Zero-Shot Transfer in Reinforcement Learning

Irina Higgins; Arka Pal; Andrei A. Rusu; Loic Matthey; Christopher P. Burgess; Alexander Pritzel; Matthew Botvinick; Charles Blundell; Alexander Lerchner

Explore More