Guillaume Desjardins | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Guillaume Desjardins is active.

Explore More

Publication

Featured researches published by Guillaume Desjardins.

Proceedings of the National Academy of Sciences of the United States of America | 2017

Overcoming catastrophic forgetting in neural networks

James Kirkpatrick; Razvan Pascanu; Neil C. Rabinowitz; Joel Veness; Guillaume Desjardins; Andrei A. Rusu; Kieran Milan; John Quan; Tiago Ramalho; Agnieszka Grabska-Barwinska; Demis Hassabis; Claudia Clopath; Dharshan Kumaran; Raia Hadsell

Significance Deep neural networks are currently the most successful machine-learning technique for solving a variety of tasks, including language translation, image classification, and image generation. One weakness of such models is that, unlike humans, they are unable to learn multiple tasks sequentially. In this work we propose a practical solution to train such models sequentially by protecting the weights important for previous tasks. This approach, inspired by synaptic consolidation in neuroscience, enables state of the art results on multiple reinforcement learning problems experienced sequentially. The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence. Until now neural networks have not been capable of this and it has been widely thought that catastrophic forgetting is an inevitable feature of connectionist models. We show that it is possible to overcome this limitation and train networks that can maintain expertise on tasks that they have not experienced for a long time. Our approach remembers old tasks by selectively slowing down learning on the weights important for those tasks. We demonstrate our approach is scalable and effective by solving a set of classification tasks based on a hand-written digit dataset and by learning several Atari 2600 games sequentially.

international conference on multimodal interfaces | 2013

Combining modality specific deep neural networks for emotion recognition in video

Samira Ebrahimi Kahou; Chris Pal; Xavier Bouthillier; Pierre Froumenty; Caglar Gulcehre; Roland Memisevic; Pascal Vincent; Aaron C. Courville; Yoshua Bengio; Raul Chandias Ferrari; Mehdi Mirza; Sébastien Jean; Pierre-Luc Carrier; Yann N. Dauphin; Nicolas Boulanger-Lewandowski; Abhishek Aggarwal; Jeremie Zumer; Pascal Lamblin; Jean-Philippe Raymond; Guillaume Desjardins; Razvan Pascanu; David Warde-Farley; Atousa Torabi; Arjun Sharma; Emmanuel Bengio; Myriam Côté; Kishore Reddy Konda; Zhenzhou Wu

In this paper we present the techniques used for the University of Montréals team submissions to the 2013 Emotion Recognition in the Wild Challenge. The challenge is to classify the emotions expressed by the primary human subject in short video clips extracted from feature length movies. This involves the analysis of video clips of acted scenes lasting approximately one-two seconds, including the audio track which may contain human voices as well as background music. Our approach combines multiple deep neural networks for different data modalities, including: (1) a deep convolutional neural network for the analysis of facial expressions within video frames; (2) a deep belief net to capture audio information; (3) a deep autoencoder to model the spatio-temporal information produced by the human actions depicted within the entire scene; and (4) a shallow network architecture focused on extracted features of the mouth of the primary human subject in the scene. We discuss each of these techniques, their performance characteristics and different strategies to aggregate their predictions. Our best single model was a convolutional neural network trained to predict emotions from static frames using two large data sets, the Toronto Face Database and our own set of faces images harvested from Google image search, followed by a per frame aggregation strategy that used the challenge training data. This yielded a test set accuracy of 35.58%. Using our best strategy for aggregating our top performing models into a single predictor we were able to produce an accuracy of 41.03% on the challenge test set. These compare favorably to the challenge baseline test set accuracy of 27.56%.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2014

The Spike-and-Slab RBM and Extensions to Discrete and Sparse Data Distributions

Aaron C. Courville; Guillaume Desjardins; James Bergstra; Yoshua Bengio

The spike-and-slab restricted Boltzmann machine (ssRBM) is defined to have both a real-valued “slab” variable and a binary “spike” variable associated with each unit in the hidden layer. The model uses its slab variables to model the conditional covariance of the observation-thought to be important in capturing the statistical properties of natural images. In this paper, we present the canonical ssRBM framework together with some extensions. These extensions highlight the flexibility of the spike-and-slab RBM as a platform for exploring more sophisticated probabilistic models of high dimensional data in general and natural image data in particular. Here, we introduce the subspace-ssRBM focused on the task of learning invariant features. We highlight the behaviour of the ssRBM and its extensions through experiments with the MNIST digit recognition task and the CIFAR-10 object classification task.

Proceedings of the National Academy of Sciences of the United States of America | 2018

Reply to Huszár: The elastic weight consolidation penalty is empirically valid

In our recent work on elastic weight consolidation (EWC) (1) we show that forgetting in neural networks can be alleviated by using a quadratic penalty whose derivation was inspired by Bayesian evidence accumulation. In his letter (2), Dr. Huszar provides an alternative form for this penalty by following the standard work on expectation propagation using the Laplace approximation (3). He correctly argues that in cases when more than two tasks are undertaken the two forms of the penalty are different. Dr. Huszar also shows that for a toy linear regression problem his expression appears to be better. We would like to thank Dr. Huszar for pointing out … [↵][1]1To whom correspondence should be addressed. Email: [email protected]. [1]: #xref-corresp-1-1

Proceedings of the 9th Python in Science Conference | 2010

Theano: A CPU and GPU Math Compiler in Python

James Bergstra; Olivier Breuleux; Frédéric Bastien; Pascal Lamblin; Razvan Pascanu; Guillaume Desjardins; Joseph P. Turian; David Warde-Farley; Yoshua Bengio

Archive | 2012

Theano: Deep Learning on GPUs with Python

James Bergstra; Frédéric Bastien; Olivier Breuleux; Pascal Lamblin; Razvan Pascanu; Olivier Delalleau; Guillaume Desjardins; David Warde-Farley; Ian J. Goodfellow; Arnaud Bergeron; Yoshua Bengio

international conference on machine learning | 2012

Unsupervised and Transfer Learning Challenge: a Deep Learning Approach

Grégoire Mesnil; Yann N. Dauphin; Xavier Glorot; Salah Rifai; Yoshua Bengio; Ian J. Goodfellow; Erick Lavoie; Xavier Muller; Guillaume Desjardins; David Warde-Farley; Pascal Vincent; Aaron C. Courville; James Bergstra

international conference on artificial intelligence and statistics | 2010