Yuhuai Wu
University of Toronto
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yuhuai Wu.
Neural Computation | 2017
Yoshua Bengio; Thomas Mesnard; Asja Fischer; Saizheng Zhang; Yuhuai Wu
We show that Langevin Markov chain Monte Carlo inference in an energy-based model with latent variables has the property that the early steps of inference, starting from a stationary point, correspond to propagating error gradients into internal layers, similar to backpropagation. The backpropagated error is with respect to output units that have received an outside driving force pushing them away from the stationary point. Backpropagated error gradients correspond to temporal derivatives with respect to the activation of hidden units. These lead to a weight update proportional to the product of the presynaptic firing rate and the temporal rate of change of the postsynaptic firing rate. Simulations and a theoretical argument suggest that this rate-based update rule is consistent with those associated with spike-timing-dependent plasticity. The ideas presented in this article could be an element of a theory for explaining how brains perform credit assignment in deep hierarchies as efficiently as backpropagation does, with neural computation corresponding to both approximate inference in continuous-valued latent variables and error backpropagation, at the same time.
Archive | 2017
Yoshua Bengio; Thomas Mesnard; Asja Fischer; Saizheng Zhang; Yuhuai Wu
We introduce a weight update formula that is expressed only in terms of firing rates and their derivatives and that results in changes consistent with those associated with spike-timing dependent plasticity (STDP) rules and biological observations, even though the explicit timing of spikes is not needed. The new rule changes a synaptic weight in proportion to the product of the presynaptic firing rate and the temporal rate of change of activity on the postsynaptic side. These quantities are interesting for studying theoretical explanation for synaptic changes from a machine learning perspective. In particular, if neural dynamics moved neural activity towards reducing some objective function, then this STDP rule would correspond to stochastic gradient descent on that objective function.
international conference on learning representations | 2017
Yuhuai Wu; Yuri Burda; Ruslan Salakhutdinov; Roger B. Grosse
neural information processing systems | 2016
Yuhuai Wu; Saizheng Zhang; Ying Zhang; Yoshua Bengio; Ruslan Salakhutdinov
neural information processing systems | 2017
Yuhuai Wu; Elman Mansimov; Roger B. Grosse; Shun Liao; Jimmy Ba
neural information processing systems | 2016
Saizheng Zhang; Yuhuai Wu; Tong Che; Zhouhan Lin; Roland Memisevic; Ruslan Salakhutdinov; Yoshua Bengio
international conference on learning representations | 2018
Will Grathwohl; Dami Choi; Yuhuai Wu; Geoffrey Roeder; David K. Duvenaud
neural information processing systems | 2017
Geoffrey Roeder; Yuhuai Wu; David K. Duvenaud
arXiv: Neural and Evolutionary Computing | 2015
Yoshua Bengio; Thomas Mesnard; Asja Fischer; Saizheng Zhang; Yuhuai Wu
neural information processing systems | 2016
Behnam Neyshabur; Yuhuai Wu; Ruslan Salakhutdinov; Nathan Srebro