Volodymyr Mnih
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Volodymyr Mnih.
Nature | 2015
Volodymyr Mnih; Koray Kavukcuoglu; David Silver; Andrei A. Rusu; Joel Veness; Marc G. Bellemare; Alex Graves; Martin A. Riedmiller; Andreas K. Fidjeland; Georg Ostrovski; Stig Petersen; Charles Beattie; Amir Sadik; Ioannis Antonoglou; Helen King; Dharshan Kumaran; Daan Wierstra; Shane Legg; Demis Hassabis
The theory of reinforcement learning provides a normative account, deeply rooted in psychological and neuroscientific perspectives on animal behaviour, of how agents may optimize their control of an environment. To use reinforcement learning successfully in situations approaching real-world complexity, however, agents are confronted with a difficult task: they must derive efficient representations of the environment from high-dimensional sensory inputs, and use these to generalize past experience to new situations. Remarkably, humans and other animals seem to solve this problem through a harmonious combination of reinforcement learning and hierarchical sensory processing systems, the former evidenced by a wealth of neural data revealing notable parallels between the phasic signals emitted by dopaminergic neurons and temporal difference reinforcement learning algorithms. While reinforcement learning agents have achieved some successes in a variety of domains, their applicability has previously been limited to domains in which useful features can be handcrafted, or to domains with fully observed, low-dimensional state spaces. Here we use recent advances in training deep neural networks to develop a novel artificial agent, termed a deep Q-network, that can learn successful policies directly from high-dimensional sensory inputs using end-to-end reinforcement learning. We tested this agent on the challenging domain of classic Atari 2600 games. We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to surpass the performance of all previous algorithms and achieve a level comparable to that of a professional human games tester across a set of 49 games, using the same algorithm, network architecture and hyperparameters. This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
european conference on computer vision | 2010
Volodymyr Mnih; Geoffrey E. Hinton
Reliably extracting information from aerial imagery is a difficult problem with many practical applications. One specific case of this problem is the task of automatically detecting roads. This task is a difficult vision problem because of occlusions, shadows, and a wide variety of non-road objects. Despite 30 years of work on automatic road detection, no automatic or semi-automatic road detection system is currently on the market and no published method has been shown to work reliably on large datasets of urban imagery. We propose detecting roads using a neural network with millions of trainable weights which looks at a much larger context than was used in previous attempts at learning the task. The network is trained on massive amounts of data using a consumer GPU. We demonstrate that predictive performance can be substantially improved by initializing the feature detectors using recently developed unsupervised learning methods as well as by taking advantage of the local spatial coherence of the output labels.We show that our method works reliably on two challenging urban datasets that are an order of magnitude larger than what was used to evaluate previous approaches.
computer vision and pattern recognition | 2011
Marc'Aurelio Ranzato; Joshua Susskind; Volodymyr Mnih; Geoffrey E. Hinton
The most popular way to use probabilistic models in vision is first to extract some descriptors of small image patches or object parts using well-engineered features, and then to use statistical learning tools to model the dependencies among these features and eventual labels. Learning probabilistic models directly on the raw pixel values has proved to be much more difficult and is typically only used for regularizing discriminative methods. In this work, we use one of the best, pixel-level, generative models of natural images–a gated MRF–as the lowest level of a deep belief network (DBN) that has several hidden layers. We show that the resulting DBN is very good at coping with occlusion when predicting expression categories from face images, and it can produce features that perform comparably to SIFT descriptors for discriminating different types of scene. The generative ability of the model also makes it easy to see what information is captured and what is lost at each level of representation.
international conference on machine learning | 2008
Volodymyr Mnih; Csaba Szepesvári; Jean-Yves Audibert
Sampling is a popular way of scaling up machine learning algorithms to large datasets. The question often is how many samples are needed. Adaptive stopping algorithms monitor the performance in an online fashion and they can stop early, saving valuable resources. We consider problems where probabilistic guarantees are desired and demonstrate how recently-introduced empirical Bernstein bounds can be used to design stopping rules that are efficient. We provide upper bounds on the sample complexity of the new rules, as well as empirical results on model selection and boosting in the filtering setting.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2013
Marc'Aurelio Ranzato; Volodymyr Mnih; Joshua Susskind; Geoffrey E. Hinton
This paper describes a Markov Random Field for real-valued image modeling that has two sets of latent variables. One set is used to gate the interactions between all pairs of pixels, while the second set determines the mean intensities of each pixel. This is a powerful model with a conditional distribution over the input that is Gaussian, with both mean and covariance determined by the configuration of latent variables, which is unlike previous models that were restricted to using Gaussians with either a fixed mean or a diagonal covariance matrix. Thanks to the increased flexibility, this gated MRF can generate more realistic samples after training on an unconstrained distribution of high-resolution natural images. Furthermore, the latent variables of the model can be inferred efficiently and can be used as very effective descriptors in recognition tasks. Both generation and discrimination drastically improve as layers of binary latent variables are added to the model, yielding a hierarchical model called a Deep Belief Network.
Journal of Field Robotics | 2006
Xuming He; Richard S. Zemel; Volodymyr Mnih
We propose an approach to building topological maps of environments based on image sequences. The central idea is to use manifold constraints to find representative feature prototypes, so that images can be related to each other, and thereby to camera poses in the environment. Our topological map is built incrementally, performing well after only a few visits to a location. We compare our method to several other approaches to representing images. During tests on novel images from the same environment, our method attains the highest accuracy in finding images depicting similar camera poses, including generalizing across considerable seasonal variations.
arXiv: Learning | 2013
Volodymyr Mnih; Koray Kavukcuoglu; David Silver; Alex Graves; Ioannis Antonoglou; Daan Wierstra; Martin A. Riedmiller
international conference on machine learning | 2016
Volodymyr Mnih; Adrià Puigdomènech Badia; Mehdi Mirza; Alex Graves; Tim Harley; Timothy P. Lillicrap; David Silver; Koray Kavukcuoglu
neural information processing systems | 2014
Volodymyr Mnih; Nicolas Heess; Alex Graves; Koray Kavukcuoglu
international conference on learning representations | 2015
Jimmy Ba; Volodymyr Mnih; Koray Kavukcuoglu