Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ian Osband is active.

Publication


Featured researches published by Ian Osband.


arXiv: Learning | 2018

A Tutorial on Thompson Sampling

Daniel Russo; Benjamin Van Roy; Abbas Kazerouni; Ian Osband; Zheng Wen

Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing to accumulate new information that may improve future performance. The algorithm addresses a broad range of problems in a computationally efficient manner and is therefore enjoying wide use. This tutorial covers the algorithm and its application, illustrating concepts through a range of examples, including Bernoulli bandit problems, shortest path problems, product recommendation, assortment, active learning with neural networks, and reinforcement learning in Markov decision processes. Most of these problems involve complex information structures, where information revealed by taking an action informs beliefs about other actions. We will also discuss when and why Thompson sampling is or is not effective and relations to alternative algorithms.


neural information processing systems | 2016

Deep Exploration via Bootstrapped DQN.

Ian Osband; Charles Blundell; Alexander Pritzel; Benjamin Van Roy


international conference on machine learning | 2016

Generalization and exploration via randomized value functions

Ian Osband; Benjamin Van Roy; Zheng Wen


neural information processing systems | 2013

More) Efficient Reinforcement Learning via Posterior Sampling

Ian Osband; Daniel Russo; Benjamin Van Roy


international conference on learning representations | 2018

Noisy Networks For Exploration

Meire Fortunato; Mohammad Gheshlaghi Azar; Jacob Menick; Matteo Hessel; Ian Osband; Alex Graves; Volodymyr Mnih; Rémi Munos; Demis Hassabis; Olivier Pietquin; Charles Blundell; Shane Legg


national conference on artificial intelligence | 2018

Deep Q-learning from Demonstrations

Todd Hester; Matej Vecerik; Olivier Pietquin; Marc Lanctot; Tom Schaul; Dan Horgan; John Quan; Andrew Sendonaris; Gabriel Dulac-Arnold; Ian Osband; John Agapiou; Joel Z. Leibo; Audrunas Gruslys


international conference on machine learning | 2016

Why is Posterior Sampling Better than Optimism for Reinforcement Learning

Ian Osband; Benjamin Van Roy


neural information processing systems | 2014

Model-based Reinforcement Learning and the Eluder Dimension

Ian Osband; Benjamin Van Roy


neural information processing systems | 2014

Near-optimal Reinforcement Learning in Factored MDPs

Ian Osband; Benjamin Van Roy


arXiv: Machine Learning | 2015

Bootstrapped Thompson Sampling and Deep Exploration

Ian Osband; Benjamin Van Roy

Collaboration


Dive into the Ian Osband's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Olivier Pietquin

Institut Universitaire de France

View shared research outputs
Top Co-Authors

Avatar

Todd Hester

University of Texas at Austin

View shared research outputs
Researchain Logo
Decentralizing Knowledge