Ian Osband | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ian Osband is active.

Explore More

Publication

Featured researches published by Ian Osband.

arXiv: Learning | 2018

A Tutorial on Thompson Sampling

Daniel Russo; Benjamin Van Roy; Abbas Kazerouni; Ian Osband; Zheng Wen

Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing to accumulate new information that may improve future performance. The algorithm addresses a broad range of problems in a computationally efficient manner and is therefore enjoying wide use. This tutorial covers the algorithm and its application, illustrating concepts through a range of examples, including Bernoulli bandit problems, shortest path problems, product recommendation, assortment, active learning with neural networks, and reinforcement learning in Markov decision processes. Most of these problems involve complex information structures, where information revealed by taking an action informs beliefs about other actions. We will also discuss when and why Thompson sampling is or is not effective and relations to alternative algorithms.

neural information processing systems | 2016

Deep Exploration via Bootstrapped DQN.

Ian Osband; Charles Blundell; Alexander Pritzel; Benjamin Van Roy

international conference on machine learning | 2016

Generalization and exploration via randomized value functions

Ian Osband; Benjamin Van Roy; Zheng Wen

neural information processing systems | 2013

More) Efficient Reinforcement Learning via Posterior Sampling

Ian Osband; Daniel Russo; Benjamin Van Roy

international conference on learning representations | 2018

Noisy Networks For Exploration

Meire Fortunato; Mohammad Gheshlaghi Azar; Jacob Menick; Matteo Hessel; Ian Osband; Alex Graves; Volodymyr Mnih; Rémi Munos; Demis Hassabis; Olivier Pietquin; Charles Blundell; Shane Legg

national conference on artificial intelligence | 2018

Deep Q-learning from Demonstrations

Todd Hester; Matej Vecerik; Olivier Pietquin; Marc Lanctot; Tom Schaul; Dan Horgan; John Quan; Andrew Sendonaris; Gabriel Dulac-Arnold; Ian Osband; John Agapiou; Joel Z. Leibo; Audrunas Gruslys

international conference on machine learning | 2016