Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where David S. Leslie is active.

Publication


Featured researches published by David S. Leslie.


Siam Journal on Control and Optimization | 2005

Individual Q -Learning in Normal Form Games

David S. Leslie; Edmund J. Collins

The single-agent multi-armed bandit problem can be solved by an agent that learns the values of each action using reinforcement learning. However, the multi-agent version of the problem, the iterated normal form game, presents a more complex challenge, since the rewards available to each agent depend on the strategies of the others. We consider the behavior of value-based learning agents in this situation, and show that such agents cannot generally play at a Nash equilibrium, although if smooth best responses are used, a Nash distribution can be reached. We introduce a particular value-based learning algorithm, which we call individual Q-learning, and use stochastic approximation to study the asymptotic behavior, showing that strategies will converge to Nash distribution almost surely in 2-player zero-sum games and 2-player partnership games. Player-dependent learning rates are then considered, and it is shown that this extension converges in some games for which many algorithms, including the basic algorithm initially considered, fail to converge.


Games and Economic Behavior | 2006

Generalised weakened fictitious play

David S. Leslie; Edmund J. Collins

A general class of adaptive processes in games is developed, which significantly generalises weakened fictitious play [Van der Genugten, B., 2000. A weakened form of fictitious play in two-person zero-sum games. Int. Game Theory Rev. 2, 307–328] and includes several interesting fictitious-play-like processes as special cases. The general model is rigorously analysed using the best response differential inclusion, and shown to converge in games with the fictitious play property. Furthermore, a new actor–critic process is introduced, in which the only information given to a player is the reward received as a result of selecting an action—a player need not even know they are playing a game. It is shown that this results in a generalised weakened fictitious play process, and can therefore be considered as a first step towards explaining how players might learn to play Nash equilibrium strategies without having any knowledge of the game, or even that they are playing a game.


Knowledge Engineering Review | 2011

Review: a unifying framework for iterative approximate best-response algorithms for distributed constraint optimization problems1

Archie C. Chapman; Alex Rogers; Nicholas R. Jennings; David S. Leslie

Distributed constraint optimization problems (DCOPs) are important in many areas of computer science and optimization. In a DCOP, each variable is controlled by one of many autonomous agents, who together have the joint goal of maximizing a global objective function. A wide variety of techniques have been explored to solve such problems, and here we focus on one of the main families, namely iterative approximate best-response algorithms used as local search algorithms for DCOPs. We define these algorithms as those in which, at each iteration, agents communicate only the states of the variables under their control to their neighbours on the constraint graph, and that reason about their next state based on the messages received from their neighbours. These algorithms include the distributed stochastic algorithm and stochastic coordination algorithms, the maximum-gain messaging algorithms, the families of fictitious play and adaptive play algorithms, and algorithms that use regret-based heuristics. This family of algorithms is commonly employed in real-world systems, as they can be used in domains where communication is difficult or costly, where it is appropriate to trade timeliness off against optimality, or where hardware limitations render complete or more computationally intensive algorithms unusable. However, until now, no overarching framework has existed for analyzing this broad family of algorithms, resulting in similar and overlapping work being published independently in several different literatures. The main contribution of this paper, then, is the development of a unified analytical framework for studying such algorithms. This framework is built on our insight that when formulated as non-cooperative games, DCOPs form a subset of the class of potential games. This result allows us to prove convergence properties of iterative approximate best-response algorithms developed in the computer science literature using game-theoretic methods (which also shows that such algorithms can also be applied to the more general problem of finding Nash equilibria in potential games), and, conversely, also allows us to show that many game-theoretic algorithms can be used to solve DCOPs. By so doing, our framework can assist system designers by making the pros and cons of, and the synergies between, the various iterative approximate best-response DCOP algorithm components clear.


Statistics and Computing | 2007

A general approach to heteroscedastic linear regression

David S. Leslie; Robert Kohn; David J. Nott

Our article presents a general treatment of the linear regression model, in which the error distribution is modelled nonparametrically and the error variances may be heteroscedastic, thus eliminating the need to transform the dependent variable in many data sets. The mean and variance components of the model may be either parametric or nonparametric, with parsimony achieved through variable selection and model averaging. A Bayesian approach is used for inference with priors that are data-based so that estimation can be carried out automatically with minimal input by the user. A Dirichlet process mixture prior is used to model the error distribution nonparametrically; when there are no regressors in the model, the method reduces to Bayesian density estimation, and we show that in this case the estimator compares favourably with a well-regarded plug-in density estimator. We also consider a method for checking the fit of the full model. The methodology is applied to a number of simulated and real examples and is shown to work well.


Siam Journal on Control and Optimization | 2013

CONVERGENT LEARNING ALGORITHMS FOR UNKNOWN REWARD GAMES

Archie C. Chapman; David S. Leslie; Alex Rogers; Nicholas R. Jennings

In this paper, we address the problem of convergence to Nash equilibria in games with rewards that are initially unknown and must be estimated over time from noisy observations. These games arise in many real-world applications, whenever rewards for actions cannot be prespecified and must be learned online, but standard results in game theory do not consider such settings. For this problem, we derive a multiagent version of


Drug and Alcohol Dependence | 2012

Respondent driven sampling and community structure in a population of injecting drug users, Bristol, UK.

Harriet L. Mills; Caroline Colijn; Peter Vickerman; David S. Leslie; Vivian Hope; Matthew Hickman

\mathcal{Q}


Journal of the Royal Society Interface | 2013

Context-dependent decision-making: a simple Bayesian model.

Kevin Lloyd; David S. Leslie

-learning to estimate the reward functions using novel forms of the


Journal of Artificial Intelligence Research | 2008

On similarities between inference in game theory and machine learning

Iead Rezek; David S. Leslie; Steven Reece; S. Roberts; Alex Rogers; Rajdeep K. Dash; Nicholas R. Jennings

\epsilon


Electronic Journal of Statistics | 2008

Generalised linear mixed model analysis via sequential Monte Carlo sampling

Yanan Fan; David S. Leslie; M. P. Wand

-greedy learning policy. Using these


Journal of Economic Theory | 2014

Stochastic Fictitious Play with Continuous Action Sets

Steven Perkins; David S. Leslie

\mathcal{Q}

Collaboration


Dive into the David S. Leslie's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Panayotis Mertikopoulos

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge