David H. Wolpert | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where David H. Wolpert is active.

Explore More

Publication

Featured researches published by David H. Wolpert.

Archive | 2002

The Supervised Learning No-Free-Lunch Theorems

David H. Wolpert

This paper reviews the supervised learning versions of the no-free-lunch theorems in a simplified form. It also discusses the significance of those theorems, and their relation to other aspects of supervised learning.

Advances in Complex Systems | 2001

OPTIMAL PAYOFF FUNCTIONS FOR MEMBERS OF COLLECTIVES

David H. Wolpert; Kagan Tumer

We consider the problem of designing (perhaps massively distributed) collectives of computational processes to maximize a provided world utility function. We consider this problem when the behavior of each process in the collective can be cast as striving to maximize its own payoff utility function. For such cases the central design issue is how to initialize/update those payoff utility functions of the individual processes so as to induce behavior of the entire collective having good values of the world utility. Traditional team game approaches to this problem simply assign to each process the world utility as its payoff utility function. In previous work we used the Collective Intelligence (COIN) framework to derive a better choice of payoff utility functions, one that results in world utility performance up to orders of magnitude superior to that ensuing from the use of the team game utility. In this paper, we extend these results using a novel mathematical framework. Under that new framework we review the derivation of the general class of payoff utility functions that both (i) are easy for the individual processes to try to maximize, and (ii) have the property that if good values of them are achieved, then we are assured a high value of world utility. These are the Aristocrat Utility and a new variant of the Wonderful Life Utility that was introduced in the previous COIN work. We demonstrate experimentally that using these new utility functions can result in significantly improved performance over that of previously investigated COIN payoff utilities, over and above those previous utilities superiority to the conventional team game utility. These results also illustrate the substantial superiority of these payoff functions to perhaps the most natural version of the economics technique of endogenizing externalities.

IEEE Transactions on Evolutionary Computation | 2005

Coevolutionary free lunches

David H. Wolpert; William G. Macready

Recent work on the foundational underpinnings of black-box optimization has begun to uncover a rich mathematical structure. In particular, it is now known that an inner product between the optimization algorithm and the distribution of optimization problems likely to be encountered fixes the distribution over likely performances in running that algorithm. One ramification of this is the No Free Lunch (NFL) theorems, which state that any two algorithms are equivalent when their performance is averaged across all possible problems. This highlights the need for exploiting problem-specific knowledge to achieve better than random performance. In this paper, we present a general framework covering most optimization scenarios. In addition to the optimization scenarios addressed in the NFL results, this framework covers multiarmed bandit problems and evolution of multiple coevolving players. As a particular instance of the latter, it covers self-play problems. In these problems, the set of players work together to produce a champion, who then engages one or more antagonists in a subsequent multiplayer game. In contrast to the traditional optimization case where the NFL results hold, we show that in self-play there are free lunches: in coevolution some algorithms have better performance than other algorithms, averaged across all possible problems. However, in the typical coevolutionary scenarios encountered in biology, where there is no champion, the NFL theorems still hold.

arXiv: Statistical Mechanics | 2006

Information Theory ― The Bridge Connecting Bounded Rational Game Theory and Statistical Physics

David H. Wolpert

A long-running difficulty with conventional game theory has been how to modify it to accommodate the bounded rationality of all red-world players. A recurring issue in statistical physics is how best to approximate joint probability distributions with decoupled (and therefore far more tractable) distributions. This paper shows that the same information theoretic mathematical structure, known as Product Distribution (PD) theory, addresses both issues. In this, PD theory not only provides a principle formulation of bounded rationality and a set of new types of mean field theory in statistical physics; it also shows that those topics are fundamentally one and the same.

Archive | 2004

Collectives and the Design of Complex Systems

Kagan Tumer; David H. Wolpert

A survey of collectives.- Theory of collective intelligence.- On learnable mechanism design.- Asynchronous learning in decentralized environments.- Competition between adaptive agents.- Managing catastrophic changes in a collective.- Effects of inter-agent communications on the collective.- Man and superman--human limitations, innovation, and emergence in resource competition.- Design principles for the distributed control of modular self-reconfigurable robots.- Two paradigms for the design of artificial collectives.- Efficiency and equity in collective systems of interacting heterogeneous agents.- Selection in coevolutionary algorithms and the inverse problem.- Dynamics of large autonomous computational systems.- Index.

EPL | 2000

Collective intelligence for control of distributed dynamical systems

David H. Wolpert; Kevin R. Wheeler; Kagan Tumer

We consider the El Farol bar problem, also known as the minority game (W. B. Arthur, The American Economic Review, 84 (1994) 406; D. Challet and Y. C. Zhang, Physica A, 256 (1998) 514). We view it as an instance of the general problem of how to configure the nodal elements of a distributed dynamical system so that they do not work at cross purposes, in that their collective dynamics avoids frustration and thereby achieves a provided global goal. We summarize a mathematical theory for such configuration applicable when (as in the bar problem) the global goal can be expressed as minimizing a global energy function and the nodes can be expressed as minimizers of local free energy functions. We show that a system designed with that theory performs nearly optimally for the bar problem.

Machine Learning | 1999

An Efficient Method To Estimate Bagging‘s Generalization Error

David H. Wolpert; William G. Macready

Bagging (Breiman, 1994a) is a technique that tries to improve a learning algorithms performance by using bootstrap replicates of the training set (Efron & Tibshirani, 1993, Efron, 1979). The computational requirements for estimating the resultant generalization error on a test set by means of cross-validation are often prohibitive, for leave-one-out cross-validation one needs to train the underlying algorithm on the order of mν times, where m is the size of the training set and ν is the number of replicates. This paper presents several techniques for estimating the generalization error of a bagged learning algorithm without invoking yet more training of the underlying learning algorithm (beyond that of the bagging itself), as is required by cross-validation-based estimation. These techniques all exploit the bias-variance decomposition (Geman, Bienenstock & Doursat, 1992, Wolpert, 1996). The best of our estimators also exploits stacking (Wolpert, 1992). In a set of experiments reported here, it was found to be more accurate than both the alternative cross-validation-based estimator of the bagged algorithms error and the cross-validation-based estimator of the underlying algorithms error. This improvement was particularly pronounced for small test sets. This suggests a novel justification for using bagging—more accurate estimation of the generalization error than is possible without bagging.

adaptive agents and multi-agents systems | 1999

General principles of learning-based multi-agent systems

David H. Wolpert; Kevin R. Wheeler; Kagan Tumer

We consider the problem of how to design large decentralized multiagent systems (MAS’s) in an automated fashion, with little or no hand-tuning. Our approach has each agent run a reinforcement learning algorithm. This converts the problem into one of how to automatically set/update the reward functions for each of the agents so that the global goal is achieved. In particular we do not want the agents to “work at cross-purposes” as far as the global goal is concerned. We use the term artificial COllective INtelligence (COIN) to refer to systems that embody solutions to this problem. In this paper we present a summary of a mathematical framework for COINs. We then investigate the real-world applicability of the core concepts of that framework via two computer experiments: we show that our COINs perform near optimally in a difficult variant of Arthur’s bar problem [1] (and in particular avoid the tragedy of the commons for that problem), and we also illustrate optimal performance for our COINs in the leader-follower problem.

Machine Learning | 1999

Linearly Combining Density Estimators via Stacking

Padhraic Smyth; David H. Wolpert

This paper presents experimental results with both real and artificial data combining unsupervised learning algorithms using stacking. Specifically, stacking is used to form a linear combination of finite mixture model and kernel density estimators for non-parametric multivariate density estimation. The method outperforms other strategies such as choosing the single best model based on cross-validation, combining with uniform weights, and even using the single best model chosen by “Cheating” and examining the test set. We also investigate (1) how the utility of stacking changes when one of the models being combined is the model that generated the data, (2) how the stacking coefficients of the models compare to the relative frequencies with which cross-validation chooses among the models, (3) visualization of combined “effective” kernels, and (4) the sensitivity of stacking to overfitting as model complexity increases.

Journal of Artificial Intelligence Research | 2002

Collective intelligence, data routing and braess' paradox

David H. Wolpert; Kagan Tumer

We consider the problem of designing the the utility functions of the utility-maximizing agents in a multi-agent system (MAS) so that they work synergistically to maximize a global utility. The particular problem domain we explore is the control of network routing by placing agents on all the routers in the network. Conventional approaches to this task have the agents all use the Ideal Shortest Path routing Algorithm (ISPA). We demonstrate that in many cases, due to the side-effects of one agents actions on another agents performance, having agents use ISPAs is suboptimal as far as global aggregate cost is concerned, even when they are only used to route infinitesimally small amounts of traffic. The utility functions of the individual agents are not aligned with the global utility, intuitively speaking. As a particular example of this we present an instance of Braess paradox in which adding new links to a network whose agents all use the ISPA results in a decrease in overall throughput. We also demonstrate that load-balancing, in which the agents decisions are collectively made to optimize the global cost incurred by all traffic currently being routed, is suboptimal as far as global cost averaged across time is concerned. This is also due to side-effects, in this case of current routing decision on future traffic. The mathematics of Collective Intelligence (COIN) is concerned precisely with the issue of avoiding such deleterious side-effects in multi-agent systems, both over time and space. We present key concepts from that mathematics and use them to derive an algorithm whose ideal version should have better performance than that of having all agents use the ISPA, even in the infinitesimal limit. We present experiments verifying this, and also showing that a machine-learning-based version of this COIN algorithm in which costs are only imprecisely estimated via empirical means (a version potentially applicable in the real world) also outperforms the ISPA, despite having access to less information than does the ISPA. In particular, this COIN algorithm almost always avoids Braess paradox.

Explore More