Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Brian Tanner is active.

Publication


Featured researches published by Brian Tanner.


symposium on abstraction reformulation and approximation | 2005

Hierarchical heuristic search revisited

Robert C. Holte; Jeffery Grajkowski; Brian Tanner

Pattern databases enable difficult search problems to be solved very quickly, but are large and time-consuming to build. They are therefore best suited to situations where many problem instances are to be solved, and less than ideal when only a few instances are to be solved. This paper examines a technique – hierarchical heuristic search - especially designed for the latter situation. The key idea is to compute, on demand, only those pattern database entries needed to solve a given problem instance. Our experiments show that Hierarchical IDA* can solve individual problems very quickly, up to two orders of magnitude faster than the time required to build an entire high-performance pattern database.


international conference on machine learning | 2005

TD(λ) networks: temporal-difference networks with eligibility traces

Brian Tanner; Richard S. Sutton

Temporal-difference (TD) networks have been introduced as a formalism for expressing and learning grounded world knowledge in a predictive form (Sutton & Tanner, 2005). Like conventional TD(0) methods, the learning algorithm for TD networks uses 1-step backups to train prediction units about future events. In conventional TD learning, the TD(λ) algorithm is often used to do more general multi-step backups of future predictions. In our work, we introduce a generalization of the 1-step TD network specification that is based on the TD(λ) learning algorithm, creating TD(λ) networks. We present experimental results that show TD(λ) networks can learn solutions in more complex environments than TD networks. We also show that in problems that can be solved by TD networks, TD(λ) networks generally learn solutions much faster than their 1-step counterparts. Finally, we present an analysis of our algorithm that shows that the computational cost of TD(λ) networks is only slightly more than that of TD networks.


ieee symposium on adaptive dynamic programming and reinforcement learning | 2011

Protecting against evaluation overfitting in empirical reinforcement learning

Shimon Whiteson; Brian Tanner; Matthew E. Taylor; Peter Stone

Empirical evaluations play an important role in machine learning. However, the usefulness of any evaluation depends on the empirical methodology employed. Designing good empirical methodologies is difficult in part because agents can overfit test evaluations and thereby obtain misleadingly high scores. We argue that reinforcement learning is particularly vulnerable to environment overfitting and propose as a remedy generalized methodologies, in which evaluations are based on multiple environments sampled from a distribution. In addition, we consider how to summarize performance when scores from different environments may not have commensurate values. Finally, we present proof-of-concept results demonstrating how these methodologies can validate an intuitively useful range-adaptive tile coding method.


Journal of Machine Learning Research | 2009

RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments

Brian Tanner; Adam White


neural information processing systems | 2004

Temporal-Difference Networks

Richard S. Sutton; Brian Tanner


Ai Magazine | 2010

The Reinforcement Learning Competitions

Shimon Whiteson; Brian Tanner; Adam White


international joint conference on artificial intelligence | 2005

Using predictive representations to improve generalization in reinforcement learning

Eddie Rafols; Mark B. Ring; Richard S. Sutton; Brian Tanner


international joint conference on artificial intelligence | 2005

Temporal-difference networks with history

Brian Tanner; Richard S. Sutton


Archive | 2009

Generalized Domains for Empirical Evaluations in Reinforcement Learning

Shimon Whiteson; Brian Tanner; Matthew E. Taylor; Peter Stone


national conference on artificial intelligence | 2004

Dynamic coalition formation in robotic soccer

John Anderson; Brian Tanner; Jacky Baltes

Collaboration


Dive into the Brian Tanner's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Matthew E. Taylor

Washington State University

View shared research outputs
Top Co-Authors

Avatar

Peter Stone

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Anna Koop

University of Alberta

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge