Brian Tanner
University of Alberta
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Brian Tanner.
symposium on abstraction reformulation and approximation | 2005
Robert C. Holte; Jeffery Grajkowski; Brian Tanner
Pattern databases enable difficult search problems to be solved very quickly, but are large and time-consuming to build. They are therefore best suited to situations where many problem instances are to be solved, and less than ideal when only a few instances are to be solved. This paper examines a technique – hierarchical heuristic search - especially designed for the latter situation. The key idea is to compute, on demand, only those pattern database entries needed to solve a given problem instance. Our experiments show that Hierarchical IDA* can solve individual problems very quickly, up to two orders of magnitude faster than the time required to build an entire high-performance pattern database.
international conference on machine learning | 2005
Brian Tanner; Richard S. Sutton
Temporal-difference (TD) networks have been introduced as a formalism for expressing and learning grounded world knowledge in a predictive form (Sutton & Tanner, 2005). Like conventional TD(0) methods, the learning algorithm for TD networks uses 1-step backups to train prediction units about future events. In conventional TD learning, the TD(λ) algorithm is often used to do more general multi-step backups of future predictions. In our work, we introduce a generalization of the 1-step TD network specification that is based on the TD(λ) learning algorithm, creating TD(λ) networks. We present experimental results that show TD(λ) networks can learn solutions in more complex environments than TD networks. We also show that in problems that can be solved by TD networks, TD(λ) networks generally learn solutions much faster than their 1-step counterparts. Finally, we present an analysis of our algorithm that shows that the computational cost of TD(λ) networks is only slightly more than that of TD networks.
ieee symposium on adaptive dynamic programming and reinforcement learning | 2011
Shimon Whiteson; Brian Tanner; Matthew E. Taylor; Peter Stone
Empirical evaluations play an important role in machine learning. However, the usefulness of any evaluation depends on the empirical methodology employed. Designing good empirical methodologies is difficult in part because agents can overfit test evaluations and thereby obtain misleadingly high scores. We argue that reinforcement learning is particularly vulnerable to environment overfitting and propose as a remedy generalized methodologies, in which evaluations are based on multiple environments sampled from a distribution. In addition, we consider how to summarize performance when scores from different environments may not have commensurate values. Finally, we present proof-of-concept results demonstrating how these methodologies can validate an intuitively useful range-adaptive tile coding method.
Journal of Machine Learning Research | 2009
Brian Tanner; Adam White
neural information processing systems | 2004
Richard S. Sutton; Brian Tanner
Ai Magazine | 2010
Shimon Whiteson; Brian Tanner; Adam White
international joint conference on artificial intelligence | 2005
Eddie Rafols; Mark B. Ring; Richard S. Sutton; Brian Tanner
international joint conference on artificial intelligence | 2005
Brian Tanner; Richard S. Sutton
Archive | 2009
Shimon Whiteson; Brian Tanner; Matthew E. Taylor; Peter Stone
national conference on artificial intelligence | 2004
John Anderson; Brian Tanner; Jacky Baltes