Leemon C. Baird
Wright Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Leemon C. Baird.
Adaptive Behavior | 1993
Leemon C. Baird; A. Harry Klopf
An associative control process (ACP) network is a learning control system that can reproduce a variety of animal learning results from classical and instrumental conditioning experiments (Klopf, Morgan, & Weaver, 1993; see also the article, A Hierarchical Network of Control Systems that Learn). The ACP networks proposed and tested by Klopf, Morgan, and Weaver are not guaranteed, however, to learn optimal policies for maximizing reinforcement. Optimal behavior is guaranteed for a reinforcement learning system such as Q-learning (Watkins, 1989), but simple Q-learning is incapable of reproducing the animal learning results that ACP networks reproduce. We propose two new models that reproduce the animal learning results and are provably optimal. The first model, the modified ACP network, embodies the smallest number of changes necessary to the ACP network to guarantee that optimal policies will be learned while still reproducing the animal learning results. The second model, the single-layer ACP network, embodies the smallest number of changes necessary to Q-learning to guarantee that it reproduces the animal learning results while still learning optimal policies. We also propose a hierarchical network architecture within which several reinforcement learning systems (e.g., Q-learning systems, single-layer ACP networks, or any other learning controller) can be combined in a hierarchy. We implement the hierarchical network architecture by combining four of the single-layer ACP networks to form a controller for a standard inverted pendulum dynamic control problem. The hierarchical controller is shown to learn more reliably and more than an order of magnitude faster than either the single-layer ACP network or the Barto, Sutton, and Anderson (1983) learning controller for the benchmark problem.
Archive | 2001
Leemon C. Baird; Mance E. Harmon; R. Reed Young; James E. Armstrong
international conference on machine learning | 1996
Cesar Bandera; Francisco J. Vico; José Manuel Bravo; Mance E. Harmon; Leemon C. Baird
IEEE Transactions on Reliability | 1996
Mance E. Harmon; Leemon C. Baird
neural information processing systems | 1994
Mance E. Harmon; Leemon C. Baird; A. Harry Klopf
Archive | 2007
Leemon C. Baird; Mance E. Harmon; John Kelly Hughes
Archive | 2015
Mance E. Harmon; Leemon C. Baird; David Chase; David Waite
simulation of adaptive behavior | 1993
Leemon C. Baird; A. Harry Klopf
Archive | 2008
Leemon C. Baird; John Kelly Hughes
Archive | 1996
Leemon C. Baird; Mance E. Harmon; A. Harry Klopf