George G. Lendaris
Portland State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by George G. Lendaris.
systems man and cybernetics | 2002
John J. Murray; Chadwick J. Cox; George G. Lendaris; Richard Saeks
Unlike the many soft computing applications where it suffices to achieve a good approximation most of the time, a control system must be stable all of the time. As such, if one desires to learn a control law in real-time, a fusion of soft computing techniques to learn the appropriate control law with hard computing techniques to maintain the stability constraint and guarantee convergence is required. The objective of the paper is to describe an adaptive dynamic programming algorithm (ADPA) which fuses soft computing techniques to learn the optimal cost (or return) functional for a stabilizable nonlinear system with unknown dynamics and hard computing techniques to verify the stability and convergence of the algorithm. Specifically, the algorithm is initialized with a (stabilizing) cost functional and the system is run with the corresponding control law (defined by the Hamilton-Jacobi-Bellman equation), with the resultant state trajectories used to update the cost functional in a soft computing mode. Hard computing techniques are then used to show that this process is globally convergent with stepwise stability to the optimal cost functional/control law pair for an (unknown) input affine system with an input quadratic performance measure (modulo the appropriate technical conditions). Three specific implementations of the ADPA are developed for 1) the linear case, 2) for the nonlinear case using a locally quadratic approximation to the cost functional, and 3) the nonlinear case using a radial basis function approximation of the cost functional; illustrated by applications to flight control.
international symposium on neural networks | 1997
George G. Lendaris; C. Paintz
This paper discusses strategies for and details of training procedures for the dual heuristic programming methodology. This and other approximate dynamic programming approaches have been discussed in the literature, all being members of the adaptive critic design family. It suggests and investigates several alternative procedures and compares their performance with respect to convergence speed and quality of resulting controller design. A modification is to introduce a real copy of the criticNN (criticNN 2) for making the desired output calculations, and this criticNN 2 is trained differently than is criticNN 1. The idea is to provide the desired outputs from a stable platform during an epoch while adapting the criticNN 1. Then at the end of the epoch, criticNN 2 is made identical to the then-current adapted state of criticNN 1, and a new epoch starts. In this way, both the criticNN 1 and the actionNN can be simultaneously trained online during each epoch, with a faster overall convergence than the older approach. The measures used suggest that a better controller design (the actionNN) results.
Technological Forecasting and Social Change | 1979
Harold A. Linstone; George G. Lendaris; Steven D. Rogers; Wayne W. Wakeland; Mark Williams
Abstract Structural modeling (SM) techniques are a set of geometric, semi-quantitative tools that can assist in organizing a technology assessment (TA), developing a rough overview of it, and analyzing various component problems. In this project about 100 SM techniques were identified and seven were tested in detail: ISM, ELECTRE, SPIN, KSIM, QSIM, IMPACT, and XIMP. Guidelines were developed to help the assessor in the choice and proper use of such tools.
international symposium on neural networks | 2000
George G. Lendaris; Larry J. Schultz; Thaddeus T. Shannon
Selected adaptive critic (AC) methods are known to be capable of designing (approximately) optimal control policies for nonlinear plants. The present research focuses on an AC method known as dual heuristic programming (DHP). In particular it is seen as useful to explore correspondences between the form of a utility function and the resulting controllers designed by the DHP method. Based on the task of designing a steering controller for a 2-axle, terrestrial, autonomous vehicle, the present paper relates a pair of critics to divide the labor of training the controller. Improvements in convergence of the training process is realized in this way. The controllers designed by the DHP method are reasonably robust, and demonstrate good performance on disturbances not even trained on encountering a patch of ice during a steering maneuver, and encountering a wind gust perpendicular to direction of travel.
systems man and cybernetics | 2008
F. L. Lewis; Derong Liu; George G. Lendaris
The 18 papers in this special issue focus on adaptive dynamic programming and reinforcement learning in feedback control.
systems man and cybernetics | 2003
Stephen Shervais; Thaddeus T. Shannon; George G. Lendaris
A set of neural networks is employed to develop control policies that are better than fixed, theoretically optimal policies, when applied to a combined physical inventory and distribution system in a nonstationary demand environment. Specifically, we show that model-based adaptive critic approximate dynamic programming techniques can be used with systems characterized by discrete valued states and controls. The control policies embodied by the trained neural networks outperformed the best, fixed policies (found by either linear programming or genetic algorithms) in a high-penalty cost environment with time-varying demand.
systems man and cybernetics | 1999
George G. Lendaris; Karl Mathia; Richard Saeks
It is shown that a Hopfield neural network (with linear transfer functions) augmented by an additional feedforward layer can be used to compute the Moore-Penrose generalized inverse of a matrix. The resultant augmented linear Hopfield network can be used to solve an arbitrary set of linear equations or, alternatively, to solve a constrained least squares optimization problem. Applications in signal processing and robotics are considered. In the former case the augmented linear Hopfield network is used to estimate the structured noise component of a signal and adjust the parameters of an appropriate filter on-line, whereas in the latter case it is used to implement an on-line solution to the inverse kinematics problem.
international symposium on neural networks | 2009
George G. Lendaris
Some three decades ago, certain computational intelligence methods of reinforcement learning were recognized as implementing an approximation of Bellmans Dynamic Programming method, which is known in the controls community as an important tool for designing optimal control policies for nonlinear plants and sequential decision making. Significant theoretical and practical developments have occurred within this arena, mostly in the past decade, with the methodology now usually referred to as Adaptive Dynamic Programming (ADP). The objective of this paper is to provide a retrospective of selected threads of such developments. In addition, a commentary is offered concerning present status of ADP, and threads for future research and development within the controls field are suggested.
ieee international conference on fuzzy systems | 2000
Thaddeus T. Shannon; George G. Lendaris
We show the applicability of the dual heuristic programming (DHP) method of approximate dynamic programming to parameter tuning of a fuzzy control system. DHP and related techniques have been developed in the neurocontrol context but can be equally productive when used with fuzzy controllers or neuro-fuzzy hybrids. We demonstrate this technique on a highly nonlinear 2nd order plant proposed by Sanner and Slotine (1992). Throughout our example application, we take advantage of the Takagi-Sugeno model framework to initialize our tunable parameters with reasonable problem specific values, a practice difficult to perform when applying DHP to neurocontrol.
systems man and cybernetics | 2008
George G. Lendaris
Two distinguishing features of humanlike control vis-a-vis current technological control are the ability to make use of experience while selecting a control policy for distinct situations and the ability to do so faster and faster as more experience is gained (in contrast to current technological implementations that slow down as more knowledge is stored). The notions of context and context discernment are important to understanding this human ability. Whereas methods known as adaptive control and learning control focus on modifying the design of a controller as changes in context occur, experience-based (EB) control entails selecting a previously designed controller that is appropriate to the current situation. Developing the EB approach entails a shift of the technologists focus ldquoup a levelrdquo away from designing individual (optimal) controllers to that of developing online algorithms that efficiently and effectively select designs from a repository of existing controller solutions. A key component of the notions presented here is that of higher level learning algorithm. This is a new application of reinforcement learning and, in particular, approximate dynamic programming, with its focus shifted to the posited higher level, and is employed, with very promising results. The authors hope for this paper is to inspire and guide future work in this promising area.