Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Draguna Vrabie is active.

Publication


Featured researches published by Draguna Vrabie.


IEEE Circuits and Systems Magazine | 2009

Reinforcement learning and adaptive dynamic programming for feedback control

Frank L. Lewis; Draguna Vrabie

Living organisms learn by acting on their environment, observing the resulting reward stimulus, and adjusting their actions accordingly to improve the reward. This action-based or reinforcement learning can capture notions of optimal behavior occurring in natural systems. We describe mathematical formulations for reinforcement learning and a practical implementation method known as adaptive dynamic programming. These give us insight into the design of controllers for man-made engineered systems that both learn and exhibit optimal behavior.


Automatica | 2009

Brief paper: Adaptive optimal control for continuous-time linear systems based on policy iteration

Draguna Vrabie; Octavian Pastravanu; Murad Abu-Khalaf; Frank L. Lewis

In this paper we propose a new scheme based on adaptive critics for finding online the state feedback, infinite horizon, optimal control solution of linear continuous-time systems using only partial knowledge regarding the system dynamics. In other words, the algorithm solves online an algebraic Riccati equation without knowing the internal dynamics model of the system. Being based on a policy iteration technique, the algorithm alternates between the policy evaluation and policy update steps until an update of the control policy will no longer improve the system performance. The result is a direct adaptive control algorithm which converges to the optimal control solution without using an explicit, a priori obtained, model of the system internal dynamics. The effectiveness of the algorithm is shown while finding the optimal-load-frequency controller for a power system.


IEEE Control Systems Magazine | 2012

Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers

Frank L. Lewis; Draguna Vrabie; Kyriakos G. Vamvoudakis

This article describes the use of principles of reinforcement learning to design feedback controllers for discrete- and continuous-time dynamical systems that combine features of adaptive control and optimal control. Adaptive control [1], [2] and optimal control [3] represent different philosophies for designing feedback controllers. Optimal controllers are normally designed of ine by solving Hamilton JacobiBellman (HJB) equations, for example, the Riccati equation, using complete knowledge of the system dynamics. Determining optimal control policies for nonlinear systems requires the offline solution of nonlinear HJB equations, which are often difficult or impossible to solve. By contrast, adaptive controllers learn online to control unknown systems using data measured in real time along the system trajectories. Adaptive controllers are not usually designed to be optimal in the sense of minimizing user-prescribed performance functions. Indirect adaptive controllers use system identification techniques to first identify the system parameters and then use the obtained model to solve optimal design equations [1]. Adaptive controllers may satisfy certain inverse optimality conditions [4].


Neural Networks | 2009

2009 Special Issue: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems

Draguna Vrabie; Frank L. Lewis

In this paper we present in a continuous-time framework an online approach to direct adaptive optimal control with infinite horizon cost for nonlinear systems. The algorithm converges online to the optimal control solution without knowledge of the internal system dynamics. Closed-loop dynamic stability is guaranteed throughout. The algorithm is based on a reinforcement learning scheme, namely Policy Iterations, and makes use of neural networks, in an Actor/Critic structure, to parametrically represent the control policy and the performance of the control system. The two neural networks are trained to express the optimal controller and optimal cost function which describes the infinite horizon control performance. Convergence of the algorithm is proven under the realistic assumption that the two neural networks do not provide perfect representations for the nonlinear control and cost functions. The result is a hybrid control structure which involves a continuous-time controller and a supervisory adaptation structure which operates based on data sampled from the plant and from the continuous-time performance dynamics. Such control structure is unlike any standard form of controllers previously seen in the literature. Simulation results, obtained considering two second-order nonlinear systems, are provided.


mediterranean conference on control and automation | 2009

Adaptive optimal controllers based on Generalized Policy Iteration in a continuous-time framework

Draguna Vrabie; Kyriakos G. Vamvoudakis; Frank L. Lewis

In this paper we present two adaptive algorithms which offer solution to the continuous-time optimal control problem for nonlinear, affine in the inputs, time-invariant systems. Both algorithms were developed based on the Generalized Policy Iteration technique and involve adaptation of two neural network structures namely Actor, providing the control signal, and Critic, performing evaluation of the control performance. Despite the similarities, the two adaptive algorithms differ in the manner in which the adaptation takes place, required knowledge on the system dynamics, and formulation of the persistence of excitation requirement. The main difference is that one algorithm uses sequential adaptation of the actor and critic structures, i.e. while one is trained the other one is kept constant, while for the second algorithm the two neural networks are trained synchronously in a continuous-time fashion. The two algorithms are described in detail and proof of convergence is provided. Simulation results of applying the two algorithms for finding the optimal state feedback controller of a nonlinear system are also presented.


2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning | 2007

Continuous-Time ADP for Linear Systems with Partially Unknown Dynamics

Draguna Vrabie; Murad Abu-Khalaf; Frank L. Lewis; Youyi Wang

Approximate dynamic programming has been formulated and applied mainly to discrete-time systems. Expressing the ADP concept for continuous-time systems raises difficult issues related to sampling time and system model knowledge requirements. In this paper is presented a novel online adaptive critic (AC) scheme, based on approximate dynamic programming (ADP), to solve the infinite horizon optimal control problem for continuous-time dynamical systems; thus bringing together concepts from the fields of computational intelligence and control theory. Only partial knowledge about the system model is used, as knowledge about the plant internal dynamics is not needed. The method is thus useful to determine the optimal controller for plants with partially unknown dynamics. It is shown that the proposed iterative ADP algorithm is in fact a quasi-Newton method to solve the underlying algebraic Riccati equation (ARE) of the optimal control problem. An initial gain that determines a stabilizing control policy is not required. In control theory terms, in this paper is developed a direct adaptive control algorithm for obtaining the optimal control solution without knowing the system A matrix


conference on decision and control | 2008

Adaptive optimal control algorithm for continuous-time nonlinear systems based on policy iteration

Draguna Vrabie; Frank L. Lewis

In this paper we develop a new online adaptive control scheme, for partially unknown nonlinear systems, which converges to the optimal state feedback control solution for affine in the inputs nonlinear systems. The derivation of the optimal adaptive control algorithm is presented in a continuous-time framework. The optimal control solution will be obtained in a direct fashion, without system identification. The algorithm is an online approach to policy iterations based on an adaptive critic structure to find an approximate solution to the state feedback, infinite-horizon, optimal control problem.


conference on decision and control | 2011

Parameter estimation of a building system model and impact of estimation error on closed-loop performance

Sorin Bengea; Veronica Adetola; Keunmo Kang; Michael J. Liba; Draguna Vrabie; Robert R. Bitmead; Satish Narayanan

Predictive-control methods have been recently employed for demand-response control of building and district-level HVAC systems. Such approaches rely on models and parameter estimates to meet comfort constraints and to achieve the theoretical system-efficiency gains. In this paper we present a methodology that establishes achievable targets for control-model parameter estimation errors based on closed-loop performance sensitivity. The control algorithm is designed as a Model Predictive Controller (MPC) that uses perturbed building-model parameters. We perform simulations to estimate the dependency of energy cost and constraint infringement time on the magnitude of these perturbations. The simulation results are used to define targets for the parameter estimation errors, which in turn are applied to specify the character of excitation and model structure used for identification. We design a parameter estimator and perform Monte-Carlo simulations for a model that includes sensor noise and load uncertainty. The distribution of the estimation errors are used to demonstrate that the established targets are met.


international symposium on neural networks | 2009

Generalized Policy Iteration for continuous-time systems

Draguna Vrabie; Frank L. Lewis

In this paper we present a unified point of view over the Approximate Dynamic Programming (ADP) algorithms which have been developed in the last years for continuous-time (CT) systems. We introduce here, in a continuous-time formulation, the Generalized Policy Iteration (GPI), and show that in effect it represents a spectrum of algorithms which has at one end the exact Policy Iteration (PI) algorithm and at the other the Value Iteration (VI) algorithm. At the middle part of the spectrum we formulate for the first time the Optimistic Policy Iteration (OPI) algorithm for CT systems. We introduce the GPI starting from a new formulation for the PI algorithm which involves an iterative process to solve for the value function at the policy evaluation step. The GPI algorithm is implemented on an Actor/Critic structure. The results allow implementation of a family of adaptive controllers which converge online to the solution of the optimal control problem, without knowing or identifying the internal dynamics of the system. Simulation results are provided to verify the convergence to the optimal control solution.


ieee symposium on adaptive dynamic programming and reinforcement learning | 2009

Online policy iteration based algorithms to solve the continuous-time infinite horizon optimal control problem

Kyriakos G. Vamvoudakis; Draguna Vrabie; Frank L. Lewis

In this paper we discuss two online algorithms based on policy iterations for learning the continuous-time (CT) optimal control solution when nonlinear systems with infinite horizon quadratic cost are considered.

Collaboration


Dive into the Draguna Vrabie's collaboration.

Top Co-Authors

Avatar

Frank L. Lewis

University of Texas at Arlington

View shared research outputs
Top Co-Authors

Avatar

Vassilis L. Syrmos

University of Hawaii at Manoa

View shared research outputs
Top Co-Authors

Avatar

Murad Abu-Khalaf

University of Texas at Arlington

View shared research outputs
Top Co-Authors

Avatar

Asma Al-Tamimi

University of Texas at Arlington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Daniel S. Levine

University of Texas at Arlington

View shared research outputs
Researchain Logo
Decentralizing Knowledge