Murad Abu-Khalaf
University of Texas at Arlington
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Murad Abu-Khalaf.
Automatica | 2005
Murad Abu-Khalaf; Frank L. Lewis
The Hamilton-Jacobi-Bellman (HJB) equation corresponding to constrained control is formulated using a suitable nonquadratic functional. It is shown that the constrained optimal control law has the largest region of asymptotic stability (RAS). The value function of this HJB equation is solved for by solving for a sequence of cost functions satisfying a sequence of Lyapunov equations (LE). A neural network is used to approximate the cost function associated with each LE using the method of least-squares on a well-defined region of attraction of an initial stabilizing controller. As the order of the neural network is increased, the least-squares solution of the HJB equation converges uniformly to the exact solution of the inherently nonlinear HJB equation associated with the saturating control inputs. The result is a nearly optimal constrained state feedback controller that has been tuned a priori off-line.
systems man and cybernetics | 2008
Asma Al-Tamimi; Frank L. Lewis; Murad Abu-Khalaf
Convergence of the value-iteration-based heuristic dynamic programming (HDP) algorithm is proven in the case of general nonlinear systems. That is, it is shown that HDP converges to the optimal control and the optimal value function that solves the Hamilton-Jacobi-Bellman equation appearing in infinite-horizon discrete-time (DT) nonlinear optimal control. It is assumed that, at each iteration, the value and action update equations can be exactly solved. The following two standard neural networks (NN) are used: a critic NN is used to approximate the value function, whereas an action network is used to approximate the optimal control policy. It is stressed that this approach allows the implementation of HDP without knowing the internal dynamics of the system. The exact solution assumption holds for some classes of nonlinear systems and, specifically, in the specific case of the DT linear quadratic regulator (LQR), where the action is linear and the value quadratic in the states and NNs have zero approximation error. It is stressed that, for the LQR, HDP may be implemented without knowing the system A matrix by using two NNs. This fact is not generally appreciated in the folklore of HDP for the DT LQR, where only one critic NN is generally used.
Automatica | 2009
Draguna Vrabie; Octavian Pastravanu; Murad Abu-Khalaf; Frank L. Lewis
In this paper we propose a new scheme based on adaptive critics for finding online the state feedback, infinite horizon, optimal control solution of linear continuous-time systems using only partial knowledge regarding the system dynamics. In other words, the algorithm solves online an algebraic Riccati equation without knowing the internal dynamics model of the system. Being based on a policy iteration technique, the algorithm alternates between the policy evaluation and policy update steps until an update of the control policy will no longer improve the system performance. The result is a direct adaptive control algorithm which converges to the optimal control solution without using an explicit, a priori obtained, model of the system internal dynamics. The effectiveness of the algorithm is shown while finding the optimal-load-frequency controller for a power system.
european control conference | 2007
Asma Al-Tamimi; Frank L. Lewis; Murad Abu-Khalaf
In this paper, the optimal strategies for discrete-time linear system quadratic zero-sum games related to the H-infinity optimal control problem are solved in forward time without knowing the system dynamical matrices. The idea is to solve for an action dependent value function Q(x,u,w) of the zero-sum game instead of solving for the state dependent value function V(x) which satisfies a corresponding game algebraic Riccati equation (GARE). Since the state and actions spaces are continuous, two action networks and one critic network are used that are adaptively tuned in forward time using adaptive critic methods. The result is a Q-learning approximate dynamic programming model-free approach that solves the zero-sum game forward in time. It is shown that the critic converges to the game value function and the action networks converge to the Nash equilibrium of the game. Proofs of convergence of the algorithm are shown. It is proven that the algorithm ends up to be a model-free iterative algorithm to solve the (GARE) of the linear quadratic discrete-time zero-sum game. The effectiveness of this method is shown by performing an H-infinity control autopilot design for an F-16 aircraft.
IEEE Transactions on Neural Networks | 2008
Murad Abu-Khalaf; Frank L. Lewis; Jie Huang
In this paper, neural networks are used along with two-player policy iterations to solve for the feedback strategies of a continuous-time zero-sum game that appears in L2-gain optimal control, suboptimal Hinfin control, of nonlinear systems affine in input with the control policy having saturation constraints. The result is a closed-form representation, on a prescribed compact set chosen a priori, of the feedback strategies and the value function that solves the associated Hamilton-Jacobi-Isaacs (HJI) equation. The closed-loop stability, L2-gain disturbance attenuation of the neural network saturated control feedback strategy, and uniform convergence results are proven. Finally, this approach is applied to the rotational/translational actuator (RTAC) nonlinear benchmark problem under actuator saturation, offering guaranteed stability and disturbance attenuation.
Automatica | 2007
Tao Cheng; Frank L. Lewis; Murad Abu-Khalaf
We consider the use of neural networks and Hamilton-Jacobi-Bellman equations towards obtaining fixed-final time optimal control laws in the input nonlinear systems. The method is based on Kronecker matrix methods along with neural network approximation over a compact set to solve a time-varying Hamilton-Jacobi-Bellman equation. The result is a neural network feedback controller that has time-varying coefficients found by a priori offline tuning. Convergence results are shown. The results of this paper are demonstrated on two examples
Automatica | 2007
Jyotirmay Gadewadikar; Frank L. Lewis; Lihua Xie; Vladimír Kučera; Murad Abu-Khalaf
This paper presents a simplified parameterization of all H∞ static state-feedback controllers in terms of a single algebraic Riccati equation and a free parameter matrix. As a special case, necessary and sufficient conditions for the existence of an static output-feedback gain are given. An efficient computational algorithm is given and its correctness proven. No initial stabilizing output-feedback gain is needed. The technique is used to design an H∞ lateral-directional command augmentation system for the F-16 aircraft.
IEEE Transactions on Neural Networks | 2007
Tao Cheng; Frank L. Lewis; Murad Abu-Khalaf
In this paper, fixed-final time-constrained optimal control laws using neural networks (NNS) to solve Hamilton-Jacobi-Bellman (HJB) equations for general affine in the constrained nonlinear systems are proposed. An NN is used to approximate the time-varying cost function using the method of least squares on a predefined region. The result is an NN nearly -constrained feedback controller that has time-varying coefficients found by a priori offline tuning. Convergence results are shown. The results of this paper are demonstrated in two examples, including a nonholonomic system.
Journal of Guidance Control and Dynamics | 2006
Jyotirmay Gadewadikar; Frank L. Lewis; Murad Abu-Khalaf
Necessary and sufficient conditions are presented for static output-feedback control of linear time-invariant systems using the H ∞ approach. Simplified conditions are derived which only require the solution of two coupled matrix design equations. It is shown that the static output-feedback H ∞ solution does not generally yield a well-defined saddle point for the zero-sum differential game; conditions are given under which it does. This paper also proposes a numerically efficient solution algorithm for the coupled design equations to determine the output-feedback gain. A major contribution is that an initial stabilizing gain is not needed. An F-16 normal acceleration design example is solved to illustrate the power of the proposed approach.
2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning | 2007
Draguna Vrabie; Murad Abu-Khalaf; Frank L. Lewis; Youyi Wang
Approximate dynamic programming has been formulated and applied mainly to discrete-time systems. Expressing the ADP concept for continuous-time systems raises difficult issues related to sampling time and system model knowledge requirements. In this paper is presented a novel online adaptive critic (AC) scheme, based on approximate dynamic programming (ADP), to solve the infinite horizon optimal control problem for continuous-time dynamical systems; thus bringing together concepts from the fields of computational intelligence and control theory. Only partial knowledge about the system model is used, as knowledge about the plant internal dynamics is not needed. The method is thus useful to determine the optimal controller for plants with partially unknown dynamics. It is shown that the proposed iterative ADP algorithm is in fact a quasi-Newton method to solve the underlying algebraic Riccati equation (ARE) of the optimal control problem. An initial gain that determines a stabilizing control policy is not required. In control theory terms, in this paper is developed a direct adaptive control algorithm for obtaining the optimal control solution without knowing the system A matrix