Unconditionnally stable scheme for Riccati equation
aa r X i v : . [ m a t h . NA ] J a n Unconditionnally stable schemefor Riccati equation
François Dubois a and Abdelkader Saïdi b a Conservatoire National des Arts et Métiers15 rue Marat, F-78 210 Saint Cyr l’Ecole, France. b Institut de Recherche Mathématique AvancéeUniversité Louis Pasteur7 Rue René-Descartes, 67084 Strasbourg Cedex, France.
July 2000 ∗ Abstract.
In this contribution we present a numerical scheme for the resolution of matrixRiccati equation used in control problems. The scheme is unconditionnally stable and the solutionis definite positive at each time step of the resolution. We prove the convergence in the scalarcase and present several numerical experiments for classical test cases.
Keywords : control problems, ordinary differential equations, stability.
AMS classification : 34H05, 49K15, 65L20, 93C15. ∗ Published,
ESAIM: Proceedings François Dubois and Abdelkader Saïdi
1) Introduction
We study the optimal control of a differential linear system(1) d y d t = A y + B v , where the state variable y ( t ) belongs to IR n and the control function v ( • ) takes its valuesin IR m , with n and m beeing given integers. Matrix A is composed by n lines and n columns and matrix B contains n lines and m columns. Both matrices A and B areindependent of time. With the ordinary differential equation (1) is associated an initialcondition(2) y (0) = y with y given in IR n and the solution of system (1)(2) is parametrized by the function v ( • ) :The control problem consists of finding the minimum u ( • ) of some quadratic functional J ( • ) :(3) J ( u ( • )) ≤ J ( v ( • )) , ∀ v ( • ) . The functional J ( • ) depends on the control variable function v ( • ) , is defined by thehorizon T > , the symmetric semi-definite positive n by n constant matrix Q and thesymmetric definite positive m by m constant matrix R . We set classically :(4) J ( v ( • )) = 12 Z T ( Qy ( t ) , y ( t )) d t + 12 Z T ( Rv ( t ) , v ( t )) d t . • Problem (1)(2)(3)(4) is a classical linear quadratic mathematical modelling of dy-namical systems in automatics (see e.g.
Lewis [Le86]). When the control function v ( • ) is supposed to be square integrable ( v ( • ) ∈ L (]0; T [ , IR m ) ) then the control problem(1)(2)(3)(4) has a unique solution u ( • ) ∈ L (]0; T [ , IR m ) (see for instance Lions [Li68]).When there is no constraint on the control variable the minimum u ( • ) of the functional J ( v ) is characterized by the condition:(5) d J ( u ) • w = 0 , ∀ w ∈ L (]0 , T [ , IR m ) , which is not obvious to compute directly. • When we introduce the differential equation (1) as a constraint between y ( • ) and v ( • ) ,the associated Lagrange multiplyer p ( • ) is a function of time and is classically named theadjoint variable. Research of a minimum for J ( • ) (condition (5)) can be rewritten in theform of research of a saddle point and the evolution equation for the adjoint variable isclassical (see e.g. Lewis [Le86]):(6) d p d t + A t p + Q y = 0 , with a final condition at t = T ,(7) p ( T ) = 0 and the optimal control in terms of the adjoint state p ( • ) takes the form:(8) R u ( t ) + B t p ( t ) = 0 . nconditionnally stable scheme for Riccati equation • We observe that the differential system (1)(6) together with the initial condition (2)and the final condition (7) is coupled through the optimality condition (8). In practice, weneed a linear feedback function of the state variable y ( t ) instead of the adjoint variable p ( t ) . Because adjoint state p ( • ) depends linearily on state variable y ( • ) we can set: p ( t ) = X ( T − t ) • y ( t ) , ≤ t ≤ T , with a symmetric n by n matrix X ( • ) which is positive definite. The final condition (7)is realized for each value y ( T ) , then we have the following condition:(9) X (0) = 0 . We set K = BR − B t ; we remark that matrix K is symmetric positive definite, we replacethe control u ( t ) by its value obtained in relation (8) and we deduce after elementaryalgebra the evolution equation for the transition matrix X ( • ) :(10) d X d t − (cid:0) XA + A t X (cid:1) + X K X − Q = 0 , which defines the Riccati equation associated with the control problem (1)(2)(3)(4). • In this paper we study the numerical approximation of differential system (9)(10).Recall that the given matrices Q and K are n × n symmetric matrices, with Q semi-definite positive and K positive definite; the matrix A is an n by n matrix without anyother condition and the unknown matrix X ( t ) is symmetric. We have the followingproperty (see e.g. Lewis [Le86]).
Proposition 1. The solution of Riccati equation is positive definite.
Let K , Q , A be given n × n matrices with K , Q symmetric, Q positive and K definitepositive. Let X ( • ) be the solution of the Riccati differential equation (10) with initialcondition (9). Then X ( t ) is well defined for any t ≥ , is symmetric and for each t > , X ( t ) is definite positive and tends to a definite positive matrix X ∞ as t tends to infinity: X ( t ) −→ X ∞ if t −→ ∞ . Matrix X ∞ is the unique positive symmetric matrix which issolution of the so-called algebraic Riccati equation: − ( XA + A t X ) + XKX − Q = 0 . • As a consequence of this proposition it is usefull to simplify the feedback commandlaw (8) by the associated limit command obtained by taking t −→ ∞ , that is:(11) v ( t ) = − R − B t X ∞ y ( t ) , and the differential system (1) (11) is stable (see e.g. [Le86]). The practical computationof matrix X ∞ by direct methods is not obvious and we refer e.g. to Laub [La79]. Ifwe wish to compute directly a numerical solution of instationnary Riccati equation (10)classical methods for ordinary differential equations like e.g. the forward Euler method t ( X j +1 − X j ) + X j K X j − ( At X j + X j A ) − Q = 0 , François Dubois and Abdelkader Saïdi or Runge Kutta method fail to maintain positivity of the iterate X j +1 at the order ( j + 1) :(12) ( X j +1 x , x ) > , ∀ x ∈ IR n , x = 0 , if X j is positive definite and if time step ∆ t > is not small enough (see e.g. Dieci andEirola [DE96]). Morever, there is to our best knowledge no simple way to determine apriori if time step ∆ t > is compatible or not with condition (12). • In the following, we propose a method for numerical integration of Riccati equation(10) which maintains condition (12) for each time step ∆ t > . We present in secondsection the simple case of scalar Riccati equation and present the numerical scheme andits principal properties of the general case in section 3. We describe several numericalexperiments in section 4.
2) Scalar Riccati equation • When the unknown is a scalar variable, we write Riccati equation in the followingform:(13) d x d t + k x − a x − q = 0 , with(14) k > , q ≥ , and an initial condition:(15) x (0) = d, d ≥ . We approach the ordinary differential equation (13) with a finite difference scheme ofthe type proposed by Baraille [Ba91] for hypersonic chemical kinetics and independentlywith the “family method” proposed by Cariolle [Ca79] and studied by Miellou [Mi84]. Wesuppose that time step ∆ t is strictly positive. The idea is to write the approximation x j +1 at time step ( j + 1)∆ t as a rational fraction of x j with positive coefficients. Wedecompose first the real number a into positive and negative parts : a = a + − a − ; a + = max(0; a ) ≥ , a − = max(0; − a ) ≥ , a + a − = 0 and factorize the product x intothe very simple form: (cid:0) x (cid:1) j +1 / = x j x j +1 . Definition 1. Numerical scheme in the scalar case.
For resolution of the scalar differential equation (13), we define our numerical scheme bythe following relation:(16) x j +1 − x j ∆ t + k x j x j +1 − a + x j + 2 a − x j +1 − q = 0 . • The scheme (16) is implicit because some linear equation has to be solved to compute x j +1 when x j is supposed to be given. In the case of our scheme this equation is linearand the solution x j +1 is obtained from scheme (16) by the homographic relation:(17) x j +1 = (cid:0) a + ∆ t (cid:1) x j + q ∆ tk ∆ t x j + (1 + 2 a − ∆ t ) . nconditionnally stable scheme for Riccati equation Proposition 2. Algebraic properties of the scalar homographic scheme.
Let ( x j ) j ∈ IN be the sequence defined by initial condition : x = x (0) = d and recurrencerelation (17). Then sequence ( x j ) j ∈ IN is globally defined and remains positive for eachtime step: x j > , ∀ j ∈ IN , ∀ ∆ t > . If ∆ t > is chosen such that:(18) | a | ∆ t − k q ∆ t = 0 , then ( x j ) j ∈ IN converges towards the positive solution x ∗ of the “algebraic Riccati equation” k x − a x − q = 0 and(19) x ∗ = 1 k (cid:16) a + p a + kq (cid:17) . • In the exceptional case where ∆ t > is chosen such that (18) is not satisfied, thenthe sequence ( x j ) j ∈ IN is equal to the constant a + ∆ tk ∆ t for j ≥ and the scheme (17)cannot be used for the approximation of Riccati equation (13). Theorem 1. Convergence of the scalar scheme.
We suppose that the data k, a, q of Riccati equation satisfy (14) and (18) and that thedatum d of condition (15) is relatively closed to x ∗ , i.e. :(20) − k τ + η ≤ d − x ∗ ≤ C , where C is some given strictly positive constant ( C > , x ∗ calculated according torelation (19) is the limit in time of the Riccati equation, τ is defined from data k, a, q by: τ = 12 p a + kq , and η is some constant chosen such that(21) < η < k τ . • We denote by x ( t ; d ) the solution of differential equation (13) with initial condition(15). Let ( x j (∆ t ; d ∆ )) ( j ∈ IN) be the solution of the numerical scheme defined at therelation (17) and let d ∆ be the initial condition: x (∆ t ; d ∆ ) = d ∆ . We suppose that the numerical initial condition d ∆ > satisfies a condition analogousto (20): − k τ + η ≤ d ∆ − x ∗ ≤ C , with C and η > equal to the constant introduced in (20) and satisfying (21). • Then the approximated value ( x j (∆ t ; d ∆ )) j ∈ IN is arbitrarily closed to the exact value x ( j ∆ t ; d ) for each j as ∆ t −→ and d ∆ −→ d . More precisely, if a = 0 we have thefollowing estimate for the error at time equal to j ∆ t : | x ( j ∆ t ; d ) − x j (∆ t ; d ∆ ) |≤ A (∆ t + | d − d ∆ | ) , ∀ j ∈ IN , < ∆ t ≤ B François Dubois and Abdelkader Saïdi with constants
A > , B > , depending on data k, a, q, η but independent on time step ∆ t > and iteration j . • If a = 0 , the scheme is second order accurate in the following sense: | x ( j ∆ t ; d ) − x j (∆ t ; d ∆ ) | ≤ A (∆ t + | d − d ∆ | ) , ∀ j ∈ IN , < ∆ t ≤ B with constants A et B independent on time step ∆ t and iteration j .A direct application of the Lax theorem for numerical scheme associated to ordinary dif-ferential equations is not straightforward because both Riccati equation and the numericalscheme are nonlinear. Our proof is detailed in [DS2k].
3) Matrix Riccati equation
In order to define a numerical scheme to solve the Riccati differential equation (10) withinitial condition (9) we first introduce a strictly positive real number, which is chosenpositive in such a way that the real matrix [ µ I − ( A + A t)] is definite positive:(22)
12 ( µ x , x ) − ( A x , x ) > , ∀ x = 0 . Then we introduce the definite positive matrix M wich depends on µ and matrix A : M = 12 µ I − A .
The numerical scheme is then defined by analogy with relation (16). We have the followingdecomposition :(23) A = A + − A − with A + = µ I , A − = M, µ > , M positive definite. Taking as an explicit part thepositive contribution A + of the decomposition (23) of matrix A and in the implicit partthe negative contribution A − = M of the decomposition (23), we get(24) t ( X j +1 − X j ) + 12 ( X j KX j +1 + X j +1 KX j ) ++ ( M t X j +1 + X j +1 M ) = µX j + Q .
The numerical solution given by the scheme X j +1 at time step j + 1 is then defined asa solution of Lyapunov matrix equation with matrix X as unknown: S t j X + X S j = Y j with(25) S j = 12 I + ∆ t KX j + ∆ t M and(26) Y j = X j + µ ∆ t X j + ∆ t Q . We notice that S j is a (non necessarily symmetric) positive matrix and that Y j is asymmetric definite positive matrix if it is the case for X j . nconditionnally stable scheme for Riccati equation Definition 2. Symmetric matrices.
Let n be an integer greater or equal to . We define by S n (IR) , (respectively S + n (IR) , S + ∗ n (IR) ) the linear space (respectively the closed cone, the open cone) of symmetric-matrices (respectively symmetric positive and symmetric definite positive matrices). Thefollowing inclusions S + ∗ n (IR) ⊂ S + n (IR) ⊂ S n (IR) are natural. Proposition 3. Property of the Lyapunov equation.
Let S be a matrix which is not necessary symmetric, such that the associated quadraticform: IR n ∋ x ( x, Sx ) ∈ IR , is strictly positive i.e. S + S t ∈ S + ∗ n (IR) . Then the application ϕ defined by :(27) S n (IR) ∋ X ϕ ( X ) = S t X + X S ∈ S n (IR) , is a one to one bijective application on the space S n (IR) of real symmetric matrices oforder n . Morever, if matrix ϕ ( X ) is positive (respectively definite positive) then thematrix X is also positive (respectively definite positive): if ϕ ( X ) ∈ S + n (IR) , then X ∈ S + n (IR) . • The numerical scheme has been written as an equation with unknown X = X j +1 which takes the form: ϕ j ( X ) = Y j with ϕ j given by a relation of the type (27) with thehelp of matrix S j defined in (25) and a datum matrix Y j defined by relation (26). Thenwe have the following propositions. Proposition 4. Homographic scheme computes a definite positive matrix.
The matrix X j defined by numerical scheme (24) with the initial condition X = 0 ispositive for each time step ∆ t > : X j ∈ S + n (IR) , ∀ j ≥ . If there exists some integer m such that X m belongs to the open cone S + ∗ n (IR) , thenmatrix X m + j belongs to the open cone S + ∗ n (IR) for each j . Proposition 5. Monotonicity.
Under the condition (cid:0) KX ∞ + X ∞ K (cid:1) < (cid:0) µ + 1∆ t (cid:1) I , the scheme (24) is monotone and we have more precisely :(28) (cid:16) ≤ X j ≤ X ∞ (cid:17) = ⇒ (cid:16) ≤ X j ≤ X j +1 ≤ X ∞ (cid:17) . François Dubois and Abdelkader Saïdi
4) First numerical experiments4-1 Square root function • The first example studied is the resolution of the equation :(29) d X d t + X − Q = 0 , X (0) = 0 with n = 2 , A = 0 , K = I and matrix Q equal to(30) Q = 12 (cid:18) −
11 1 (cid:19) (cid:18) (cid:19) (cid:18) − (cid:19) . • We have tested our numerical scheme for fixed value ∆ t = 1 / and different valuesof parameter µ : µ = 0 . , − , +6 . For small values of parameter µ, the behaviourof the scheme does not change between µ = 0 . and µ = 10 − . Figures 1 to 4 showthe evolution with time of the eigenvalues of matrix X j and the convergence is achievedto the square root of matrix Q. For large value of parameter µ ( µ = 10 +6 ) , we loosecompletely consistency of the scheme (see figures 5 and 6). Figures 1 and 2. Square root function test.
Two first eigenvalues of numerical solution ( µ = 0 . ). Figures 3 and 4. Square root function test.
Two first eigenvalues of numerical solution ( µ = 10 − ). nconditionnally stable scheme for Riccati equation Figures 5 and 6. Square root function test.
Two first eigenvalues of numerical solution ( µ = 10 +6 ). • The second exemple is the classical harmonic oscillator. Dynamical system y ( t ) isgoverned by the second order differential equation with command v ( t ) :(31) d y ( t )d t + 2 δ d y ( t )d t + ω y ( t ) = b v ( t ) . This equation is written as a first order system of differential equations :(32) Y = y ( t )d y ( t )d t ! , d Y d t = (cid:18) − ω − δ (cid:19) Y ( t ) + (cid:18) b v ( t ) (cid:19) . In this case, we have tested the stability of the scheme for fixed value of parameter µ ( µ = 0 . and different values of time step ∆ t and coefficients of matrix R inside thecost function of relation (4): R = (cid:18) α α (cid:19) . • We have chosen three sets of parameters : α = ∆ t = 1 / (reference experiment,figures 7 and 8), α = 10 − , ∆ t = 1 / (very small value for α , figures 9 and 10) and α = 1 / , ∆ t = 100 (too large value for time step, figures 11 and 12). Note thatfor the last set of parameters, classical explicit schemes fail to give any answer. As inprevious test case, we have represented the two eigenvalues of discrete matrix solution X j as time is increasing. On reference experiment (figures 7 and 8), we have convergenceof the solution to the solution of algebraic Riccati equation. If control parameter α ischosen too small, the first eigenvalue of Riccati matrix oscillates during the first timesteps but reach finally the correct values of limit matrix, the solution of algebraic Riccatiequation. If time step is too large, we still have stability but we loose also monotonicity.Nevertheless, convergence is achieved as in previous case.0 François Dubois and Abdelkader Saïdi
Figures 7 and 8. Harmonic oscillator.
Two first eigenvalues of numerical solution ( µ = 0 . , α = 0 . , ∆ t = 0 . ). Figures 9 and 10. Harmonic oscillator.
Two first eigenvalues of numerical solution ( µ = 0 . , α = 10 − , ∆ t = 0 . ). Figures 11 and 12. Harmonic oscillator.
Two first eigenvalues of numerical solution ( µ = 0 . , α = 0 . , ∆ t = 100 ). nconditionnally stable scheme for Riccati equation
5) Conclusion
We have proposed a numerical scheme for the resolution of the matrix Riccati equation.The scheme is implicit, unconditionnaly stable, needs to use only one scalar parameterand to solve a linear system of equations for each time step. This scheme is convergentin the scalar case and has good monotonicity properties in the matrix case. Our firstnumerical experiments show stability and robustness when various parameters have largevariations. Situations where classical explicit schemes fail to give a solution compatiblewith the property that solution of Riccati equation is a definite positive matrix have beencomputed. We expect to prove convergence in the matrix case and we will present in[DS2k] experiments on realistic test models such as a string of vehicles and the discretizedwave equation.
Acknowledgments
The authors thank Marius Tucsnak for helpfull comments on the first draft of this paper.
References [Ba91] R. Baraille. Développement de schémas numériques adaptés à l’hydrodynamique,
Thèse de l’Université Bordeaux 1 , décembre 1991.[Ca79] D. Cariolle. Modèle unidimentionnel de chimie de l’ozone, Internal note,
Etablisse-ment d’Etudes et de Recherches Météorologiques , Paris 1979.[DE96] L. Dieci, T. Eirola. Preserving monotonicity in the numerical solution of Riccatidifferential equations,
Numer. Math., vol .
74, p. 35-47, 1996.[DS2k] F. Dubois, A. Saïdi. Homographic scheme for Riccati equation,
CNAM-IAT In-ternal Report number 338-2K, 29 august 2000, and
IRMA Research Report number2000-32, Université Louis Pasteur, Strasbourg, september 2000, hal-00554484, 2011.[La79] A.J. Laub. A Schur Method for Solving Algebraic Riccati Equations,
IEEE Trans.Aut. Control , vol . AC-24, p. 913-921, 1979.[Le86] F.L. Lewis.
Optimal Control , J. Wiley-Interscience, New York, 1986.[Li68] J.L. Lions.
Contrôle optimal des systèmes gouvernés par des équations aux dérivéespartielles , Dunod, Paris, 1968.[Mi84] J.C. Miellou. Existence globale pour une classe de systèmes paraboliques semi-linéaires modélisant le problème de la stratosphère : la méthode de la fonctionagrégée,