Linear Quadratic Optimal Control and Stabilization for Discrete-time Markov Jump Linear Systems
aa r X i v : . [ m a t h . O C ] M a r Linear Quadratic Optimal Control and Stabilizationfor Discrete-time Markov Jump Linear Systems
Chunyan Han a , Hongdan Li b , Wei Wang b , Huanshui Zhang b, ∗ a School of Electrical Engineering, University of Jinan,Jinan Shandong 250022, China b School of Control Science and Engineering, Shandong University,Jinan Shandong 250061, China
Abstract
This paper mainly investigates the optimal control and stabilization problems for lineardiscrete-time Markov jump systems. The general case for the finite-horizon optimal controlleris considered, where the input weighting matrix in the performance index is just requiredto be positive semi-definite. The necessary and sufficient condition for the existence ofthe optimal controller in finite-horizon is given explicitly from a set of coupled differenceRiccati equations (CDRE). One of the key techniques is to solve the forward and backwardstochastic difference equation (FDSDE) which is obtained by the maximum principle. As tothe infinite-horizon case, we establish the necessary and sufficient condition to stabilize theMarkov jump linear system in the mean square sense. It is shown that the Markov jumplinear system is stabilizable under the optimal controller if and only if the associated couplealgebraic Riccati equation (CARE) has a unique positive solution. Meanwhile, the optimalcontroller and optimal cost function in infinite-horizon case are expressed explicitly.
Keywords: optimal control, stabilization, Markov jump linear system
Over the last decades, there has been a steadily rising level of activity with linear systems subjectto abrupt changes in their structures. The case in which the jumps are modeled by a Markovchain is referred to as Markov jump linear systems (MJLS) and has been receiving lately a greatdeal of attention in the literature. Applications of these models can be found, for instance, inrobotics tracking and estimation, communication networks, flight systems, etc. We can mentionthe books and the references therein for a general overview on the control and filter problemsfor MJLS [9], [20].The study of this discrete-time Markovian jump linear quadratic (JLQ) control problem canbe traced back (at least) to the work of Blair and Sworder [1] for the finite-time horizon case.
This work is supported by the National Natural Science Foundation of China (Nos. 61473134, 61573220,61120106011, 61573221). ∗ Corresponding author: Huanshui Zhang. Email: [email protected] x - and u -dependent form transition probabilitieswere considered. In [8], a necessary and sufficient condition was presented for the existence ofa positive-semidefinite solution of the coupled algebraic Riccati-like equation occurring in theinfinite horizon JLQ problems. However, the existence of the optimal controller and the stabil-ity of the closed-loop system were not discussed in the paper. In [9], both finite horizon andinfinite horizon optimal controls were considered, and the optimal controllers derived from a setof coupled difference Riccati equations(CDRE) for the former problem, and the mean squarestabilizing solution of a set of coupled algebraic Riccati equations (CARE) for the latter prob-lem. Sufficient existence condition for the finite-horizon optimal controller was guaranteed bythe positive definite of the input weighting matrix, and the mean square stabilizing solution tothe infinite-horizon optimal control problem was derived under the assumption that the stabi-lizing solution to CARE was existed. Subsequently, the results developed in [9] were extendedto solve the constrained quadratic control problems [10] and the finite-horizon output feedbackquadratic optimal control problems [11], respectively. In [12], a new detectability concept (weakdetectability) for discrete-time MJLS was presented, and the new concept supplied a sufficientcondition for the mean square stable for the infinite-horizon linear quadratic controlled system.The concept of weak detectable retrieved the idea that each non-observed state corresponded tostable modes of the system. It has been shown in [12] that mean square detectability ensuredweak detectability. Latter, Costa and do Val [13] summarized the available results and gavea proposition on the mean square stabilizable of the system. Under the assumption that thesystem was weak detectable, the system was mean square stabilizable if and only if there existeda positive semi-definite solution to the CARE. And a method for seeking the stabilizing solutionof the CARE was supplied in [13]. Further in [22], it studied the state-feedback JLQ controlproblem for discrete-time Markov jump linear systems considering the case in which the Markovchain takes values in a general Borel space. It was shown that the solution of the JLQ optimalcontrol problem was obtained in terms of the positive semi-definite solution of M-coupled alge-braic Riccati equations. It was obtained sufficient conditions, based on the concept of stochastic2tabilizability and stochastic detectability, for the existence and uniqueness of this positive semi-definite solution [9]. Meanwhile, the output feedback JLQ optimal control problem for this typeof systems was studied in [23].Continuous-time version of the jump linear quadratic control problem were solved for finite-timehorizons in Sworder [15] and Wonham [16]. Sworder used a stochastic maximum principle toobtain his result, and Wonham used dynamic programming. Wonham also solved the infinitetime horizon version of this control problem, and derived a set of sufficient conditions for the ex-istence of a unique, finite steady-state solution. Mariton [17] considered a discount cost versionof the problem, where a controller that ensures stability in all forms was obtained. However,the paper used the unstated hypothesis that the diagonal entries of the generator of the jumpprocess are equal. A discussion about this result appears in [18]. Mariton and Bertrand [19] alsoconsidered an output feedback version of the JLQ problem. Necessary optimality conditions arederived and two computational algorithms proposed. However, it was not possible to demon-strate their convergence. In [7], a necessary and sufficient condition for stochastic stabilizabilityof the system was provided, where the concept of stochastic stabilizability means the bounded-ness of the infinite-horizon performance index. Under the prerequisite that the optimal solutionof the infinite-horizon JLQ problem existed, a necessary and sufficient stabilizing condition forthe JLQ solution was provided in [14] via the definition of weak detectable. In [21], a stochas-tic maximum principle for the finite-horizon optimal control problems of the continuous-timeforward-backward Markovian regime-switching system was provided. The control system wasdescribed by forward-backward stochastic differential equations and modulated by continuous-time, finite-state Markov chains. The necessary and sufficient conditions for the optimal controlwas obtained.In summary, the aforementioned works have supplied good results for the advances of the finiteand infinite-horizon optimal control theories. As regards the discrete-time finite-horizon opti-mal control problems for the MJLS, the sufficient conditions for the existence of the optimalcontroller is guaranteed by the positiveness of the input penalty matrix R or the positivenessof a set of matrix expressions, but no necessary conditions were presented. As for the discrete-time infinite-horizon optimal stabilization control problem, the necessary conditions [9], [10],[11],sufficient conditions [4], [12], and necessary and sufficient conditions [3], [13] for the existenceof the optimal stabilization controllers were provided. It need to point out that the necessaryand sufficient conditions supplied in [3] were not easily tested, however, since they required thesimultaneous solution of coupled matrix equations containing infinite sums. And the conditionsdeveloped in [13] was based on the concept of weak detectability, it was not easy to test. Tofind a necessary and sufficient conditions, which is easy to check, for mean square stabilizable ofthe system in the linear optimal control frame is still an interesting problem. In the most recentworks, Zhang et. al., [24]-[26] considered the linear quadratic regulation (LQR) and stabilizaitonproblem for the multiplicative noise systems with input delays. The necessary and sufficientcondition for the existence of the finite-time LQR controller was established, and an explicitsolution was given based on the solving forward and backward stochastic deferential/difference3quations (FBSDEs) which are from the maximum principle (MP). Inspired on the results de-veloped in[24]-[26], we will propose a new approach to the LQR and stabilization problems forthe MJLS based on MP. Compared to the multiplicative noise systems, the jumping parametersystems become more complicated since the correlation of the jumping parameters at adjoiningtime. We assume that the state variable and the jump parameters are available to the controller.In the second part of this paper we address the finite-horizon LQR for the discrete-time MJLS,where the input penalty matrix R is just required to be semi-definite positive. This relaxes theconstraint imposed in the control problems greatly, which form the first innovation of this paper.We first extend the stochastic maximum principle [24] to the jumping parameter systems, anddevelop a new forward-backward Markov jumping difference equation(FBMJDE). By solvingthe FBMJDE, a necessary and sufficient condition for the existence of the optimal controller isgiven, and an explicit analytical expression is given for the optimal controller. In the third part,a necessary and sufficient condition for the stabilization solution to the infinite-horizon optimalcontrol problem is provided and the optimal constant gain controller is expressed explicitly, andthe finite value of the infinite-horizon performance is given. Also, a special case is consideredin Corollary 1. Compared with the existed results [9], [12], [13], the stabilization condition inCorollary 1 is easy to test, since no precondition needs to test and it just requires to determinethe existence of a positive definite solution to a set of CARE. The stochastic maximum principleand the explicit relationship between the optimal costate and the systems state explored in thispaper play an important role in the derivation of the results.Notations: Throughout this paper, R n denotes the n -dimensional Euclidean space, R m × n de-notes the norm bounded linear space of all m × n matrices. For L ∈ R n × n , L ′ stands for thetranspose of L . As usual, L ≥ L >
0) will mean that the symmetric matrix L ∈ R n × n ispositive semi-definite (positive definite), respectively. We consider in this paper the finite horizon optimal control problem for the Markov jump linearsystem (MJLS) when the state variable x ( k ) and the jump variable θ ( k ) are available to thecontroller. On the stochastic basis (Ω , G , G k , P), consider the following MJLS x ( k + 1) = A θ ( k ) ( k ) x ( k ) + B θ ( k ) ( k ) u ( k ) , (1)where x ( k ) ∈ R n is the state, u ( k ) ∈ R m is the input control. θ ( k ) is a discrete-time Markovchain with finite state space { , , · · · , L } and transition probability λ i,j = P( θ ( k + 1) = j | θ ( k ) = i )( i, j = 1 , , · · · , L ). We set π i ( k ) = P( θ ( k ) = i )( i = 1 , , · · · , L ), while A i ( k ) , B i ( k )( i =1 , · · · , L ) are matrices of appropriate dimensions. The initial value x is known. We assumethat θ ( k ) is independent of x .The quadratic cost associated to system (1) with admissible control law u = ( u (0) , · · · , u ( N )) is4iven by J N = E[ N X k =0 x ( k ) ′ Q θ ( k ) ( k ) x ( k ) + N X k =0 u ( k ) ′ R θ ( k ) ( k ) u ( k )+ x ( N + 1) ′ P θ ( N +1) ( N + 1) x ( N + 1)] , (2)where N > x ( N + 1) is the terminal state, P j ( N + 1)( j = 1 , · · · , L ) reflects thepenalty on the terminal state, the matrix functions R i ( k ) ≥ i = 1 , · · · , L ) and Q i ( k ) ≥ i =1 , · · · , L ). The controller is required to obey the causality constraint, i.e., u ( k ) must be in theform of u ( k ) = f k ( θ ( k ) , x ( k ) , · · · , x (0) , u ( k − , · · · , u (0))for some function f k ( . ). It means that u ( k ) must be G k -measurable, where G k = { θ ( t ); t =0 , · · · , k } . So the linear quadratic regulation (LQR) problem for Markov jumping parametersystem can be stated as follows: Problem 1 : Find a G k -measurable u ( k ) such that (2) is minimized, subject to (1). Remark 1
For brevity, we will omit the time steps in the systems matrices and the penaltymatrices in the following discussions. That is denoting A θ ( k ) ( k ) , B θ ( k ) ( k ) , Q θ ( k ) ( k ) , and R θ ( k ) ( k ) as A θ ( k ) , B θ ( k ) , Q θ ( k ) , and R θ ( k ) , respectively. This will not affect the final results. Remark 2
For finite-horizon optimal control of discrete-time MJLS, some results have been ob-tained. When the system is described by linear difference equations and when the penalty matrix R θ ( k ) in the performance index is positive definite, the formalism of dynamic programming maybe applied to advantage [1], [9]. However, only sufficient conditions to guarantee the existenceof the optimal controller were given in [1], [9]. It is not possible to demonstrate a necessary andsufficient condition subject to the general case of R θ ( k ) ≥ . So in this paper, we consider thegeneral case that R θ ( k ) ≥ . One object of this paper is to extend the work on linear system withwhite Gaussian noise to MJLS. To do this it will be expedient to derive an algorithm similar tothe stochastic maximum principle developed in [24]. The use of maximum principle in MJLSmay supply a necessary and sufficient condition for the existence of the finite-horizon optimalcontroller in the general case. In the next, we will derive the optimal control by employing the stochastic maximum principle,where the necessary and sufficient condition for the existence of the optimal controller is pro-posed, and an explicit solution to the optimal controller is given. Due to the dependence of θ ( k )on its past values, the new version of the maximum principle for the LQR problem needs to beestablished which can viewed as a generalization to the result for multiplicative noise systems[24]. 5 emma 1 According to the linear system (1) and the performance index (2). If the LQRproblem min J N is solvable, then the optimal G k -measurable control u ( k ) satisfies the followingequation E [ B ′ θ ( k ) η k + R θ ( k ) u ( k ) |G k ] , k = 0 , · · · , N, (3) where the costate η k satisfies the following equation η N = E [ P θ ( N +1) x ( N + 1) |G N ] , (4) η k − = E [ A ′ θ ( k ) η k + Q θ ( k ) x ( k ) |G k − ] , k = 0 , · · · , N. (5) Proof . Denote N as the final control horizon. It is known that u ( k ) is G k -measurable. Con-sider the increment of the control variable u ( k ) and deduce an expression of the correspondingvariation of the performance index (2) dJ N = E[2 x ( N + 1) ′ P θ ( N +1) dx ( N + 1) + 2 N X k =0 x ( k ) ′ Q θ ( k ) dx ( k )+2 N X k =0 u ( k ) ′ R θ ( k ) du ( k )]= E { x ( N + 1) ′ P θ ( N +1) [ F x ( N, dx + N X i =0 F x ( N, i + 1) B θ ( i ) du ( i )]+2 N X k = d u ( k − d ) ′ R θ ( k ) du ( k )+2 N X k =0 x ( k ) ′ Q θ ( k ) [ F x ( k − , dx + k − X i =0 F x ( k − , i + 1) B θ ( i ) du ( i )] } = E { x ( N + 1) ′ P θ ( N +1) [ F x ( N, dx + N X i =0 F x ( N, i + 1) B θ ( i ) du ( i )]+2 N X i =0 u ( i ) ′ R θ ( i ) du ( i ) + 2 N X k =0 x ( k ) ′ Q θ ( k ) F x ( k − , dx +2 N − X i =0 N X k = i +1 x ( k ) ′ Q θ ( k ) F x ( k − , i + 1) B θ ( i ) du ( i ) } , (6)where F x ( k, i ) = A θ ( k ) · · · A θ ( i ) , i = 0 , · · · , k,F x ( k, k + 1) = I. (7)Since we just pay attention to the increment of J N caused by the increment of u ( i ), the initialstate x is fixed and its increment dx is thus 0. Therefore, dJ N = E { x ( N + 1) ′ P θ ( N +1) F x ( N, N + 1) B θ ( N ) + u ( N ) ′ R θ ( N ) ] du ( N )62 N − X i =0 [ x ( N + 1) ′ P θ ( N +1) F x ( N, i + 1) B θ ( i ) + u ( i ) ′ R θ ( i ) + N X k = i +1 x ( k ) ′ Q θ ( k ) F x ( k − , i + 1) B θ ( i ) du ( i ) } . (8)Define η i = E { N X k = i +1 F ′ x ( k − , i + 1) Q θ ( k ) x ( k ) + F ′ x ( N, i + 1) P θ ( N +1) x ( N + 1) |G i } , (9)then we have η i − = E { N X k = i F ′ x ( k − , i ) Q θ ( k ) x ( k ) + F ′ x ( N, i ) P θ ( N +1) x ( N + 1) |G i − } = E { Q θ ( i ) x ( i ) + A ′ θ ( i ) η i |G i − } . Based on (9), we deduce that dJ N = E { N X i =0 E[ η ′ i B θ ( i ) + u ( i ) ′ R θ ( i ) |G i ] du ( i ) } . (10)It concludes from (10) that the necessary condition for the minimum can be given as followsE[ η ′ i B θ ( i ) + u ( i ) ′ R θ ( i ) |G i ] = 0 , i = 0 , · · · , N. (11)This completes the proof.In what follows, we will derive the analytic solution for the LQR problem, and give the necessaryand sufficient conditions for the existence of the optimal controller. Theorem 1
Problem 1 has a unique solution if and only if the following coupled differenceequations Υ i ( k ) = B ′ i ( L X j =1 λ i,j P j ( k + 1)) B i + R i , (12) M i ( k ) = B ′ i ( L X j =1 λ i,j P j ( k + 1)) A i , (13) P i ( k ) = A ′ i ( L X j =1 λ i,j P j ( k + 1)) A i + Q i − M i ( k ) ′ Υ i ( k ) − M i ( k ) , (14) are well defined for k = N, · · · , , i = 1 , · · · , L , that is Υ i ( k ) , k = N, · · · , , i = 1 , · · · , L are allinvertible. If this condition is satisfied, the analytical solution to the optimal control can be givenas u ( k ) = − Υ i ( k ) − M i ( k ) x ( k ) , i = 1 , · · · , L, (15)7 or k = N, · · · , . The corresponding optimal performance index is given by J N = E [ x (0) ′ P θ (0) (0) x (0)] . (16) The solution to FBMJDE (3)-(5), i.e., the relationship of η k − and x ( k ) , is given as η k − = ( L X j =1 λ i,j P j ( k )) x ( k ) , i = 1 , · · · , L. (17) Proof . “Necessary”: Assume that Problem 1 has a unique solution. By the induction, we willprove that Υ i ( k ) in (12) is invertible for all k = N, · · · , , i = 1 , · · · , L , and u ( k ) satisfies (15).Define J ( k ) △ = E { N X i = k ( x ( i ) ′ Q θ ( i ) x ( i ) + u ( i ) ′ R θ ( i ) u ( i )) + x ( N + 1) ′ P θ ( N +1) ( N + 1) x ( N + 1) |G k } , (18)for k = N, · · · ,
0. For k = N , (18) becomes J ( N )=E { x ( N ) ′ Q θ ( N ) x ( N ) + u ( N ) ′ R θ ( N ) u ( N ) + x ( N + 1) ′ P θ ( N +1) ( N + 1) x ( N + 1) |G N } , (19)Based on (1), we deduce that J ( N ) can be represented as quadratic function of x ( N ) and u ( N ).The uniqueness of the optimal controller u ( N ) indicates that the quadratic term of u ( N ) ispositive for any nonzero u ( N ). Let x ( N ) = 0 and substitute (1) in (19), we have J ( N ) = E { u ( N ) ′ ( R θ ( N ) + B ′ θ ( N ) P θ ( N +1) ( N + 1) B θ ( N ) ) u ( N ) |G N } = u ( N ) ′ [ R i + B ′ i ( L X j =1 λ i,j P j ( N + 1)) B i ] u ( N )= u ( N ) ′ Υ i ( N ) u ( N ) > , i = 1 , · · · , L, (20)where Υ i ( N ) = R i + B ′ i ( L X j =1 λ i,j P j ( N + 1)) B i . (21)It can be concluded that Υ i ( N ) > u ( N ) is to be calculated. Applying (1), (3) and (4), weobtain that0 = E[ B ′ θ ( N ) η N + R θ ( N ) u ( N ) |G N ]= E[ B ′ θ ( N ) E( P θ ( N +1) ( N + 1) x ( N + 1) |G N ) + R θ ( N ) u ( N ) |G N ]= E[ B ′ θ ( N ) E[ P θ ( N +1) ( N + 1)( A θ ( N ) x ( N ) + B θ ( N ) u ( N )) |G N ] + R θ ( N ) u ( N ) |G N ]= B ′ i ( L X j =1 λ i,j P j ( N + 1)) A i x ( N ) + B ′ i ( L X j =1 λ i,j P j ( N + 1)) B i u ( N )+ R i u ( N )= B ′ i ( L X j =1 λ i,j P j ( N + 1)) A i x ( N ) + [ B ′ i ( L X j =1 λ i,j P j ( N + 1)) B i + R i ] u ( N ) .
8t follows from the above equation that u ( N ) = − Υ i ( N ) − M i ( N ) x ( N ) , i = 1 , · · · , L, (22)where Υ i ( N ) is as in (21) and M i ( N ) is as follows M i ( N ) = B ′ i ( L X j =1 λ i,j P j ( N + 1)) A i . (23)In the following, we will show that η N − is with the form as (17). In view of (1), (5), and (22),one yields η N − = E { A ′ θ ( N ) η N + Q θ ( N ) x ( N ) |G N − } = E { A ′ θ ( N ) E[ P θ ( N +1) ( N + 1) x ( N + 1) |G N ] + Q θ ( N ) x ( N ) |G N − } = E { A ′ θ ( N ) E[ P θ ( N +1) ( N + 1)( A θ ( N ) x ( N ) + B θ ( N ) u ( N )) |G N ] + Q θ ( N ) x ( N ) |G N − } = E { [ A ′ i ( L X j =1 λ i,j P j ( N + 1)) A i + Q i ] x ( N ) + A ′ i ( L X j =1 λ i,j P j ( N + 1)) B i u ( N ) |G N − } = E { [ A ′ i ( L X j =1 λ i,j P j ( N + 1)) A i + Q i ] x ( N ) − M i ( N ) ′ Υ i ( N ) − M i ( N ) x ( N ) |G N − } = ( L X i =1 λ s,i P i ( N )) x ( N ) , s = 1 , · · · , L, where P i ( N ) = A ′ i ( L X j =1 λ i,j P j ( N + 1)) A i + Q i − M i ( N ) ′ Υ i ( N ) − M i ( N ) . To proceed the induction proof, we take any n with 1 ≤ n ≤ N , and assume that Υ i ( k )( i =1 , · · · , L ) is invertible and that the optimal controller u ( k ) and the optimal costate η k − are as(15) and (17) for all k ≥ n + 1. In the next, it needs to show that these conditions will alsobe satisfied for k = n . Follow the similar derivation procedure for Υ i ( N )( i = 1 , · · · , L ) and let x ( n ) = 0, we will check the quadratic term of u ( n ) in J ( n ). In view of (1), (3), and (5) for k ≥ n + 1, we haveE { x ( k ) ′ η k − − x ( k + 1) ′ η k |G n +1 } = E { x ( k ) ′ E[ Q θ ( k ) x ( k ) + A ′ θ ( k ) η k |G k − ] − [ A θ ( k ) x ( k ) + B θ ( k ) u ( k )] ′ η k |G n +1 } = E { E[ x ( k ) ′ Q θ ( k ) x ( k ) + x ( k ) ′ A ′ θ ( k ) η k |G k − ] − E[ x ( k ) ′ A ′ θ ( k ) η k + u ( k ) ′ B ′ θ ( k ) η k |G k − ] |G n +1 } = E { E[ x ( k ) ′ Q θ ( k ) x ( k ) − u ( k ) ′ B ′ θ ( k ) η k |G k − ] |G n +1 } = E { x ( k ) ′ Q θ ( k ) x ( k ) − u ( k ) ′ B ′ θ ( k ) η k |G n +1 } = E { x ( k ) ′ Q θ ( k ) x ( k ) + u ( k ) ′ R θ ( k ) u ( k ) |G n +1 } k = n + 1 to k = N on both sides of the above equation, we obtain thatE { x ( n + 1) ′ η n − x ( N + 1) ′ η N |G n +1 } = N X k = n +1 E { x ( k ) ′ η k − − x ( k + 1) ′ η k |G n +1 } = N X k = n +1 E { x ( k ) ′ Q θ ( k ) x ( k ) + u ( k ) ′ R θ ( k ) u ( k ) |G n +1 } . (24)So we have from (24) that J ( n ) = E { N X i = n ( x ( i ) ′ Q θ ( i ) x ( i ) + u ( i ) ′ R θ ( i ) u ( i ))+ x ( N + 1) ′ P θ ( N +1) x ( N + 1) |G n } = E { x ( n ) ′ Q θ ( n ) x ( n ) + u ( n ) ′ R θ ( n ) u ( n )+E[ N X i = n +1 ( x ( i ) ′ Q θ ( i ) x ( i ) + u ( i ) ′ R θ ( i ) u ( i ))+ x ( N + 1) ′ P θ ( N +1) x ( N + 1) |G n +1 ] |G n } = E { x ( n ) ′ Q θ ( n ) x ( n ) + u ( n ) ′ R θ ( n ) u ( n ) + E[ x ( n + 1) ′ η n − x ( N + 1) ′ η N + x ( N + 1) ′ P θ ( N +1) x ( N + 1) |G n +1 ] |G n } = E { x ( n ) ′ Q θ ( n ) x ( n ) + u ( n ) ′ R θ ( n ) u ( n ) + E[ x ( n + 1) ′ η n |G n +1 ] |G n } = E { x ( n ) ′ Q θ ( n ) x ( n ) + u ( n ) ′ R θ ( n ) u ( n ) + x ( n ) ′ A ′ θ ( n ) η n + u ( n ) ′ B ′ θ ( n ) η n |G n } = E { u ( n ) ′ R θ ( n ) u ( n ) + u ( n ) ′ B ′ θ ( n ) η n |G n } . (25)Note that η n = ( L X j =1 λ i,j P j ( n + 1)) x ( n + 1)= ( L X j =1 λ i,j P j ( n + 1))[ A i x ( n ) + B i u ( n )] . (26)Substitute (26) in (25), we deduce that J ( n ) = E { u ( n ) ′ R i u ( n ) + u ( n ) ′ B ′ i ( L X j =1 λ i,j P j ( n + 1)) B i u ( n ) |G n } = u ( n ) ′ [ B ′ i ( L X j =1 λ i,j P j ( n + 1)) B i + R i ] u ( n )= u ( n ) ′ Υ i ( n ) u ( n ) . (27)It is concluded from the uniqueness of the optimal controller that J ( n ) must be positive for any u ( n ) = 0. So we have Υ i ( n ) > , i = 1 , · · · , L . 10o derive the optimal controller u ( n ), plugging (26) in (3) yields0 = E { B ′ i η n + R i u ( n ) |G n } = B ′ i ( L X j =1 λ i,j P j ( n + 1)) A i x ( n )+( B ′ i ( L X j =1 λ i,j P j ( n + 1)) B i + R i ) u ( n )= M i ( n ) x ( n ) + Υ i ( n ) u ( n ) . Using the above equation, we get u ( n ) = − Υ i ( n ) − M i ( n ) x ( n ) , (28)where Υ i ( n ) = B ′ i ( L X j =1 λ i,j P j ( n + 1)) B i + R i ,M i ( n ) = B ′ i ( L X j =1 λ i,j P j ( n + 1)) A i . Now, we proceed to derive that η n − is of the form as (17). In terms of (5), (26) and (28), wehave η n − = E { Q θ ( n ) x ( n ) + A ′ θ ( n ) η n |G n − } = E { Q θ ( n ) x ( n ) + A ′ θ ( n ) ( L X j =1 λ θ ( n ) ,j P j ( n + 1))( A θ ( n ) x ( n ) + B θ ( n ) u ( n )) |G n − } = L X i =1 λ s,i ( Q i + A ′ i ( L X j =1 λ i,j P j ( n + 1)) A i ) x ( n )+ L X i =1 λ s,i A ′ i ( L X j =1 λ i,j P j ( n + 1)) B i u ( n )= L X i =1 λ s,i { ( Q i + A ′ i ( L X j =1 λ i,j P j ( n + 1)) A i ) − M i ( n ) ′ Υ i ( n ) − M i ( n ) } x ( n )= { L X i =1 λ s,i P i ( n ) } x ( n ) , where P i ( n ) = Q i + A ′ i ( L X j =1 λ i,j P j ( n + 1)) A i − M i ( n ) ′ Υ i ( n ) − M i ( n ) . i ( k ) > , i = 1 , · · · , L , then we will prove that Problem 1 a has a uniquesolution. Define V N ( k, x ( k )) △ = E[ x ( k ) ′ ( L X i =1 λ s,i P i ( k )) x ( k )] = E[ x ( k ) ′ P θ ( k ) ( k ) x ( k )] . (29)Applying (29), (12)-(14), we deduce that V N ( k, x ( k )) − V N ( k + 1 , x ( k + 1))= E { x ( k ) ′ Q i x ( k ) + u ( k ) ′ R i u ( k ) − [ u ( k ) + Υ i ( k ) − M i ( k ) x ( k )] × Υ i ( k )[ u ( k ) + Υ i ( k ) − M i ( k ) x ( k )] } . (30)Adding from k = 0 to k = N on both sides of (30), the performance index (2) is rewritten as J N = E { x (0) ′ P θ (0) (0) x (0) + N X k =0 [ u ( k ) + Υ i ( k ) − M i ( k ) x ( k )] × Υ i ( k )[ u ( k ) + Υ i ( k ) − M i ( k ) x ( k )] } . Note that Υ i ( k ) > , i = 1 , · · · , L . Thus Problem 1 has a unique solution, and the optimalcontroller is given by u ( k ) = − Υ i ( k ) − M i ( k ) x ( k ) . The corresponding optimal performance index is given by J N = E[ x (0) ′ P θ (0) (0) x (0)] . This completes the proof.
Remark 3
Necessary and sufficient condition for the existence of the discrete-time Markovjump linear LQR problem is given in Theorem 1, in which it just requires the input penaltymatrix R is positive semi-definite. It can be found that the existed results usually considerthe case that R is positive definite [1], [9], [10], [11] or a set of matrix expressions is positivedefinite [3], [4], and only sufficient conditions for the existence of the LQR controller is given. Inthis paper, an analytical solution to the forward-backward Markov jumping parameter differenceequation associated with optimal control is presented. This forms the basis on which we supplythe necessary and sufficient conditions for the existence the optimal LQR controller for MJLS. For the infinite horizon quadratic optimal control problems to be analyzed in this section, weconsider a time-invariant version of the model (1). We will be interested in the problem of12inimizing the infinite horizon cost function given by J = E { ∞ X k =0 [ x ( k ) ′ Q θ ( k ) x ( k ) + u ( k ) ′ R θ ( k ) u ( k )] } . (31) Definition 1
We say that the linear system with Markov jump parameter (4) with u ( k ) = 0 ismean square stable (MSS) if for any initial condition x and θ (0) , there holds lim k →∞ E ( x ( k ) ′ x ( k )) = 0 . Definition 2
We say that system (1) is mean square stabilizable if there is a G k -measurablecontroller u ( k ) = F θ ( k ) x ( k ) satisfying lim k →∞ E [ u ( k ) ′ u ( k )] = 0 , such that system (1) is asymp-totically mean square stable. Definition 3
The following MJLS x ( k + 1) = A θ ( k ) x ( k ) , y ( k ) = C θ ( k ) x ( k ) (32) is said to be exactly observable, if for any Ny ( k ) = 0 , a.s. ∀ s ≤ k ≤ N ⇒ x = 0 . Denote A = ( A , · · · , A L ) , B = ( B , · · · , B L ), and C = ( C , · · · , C L ). For brevity, we usually saythat the pair ( A, B ) is mean square stabilizable if system (1) is mean square stabilizable, andsay that the pair (
C, A ) is exactly observable if system (32) is exactly observable.
Problem 2 : Find the G k -measurable controller u ( k ) = F θ ( k ) x ( k ) , k ≥
0, such that the closed loopsystem is asymptotically stable in the mean square sense, and the corresponding cost function(31) is minimized.
Assumption 1 R i ( i = 1 , · · · , L ) is positive definite; Q i ( i = 1 , · · · , L ) is positive semi-definite,that is, Q i = C i C ′ i ( i = 1 , · · · , L ) for some matrix C i ( i = 1 , · · · , L ) . Assumption 2 ( C, A ) is exactly observable. For clarity, we rewrite Υ i ( k ) , P i ( k ), and M i ( k )( i = 1 , · · · , L ) in (12)-(14) as Υ Ni ( k ) , P Ni ( k ), and M Ni ( k )( i = 1 , · · · , L ). Without loss of generality, we set the terminal weight matrix P Nj ( N +1)( j = 1 , · · · , L ) in the cost function to be zero. Define the following coupled algebraic Riccatiequation P i = A ′ i ( L X j =1 λ ij P j ) A i + Q i − A ′ i ( L X j =1 λ ij P j ) B i [ B ′ i ( L X j =1 λ ij P j ) B i + R i ] − B ′ i ( L X j =1 λ ij P j ) A i , i = 1 , , · · · , L, (33)Υ i = B i ( L X j =1 λ ij P j ) B i + R i , i = 1 , , · · · , L, (34) M i = B i ( L X j =1 λ ij P j ) A i , i = 1 , , · · · , L. (35) Lemma 2
For any N ≥ , P Ni ( k ) ≥ .Proof : From (34) and (35), we have[ M Ni ( k )] ′ [Υ Ni ( k )] − M Ni ( k ) = − [ M Ni ( k )] ′ K Ni ( k ) − [ K Ni ( k )] ′ M Ni ( k ) − [ K Ni ( k )] ′ Υ Ni ( k ) K Ni ( k )in which K Ni ( k ) = − [Υ Ni ( k )] − M Ni ( k ).In considering of (33), it yields that P Ni ( k ) = A ′ i ( L X j =1 λ i,j P Nj ( k + 1)) A i + Q i + [ M Ni ( k )] ′ K Ni ( k )+[ K Ni ( k )] ′ M Ni ( k ) + [ K Ni ( k )] ′ Υ Ni ( k ) K Ni ( k )= Q i + [ K Ni ( k )] ′ Υ Ni ( k ) K Ni ( k ) + [ A i + B i K Ni ( k )] ′ ( L X j =1 λ i,j P Nj ( k + 1))[ A i + B i K Ni ( k )] . In view of the terminal condition P N +1 j ( N + 1) = 0 and Q i ≥
0, we can obtain that P Ni ( N ) ≥ P Ni ( k ) ≥
0, for 0 ≤ k ≤ N . Lemma 3
When R i > , Problem 1 has a unique solution.Proof : From Lemma 2, the expression of (34) and R i >
0, it is easy to obtain that Υ Ni ( k ) > R i >
0, we have Problem 1 has a uniquesolution.
Theorem 2
Under Assumptions 1 and 2, if the system (1) is mean square stabilizable , we havethe following properties:For any k ≥ , P Ni ( k ) is convergent when N → ∞ , i.e., lim N →∞ P Ni ( k ) = P i , in which P i satisfies(33)-(35), furthermore, P i > .Proof . First, we show that P Nθ (0) (0) is increasing with respect to N . On the ground of J N ≤ J N +1 ,we have that J ∗ N ≤ J ∗ N +1 for any initial value x . From (16), we obtain thatE[ x (0) ′ P Nθ (0) (0) x (0)] ≤ E[ x (0) ′ P N +1 θ (0) (0) x (0)] .
14n view of the arbitrary of x (0), it implies that P Nθ (0) (0) ≤ P N +1 θ (0) (0), i.e., P Nθ (0) (0) is increasingwith respect to N .Next, we will show the boundedness of P Nθ (0) (0). When the system (1) is stabilizable in the meansquare sense, there exists u ( k ) = F θ ( k ) x ( k ) satisfyinglim k →∞ E ( x ′ k x k ) = 0 . Hence, we have that J ∗ N ≤ J = E[ ∞ X k =0 ( x ( k ) ′ Q θ ( k ) x ( k ) + u ( k ) ′ R θ ( k ) u ( k ) ′ )]= E[ ∞ X k =0 x ( k ) ′ Q θ ( k ) x ( k ) + x ( k ) ′ F ′ θ ( k ) R θ ( k ) F θ ( k ) x ( k )]= E[ ∞ X k =0 x ( k ) ′ ( Q θ ( k ) + F ′ θ ( k ) R θ ( k ) F θ ( k ) ) x ( k )] ≤ λ E[ ∞ X k =0 x ( k ) ′ x ( k )] ≤ λ · c · E[ x (0) ′ x (0)]in which λ denotes the maximum eigenvalue of ( Q θ ( k ) + F ′ θ ( k ) R θ ( k ) F θ ( k ) ) and c is a positiveconstant. The above formula implies thatE[ x (0) ′ P Nθ (0) (0) x (0)] ≤ λ · c · E[ x (0) ′ x (0)] , i.e., P Nθ (0) (0) ≤ λ · cI. From now on, we can say that P Nθ (0) (0) is bounded. In considering of the monotonicity of P Nθ (0) (0), we deduce that P Nθ (0) (0) is convergent. Note that the variables given in (12)-(14) aretime invariant for N due to the choice that P j ( N + 1) = 0 , ( j = 1 , · · · , L ), so we havelim k →∞ P Ni ( k ) = lim k →∞ P N − ki (0) = P i , i = 1 , · · · , L. At the same time, we have thatlim k →∞ Υ Ni ( k ) = Υ i , lim k →∞ M Ni ( k ) = M i . Now we will illustrate P i >
0. Since J ∗ N = E[ x (0) ′ P Nθ (0) (0) x (0)] ≥
0, we can obtain that P Nθ (0) (0) ≥ x (0). Next we mainly investigate that there exists a positive integer N such that P N θ (0) (0) >
0. If not, there must exist a nonempty set as follows: Z N = n x ∈ R n : x = 0 , x ′ P Nθ (0) (0) x = 0 o . P Nθ (0) (0) implies that if E[ x ′ P N +1 θ (0) (0) x ] = 0, then E[ x ′ P Nθ (0) (0) x ] =0, i.e., Z N +1 ⊆ Z N . As Z N is a nonempty finite dimensional set, thus1 ≤ · · · ≤ dim ( Z ) ≤ dim ( Z ) ≤ dim ( Z ) ≤ n. Therefore, there exists a positive integer N such that for any N > N we have dim ( Z N ) = dim ( Z N ) , i.e., Z N = Z N , furthermore, T N ≥ Z N = Z N = 0. Therefore, there exists a nonzero x ∈ Z N suchthat x ′ P Nθ (0) (0) x = 0. Now let x = x , then J ∗ N = E[ N X k =0 x ∗ ( k ) ′ Q θ ( k ) x ∗ ( k ) + u ∗ ( k ) ′ R θ ( k ) u ∗ ( k )]= E[ x ′ P Nθ (0) (0) x ] = 0 . Noting that Q θ ( k ) ≥ , R θ ( k ) >
0, we have u ∗ ( k ) = 0 , C θ ( k ) x ∗ ( k ) = 0. That is, x ∗ ( k + 1) = A θ ( k ) x ∗ ( k ) , C θ ( k ) x ∗ ( k ) = 0 . Considering the exactly observable of (
C, A ), it implies that x = 0, which is a contradiction with x = 0. Hence, there must exist N , such that P N θ (0) (0) >
0. Therefore, P i = lim N →∞ P Ni ( k ) ≥ P N i ( k ) > . Theorem 3
Under Assumptions 1 and 2, the system (1) is mean square stabilizable if and onlyif there exists a unique solution to (33)-(35) such that P i > , i = 1 , , · · · , L . In this case, thecontroller u ( k ) = − Υ − i M i x ( k ) , k ≥ stabilizes (1) in the mean square sense and minimizes the cost function (31). The optimal costis given by J ∗ = E { x ′ P θ (0) x } . (37) Proof .- Sufficiency : Assume P i ( i = 1 , , · · · , L ) is a solution to (33) such that P i >
0. Firstly, wewill show that (1) is mean square stabilizable with the controller (51). For this purpose, definethe Lyapunov function candidate V ( k, x ( k )) as V ( k, x ( k )) = E[ x ( k ) ′ P θ ( k ) x ( k )] . (38)The convergence of V ( k, x ( k )) is to be proven. Employing (1) and (33)-(35) yields V ( k, x ( k )) − V ( k + 1 , x ( k + 1))= E { x ( k ) ′ P θ ( k ) x ( k ) − x ( k + 1) ′ P θ ( k +1) x ( k + 1) } = E { x ( k ) ′ P θ ( k ) x ( k ) − [ A θ ( k ) x ( k ) + B θ ( k ) u ( k )] ′ P θ ( k +1) [ A θ ( k ) x ( k ) + B θ ( k ) u ( k )] } = E { x ( k ) ′ Q i x ( k ) + u ( k ) ′ R i u ( k ) − [ u ( k ) + Υ − i M i x ( k )] ′ Υ i [ u ( k ) + Υ − i M i x ( k )] } (39)= E { x ( k ) ′ Q θ ( k ) x ( k ) + u ( k ) ′ R θ ( k ) u ( k ) } ≥ , k ≥ , (40)16here u ( k ) = − Υ − i M i x ( k ) for k ≥ V ( k, x ( k )) decreases with respect to k . From Theorem 2, we know that V ( k, x ( k )) = E { x ( k ) ′ P θ ( k ) x ( k ) } ≥ , (41)which means that V ( k, ( x ( k ))) is bounded, and thus is convergent.Now let m be any nonnegative integer. By adding from k = m to k = m + N on both sides of(40) and letting m → + ∞ , it yields thatlim m →∞ m + N X k = m E[ x ( k ) ′ Q θ ( k ) x ( k ) + u ( k ) ′ R θ ( k ) u ( k )]= lim m →∞ [ V ( m, x ( m )) − V ( m + N + 1 , x ( m + N + 1))]= 0 , (42)in which the last equality holds owning to the convergence of V ( k, x ( k )). Note that N X k =0 E[ x ( k ) ′ Q θ ( k ) x ( k ) + u ( k ) ′ R θ ( k ) u ( k )] ≥ E { x ′ P Nθ (0) (0) x } . Via a time-shift of length of m , it leads to m + N X k = m E[ x ( k ) ′ Q θ ( k ) x ( k ) + u ( k ) ′ R θ ( k ) u ( k )] ≥ E { x ( m + 0) ′ P N + mθ ( m +0) ( m + 0) x ( m + 0) } = E { x ′ m P Nθ (0) (0) x m } ≥ m →∞ E { x ′ m P Nθ (0) (0) x m } = 0 , ∀ N ≥ . (44)In the proof of Theorem 2, we have shown that there exists N , such that P N θ (0) (0) is positivedefinite. Thus (44) implies that lim m →∞ E[ x ′ m x m ] = 0. Therefore, the controller (51) stabilizes(1) in the mean square sense.Secondly, we will show that the cost function (31) is minimized by (51). Adding from k = 0 to k = N to (39) yields E { N X k =0 [ x ( k ) ′ Q θ ( k ) x ( k ) + u ( k ) ′ R θ ( k ) u ( k )] } = V (0 , x ) − V ( N + 1 , x ( N + 1))+ N X k =0 E { [ u ( k ) + Υ − i M i x ( k )] ′ Υ i [ u ( k ) + Υ − i M i x ( k )] } , (45)in which V (0 , x ) and V ( N + 1 , x ( N + 1)) are defined in (38). Then lim k →∞ V ( k, x ( k )) =0 is to be shown. Now we only consider the controller which stabilizes system (1). Thus17im k →∞ E { x ( k ) ′ P θ ( k ) x ( k ) } = lim k →∞ V ( k, x ( k )) = 0. By letting N → ∞ on both sides of (45),the cost function (31) is rewritten as J = E { x ′ P θ (0) x } + ∞ X k =0 E { [ u ( k ) + Υ − i M i x ( k )] ′ Υ i [ u ( k ) + Υ − i M i x ( k )] } , (46)In view of the positive definiteness of Υ i , i = 1 , · · · , L , the optimal controller to minimizes(46) must be (51), and the corresponding optimal cost is as (52). Therefore, the proof of thesufficiency is finished. Necessity : Suppose the system (1) is mean square stabilizable. In Theorem 2, the existenceof the solution to (33)-(35) satisfying P i > i = 1 , , · · · , L ) has been verified. We just needto show the uniqueness. Let S i ( i = 1 , , · · · , L ) be another solution to (33)-(35) satisfying S i > i = 1 , , · · · , L ), i.e., S i = A ′ i ( L X j =1 λ ij S j ) A i + Q i − A ′ i ( L X j =1 λ ij S j ) B i [ B ′ i ( L X j =1 λ ij S j ) B i + R i ] − × B ′ i ( L X j =1 λ ij S j ) A i , i = 1 , , · · · , L, (47)∆ i = B i ( L X j =1 λ ij S j ) B i + R i , i = 1 , , · · · , L, (48)Π i = B i ( L X j =1 λ ij S j ) A i , i = 1 , , · · · , L. (49)In view of the proof of sufficiency as in the above, the optimal value of the cost function (31) isas J ∗ = E { x ′ P θ (0) x } = E { x ′ S θ (0) x } . As x is arbitrary, the above equation implies that P i = S i , i = 1 , , · · · , L . It follows from(33)-(35), (47)-(49) that, Υ i = ∆ i , M i = Π i ( i = 1 , , · · · , L ). Thus the uniqueness has beenproven. The proof of necessity is now complete.If Q i = I n × n , i = 1 , · · · , L and R i = I m × m , i = 1 , · · · , L in (31), where I n × n is the identity matrixwith dimension of n and I m × m is the identity matrix with dimension of m , the conditions ofAssumption 1 and Assumption 2 are guaranteed naturally, and the performance index becomesas J = E { ∞ X k =0 [ x ( k ) ′ x ( k ) + u ( k ) ′ u ( k )] } . (50)Then the stabilization solution to the infinite horizon problem (50) can be stated as. Corollary 1
The system (1) is mean square stabilizable if and only if there exists a uniquesolution to (33)-(35) such that P i > , i = 1 , , · · · , L . In this case, the controller u ( k ) = − Υ − i M i x ( k ) , k ≥ tabilizes (1) in the mean square sense and minimizes the cost function (31). The optimal costis given by J ∗ = E { x ′ P θ (0) x } . (52) Remark 4
In [12], a new detectability concept (weak detectability) for discrete-time MJLS waspresented, and the new concept supplied a sufficient condition for the mean square stable forthe infinite-horizon linear quadratic controlled system. The result can be summarized as: If thesystem was weak detectable and there existed a positive semi definite solution to the CARE,then the system with the optimal feedback gain was mean square stable. Further, the necessaryand sufficient conditions were supplied in [13], and we can summarize the result as: Underthe assumption that the system was weak detectable, the system was mean square stabilizable ifand only if there existed a positive semi-definite solution to the CARE. Although the sufficientconditions [12] and necessary and sufficient conditions [13] for the infinite-horizon stabilizationproblem were given. However, the computational test for weak detectability is not intuitive,and it is not easy to check. In Corollary 1, we give the necessary and sufficient conditions forthe stabilization of the system without additional prerequisite. We just need to determine theexistence of a positive definite solution to the CARE. It is easy to check.
In this section, we present a simple example to illustrate the previous theoretical results. Con-sider a second-order dynamic system (1) with the performance (2). The specifications of thesystem and the weighting matrices are as follows A = (cid:20) . − . − . (cid:21) , A = (cid:20) . . (cid:21) , B = (cid:20) (cid:21) , B = (cid:20) (cid:21) ,Q = (cid:20) (cid:21) , Q = (cid:20) (cid:21) , R = 1 , R = 1 .θ ( k ) is the Markov chain taking values in a finite set { , } with transition rate λ = 0 . λ = 0 .
3. The initial distribution of θ ( k ) is (0 . , . x (0) = [5 5] ′ .In this example, the time horizon is set to N = 20. And the final penalty matrix P (10) = I , P (10) = I . Without loss of generality, we run 50 Monte Carlo simulations from k = 0 to 20.The simulation results are obtained as follows. Fig. 1 shows a sample path of the Markov chain θ ( k ) ∈ { , } . The Riccati coefficients of the matrix P i ( k )( i = 1 ,
2) obtained using MATLABare shown in Fig. 2. The optimal states are plotted in Fig. 3 and the optimal control is shownin Fig. 4. 19
20 40 60 80 10011.11.21.31.41.51.61.71.81.92
Figure 1: One path of the Markov chain θ ( k ) ∈ { , } k R i cc a t i C oe ff i c i en t s p ll (k)p (k)=p (k)p (k) Figure 2: The Riccati coefficients of the matrix P i ( k )( i = 1 , This paper has addressed the finite-horizon and infinite-horizon optimal control problems forthe MJLS. A general situation in the former has been considered, and a necessary and sufficientcondition for the existence of the optimal controller has been proposed for the first time. Later,we have proposed a necessary and sufficient condition for the mean square stabilizable of theMJLS. To show the existence of such a solution, one just need to prove the positiveness of thesolution to the corresponding CARE. The condition is easily verifiable. As far as we know, no20 k -20-15-10-5051015 O p t i m a l S t a t e s x l (k)x (k) Figure 3: The optimal state trajectories k -120-100-80-60-40-20020 O p t i m a l C on t r o l u(k) Figure 4: The optimal controlsuch conditions have been given for the mean mean square stabilizable of the MJLS before.
References [1] W. P. Blair and D. D. Sworder, “Feedback control of a class of linear discrete systems withjump parameters and quadratic cost criteria,”
Int. J. Contr., vol. 21, no. 5, pp. 833-841,1975. 212] J. D. Birdwell, D. A. Castanon, and M. Athans, “On reliable control system designs,”
IEEETrans. Syst., Man, Cybern., vol. SMC-16, pp. 703-710, 1986.[3] H. J. Chizeck, A. S. Willsky, and D. Castanon, “Discrete-time Markovian-jump linearquadratic optimal control,”
Int. J. Contr., vol. 43, no. 1, pp. 213-231, 1986.[4] Y. Ji and H. J. Chizeck, “Controllability, observability and discrete time Markovian jumplinear quadratic control,”
Int. J. Contr., vol. 48, no. 2, pp. 481-498, 1988.[5] Y. Ji and H. J. Chizeck, “Optimal quadratic control of discrete-time jump linear systemswith separately controlled transition probabilities,”
Int. J. Contr., vol. 49, no. 2, pp. 481-491, 1989.[6] Y. Ji and H. J. Chizeck, “Bounded sample path control of discrete-time jump linear sys-terns,”
IEEE Trons. Syst.. Man, Cybern., vol. 19, pp. 227-284, 1989.[7] Y. Ji and H. J. Chizeck, “Controllability, stabilizability, and continuous-time Markovianjump linear quadratic control,”
IEEE Trans. on Automatic Control, vol. 35, no. 7, pp.777-788, 1990.[8] H. Abou-Kandil, G. Freiling, and G. Jank, “On the solution of discrete-time Markovianlinear quadratic control problems,”
Auromatica, vol. 31, no. 5, pp. 765-768, 1995.[9] O. L. V. Costa, M. D. Fragoso and R. P. Marques,
Discrete Time Markov Jump LinearSystems . New York: Springer-Verlag, 2005.[10] O. L. V. Costa, E. O. Assumpia Filho, E. K. Boukas, and R. P. Marques, “Constrainedquadratic state feedback control of discrete-time Markovian jump linear systems,”
Auro-matica, vol. 35, pp. 617-626, 1999.[11] O. L. V. Costa and E. F. Tuesta, “Finite horizon quadratic optimal control and a separationprinciple for Markovian jump linear systems,”
IEEE Trans. on Automatic Control, vol. 48,no. 10, pp. 1836-1842, 2003.[12] E. F. Costa and J. B. R. do Val, “On the detectability and observability of discrete-timeMarkov jump linear systems,”
Systems and Control Letters, vol. 44, pp. 135-145, 2001.[13] E. F. Costa and J. B. R. do Val, “Weak detectability and the linear-quadratic controlproblem of discrete-time Markov jump linear systems,”
Int. J. Contr., vol. 75, no. 16/17,pp. 1282-1292, 2002.[14] J. B. R. do Val and E. F. Costa, “Stabilizability and positiveness of solutions of the jumplinear quadratic problem and the coupled algebraic Riccati equation,”
IEEE Trans. onAutomatic Control, vol. 50, no. 5, pp. 691-695, 2005.[15] D. D. Sworder, “Feedback control of a class of linear systems with jump parameters,”
IEEETrans. on Automatic Control, vol. AC-14, pp. 9-14, 1969.2216] W. M. Wonham, Random differential equations in control theory, in Probabilistic Methodsin Applied Mathematics, Vol. 2, A. T. Bharucha-reid, Ed. New York: Academic, 1971.[17] M. Mariton and P. Bertrand, “Robust jump linear quadratic control: A mode stabilizingsolution,”
IEEE Trans. on Automatic Control, vol. AC-30, pp. 1145-1147, 1985.[18] W. E. Hopkins, Jr., “Comments on ‘Robust jump linear quadratic control: A mode sta-bilizing solution,’” and M. Mariton and P. Bertrand, “Authors’ reply,”
IEEE Trans. onAutomatic Control, vol. AC-31, pp. 1079-1081, 1986.[19] M. Mariton and P. Bertrand, “Output feedback for a class of linear systems with stochasticjumping parameters,”
IEEE Trans. on Automatic Control, vol. AC-30, pp. 898-900, 1985.[20] M. Mariton,
Jump Linear Systems in Automatic Control . New York: Marcel Dekker, 1990.[21] R. Tao and Z. Wu, “Maximum principle for optimal control problems of forwardCbackwardregime-switching system and applications,”
Systems and Control Letters, vol. 61, pp. 911-917, 2012.[22] O. L. V. Costa and D. Z. Figueiredo, “LQ control of discrete-time jump systems withMarkov chain in a general Borel space,”
IEEE Trans. on Automatic Control, vol. 60, no. 9,pp. 2530-2535, 2015.[23] O. L. V. Costa, D. Z. Figueiredo, “Quadratic control with partial information for discrete-time jump systems with the Markov chain in a general Borel space,”
Automatica, vol. 66,pp. 73-84, 2016.[24] H. Zhang, H. Wang, and L. Li, Adapted and casual maximum principle and analyticalsolution to optimal control for stochastic multiplicative-noise systems with multiple input-delays, in
Proceedings of the 51st IEEE Conference on Decision and Control , 2012: 2122-2127.[25] H. Zhang, L. Li, J. Xu, and M. Fu, “Linear quadratic regulation and stabilization of discrete-time systems with delay and multiplicative noise”
IEEE Trans. Autom. Control, vol. 60,no. 10, pp. 2599-2613, 2015.[26] L. Li and H. Zhang, “Stabilization of discrete-time systems with multiplicative noise andmultiple delays in the control variable”