Analysis of the Optimization Landscape of Linear Quadratic Gaussian (LQG) Control
AAnalysis of the Optimization Landscape of LinearQuadratic Gaussian (LQG) Control ∗Yang Zheng , Yujie Tang , and Na Li School of Engineering and Applied Sciences, Harvard University th February, 2021
Abstract
This paper revisits the classical Linear Quadratic Gaussian (LQG) control from a modernoptimization perspective. We analyze two aspects of the optimization landscape of the LQGproblem: 1) connectivity of the set of stabilizing controllers C n ; and 2) structure of stationarypoints. It is known that similarity transformations do not change the input-output behavior of adynamical controller or LQG cost. This inherent symmetry by similarity transformations makesthe landscape of LQG very rich. We show that 1) the set of stabilizing controllers C n has atmost two path-connected components and they are diffeomorphic under a mapping defined by asimilarity transformation; 2) there might exist many strictly suboptimal stationary points of theLQG cost function over C n and these stationary points are always non-minimal ; 3) all minimal stationary points are globally optimal and they are identical up to a similarity transformation.These results shed some light on the performance analysis of direct policy gradient methods forsolving the LQG problem. As one of the most fundamental optimal control problems, Linear Quadratic Gaussian (LQG) controlhas been studied for decades. Many structural properties of the LQG problem have been establishedin the literature, such as existence of the optimal controller, separation principle of the controllerstructure, and no guaranteed stability margin of closed-loop LQG systems [1, 2, 3]. Despite thenon-convexity of the LQG problem, a globally optimal controller can be found by solving twoalgebraic Riccati equations [1], or a convex semidefinite program based on a change of variables [4, 5].While extensive results on LQG have been obtained in classical control, its optimization landscapeis less studied, i.e., viewing the LQG cost as a function of the controller parameters and studyingits analytical and geometrical properties. On the other hand, recent advances in reinforcementlearning (RL) have revealed that the landscape analysis of another benchmark optimal controlproblem, linear quadratic regulator (LQR), can lead to fruitful and profound results, especially formodel-free controller synthesis [6, 7, 8, 9, 10, 11, 12]. For instance, it is shown that the set of staticstabilizing feedback gains for LQR is connected, and that the LQR cost function is coercive andenjoys an interesting property of gradient dominance [6, 13]. These properties are fundamental to ∗ Y. Zheng and Y. Tang contributed to this work equally. This work is supported by NSF career 1553407,AFOSR Young Investigator Program, and ONR Young Investigator Program. Emails: [email protected];[email protected]; [email protected]. a r X i v : . [ m a t h . O C ] F e b stablish convergence guarantees for gradient-based algorithms for solving LQR and their model-free extensions for RL [7, 8]. We note that recent studies have also contributed to establishingperformance guarantees of model-based RL techniques for LQR (see e.g., [14, 15]) as well as LQGcontrol [16, 17, 18, 19].This paper aims to analyze the optimization landscape of the LQG problem. Unlike LQR thatdeals with fully observed linear systems whose optimal solution is a static feedback policy, the LQGproblem concerns partially observed linear systems driven by additive Gaussian noise and its optimalcontroller is no longer static. We need to search over dynamical controllers for LQG problems. Thismakes its optimization landscape richer and yet much more complicated than LQR. Indeed, theset of stabilizing static state feedback policies is connected, but the set of stabilizing static outputfeedback policies can be highly disconnected [20]. The connectivity of stabilizing dynamical outputfeedback policies, i.e., the feasible region of LQG control, remains unclear. Furthermore, LQGhas a natural symmetry structure induced by similarity transformations that do not change theinput-output behavior of dynamical controllers, which is not the case for LQR.Some recent studies [21, 22, 23, 24, 25] have demonstrated that symmetry properties play a keyrole in rendering a large class of non-convex optimization problems in machine learning tractable; seealso [26] for a recent review. For the LQG problem, we can expect the inherent symmetry by similaritytransformations to bring some important properties of its non-convex optimization landscape. Wealso note that the notion of minimal controllers (a.k.a. controllable and observable controllers; seeAppendix A.1) is a unique feature in controller synthesis of partially observed dynamical systems,making the optimization landscape of LQG distinct from many machine learning problems. In this paper, we view the classical LQG problem from a modern optimization perspective, and studytwo aspects of its optimization landscape. First, we characterize the connectivity of the feasibleregion of the LQG problem, i.e., the set of strictly proper stabilizing dynamical controllers, denotedby C n ( n is the state dimension). We prove that C n can be disconnected, but has at most twopath-connected components (Theorem 3.1) that are diffeomorphic under a similarity transformation(Theorem 3.2). This brings positive news to gradient-based local search algorithms for the LQGproblem, since it makes no difference to search over either path-connected component even if C n isdisconnected. We further present a sufficient condition under which C n is always connected, and thiscondition becomes necessary for LQG problems with a single input or a single output (Theorem 3.3).The sufficient condition naturally holds for any open-loop stable system, thus its set of strictly properstabilizing dynamical controllers is always connected (Corollary 3.1).Second, we investigate structural properties of the stationary points of the LQG cost function.When characterizing the stationary points, the notion of minimal controllers plays an importantrole. By exploiting the symmetry induced by similarity transformations, we show that the LQGcost is very likely to have many strictly suboptimal stationary points , and these stationary pointsare always non-minimal (Theorem 4.1). For LQG with an open-loop stable plant, we explicitlyconstruct a family of non-minimal stationary points and further establishes a criterion for checkingwhether the corresponding Hessian is indefinite or vanishing (Theorem 4.2). In contrast, we provethat all minimal stationary points are globally optimal to the LQG problem (Theorem 4.3), andform a submanifold of dimension n that has two path-connected components (Proposition 4.1).These minimal stationary points are identical up to similarity transformations. This result impliesthat if local search iterates converge to a stationary point that corresponds to a controllable andobservable controller, then the algorithm has found a globally optimal solution to the LQG problem(Corollary 4.2). Finally, we construct an example showing that the second-order shape of the LQG2ost function can be ill-behaved around a minimal stationary point in the sense that its Hessian hasa huge condition number (see Theorem 4.4). Optimization landscape of LQR
The classical Linear-Quadratic Regulator (LQR), one of themost well-studied optimal control problems, has re-attracted increasing interest [6, 14, 7, 11, 27, 28]in the study of RL techniques for control systems. For model-free policy optimization methods, theoptimization landscape of LQR is essential to establish their performance guarantees. In [6, 7, 8], itis shown that both continuous-time and discrete-time LQR problems enjoy the gradient dominanceproperty, and that model-free gradient-based algorithms converge to the optimal LQR controllerunder mild conditions. The authors in [12] have examined the optimization landscape of a class of risk-sensitive state-feedback control problems and the convergence of corresponding policy optimizationmethods. Furthermore, it is shown in [29] that a class of finite-horizon output-feedback linearquadratic control problems also satisfies the gradient dominance property. Some recent studies haveexamined the connectivity of stabilizing static output feedback policies [20, 13, 30]. It is shownin [20] that the set of stabilizing static output feedback policies can be highly disconnected, whichposes a significant challenge for decentralized LQR problems. For general decentralized LQR, policyoptimization methods can only be guaranteed to reach some stationary points [10].We note that many landscape properties of LQR are derived using classical control tools [8, 12,29, 30]. Our work leverages ideas from classical control tools [1, 4, 5] to analyze the optimizationlandscape of the LQG problem.
Reinforcement learning for LQG and controller parameterization
Recent studies havealso started to investigate LQG with unknown dynamics, including offline robust control [16, 17, 18]and online adaptive control [19, 31, 32]. The line of studies on offline robust control first estimates asystem model as well as a bound on the estimation error (see, e.g., [16, 33, 34]), and then design arobust LQG controller that stabilizes the plant against model uncertainty. For online adaptive control,the recent work [19] has introduced an online gradient descent algorithm to update LQG controllerparameters with a sub-linear regret; see [31, 32] for further developments. For both lines of works, aconvex reformulation of the LQG problem is essential for algorithm design as well as performanceanalysis. For example, the works [19, 31, 32] employ the classical Youla parameterization [35],while the works [17, 18] adopt the recent system-level parameterization (SLP) [36] and input-outputparameterization (IOP) [37], respectively. The Youla parameterization, SLP, and IOP recast theLQG problem into equivalent convex formulations in the frequency domain [38], but they all rely onthe underlying system dynamics explicitly. Thus, a system identification procedure is required apriori in [16, 17, 18, 19], and these methods are all model-based.In this work, we consider a natural model-free controller parameterization for the LQG problemin the state-space domain. This parameterization does not depend on the system dynamics explicitlybut leads to a non-convex formulation. Our results contribute to the understanding of this non-convexoptimization landscape, which shed light on performance analysis of model-free RL methods forsolving LQG control.
Non-convex optimization with symmetry
Recent works [26, 23] have revealed the signifi-cance of symmetry properties in understanding the geometry of many non-convex optimizationproblems in machine learning. For example, the phase retrieval [21] and low-rank matrix factoriza-tion [22, 23] problems have rotational symmetries, while sparse dictionary learning [24] and tensordecomposition [25] exhibit discrete symmetries; see [26] for a recent survey. These symmetries3nable identifying the local curvature of stationary points, and contribute to the tractability of theassociated non-convex optimization problems. In this paper, we highlight a new symmetry definedby similarity transformations in the LQG problem. This symmetry appears only in dynamicaloutput-feedback controller synthesis. In addition, a notion of minimal controllers is unique incontrol problems, making the study of the landscape of LQG distinct from other machine learningproblems [21, 22, 24, 25, 26].
The rest of this paper is organized as follows. Section 2 presents the problem statement of LinearQuadratic Gaussian (LQG) control. We introduce our main results on the connectivity of stabilizingcontrollers in Section 3, and present our main results on the structure of stationary points of LQGproblems in Section 4. Some numerical results on gradient descent algorithms for LQG are reportedin Section 5. We conclude the paper in Section 6. Appendices contain preliminaries in control anddifferential geometry, proofs of auxiliary results, a connectivity result of proper stabilizing controllers,and analogous results for discrete-time systems.
Notations
We use R and N to denote the set of real and natural numbers, respectively. The set of k × k realsymmetric matrices is denoted by S k , and the determinant of a square matrix M is denoted by det M . We denote the set of k × k real invertible matrices by GL k = { T ∈ R k × k | det T (cid:54) = 0 } . Givena matrix M ∈ R k × k , M T denotes the transpose of M , and (cid:107) M (cid:107) F denotes the Frobenius normof M . For any M , M ∈ S k , we use M ≺ M and M (cid:31) M to mean that M − M is positivedefinite, and use M (cid:22) M and M (cid:23) M to mean that M − M is positive semidefinite. We use I k to denote the k × k identity matrix, and use k × k to denote the k × k zero matrix; we sometimesomit their subscripts if the dimensions can be inferred from the context. In this section, we first introduce the Linear Quadratic Gaussian control problem, and then presentthe problem statement of our work.
Consider a continuous-time linear dynamical system ˙ x ( t ) = Ax ( t ) + Bu ( t ) + w ( t ) ,y ( t ) = Cx ( t ) + v ( t ) , (1)where x ( t ) ∈ R n represents the vector of state variables, u ( t ) ∈ R m the vector of control inputs, y ( t ) ∈ R p the vector of measured outputs available for feedback, and w ( t ) ∈ R n , v ( t ) ∈ R p are systemprocess and measurement noises at time t . It is assumed that w ( t ) and v ( t ) are white Gaussian noiseswith intensity matrices W (cid:23) and V (cid:31) . For notational simplicity, we will drop the argument t when it is clear in the context. We only consider the continuous-time case in the main text. The results for discrete-time systems can be foundin Appendix D. min u ( t ) J := lim T →∞ T E (cid:20)(cid:90) Tt =0 (cid:16) x T Qx + u T Ru (cid:17) dt (cid:21) subject to (1) , (2)where Q (cid:23) and R (cid:31) . In (2), the input u ( t ) is allowed to depend on all past observation y ( τ ) with τ < t . Throughout the paper, we make the following standard assumption of minimal systemsin the sense of Kalman (see Appendix A.1 for a review of these notions). Assumption 1. ( A, B ) and ( A, W / ) are controllable, and ( C, A ) and ( Q / , A ) are observable. Unlike the problem of linear quadratic regulator (LQR), static feedback policies in general do notachieve optimal values of the cost function, and we need to consider a class of dynamical controllersin the form of ˙ ξ ( t ) = A K ξ ( t ) + B K y ( t ) ,u ( t ) = C K ξ ( t ) , (3)where ξ ( t ) ∈ R q is the internal state of the controller, and A K , B K , C K are matrices of properdimensions that specify the dynamics of the controller. We refer to the dimension q of the internalcontrol variable ξ as the order of the dynamical controller (3). A dynamical controller is called a full-order dynamical controller if its order is the same as the system dimension, i.e., q = n ; if q < n ,we call (3) a reduced-order or lower-order controller . We shall see later that it is unnecessary toconsider dynamical controllers with order beyond the system dimension n .The LQG problem (2) admits the celebrated separable principle and has a closed-form solutionby solving two algebraic Riccati equations [1, Theorem 14.7]. Indeed, the optimal solution to (2) is u ( t ) = − Kξ ( t ) with a fixed p × n matrix K and ξ ( t ) is the state estimation based on the Kalmanfilter. Precisely, the optimal controller is given by ˙ ξ = ( A − BK ) ξ + L ( y − Cξ ) ,u = − Kξ. (4)In (4), the matrix L is called the Kalman gain , computed as L = P C T V − where P is the uniquepositive semidefinite solution (see e.g., [1, Corollary 13.8]) to AP + P A T − P C T V − CP + W = 0 , (5a)and the matrix K is called the feedback gain , computed as K = R − B T S where S is the uniquepositive semidefinite solution to A T S + SA − SBR − B T S + Q = 0 . (5b)We can see that the optimal LQG controller (4) can be written into the form of (3) with A K = A − BK − LC, B K = L, C K = − K. (6)Thus, the solution from Ricatti equations (5) is always full-order, i.e., q = n . We note that twodynamical controllers with the same transfer function K ( s ) = C K ( sI − A K ) − B K lead to the sameLQG cost. In general, the optimal LQG controller is only unique in the frequency domain [1,Theorem 14.7] but not unique in the state-space domain (3); any similarity transformation on (6)leads to another optimal solution that achieves the global minimum cost . This is a well-known fact and can be verified easily; see Lemma 4.1. .2 Parameterization of Dynamical Controllers and the LQG Cost Function The controller (4) explicitly depends on the plant’s parameters
A, B, C , and it may not be straight-forward to compute (4) if
A, B and C are not available. Recently, model-free policy gradientmethods have been applied in a range of control problems, such as LQR in discrete-time [6] andcontinuous-time [8], finite-horizon discrete-time LQG problem [29], and state-feedback risk-sensitivecontrol [12]. These methods view classical control problems from a modern optimization perspective,and directly optimize control policies based on system observations, without explicit knowledge ofthe underlying model. To avoid the explicit dependence on model parameters A, B, C , we considerthe class of dynamical controllers in (3), parameterized by ( A K , B K , C K ) . As we will see later, thisallows us to view LQG (2) from a model-free optimization perspective.In order to formulate the cost in (2) as a function of the parameterized dynamical controller ( A K , B K , C K ) , we first need to specify its domain. By combining (3) with (1), we get the closed-loopsystem ddt (cid:20) xξ (cid:21) = (cid:20) A BC K B K C A K (cid:21) (cid:20) xξ (cid:21) + (cid:20) I B K (cid:21) (cid:20) wv (cid:21) , (cid:20) yu (cid:21) = (cid:20) C C K (cid:21) (cid:20) xξ (cid:21) + (cid:20) v (cid:21) . (7)It is known from classical control theory [1, Chapter 13] that under Assumption 1, the LQG cost isfinite if the closed-loop matrix (cid:20) A BC K B K C A K (cid:21) = (cid:20) A
00 0 (cid:21) + (cid:20) B I (cid:21) (cid:20) C K B K A K (cid:21) (cid:20) C I (cid:21) (8)is stable [1], i.e., the real parts of all its eigenvalues are negative; dynamical controllers satisfyingthis condition is said to internally stabilize the plant (1). Furthermore, it is a known fact in controltheory that the optimal controller (6) obtained by solving the Riccati equations internally stabilizesthe plant. We therefore parameterize the set of stabilizing controllers with order q ∈ N by C q := (cid:26) K = (cid:20) D K C K B K A K (cid:21) ∈ R ( m + q ) × ( p + q ) (cid:12)(cid:12)(cid:12)(cid:12) D K = 0 m × p , (8) is stable (cid:27) , (9)and let J q ( K ) : C q → R denote the function that maps a parameterized dynamical controller in C q toits corresponding LQG cost for each q ∈ N . It can be shown that the set of full-order stabilizingcontrollers C n is nonempty as long as Assumption 1 holds [1], and since it also contains the optimalLQG controller to (2), we will mainly focus on the set of full-order stabilizing controllers C n in thispaper. We will abbreviate J n ( K ) as J ( K ) when no confusions occur.The following lemma shows that the set C q can be treated as an open set when it is nonempty.This is a direct consequence of the fact that the Routh–Hurwitz stability criterion returns a set ofstrict polynomial inequalities in terms of the elements of ( A K , B K , C K ) . Lemma 2.1.
Let q ≥ such that C q is nonempty. Then, C q is an open subset of the linear space V q := (cid:26) (cid:20) D K C K B K A K (cid:21) ∈ R ( m + q ) × ( p + q ) (cid:12)(cid:12)(cid:12)(cid:12) D K = 0 m × p (cid:27) . (10) We explicitly include the zero matrix D K in the definition of C q , which corresponds to the set of strictly proper dynamical controllers. If we allow D K to be non-zero, we will have a proper dynamical controller; see Appendix C. Inthis definition, when q = 0 , we have C q = { m × p } if the plant (1) is open-loop stable, and C q = ∅ otherwise. In (9), for notational simplicity, we lumped the controller parameters into a single matrix; but it should beinterpreted as a dynamical controller, represented by (3). Note that this definition allows us to apply block-wisematrix operations; see e.g. , (14). J q . Theseresults are known in the literature; we provide a short proof in Appendix B.1 for completeness. Lemma 2.2.
Fix q ∈ N such that C q (cid:54) = ∅ . Given K ∈ C q , we have J q ( K ) = tr (cid:18)(cid:20) Q C TK RC K (cid:21) X K (cid:19) = tr (cid:18)(cid:20) W B K V B TK (cid:21) Y K (cid:19) , (11) where X K and Y K are the unique positive semidefinite solutions to the following Lyapunov equations (cid:20) A BC K B K C A K (cid:21) X K + X K (cid:20) A BC K B K C A K (cid:21) T + (cid:20) W B K V B TK (cid:21) = 0 , (12a) (cid:20) A BC K B K C A K (cid:21) T Y K + Y K (cid:20) A BC K B K C A K (cid:21) + (cid:20) Q C TK RC K (cid:21) = 0 . (12b) Lemma 2.3.
Fix q ∈ N such that C q (cid:54) = ∅ . Then, J q is a real analytic function on C q . Now, given the dimension n of the plant’s state variable, the LQG problem (2) can be reformulatedinto a constrained optimization problem: min K J n ( K ) subject to K ∈ C n . (13)After reformulating the LQG (2) into (13), it is possible to estimate the gradient of J n ( K ) fromsystem trajectories, and one may further derive model-free policy gradient algorithms to find asolution to (13). To characterize the performance of policy gradient algorithms, it is necessary tounderstand the landscape of (13). It is well-known that C n is in general non-convex. Lemma 2.3indicates that J n is a real analytical function. However, little is known about their further geometricaland analytical properties, especially those that are fundamental for establishing convergence ofgradient-based algorithms. In this paper, we focus on the following two topics of the set C n and theLQG cost function J n :1) The connectivity of C n and its implications , which will be studied in Section 3. Connectivityof stabilizing controllers has received increasing attention, but most recent results focus onstate-feedback controllers or static output-feedback controllers [6, 8, 13, 20]. It is known thatthe set of stabilizing state-feedback policies is in general non-convex but connected, and thisconnectivity is fundamental for gradient-based local search algorithms to find a good solution. Itis also known that the set of stabilizing static output-feedback policies can be highly disconnected(there exist cases with an exponential number of connected components [20]). The connectivity ofdynamical controllers C n , however, is unknown and has not been discussed before in the literature.2) The structure of the stationary points and the global optimum of J n , which will be studiedin Section 4. A classical result in control is that the solution to the LQR problem is unique undermild technical assumptions, which is an important fact in establishing the gradient dominant property of the LQR cost function [6, 8]. In addition, it has been recently shown that a classof output-feedback controller design problem in finite-time horizon also has a unique stationarypoint [29]. However, it is expected that the stationary points of the LQG problem (13) are notunique due to the non-uniqueness of globally optimal solutions in the state-space domain. Weaim to reveal further structural properties of stationary points of the LQG problem (13).7 Connectivity of the Set of Stabilizing Controllers
In this section, we characterize the connectivity of the set of full-order stabilizing controllers C n . Wefirst have the following observation. Lemma 3.1.
Under Assumption 1, the set C n is non-empty, unbounded, and can be non-convex.Proof. It is a well-known fact in control theory that C n (cid:54) = ∅ under Assumption 1. In particular, anypole assignment algorithm or solving the Ricatti equations (5a) and (5b) can find a feasible point in C n . To show the unboundedness of C n , we introduce the following set S n = (cid:40) K = (cid:20) C K B K A K (cid:21) ∈ R ( m + n ) × ( p + n ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) A K = A − BK − LC, B K = L, C K = − K,A − BK and A − LC are stable (cid:41) . It has been established in classical control theory that S n ⊂ C n [1, Chapter 3.5] and the set { K | A − BK is stable } is unbounded (see, e.g., [13, Observation 3.6]). Thus, the set S n isunbounded, and so is C n . Non-convexity of C n is also known and can be illustrated by the explicitcounterexample in Example 1. Example 1 (Non-convexity of stabilizing controllers) . Consider a dynamical system (1) with A = 1 , B = 1 , C = 1 . The set of stabilizing controllers C n = C is given by C n = (cid:26) K = (cid:20) C K B K A K (cid:21) ∈ R × (cid:12)(cid:12)(cid:12)(cid:12) (cid:20) C K B K A K (cid:21) is stable (cid:27) . It is easy to verify that the following dynamical controllers K (1) = (cid:20) − − (cid:21) , K (2) = (cid:20) − − (cid:21) internally stabilize the plant and thus belong to C . However, ˆ K = (cid:0) K (1) + K (2) (cid:1) = (cid:20) − (cid:21) failsto stabilize the plant. We first introduce the notion of similarity transformation that has been widely-used in control theory.Given q ≥ such that C q (cid:54) = ∅ , we define the mapping T q : GL q × C q → C q that represents similaritytransformations on C q by T q ( T, K ) := (cid:20) I m T (cid:21) (cid:20) D K C K B K A K (cid:21) (cid:20) I p T (cid:21) − = (cid:20) D K C K T − T B K T A K T − (cid:21) . (14)It is not hard to verify that for any invertible matrix T ∈ GL q and K ∈ C q , T q ( T, K ) is indeeda stabilizing controller of order q and thus is in C q . We can also check that T q is indefinitelydifferentiable on GL q × C q , and that T q ( T , T q ( T , K )) = T q ( T T , K ) (15)8or any invertible T , T ∈ GL q . This implies that for any fixed T ∈ GL n , the map K (cid:55)→ T q ( T, K ) admits an inverse given by K (cid:55)→ T q ( T − , K ) . Therefore, we have the following result (see Appendix A.3for a review of manifolds and diffeomorphism). Lemma 3.2.
Given q ≥ such that C q (cid:54) = ∅ , for any invertible matrix T ∈ GL q , the map K (cid:55)→ T q ( T, K ) is a diffeomorphism from C q to itself. Our main technical results in this section are on the path-connectivity of C n . Recall that T n ( T, K ) is defined by (14). For notational simplicity, for any fixed T ∈ GL n , we let T T : C n → C n denote themapping given by T T ( K ) := T n ( T, K ) = (cid:20) D K C K T − T B K T A K T − (cid:21) . We are now ready to present the main technical results.
Theorem 3.1.
Under Assumption 1, the set C n has at most two path-connected components. Theorem 3.2. If C n has two path-connected components C (1) n and C (2) n , then C (1) n and C (2) n arediffeomorphic under the mapping T T , for any invertible matrix T ∈ R n × n with det T < . Theorem 3.2 shows that even if C n has two path-connected components, there exists a linearbijection mapping defined by a similarity transformation T T between these two components. Inthe following theorem, we present a sufficient condition under which C n is path-connected. Thiscondition becomes necessary for a class of dynamical systems with single input or single output. Theorem 3.3.
Under Assumption 1, the following statements are true.1) C n is path-connected if there exists a reduced-order stabilizing controller, i.e. , C n − (cid:54) = ∅ .2) Suppose the plant (1) is single-input or single-output, i.e., m = 1 or p = 1 . Then the set C n ispath-connected if and only if C n − (cid:54) = ∅ . One main idea in our proofs is based on a classical change of variables for dynamical controllers(see, e.g., [5]). We adopt the change of variables to construct a set with a convex projection and asurjective mapping from that set to C n , and then path-connectivity results generally follow fromthe fact that a convex set is path-connected. The potential disconnectivity of C n comes from thefact that the set of real invertible matrices GL n = { Π ∈ R n × n | det Π (cid:54) = 0 } has two path-connectedcomponents [39]: GL + n = { Π ∈ R n × n | det Π > } , GL − n = { Π ∈ R n × n | det Π < } . The full proofsare technically involved, and we postpone them to Section 3.2— 3.4.Here, we note that given any open-loop unstable first-order dynamical system, i.e. , n = 1 , and A > in (1), it is easy to see that there exist no reduced-order stabilizing controllers, i.e. , C n − = ∅ .Thus, Theorem 3.3 indicates that its associated set of stabilizing controllers C n is not path-connected.We provide an explicit single-input and single-output (SISO) example below. Example 2 (Disconectivity of stabilizing controllers) . Consider the dynamical system in Example 1: A = 1 , B = 1 , C = 1 . a) C for Example 2 (b) C for Example 3 Figure 1:
The set of stabilizing controllers C for Examples 2 and 3: (a) For Example 2, the set C givenby (16) has two path-connected components; (b) For Example 3, the set C given by (17) is path-connected. Since it is open-loop unstable and only has state of dimension n = 1 , we know C n − = ∅ . Thus,Theorem 3.3 indicates that its associated set of stabilizing controllers C n is not path-connected.Indeed, using the Routh–Hurwitz stability criterion, it is straightforward to derive that C = (cid:26) K = (cid:20) C K B K A K (cid:21) ∈ R × (cid:12)(cid:12)(cid:12)(cid:12)(cid:20) A BC K B K C A K (cid:21) is stable (cid:27) = (cid:26) K = (cid:20) C K B K A K (cid:21) ∈ R × (cid:12)(cid:12)(cid:12)(cid:12) A K < − , B K C K < A K (cid:27) . (16)This set has two path-connected components: C = C +1 ∪ C − with C +1 ∩ C − = ∅ , where C +1 := (cid:26) K = (cid:20) C K B K A K (cid:21) ∈ R × (cid:12)(cid:12)(cid:12)(cid:12) A K < − , B K C K < A K , B K > (cid:27) , C − := (cid:26) K = (cid:20) C K B K A K (cid:21) ∈ R × (cid:12)(cid:12)(cid:12)(cid:12) A K < − , B K C K < A K , B K < (cid:27) . In addition, as expected by Theorem 3.2, it is easy to verify that C +1 and C − are homeomorphicunder the mapping T T , for any T < . Figure 1a illustrates the region of the set C in (16).In Appendix B.3, we present a nontrivial second-order SISO system, for which C n − = ∅ and C n is disconnected. Theorem 3.3 also suggests the following corollary. Corollary 3.1.
Given any open-loop stable dynamical system (1) , i.e. , A is stable, we have that C n is path-connected.Proof. Since the dynamical system (1) is open-loop stable, thus K = (cid:20) m × p m × ( n − ( n − × p − I n − (cid:21) ∈ C n − , and C n − (cid:54) = ∅ . By Theorem 3.3, C n is path-connected.10 xample 3 (Stabilizing controllers for an open-loop stable system) . Consider an open-loop stabledynamical system (1) with A = − , B = 1 , C = 1 . Since it is open-loop stable, Corollary 3.1 indicates that its associated set of stabilizing controllers C n is path-connected. Using the Routh–Hurwitz stability criterion, it is straightforward to derive that C = (cid:26) K = (cid:20) C K B K A K (cid:21) ∈ R × (cid:12)(cid:12)(cid:12)(cid:12) A K < , B K C K < − A K (cid:27) . (17)This set is path-connected, as illustrated in Figure 1b.Before presenting the technical proofs, we note that the controllers of C n in (9) are always strictly proper , which is sufficient for the LQG problem (2). For closed-loop stability, we can alsoconsider proper dynamical controllers. We provide this discussion in Appendix C: Unlike C n thatmight be disconnected, the set of proper stabilizing dynamical controllers is always connected (seeTheorem C.1). Remark . Moti-vated by the success of data-driven RL techniques, some recent studies revisited the classical LQRproblem from a modern optimization perspective and designed policy gradient algorithms [6, 8, 12].The connectivity of feasible region (i.e., the set of stabilizing controllers) becomes important tolocal search algorithms (e.g., policy gradient) since they typically cannot jump between differ-ent connected components. It is known that the set of stabilizing static state-feedback policies { K ∈ R m × n | A − BK is stable } is connected [13], and this is one important factor in justifying theperformance of the algorithms in [6, 8, 12]. On the other hand, the set of stabilizing static outputfeedback policies { D K ∈ R m × p | A − BD K C is stable } can be highly disconnected [20], posing asignificant challenge for local search algorithms. In Theorems 3.1, 3.2 and 3.3, we have shown thatthe set of stabilizing controllers C n in LQG problem has at most two path-connected components thatare diffeomorphic to each other under a particular similarity transformation. Since similarity trans-formation does not change the input/output behavior of a controller (see Appendix A.1), it makesno difference to search over any path-connected component in C n even if C n is not path-connected.This brings positive news to gradient-based local search algorithms for the LQG problem. The following Lyapunov stability criterion [40] plays a central role in our proof: A square real matrix M is stable if and only if the Lyapunov inequality M P + P M T ≺ has a positive definite solution P (cid:31) .The analysis of the path-connectivity of C n is similar with analyzing the connectivity of the set ofstabilizing static state feedback policies: We first adopt a classical change of variables that has beenused for developing convex reformulation of controller synthesis problems, and then path-connectivityresults generally follow from the fact that a convex set is path-connected ; see Remark 2 for details. Remark . The path-connectivity of the11et of stabilizing static state-feedback policies { K ∈ R m × n | A − BK is stable } is easy to show: { K ∈ R m × n | A − BK is stable }⇐⇒ { K ∈ R m × n | ∃ P (cid:31) , P ( A − BK ) T + ( A − BK ) P ≺ }⇐⇒ { K ∈ R m × n | ∃ P (cid:31) , P A T − L T B T + AP − BL ≺ , L = KP }⇐⇒ { K = LP − ∈ R m × n | ∃ P (cid:31) , P A T − L T B T + AP − BL ≺ } . (18)Since the set { ( P, L ) | P (cid:31) , P A T − L T B T + AP − BL ≺ } (19)is convex and the map K = LP − is continuous for the elements in (19), we know { K ∈ R m × n | A − BK is stable } is path-connected. The second equivalence in (18) utilizes a well-known changeof variables K = LP − . This trick is essential to derive convex reformulations for designing state-feedback policies in various setups [40]. We note that the trick (18) has been used in [13, 8].The main strategy in the proof of Theorem 3.1 is similar to (18), but we need to use a morecomplicated change of variables for dynamical controllers in the state-space domain [5]. To see thedifficulty, applying the Lyapunov stability result leads to (cid:20) A + BD K C BC K B K C A K (cid:21) is stable ⇐⇒ ∃ P (cid:31) , P (cid:20) A + BD K C BC K B K C A K (cid:21) T + (cid:20) A + BD K C BC K B K C A K (cid:21) P ≺ , (20)where the coupling between the auxiliary variable P and the controller parameters A K , B K , C K , D K are much more involved.In our proof, we adopt the change of variables presented in [5]. Given the system dynamics ( A, B, C ) in (1), we first introduce the following convex set F n := (cid:26) ( X, Y, M, G, H, F ) | X, Y ∈ S n , M ∈ R n × n , G = 0 m × p , H ∈ R n × p , F ∈ R m × n , (cid:20) X II Y (cid:21) (cid:31) , (cid:20) AX + BF A + BGCM Y A + HC (cid:21) + (cid:20) AX + BF A + BGCM Y A + HC (cid:21) (cid:62) ≺ (cid:27) , (21)and the extended set G n := (cid:40) Z = ( X, Y, M, G, H, F, Π , Ξ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( X, Y, M, G, H, F ) ∈ F n , Π , Ξ ∈ R n × n , ΞΠ = I − Y X (cid:41) . (22)We shall later see that there exists a continuous surjective map from G n to C n , and the path-connectivity of the convex set F n plays a key role in analyzing the path-connected components of C n . Before proceeding, we note the following observation for each element in G n . Lemma 3.3.
For any ( X, Y, M, G, H, F, Π , Ξ) ∈ G n , Π and Ξ are always invertible, and consequently,the block triangular matrices (cid:20) I Y B Ξ (cid:21) and (cid:20) I CX (cid:21) are invertible. We explicitly include the matrix D K in the Lypuanov inequality (20): D K = 0 corresponds to strictly propercontrollers and D K (cid:54) = 0 corresponds to proper controllers; see Appendix C. We explicitly include the zero matrix G in the definition of F n , for which the purpose will become clear whenstudying the set of proper stabilizing controllers; see Appendix C. roof. By definition, for all ( X, Y, W, G, H, F, Π , Ξ) ∈ G n , we have (cid:20) X II Y (cid:21) (cid:31) , implying that det( Y X − I ) = det X det( Y − X − ) = det (cid:20) X II Y (cid:21) > . Thus, det (Π) (cid:54) = 0 and det (Ξ) (cid:54) = 0 , indicating they are both invertible. The invertibility of the othertwo block triangular matrices is straightforward.We now define a mapping from G n to a subset of R ( m + n ) × ( p + n ) . Definition 1 (Change of variables via nonlinear mapping) . For each Z = ( X, Y, M, G, H, F, Π , Ξ) in G n , let Φ( Z ) = (cid:20) Φ D ( Z ) Φ C ( Z )Φ B ( Z ) Φ A ( Z ) (cid:21) := (cid:20) I Y B Ξ (cid:21) − (cid:20) G HF M − Y AX (cid:21) (cid:20)
I CX (cid:21) − . (23)It is easy to see that Φ D ( Z ) ≡ G ≡ for Z ∈ G n . We point out that this mapping (23) isderived from the change of variables presented in [5], which is essential to obtain equivalent convexreformulations for a range of output-feedback controller synthesis, including H ∞ and H optimalcontrol. The following result builds an explicit connection between G n and C n via the mapping Φ ,and its proof is provided in Appendix B.2. Proposition 3.1.
The mapping Φ in (23) is a continuous and surjective mapping from G n to C n . After establishing the continuous surjection from G n to C n , it is now clear that we can study thepath-connectivity of C n via the path-connectivity of G n : Any continuous path in G n will be mappedto a continuous path in C n , and thus any path-connected component of G n has a path-connectedimage under the mapping Φ . Consequently, the number of path-connected components of C n will beno more than the number of path-connected components of G n .We now proceed to provide results on the path-connectivity of the set G n . Proposition 3.2.
The set G n has two path-connected components, given by G + n = { ( X, Y, M, G, H, F, Π , Ξ) ∈ G n | det Π > } , G − n = { ( X, Y, M, G, H, F, Π , Ξ) ∈ G n | det Π < } . Proof.
First, the convexity of F n implies that set F n is path-connected. We then notice that the setof real invertible matrices GL n = { Π ∈ R n × n | det Π (cid:54) = 0 } has two path-connected components [39] GL + n = { Π ∈ R n × n | det Π > } , GL − n = { Π ∈ R n × n | det Π < } . Therefore the Cartesian product F n × GL n has two path-connected components. Finally, it is nothard to verify that the following mapping ( X, Y, M, G, H, F, Π) (cid:55)→ ( X, Y, M, G, H, F, Π , ( I − Y X )Π − ) is a continuous bijection from F n × GL n to G n .Therefore G n also has two path-connected components, and their expressions are evident.Proposition 3.2 then implies that C n has at most two path-connected components. Precisely,upon defining C + n = Φ( G + n ) , C − n = Φ( G − n ) , the two path-connected components of C n are just given by C + n and C − n , if C n is not path-connected.This completes the proof of Theorem 3.1. 13 .3 Proof of Theorem 3.2 In the previous subsection, we have already shown that C + n and C − n are the two path-connectedcomponents if C n is not connected. In order to prove Theorem 3.2, it suffices to show that, regardlessof the path-connectivity of C n , for any T ∈ R n × n with det T < , the mapping T T restricted on C + n gives a diffeomorphism from C + n to C − n .Since T T is a diffeomorphism from C n to itself with inverse T T − , and C + n and C − n are two opensubsets of C n , to complete the proof, we only need to show that T T ( C + n ) ⊆ C − n , and T T − ( C − n ) ⊆ C + n when det T < . Consider an arbitrary point K = (cid:20) C K B K A K (cid:21) ∈ C + n . By the definition of C + n , there exists Z = ( X, Y, M, G, H, F, Π , Ξ) ∈ G + n such that Φ( Z ) = K . Now let ˆΠ = T Π , ˆΞ = Ξ T − , ˆ Z = ( X, Y, M, G, H, F, ˆΠ , ˆΞ) . It is not difficult to verify that ˆ Z ∈ G n . Since det ˆΠ = det T · det Π < , we have ˆ Z ∈ G − n . Then, Φ(ˆ Z ) = (cid:20) Φ D (ˆ Z ) Φ C (ˆ Z )Φ B (ˆ Z ) Φ A (ˆ Z ) (cid:21) = (cid:20) I Y B ˆΞ (cid:21) − (cid:20) G FH M − Y AX (cid:21) (cid:20)
I CX (cid:21) − = (cid:20) I T (cid:21) (cid:20) Ξ Y B I (cid:21) − (cid:20) G FH M − Y AX (cid:21) (cid:20)
I CX (cid:21) − (cid:20) I T − (cid:21) = (cid:20) I T (cid:21) (cid:20) C K B K A K (cid:21) (cid:20) I T − (cid:21) = (cid:20) C K T − T B K T A K T − (cid:21) = T T ( K ) , which implies that T T ( K ) ∈ Φ( G − n ) = C − n and consequently T T ( C + n ) ⊆ C − n .The proof of T T − ( C − n ) ⊆ C + n is similar by noting that det T − < if and only if det T < . We first show that the non-emptiness of C n − implies the path-connectivity of C n . Indeed, supposethere exists ˜ K ∈ C n − . Then it can be augmented to be a full-order controller in C n by K = C K B K ˜ A K
00 0 − Now define a similarity transformation matrix T = (cid:20) I n − − (cid:21) .
14y the proof of Theorem 3.2, we can see that K ∈ C ± n implies T T ( K ) ∈ C ∓ n . On the other hand, wecan directly check that T T ( K ) = K . Therefore we have K ∈ C + n ∩ C − n , indicating that C + n ∩ C − n is nonempty. Consequently, C n is path-connected.We then carry out the analysis for the case when the plant is single-input or single-output. Thegoal is to find a reduced-order controller in C n − when C n is connected. Here we only prove thesingle-out case; the single-input case can be proved similarly, i.e. , using the observability matrix orby the duality between controllability and observability.Let T be any real n × n matrix with det T < . Let K (0) ∈ C n be arbitrary, and let K (1) = T T ( K (0) ) .If C n is path-connected, then there exists a continuous path K ( t ) = (cid:20) C K ( t ) B K ( t ) A K ( t ) (cid:21) , t ∈ [0 , in C n such that K (0) = K (0) , and K (1) = K (1) . Now for each t ∈ [0 , , let C ( t ) denote the controllability matrix for ( A K ( t ) , B K ( t )) , i.e., C ( t ) = (cid:2) B K ( t ) A K ( t ) B K ( t ) · · · A K ( t ) n − B K ( t ) (cid:3) ∈ R n × n , where the dimension of C ( t ) is n × n since the plant is single-output (i.e., the controller is single-input).We then have C (1) = T C (0) , and thus det C (1) · det C (0) < . On the other hand, it can be seen that det C ( t ) is a continuous function over t ∈ [0 , . Therefore det C ( τ ) = 0 for some τ ∈ (0 , , implying that ( A K ( τ ) , B K ( τ )) is not controllable. This indicates that the transferfunction C K ( τ )( sI n − A K ( τ )) − B K ( τ ) can be realized by a state-space representation with dimensionat most n − (see Appendix A.1), and consequently C n − (cid:54) = ∅ . We have shown that the set of stabilizing controllers C n might be disconnected, and that the potentialdisconnectivity has no harm to gradient-based local search algorithms. In this section, we proceed tocharacterize the stationary points of the cost function in the LQG problem (2), which is anotherimportant factor for establishing the convergence of gradient-based algorithms.Section 4.1 discusses the invariance of the LQG cost J q under similarity transformation and itsimplications. Section 4.2 shows how to compute the gradient and the Hessian of the LQG cost J q .In Section 4.3, some results related to non-minimal stationary points are provided. We characterizethe minimal stationary points for LQG over C n in Section 4.4. Finally, in Section 4.5, we discuss thesecond-order behavior of J n ( K ) around its minimal stationary points.15 .1 Invariance of LQG Cost under Similarity Transformation As shown in Lemma 3.2, the similarity transformation T q ( T, · ) is a diffeomorphism from C q to itselffor any invertible matrix T ∈ GL q . Then together with (15), we can see that the set of similaritytransformation is a group that is isomorphic to GL q . We can therefore define the orbit of K ∈ C q by O K := { T q ( T, K ) | T ∈ GL q } . It is known that the LQG cost is invariant under the same similarity transformation, and thus is aconstant over an orbit O K for any K ∈ C q . Lemma 4.1.
Let q ≥ such that C q (cid:54) = ∅ . Then we have J q ( K ) = J q ( T q ( T, K )) for any K ∈ C q and any invertible matrix T ∈ GL q .Proof. Given any K ∈ C q and any invertible T ∈ R q × q , we know that T ( T, K ) ∈ C q . Thus, theLyapunov equation (12a) admits a unique positive semidefinite solution for each of K and T q ( T, K ) (see Lemma A.1).Suppose that the solution of (12a) for K is X K . Then, it is not difficult to verify that the uniquesolution of (12a) for T q ( T, K ) is (cid:20) I T (cid:21) X K (cid:20) I T (cid:21) T . Therefore, we have J q ( T q ( T, K )) = tr (cid:32)(cid:20) Q
00 ( C K T − ) T RC K T − (cid:21) (cid:20) I T (cid:21) X K (cid:20) I T (cid:21) T (cid:33) = tr (cid:18)(cid:20) Q C TK RC K (cid:21) X K (cid:19) = J q ( K ) , where the second identity applies the trace property tr( AB ) = tr( BA ) for A, B with compatibledimensions.The following proposition shows that every orbit O K corresponding to controllable and observ-able controllers has dimension q with two path-connected components. The proof is given inAppendix B.6. Proposition 4.1.
Suppose K ∈ C q represents a controllable and observable controller. Then theorbit O K is a submanifold of C q of dimension q , and has two path-connected components, given by O + K = { T q ( T, K ) | T ∈ GL q , det T > } , O − K = { T q ( T, K ) | T ∈ GL q , det T < } . From Lemma 4.1 and Proposition 4.1, one interesting consequence is that given a globally optimalLQG controller K ∗ ∈ C n , then any controller in following orbit is globally optimal O K ∗ := { T n ( T, K ∗ ) | T ∈ GL n } . a) Open-loop unstable system in Example 2 (b)
Open-loop stable system in Example 3
Figure 2:
Non-isolated and disconnected globally optimal LQG controllers. In both cases, we set Q = 1 , R =1 , V = 1 , W = 1 . (a) LQG cost for the open-loop unstable SISO system in Example 2 when fixing A K = − − √ , for which the set of globally optimal points (cid:8) ( B K , C K ) | B K = (1 + √ T , C K = − (1 + √ T, T (cid:54) = 0 (cid:9) hastwo connected components. (b) LQG cost for the open-loop stable SISO system in Example 3 when fixing A K =1 − √ , for which the set of globally optimal points (cid:8) ( B K , C K ) | B K = ( − √ T , C K = (1 − √ T, T (cid:54) = 0 (cid:9) has two connected components. If K ∗ is minimal (i.e., controllable and observable), the orbit O K ∗ is a submanifold in V n of dimension n , and it has two path-connected components. Figure 2 demonstrates the orbit of globally optimalLQG controllers for an open-loop unstable system and another open-loop stable system, which showsthat the set of globally optimal LQG controllers are non-isolated and disconnected in C n .Proposition 4.1 guarantees that for any controllable and observable K ∈ C q , the orbit O K is asubmanifold of dimension q in C q , which allows us to define the tangent space of O K . For eachminimal K ∈ C q , we use T O K to denote the tangent space of O K at K , and treat it as a subspace of V q ; recall that V q is defined by (10). The dimension of T O K is then dim T O K = dim O K = q . We denote the orthogonal complement of
T O K in V q by T O ⊥ K . The following proposition characterizesthe tangent space T O K and its orthogonal complement T O ⊥ K at a minimal controller K ∈ C q . Proposition 4.2.
Let K ∈ C q represent a controllable and observable controller. Then T O K = (cid:26) (cid:20) − C K HHB K HA K − A K H (cid:21) (cid:12)(cid:12)(cid:12)(cid:12) H ∈ R q × q (cid:27) , T O ⊥ K = (cid:26) ∆ = (cid:20) B K ∆ C K ∆ A K (cid:21) ∈ V q (cid:12)(cid:12)(cid:12)(cid:12) ∆ A K A TK − A TK ∆ A K + ∆ B K B TK − C TK ∆ C K = 0 (cid:27) . Proof.
Let H ∈ R q × q be arbitrary. Then for sufficiently small (cid:15) , we have T q ( I + (cid:15)H, K ) = (cid:20) C K ( I + (cid:15)H ) − ( I + (cid:15)H ) B K ( I + (cid:15)H ) A K ( I + (cid:15)H ) − (cid:21) = K + (cid:15) (cid:20) − C K HHB K HA K − A K H (cid:21) + o ( (cid:15) ) , See Appendix A.3 for the definition of tangent spaces. A visualization of a manifold M and its tangent space T x M at one point x ∈ M is provided in Figure 3. igure 3: A graphical illustration of a manifold M and its tanget space T x M at some point x ∈ M . Here γ ( t ) is an arbitrary C ∞ curve in M that passes through x , and v is the tangent vector of γ ( t ) at x . Thetangent space T x M consists of all such vectors v . implying that the tangent map of T q ( · , K ) at the identity is given by H (cid:55)→ (cid:20) − C K HHB K HA K − A K H (cid:21) . Then since T q ( · , K ) is a diffeomorphism from GL q to O K , the tangent map of T q ( · , K ) at the identityis an isomorphism from R q × q (the tangent space of GL q at the identity) to the tangent space T O K .Thus T O K = (cid:26) (cid:20) − C K HHB K HA K − A K H (cid:21) (cid:12)(cid:12)(cid:12)(cid:12) H ∈ R q × q (cid:27) . Then the orthogonal complement
T O ⊥ K is given by T O ⊥ K = (cid:110) ∆ ∈ V q (cid:12)(cid:12)(cid:12) tr( U T ∆) = 0 for all U ∈ T O K (cid:111) = (cid:40) ∆ = (cid:20) B K ∆ C K ∆ A K (cid:21) ∈ V q (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) tr (cid:32)(cid:20) − C K HHB K HA K − A K H (cid:21) T ∆ (cid:33) = 0 , ∀ H ∈ R q × q (cid:41) = (cid:26) ∆ = (cid:20) B K ∆ C K ∆ A K (cid:21) ∈ V q (cid:12)(cid:12)(cid:12)(cid:12) tr H T (cid:16) ∆ A K A TK − A TK ∆ A K + ∆ B K B TK − C TK ∆ C K (cid:17) = 0 , ∀ H ∈ R q × q (cid:27) = (cid:26) ∆ = (cid:20) B K ∆ C K ∆ A K (cid:21) ∈ V q (cid:12)(cid:12)(cid:12)(cid:12) ∆ A K A TK − A TK ∆ A K + ∆ B K B TK − C TK ∆ C K = 0 (cid:27) . This completes the proof.We conclude this subsection by noting that the LQG cost function J q ( K ) is not coercive in thesense that there might exist sequences of stabilizing controllers K j ∈ C q where lim j →∞ K j = ˆ K ∈ ∂ C q such that lim j →∞ J q ( K j ) < ∞ , and sequences of stabilizing controllers K j ∈ C q where lim j →∞ (cid:107) K j (cid:107) F = ∞ such that lim j →∞ J q ( K j ) < ∞ . The latter fact is easy to see from Proposition 4.1 since the orbit O K can be unbounded and J q ( K ) is constant for any controller in the same orbit. The following example shows that the LQG costmight converge to a finite value even when the controller K goes to the boundary of C q .18 xample 4 (Non-coercivity of the LQG cost) . Consider the open-loop stable SISO system inExample 3, and we fix Q = 1 , R = 1 , V = 1 , W = 1 in the LQG formulation. The set of full-orderstabilizing controllers C is shown in (17). We consider the following stabilizing controller K (cid:15) = (cid:20) (cid:15) − (cid:15) (cid:21) ∈ C , ∀ (cid:15) (cid:54) = 0 . It is not hard to see that lim (cid:15) → K (cid:15) ∈ ∂ C . By solving the Lyapunov equation (12a), we get theunique solution as X K (cid:15) = (cid:15) + 12 (cid:15) (cid:15) (cid:15) , and the corresponding LQG cost as J ( K (cid:15) ) = 1 + 3 (cid:15) + (cid:15) . Therefore, we have lim (cid:15) → J ( K (cid:15) ) = 1 / , while lim (cid:15) → K (cid:15) ∈ ∂ C . The following lemma gives a closed-loop form for the gradient of the LQG cost function J q , and itsproof is given in Appendix B.4. Lemma 4.2 (Gradient of LQG cost J q ) . Fix q ≥ such that C q (cid:54) = ∅ . For every K = (cid:20) C K B K A K (cid:21) ∈ C q ,the gradient of J q ( K ) is given by ∇ J q ( K ) = ∂J q ( K ) ∂C K ∂J q ( K ) ∂B K ∂J q ( K ) ∂A K , with ∂J q ( K ) ∂A K = 2 (cid:16) Y T X + Y X (cid:17) , (24a) ∂J q ( K ) ∂B K = 2 (cid:16) Y B K V + Y X T C T + Y T X C T (cid:17) , (24b) ∂J q ( K ) ∂C K = 2 (cid:16) RC K X + B T Y X + B T Y X (cid:17) , (24c) where X K and Y K , partitioned as X K = (cid:20) X X X T X (cid:21) , Y K = (cid:20) Y Y Y T Y (cid:21) (25) are the unique positive semidefinite solutions to (12a) and (12b) , respectively. We next consider the Hessian of J q ( K ) . Let K be any controller in C q , and we use Hess K : V q × V q → R to denote the bilinear form of the Hessian of J q at K , so that for any ∆ ∈ V q , we have J n ( K + ∆) = J n ( K ) + tr (cid:16) ∇ J q ( K ) T ∆ (cid:17) + 12 Hess K (∆ , ∆) + o ( (cid:107) ∆ (cid:107) F ) (cid:107) ∆ (cid:107) F → . Obviously, Hess K is symmetric in the sense that Hess K ( x, y ) = Hess K ( y, x ) for all x, y ∈ V n . The following lemma shows how to compute Hess K (∆ , ∆) for any ∆ ∈ V q by solving threeLyapunov equations, whose proof is given in Appendix B.4. Lemma 4.3.
Fix q ≥ such that C q (cid:54) = ∅ . Let K = (cid:20) C K B K A K (cid:21) ∈ C q . Then for any ∆ = (cid:20) C K ∆ B K ∆ A K (cid:21) ∈ V q , we have Hess K (∆ , ∆) = 2 tr (cid:32) (cid:20) B ∆ C K ∆ B K C ∆ A K (cid:21) X (cid:48) K ∗ , ∆ · Y K ∗ + 2 (cid:20) C ∗ KT R ∆ C K (cid:21) · X (cid:48) K ∗ , ∆ + (cid:20) B K V ∆ T B K (cid:21) Y K ∗ + (cid:20) T C K R ∆ C K (cid:21) X K ∗ (cid:33) , where X K ∗ and Y K ∗ are the solutions to the Lyapunov equations (12a) and (12b) , and X (cid:48) K ∗ , ∆ ∈ R ( n + q ) × ( n + q ) is the solution to the following Lyapunov equation (cid:20) A BC ∗ K B ∗ K C A ∗ K (cid:21) X (cid:48) K ∗ , ∆ + X (cid:48) K ∗ , ∆ (cid:20) A BC ∗ K B ∗ K C A ∗ K (cid:21) T + M ( X K ∗ , ∆) = 0 , (26) with M ( X K ∗ , ∆) := (cid:20) B ∆ C K ∆ B K C ∆ A K (cid:21) X K ∗ + X K ∗ (cid:20) B ∆ C K ∆ B K C ∆ A K (cid:21) T + (cid:20) B ∗ K V ∆ T B K +∆ B K V B ∗ KT (cid:21) . From Lemma 4.3, one can further compute
Hess K (∆ , ∆ ) for any ∆ , ∆ ∈ V n by Hess K (∆ , ∆ ) = 14 (Hess K (∆ + ∆ , ∆ + ∆ ) − Hess K (∆ − ∆ , ∆ − ∆ ))= 12 (Hess K (∆ + ∆ , ∆ + ∆ ) − Hess K (∆ , ∆ ) − Hess K (∆ , ∆ )) . In this part, we show that the LQG cost J n ( K ) over the full-order stabilizing controller C n may havemany non-minimal stationary points that might be strict saddle points.We first investigate the gradient of J q ( K ) under similarity transformation. Given any T ∈ GL q ,recall the definition of the linear map of similarity transformation T q ( T, K ) in (14). The followinglemma gives an explicit relationship among the gradients of J q ( · ) at K and T q ( T, K ) . Lemma 4.4.
Let K = (cid:20) C K B K A K (cid:21) ∈ C q be arbitrary. For any T ∈ GL q , we have ∇ J q | T q ( T, K ) = (cid:20) I m T − T (cid:21) · ∇ J q | K · (cid:20) I p T T (cid:21) . (27)20 roof. Let ∆ ∈ V q be arbitrary. We have J q ( T q ( T, K + ∆)) − J q ( T q ( T, K ))= J q ( T q ( T, K ) + T q ( T, ∆)) − J q ( T q ( T, K ))= tr (cid:20)(cid:16) ∇ J q | T q ( T, K ) (cid:17) T · T q ( T, ∆) (cid:21) + o ( (cid:107) ∆ (cid:107) )= tr (cid:20)(cid:16) ∇ J q | T q ( T, K ) (cid:17) T · (cid:20) I m T (cid:21) ∆ (cid:20) I p T − (cid:21)(cid:21) + o ( (cid:107) ∆ (cid:107) )= tr (cid:32)(cid:20) I m T (cid:21) T · ∇ J q | T q ( T, K ) · (cid:20) I p T − (cid:21) T (cid:33) T ∆ + o ( (cid:107) ∆ (cid:107) ) . On the other hand, Lemma 4.1 shows that the LQG cost stays the same when applying similaritytransformation. Thus, we have J q ( T T ( K + ∆)) − J q ( T T ( K )) = J q ( K + ∆) − J q ( K )= tr (cid:104)(cid:0) ∇ J q | K (cid:1) T · ∆ (cid:105) + o ( (cid:107) ∆ (cid:107) ) . By comparing the two equations, we get ∇ J q | K = (cid:20) I m T (cid:21) T · ∇ J q | T q ( T, K ) · (cid:20) I p T − (cid:21) T , which then leads to the relationship (27).As expected, a direct consequence of Lemma 4.4 is that, if K ∈ C q is a stationary point of J q ,then any controller in the orbit O K is also a stationary point of J q . In addition, Lemma 4.4 allowsus to establish an interesting result that any stationary point of J q can be transferred to stationarypoints of J q + q (cid:48) for any q (cid:48) > with the same objective value. Theorem 4.1.
Let q ≥ be arbitrary. Suppose there exists K (cid:63) = (cid:20) C (cid:63) K B (cid:63) K A (cid:63) K (cid:21) ∈ C q such that ∇ J q ( K (cid:63) ) = 0 . Then for any q (cid:48) ≥ and any stable Λ ∈ R q (cid:48) × q (cid:48) , the following controller ˜ K (cid:63) = C (cid:63) K B (cid:63) K A (cid:63) K
00 0 Λ ∈ C q + q (cid:48) (28) is a stationary point of J q + q (cid:48) over C q + q (cid:48) satisfying J q + q (cid:48) (cid:0) ˜ K (cid:63) (cid:1) = J q ( ˜ K ) .Proof. Since K (cid:63) ∈ C q , we have ˜ K (cid:63) ∈ C q + q (cid:48) by construction. It is straightforward to verify that T q + q (cid:48) (cid:0) T, ˜ K (cid:63) (cid:1) = ˜ K (cid:63) with T = (cid:20) I q − I q (cid:48) (cid:21) . Therefore, by Lemma 4.4, we have ∇ J q + q (cid:48) (cid:12)(cid:12) ˜ K (cid:63) = ∇ J q + q (cid:48) (cid:12)(cid:12) T q + q (cid:48) (cid:0) T, ˜ K (cid:63) (cid:1) = (cid:20) I m + q − I q (cid:48) (cid:21) · ∇ J q + q (cid:48) (cid:12)(cid:12) ˜ K (cid:63) · (cid:20) I p + q − I q (cid:48) (cid:21) , q (cid:48) × q (cid:48) block, the last q (cid:48) rows and the last q (cid:48) columns of ∇ J q + q (cid:48) (cid:12)(cid:12) ˜ K (cid:63) are zero. On the other hand, it can be checked that J q + q (cid:48) (cid:18)(cid:20) K
00 Λ (cid:21)(cid:19) = J q ( K ) , ∀ K ∈ C q , and since ∇ J q ( K (cid:63) ) = 0 , we can see that the upper left ( m + q ) × ( p + q ) block of ∇ J q + q (cid:48) (cid:12)(cid:12) ˜ K (cid:63) is equalto zero. Then, from Lemma 2.2, it is not difficult to verify that the value J q ( ˜ K ∗ ) is independent ofthe q (cid:48) × q (cid:48) stable matrix Λ , and thus the bottom right q (cid:48) × q (cid:48) block of ∇ J q + q (cid:48) (cid:12)(cid:12) ˜ K (cid:63) is zero.We can now see that ∇ J q + q (cid:48) (cid:12)(cid:12) ˜ K (cid:63) = 0 . This completes the proof.Theorem 4.1 indicates that from any stationary point of J q over lower-order stabilizing controllersin C q , we can construct a family of stationary points of J q + q (cid:48) over higher-order stabilizing controllersin C q + q (cid:48) . Moreover, the stationary points constructed by (28) are neither controllable nor observable.This indicates that, if the globally optimal controller of J n is controllable and observable, and if theproblem min K ∈C q J q ( K ) has a solution for some q < n , then there will exist many strictly suboptimal stationary points of J n over C n .The following theorem explicitly constructs a family of stationary points for J n with an open-loopstable plant, and also provides a criterion for checking whether the corresponding Hessian is indefiniteor vanishing. Theorem 4.2.
Suppose the plant (1) is open-loop stable. Let Λ ∈ R n × n be stable, and let K (cid:63) = (cid:20) (cid:21) . Then K (cid:63) is a stationary point of J n ( K ) over K ∈ C n , and the corresponding Hessian Hess K (cid:63) is eitherindefinite or zero.Furthermore, suppose Λ is diagonalizable, and let eig( − Λ) denote the set of (distinct) eigenvaluesof − Λ . Let X op and Y op be the solutions to the following Lyapunov equations AX op + X op A T + W = 0 , A T Y op + Y op A + Q = 0 , (29) and let Z = (cid:110) s ∈ C | CX op (cid:0) sI − A T (cid:1) − Y op B = 0 (cid:111) . (30) Then, the Hessian of J n at K (cid:63) is indefinite if and only if eig( − Λ) (cid:42) Z ; the Hessian of J n at K (cid:63) iszero if and only if eig( − Λ) ⊆ Z . The fact that K (cid:63) = (cid:20) (cid:21) is a stationary point can be proved similarly as in Theorem 4.1.Regarding the properties of the Hessian, we exploit its bilinear property and use Lemma 4.3 fordirect calculation. In particular, the Lyapunov equations (12a) and (12b) are reduced to (29), andthe transfer function in (30) is obtained when we solve the third Lyapunov equation (26). Thedetailed proof is provided in Appendix B.7.Theorem 4.2 constructs a family of non-minimal strict saddle points or stationary points withvanishing Hessian for LQG with open-loop stable systems. We now present two explicit examplesillustrating the Hessian of J q ( K ) at non-minimal stationary points.22 xample 5 (Strict saddle point) . Consider the open-loop stable SISO system in Example 3. Wechoose Q = R = 1 , W = V = 1 for the LQG formulation. By Theorem 4.2, given any negative a < ,the following controller K (cid:63) = (cid:20) a (cid:21) ∈ R × is a stationary point of J ( K ) over the set of full-order stabilizing controller C . Furthermore, it canbe checked that CX op (cid:0) sI − A T (cid:1) − Y op B = 14( s + 1) . Therefore the Hessian of J at K (cid:63) is indefinite by Theorem 4.2, indicating that K ∗ is a strict saddlepoint [41]. Indeed, by using (11), we can directly compute the LQG cost and obtain J (cid:18)(cid:20) C K B K A K (cid:21)(cid:19) = A K − A K (1 + B K C K ) − B K C K (1 − B K C K + B K C K )2( − A K )( A K + B K C K ) . The Hessian at K (cid:63) can then be represented as ∂J ( K ) ∂A K ∂J ( K ) ∂A K ∂B K ∂J ( K ) ∂A K ∂C K ∂J ( K ) ∂B K A K ∂J ( K ) ∂B K ∂J ( K ) ∂B K ∂C K ∂J ( K ) ∂C K A K ∂J ( K ) ∂C K B K ∂J ( K ) ∂∂C K (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) K (cid:63) = a = 12(1 − a ) , which has eigenvalues and ± − a ) . Example 6 (Stationary point with vanishing Hessian) . Consider the following SISO system: A = (cid:20) − − (cid:21) , B = (cid:20) − (cid:21) , C = (cid:2) − (cid:3) , W = (cid:20) (cid:21) , V = 1 , and let Q = (cid:20) (cid:21) , R = 1 . It can be checked that CX op (cid:0) sI − A T (cid:1) − Y op B = 5( s − s + 1)( s + 2) . By Theorem 4.2, the point K (cid:63) = − − is a stationary point of J n with a vanishing Hessian. In Figure 4, we plot the graph of the function t (cid:55)→ J n ( K (cid:63) + t ∆) for ∆ = / − . Figure 4 suggests that K (cid:63) is a saddle point of J n with a vanishing Hessian but non-vanishingthird-order partial derivatives. Remark . Some recent studies have shown that many gradient-based algorithms can automaticallyescape strict saddle points under mild conditions [41, 42]. However, Example 6 shows that the LQGcost function J ( K ) may have non-strict saddle points, and further analysis is required to examinewhether gradient-based methods can also escape such stationary points.23 igure 4: The function t (cid:55)→ J n ( K (cid:63) + t ∆) for Example 6. As discussed in Theorems 4.1 and 4.2, there may exist many non-minimal stationary points for J n that are not globally optimal. In this section, we aim to show that all minimal stationary points areglobally optimal to the LQG problem (2).Recall that K = (cid:20) C K B K A K (cid:21) ∈ C q is minimal if it represents a controllable and observablecontroller. The gradient computation in Lemma 4.2 works for both minimal and non-minimalstabilizing controllers in C q . For a minimal stabilizing controller K , we further have the followingresult (see Appendix B.5 for a proof). Lemma 4.5.
Fix q ∈ N such that C q (cid:54) = ∅ , and let K ∈ C q be minimal. Under Assumption 1, thesolutions X K and Y K to (12a) and (12b) are positive definite. By letting the gradient (24) equal to zero, i.e. , ∂J n ( K ) ∂A K = 0 , ∂J n ( K ) ∂B K = 0 , ∂J n ( K ) ∂C K = 0 , (31)we can characterize the stationary points of the LQG problem (13). In particular, we have closed-loopform expressions for full-order minimal stationary points K ∈ C n , which turn out to be globallyoptimal. This result is formally summarized below. Theorem 4.3.
Under Assumption 1, all minimal stationary points K ∈ C n to the LQG problem (13) are globally optimal, and they are in the form of A K = T ( A − BK − LC ) T − , B K = − T L, C K = KT − , (32) where T ∈ R n × n is an invertible matrix, and K = R − B T S, L = P C T V − , (33) with P and S being the unique positive definite solutions to the Riccati equations (5a) and (5b) . Theorem 4.3 can be viewed as a special case in [1, Theorem 20.6], [43, Section II] that presentsfirst-order necessary conditions for optimal reduced-order controllers K ∈ C q . Following the analysisin [1, Chapter 20], we present an adapted proof for Theorem 4.3 here.24 roof. Consider a stationary point K = (cid:20) C K B K A K (cid:21) ∈ C n such that the gradient (24) vanishes. If thecontroller K is minimal, we know by Lemma 4.5 that the solutions X K and Y K to (12a) and (12b)are unique and positive definite.Upon partitioning X K and Y K in (25), by the Schur complement, the following matrices arewell-defined and positive definite P := X − X X − X T (cid:31) , S := Y − Y Y − Y T (cid:31) . (34)We further define T := Y − Y T . By (24a), we know that matrix T is invertible, and T − = − X X − . Now, letting ∂J n ( K ) ∂B K = 0 , from (24b), we have B K = − ( X T + Y − Y T X ) C T V − , = − ( X T + T X ) C T V − = − T ( X − X X − X T ) C T V − , = − T P C T V − . (35)Similarly, from (24c), we have C K = − R − B T ( Y X X − + Y ) = R − B T ST − . (36)Furthermore, since X K is the solution to the Lyapunov equation (12a), by plugging in the blocksof X K we get AX + X A + BC K X T + X C TK B TK + W, (37a) AX + BC K X + X C T B TK + X A TK , (37b) A K X + X A TK + B K CX + X T C T B TK + B K V B TK . (37c)Now, we have (37c) + T × (37b) leads to A K X + X A TK + B K CX + X T C T B TK + B K V B TK + T ( AX + BC K X + X C T B TK + X A TK ) = 0 , which is the same as A K X + X A TK − T P C T V − CX − X T C T V − CP T + T P C T V − CP T + T ( AX + BR − B T ST − X − X C T V − CP T + X A TK ) = 0 . By the definition of T , we have T X = − X . Then, the equation above becomes A K X − T P C T V − CX − X T C T V − CP T + T P C T V − CP T + T ( AX + BR − B T ST − X − X C T V − CP T ) = 0 , leading to A K = T P C T V − CX X − + X T C T V − CP T X − − T P C T V − CP T X − − T ( AX + BR − B T ST − X − X C T V − CP T ) X − = T ( A − P C T V − C − BR − B T S ) T − . (38)25rom (35), (36) and (38), upon defining K and L in (33), it is easy to see that the stationarypoints are in the form of (32). It remains to prove that P and S defined in (34) are the uniquepositive definite solutions to the Riccati equations (5a) and (5b).We multiply (37c) by T − on the left and by T − T on the right, and by noting that B K = − T P C T V − and T − = − X X − , we get X X − A K X T + X A TK X − X T + P C T V − CX X − X T + X X − X T C T V − CP + P C T V − CP.
Since P = X − X X − X T , we further get X X − A K X T + X A TK X − X T + P C T V − CX + X C T V − CP − P C T V − CP. (39)Next, we multiply (37b) by − T − T = X − X T on the right and get AX X − X T + BC K X T + X C T V − C T P + X A TK X − X T . By plugging this equality into (39), we get − AX X − X T − BC K X T − X X − X T A − X C TK B T − P C T V − CP.
Then, we plug the above equality into (37a) and get A ( X − X X − X T ) + ( X − X X − X T ) A − P C T V − CP + W, and since P = X − X X − X T , we can see that P satisfies the Riccati equation (5a). Throughsimilar steps, we can derive from (12b) that S satisfies the Riccati equation (5b).Finally, from classical control theory [1, Theorem 14.7], a globally optimal controller to theLQG problem (13) is given by (6), and any similarity transformation leads to another equivalentcontroller with the same LQG cost. Therefore, any minimal stationary point, given by (32), isglobally optimal.The results in Theorem 4.3 indicate that if the LQG problem (13) has a globally optimal solutionin C n that is also minimal, then the globally optimal controller is unique in C n after taking a quotientwith respect to similarity transformation. This is expected from the classical result that the globallyoptimal LQG controller is unique in the frequency domain [1, Theorem 14.7].We note that minimal stationary points are required in the proof of Theorem 4.3, as it guaranteesthat matrices (34) are well-defined and the solutions (35) and (36) are unique. Theorem 4.3 allowsus to establish the following corollaries. Corollary 4.1.
The following statements are true:1) If J n ( K ) has a minimal stationary point in C n , then all its non-minimal stationary points K ∈ C n are strictly suboptimal.2) If J n ( K ) has a non-minimal stationary point in C n that is globally optimal, then all stationarypoints K ∈ C n of J n ( K ) are non-minimal. We have already seen LQG cases with non-minimal stationary points that are strictly suboptimalin Example 5 and Example 6. It should be noted that, even with Assumption 1, the LQG problem (13)might have no minimal stationary points, i.e., all the solutions K for (31) may be non-minimal; thishappens if the controller from the Ricatti equations (5) is not minimal.26 xample 7 (Non-minimal globally optimal controllers) . Here we give an example from [44], whoseoptimal LQG controller does not have a minimal realization in C n . Consider the linear system (1)with A = (cid:20) −
11 0 (cid:21) , B = (cid:20) (cid:21) , C = (cid:2) − (cid:3) , W = (cid:20) − − (cid:21) , V = 1 , and let the LQG cost be defined by Q = (cid:20) (cid:21) , R = 1 . This LQG problem satisfies Assumption 1. The positive definite solutions to the Riccati equations(5) are given by P = (cid:20) (cid:21) , S = (cid:20) (cid:21) , and the globally optimal controller is given by A K = (cid:20) − − (cid:21) , B K = L = (cid:20) − (cid:21) , C K = − K = (cid:2) − (cid:3) . (40)It is not hard to see that ( C K , A K ) is not observable. Therefore, the controller obtained from theRiccati equations is not minimal in this example. Consequently, by Corollary 4.1, all stationarypoints of J n are not minimal for this example.In this case, the globally optimal controllers in C n are not all connected by similarity transfor-mations. For example, it can be verified that the following two non-minimal controllers are bothglobally optimal: K = − − − − , K = − − − , but there exists no similarity transformation between K and K since (cid:20) − − (cid:21) and (cid:20) − − (cid:21) have different sets of eigenvalues (recall that similarity transformation does not change eigenvalues).If a sequence of gradient iterates converges to a point, Theorem 4.3 also allow us to check whetherthe limit point is a globally optimal solution to the LQG problem. Corollary 4.2.
Consider a gradient descent algorithm K t +1 = K t − α t ∇ J n ( K t ) for the LQGproblem (13) , where α t is a step size. Suppose the iterates K t converge to a point K ∗ , i.e. , lim t →∞ K t = K ∗ . If K ∗ is a controllable and observable controller, then it is globally optimal.Remark . Corollary 4.2 proposes checking the controllability and observability of K ∗ for verifyingglobal optimality when the gradient descent iterates converge to K ∗ . In practice, the limit K ∗ cannotbe directly computed, and one tentative approach to check its controllability (observability) is tocheck whether the smallest singular value of the controllability (observability) matrix of the lastiterate K T is sufficiently bounded away from zero. A rigorous justification of this approach will be ofinterest for future work. 27 emark . Note that Corollary 4.2 does not discuss under what conditions will the gradient descentiterates converge. The results in [45] guarantee that if the cost function is analytic over the wholeEuclidean space, then the gradient descent with step sizes satisfying the Wolfe conditions will eitherconverge to a stationary point or diverge to infinity. In our case, however, the cost function J n ( K ) isonly analytic over a subset C n ⊂ V n . Furthermore, J n ( K ) is not coercive as shown in Example 4.Whether the gradient descent with properly chosen step sizes can converge to a stationary point of J n ( K ) requires further investigation. J n ( K ) at Minimal Stationary Points Finally, we turn to characterizing the second-order behavior of J n around a globally optimal controller K ∗ . Throughout this subsection, we will assume that K ∗ is controllable and observable. We focus onthe eigenvalues and eigenspaces of the Hessian Hess K ∗ . The null space of Hess K ∗ is null Hess K ∗ = { x ∈ V n | Hess K ∗ ( x, y ) = 0 , ∀ y ∈ V n } . The following lemma shows that the tangent space
T O K ∗ is a subspace of the null space of Hess K ∗ ,which is a direct corollary of [23, Theorem 2]. Lemma 4.6.
Suppose K ∗ is controllable and observable. Then T O K ∗ ⊆ null Hess K ∗ . This lemma can be viewed as a local version of Lemma 4.1 indicating the invariance of J n alongthe orbit O K . Consequently, the dimension of the null space of Hess K ∗ is at least q . On the otherhand, we also have the following result. Lemma 4.7.
Suppose K ∗ is controllable and observable, and let ∆ ∈ T O ⊥ K ∗ . Then for all sufficientlysmall t > , J n ( K ∗ + t ∆) − J n ( K ∗ ) > . Proof.
We prove by contradiction. Suppose for any sufficiently small δ > , there always exists t ∈ (0 , δ ) such that J n ( K ∗ + t ∆) = J n ( K ∗ ) . Then we can find a positive sequence ( t j ) j ≥ such that t j → and J n ( K ∗ + t j ∆) = J n ( K ∗ ) . Denote K j = K ∗ + t j ∆ . Since ∆ is orthogonal to T O K ∗ , theremust exists some j ≥ such that K j / ∈ T O K ∗ . By [1, Theorem 3.17], we can see that the transferfunction of K j will be different from the transfer function of K j . Then by the uniqueness of thetransfer function solution to the LQG problem, K j cannot be a global minimum of J n , contradicting J n ( K j ) = J n ( K ∗ ) .Combining the observations from Lemma 4.6 and 4.7, we can see that, while the Hessian Hess K ∗ is degenerate and its null space has a nontrivial subspace T O K ∗ , the degeneracy associated with T O K ∗ does not cause much trouble for optimizing J n , as the directions in T O K ∗ correspond tosimilarity transformations that lead to other globally optimal controllers, while along the directionsorthogonal to T O K ∗ , the optimal controller of J n is locally unique.We are therefore interested in the behavior of Hess K ∗ restricted to the subspace T O ⊥ K ∗ . Specifically,we let rcond K ∗ denote the reciprocal condition number of Hess K ∗ restricted to the subspace T O ⊥ K ∗ ,i.e., rcond K ∗ := min ∆ ⊥T O K ∗ Hess K ∗ (∆ , ∆) / (cid:107) ∆ (cid:107) F max ∆ ⊥T O K ∗ Hess K ∗ (∆ , ∆) / (cid:107) ∆ (cid:107) F . (41)Intuitively, if rcond K ∗ is bounded away from zero, then we can expect gradient-based methods toachieve good local convergence behavior for optimizing J n . However, we give an explicit examplebelow showing that rcond K ∗ can be arbitrarily bad even if the original plant seems entirely normal.28 xample 8. Let (cid:15) > be arbitrary, and let A = 32 (cid:20) − − − (cid:15) (cid:21) , B = (cid:20)
11 + (cid:15) (cid:21) , C = (cid:2) (cid:3) , and Q = (cid:20) (cid:21) , W = (cid:20) (cid:15) (cid:15) (cid:15) ) (cid:21) , V = R = 1 . For this plant, the positive definite solutions to the Riccati equations (5) are given by P = (cid:20) (cid:15) (cid:21) , S = (cid:20) (cid:15) (cid:21) , and we have K = R − B T S = (cid:2) (cid:3) , L = P C T V − = (cid:20)
11 + (cid:15) (cid:21) . The optimal controller K ∗ is then given by K ∗ = (cid:20) − KL A − BK − LC (cid:21) = − − − −
21 + (cid:15) − (cid:15) ) − (1 + (cid:15) ) . It can be checked that the optimal controller provided by the Riccati equations is controllable andobservable when (cid:15) (cid:54) = 0 . In Theorem 4.4, we provide an asymptotic upper bound on the reciprocalcondition number rcond K ∗ . We also provide numerical results on Hess K ∗ for (cid:15) ∈ [0 . , . inFigure 5. It can be seen that the upper bound (42c) on rcond K ∗ is on the order of O ( (cid:15) ) , indicatingthat rcond K ∗ degrades rapidly as (cid:15) approaches zero. Moreover, it can be numerically checked viaLemma 4.2 that, even if we set (cid:15) = 0 . , the reciprocal condition number rcond K ∗ is still below . × − . On the other hand, if we plug in (cid:15) = 0 . , the resulting plant’s parameters as well as thecontrollability and observability matrices (cid:2) B AB (cid:3) = (cid:20) − . . − . (cid:21) , (cid:20) CCA T (cid:21) = (cid:20) − . − . (cid:21) seem entirely normal. Theorem 4.4.
Consider the LQG problem in Example 8. Let (cid:15) > be arbitrary. Let ∆ = − / /
20 1 / − / , ∆ = − / − / / / . Then, as (cid:15) → , we have Hess K ∗ (∆ , ∆ ) = d J ( K ∗ + t ∆ ) dt (cid:12)(cid:12)(cid:12)(cid:12) t =0 = 37000 (cid:15) + o ( (cid:15) ) , Hess K ∗ (∆ , ∆ ) = d J ( K ∗ + t ∆ ) dt (cid:12)(cid:12)(cid:12)(cid:12) t =0 = 680343 + o (1) , and (cid:13)(cid:13) Proj
T O K ∗ [∆ ] (cid:13)(cid:13) F = O ( (cid:15) ) . a) (b) (c) Figure 5:
Numerical results on the behavior of
Hess K ∗ in Example 8: (a) The minimum eigenvalue of Hess K ∗ restricted on T O ⊥ K ∗ , the value of Hess K ∗ (∆ , ∆ ) , and the asymptotic upper bound given by (42a). (b) Themaximum eigenvalue of Hess K ∗ restricted on T O ⊥ K ∗ , the value of Hess K ∗ (∆ , ∆ ) , and the asymptotic lowerbound given by (42b). (c) The reciprocal condition number rcond K ∗ and its asymptotic upper bound givenby (42c). Consequently, as (cid:15) → , min ∆ ⊥T O K ∗ Hess K ∗ (∆ , ∆) (cid:107) ∆ (cid:107) F ≤ Hess K ∗ (∆ , ∆ ) (cid:107) ∆ (cid:107) F − (cid:13)(cid:13) P T O K ∗ [∆ ] (cid:13)(cid:13) F = 37000 (cid:15) + o ( (cid:15) ) , (42a) max ∆ ⊥T O K ∗ Hess K ∗ (∆ , ∆) (cid:107) ∆ (cid:107) F ≥ Hess K ∗ (∆ , ∆ ) (cid:107) ∆ (cid:107) F = 680343 + o (1) , (42b) and the reciprocal condition number of Hess K ∗ restricted on T O ⊥ K ∗ can be upper bounded by rcond K ∗ ≤ (cid:15) + o ( (cid:15) ) ≈ . × − · (cid:15) + o ( (cid:15) ) . (42c)The proof of Theorem 4.4 is based on a direct but tedious calculation of Hessian via Lemma 4.3.The details are provided in Appendix B.8. The observations in Example 8 suggest that, if we applythe vanilla gradient descent algorithm to the optimization problem (13), it may take a large numberof iterations for the iterate to converge to a globally optimal controller for certain LQG problemsthat appear entirely normal. Remark . Due to the symmetry induced by similaritytransformations, the landscape of LQG shares some similarities with the landscapes of non-convexmachine learning problems with rotational symmetries such as phase retrieval, matrix factorization [21,23, 26]. For example, the stationary points of these non-convex problems are non-isolated, and thetangent space of the orbit associated with the symmetry group is a subspace of the null space of theHessian (see Lemma 4.6). On the other hand, for phase retrieval [21] and matrix factorization [23],the classification of all stationary points as well as their local curvatures (Hessian) seem to berelatively well understood, while there remain many open questions regarding the stationary pointsof LQG: such as the existence of local optimizers that are not globally optimal, whether all non-globally-optimal stationary points have the form of (28) up to similarity transformations. Finally,in addition to the apparent algebraic complication of LQG and control-theoretic notions such asminimal controllers, the non-compactness of the group of similarity transformations may also renderthe landscape of LQG distinct from the non-convex machine learning problems with rotationalsymmetries. 30
Numerical experiments
We have illustrated our main technical results on the connectivity of stabilizing controllers andstationary points through Examples 1-8. Here, we present some numerical experiments to demonstrateempirical performance of gradient descent algorithms for solving the LQG problem (13). The scriptsfor all experiments can be downloaded from https://github.com/zhengy09/LQG_gradient . A vanilla gradient descent algorithm for solving (13) is as follows. Upon giving an initial stabilizingcontroller K ∈ C n , we update the controller: t = 0 , , , . . .A K ,t +1 = A K ,t − s t ∂J ( K ) ∂A K (cid:12)(cid:12)(cid:12)(cid:12) K t , B K ,t +1 = B K ,t − s t ∂J ( K ) ∂B K (cid:12)(cid:12)(cid:12)(cid:12) K t , C K ,t +1 = C K ,t − s t ∂J ( K ) ∂C K (cid:12)(cid:12)(cid:12)(cid:12) K t , (43)where the gradient is obtained using (24), until the gradient satisfies (cid:107)∇ J ( K t ) (cid:107) F ≤ (cid:15) or the iterationreaches the maximum number t max . In our simulation, the step size s t in (43) is determined by the Armijo rule [46, Chapter 1.3]: Set s t = 1 , repeat s t = βs t until J ( K t ) − J ( K t +1 ) ≥ αs t (cid:107)∇ J ( K t ) (cid:107) F , where α ∈ (0 , , β ∈ (0 , , e.g., α = 0 . and β = 0 . .For numerical comparison, we can also reduce the number of controller parameters by consideringa controller canonical form. In particular, for any SISO controller, the controllable canonical form of K is A K = . . .
00 0 1 . . . ... ... ... . . . ... . . . − b − b − b . . . − b n − , B K = ... , C K = (cid:2) a a a . . . a n − (cid:3) . (44)We now only update the controller parameters a i , b i , i = 0 , . . . , n − by using a partial gradientin (43). It is clear that the set of stabilizing controllable controllers is a subset of C n , but we notethat the connectivity of stabilizing controllable controllers is unclear and cannot be deduced fromthe results in Section 3. Here, we further remark a few facts [1, Chapter 3]• The controller K in (44) is not necessarily minimal, and it may be unobservable. Thus, thisparameterization (44) is able to capture some non-minimal globally optimal controllers, e.g., theLQG problem in Example 7.• For any controllable SISO K , there is a unique similarity transformation such that T T ( K ) is inthe form of (44). Conversely, given K in the form of (44), all the controllers in the orbit O K arecontrollable.• By Theorem 4.3, if the LQG problem (13) for SISO systems has a minimal stationary point, thenit admits a unique globally optimal controller in the form of (44).In our experiments, we set the maximum iteration number t max = 10 and the stopping criterion (cid:15) = 10 − . To investigate the influence of initial stabilizing controllers on the convergence performanceof gradient descent algorithms, we used two different initialization strategies:1) Random initialization:
We used a pole placement method to get an initial stabilizing controller K , and the closed-loop poles were chosen randomly from ( − , − .31) Initialization around a globally optimal point:
We also considered initialization around the globallyoptimal controller from Riccati equations, i.e., A K , ∼ N ( A (cid:63) K , δI ) , B K , ∼ N ( B (cid:63) K , δI ) , C K , ∼ N ( C (cid:63) K , δI ) , where ( A (cid:63) K , B (cid:63) K , C (cid:63) K ) is the optimal LQG controller (6) from solving Riccati equations, and wechose δ = 10 − in the simulations.Throughout this section, we denote the vanilla gradient descent algorithm (43) as Vanilla GD A and call the gradient descent over the controllable canonical form (44) as Vanilla GD B . We first consider two examples for which
Vanilla GD B has good empirical convergence performance.The first one is the famous Doyle’s LQG example from [3] A = (cid:20) (cid:21) , B = (cid:20) (cid:21) , C = (cid:2) (cid:3) , W = 5 (cid:20) (cid:21) , V = 1 (45a)with performance weights Q = 5 (cid:20) (cid:21) , R = 1 . (45b)The globally optimal LQG controller from Riccati equations is A K = (cid:20) − − − (cid:21) , B K = (cid:20) (cid:21) , C K = (cid:2) − − (cid:3) , (46)and its corresponding LQG cost is J (cid:63) = 750 . The system (45) is open-loop unstable, so we chosean initial stabilizing controller using pole placement where the poles were randomly selected from ( − , − in our simulations. The results are shown in Figure 6. For this LQG case, Vanilla GD B over the controllable canonical form has better convergence performance compared to VanillaGD A . In particular, Vanilla GD A did not converge within iterations, and the final iterate in Vanilla GD A has nonzero gradient. Instead, for different initial points, Vanilla GD B converged tothe following solution (up to two decimal places) A K = (cid:20) − . − . (cid:21) , B K = (cid:20) (cid:21) , C K = (cid:2) . − . (cid:3) . (47)The controller (47) from Vanilla GD B is minimal, and the gradient is close to zero (stationary point).By Corollary 4.2, it is reasonable to conclude that this controller is globally optimal. Indeed, (47)is identical to (46) via a similarity transformation defined by T = (cid:20)
25 5 −
30 5 (cid:21) . By Lemma 4.3, wecan also compute the hessian of J ( K ) at (47), for which the minimum eigenvalue is . whenrestricting to the subspace T O ⊥ K ∗ .Our second numerical experiment is carried out on the LQG case in Example 7, for whicha globally optimal controller from Riccati equations is non-minimal, shown in (40). The initialcontrollers were randomly chosen by pole placement from ( − , − . Similar to the first numericalexperiment, Vanilla GD A did not converge within iterations, while Vanilla GD B convergedto stationary points (the gradient reached the stopping criterion); see Figure 7. In this case, the32 a) Vanilla GD A (b) Vanilla GD B Figure 6:
Convergence performance of gradient descent algorithms for Doyle’s example in (45) with fourdifferent random initialization K . (a) Vanilla GD A (b) Vanilla GD B Figure 7:
Convergence performance of gradient descent algorithms for Example 7 with four different randominitialization K . controllers from Vanilla GD B are not minimal, and they have different state-space representations,two of which are A K , = (cid:20) − . − . (cid:21) , B K , = (cid:20) (cid:21) , C K , = (cid:2) − . − . (cid:3) , (48a) A K , = (cid:20) − . − . (cid:21) , B K , = (cid:20) (cid:21) , C K , = (cid:2) − . − . (cid:3) . (48b)Our theoretical results (Theorem 4.1 and Corollary 4.2) failed to check whether the controllers (48)from Vanilla GD B are globally optimal. However, after pole-zero cancellation, we can check thatthe controllers (48) correspond to the same transfer function with (40), which is K (cid:63) = − s + 3 . Also, we numerically check that the the Hessian of J ( K ) at the controllers (48) and (40) has aminimum eigenvalue as zero over the subspace T O ⊥ K ∗ .33 a) Vanilla GD A (b) Vanilla GD B Figure 8:
Convergence performance of gradient descent algorithms for Example 6 with different initializationstrategies. In each subfigure, the left one shows results using random initialization, and the right one showresults using initialization around a globally optimal point. (a)
Vanilla GD A (b) Vanilla GD B Figure 9:
Convergence performance of gradient descent algorithms for Example 8 ( (cid:15) = 0 . ) with fourdifferent initialization K . In each subfigure, the left one shows results using random initialization, and theright one show results using initialization around a globally optimal point. Here, we present two LQG examples for which
Vanilla GD B over the controllable canonical formseems to get stuck around some points when using random initialization. We first consider the LQGin Example 6, for which we have shown there exist stationary points with vanishing Hessian (seeFigure 4). Note that this LQG problem has a minimal globally optimal controller, so it admits aunique globally optimal controller in the form of (44). However, as shown in Figure 8, with randominitialization, Vanilla GD B over the controllable canonical form seems to get stuck around differentpoints; Vanilla GD A does make steady improvement over the LQG cost function, but it still failedto converge within iterations. When using the initialization around a globally optimal point, theconvergence performance of both Vanilla GD A and Vanilla GD B has been significantly improved,and both of them reached the stopping criterion within one hundred iterations. We note that therandom initialization actually started from a point with a smaller LQG cost compared to the otherinitialization.Our final numerical experiment is carried out for the LQG in Example 8, where we chose (cid:15) = 0 . .The results are shown in Figure 9. Both Vanilla GD A and Vanilla GD B failed to converge with iterations, and they seems to get stuck around different points for very many iterations that are notglobally optimal. Similar to the previous case, using the initialization around a globally optimalpoint greatly improved the convergence performance of Vanilla GD A and Vanilla GD B , and bothof them reached the stopping criterion within a few hundred iterations.These two LQG cases show that initialization has a great impact on the performance of gradient34lgorithms for solving general LQG problems. We also note that for the LQG cases we tested,gradient descent algorithms can reduce the LQG cost quickly in the beginning period of iterations,but might get struck in some region for many iterations. In this paper, we have characterized the connectivity of the set of stabilizing controllers C n and pro-vided some structural properties of the LQG cost function. These results reveal rich yet complicatedoptimization landscape properties of the LQG problem. Ongoing work includes establishing conver-gence conditions for gradient descent algorithms and investigating whether local search algorithmscan escape saddle points of the LQG problem. We note that the optimization landscape of LQGalso depends on the parameterization of dynamical controllers. It will be interesting to look intothe LQG problem when parameterizing controllers in a canonical form. Finally, our analysis revealsthat minimal stationary points in C n are always globally optimal, and it would also be interesting toinvestigate the existence of minimal stationary points for the LQG problem. References [1] Kemin Zhou, John C. Doyle, and Keith Glover.
Robust and optimal control . Prentice Hall, 1996.[2] Dimitri P Bertsekas.
Dynamic programming and optimal control , volume 1. Athena scientificBelmont, MA, 1995.[3] John C. Doyle. Guaranteed margins for LQG regulators.
IEEE Transactions on AutomaticControl , 23(4):756–757, 1978.[4] Pascal Gahinet and Pierre Apkarian. A linear matrix inequality approach to H ∞ control. International Journal of Robust and Nonlinear Control , 4(4):421–448, 1994.[5] Carsten Scherer, Pascal Gahinet, and Mahmoud Chilali. Multiobjective output-feedback controlvia LMI optimization.
IEEE Transactions on Automatic Control , 42(7):896–911, 1997.[6] Maryam Fazel, Rong Ge, Sham Kakade, and Mehran Mesbahi. Global convergence of policygradient methods for the linear quadratic regulator. In
Proceedings of the 35th InternationalConference on Machine Learning , volume 80 of
Proceedings of Machine Learning Research , pages1467–1476. PMLR, 2018.[7] Dhruv Malik, Ashwin Pananjady, Kush Bhatia, Koulik Khamaru, Peter Bartlett, and MartinWainwright. Derivative-free methods for policy optimization: Guarantees for linear quadraticsystems. In
The 22nd International Conference on Artificial Intelligence and Statistics , pages2916–2925. PMLR, 2019.[8] Hesameddin Mohammadi, Armin Zare, Mahdi Soltanolkotabi, and Mihailo R. Jovanović. Conver-gence and sample complexity of gradient methods for the model-free linear quadratic regulatorproblem. arXiv preprint arXiv:1912.11899 , 2019.[9] Stephen Tu and Benjamin Recht. The gap between model-based and model-free methods onthe linear quadratic regulator: An asymptotic viewpoint. In
Conference on Learning Theory ,pages 3036–3083, 2019.[10] Yingying Li, Yujie Tang, Runyu Zhang, and Na Li. Distributed reinforcement learning fordecentralized linear quadratic control: A derivative-free policy optimization approach. arXivpreprint arXiv:1912.09135 , 2019. 3511] Jack Umenberger, Mina Ferizbegovic, Thomas B Schön, and Håkan Hjalmarsson. Robustexploration in linear quadratic reinforcement learning. In
Advances in Neural InformationProcessing Systems , pages 15336–15346, 2019.[12] Kaiqing Zhang, Bin Hu, and Tamer Basar. Policy optimization for H linear control with H ∞ robustness guarantee: Implicit regularization and global convergence. arXiv preprintarXiv:1910.09496 , 2019.[13] Jingjing Bu, Afshin Mesbahi, and Mehran Mesbahi. On topological and metrical properties ofstabilizing feedback gains: the MIMO case. arXiv preprint arXiv:1904.02737 , 2019.[14] Sarah Dean, Horia Mania, Nikolai Matni, Benjamin Recht, and Stephen Tu. On the samplecomplexity of the linear quadratic regulator. Foundations of Computational Mathematics , pages1–47, 2019.[15] Feicheng Wang and Lucas Janson. Exact asymptotics for linear quadratic adaptive control. arXiv preprint arXiv:2011.01364 , 2020.[16] Stephen Tu, Ross Boczar, Andrew Packard, and Benjamin Recht. Non-asymptotic analysis ofrobust control from coarse-grained identification. arXiv preprint arXiv:1707.04791 , 2017.[17] Ross Boczar, Nikolai Matni, and Benjamin Recht. Finite-data performance guarantees for theoutput-feedback control of an unknown system. In , pages 2994–2999. IEEE, 2018.[18] Yang Zheng, Luca Furieri, Maryam Kamgarpour, and Na Li. Sample complexity of linearquadratic gaussian (LQG) control for output feedback systems. arXiv preprint arXiv:2011.09929 ,2020.[19] Max Simchowitz, Karan Singh, and Elad Hazan. Improper learning for non-stochastic control. arXiv preprint arXiv:2001.09254 , 2020.[20] Han Feng and Javad Lavaei. Connectivity properties of the set of stabilizing static decentralizedcontrollers.
SIAM Journal on Control and Optimization , 58(5):2790–2820, 2020.[21] Ju Sun, Qing Qu, and John Wright. A geometric analysis of phase retrieval.
Foundations ofComputational Mathematics , 18(5):1131–1198, 2018.[22] Yuejie Chi, Yue M Lu, and Yuxin Chen. Nonconvex optimization meets low-rank matrixfactorization: An overview.
IEEE Transactions on Signal Processing , 67(20):5239–5269, 2019.[23] Xingguo Li, Junwei Lu, Raman Arora, Jarvis Haupt, Han Liu, Zhaoran Wang, and Tuo Zhao.Symmetry, saddle points, and global optimization landscape of nonconvex matrix factorization.
IEEE Transactions on Information Theory , 65(6):3489–3514, 2019.[24] Qing Qu, Yuexiang Zhai, Xiao Li, Yuqian Zhang, and Zhihui Zhu. Analysis of the optimizationlandscapes for overcomplete representation learning. arXiv preprint arXiv:1912.02427 , 2019.[25] Rong Ge and Tengyu Ma. On the optimization landscape of tensor decompositions. In
Advancesin Neural Information Processing Systems , pages 3653–3663, 2017.[26] Yuqian Zhang, Qing Qu, and John Wright. From symmetry to geometry: Tractable nonconvexproblems. arXiv preprint arXiv:2007.06753 , 2020.[27] Stephen Tu and Benjamin Recht. The gap between model-based and model-free methods onthe linear quadratic regulator: An asymptotic viewpoint. In
Conference on Learning Theory ,pages 3036–3083, 2019.[28] Benjamin Recht. A tour of reinforcement learning: The view from continuous control.
AnnualReview of Control, Robotics, and Autonomous Systems , 2:253–279, 2019.[29] Luca Furieri, Yang Zheng, and Maryam Kamgarpour. Learning the globally optimal distributedLQ regulator. In
Learning for Dynamics and Control , pages 287–297, 2020.3630] Ilyas Fatkhullin and Boris Polyak. Optimizing static linear feedback: Gradient method. arXivpreprint arXiv:2004.09875 , 2020.[31] Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, and Anima Anandkumar. Logarithmicregret bound in partially observable linear dynamical systems. arXiv preprint arXiv:2003.11227 ,2020.[32] Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, and Anima Anandkumar. Regret boundof adaptive control in linear quadratic gaussian (lqg) systems. arXiv preprint arXiv:2003.05999 ,2020.[33] Samet Oymak and Necmiye Ozay. Non-asymptotic identification of lti systems from a singletrajectory. In , pages 5655–5661. IEEE, 2019.[34] Yang Zheng and Na Li. Non-asymptotic identification of linear dynamical systems using multipletrajectories. arXiv preprint arXiv:2009.00739 , 2020.[35] Dante Youla, Hamid Jabr, and Jr Bongiorno. Modern Wiener-Hopf design of optimal controllers–Part II: The multivariable case.
IEEE Transactions on Automatic Control , 21(3):319–338,1976.[36] Yuh-Shyang Wang, Nikolai Matni, and John C. Doyle. A system-level approach to controllersynthesis.
IEEE Transactions on Automatic Control , 64(10):4079–4093, 2019.[37] Luca Furieri, Yang Zheng, Antonis Papachristodoulou, and Maryam Kamgarpour. An in-put–output parametrization of stabilizing controllers: amidst Youla and system level synthesis.
IEEE Control Systems Letters , 3(4):1014–1019, 2019.[38] Y. Zheng, L. Furieri, A. Papachristodoulou, N. Li, and M. Kamgarpour. On the equivalenceof Youla, system-level and input-output parameterizations.
IEEE Transactions on AutomaticControl , 66(1):413–420, 2021.[39] John M. Lee.
Introduction to Smooth Manifolds . Springer Science & Business Media, 2 edition,2013.[40] Stephen Boyd, Laurent El Ghaoui, Eric Feron, and Venkataramanan Balakrishnan.
Linear MatrixInequalities in System and Control Theory . Society for Industrial and Applied Mathematics,1994.[41] Jason D. Lee, Ioannis Panageas, Georgios Piliouras, Max Simchowitz, Michael I. Jordan, andBenjamin Recht. First-order methods almost always avoid strict saddle points.
Mathematicalprogramming , 176(1-2):311–337, 2019.[42] Chi Jin, Rong Ge, Praneeth Netrapalli, Sham M. Kakade, and Michael I. Jordan. How to escapesaddle points efficiently. In Doina Precup and Yee Whye Teh, editors,
Proceedings of the 34thInternational Conference on Machine Learning , volume 70 of
Proceedings of Machine LearningResearch , pages 1724–1732, 2017.[43] David Hyland and Dennis Bernstein. The optimal projection equations for fixed-order dynamiccompensation.
IEEE Transactions on Automatic Control , 29(11):1034–1037, 1984.[44] A. Yousuff and R. Skelton. A note on balanced controller reduction.
IEEE Transactions onAutomatic Control , 29(3):254–257, 1984.[45] Pierre-Antoine Absil, Robert Mahony, and Benjamin Andrews. Convergence of the iterates ofdescent methods for analytic cost functions.
SIAM Journal on Optimization , 16(2):531–547,2005.[46] Dimitri P Bertsekas.
Nonlinear Programming . Belmont, MA: Athena Scientific, 1997.[47] John Milnor and David W. Weaver.
Topology from the Differentiable Viewpoint . PrincetonUniversity Press, 1997. 3748] Deane Montgomery and Leo Zippin.
Topological Transformation Groups . Courier DoverPublications, 2018.[49] F. Brasch and J. R. Pearson. Pole placement using dynamic compensators.
IEEE Transactionson Automatic Control , 15(1):34–43, 1970.[50] Eliahu Ibraham Jury. Theory and application of the z-transform method. 1964.[51] Lee H Keel and Shankar P Bhattacharyya. A new proof of the jury test.
Automatica , 35(2):251–258, 1999. 38 ppendix
This appendix is divided into four parts:• Appendix A presents some preliminaries in control theory and differential geometry;• Appendix B presents auxiliary proofs/results for continuous-time systems;• Appendix C presents the connectivity results for proper stabilizing controllers;• Appendix D presents analogous landscape results for the LQG problem in discrete-time.
A Fundamentals of Control Theory and Differential Geometry
For self-completeness, this section reviews some fundamental notions in control theory (see [1,Chapter 3] for more details), as well as some basic notions from differential geometry [39, 47].
A.1 Controllability, Observability, and Minimal Systems
Consider a dynamical system, parameterized by ( A, B, C, D ) ∈ R n × n × R n × m × R p × n × R p × m , asfollows ˙ x = Ax + Bu,y = Cx + Du. (A.1)The system (A.1) is called controllable if the following controllability matrix is of full row rankrank (cid:0)(cid:2)
B AB . . . A n − B (cid:3)(cid:1) = n, and observable if the following observability matrix is of full column rankrank CCA ... CA n − = n. The input-output behavior of (A.1) can also be equivalently described in the frequency domain G ( s ) = C ( sI − A ) − B + D. (A.2)It is easy to verify that the transfer function G ( s ) is invariant under any similarity transformationon the state-space model ( T AT − , T B, CT − , D ) .System (A.1) is called minimal if and only if it is controllable and observable. This “minimal” notion is justified by the following interpretation: if system (A.1) is not minimal, then there existsanother state-space model with a smaller state dimension ˆ n < n ˙ˆ x = ˆ A ˆ x + ˆ Buy = ˆ C ˆ x + Du, such that the input-output behavior is the same as (A.1), i.e. , G ( s ) = ˆ C ( sI − ˆ A ) − ˆ B + D. In thispaper, we have used the notions of “minimal controller” and “controllable and observable controller” in an interchangeabe way. The following theorem shows that minimal realizations of a transfermatrix are identical up to a similarity transformation.39 heorem A.1 ([1, Theorem 3.17]) . Given a real rational transfer matrix G ( s ) , suppose that ( A , B , C , D ) and ( A , B , C , D ) are two minimal state-space realizations of G ( s ) . Then, thereexists a unique invertible matrix T , such that A = T A T − , B = T B , C = C T − , D = D . Finally, the system (A.1) is proper in the sense that the degree of the numerator in (A.2) doesnot exceed the degree of its denominator. The system (A.1) becomes strictly proper if D = 0 . A.2 Lyapunov Equations
Given a real matrix A ∈ R n × n and a symmetric matrix Q ∈ S n , we consider the following Lyapunovequation A T X + XA + Q = 0 . (A.3)Its vectorized version is ( I n ⊗ A T + A T ⊗ I n ) vec( X ) = − vec( Q ) , (A.4)where we use ⊗ to denote Kronecker product. It can be shown that if A is stable, then ( I n ⊗ A T + A T ⊗ I n ) is invertible, and thus from (A.4), the Lyapunov equation (A.3) admits a unique solutionfor any matrix Q . Furthermore, we have the following results on the positive semidefiniteness of thesolution X . Lemma A.1 ([1, Lemma 3.18]) . Consider the Lyapunov equation (A.3) . Assuming that A is stable,the following statements hold. • The unique solution is X = (cid:90) ∞ e A T t Qe At dt. • X (cid:31) if Q (cid:31) , and X (cid:23) if Q (cid:23) . • If Q (cid:23) , then X (cid:31) if and only if ( Q / , A ) is observable. Given the solution to the Lyapunov equation (A.3), there also exist converse results that establishthe stability property of the matrix A ; see [1, Lemma 3.19]. A.3 Manifolds and Lie Groups
We adopt the following definitions for manifolds in Euclidean spaces. We refer to [39, 47] for moredetails of these definitions and related results.
Definition 2 ( C ∞ maps and diffeomorphism) . Let E and F be two real Euclidean spaces, andlet X ⊂ E and
Y ⊆ F be subsets of E and F respectively. We say that a map φ : X → Y is C ∞ ,if for any p ∈ X , there exists an open neighborhood U of p in E and an indefinitely differentiablefunction ˜ φ : U → F that coincides with φ on U ∩ X . We say that a C ∞ map φ : X → Y is a diffeomorphism from X to Y , if φ has an inverse map φ − : Y → X that is C ∞ . We say that X and Y are diffeomorphic if there exists a diffeomorphism from X to Y . Definition 3 (Manifold and submanifold) . Let E be a real Euclidean space. A subset M ⊂ E issaid to be a C ∞ manifold of dimension k in E , if for any p ∈ M , there exists an open neighborhood U of p in E , such that U ∩ M is diffeomorphic to some open subset of R k .Let M ⊆ E be a C ∞ manifold in the real Euclidean space E . A subset N ⊆ M is said to be a C ∞ (embedded) submanifold of M if it is a manifold in the real Euclidean space E .40 efinition 4 (Tangent space) . Let
M ⊆ E be a C ∞ manifold in a real Euclidean space E . Given x ∈ M , we say that v ∈ E is a tangent vector of M at x , if there exists a C ∞ curve γ : ( − , → M with γ (0) = x and v = γ (cid:48) (0) . The set of tangent vectors of M at x is called the tangent space of M at x , which we denoted by T x M .It is a known fact in differential geometry that the dimension of the tangent space is equal to thedimension of the manifold. Definition 5 (Tangent map) . Let
M ⊆ E and
N ⊆ F be two C ∞ manifolds in real Euclideanspaces E and F respectively. let φ : M → N be a C ∞ map. For any x ∈ M , the tangent map of φ at x is the linear map dφ x : T x M → T φ ( x ) N defined by dφ x ( γ (cid:48) (0)) = d ( φ ◦ γ ( t )) dt (cid:12)(cid:12)(cid:12)(cid:12) t =0 for any C ∞ curve γ : ( − , → M with γ (0) = x .It is known in differential geometry that, if φ : M → N is a diffeomorphism, then dφ x is anisomorphism (a bijective linear map) from T x M to T φ ( x ) N . Definition 6 (Lie group) . A C ∞ manifold G is said to be a Lie group, if there exists a C ∞ binaryoperation · : G × G → G , such that the following group axioms are satisfied:1) associativity: ( x · y ) · z = x · ( y · z ) for all x, y, z ∈ G ;2) identity: there exists e ∈ G such that e · x = x · e = x for all x ∈ G ;3) inverse: for all x ∈ G there exists a unique x − ∈ G such that x · x − = x − · x = e ;and moreover, the inversion x (cid:55)→ x − is a C ∞ map from G to G .In this paper, we extensively use the Lie group GL q which is the set of q × q (real) invertiblematrices together with the ordinary matrix multiplication. GL q is a Lie group whose elements areorganized continuously and smoothly. In addition, GL q is also a q -dimensional manifold, where thegroup operations of multiplication and inversion are smooth maps. Definition 7 (Lie group action) . Let M be a C ∞ manifold, and let G be a Lie group with identity e ∈ G . We say that a C ∞ map T : G × M → M gives a (left) Lie group action , if T ( e, x ) = x and T ( u · v, x ) = T ( u, T ( v, x )) for all x ∈ M and u, v ∈ G .As an example, the similarity transformation T q ( T, K ) defined in (14) gives a Lie group actionof GL q on C q . B Auxiliary Results for Continuous-time Systems
This section presents some auxiliary proofs/results for continuous-time systems.
B.1 Proofs of Lemmas 2.2 and 2.3
We first prove the LQG cost formulation in Lemma 2.2. Given a stabilizing controller K ∈ C q ,the closed-loop system is shown in (7). Since the controller K internally stabilizes the plant, theclosed-loop matrix A cl , K := (cid:20) A BC K B K C A K (cid:21)
41s stable and the state variable ( x ( t ) , ξ ( t )) is a Gaussian process with mean satisfying lim t →∞ E (cid:18)(cid:20) x ( t ) ξ ( t ) (cid:21)(cid:19) = 0 , and covariance satisfying lim t →∞ E (cid:32)(cid:20) x ( t ) ξ ( t ) (cid:21) (cid:20) x ( t ) ξ ( t ) (cid:21) T (cid:33) = lim t →∞ (cid:90) t e A cl , K ( t − τ ) (cid:20) W B K V B TK (cid:21) e A T cl , K ( t − τ ) dτ = (cid:90) ∞ e A cl , K t (cid:20) W B K V B TK (cid:21) e A T cl , K t dt. (B.1)By Lemma A.1, the last expression in (B.1) is the same as the unique solution X K to the Lyapunovequation (12a).Therefore, the corresponding LQG cost is given by J q := lim T →∞ T E (cid:20)(cid:90) Tt =0 (cid:16) x T Qx + u T Ru (cid:17) dt (cid:21) = lim t →∞ E (cid:32)(cid:20) xξ (cid:21) T (cid:20) Q C TK RC K (cid:21) (cid:20) xξ (cid:21)(cid:33) = lim t →∞ E tr (cid:32)(cid:20) Q C TK RC K (cid:21) (cid:20) xξ (cid:21) (cid:20) xξ (cid:21) T (cid:33) = tr (cid:32)(cid:20) Q C TK RC K (cid:21) lim t →∞ E (cid:32)(cid:20) xξ (cid:21) (cid:20) xξ (cid:21) T (cid:33)(cid:33) = tr (cid:18)(cid:20) Q C TK RC K (cid:21) X K (cid:19) . The other expression of the LQG cost in Lemma 2.2 follows from the Lyapunov function (12b) byduality between controllability Gramian and observability Gramian.We now proceed to prove Lemma 2.3. First, upon vectorizing the Lyapunov equation (12a), wehave ( I n + q ⊗ A cl , K + A cl , K ⊗ I n + q ) vec( X K ) = − vec (cid:18)(cid:20) W B K V B TK (cid:21)(cid:19) . Since A cl , K is stable, we know that I n + q ⊗ A cl , K + A cl , K ⊗ I n + q is invertible, and thus we have vec( X K ) = − ( I n + q ⊗ A cl , K + A cl , K ⊗ I n + q ) − vec (cid:18)(cid:20) W B K V B TK (cid:21)(cid:19) . It is not difficult to see that each element of ( I n + q ⊗ A cl , K + A cl , K ⊗ I n + q ) − is a rational function ofthe elements of K . Therefore, the LQG cost function J q ( K ) = tr (cid:18)(cid:20) Q C TK RC K (cid:21) X K (cid:19) is a rational function of the elements of K , which is real analytical. B.2 Proof of Proposition 3.1
It is straightforward to see that Φ( · ) is continuous since each element of Φ( Z ) is a rational functionin terms of the elements of Z (a ratio of two polynomials). To show that Φ is a mapping onto C n , weneed to prove the following statements: 42) For all K ∈ C n , there exists Z = ( X, Y, M, G, H, F, Π , Ξ) ∈ G n such that Φ( Z ) = K .2) For all Z = ( X, Y, M, G, H, F, Π , Ξ) ∈ G n , we have Φ( Z ) ∈ C n .To show the first statement, let K = (cid:20) D K C K B K A K (cid:21) ∈ C n be arbitrary. By definition we have D K = 0 ,and the stability of the matrix (cid:20) A BC K B K C A K (cid:21) implies that the Lyapunov inequality (cid:20) A + BD K C BC K B K C A K (cid:21) (cid:20) X Π T Π ˆ X (cid:21) + (cid:20) X Π T Π ˆ X (cid:21) (cid:20) A + BD K C BC K B K C A K (cid:21) T ≺ (B.2)has a solution (cid:20) X Π T Π ˆ X (cid:21) (cid:31) . Without loss of generality we may assume that det Π (cid:54) = 0 (otherwisewe can add a small perturbation on Π to make it invertible while still preserving the inequality(B.2)). Upon defining (cid:20) Y ΞΞ T ˆ Y (cid:21) := (cid:20) X Π T Π ˆ X (cid:21) − , T := (cid:20) X Π T Π ˆ X (cid:21) − (cid:20) X I
Π 0 (cid:21) = (cid:20) I Y T (cid:21) , we can verify that Y X + ΞΠ =
I, T T (cid:20) X Π T Π ˆ X (cid:21) T = (cid:20) X II Y (cid:21) (cid:31) . (B.3)Upon letting M = Y ( A + BD K C ) X + Ξ B K CX + Y BC K Π + Ξ A K Π ,G = D K ,H = Y BD K + Ξ B K ,F = D K CX + C K Π , (B.4)we can also verify that T T (cid:20) A + BD K C BC K B K C A K (cid:21) (cid:20) X Π T Π ˆ X (cid:21) T = (cid:20) AX + BF A + BGCM Y A + HC (cid:21) . (B.5)Combining (B.5) with (B.2) and (B.3), we see that Z = ( X, Y, M, G, H, F, Π , Ξ) ∈ G n by the definitionof G n . Note that the change of variables (B.4) can be compactly represented as (cid:20) G FH M (cid:21) = (cid:20) I Y B Ξ (cid:21) (cid:20) D K C K B K A K (cid:21) (cid:20) I CX (cid:21) + (cid:20) Y AX (cid:21) , and with the guarantee in Lemma 3.3, we see that (cid:20) D K C K B K A K (cid:21) = (cid:20) I Y B Ξ (cid:21) − (cid:20) G FH M − Y AX (cid:21) (cid:20)
I CX (cid:21) − = (cid:20) Φ D ( Z ) Φ C ( Z )Φ B ( Z ) Φ A ( Z ) (cid:21) = Φ( Z ) . We then prove the second statement. Let Z = ( X, Y, M, G, H, F, Π , Ξ) ∈ G n be arbitrary. Let ˆ X = Π( X − Y − ) − Π T , and it’s straightforward to see that ˆ X (cid:31) and (cid:20) X Π T Π ˆ X (cid:21) (cid:20) I Y T (cid:21) = (cid:20) X XY + Π T Ξ T Π Π Y + ˆ X Ξ T (cid:21) = (cid:20) X I
Π 0 (cid:21) , Π Y + ˆ X Ξ T =Π Y + Π( X − Y − ) − Π T Ξ T =Π Y − Π( X − Y − ) − ( XY − I )=Π Y − Π( X − Y − ) − ( X − Y − ) Y =0 . We also have (cid:20)
G FH M (cid:21) = (cid:20) I Y B Ξ (cid:21) (cid:20) Φ D ( Z ) Φ C ( Z )Φ B ( Z ) Φ A ( Z ) (cid:21) (cid:20) I CX (cid:21) + (cid:20) Y AX (cid:21) from the definition of Φ . Similarly as showing the equality (B.5), we can derive that (cid:20) AX + BF A + BGCM Y A + HC (cid:21) = (cid:20) I Y T (cid:21) T (cid:20) A + BGC B Φ C ( Z )Φ B ( Z ) C Φ A ( Z ) (cid:21) (cid:20) X Π T Π ˆ X (cid:21) (cid:20) I Y T (cid:21) . Then from the definition of G n , we can further get (cid:20) A + BGC B Φ C ( Z )Φ B ( Z ) C Φ A ( Z ) (cid:21) (cid:20) X Π T Π ˆ X (cid:21) + (cid:20) X Π T Π ˆ X (cid:21) (cid:20) A + BGC B Φ C ( Z )Φ B ( Z ) C Φ A ( Z ) (cid:21) T ≺ , and since X − Π T ˆ X − Π = Y − (cid:31) , the matrix (cid:20) X Π T Π ˆ X (cid:21) is positive definite. We can now see that (cid:20) A B Φ C ( Z )Φ B ( Z ) C Φ A ( Z ) (cid:21) satisfies the Lyapunov inequality and thus is stable, meaning that Φ( Z ) ∈ C n . B.3 A Second-Order SISO System for Which C n Is Not Path-Connected
Consider a second-order SISO plant with A = (cid:20) (cid:21) , B = (cid:20) (cid:21) , C = (cid:2) (cid:3) . (B.6)For this case, any reduced-order controller in C and can be parameterized by K = (cid:20) C K B K A K (cid:21) forsome A K , B K , C K ∈ R . We now show that the matrix (8), given by C K B K A K , is not stable for any A K , B K , C K ∈ R , implying that C = ∅ . Indeed, by the Routh–Hurwitz criterion,the characteristic polynomial det λI − C K B K A K = λ − A K λ − ( B K C K + 1) λ + A K . has all roots in the open left half plane if and only if − A K > , A K > , A K ( B K C K + 1) > A K , C n is not path-connected by Theorem 3.3since the plant is SISO.We can also directly prove the disconnectivity of C n in this example. The set C n = C for (B.6)can be written as C n = C K , C K , B K , A K , A K , B K , A K , A K , ∈ R × (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) C K , C K , B K , A K , A K , B K , A K , A K , is stable . Obviously B K cannot be zero for any stabilizing controller in C . Since for any B K ∈ R \{ } ,there exists T ∈ R × with det T > such that T B K = (cid:20) (cid:21) , by the path-connectivity of the set { T ∈ R × : det T > } [39], we can see that C n is path-connected if and only if the set S = ˆ K = C K , C K , A K , A K , A K , A K , ∈ R × (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) C K , C K , A K , A K , A K , A K , is stable is path-connected. The Routh–Hurwitz stability criterion allows establishing an equivalent conditionfor the set S as S = ˆ K = C K , C K , A K , A K , A K , A K , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p ( ˆ K ) > , p ( ˆ K ) > , p ( ˆ K ) > , p ( ˆ K ) > , where p ( ˆ K ) = − A K , − A K , ,p ( ˆ K ) = A K , + A K , + A K , C K , − A K , C K , ,p ( ˆ K ) = ( A K , + A K , ) ( A K , A K , − A K , A K , ) − ( A K , + A K , + A K , C K , − A K , C K , ) × [( A K , + A K , )( A K , A K , − A K , A K , − C K , ) + A K , C K , − A K , C K , ] ,p ( ˆ K ) = − A K , A K , + A K , A K , . We first show that A K , (cid:54) = 0 for any ˆ K ∈ S . Indeed, if A K , = 0 , we then have p ( ˆ K ) = A K , + A K , + A K , C K , , p ( ˆ K ) = − A K , A K , , and p ( ˆ K ) = ( A K , + A K , ) A K , A K , − ( A K , + A K , + A K , C K , )[( A K , + A K , )( A K , A K , − C K , ) + A K , C K , ]= A K , C K , ( A K , + A K , + A K , C K , − A K , ( A K , + A K , )) . From p ( ˆ K ) > and p ( ˆ K ) > , we get A K , C K , > , and together with p ( ˆ K ) > and p ( ˆ K ) > ,we see that A K , C K , < and A K , + A K , + A K , C K , < A K , ( A K , + A K , ) < , p ( ˆ K ) > . Thus A K , (cid:54) = 0 for any ˆ K ∈ S On the other hand, let ˆ K (1) = − / −
20 11 / − , ˆ K (2) = / − − − / − . It can be checked that ˆ K (1) and ˆ K (2) are both in S . Now we see that S is not path-connected, sinceany continuous path connecting ˆ K (1) and ˆ K (2) must pass a point with A K , = 0 . Consequently, theset C is not path-connected for this example. B.4 The Gradient and the Hessian of J q ( K ) We first introduce the following lemma.
Lemma B.1.
Suppose M : ( − δ, δ ) → R k × k and G : ( − δ, δ ) → S k are two indefinitely differentiablematrix-valued functions for some δ > and k ∈ N \{ } , and suppose M ( t ) is stable for all t ∈ ( − δ, δ ) .Let X ( t ) denote the solution to the following Lyapunov equation M ( t ) X ( t ) + X ( t ) M ( t ) T + G ( t ) = 0 . Then X ( t ) is indefinitely differentiable over t ∈ ( − δ, δ ) , and its j ’th order derivative at t = 0 , denotedby X ( j ) (0) , is the solution to the following Lyapunov equation M (0) X ( j ) (0) + X ( j ) (0) M (0) T + (cid:32) j (cid:88) i =1 j ! i !( j − i )! (cid:16) M ( i ) (0) X ( j − i ) (0) + X ( j − i ) (0) M ( i ) (0) T (cid:17) + G ( j ) (0) (cid:33) = 0 . (B.7) Proof of Lemma B.1.
The differentiability of X ( t ) follows from the observation that the uniquesolution to the Lyapunov equation can be written as vec( X ( t )) = − ( I k ⊗ M ( t ) + M ( t ) ⊗ I k ) − vec( G ( t )) . Since M ( t ) , G ( t ) and X ( t ) are indefinitely differentiable, they admit Taylor expansions around t = 0 given by M ( t ) = a (cid:88) j =0 t j j ! M ( j ) (0) + o ( t a ) ,G ( t ) = a (cid:88) j =0 t j j ! G ( j ) (0) + o ( t a ) ,X ( t ) = a (cid:88) j =0 t j j ! X ( j ) (0) + o ( t a ) for any a ∈ N . By plugging these Taylor expansions into the original Lyapunov equation, after somealgebraic manipulations, we can show that a (cid:88) j =0 t j (cid:34) j (cid:88) i =0 i !( j − i )! (cid:16) M ( i ) (0) X ( j − i ) (0) + X ( j − i ) (0) M ( i ) (0) T (cid:17) + 1 j ! G ( j ) (0) (cid:35) + o ( t a ) = 0 . t , we get j (cid:88) i =0 i !( j − i )! (cid:16) M ( i ) (0) X ( j − i ) (0) + X ( j − i ) (0) M ( i ) (0) T (cid:17) + 1 j ! G ( j ) (0) = 0 , which is the same as (B.7). Thus, X ( j ) (0) is a solution to the Lyapunov equation (B.7).Given any stabilizing controller K ∈ C q , we denote the closed-loop matrix as A cl , K = (cid:20) A BC K B K C A K (cid:21) = (cid:20) A
00 0 (cid:21) + (cid:20) B I (cid:21) K (cid:20) C I (cid:21) and recall that the LQG cost is given by J q ( K ) = tr (cid:18)(cid:20) Q C TK RC K (cid:21) X K (cid:19) , where X K is the unique positive semidefinite solution to the Lyapunov equation (12a).Consider an arbitrary direction ∆ = (cid:20) C K ∆ B K ∆ A K (cid:21) ∈ V q . For sufficiently small t > such that K + t ∆ ∈ C q , the corresponding closed-loop matrix is A cl , K + t ∆ = A cl , K + t (cid:20) B I (cid:21) ∆ (cid:20) C I (cid:21) , and we let X K , ∆ ( t ) denote the solution to the Lyapunov equation (12a) with closed-loop matrix A cl , K + t ∆ , i.e., (cid:18) A cl , K + t (cid:20) B I (cid:21) ∆ (cid:20) C I (cid:21)(cid:19) X K , ∆ ( t ) + X K , ∆ ( t ) (cid:18) A cl , K + t (cid:20) B I (cid:21) ∆ (cid:20) C I (cid:21)(cid:19) T + (cid:20) W
00 ( B K + t ∆ B K ) V ( B K + t ∆ B K ) T (cid:21) = 0 . (B.8)By Lemma B.1, we see that X K , ∆ ( t ) admits a Taylor expansion of the form X K , ∆ ( t ) = X K + t · X (cid:48) K , ∆ (0) + t · X (cid:48)(cid:48) K , ∆ (0) + o ( t ) , · (B.9)and the derivatives X (cid:48) K , ∆ (0) and X (cid:48)(cid:48) K , ∆ (0) are the solutions to the following Lyapunov equations A cl , K X (cid:48) K , ∆ (0) + X (cid:48) K , ∆ (0) A T cl , K + M ( X K , ∆) = 0 , (B.10) A cl , K X (cid:48)(cid:48) K , ∆ (0) + X (cid:48)(cid:48) K , ∆ (0) A T cl , K + 2 M (cid:0) X (cid:48) K , ∆ (0) , ∆ (cid:1) = 0 , (B.11)where M ( X K , ∆) := (cid:20) B I (cid:21) ∆ (cid:20) C I (cid:21) X K + X K (cid:20) C I (cid:21) T ∆ T (cid:20) B I (cid:21) T + (cid:20) B K V ∆ T B K +∆ B K V B TK (cid:21) ,M (cid:0) X (cid:48) K , ∆ (0) , ∆ (cid:1) := (cid:20) B I (cid:21) ∆ (cid:20) C I (cid:21) X (cid:48) K , ∆ (0) + X (cid:48) K , ∆ (0) (cid:20) C I (cid:21) T ∆ T (cid:20) B I (cid:21) T + (cid:20) B K V ∆ T B K (cid:21) . J q ( K ) , we get J q ( K + t ∆) = tr (cid:18)(cid:20) Q
00 ( C K + t ∆ C K ) T R ( C K + t ∆ C K ) (cid:21) X K , ∆ ( t ) (cid:19) = J q ( K ) + t · tr (cid:18)(cid:20) Q C TK RC K (cid:21) X (cid:48) K , ∆ (0) + (cid:20) C TK R ∆ C K + ∆ T C K RC K (cid:21) X K (cid:19) + t · tr (cid:32) (cid:20) Q C TK RC K (cid:21) X (cid:48)(cid:48) K , ∆ (0) + 2 (cid:20) C TK R ∆ C K + ∆ T C K RC K (cid:21) X (cid:48) K , ∆ (0)+ 2 (cid:20) T C K R ∆ C K (cid:21) X K (cid:33) + o ( t ) , from which we can directly recognize dJ q ( K + t ∆) dt (cid:12)(cid:12)(cid:12) t =0 and d J q ( K + t ∆) dt (cid:12)(cid:12)(cid:12)(cid:12) t =0 .Now suppose X is the solution to the following Lyapunov equation A cl , K X + XA T cl , K + M = 0 for some M ∈ S n + q . Then, by Lemma A.1, we have X = (cid:90) + ∞ e A cl , K s M e A T cl , K s ds, and consequently tr (cid:18)(cid:20) Q C TK RC K (cid:21) X (cid:19) = (cid:90) + ∞ tr (cid:18)(cid:20) Q C TK RC K (cid:21) e A cl , K s M e A T cl , K s (cid:19) ds = (cid:90) + ∞ tr (cid:18) e A T cl , K s (cid:20) Q C TK RC K (cid:21) e A cl , K s M (cid:19) ds = tr( Y K M ) , in which we recall that Y K is the unique positive semidefinite solution to Lyapunov equation (12b).Therefore the first-order derivative dJ q ( K + t ∆) dt (cid:12)(cid:12)(cid:12) t =0 can be alternatively given by dJ q ( K + t ∆) dt (cid:12)(cid:12)(cid:12)(cid:12) t =0 = tr (cid:18) Y K M ( X K , ∆) + (cid:20) C TK R ∆ C K + ∆ T C K RC K (cid:21) X K (cid:19) = 2 tr (cid:32)(cid:20) RC K (cid:21) X K (cid:20) I (cid:21) + (cid:20) B I (cid:21) T Y K X K (cid:20) C I (cid:21) T + (cid:20) I (cid:21) Y K (cid:20) B K V (cid:21)(cid:33) T ∆ . One can readily recognize the gradient ∇ J q ( K ) by noticing that dJ q ( K + t ∆) dt (cid:12)(cid:12)(cid:12)(cid:12) t =0 = tr (cid:16) ∇ J q ( K ) T ∆ (cid:17) . Upon partitioning X K and Y K as (25), a few simple calculations lead to the gradient formula of J q ( K ) in (24). 48imilarly, we can show that the second-order derivative d J q ( K + t ∆) dt (cid:12)(cid:12)(cid:12)(cid:12) t =0 can be alternativelygiven by d J q ( K + t ∆) dt (cid:12)(cid:12)(cid:12)(cid:12) t =0 = 2 tr (cid:18) Y K M (cid:0) X (cid:48) K , ∆ (0) , ∆ (cid:1) + (cid:20) C TK R ∆ C K + ∆ T C K RC K (cid:21) X (cid:48) K , ∆ (0) + (cid:20) T C K R ∆ C K (cid:21) X K (cid:19) = 2 tr (cid:32) (cid:20) B I (cid:21) ∆ (cid:20) C I (cid:21) X (cid:48) K , ∆ (0) Y K + 2 (cid:20) C TK R ∆ C K (cid:21) X (cid:48) K , ∆ (0)+ (cid:20) B K V ∆ T B K (cid:21) Y K + (cid:20) T C K R ∆ C K (cid:21) X K (cid:33) . Remark . If we let
Hess K : V q × V q → R denote the bilinear form of the Hessian of J q at K ∈ C q .Then one can compute Hess K (∆ , ∆ ) for any ∆ , ∆ ∈ V q by noting that Hess K (∆ , ∆ ) = 14 (Hess K (∆ + ∆ , ∆ + ∆ ) − Hess K (∆ − ∆ , ∆ − ∆ )) , and that Hess K (∆ , ∆) = d J q ( K + t ∆) dt (cid:12)(cid:12)(cid:12)(cid:12) t =0 for any ∆ ∈ V q . B.5 Proof of Lemma 4.5
By Lemma A.1, given a stable matrix A , if ( C, A ) is observable, then the solution L to the Lyapunovequation is positive definite A T L + LA + C T C = 0 . Therefore, we only need to prove (cid:32)(cid:34) Q R C K (cid:35) , (cid:20) A BC K B K C A K (cid:21)(cid:33) is observable, and this is equivalent to show that the eigenvalues of the following matrix (cid:20) A BC K B K C A K (cid:21) + (cid:20) L L L L (cid:21) (cid:34) Q R C K (cid:35) = (cid:34) A + L Q BC K + L R C K B K C + L Q A K + L R C K (cid:35) can be arbitrarily assigned by choosing L , L , L , L . This is true by choosing L = − BR − , and observing that A + L Q and A K + L R C K can be arbitrarily assigned since ( Q , A ) , ( C K , A K ) are both observable.Thus, by Lemma A.1, the solution Y K to (12b) is positive definite. Similarly, we can prove X K ispositive definite. 49 .6 Proof of Proposition 4.1 We have already seen that T q gives a smooth Lie group action of GL q on C q . We first show that theisotropy group of K under the group actions in GL q , defined by { T ∈ GL q | T q ( T, K ) = K } , is a trivial group containing only the identity matrix. Let T ∈ GL q satisfy T q ( T, K ) = K , or (cid:20) C K T − T B K T A K T − (cid:21) = (cid:20) C K B K A K (cid:21) . Then we have
T A K = A K T , and consequently T A j +1 K B K = A K T A j K B K . By mathematical induction, we can see that
T A j K B K = A j K B K for all j = 0 , . . . , q − , indicatingthat any column vector of A j K B K is an eigenvector of T with eigenvalue . On the other hand, thecontrollability of K implies the column vectors of the matrix (cid:2) B K A K B K · · · A q − K B K (cid:3) span the whole space R q . Therefore R q is a subspace of the eigenspace of T with eigenvalue ,meaning that T is just the identity matrix.Since the isotropy group { T ∈ GL q | T q ( T, K ) = K } only contains the identity, by [39, Proposition7.26], the mapping T (cid:55)→ T q ( T, K ) is an immersion and the orbit O K is an immersed submanifold.We then prove that O K is closed under the original topology of C q . Suppose ( T j ) ∞ j =1 is a sequencein GL q such that T q ( T j , K ) = (cid:20) C K T − j T j B K T j A K T − j (cid:21) → (cid:20) C K ˜ B K ˜ A K (cid:21) = ˜ K , j → ∞ . Let G ( s ) be the transfer function of K , i.e., G ( s ) = C K ( sI − A K ) − B K . We notice that for any j ≥ , the matrix sI − T j A K T − j is invertible if and only if sI − A K isinvertible. Thus for any fixed s ∈ C such that sI − A K is invertible, we have lim j →∞ C K T − j ( sI − T j A K T − j ) − T j B K = ˜ C K ( sI − ˜ A K ) − ˜ B K . On the other hand, we simply have C K T − j ( sI − T j A K T − j ) − T j B K = C K ( sI − A K ) − B K = G ( s ) . This shows that the transfer function of ˜ K agrees with G ( s ) for any s ∈ C such that sI − A K isinvertible, and thus is just equal to G ( s ) . On the other hand, the controllability and observabilityof K ∈ C q indicates that the transfer function G ( s ) has order q , and so any two state-spacerepresentations of G ( s ) with order q will always be similarity transformations of each other (seeTheorem A.1). In other words, there exists ˜ T ∈ GL q such that ˜ K = (cid:20) C K ˜ B K ˜ A K (cid:21) = (cid:20) C K ˜ T − ˜ T B K ˜ T A K ˜ T − (cid:21) = T q ( ˜ T , K ) , ˜ K ∈ O K . We can now conclude that O K is a closed subset of C q . As a consequenceof the closedness of O K , the set O K equipped with the subspace topology induced from C q is a locallycompact Hausdorff space.Now, by combining the above results and applying [48, Theorem 2.13], we can conclude that themapping T (cid:55)→ T q ( T, K ) is a homeomorphism from GL q to O K . Therefore, the mapping T (cid:55)→ T q ( T, K ) is a diffeomorphism from GL q to O K , and O K is an embedded submanifold of C q with dimensiongiven by dim O K = dim GL q = q . Finally, the two path-connected components of O K are immediate. B.7 Proof of Theorem 4.2
We first show that K (cid:63) is a stationary point of J n ( K ) over K ∈ C n . Since T n ( − I n , K (cid:63) ) = K (cid:63) , by Lemma 4.4, we have ∇ J n | K (cid:63) = ∇ J n | T n ( − I n , K (cid:63) ) = (cid:20) I m − I n (cid:21) · ∇ J n | K (cid:63) · (cid:20) I p − I n (cid:21) . This equality implies that, excluding the bottom right n × n block, the last n rows and the last n columns of ∇ J n | K (cid:63) are zero. On the other hand, it is not hard to see that J n ( K (cid:63) ) does not dependon the choice of Λ as long as Λ is stable. Therefore the bottom right n × n block of ∇ J n | K (cid:63) is zero.We can now see that ∇ J n | K (cid:63) = 0 , showing that K (cid:63) is a stationary point of J n .Let ∆ = (cid:20) C K ∆ B K ∆ A K (cid:21) ∈ V n be arbitrary, and let ∆ (1) = (cid:20) C K (cid:21) , ∆ (2) = (cid:20) B K (cid:21) , ∆ (3) = (cid:20) A K (cid:21) . By the bilinearity of the Hessian, we have
Hess K (cid:63) (∆ , ∆) = (cid:88) ≤ i Hess K (cid:63) (∆ , ∆) = Hess K (cid:63) (∆ (1) + ∆ (2) , ∆ (1) + ∆ (2) ) . Hess K (cid:63) (∆ , ∆) = 0 for all ∆ ∈ V n , then the Hessian Hess K (cid:63) is obviously zero. Otherwise, Hess K (cid:63) (∆ , ∆) (cid:54) = 0 for some ∆ ∈ V n , which implies that Hess K (cid:63) (∆ (1) , ∆ (2) )= 12 (cid:16) Hess K (cid:63) (∆ (1) + ∆ (2) , ∆ (1) + ∆ (2) ) − Hess K (cid:63) (∆ (1) , ∆ (1) ) − Hess K (cid:63) (∆ (2) , ∆ (2) ) (cid:17) = 12 Hess K (cid:63) (∆ , ∆) (cid:54) = 0 . Note that ∆ (1) and ∆ (2) are linearly independent (otherwise Hess K (cid:63) (∆ (1) , ∆ (2) ) will be zero).Together with Hess K (cid:63) (∆ ( i ) , ∆ ( i ) ) = 0 for i = 1 , , we see that Hess K (cid:63) must be indefinite (a symmetricmatrix having a × principal submatrix with zero diagonal entries and non-zero off-diagonal entriesmust be indefinite).Now we proceed to the situation where Λ is diagonalizable. We will use e ( k ) i to denote the k -dimensional vector where only the i th entry is and other entries are zero. Part I: eig( − Λ) (cid:42) Z = ⇒ the Hessian is indefinite. Let λ ∈ eig( − Λ) \Z . Since λ / ∈ Z , thereexists some i, j such that G ( λ ) := e ( p ) i T CX op (cid:0) λI − A T (cid:1) − Y op Be ( m ) j (cid:54) = 0 . We consider three situations:1) λ is real. In this case, let T be a real invertible matrix such that T Λ T − = (cid:20) − λ ∗ (cid:21) . Let ∆ (1) , ∆ (2) ∈ V n be given by ∆ (1) = (cid:34) (1) C K (cid:35) , ∆ (2) = (cid:34) (2) B K (cid:35) , where ∆ (1) C K = e ( m ) j e ( n )1 T T − , ∆ (2) B K = T e ( n )1 e ( p ) i T . Then it’s not hard to see that J n ( K (cid:63) + t ∆ (1) ) = J n ( K (cid:63) + t ∆ (2) ) = J n ( K (cid:63) ) for any sufficiently small t , indicating that Hess K (cid:63) (∆ (1) , ∆ (1) ) = Hess K (cid:63) (∆ (2) , ∆ (2) ) = 0 . On the other hand, we have that the unique solutions to Lyapunov equations (12a) and (12b) are X K (cid:63) = (cid:20) X op 00 0 (cid:21) , Y K (cid:63) = (cid:20) Y op 00 0 (cid:21) . By Lemma 4.3, we can see that Hess K (cid:63) (∆ (1) + ∆ (2) , ∆ (1) + ∆ (2) ) = 4 tr (cid:32)(cid:34) B ∆ (1) C K ∆ (2) B K C (cid:35) X (cid:48) K (cid:63) , ∆ (1) +∆ (2) (cid:20) Y op 00 0 (cid:21)(cid:33) , X (cid:48) K (cid:63) , ∆ (1) +∆ (2) is the solution to the following Lyapunov equation (cid:20) A 00 Λ (cid:21) X (cid:48) K (cid:63) , ∆ (1) +∆ (2) + X (cid:48) K (cid:63) , ∆ (1) +∆ (2) (cid:20) A 00 Λ (cid:21) T + (cid:34) B ∆ (1) C K ∆ (2) B K C (cid:35) (cid:20) X op 00 0 (cid:21) + (cid:20) X op 00 0 (cid:21) (cid:34) B ∆ (1) C K ∆ (2) B K C (cid:35) T = 0 . Since (cid:34) B ∆ (1) C K ∆ (2) B K C (cid:35) (cid:20) X op 00 0 (cid:21) + (cid:20) X op 00 0 (cid:21) (cid:34) B ∆ (1) C K ∆ (2) B K C (cid:35) T = (cid:34) X op C T ∆ (2) B K T ∆ (2) B K CX op (cid:35) , the matrix X (cid:48) K (cid:63) , ∆ (1) +∆ (2) can be represented by X (cid:48) K (cid:63) , ∆ (1) +∆ (2) = (cid:90) + ∞ exp (cid:18)(cid:20) A 00 Λ (cid:21) s (cid:19) (cid:34) X op C T ∆ (2) B K T ∆ (2) B K CX op (cid:35) exp (cid:32)(cid:20) A 00 Λ (cid:21) T s (cid:33) ds = (cid:90) + ∞ (cid:20) e As e Λ s (cid:21) (cid:34) X op C T ∆ (2) B K T ∆ (2) B K CX op (cid:35) (cid:34) e A T s e Λ T s (cid:35) ds = (cid:90) + ∞ (cid:34) e As X op C T ∆ (2) B K T e Λ T s e Λ s ∆ (2) B K CX op e A T s (cid:35) ds. Therefore Hess K (cid:63) (∆ (1) + ∆ (2) , ∆ (1) + ∆ (2) )= (cid:90) ∞ (cid:32)(cid:34) B ∆ (1) C K ∆ (2) B K C (cid:35) (cid:34) e As X op C T ∆ (2) B K T e Λ T s e Λ s ∆ (2) B K CX op e A T s (cid:35) (cid:20) Y op 00 0 (cid:21)(cid:33) ds = (cid:90) + ∞ (cid:16) B ∆ (1) C K e Λ s ∆ (2) B K CX op e A T s Y op (cid:17) ds. By the construction of ∆ (1) C K and ∆ (2) B K , we can see that ∆ (1) C K e Λ s ∆ (2) B K = e − λs e ( m ) j e ( p ) i T . Thus Hess K (cid:63) (∆ (1) + ∆ (2) , ∆ (1) + ∆ (2) ) = (cid:90) + ∞ e ( p ) i T CX op e ( A T − λI ) s Y op Be ( m ) j ds = 4 e ( p ) i T CX op (cid:0) λI − A T (cid:1) − Y op Be ( m ) j = 4 G ( λ ) , which is nonzero by assumption. Consequently, Hess K (cid:63) (∆ (1) , ∆ (2) )= 12 (cid:16) Hess K (cid:63) (∆ (1) +∆ (2) , ∆ (1) +∆ (2) ) − Hess K (cid:63) (∆ (1) , ∆ (1) ) − Hess K (cid:63) (∆ (2) , ∆ (2) ) (cid:17) = 2 G ( λ ) (cid:54) = 0 . Hess K (cid:63) (∆ (1) , ∆ (1) ) = Hess K (cid:63) (∆ (2) , ∆ (2) ) = 0 , we can see that neither Hess K (cid:63) nor − Hess K (cid:63) can be positive semidefinite. Thus Hess K (cid:63) has at least one positive eigenvalueand one negative eigenvalue.2) λ = λ re + i λ im is not real, and G ( λ ) is not purely imaginary. In this case, since Λ is real, thecomplex conjugate of λ , which we denote by λ , is also an eigenvalue of Λ . We can find a realinvertible matrix T such that T Λ T − = (cid:20) − λ re − λ im λ im − λ re (cid:21) ∗ . We still let ∆ (1) , ∆ (2) ∈ V n be given by ∆ (1) = (cid:34) (1) C K (cid:35) , ∆ (2) = (cid:34) (2) B K (cid:35) , ∆ (1) C K = e ( m ) j e ( n )1 T T − , ∆ (2) B K = T e ( n )1 e ( p ) i T . Then similarly as in the previous situation, we have Hess K (cid:63) (∆ (1) , ∆ (1) ) = Hess K (cid:63) (∆ (2) , ∆ (2) ) = 0 , and Hess K (cid:63) (∆ (1) + ∆ (2) , ∆ (1) + ∆ (2) ) = (cid:90) + ∞ (cid:16) B ∆ (1) C K e Λ s ∆ (2) B K CX op e A T s Y op (cid:17) ds. By the construction of ∆ (1) C K and ∆ (2) B K , we have ∆ (1) C K e Λ s ∆ (2) B K = e − λ re s cos( − λ im s ) e ( m ) j e ( p ) i T = e − λs + e − λs e ( m ) j e ( p ) i T , and therefore Hess K (cid:63) (∆ (1) + ∆ (2) , ∆ (1) + ∆ (2) )= 12i (cid:18)(cid:90) + ∞ e ( p ) i T CX op e ( A T − λI ) s Y op Be ( m ) j ds + (cid:90) + ∞ e ( p ) i T CX op e ( A T − λI ) s Y op Be ( m ) j ds (cid:19) = − G ( λ ) + G ( λ )) , and since G ( λ ) is not purely imaginary, we have Hess K (cid:63) (∆ (1) +∆ (2) , ∆ (1) +∆ (2) ) (cid:54) = 0 . Consequently, Hess K (cid:63) (∆ (1) , ∆ (2) ) (cid:54) = 0 , and together with the fact that Hess K (cid:63) (∆ (1) , ∆ (1) ) = Hess K (cid:63) (∆ (2) , ∆ (2) ) =0 , we can conclude that Hess K (cid:63) has at least one positive eigenvalue and one negative eigenvalue.3) λ = λ re + i λ im is not real, and G ( λ ) is purely imaginary. In this case, we can still find a realinvertible matrix T such that T Λ T − = (cid:20) − λ re − λ im λ im − λ re (cid:21) ∗ . We let ∆ (1) , ∆ (2) ∈ V n be given by ∆ (1) = (cid:34) (1) C K (cid:35) , ∆ (2) = (cid:34) (2) B K (cid:35) , ∆ (1) C K = e ( m ) j e ( n )1 T T − , ∆ (2) B K = T e ( n )2 e ( p ) i T . Hess K (cid:63) (∆ (1) , ∆ (1) ) = Hess K (cid:63) (∆ (2) , ∆ (2) ) = 0 , and Hess K (cid:63) (∆ (1) + ∆ (2) , ∆ (1) + ∆ (2) ) = (cid:90) + ∞ (cid:16) B ∆ (1) C K e Λ s ∆ (2) B K CX op e A T s Y op (cid:17) ds. By the construction of ∆ (1) C K and ∆ (2) B K , we have ∆ (1) C K e Λ s ∆ (2) B K = e − λ re s sin( − λ im s ) e ( m ) j e ( p ) i T = e − λs − e − λs e ( m ) j e ( p ) i T , and therefore Hess K (cid:63) (∆ (1) + ∆ (2) , ∆ (1) + ∆ (2) )= 12 (cid:18)(cid:90) + ∞ e ( p ) i T CX op e ( A T − λI ) s Y op Be ( m ) j ds − (cid:90) + ∞ e ( p ) i T CX op e ( A T − λI ) s Y op Be ( m ) j ds (cid:19) = 2( G ( λ ) − G ( λ )) , and since G ( λ ) has a nonzero imaginary part, we have Hess K (cid:63) (∆ (1) + ∆ (2) , ∆ (1) + ∆ (2) ) (cid:54) = 0 .Consequently, Hess K (cid:63) (∆ (1) , ∆ (2) ) (cid:54) = 0 , and together with the fact that Hess K (cid:63) (∆ (1) , ∆ (1) ) =Hess K (cid:63) (∆ (2) , ∆ (2) ) = 0 , we can conclude that Hess K (cid:63) has at least one positive eigenvalue andone negative eigenvalue. Part II: eig( − Λ) ⊆ Z = ⇒ the Hessian is zero. In this part, we will show that Hess K (cid:63) (∆ , ∆) = 0 for any ∆ ∈ V n .Let ∆ = (cid:20) C K ∆ B K ∆ A K (cid:21) ∈ V n be arbitrary. Let ∆ (1) = (cid:20) C K (cid:21) , ∆ (2) = (cid:20) B K (cid:21) , ∆ (3) = (cid:20) A K (cid:21) . We have already shown that Hess K (cid:63) (∆ , ∆) = Hess K (cid:63) (∆ (1) + ∆ (2) , ∆ (1) + ∆ (2) ) . Let T be an invertible n × n (complex) matrix that diagonalizes Λ as T Λ T − = − λ . . . − λ n . Define U ik = e ( m ) i e ( n ) k T T − , V jk = T e ( n ) k e ( p ) j T ≤ i ≤ m , ≤ j ≤ p and ≤ k ≤ n . It’s not hard to see that { U ik | ≤ i ≤ m, ≤ k ≤ n } forms a basis of C m × n , and { V jk | ≤ j ≤ n, ≤ k ≤ n } forms a basis of C n × q . Therefore ∆ C K and ∆ B K can be expanded as ∆ C K = (cid:88) ≤ i ≤ m (cid:88) ≤ k ≤ n α ik U ik , ∆ B K = (cid:88) ≤ j ≤ q (cid:88) ≤ k ≤ n β jk V jk . By similar derivations as in Case 1, we can get Hess K (cid:63) (∆ (1) + ∆ (2) , ∆ (1) + ∆ (2) )= (cid:90) + ∞ (cid:16) B ∆ C K e Λ s ∆ B K CX op e A T s Y op (cid:17) ds. Then, since ∆ C K e Λ s ∆ B K = (cid:88) ≤ i ≤ m (cid:88) ≤ j ≤ q (cid:88) ≤ k ≤ n (cid:88) ≤ k (cid:48) ≤ n α ik β jk (cid:48) U ik e Λ s V jk (cid:48) = (cid:88) ≤ i ≤ m (cid:88) ≤ j ≤ q (cid:88) ≤ k ≤ n (cid:88) ≤ k (cid:48) ≤ n α ik β jk (cid:48) e ( m ) i e ( n ) k T e − λ s . . . e − λ n s e ( n ) k (cid:48) e ( p ) j T = (cid:88) ≤ i ≤ m (cid:88) ≤ j ≤ q (cid:88) ≤ k ≤ n α ik β jk (cid:48) e − λ k s e ( m ) i e ( p ) j T , we have Hess K (cid:63) (∆ (1) + ∆ (2) , ∆ (1) + ∆ (2) )= (cid:88) ≤ i ≤ m (cid:88) ≤ j ≤ q (cid:88) ≤ k ≤ n (cid:90) + ∞ α ik β jk (cid:48) · e ( p ) j T CX op e ( A − λ k I ) T s Y op Be ( m ) i ds = (cid:88) ≤ i ≤ m (cid:88) ≤ j ≤ q (cid:88) ≤ k ≤ n α ik β jk (cid:48) · e ( p ) j T CX op (cid:0) λ k I − A T (cid:1) − Y op Be ( m ) i . Since eig( − Λ) \Z = ∅ , we can see that CX op (cid:0) λ k I − A T (cid:1) − Y op B = 0 for any ≤ k ≤ n . Therefore Hess K (cid:63) (∆ , ∆) = Hess K (cid:63) (∆ (1) + ∆ (2) , ∆ (1) + ∆ (2) ) = 0 , which completes the proof. B.8 Proof of Theorem 4.4 For each (cid:15) > , let the closed-loop system matrix be denoted by A cl ( (cid:15) ) = − − − − (1 + (cid:15) ) − − (cid:15) − − (cid:15) − − 21 + (cid:15) (cid:15) − (cid:15) ) − (1 + (cid:15) ) , M W,V ( (cid:15) ) = (cid:20) W B ∗ K V B ∗ KT (cid:21) = (cid:15) (cid:15) (cid:15) ) (cid:15) (cid:15) (1 + (cid:15) ) ,M Q,R ( (cid:15) ) = (cid:20) Q C ∗ KT RC ∗ K (cid:21) = Let X K ∗ ( (cid:15) ) and Y K ∗ ( (cid:15) ) denote the solutions to the Lyapunov equations A cl ( (cid:15) ) X K ∗ ( (cid:15) ) + X K ∗ ( (cid:15) ) A cl ( (cid:15) ) T + M W,V ( (cid:15) ) = 0 ,A cl ( (cid:15) ) T Y K ∗ ( (cid:15) ) + Y K ∗ ( (cid:15) ) A cl ( (cid:15) ) + M Q,R ( (cid:15) ) = 0 . By Lemma B.1, we can compute the Taylor expansions of X K ∗ ( (cid:15) ) and Y K ∗ ( (cid:15) ) , which turn out to be X K ∗ ( (cid:15) ) = 17 + − / / − / / / / / / − / / − / / / / / / (cid:15) / − / / − / − / / − / / / − / / − / − / / − / / (cid:18) (cid:15) − (cid:15) 28 + (cid:15) (cid:19) + o ( (cid:15) ) , and Y K ∗ ( (cid:15) ) = 17 − − 11 8 − − − − − − + − / − / / / − / − / / / / / − / − / / / − / / (cid:15) / / − / − / / / − / − / − / − / / / − / − / / / (cid:15) 14 + − / − / / / − / − / / / / / − / − / / / − / − / (cid:15) / / − / − / / − / − − / − / / / − / − / + (cid:15) 56 + o ( (cid:15) ) . Next, we let M (0)1 ( (cid:15) ) = − / / 20 0 1 / − / X K ∗ ( (cid:15) ) + X K ∗ ( (cid:15) ) − / / 20 0 1 / − / , M ( X K ∗ , ∆ ) in Lemma 4.3. Let X (cid:48) (0) K ∗ ( (cid:15) ) denote the solution tothe Lyapunov equation A cl ( (cid:15) ) X (cid:48) (0) K ∗ ( (cid:15) ) + X (cid:48) (0) K ∗ ( (cid:15) ) A cl ( (cid:15) ) T + M (0)1 ( (cid:15) ) = 0 . Then similarly by Lemma B.1, we can compute the Taylor expansion of X (cid:48) (0) K ∗ ( (cid:15) ) , which is given by X (cid:48) (0) K ∗ ( (cid:15) ) = − 10 0 1 − 11 1 2 0 − − − (cid:15) 100 + − − − 69 78 − − − 20 29 − − − 132 6478 29 64 64 (cid:15) 12 12 89 − − − − − − − − − (cid:15) − − − 109 136 − − 18 38 − − 109 38 − 200 192136 − 11 192 − (cid:15) o ( (cid:15) ) . By Lemma 4.3, we then have Hess K ∗ (∆ , ∆ ) = 4 tr − / / 20 0 1 / − / X (cid:48) (0) K ∗ ( (cid:15) ) Y K ∗ ( (cid:15) ) = 37000 (cid:15) + o ( (cid:15) ) . Similarly, to compute the leading term of the Taylor expansion of Hess K ∗ (∆ , ∆ ) , we let M (1)1 ( (cid:15) ) = − / − / 20 0 − (1 + (cid:15) ) / − (1 + (cid:15) ) / / / / / X K ∗ ( (cid:15) )+ X K ∗ ( (cid:15) ) − / − / 20 0 − (1 + (cid:15) ) / − (1 + (cid:15) ) / / / / / + (cid:15) ) / 20 0 (2 + (cid:15) ) / (cid:15) , which corresponds to the matrix M ( X K ∗ , ∆ ) in Lemma 4.3. Let X (cid:48) (1) K ∗ ( (cid:15) ) denote the solution tothe Lyapunov equation A cl ( (cid:15) ) X (cid:48) (1) K ∗ ( (cid:15) ) + X (cid:48) (1) K ∗ ( (cid:15) ) A cl ( (cid:15) ) T + M (1)1 ( (cid:15) ) = 0 . Then by Lemma B.1, we have X (cid:48) (1) K ∗ ( (cid:15) ) = 1686 − − 72 5 5 − − 72 5 55 5 82 825 5 82 82 + o (1) , and then by Lemma 4.3, we can show that Hess K ∗ (∆ , ∆ ) = 680343 + o (1) . (cid:13)(cid:13) Proj T O K ∗ [∆ ] (cid:13)(cid:13) F = O ( (cid:15) ) . It can be shown that for any ∆ = (cid:20) C K ∆ B K ∆ A K (cid:21) ,we have ∆ A K A ∗ KT − A ∗ KT ∆ A K + ∆ B K B ∗ KT − C ∗ KT ∆ C K = 0 ⇐⇒ (cid:2) A ∗ K ⊗ I n − I n ⊗ A ∗ KT B ∗ K ⊗ I n − I n ⊗ C ∗ KT (cid:3) vec(∆ A K )vec(∆ B K )vec(∆ C K ) = 0 . Denoting M = (cid:2) A ∗ K ⊗ I n − I n ⊗ A ∗ KT B ∗ K ⊗ I n − I n ⊗ C ∗ KT (cid:3) = (cid:15) ) − (cid:15)/ − − (cid:15) ) 0 − (cid:15)/ (cid:15) ) 1 + (cid:15) − (cid:15) ) 2 0 0 1 + (cid:15) . Since for (cid:15) > , dim T O K ∗ = n = 4 , we can see that rank M = n + nm + np − dim ker M = 8 − dim T O K ∗ = 4 . By Proposition 4.2, we can obtain (cid:107) Proj T O K ∗ [∆ ] (cid:107) F by computing (cid:13)(cid:13)(cid:13) M T ( MM T ) − M v (cid:13)(cid:13)(cid:13) where v = (cid:2) − / / / − / (cid:3) T . We note that MM T = 10 + 8 (cid:15) + 4 (cid:15) (cid:15) + 7 (cid:15) (cid:15) − − (cid:15) − (cid:15) (cid:15) + 7 (cid:15) 10 + 49 (cid:15) / − − (cid:15) − (cid:15) − (cid:15) (cid:15) − − (cid:15) 10 + 18 (cid:15) + 85 (cid:15) / − (cid:15) − − (cid:15) − (cid:15) − (cid:15) − (cid:15) − (cid:15) 10 + 10 (cid:15) + 5 (cid:15) . It can be checked that MM T − − (cid:15) + 3 (cid:15) + 29 (cid:15) + 49 (cid:15) / 272 + 126 (cid:15) + 60 (cid:15) + 13 (cid:15) − (cid:15) − (cid:15) − (cid:15) − (cid:15) − − (cid:15) − (cid:15) + 14 (cid:15) = (432 + 840 (cid:15) + 1122 (cid:15) + 702 (cid:15) + 249 (cid:15) ) (cid:15) (cid:15)/ − (cid:15)/ − (cid:15) = (432 + 840 (cid:15) + 1122 (cid:15) + 702 (cid:15) + 249 (cid:15) ) M v , implying that ( MM T ) − M v = 1432 + 840 (cid:15) + 1122 (cid:15) + 702 (cid:15) + 249 (cid:15) − − (cid:15) + 3 (cid:15) + 29 (cid:15) + 49 (cid:15) / 272 + 126 (cid:15) + 60 (cid:15) + 13 (cid:15) − (cid:15) − (cid:15) − (cid:15) − (cid:15) − − (cid:15) − (cid:15) + 14 (cid:15) = − / / / − / + − − (cid:15) 216 + o ( (cid:15) ) . M T − / / / − / = O ( (cid:15) ) , we can see that M T ( MM T ) − M v = O ( (cid:15) ) , which completes the proof. C Connectivity of the Set of Proper Stabilizing Controllers We present connectivity results for the set of proper stabilizing controllers. The dynamical controllerin (3) is strictly proper as it does not contain a direct feedback term from the output measurement.We note that the optimal solution for the LQG problem (2) is always strictly proper.For closed-loop stability, we can also consider a proper dynamical controller as follows ˙ ξ ( t ) = A K ξ ( t ) + B K y ( t ) ,u ( t ) = C K ξ ( t ) + D K y ( t ) , (C.1)parameterized by four matrices A K , B K , C K , D K with compatible dimensions. Similarly, we definethe set of proper stabilizing controllers as ˆ C q := (cid:26) K = (cid:20) D K C K B K A K (cid:21) ∈ R ( p + q ) × ( m + q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:20) A + BD K C BC K B K C A K (cid:21) is stable (cid:27) . (C.2)By this definition, we always have C q ⊆ ˆ C q , which is consistent with the fact that the set of strictlyproper stabilizing controllers is a subset of the set of proper stabilizing controllers. But we notethat ∀ K ∈ ˆ C q with D K (cid:54) = 0 , the resulting LQG cost J ( K ) in (2) is infinite, despite that K internallystabilizes the plant.Similar to Lemma 3.1, the observation in Lemma C.1 is obvious. Unlike C n that might have twopath-connected components, ˆ C n is always path-connected, as stated in Theorem C.1. Lemma C.1. Under Assumption 1, the set ˆ C n is non-empty, open, unbounded and non-convex. Theorem C.1. Under Assumption 1, ˆ C n is always path-connected. The proof of Theorem C.1 is almost identical to Theorem 3.1. By replacing the constraint R = 0 m × p with R ∈ R m × p in the definitions of F n , G n and Φ( · ) , it is not difficult to verify that theresults in Proposition 3.1 and Proposition 3.2 still hold for ˆ C n . Unlike C n − might be empty, wealways have ˆ C n − (cid:54) = ∅ under Assumption 1 [49]. By adapting the proof in Theorem 3.3, Theorem C.1is now obvious. Example 9 (Connectivity of proper stabilizing controllers) . Consider the linear system (B.6) inAppendix B.3. We have shown that C n − = ∅ for strictly proper reduced-order dynamical controllers.Here, it is easy to verify that the following proper reduced-order dynamical controller A K = 1 , B K = − , C K = 2 , D K = − , i.e. , the eigenvalues of D K C K B K A K have all negative real parts, indicating that ˆ C n − (cid:54) = ∅ . Thus, ˆ C n is path-connected.Similarly, one can verify that the set of proper stabilizing controllers for the system in Example 2is path-connected. Indeed, using the Routh–Hurwitz stability criterion, we derive that ˆ C = (cid:26) K = (cid:20) D K C K B K A K (cid:21) ∈ R × (cid:12)(cid:12)(cid:12)(cid:12)(cid:20) A + BD K C BC K B K C A K (cid:21) is stable (cid:27) = (cid:26) K = (cid:20) D K C K B K A K (cid:21) ∈ R × (cid:12)(cid:12)(cid:12)(cid:12) A K + D K < − , B K C K < A K + A K D K (cid:27) . This set is path-connected. D Results for Discrete-Time Systems In this section, we discuss some landscape properties for the discrete-time LQG problem. As wewill see, most results are analogous to the continuous-time case. We will slightly abuse the use ofnotation, and adopt the same notation for both continuous-time and discrete-time cases.Consider a discrete-time partially observed LTI system x t +1 = Ax t + Bu t + w t ,y t = Cx t + v t , (D.1)where x t ∈ R n , u t ∈ R m , y t ∈ R p are the system state, input, and output measurement at time t ,and w t ∼ N (0 , W ) , v t ∼ N (0 , V ) are Gaussian process and measurement noises, respectively. It isassumed that the covariance matrices satisfy W (cid:23) , V (cid:31) . Given performance weight matrices Q (cid:23) , R (cid:31) , the discrete-time LQG problem is defined as min u ,u ,... lim T →∞ E (cid:34) T T (cid:88) t =1 (cid:16) x T t Qx t + u T t Ru t (cid:17)(cid:35) subject to (D.1) . (D.2)The control input u t at time t is allowed to depend on the history H t := ( u , . . . , u t − , y , . . . , y t − ) . We make the following standard assumption. Assumption 2. ( A, B ) and ( A, W / ) are controllable, and ( C, A ) and ( Q / , A ) are observable. Under Assumption 2, the optimal solution to (D.2) is a dynamical controller, given by ξ t +1 = Aξ t + Bu t + L ( y t − Cξ t ) ,u t = − Kξ t . (D.3) The one step delay in y t is a standard assumption that simplifies the Kalman filtering expressions. We leave thecase where the history contains the current measurement y t for future discussions. L is called the Kalman gain , computed as L = AP C T ( CP C T + V ) − where P isthe unique positive semidefinite solution to P = AP A T − AP C T ( CP C T + V ) − CP A T + W, (D.4)and the matrix K is called the LQR feedback gain , computed as K = ( B T SB + R ) − B T SA where S is the unique positive semidefinite solution to S = A T SA − A T SB ( B T SB + R ) − B T SA + Q. (D.5) D.1 Controller Parameterization and the LQG Cost Function Similar to the continuous-time case, we consider the following parameterization of dynamicalcontrollers ξ t +1 = A K ξ t + B K y t ,u t = C K ξ t , (D.6)where ξ t ∈ R q is the controller state at time t , and ( A K , B K , C K ) ∈ R q × q × R q × p × R m × q specify thecontroller dynamics. The optimal LQG controller (D.3) can be written in the form of (D.6), wherethe controller state has dimension q = n , and A K = A − BK − LC, B K = L, C K = − K . Combining (D.6) with (D.1) leads to the closed-loop system as (cid:20) x t +1 ξ t +1 (cid:21) = (cid:20) A BC K B K C A K (cid:21) (cid:20) x t ξ t (cid:21) + (cid:20) I B K (cid:21) (cid:20) w t v t (cid:21)(cid:20) y t u t (cid:21) = (cid:20) C C K (cid:21) (cid:20) x t ξ t (cid:21) + (cid:20) v t (cid:21) . The set of stabilizing controllers with order q ∈ N is defined as C q := (cid:26) K = (cid:20) m × p C K B K A K (cid:21) ∈ R ( m + q ) × ( p + q ) (cid:12)(cid:12)(cid:12)(cid:12) ρ (cid:18)(cid:20) A BC K B K C A K (cid:21)(cid:19) < (cid:27) , (D.7)where ρ ( · ) denotes the spectral radius of a square matrix. Let J q ( K ) : C q → R denote the functionthat maps a parameterized dynamical controller in C q to its corresponding LQG cost for each q ∈ N .Analogous to the continuous time case, we have the following two lemmas characterizing the LQGcost function J q . Lemma D.1. Fix q ∈ N such that C q (cid:54) = ∅ . Given K ∈ C q , we have J q ( K ) = tr (cid:18)(cid:20) Q C TK RC K (cid:21) X K (cid:19) = tr (cid:18)(cid:20) W B K V B TK (cid:21) Y K (cid:19) , (D.8) where X K and Y K are the unique positive semidefinite solutions to the following Lyapunov equations X K = (cid:20) A BC K B K C A K (cid:21) X K (cid:20) A BC K B K C A K (cid:21) T + (cid:20) W B K V B TK (cid:21) , (D.9a) Y K = (cid:20) A BC K B K C A K (cid:21) T Y K (cid:20) A BC K B K C A K (cid:21) + (cid:20) Q C TK RC K (cid:21) . (D.9b)62 emma D.2. Fix q ∈ N such that C q (cid:54) = ∅ . Then, J q is a real analytic function on C q . The LQG cost function being real analytical is a direct consequence of Lemma D.1, and the proofis identical to Lemma 2.3. Given the dimension n of the plant’s state variable, the discrete-timeLQG problem (D.2) can be reformulated into a constrained optimization problem: min K J n ( K ) subject to K ∈ C n . (D.10) D.2 Connectivity of the Feasible Region C n We now characterize the connectivity of the set of full-order stabilizing controllers C n . Lemma D.3. Under Assumption 2, the set C n in (D.7) is non-empty, unbounded, and can benon-convex. Example 10 (Non-convexity of stabilizing controllers) . Consider an open-loop unstable dynamicalsystem (D.1) with A = 1 . , B = 1 , C = 1 . The set of stabilizing controllers C n = C is given by C n = (cid:26) K = (cid:20) C K B K A K (cid:21) ∈ R × (cid:12)(cid:12)(cid:12)(cid:12) (cid:20) . C K B K A K (cid:21) is stable (cid:27) . It is easy to verify that the following dynamical controllers K (1) = (cid:20) . − . (cid:21) , K (2) = (cid:20) − . . (cid:21) stabilize the plant and thus belong to C . However, ˆ K = (cid:0) K (1) + K (2) (cid:1) = (cid:20) (cid:21) fails to stabilizethe plant.The notion of similarity transformation for the discrete-time case remains the same as that ofthe continuous-time case. We thus use the same mapping T q : GL q × C q → C q , defined in (14), torepresent similarity transformations on C q . It is not difficult to see that Lemma 3.2 also holds indiscrete-time. Indeed, All the connectivity results in Theorems 3.1 to 3.3 have their counterparts inthe discrete-time case. Theorem D.1. Under Assumption 2, the set C n in (D.7) has at most two path-connected components. Theorem D.2. If C n in (D.7) has two path-connected components C (1) n and C (2) n , then C (1) n and C (2) n are diffeomorphic under the mapping T T , for any invertible matrix T ∈ R n × n with det T < . Theorem D.3. Under Assumption 1, the following statements are true.1) C n is path-connected if there exists a reduced-order stabilizing controller, i.e. , C n − (cid:54) = ∅ .2) Suppose the plant (1) is single-input or single-output, i.e., m = 1 or p = 1 . Then the set C n ispath-connected if and only if C n − (cid:54) = ∅ . xample 11 (Disconectivity of stabilizing controllers in discrete-time) . Consider the discrete-timedynamical system in Example 10: A = 1 . , B = 1 , C = 1 . Since it is open-loop unstable and only has state of dimension n = 1 , we know C n − = ∅ . Thus,Theorem D.3 indicates that its associated set of stabilizing controllers C n is not path-connected.Indeed, using the Jury stability criterion [50, 51] (the discrete-time analogue of the Routh–Hurwitzstability criterion), we can derive that C = (cid:26) K = (cid:20) C K B K A K (cid:21) ∈ R × (cid:12)(cid:12)(cid:12)(cid:12) ρ (cid:18)(cid:20) . C K B K A K (cid:21)(cid:19) < (cid:27) = (cid:40) K = (cid:20) C K B K A K (cid:21) ∈ R × (cid:12)(cid:12)(cid:12)(cid:12) B K C K < . A K − | . A K | B K C K > . A K − (cid:41) . Note that we should have . A K − < . A K − | . A K | to guarantee C (cid:54) = ∅ , which gives − . < A K < . . Furthermore, it is not difficult to verify that . A K − | . A K | < , ∀ A K ∈ ( − . , . . One can then verify that the set C has two path-connected components: C = C +1 ∪ C − with C +1 ∩ C − = ∅ , where C +1 := (cid:40) K = (cid:20) C K B K A K (cid:21) ∈ R × (cid:12)(cid:12)(cid:12)(cid:12) B K C K < . A K − | . A K | B K C K > . A K − , B K > (cid:41) , C − := (cid:40) K = (cid:20) C K B K A K (cid:21) ∈ R × (cid:12)(cid:12)(cid:12)(cid:12) B K C K < . A K − | . A K | B K C K > . A K − , B K < (cid:41) . In addition, as expected from Theorem D.2, it can be verified that C +1 and C − are homeomorphicunder the mapping T T for any T < . D.2.1 Proofs of Theorems D.1 to D.3 The proof ideas are almost identical to those for the continuous-time case, and we highlight the mainsteps here. We utilize the following discrete-time Lyapunov stability criterion: given a square matrix M ∈ R n × n , we have ρ ( M ) < if and only if the following LMI is feasible M T XM − X ≺ , X (cid:31) . (D.11)Upon defining P = X − and using the Schur complement, the discrete-time Lyapunov LMI (D.11)is equivalent to (cid:20) P M PP M T P (cid:21) (cid:31) (D.12)Now, we can use the same change of variables, defined in (23), to prove Theorem D.1. Given thesystem dynamics ( A, B, C ) in (D.1), we first introduce the following convex set F n := (cid:26) ( X, Y, M, G, H, F ) | X, Y ∈ S n , M ∈ R n × n , G = 0 m × p , H ∈ R n × p , F ∈ R m × n , (cid:20) X II Y (cid:21) (cid:20) AX + BF A + BGCM Y A + HC (cid:21)(cid:20) AX + BF A + BGCM Y A + HC (cid:21) T (cid:20) X II Y (cid:21) (cid:31) (cid:27) , G n := (cid:40) Z = ( X, Y, M, G, H, F, Π , Ξ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ( X, Y, M, G, H, F ) ∈ F n , Π , Ξ ∈ R n × n , ΞΠ = I − Y X (cid:41) . We then see that there exists a continuous surjective map from G n to C n , as summarized in thefollowing proposition. The proof follows the idea in Proposition 3.1, and we omit the details here. Proposition D.1. The mapping Φ in (23) is a continuous and surjective mapping from G n to C n . The rest of the proof for Theorem D.1 is identical to the proof of Theorem 3.1. Meanwhile, theproofs of Theorems D.2 and D.3 follow the same arguments as those for Theorems 3.2 and 3.3. D.3 Stationary Points of the LQG Cost Function It is not difficult to see that Lemma 4.1 and Proposition 4.1 hold for the discrete-time LQG costfunction J q ( K ) as well. Thus, the globally optimal solutions to the discrete-time LQG problem (D.10)are non-isolated and disconnected in C n .Here, we present formulas for computing the gradient and Hessian of the LQG cost functionin (D.8), and briefly discuss the non-minimal and minimal stationary points of (D.10). D.3.1 Gradient and Hessian of the Discrete-Time LQG Cost Function The following two lemmas give closed forms for the gradient and Hessian of the discrete-time LQGcost function J q . Lemma D.4 (Gradient of LQG cost J q ) . Fix q ≥ such that C q (cid:54) = ∅ . For every K = (cid:20) C K B K A K (cid:21) ∈ C q ,the gradient of J q ( K ) is given by ∇ J q ( K ) = ∂J q ( K ) ∂C K ∂J q ( K ) ∂B K ∂J q ( K ) ∂A K , with ∂J q ( K ) ∂A K = 2 (cid:16) Y T ( AX + BC K X ) + Y A K X + Y B K CX (cid:17) , (D.13a) ∂J q ( K ) ∂B K = 2 (cid:16) Y T ( AX + BC K X T ) C T + Y A K X T C T + Y B K ( CX C T + V ) (cid:17) , (D.13b) ∂J q ( K ) ∂C K = 2 (cid:16) B T Y ( A K X + B K CX ) + B T Y AX + ( B T Y B + R ) C K X (cid:17) , (D.13c) where X K and Y K , partitioned as X K = (cid:20) X X X T X (cid:21) , Y K = (cid:20) Y Y Y T Y (cid:21) (D.14) are the unique positive semidefinite solutions to (D.9a) and (D.9b) , respectively. emma D.5. Fix q ≥ such that C q (cid:54) = ∅ . Let K = (cid:20) C K B K A K (cid:21) ∈ C q . Then for any ∆ = (cid:20) C K ∆ B K ∆ A K (cid:21) ∈ V q , we have Hess K (∆ , ∆) = 2 tr (cid:32) (cid:20) B ∆ C K ∆ B K C ∆ A K (cid:21) ˆ X K , ∆ A T cl , K Y K + (cid:20) B ∆ C K ∆ B K C ∆ A K (cid:21) X K (cid:20) B ∆ C K ∆ B K C ∆ A K (cid:21) T Y K + 2 (cid:20) C TK R ∆ C K (cid:21) ˆ X K , ∆ + + (cid:20) B K V ∆ T B K (cid:21) Y K + (cid:20) T C K R ∆ C K (cid:21) X K (cid:33) . where X K and Y K are the solutions to the Lyapunov equations (D.9a) and (D.9b) , and ˆ X K , ∆ ∈ R ( n + q ) × ( n + q ) is the solution to the following Lyapunov equation ˆ X K , ∆ = A cl , K ˆ X K , ∆ A T cl , K + M , with A cl , K := (cid:20) A BC K B K C A K (cid:21) and M := (cid:20) B ∆ C K ∆ B K C ∆ A K (cid:21) X K A T cl , K + A cl , K X K (cid:20) B ∆ C K ∆ B K C ∆ A K (cid:21) T + (cid:20) B K V ∆ T B K +∆ B K V B TK (cid:21) . For the proof of Lemma D.4 and Lemma D.5, we first need the following lemma. Lemma D.6. Let M : ( − δ, δ ) → R k × k and G : ( − δ, δ ) → S k be two indefinitely differentiablematrix-valued functions for some δ > and k ∈ N \{ } . Suppose ρ ( M ( t )) < , ∀ t ∈ ( − δ, δ ) , and let X ( t ) denote the solution to the following Lyapunov equation X ( t ) = M ( t ) X ( t ) M ( t ) T + G ( t ) . Then X ( t ) is indefinitely differentiable over t ∈ ( − δ, δ ) , and its first order derivative and second-orderderivative at t = 0 , denoted by ˙ X (0) and ¨ X (0) , are the solutions to the following Lyapunov equations ˙ X (0) = M (0) ˙ X (0) M T (0) + (cid:16) M (0) X (0) ˙ M T (0) + ˙ M (0) X (0) M T (0) + ˙ G (0) (cid:17) ¨ X (0) = M (0) ¨ X (0) M T (0) + (cid:18) ¨ M (0) X (0) M T (0) + M (0) X (0) ¨ M T (0)+ 2 ˙ M (0) ˙ X (0) M T (0) + 2 M (0) ˙ X (0) ˙ M T (0) + 2 ˙ M (0) X (0) ˙ M T (0) + ¨ G (0) (cid:19) . (D.15) Proof. The differentiability of X ( t ) follows from the observation that the unique solution to theLyapunov equation can be written as vec( X ( t )) = ( I − M ( t ) ⊗ M ( t )) − vec( G ( t )) , where we have applied the fact that I − M ( t ) ⊗ M ( t ) is invertible thanks to ρ ( M ( t )) < . Since M ( t ) , G ( t ) and X ( t ) are indefinitely differentiable, they admit Taylor expansions around t = 0 as M ( t ) = M (0) + t ˙ M (0) + t M (0) + o ( t ) ,G ( t ) = G (0) + t ˙ G (0) + t G (0) + o ( t ) ,X ( t ) = X (0) + t ˙ X (0) + t X (0) + o ( t ) . − X (0) + M (0) X (0) M T (0) + G (0)+ t (cid:16) − ˙ X (0) + M (0) ˙ X (0) M T (0) + ˙ M (0) X (0) M T (0) + M (0) X (0) ˙ M T (0) + ˙ G (0) (cid:17) + t (cid:32) − ¨ X (0) + M (0) ¨ X (0) M T (0) + ¨ M (0) X (0) M T (0) + M (0) X (0) ¨ M T (0) + 2 ˙ M (0) ˙ X (0) M T (0)+2 M (0) ˙ X (0) ˙ M T (0) + 2 ˙ M (0) X (0) ˙ M T (0) + ¨ G (0) (cid:33) + o ( t ) = 0 . The equation above holds for all sufficiently small t . Therefore, we know that (D.15) holds true.Recall that the discrete-time LQG cost is given by J q ( K ) = tr (cid:18)(cid:20) Q C TK RC K (cid:21) X K (cid:19) , where X K is the unique positive semidefinite solution to the Lyapunov equation (D.9a). Consider anarbitrary direction ∆ = (cid:20) C K ∆ B K ∆ A K (cid:21) ∈ V q . For sufficiently small t > such that K + t ∆ ∈ C q , thecorresponding closed-loop matrix is A cl , K + t ∆ = A cl , K + t (cid:20) B I (cid:21) ∆ (cid:20) C I (cid:21) , and we let X K , ∆ ( t ) denote the solution to the Lyapunov equation (D.9a) with closed-loop matrix A cl , K + t ∆ , i.e., X K , ∆ ( t ) = (cid:18) A cl , K + t (cid:20) B I (cid:21) ∆ (cid:20) C I (cid:21)(cid:19) X K , ∆ ( t ) (cid:18) A cl , K + t (cid:20) B I (cid:21) ∆ (cid:20) C I (cid:21)(cid:19) T + (cid:20) W 00 ( B K + t ∆ B K ) V ( B K + t ∆ B K ) T (cid:21) . By Lemma D.6, we see that X K , ∆ ( t ) admits a Taylor expansion of the form X K , ∆ ( t ) = X K + t · ˙ X K , ∆ (0) + t · ¨ X K , ∆ (0) + o ( t ) , (D.16)and the derivatives ˙ X K , ∆ (0) and ¨ X K , ∆ (0) are the solutions to the following Lyapunov equations ˙ X K , ∆ (0) = A cl , K ˙ X K , ∆ (0) A T cl , K + M , ¨ X K , ∆ (0) = A cl , K ¨ X K , ∆ (0) A T cl , K + 2 M , M := (cid:20) B I (cid:21) ∆ (cid:20) C I (cid:21) X K A T cl , K + A cl , K X K (cid:20) C I (cid:21) T ∆ T (cid:20) B I (cid:21) T + (cid:20) B K V ∆ T B K +∆ B K V B TK (cid:21) ,M := (cid:20) B I (cid:21) ∆ (cid:20) C I (cid:21) ˙ X K , ∆ (0) A T cl , K + A cl , K ˙ X K , ∆ (0) (cid:20) C I (cid:21) T ∆ T (cid:20) B I (cid:21) T + (cid:20) B I (cid:21) ∆ (cid:20) C I (cid:21) X K (cid:20) C I (cid:21) T ∆ T (cid:20) B I (cid:21) T + (cid:20) B K V ∆ T B K (cid:21) . By plugging the Taylor expansion (D.16) into the expression (D.8) for J q ( K ) , we get J q ( K + t ∆) = tr (cid:18)(cid:20) Q 00 ( C K + t ∆ C K ) T R ( C K + t ∆ C K ) (cid:21) X K , ∆ ( t ) (cid:19) = J q ( K ) + t · tr (cid:18)(cid:20) Q C TK RC K (cid:21) ˙ X K , ∆ (0) + (cid:20) C TK R ∆ C K + ∆ T C K RC K (cid:21) X K (cid:19) + t · tr (cid:32) (cid:20) Q C TK RC K (cid:21) ¨ X K , ∆ (0) + 2 (cid:20) C TK R ∆ C K + ∆ T C K RC K (cid:21) ˙ X K , ∆ (0)+ 2 (cid:20) T C K R ∆ C K (cid:21) X K (cid:33) + o ( t ) , from which we can directly recognize dJ q ( K + t ∆) dt (cid:12)(cid:12)(cid:12) t =0 and d J q ( K + t ∆) dt (cid:12)(cid:12)(cid:12)(cid:12) t =0 .Now suppose X is the solution to the following Lyapunov equation X = A cl , K XA T cl , K + M for some M ∈ S n + q . Similar to Lemma A.1, it is known that the unique solution to the Lyapunovequation above is X = ∞ (cid:88) k =0 A k cl , K · M · ( A k cl , K ) T , and consequently tr (cid:18)(cid:20) Q C TK RC K (cid:21) X (cid:19) = ∞ (cid:88) k =0 tr (cid:18)(cid:20) Q C TK RC K (cid:21) A k cl , K · M · ( A k cl , K ) T (cid:19) = ∞ (cid:88) k =0 tr (cid:18) ( A k cl , K ) T (cid:20) Q C TK RC K (cid:21) A k cl , K · M (cid:19) = tr( Y K M ) , in which we recall that Y K is the unique positive semidefinite solution to Lyapunov equation (D.9b).68herefore, the first-order derivative dJ q ( K + t ∆) dt (cid:12)(cid:12)(cid:12) t =0 can be alternatively given by dJ q ( K + t ∆) dt (cid:12)(cid:12)(cid:12)(cid:12) t =0 = tr (cid:18) Y K M + (cid:20) C TK R ∆ C K + ∆ T C K RC K (cid:21) X K (cid:19) = 2 tr (cid:32)(cid:20) RC K (cid:21) X K (cid:20) I (cid:21) + (cid:20) B I (cid:21) T Y K A cl , K X K (cid:20) C I (cid:21) T + (cid:20) I (cid:21) Y K (cid:20) B K V (cid:21)(cid:33) T ∆ . One can readily recognize the gradient ∇ J q ( K ) by noticing that dJ q ( K + t ∆) dt (cid:12)(cid:12)(cid:12)(cid:12) t =0 = tr (cid:16) ∇ J q ( K ) T ∆ (cid:17) . Upon partitioning X K and Y K as (D.14), a few calculations lead to the gradient formula of J q ( K ) in (D.13).Similarly, we can show that the second-order derivative can be alternatively given by d J q ( K + t ∆) dt (cid:12)(cid:12)(cid:12)(cid:12) t =0 = 2 tr (cid:18) Y K M + (cid:20) C TK R ∆ C K + ∆ T C K RC K (cid:21) ˙ X K , ∆ (0) + (cid:20) T C K R ∆ C K (cid:21) X K (cid:19) = 2 tr (cid:32) (cid:20) B I (cid:21) ∆ (cid:20) C I (cid:21) ˙ X K , ∆ (0) A T cl , K Y K + (cid:20) B I (cid:21) ∆ (cid:20) C I (cid:21) X K (cid:20) C I (cid:21) T ∆ T (cid:20) B I (cid:21) T Y K + 2 (cid:20) C TK R ∆ C K (cid:21) ˙ X K , ∆ (0) + + (cid:20) B K V ∆ T B K (cid:21) Y K + (cid:20) T C K R ∆ C K (cid:21) X K (cid:33) . We now finish the proof of Lemmas D.4 and D.5. D.3.2 Non-minimal Stationary Points Since Lemma 4.4 and Theorem 4.1 are direct consequences of the similarity transformation T q ( T, K ) ,these two results also naturally hold for the discrete-time LQG cost function. This suggests thatthe discrete-time LQG cost J n ( K ) over the full-order stabilizing controller C n is likely to havenon-minimal stationary points that are strict saddle points. One may further establish similar resultsin Theorem 4.2 for the discrete-time LQG cost J n ( K ) . D.3.3 Minimal Stationary Points Are Globally Optimal For minimal stationary points, we have the following result. Theorem D.4. Under Assumption 2, all minimal stationary points K ∈ C n of the discrete-timeLQG problem (D.10) are globally optimal, and they are in the form of A K = T ( A − BK − LC ) T − , B K = − T L, C K = KT − , (D.17) where T ∈ R n × n is an invertible matrix, and K = ( B T SB + R ) − B T SA, L = AP C T ( CP C T + V ) − , (D.18) with P and S being the unique positive definite solutions to the Riccati equations (D.4) and (D.5) . roof. Consider a stationary point K = (cid:20) C K B K A K (cid:21) ∈ C n such that the gradient (D.13) vanishes.If the controller K is minimal, similar to Lemma 4.5, we can show that the solutions X K and Y K to (D.9a) and (D.9b) are unique and positive definite.Upon partitioning X K and Y K in (D.14), by the Schur complement, the following matrices arewell-defined and positive definite P := X − X X − X T (cid:31) ,S := Y − Y Y − Y T (cid:31) . (D.19)Now, letting ∂J n ( K ) ∂A K = 0 , from (D.13a), we have A K = − ( Y − Y T AX X − + Y − Y T BC K + B K CX X − ) . Similarly, at a stationary point, some algebraic manipulations from (D.13b) and (D.13c) lead to B K = − Y − Y T AP C T ( V + CP C T ) − ,C K = − ( R + B T SB ) − B T SAX X − . We now define T := Y − Y T . (D.20)To show that all minimal stationary points are identical up to a similarity transformation andthey are in the form of (D.17) and (D.18), it remains to prove that1) Matrix T in (D.20) is invertible and T − = − X X − .