Horizon-independent Preconditioner Design for Linear Predictive Control
aa r X i v : . [ m a t h . O C ] O c t Horizon-independent Preconditioner Design forLinear Predictive Control
Ian McInerney, Eric C. Kerrigan, George A. Constantinides
Abstract —First-order optimization solvers, such as the FastGradient Method, are increasingly being used to solve ModelPredictive Control problems in resource-constrained environ-ments. Unfortunately, the convergence rate of these solversis significantly affected by the conditioning of the problemdata, with ill-conditioned problems requiring a large numberof iterations. To reduce the number of iterations required, wepresent a simple method for computing a horizon-independentpreconditioning matrix for the Hessian of the condensed problem.The preconditioner is based on the block Toeplitz structure of theHessian. Horizon-independence allows one to use only the pre-dicted system and cost matrices to compute the preconditioner,instead of the full Hessian. The proposed preconditioner hasequivalent performance to an optimal preconditioner, producingup to a 6x speedup for the Fast Gradient Method in our numericalexamples. Additionally, we derive horizon-independent spectralbounds for the Hessian in terms of the transfer function of thepredicted system, and show how these can be used to computea novel horizon-independent bound on the condition number forthe preconditioned Hessian.
Index Terms —model predictive control (MPC), optimal con-trol, preconditioning, fast gradient method (FGM), constrainedLQR
I. I
NTRODUCTION
Model Predictive Control (MPC) is an optimal controlmethod that aims to optimize the closed-loop performance of acontrolled system by solving an optimization problem at eachsampling instant to implicitly compute the next control input.MPC is swiftly becoming a popular choice for the controlof complicated systems with operational constraints, due toits explicit handling of constraints and recent advances inreal-time optimization algorithms, allowing it to be deployedon resource-constrained systems such as internet-of-thingsdevices [1].A common formulation of MPC is the Constrained LinearQuadratic Regulator (CLQR), which is an extension of thewell known LQR optimal controller to handle state and inputconstraints for a linear system with a quadratic objectivefunction. The CLQR formulation leads to an optimizationproblem that is a convex Quadratic Program (QP), whichcan be solved efficiently using many different optimizationalgorithms, such as interior-point methods, active-set methods,or other gradient-based methods. The efficient implementation
The support of the EPSRC Centre for Doctoral Training in High Per-formance Embedded and Distributed Systems (HiPEDS, Grant ReferenceEP/L016796/1) is gratefully acknowledged.The authors are with the Department of Electrical & Electronic Engi-neering, Imperial College London, SW7 2AZ, U.K. E.C. Kerrigan is alsowith the Department of Aeronautics. email: [email protected],[email protected], [email protected] of QP solvers on resource constrained embedded systems hasbeen an active area of research, with results showing that aCLQR controller can be implemented on embedded processorsand Field Programmable Gate Arrays (FPGAs) for systemswith sampling times on the order of milliseconds or less [2],[3].When implementing a CLQR controller, there are threetypes of iterative methods that are commonly used: interior-point methods, active-set methods and first-order methods.Each of these types of methods are affected by the con-ditioning of the problem, with poorly conditioned problemsrequiring more iterations to find the optimal solution. Toovercome the ill-conditioning of the problem and improvethe convergence rate of the algorithm, implementations utilizepreconditioning techniques on the problem data. In the MPCliterature, many different preconditioning schemes have beendeveloped for interior-point methods [4], active-set methods[5], and gradient-based methods [6]–[8].In this work we focus on first-order methods, which arecommonly preconditioned using matrices generated by solvingsemidefinite programs (SDPs). In [8], an SDP formulation thatminimizes the resulting maximal eigenvalue of the precondi-tioned matrix by embedding the Hessian matrix into a linearmatrix inequality constraint was proposed. This formulationis solvable using existing interior-point methods for SDPs,but the embedding of the Hessian matrix into the constraintsmeans the preconditioner must be recomputed if the horizonlength changes and that the SDP problem size grows with thehorizon length.In this work we present a new preconditioner for thecondensed CLQR formulation, with a focus on applying itto first-order gradient methods, such as the Fast GradientMethod (FGM). This preconditioner is horizon-independentand is computed using only matrices with the number of statesand inputs as their dimensions. Our proposed preconditionerprovides performance equivalent to the existing SDP precon-ditioners, while providing a reduction in the computationaleffort required to compute the preconditioner. Additionally,we exploit the block Toeplitz structure of the CLQR problem’scondensed Hessian to derive tight horizon-independent boundson the extremal eigenvalues and condition number of both theoriginal and preconditioned Hessian.We begin in Section III by deriving the theoretical frame-work to compute the bounds on the extremal eigenvalues andcondition number of the Hessian. We extend this analysisto the case of preconditioned matrices in Section IV-A, andthen propose a new preconditioner for the CLQR problem inSection IV-B. Finally, we present numerical examples in Sec- tion V comparing the proposed preconditioner against existingpreconditioners, and showing its effect on the convergence rateof the Fast Gradient Method applied to three different systems.
A. Notation A ′ and A ∗ denote the transpose and conjugate-transposeof the matrix A , respectively. A ⊗ B represents the Kroneckerproduct of matrix A with matrix B . For an infinite-dimensionalblock Toeplitz matrix T , T n represents the truncated versionof T after n diagonals (where n is a positive integer) toform an n × n matrix. Let λ ≤ · · · ≤ λ k be the realeigenvalues of a Hermitian matrix in sorted order, with theset of all eigenvalues denoted by λ . The p-norm is denotedby k◦k p , with k A k the matrix spectral norm, and k A k F theFrobenius norm. The condition number of a matrix is definedas κ ( A ) := k A k k A − k . The set T := { z ∈ C : | z | = 1 } isthe complex unit circle. L ∞ is the space of matrix-valued essentially boundedfunctions (i.e. matrix-valued functions that are measurableand have a finite Frobenius norm almost everywhere on theirdomain, see [9, §2]). ˜ C π is the space of continuous π -periodic functions inside L ∞ . Definition 1.
Let P T ( · ) ∈ ˜ C π be a function that maps T → C n × n , we define the extreme eigenvalues of P T ( · ) as λ min ( P T ) := inf z ∈ T λ ( P T ( z )) , λ max ( P T ) := sup z ∈ T λ n ( P T ( z )) , and the condition number of P T ( · ) as κ ( P T ) := λ max ( P T ) λ min ( P T ) . Definition 2.
Let T n be the n × n truncation of an infinitematrix T . We define the extrema of the spectrum of T as λ min ( T ) := lim n →∞ λ ( T n ) , λ max ( T ) := lim n →∞ λ n ( T n ) . II. P
RELIMINARIES
A. CLQR Formulation
In this work, we examine the Constrained Linear QuadraticRegulator (CLQR) formulation of the MPC problem, whichcan be written as the following constrained quadratic program-ming problemmin u,x x ′ N P x N + 12 N − X k =0 (cid:20) x k u k (cid:21) ′ (cid:20) Q R (cid:21) (cid:20) x k u k (cid:21) (1a)s.t. x k +1 = Ax k + Bu k , k = 0 , . . . , N − x = ˆ x (1b) E u u k + E x x k ≤ c, k = 0 , . . . , N − (1c)where N is the horizon length, x k ∈ R n are the states, and u k ∈ R m are the inputs at sample instant k . A ∈ R n × n and B ∈ R n × m are the state-space matrices describing thediscrete-time system G s , and ˆ x ∈ R n is the current measuredsystem state. E u ∈ R l × m and E x ∈ R l × n are the stageconstraint coefficient matrices, and the vector c ∈ R l isthe vector of bounds for the stage constraints. The matrices Q = Q ′ ∈ R n × n , R = R ′ ∈ R m × m and P = P ′ ∈ R n × n are the weighting matrices for the system states, inputs and finalstates, respectively. The weighting matrices are chosen suchthat P , Q and R are positive definite.This problem can be condensed by removing the statevariables from (1) to leave only the control inputs in the vector u := (cid:2) u ′ u ′ · · · u ′ N − (cid:3) ′ . The optimization problem is then theinequality-constrained problemmin u u ′ Hu + ˆ x ′ Φ ′ u (2a)s.t. Gu ≤ F ˆ x + g (2b)with H := Γ ′ ¯ Q Γ + ¯ R , ¯ R := I N ⊗ R , ¯ Q := (cid:20) I N − ⊗ Q P (cid:21) , Γ := B AB B A B AB B ... . . . ... A N − B A N − B A N − B · · · B . (3)The choice of terminal weighting matrix P in problem (1)can be crucial to the stability and performance of the closed-loop controller. The simplest choice is to set P = Q , sothat there is no difference in the final state weighting versusthe other states. This choice allows for simple formation ofthe problem matrices, but may not guarantee stability for theclosed-loop problem.Instead, a possible choice for P is to choose it to be thesolution to the Discrete Algebraic Riccati Equation (DARE) P = A ′ P A + Q − A ′ P B ( B ′ P B + R ) − B ′ P A, (4)where Q and R are the cost matrices from Problem (1), and A and B are the system matrices. Choosing P in this wayapproximates the cost function value for the time after thehorizon ends, and allows for the derivation of closed-loopstability guarantees for the controller [10]. B. Numerically Robust CLQR Formulation
When implementing a CLQR controller for unstable sys-tems, the condensed formulation (2) becomes numericallyunstable as the horizon length increases. This is due to the factthat unstable systems will have | λ max ( A ) | > , and takingrepeated powers of A to form Γ will then cause the matrixentries to grow unbounded. This then leads to the conditionnumber of H growing unbounded with the horizon length.To overcome this, a modification to (2) was proposed in [11]where instead of using the actual system A for computingthe prediction matrix and optimal control, a pre-stabilizedsystem A − BK is used instead. This pre-stabilized systemwould guarantee that the prediction matrix entries do notgrow unbounded with the horizon length, leading to betterconditioning of the optimization problem.To formulate the pre-stabilized problem, a new system G c is formed by setting u k = − Kx k + v k where v ∈ R m is anew input so that G c := { x k +1 = ( A − BK ) x k + Bv k , (5) where the controller K is chosen so that | λ max ( A − BK ) | < .There are several ways to compute a K that meets this criteria,however we focus on K chosen as the unconstrained infinite-horizon LQR controller computed using Q and R from (1a).The computations are then done using the input space v := (cid:2) v ′ v ′ · · · v ′ N − (cid:3) ′ , turning the condensed optimizationproblem (2) into v ∗ := argmin v v ′ H c v + ˆ x ′ Φ ′ c v (6a)s.t. G c v ≤ F c ˆ x + g (6b)with A c = A − BK , Q c = Q + K ′ RK , H c := Γ ′ c ¯ Q c Γ c + ¯ R , ¯ Q c := (cid:20) I N − ⊗ Q c P (cid:21) , Γ c := B · · · A c B B A c B A c B B ... . . . ... A N − c B A N − c B A N − c B · · · B . (7)The input applied to the original system is then u = − K ˆ x + v ∗ (ˆ x ) . The constraint matrices F c and G c in Problem (6) areformed by modifying F and G from Problem (2) to use thepre-stabilized system, with full details given in [11].III. S PECTRAL P ROPERTIES
In order to effectively analyze and derive our closed-formpreconditioner, we must first derive some spectral propertiesof the matrices used in the CLQR problem, specifically theHessian and prediction matrix. Similar results to these werereported in [12] and [13, Sect. 11], but our analysis applies toany positive definite Q matrix, does not require a special rear-rangement of the Hessian to handle non-Schur-stable systems,and is built upon the principles of Toeplitz theory instead ofFourier theory. A. Matrix Symbol for the Prediction Matrix
We start by analyzing the prediction matrix Γ c and notethat its diagonals are constant blocks, which means that Γ c isa truncated block Toeplitz matrix. A finite-dimensional blockToeplitz matrix with blocks of size m × n is said to begenerated by a matrix-valued function mapping T → C m × n ,which is called its matrix symbol and represents an infinite-dimensional block Toeplitz matrix. The matrix Γ c can then beviewed as a truncation to N block diagonals of the infinite-dimensional Toeplitz matrix Γ which is represented by thematrix symbol P Γ c ( · ) .The diagonal blocks of the matrix give the spectral coeffi-cients of the matrix symbol, so the symbol can be representedas a Fourier series with the coefficients given by the matrixblocks. For the original prediction matrix Γ , the truncatedFourier series that only uses the blocks over the horizon lengthis given by N − X i =0 A i Bz − i , ∀ z ∈ T . (8) As the horizon length increases, this series is only guaranteedto converge to a finite matrix symbol when the system G withmatrices A and B is Schur-stable.To form a convergent Fourier series no matter the system,we introduce a stabilizing linear state-feedback controller u k = − Kx k + v k to the prediction, as described in Sec-tion II-B. This leads to a convergent Fourier series for theprediction matrix Γ c of the new controlled system, and thefinite matrix symbol given in Lemma 1. Lemma 1.
Let K ∈ R m × n be a linear state-feedbackcontrol matrix used to form the pre-stabilized system (5) . Theprediction matrix Γ c then has the matrix symbol P Γ c ∈ ˜ C π with P Γ c ( z ) := z ( zI − ( A − BK )) − B = z G c ( z ) , ∀ z ∈ T , where G c ( · ) is the transfer function matrix for the system G c .Proof. The matrix symbol for the block Toeplitz matrix Γ c isderived using the infinite-dimensional extension of Γ c , called Γ . This extension is formed by noting the pattern in the blocksof (7), and then extrapolating to get (where is the maindiagonal and positive blocks are below the diagonal) ˜Γ i := ( if i < , ( A − BK ) i B if i ≥ . (9)For a block Toeplitz matrix, the matrix symbol is the trigono-metric polynomial of its Fourier series. The blocks definedby (9) are the spectral coefficients of the infinite-dimensionalmatrix Γ , so the Fourier series for Γ is then ∞ X i =0 z − i ( A − BK ) i B. The constant B matrix can then be extracted from the sum-mation, leaving ∞ X i =0 ( A − BK ) i z − i . Since K was designed to make ( A − BK ) Schur-stable,this summation becomes a convergent Neumann series thatconverges to z ( zI − ( A − BK )) − [14, §3.4]. With the B matrix right-multiplying the summation, the result is thetransfer function matrix for the time-shifted system z G c ( z ) .Finally, note that the spectral coefficients of ˜Γ are absolutelysummable, so P Γ c is in the Wiener class, meaning that P Γ c ∈ L ∞ and is continuous and π -periodic, leading to P Γ c ∈ ˜ C π .Note that at this point there is no restriction on the typeof linear state-feedback controller used to pre-stabilize thesystem - any controller that results in a Schur-stable closed-loop system can be used. A convenient choice for K thoughis the infinite-horizon LQR controller designed using thecost matrices in (1). That LQR feedback law can be readilycomputed if P is chosen to be the solution to the DARE,which is a convenient choice, since it can provide closed-loopstability guarantees for the predictive controller [10].If the system G is Schur-stable to begin with, then there isno need for a pre-stabilizing controller K and the matrices Γ and Γ c are the same. The matrix symbol for Γ c can also besimplified as shown in Corollary 1. Corollary 1.
If the system G is Schur-stable, then with K =0 the prediction matrix Γ has a convergent Fourier series,producing the matrix symbol P Γ ∈ ˜ C π with P Γ ( z ) := z ( zI − A ) − B = z G ( z ) , ∀ z ∈ T , where G ( · ) is the transfer function matrix for the system G .B. Matrix Symbol for the Hessian The Hessian of the MPC problem formulation in (6) can besplit into three distinct parts H c := H Q + H R + H P (10)where H Q , H R and H P are the parts that contain the matrices Q , R and P , respectively. Slightly different analysis must bedone depending on the choice of P , and in this work we focuson the cases when P = Q and P is the solution to the DAREfor the infinite-dimensional unconstrained LQR of problem. P is the same as Q : Choosing P = Q for (1) allowsthe term H P to be consolidated into H Q , giving H c = Γ ∗ c ¯ Q c Γ c + ¯ R, (11) ¯ Q c = I N ⊗ ( Q + K ′ RK ) . (12)Analysis of the resulting matrix H c , reveals that H c is alsoblock Toeplitz with the matrix symbol given in Lemma 2. Lemma 2.
Let P = Q and P Γ c be the matrix symbol for Γ c from Lemma 1. The matrix H c is then a block Toeplitz matrixwith the matrix symbol P H c ∈ ˜ C π , where P H c ( z ) := P ∗ Γ c ( z ) ( Q + K ′ RK ) P Γ c ( z ) + R, ∀ z ∈ T . Proof.
Since ¯ R is block diagonal with the same entry on everyblock, ¯ R is a block Toeplitz matrix with symbol P ¯ R ( z ) := R .Using the assumption that P = Q , we can see that the newstate weighting matrix given in (12) is block diagonal withthe same entry in each block, making ¯ Q c a block Toeplitzmatrix with the symbol P ¯ Q c ( z ) := Q + K ′ RK . Since Γ c isa lower-triangular block matrix and Γ ∗ c is an upper-triangularblock matrix, the product Γ ∗ c ˜ Q Γ c is block Toeplitz with matrixsymbol P ∗ Γ c P ¯ Q c P Γ c [15, Lemma 4.5]. Additionally, blockToeplitz structure is preserved over the addition of two blockToeplitz matrices, meaning matrix H c is then block Toeplitz,with the symbol given in the statement.It is important to note that the product of block Toeplitz ma-trices is not guaranteed to be block Toeplitz except in certainspecial cases, while the addition of multiple block Toeplitzmatrices with compatible block sizes is always guaranteedto produce a block Toeplitz result. In this case, the lower-triangular structure of Γ c and the block Toeplitz structure of ¯ Q c implies that the product Γ ∗ c ˜ Q Γ c is one of the special caseswhere the multiplication of the three block Toeplitz matrices ofcompatible block sizes is block Toeplitz. Choosing an arbitrary P with P = Q will cause ¯ Q c to no longer be block Toeplitz, sothe multiplication will not necessarily produce a block Toeplitzmatrix and the Hessian will not be block Toeplitz, in general. P is the solution to the DARE: Choosing P as thesolution to the DARE (4) causes ¯ Q c to not be block Toeplitz,since ¯ Q c will then have a different matrix in the lower-rightcorner than the rest of the main diagonal. This means thatthe analysis based on the multiplication of structured blockToeplitz matrices used in the proof of Lemma 2 no longercan be applied. However, the resulting H c matrix is still blockToeplitz due to the fact that P will represent the cost of thecontroller applied after the horizon ends. Proposition 1. If P is the solution to the DARE (4) and K is the infinite-horizon LQR controller for G , then H c is blockToeplitz and has the same matrix symbol as the case when P = Q given in Lemma 2.Proof. Using the matrix splitting (10), the Hessian can bedecomposed into three terms, with H Q and H P given by (13)and (14), respectively, and H R = ¯ R . We start by examiningthe first diagonal term of H Q + H P , N − X i =0 B ′ ( A ic ) ′ Q c A ic B + B ′ ( A Nc ) ′ P A Nc B. (15)Since P is the solution to the DARE (4), P is also the solutionto the Lyapunov equation ( A − BK ) ′ P ( A − BK ) + Q + K ′ RK = P when K is the infinite-time LQR controller. This means that P can also be expressed as P = ∞ X i =0 (( A − BK ) ′ ) i ( Q + K ′ RK )( A − BK ) i , transforming (15) into N − X i =0 B ′ ( A ic ) ′ Q c A ic B + B ′ ( A Nc ) ′ ∞ X i =0 ( A ′ c ) i Q c A ic ! A Nc B. The ( A Nc ) ′ and A Nc terms on the right summation can beconsolidated into the summation, offsetting its starting pointto be i = N instead of . This means the right summation issimply continuing the left summation to infinity, allowing thetwo to be consolidated into ∞ X i =0 B ′ ( A ic ) ′ Q c A ic B. (16)The same analysis can be performed on the other terms onthe main diagonal, which only differ by where the left sum-mation ends and the right summation is offset to. Therefore,the main diagonal of the matrix sum H Q + H P is composed ofblocks with all the same terms. A similar analysis can be doneon all diagonals above and below the main diagonal, showingthat they are also composed of blocks with all the same termsdown the diagonal.Since all the diagonals are composed of the same blocksdown their length, the matrix sum H Q + H P is block Toeplitz,and the resulting Hessian H c is block Toeplitz as well, since H R is already known to be block Toeplitz.To construct the matrix symbol for H c when P is thesolution to the DARE, we examine the elements in the matrix H Q := P N − i =0 B ′ ( A ic ) ′ Q c A ic B P N − i =0 B ′ ( A i +1 c ) ′ Q c A ic B P N − i =0 B ′ ( A i +2 c ) ′ Q c A ic B · · · B ′ ( A N − c ) ′ Q c B P N − i =0 B ′ ( A ic ) ′ Q c A i +1 c B P N − i =0 B ′ ( A ic ) ′ Q c A ic B P N − i =0 B ′ ( A i +1 c ) ′ Q c A ic B · · · B ′ ( A N − c ) ′ Q c B P N − i =0 B ′ ( A ic ) ′ Q c A i +2 c B P N − i =0 B ′ ( A ic ) ′ Q c A i +1 c B P N − i =0 B ′ ( A ic ) ′ Q c A ic B · · · B ′ ( A N − c ) ′ Q c B ... ... ... . . . ... B ′ Q c A N − c B B ′ Q c A N − c B B ′ Q c A N − c B · · · B ′ Q c B (13) H P := B ′ ( A Nc ) ′ P A Nc B B ′ ( A Nc ) ′ P A N − c B B ′ ( A Nc ) ′ P A N − c B · · · B ′ ( A Nc ) ′ P A c BB ′ ( A N − c ) ′ P A Nc B B ′ ( A N − c ) ′ P A N − c B B ′ ( A N − c ) ′ P A N − c B · · · B ′ ( A N − c ) ′ P A c BB ′ ( A N − c ) ′ P A Nc B B ′ ( A N − c ) ′ P A N − c B B ′ ( A N − c ) ′ P A N − c B · · · B ′ ( A N − c ) ′ P A c B ... ... ... . . . ... B ′ A ′ c P A Nc B B ′ A ′ c P A N − c B B ′ A ′ c P A N − c B · · · B ′ A ′ c P A c B (14) H Q + H P and how they relate to the case when P = Q . Notethat when P = Q , the individual elements of the matrix H Q have a summation that terminates at the horizon length. Sincethe matrix symbol is based on the infinite-dimensional matrix,if the matrix H Q is extrapolated to a horizon of infinity, thesummations in H Q will all terminate at infinity. Therefore,the sum H Q + H P will have the same blocks as the infinite-dimensional ˜ H Q when P = Q , so the Hessians for the caseswhen P = Q and P is the solution to the DARE will bothhave the same matrix symbol.
3) Simplification when G is Schur-stable: When G is Schur-stable and the results in Corollary 1 are used to simplify thematrix symbol of the prediction matrix, the results given inLemma 2 and Proposition 1 can be simplified as well. Corollary 2.
If the system G is Schur-stable, then with K =0 and P = Q or P the solution to the discrete Lyapunovequation A ′ P A + Q = P , the Hessian H has the matrixsymbol P H ∈ ˜ C π with P H ( z ) := P ∗ Γ ( z ) Q P Γ ( z ) + R, ∀ z ∈ T . where G ( · ) is the transfer function matrix for the system G . Note that the matrix symbol in Corollary 2 has the sameform as the matrix symbol in Lemma 2, however the choiceof the matrix P differs. In the Schur-stable case, in order forthe results of Proposition 1 to hold, the terminal cost mustbe based on the solution to the discrete Lyapunov equationinstead of the DARE. C. Spectral Bounds for the Hessian
One useful property of block Toeplitz matrices is that theeigenvalue spectrum for any finite-dimensional truncation ofthe infinite-dimensional block Toeplitz matrix is containedwithin the extremal eigenvalues of its matrix symbol. Thismeans that the minimum and maximum eigenvalues of thematrix H c can be bounded by analyzing the matrix symbol P H c , since H c is a finite-dimensional truncation of the infinite-dimensional matrix H c to N blocks. Theorem 1.
Let H c be the condensed Hessian for a predictionhorizon of length N that is block Toeplitz with matrix symbol P H c given in Lemma 2, then the following hold: (a) λ min ( P H c ) ≤ λ ( H c ) ≤ λ max ( P H c ) (b) lim N →∞ κ ( H c ) = κ ( P H c ) Proof. (a) The spectrum of a finite-dimensional truncation of a blockToeplitz matrix with a symbol in ˜ C π is bounded by theextremes of the spectrum of its symbol [15, Theorem 4.4].(b) Note that H c is a Hermitian matrix, which means thatit is also normal [16, §4.1]. Since it is both normal andpositive semi-definite, the singular values are the sameas the eigenvalues [17, §3.1], resulting in the conditionnumber becoming κ ( H c ) = λ n ( H c ) λ ( H c ) . Taking the limit ofboth sides in conjunction with the spectral bounds frompart (a) gives lim N →∞ κ ( H c ) = κ ( P H c ) . Remark 1.
Proposition 1 means that the bounds computedfrom Theorem 1 using Lemma 2 hold for both the case when P = Q and the case when P is the solution to the DARE. Essentially, these results say that the spectrum for thecondensed Hessian in these cases will always be containedinside the interval defined by the maximum and minimumeigenvalues of the matrix symbol in Lemma 2. Additionally,as N → ∞ the extremal eigenvalues of H c will convergeasymptotically to the maximum and minimum eigenvalues ofits symbol. IV. P RECONDITIONING
The spectral results presented in Sections III-B and III-Ccan be readily extended to analyze the case of a preconditionedHessian, as well as to help design new preconditioners.
A. Analysis of the Preconditioned Hessian
For simplicity of discussion, we focus on the case when H c is symmetrically preconditioned as L − N H c ( L − N ) ′ with ablock-diagonal preconditioner L N that has N copies of theblock L on its diagonal, thus guaranteeing that the precondi-tioned matrix is block Toeplitz. This case is most appropriatefor first-order methods, since it guarantees that the structure ofthe feasible set is preserved over the preconditioning operation and that the preconditioned Hessian is symmetric [8]. Resultscan also be derived for non-block-diagonal preconditionersusing [9, Thm 4.3] with M − H c where M := L N L ′ N , butwe do not discuss this extension.Since the preconditioner matrix L N is block-diagonal withonly L on its main diagonal, the matrix symbol for L N issimply L . The spectral bounds in Section III-C can then beextended to the preconditioned matrix H L by simply replacing P H c in Theorem 1(b) with P H L given by P H L := ¯ L P H c ¯ L ′ , (17)where ¯ L := L − . This analysis requires L N to be blockToeplitz, which may not be the case for preconditioners com-puted using the entire Hessian matrix, unless such a constraintis added to their computation. B. Preconditioner Design
There is a rich literature of preconditioners for Toeplitz andcirculant matrices, with a focus on designing the precondi-tioners independent of the size of the matrix (see [18] andreferences therein). These existing ideas can be applied to theblock Toeplitz structure of the Hessian in the CLQR problemin order to design preconditioners to use when solving theoptimization problem.One of the first circulant preconditioners was proposed byGilbert Strang in [19]. This preconditioner was originally pro-posed for preconditioning iterative conjugate gradient meth-ods, and is formed by simply copying the central diagonalsof the Toeplitz matrix into the preconditioning matrix andwrapping them around to form a circulant matrix. Strang’spreconditioner can be naturally extended to the block Toeplitzcase by simply copying the individual blocks into the precon-ditioning matrix and wrapping them around to form a blockcirculant matrix.In the case of the CLQR problem with a block diagonalpreconditioning matrix, we can explicitly compute the blockon the diagonal of the preconditioning matrix without formingthe entire Hessian, as shown in Theorem 2.
Theorem 2.
Let H c be the Hessian from (6) formed bychoosing K as the infinite-horizon LQR controller for G , with P the solution to the DARE (4) . The matrix H c can be symmet-rically preconditioned as L − N H c ( L − N ) ′ , with L N = I N ⊗ L and the blocks L given by the lower-triangular Choleskydecomposition of M with M := B ′ P B + R. (18) Proof.
Based on the work in [19], a Circulant preconditioningmatrix W for the matrix V will have entries that are obtainedby copying the central diagonals of V and wrapping themaround to form a circulant matrix. For example, for the matrix V = V V V V V V − V V V V V − V − V V V V − V − V − V V V − V − V − V − V , the block circulant preconditioning matrix will be W = V V V V − V − V − V V V V − V − V − V V V V V − V − V V V V V − V − V . Since we wish for the preconditioner to be block diagonal, weonly need to focus on computing the diagonal block V forthe CQLR problem.For the block Toeplitz Hessian H c , the main diagonal blockof the infinite dimensional block Toeplitz matrix is (15), whichwhen including the H R component becomes ∞ X i =0 B ′ ( A ik ) ′ Q k A i B + R. Since A k is Schur-stable and K is the LQR controller, thisinfinite sum converges to the solution of the DARE, makingthe diagonal block V = B ′ P B + R. Strang’s preconditioner is designed to act as a left precon-ditioner (e.g. W − V ), so since we want a symmetric precon-ditioner we apply the lower-triangular Cholesky factorizationto M , resulting in L . Remark 2.
The preconditioning matrix L in Theorem 2 canalso be formed by taking the matrix square-root of M , whichcould allow for L to have the same structure as M if thesquare root operation is structure preserving. Note that the matrices M and L will have dimension m × m ,and that the full preconditioning matrix L N is formed bysimply repeating L down the diagonal N times, so changingthe horizon length means simply adding or removing L blocks from the diagonal of L N . Additionally, all the matricesinvolved in computing M have dimensions on the order of m and n , and have no relation to the horizon length. This is incontrast to SDP-based preconditioner design techniques suchas [8], which require the full Hessian to be placed inside thesemidefinite optimization problem.V. N UMERICAL E XPERIMENTS
In this section we present numerical examples showing thespectral properties computed using the results of Section III,and also the effect of applying the preconditioner from Sec-tion IV-B to the CLQR problem for three systems.
A. Example Systems1) Schur-stable system:
The first example system we useis the Schur-stable discrete-time system with four states andtwo inputs given in [20] with state equation and cost matrices x k +1 = . − . . . . − . . . . . . . . . . . x k + . . . . . . . . u k ,Q = diag (10 , , , , R = diag (10 , . We constrain the inputs of the system to be | u k | ≤ . .
2) Inverted pendulum:
The next example system we use isa linearized inverted pendulum described by the continuous-time dynamics ˙ x = g l − b x + l u, with g = 9 . , b = 1 , and l = 0 . . The system was dis-cretized using a zero order hold with a sampling time of 0.02s,resulting in an unstable discrete-time system. The CLQR prob-lem used the cost matrices Q = diag (1000 , , , , R = 10 , and the input was constrained to be | u | ≤ .
3) Distillation column:
The final example system we use isa binary distillation column with 11 states and 3 inputs fromthe IFAC control benchmarks [21, Problem 90-01]. The systemwas discretized using a zero order hold with a sampling timeof 0.01s, resulting in a Schur-stable discrete-time system. TheCLQR problem used the cost matrices Q = diag (10 , , , , , , , , , , ,R = diag (10 , , , and the inputs were constrained to be | u | ≤ . , | u | ≤ . , | u | ≤ . . B. Matrix Conditioning
The first numerical results we present show the theoreticalresults from Theorem 1 when applied to the example systems.It can be seen in Figures 2b and 1b that the limits presentedin Theorem 1(b) and Corollary 2, respectively, are attainedas the horizon length increases. Note though that the rate atwhich the condition number of the finite-dimensional Hessianapproaches the bound is system-dependent, with the Schur-stable example system converging within 0.01% of the boundby N = 40 , while the inverted pendulum system requires N = 225 to converge to within 0.01%.As described in Section IV-A, the condition number boundof Theorem 1(b) can be used to analyze the preconditionedHessian, which is shown in Figures 1b and 2b. The boundwas computed for the proposed preconditioner by finding L using Theorem 2, then substituting (17) into Theorem 1(b).Note that the bound computed using L from Theorem 2 doesnot hold for the SDP preconditioner from [8], since L will bedifferent between the two preconditioners. Additionally, theSDP preconditioner does not guarantee that L N will be blockToeplitz, so Theorem 1 cannot be used to compute horizon-independent limits when it is used.Comparing the proposed preconditioner against the existingSDP preconditioner, by examining the condition number of thepreconditioned Hessian, shows that the two preconditionersproduce nearly identical condition numbers, as can be seen inFigures 1b and 2b. C. Algorithm Performance
To test the effect of the preconditioners on the actualperformance of a first-order method solving the CLQR prob-lem, the Fast Gradient Method (FGM) was implemented in
Horizon Length N κ ( H ) H with P = QH with P = DLYAP( A, Q ) Bound from Corollary 2 (a) Original Hessian
Horizon Length N κ ( H L ) SDP [8]Proposed (Theorem 2)Bound from Section IV-A (b) Preconditioned HessianFig. 1. Condition number versus the horizon length of the condensed Hessianfor the Schur-stable system.
Julia using the constant step-size scheme and gradient mapstopping criteria from [22]. The inequality constraints wereimplemented through projection operations, with the non-prestabilized problems being simple projections onto a boxand the prestabilized problems requiring a more complexprojection algorithm. In these examples, the projections forthe prestabilized problems were computed by solving theprojection QP directly; however, in an embedded applicationother techniques, such as Dykstra’s method [23] or an explicitQP [24], can be used instead. All CLQR problems were setupwith a horizon length of 10, and a desired error of − forthe FGM.We present two sets of results in this section, with Table Ishowing the condition number of the Hessian with a horizonof 10 and Table II showing the number of iterations requiredfor the FGM to converge to ǫ = 10 − .Both the SDP preconditioner and our proposed precondi-tioner decrease the condition number by 66% for the Schur-stable system and the already Schur-stable non-prestabilizeddistillation column. This translates to a 2.1x speedup forthe Schur-stable system and a 3.6x speedup for the non-prestabilized distillation column. When the SDP precondi-tioner is applied to the non-prestabilized inverted pendulum,it has no discernable effect on the iterations required for theFGM to converge, keeping the number of iterations requiredat 51. Note that the proposed preconditioner is not defined forthe non-prestabilized inverted pendulum, since the system is Horizon Length N κ ( H ) H with P = QH with P = DARE( A, B, Q, R ) Bound from Theorem 1(b) (a) Original Hessian
Horizon Length N κ ( H L ) Un-preconditionedSDP [8]Proposed (Theorem 2)Bound from Section IV-A (b) Preconditioned HessianFig. 2. Condition number versus the horizon length of the pre-stabilizedcondensed Hessian for the inverted pendulum system. unstable and H does not have a convergent matrix symbol.A large decrease in iterations is caused by applying aprestabilizing controller to the unstable inverted pendulum,which results in a 3.9x speedup. Prestabilizing the alreadySchur-stable distillation column produces marginal benefit,with the condition number decreasing by only 0.5% andrequiring one extra iteration.While applying the prestabilizing controller to the Schur-stable distillation column produces marginal benefit, applyingthe preconditioner to the prestabilized distillation columnproduces a speedup of 6x, a larger speedup than just precon-ditioning the non-prestabilized system on its own.VI. C ONCLUSIONS
In this work we presented a preconditioner for the con-densed CLQR problem capable of producing up to a 6xspeedup for the Fast Gradient Method in our numerical ex-amples. This preconditioner is simple to compute and usesonly matrices with dimensions on the order of the number ofstates and inputs in its computations, as opposed to existingSDP preconditioners that require the full Hessian matrix fortheir computations. Additionally, we derived results relatingthe extrema of the spectrum for the condensed Hessian to theextrema of the spectrum for a complex-valued matrix symbolformed using the weighting matrices and the system’s transferfunction. We also showed that these results can be used to
TABLE IC
ONDITION NUMBER OF THE CONDENSED H ESSIAN FOR VARIOUSPRECONDITIONERS WITH N = 10 . System None SDP [8] Proposed (Thm. 2)
Schur-stable [20] 8.776 2.922 2.933Inverted pendulum(non-prestabilized) 42.512 42.468 N/A Inverted pendulum(LQR prestabilized) 3.508 3.489 3.508Distillation column [21](non-prestabilized) 3.020 1.006 1.006Distillation column [21](LQR prestabilized) 3.004 1.001 1.001 Preconditioner not computableTABLE III
TERATIONS REQUIRED FOR COLD - START CONVERGENCE OF THE F AST G RADIENT M ETHOD TO ǫ =10 − F OR VARIOUS PRECONDITIONERSWITH N = 10 . System None SDP [8] Proposed (Thm. 2)
Schur-stable [20] 19 9 9Inverted pendulum(non-prestabilized) 51 51 N/A Inverted pendulum(LQR prestabilized) 13 12 13Distillation column [21](non-prestabilized) 11 3 3Distillation column [21](LQR prestabilized) 12 2 2 Preconditioner not computable estimate the condition number of the Hessian when using ourproposed preconditioner.When applied to several example systems, the proposed pre-conditioner and an SDP preconditioner achieved comparablereductions in the number of FGM iterations needed to solve theCLQR problem. The prestabilization necessary for computingthe proposed preconditioner also reduces the number of FGMiterations required to solve the CLQR problem on its own, butat the expense of turning the constraint sets into more complexshapes.The derivation and examples in this work focused on pre-conditioning the primal QP for the CLQR problem, howeverit is also common to use a dual form of (2) with gradientalgorithms such as the Dual Gradient Projection or Dual FastGradient Method. The preconditioner defined in Theorem 2could be extended to also handle the dual problem by usingthe diagonal blocks of the dual Hessian in the preconditonerinstead. The theoretical bounds for the preconditioned Hessianderived in Section IV-A may not be easily extended though,since we have not investigated if the dual Hessian possessesa block Toeplitz structure like the primal Hessian.This work also highlighted the relationship between thespectrum of the transfer function and the spectrum of the con-densed Hessian by showing that the spectrum of the predictedsystem directly affects the spectrum of the Hessian. We alsoshowed that the numerical prestabilization controller K has adirect effect on the spectrum of the Hessian, so preconditioningcould also be achieved through a careful choice of K instead ofapplying a separate preconditioner. Future work could exploredeveloping a preconditioner based on loop-shaping of thesystem to reduce the Hessian’s condition number. R EFERENCES[1] S. Lucia, M. K¨ogel, P. Zometa, D. E. Quevedo, and R. Findeisen,“Predictive control, embedded cyberphysical systems and systems ofsystems - A perspective,”
Annual Reviews in Control , vol. 41, pp. 193–207, 2016.[2] P. Zometa, M. Kogel, T. Faulwasser, and R. Findeisen, “Implementationaspects of model predictive control for embedded systems,” in . IEEE, 2012, p. 1205–1210.[3] J. L. Jerez, P. J. Goulart, S. Richter, G. A. Constantinides, E. C. Kerrigan,and M. Morari, “Embedded Online Optimization for Model PredictiveControl at Megahertz Rates,”
IEEE Transactions on Automatic Control ,vol. 59, no. 12, pp. 3238–3251, 2014.[4] A. Malyshev, R. Quirynen, and A. Knyazev, “Preconditioning ofconjugate gradient iterations in interior point MPC method,”
IFAC-PapersOnLine , vol. 51, no. 50, pp. 394–399, 2018.[5] R. Quirynen, A. Knyazev, and S. D. Cairano, “Block Structured Precon-ditioning within an Active-Set Method for Real-Time Optimal Control,”in . Limassol, Cyprus: IEEE,2018, pp. 1154–1159.[6] A. Guiggiani, P. Patrinos, and A. Bemporad, “Fixed-Point Implemen-tation of a Proximal Newton Method for Embedded Model PredictiveControl,”
Proceedings of the 19th IFAC World Congress , pp. 2921–2926,2014.[7] P. Giselsson and S. Boyd, “Preconditioning in Fast Dual GradientMethods,” in .Los Angelas, CA: IEEE, 2014, pp. 5040–5045.[8] S. Richter, C. N. Jones, and M. Morari, “Computational ComplexityCertification for Real-Time MPC With Input Constraints Based onthe Fast Gradient Method,”
IEEE Transactions on Automatic Control ,vol. 57, no. 6, pp. 1391–1403, 2012.[9] M. Miranda and P. Tilli, “Asymptotic spectra of Hermitian block Toeplitzmatrices and preconditioning results,”
SIAM Journal on Matrix Analysisand Applications , vol. 21, no. 3, pp. 867–881, 2000.[10] D. Q. Mayne, J. B. Rawlings, C. V. Rao, and P. O. Scokaert, “Con-strained model predictive control: Stability and optimality,”
Automatica ,vol. 36, no. 6, pp. 789–814, 2000. [11] J. A. Rossiter, B. Kouvaritakis, and M. J. Rice, “A NumericallyRobust State-Space Approach to Stable-Predictive Control Strategies,”
Automatica , vol. 34, no. 1, pp. 65–73, 1998.[12] O. J. Rojas and G. C. Goodwin, “On the asymptotic properties of theHessian in discrete-time linear quadratic control,” in . Boston, MA: IEEE, 2004, pp. 239–244.[13] G. C. Goodwin, M. M. Seron, and J. A. De Dona,
Constrained Controland Estimation - An Optimisation Based Approach . London, UK:Springer, 2005.[14] K. B. Peterson and M. S. Pederson, “The Matrix Cookbook,” 2012.[15] J. Guti´errez-Guti´errez and P. M. Crespo, “Block Toeplitz Matrices:Asymptotic Results and Applications,”
Foundations and Trends in Com-munications and Information Theory , vol. 8, no. 3, pp. 179–257, 2012.[16] R. A. Horn and C. R. Johnson,
Matrix Analysis , 2nd ed. Cambridge,UK: Cambridge University Press, 2013.[17] ——,
Topics in Matrix Analysis . Cambridge, UK: Cambridge Univer-sity Press, 1994.[18] R. H. Chan and X.-Q. Jin,
An Introduction to Iterative Toeplitz Solvers .SIAM, 2007.[19] G. Strang, “A proposal for toeplitz matrix calculations,”
Studies inApplied Mathematics , vol. 74, no. 2, p. 171–176, 1986.[20] C. N. Jones and M. Morari, “The Double Description Method forthe Approximation of Explicit MPC Control Laws,” in . Cancun, Mexico: IEEE,2008, pp. 4724–4730.[21] “Benchmark Problems for Control System Design: Report of the IFACTheory Committee,” IFAC, Tech. Rep., 1990. [Online]. Available:https://taskforce.ifac-control.org/industry-committee/reference-materials/benchmark-problems-for-control-system-design/view[22] S. Richter, “Computational complexity certification of gradient methodsfor real-time model predictive control,” Doctoral Thesis, ETH Zurich,2012.[23] N. Gaffke and R. Mathar, “A cyclic projection algorithm via duality,”
Metrika , vol. 36, no. 1, pp. 29–54, 1989.[24] S. Merkli, J. Jerez, A. Domahidi, R. S. Smith, and M. Morari, “CircuitGeneration for Efficient Projection onto Polyhedral Sets in First-OrderMethods,” in2015 European Control Conference (ECC)