Computing delay Lyapunov matrices and H2 norms for large-scale problems
CCOMPUTING DELAY LYAPUNOV MATRICES AND H NORMSFOR LARGE-SCALE PROBLEMS
WIM MICHIELS ∗ AND
BIN ZHOU † Abstract.
A delay Lyapunov matrix corresponding to an exponentially stable system of lineartime-invariant delay differential equations can be characterized as the solution of a boundary valueproblem involving a matrix valued delay differential equation. This boundary value problem can beseen as a natural generalization of the classical Lyapunov matrix equation. Lyapunov matrices playan important role in constructing Lyapunov functionals and in H optimal control. In this paperwe present a general approach for computing delay Lyapunov matrices and H norms for systemswith multiple discrete delays, whose applicability extends towards problems where the matrices arelarge and sparse, and the associated positive semidefinite matrix (the “right-hand side” for thestandard Lyapunov equation), has a low rank. The problems addressed are challenging, becausebesides that the boundary value problem is matrix valued with a structure that much harder toexploit than in the delay-free case, its solution is in the generic situation non-smooth. In contractto existing methods that are based on solving the boundary value problem directly, our methodis grounded in solving standard Lyapunov equations of increased dimensions. It combines severalingredients: i) a spectral discretization of the system of delay equations, ii) a targeted similaritytransformation which induces a desired structure and sparsity pattern and, at the same time, favorsaccurate low rank solutions of the corresponding Lyapunov equation, and iii) a Krylov method forlarge-scale matrix Lyapunov equations. The structure of the problem is exploited in such a way thatthe final algorithm does not involve a preliminary discretization step, and provides a fully dynamicconstruction of approximations of increasing rank. Interpretations in terms of a projection methoddirectly applied to a standard linear infinite-dimensional system equivalent to the original time-delay system are also given. Throughout the paper two didactic examples are used to illustrate theproperties of the problem, the challenges and methodological choices, while numerical experimentsare presented at the end to illustrate the effectiveness of the algorithm. Key words. delay system, Lyapunov matrix equations, Krylov method
1. Introduction.
We consider a linear system with multiple discrete delays,˙ x ( t ) = A x ( t ) + m (cid:88) i =1 A i x ( t − τ i ) , (1.1)where x ( t ) ∈ R n is the state variable at time t , A i ∈ R n × n arethesystemmatrices and τ i , i = 1 , . . . , m , represent time-delays, ordered such that0 < τ < · · · < τ m . Throughout the paper we assume that the zero solution of (1.1) is exponentially stable,or equivalently, that all its characteristic roots, i.e., the solutions of equationdet (cid:32) λI − A − m (cid:88) i =1 A i e − λτ i (cid:33) = 0 , are confined to the open left half plane [18, 19]. The fundamental solution of (1.1),which we denote by K : R → R n × n , is defined as the function satisfying ˙ K ( t ) = A K ( t ) + (cid:80) mi =1 A i K ( t − τ i ) , for almost all t ≥ ,K (0) = I,K ( t ) = 0 , for t < . (1.2) ∗ KU Leuven, Department of Computer Science, Heverlee, Belgium, { [email protected] † Harbin Institute of Technology, Center for Control Theory and Guidance Technology, Harbin,China { [email protected] a r X i v : . [ m a t h . NA ] A ug he delay Lyapunov matrix for (1.1), associated with a positive semidefinite matrix,whose rank revealing decomposition reads as BB T , where B ∈ R n × r is of full rank r ,is defined as a function P : R → R n × n such that P ( t ) = (cid:90) ∞ K ( s ) BB T K T ( s + t ) ds. (1.3)Following from the exponential stability condition of (1.1), the delay Lyapunov matrixcan be characterized as the unique solution of the matrix valued “boundary” valueproblem ˙ P ( t ) = P ( t ) A T + (cid:80) mk =1 P ( t − τ k ) A Tk , t ≥ ,P ( − t ) = P T ( t ) , − BB T = P (0) A T + A P (0) + (cid:80) mk =1 (cid:0) P ( − τ k ) A Tk + A k P ( τ k ) (cid:1) , (1.4)see [13]. There is also a dual formulation: with a positive semi-definite ( n × n )-matrix C T C we can associate Lyapunov matrix Q ( t ) = (cid:90) ∞ K T ( s ) C T CK ( s + t ) ds, which corresponds to the unique solution of ˙ Q ( t ) = Q ( t ) A + (cid:80) mk =1 Q ( t − τ k ) A k , t ≥ ,Q ( − t ) = Q T ( t ) , − C T C = Q (0) A + A T Q (0) + (cid:80) mk =1 (cid:0) Q ( − τ k ) A k + A Tk Q ( τ k ) (cid:1) . (1.5)Note that in the delay-free case the third equation in (1.4) and the third one in (1.5)reduce to a standard pair of primal and dual Lyapunov matrix equations.The delay Lyapunov matrix is a building block for the construction of Lyapunovfunctionals of complete type, which are associated with necessary and sufficient sta-bility conditions, see [13] for an excellent review. It should be remarked that in theliterature on complete type Lyapunov functionals, the Lyapunov matrix is usuallydenoted by U ( t ), which corresponds to Q ( t ) in our adopted notation. Another com-ment is that in several works the Lyapunov matrix is alternatively defined directlyas the solution of boundary value problem (1.4) or (1.5). In this way it can also bedefined for an exponentially unstable system (provided the delay systems has no pairof eigenvalues ( λ , λ ) such that λ + λ = 0, see [14, 13]), at the price that the afore-mentioned connection with the fundamental solution is lost. The Lyapunov matrixalso plays a major role in the characterization of the H norm for system (cid:26) ˙ x ( t ) = A x ( t ) + (cid:80) mi =1 A i x ( t − τ i ) + Bu ( t ) ,y ( t ) = Cx ( t ) , (1.6)where u ∈ C r is the input, y ∈ C s is the output and B ∈ R n × r , respectively C ∈ R s × n are the input, respectively and out matrix of the model. The transfer function of thesystem (1.6) is given byΥ( s ) = C (cid:32) sI − A − m (cid:88) i =1 A i e − sτ i (cid:33) − B. (1.7)The H norm of Υ is defined in the frequency domain as (cid:107) Υ (cid:107) = 12 π (cid:18)(cid:90) ∞−∞ Tr (Υ ∗ ( ıω )Υ( ıω )) dω (cid:19) , hile an equivalent definition in the time-domain is given by (cid:107) Υ (cid:107) = (cid:18)(cid:90) ∞ Tr (cid:0) h T ( t ) h ( t ) (cid:1) dt (cid:19) , with h the impulse response. The following proposition expresses the H norm interms of the delay Lyapunov matrix, whose proof trivially follows from the identity h ( t ) = CK ( t ) B. (1.8) Proposition 1.1. [12, Theorem 1] The H norm of (1.7) satisfies (cid:107) Υ (cid:107) = Tr (cid:0) CP (0) C T (cid:1) , (1.9) where P ( t ) is the delay Lyapunov matrix associated with matrix BB T . The aim of this paper is to present a novel method for computing delay Lyapunovmatrices and H norms with the following properties: • it is generally applicable, in the sense that there are no restrictions on thenumber and values of the delays, and the delay Lyapunov matrix can be easilycomputed or extended a posteriori beyond the interval [ − τ m , τ m ]; • the number of operations scales favorably with respect to the dimension n ofthe system matrices, particularly if the matrices are sparse, targeting (dis-cretizations of) partial differential equations (PDEs) with delay, provided thatthe rank r of B is small compared to n In the description of the results we restrict ourselves to the computation of Lyapunovmatrix P ( t ), since Q ( t ) can be obtained from Lyapunov matrix P ( t ) associated witha “transposed” system, inferred from the substitutions A k ← A Tk , k = 0 , . . . , m and B ← C T . The latter directly follows from a comparison between (1.4) and (1.5).The characterization (1.4) provides a natural way to compute the Lyapunov ma-trix, and the H norm via formula (1.9). However, there are major challenges. First,when making the leap from ordinary to delay differential equations, the algebraicLyapunov matrix equation is replaced by a matrix-valued boundary value problemwith delay. Second, bringing the equation in triangular form using a Schur decom-position, which forms the basis of the celebrated Bartels-Stewart algorithm for thematrix Lyapunov equation, is no longer possible. Third, it has been shown in [12]that function R (cid:51) t (cid:55)→ P ( t ) may be non-smooth. The function is continuous, but itmay be not be differentiable at t = 0. On the interval [0 , ∞ ), to which we restrict inthis paper because of the second condition in (1.4), it is continuously differentiable,yet the second derivative might be discontinuous at t = τ i , i = 1 , . . . , m , as we shallillustrate in the next section.In the present literature two approaches for solving (1.4) can be identified. Thefirst one, the so-called direct approach, is based on approximating the solution on aninterval by a matrix polynomial or a piecewise matrix polynomial and, besides impos-ing the boundary conditions and continuity requirements, determining the coefficientsby collocation conditions for the differential equation, see, e.g., [8, 12]. With N thenumber of collocation points, this results in a linear system of equations in O ( n N )variables. The convergence of the obtained approximations to the solution as a func-tion of N might be slowed down by the lack of smoothness of the solution discussedabove, see [12] for a detailed analysis. The second approach can be interpreted asa shooting method. It is applicable only if the time-delays are commensurate, i.e., i = n i h for some h > n i ∈ N , i = 1 , . . . , m , and it exploits that, in this case,the solution of (1.4) is piecewise smooth (more precisely, smooth on intervals of form( ih, ( i + 1) h ) , i ∈ Z ). Then (1.4) can be reformulated as a standard boundary valueproblem for an ordinary differential equation of dimensions 2 n n m on the interval[0 , h ]. For the latter boundary value problem, the transition between starting andend time can be determined explicitly in the form of the action of a matrix exponen-tial (the so-called semi-analytic approach [13, 6]) or by a numerical time-integrationscheme [11].The common factor that leads to a poor scalability of the above vectorizationbased approaches with respect to the dimension n is that they rely on solving a sys-tem of equations in n variables, possibly multiplied with a large factor, hence whenusing a direct solver the number of elementary operations amounts to O ( n ) opera-tions. To the best of the authors’ knowledge the only available method that allowsto address large-scale problems is the one presented in [11] for the single delay case,which falls under the umbrella of shooting methods, with the transition map deter-mined by time-integration. The key idea behind this approach, which has been shownto be effective for problems with n up to ≈ O ( n ) operations. This ap-proach is complementary to the presented approach, which has the distinctive featurethat it grounded in solving standard Lypunov matrix equations.In Section 2 we present a spectral discretization of equation (1.6) into an or-dinary equation of dimensions ( N + 1) n , where N determines the resolution of thediscretization. This allows us to obtain approximations of delay Lyapunov matricesand H norms from solving standard Lyapunov matrix equations. We also show howusing a transformation a favorable structure can be imposed. The main results areobtained in Section 3, where among others projections on Krylov spaces are usedto approximate the solutions of these Lyapunov equations, resulting in a dynamicconstruction of Lyapunov matrix approximations. Note that Krylov methods consti-tute an established approach for solving large scale matrix Lyapunov equations, see,e.g., [22, 21, 5] and the references therein. We will show that several methodologi-cal choices can be made in such a way that the overall algorithm does not dependany more on parameter N (the only condition is that it is sufficiently large with re-spect to the number of iterations for building the Krylov space). This property isat the basis of an interpretation of in terms of a projection method applied directlyto a linear infinite-dimensional system equivalent to the original delay system. Inthis sense the algorithm complements the set of “discretizaton free” algorithms forsolving nonlinear eigenvalue problems and associated problems in [9, 10]. Numericalexperiments are reported in Section 4 and some concluding remark are formulatedin Section 5. Preliminary results regarding the computation of H norms have beenpresented in [20].
2. Finite-dimensional approximation.
In Section 2.1 we outline how to dis-cretize (1.6) (and as a consequence (1.1)) using a spectral method [24], resulting ina system described by ordinary differential equations. For sake of conciseness, thederivation is slightly different from [2], in the sense that the connection of (1.6) withan abstract infinite-dimensional linear system is not explicitly made. Subsequently,we outline how an approximation of the delay Lyapunov matrix can be obtained fromthis discretization. In Section 2.2 we discuss and illustrate properties of the obtained pproximations. In Section 2.3 we refermulate the expressions for the delay Lyapunovapproximations in a form that is more suitable for the application of a Krylov method. Given a positive integer N , we consider a meshΩ N of N + 1 distinct points in the interval [ − τ m , N = { θ N,i , i = 1 , . . . , N + 1 } , (2.1)where − τ m ≤ θ N, < . . . < θ N,N < θ
N,N +1 = 0 . Throughout the paper we choose the nonzero mesh points as scaled and shifted zerosof the Chebyshev polynomial of the second kind and order N , i.e. the mesh pointsare specified as θ N,i = τ m α N,i − , α N,i = − cos πiN + 1 , i = 1 , . . . , N + 1 . (2.2)Denoting with l N,k the Lagrange polynomials corresponding to Ω N , i.e., realvalued polynomials of degree N satisfying l N,k ( θ N,i ) = (cid:26) i = k, i (cid:54) = k, and letting x k , k = 1 , . . . , N + 1 functions from R to R n , we approximate the “pieceof trajectory” x ( t + θ ) , θ ∈ [ − τ m ,
0] as follows, x ( t + θ ) ≈ N +1 (cid:88) k =1 l N,k ( θ ) x k ( t ) , θ ∈ [ − τ m , , (2.3)which induces on its term the approximation x ( t ) ≈ x ( t + θ N, ) , ... x N ( t ) ≈ x ( t + θ N,N ) ,x N +1 ( t ) ≈ x ( t ) . (2.4)Along a solution of (1.1), x is differentiable almost everywhere, hence for almost all t ≥ , θ ∈ [ − τ m ,
0] we can express ∂x ( t + θ ) ∂t = ∂x ( t + θ ) ∂θ . Requiring that the right-hand side of (2.3) satisfies this identity for (collocation points) θ N, , . . . , θ N,N brings us to the equations˙ x i ( t ) = N +1 (cid:88) k =1 ˙ l N,k ( θ i ) x k ( t ) , i = 1 , . . . , N. (2.5)Next, substituting the right-hand side of (2.3) into (1.6) yields (cid:40) ˙ x N +1 ( t ) = A x N +1 ( t ) + (cid:16)(cid:80) mi =1 (cid:80) N +1 k =1 A i l N,k ( − τ i ) (cid:17) x k ( t ) + Bu ( t ) ,y ( t ) = Cx N +1 ( t ) . (2.6) etting z ( t ) = [ x T ( t ) · · · x TN +1 ( t )] T ∈ R ( N +1) n × , Equations (2.5) and (2.6) canbe written as (cid:26) ˙ z ( t ) = A N z ( t ) + B N u ( t ) ,y ( t ) = C N z ( t ) , (2.7)where A N = d , . . . d ,N +1 ... ... d N, . . . d N,N +1 a . . . a N +1 , B N = ⊗ B,C N = [0 · · · ⊗ C (2.8)and (cid:26) d i,k = ˙ l N,k ( θ N,i ) I n , i ∈ { , . . . , N } , k ∈ { , . . . , N + 1 } ,a k = A l N,k (0) + (cid:80) mi =1 A i l N,k ( − τ i ) , k ∈ { , . . . , N + 1 } . The advantage of approximation (2.7) is that it is in the form of a standard statespace representation, for which many analysis and control design techniques exist.We refer to [26] where (2.7) is at the basis of a design method for fixed-order H optimal controller.According to (2.4), it is natural to relate the initial condition in the definitionof the fundamental solution K , see (1.2), with initial condition z (0) = E N of (2.7),where E N = [0 · · · T ⊗ I n . This allows us to approximate the fundamental matrix K ( t ) by K N ( t ), defined as K N ( t ) = E TN e A N t E N , (2.9)which by (1.3) leads us on its turn to an approximation P N of P , P N ( t ) = (cid:82) ∞ K N ( s ) BB T K ( s + t ) ds = (cid:82) ∞ E TN e A N s B N B TN e A TN ( s + t ) E N ds. (2.10)Similarly, we can approximate Υ in (1.7) by the transfer function of (2.7), given byΥ N ( s ) = C N ( sI − A N ) − B N . (2.11)The following proposition provides a computational expression for P N in terms of aLyapunov matrix equation. The arguments in the proof are well known but we includethem to make the paper self contained. Proposition 2.1.
If matrix A N is Hurwitz, we can express P N ( t ) = E TN P N e A TN t E N , (2.12) where P N satisfies the Lyapunov equation A N P N + P N A TN + B N B TN = 0 . (2.13) roof . We can write (2.10) as P N ( t ) = E TN ˜ P N e A TN t E N , where˜ P N = (cid:90) ∞ e A N s B N B TN e A TN s ds. We have A N ˜ P N + ˜ P N A TN = (cid:82) ∞ dds (cid:16) e A N s B N B TN e A TN s (cid:17) ds = − B N B TN , the latter following from the Hurwitz property of A N . Since for the same reason thesolution to Lyapunov equation (2.13) uniquely exists, we conclude ˜ P N = P N .The next proposition expresses that the approximation (2.10) of the delay Lya-punov matrix, and the approximation of the transfer function, are consistent withrespect to property (2.2). Proposition 2.2.
Function (2.10) and transfer function (2.11) satisfy (cid:107) Υ N (cid:107) = tr (cid:0) C P N (0) C T (cid:1) . (2.14) We discuss properties of approximations (2.7) and (2.10) whichare instrumental to the developments in the next sections, and which further shed alight on the difficulty of the problem of computing the delay Lyapunov matrix. Theyare illustrated by means of the didactic example˙ x ( t ) = 12 x ( t ) − x ( t −
1) + u ( t ) , y ( t ) = x ( t ) . (2.15) Time domain.
The function t (cid:55)→ K ( t ) is in general not analytic on (0 , ∞ ), dueto the propagation of the discontinuity at t = 0. If function K has a discontinuity inits k -th derivative ( k = 0 for a discontinuity in the function) at some time ˆ t ≥
0, thenthe function has, in the generic case, a discontinuity in its ( k + 1)-th derivative attime instants ˆ t + τ i , i = 1 , . . . , m . The increase of regularity is called the smoothingproperty of solutions [7].Via definition (1.3) the non-smoothness of K propagates to the function t ≥ (cid:55)→ P ( t ) (we restrict to non-negative t because of the so-called symmetry property P ( − t ) = P ( t ) T ). In Section 4 of [12] it has been shown that the function P is ingeneral not infinitely many times differentiable for t ∈ S, where S = { (cid:126)τ · (cid:126)z : (cid:126)z ∈ Z m , (cid:126)τ · (cid:126)z > } , where (cid:126)τ = ( τ , . . . , τ m ) and (cid:126)z = ( z , . . . , z m ). In the commensurate delay case, where (cid:126)τ = h(cid:126)n with n ∈ N m and gcd( (cid:126)n ) = 1, we have S = { kh : k = 0 , , , . . . } . In caseof non-commensurate delays, set S is dense in [0 , ∞ ). In both cases, function P iscontinuous, ˙ P are continuous on (0 , ∞ ), while ¨ P is continuous for all t ∈ (0 , ∞ ) exceptfor t = τ i , i ∈ { , . . . , m } , but still of bounded variation. For more details we referto [12]. As an illustration we plot the functions K and P , corresponding to (2.15), inFigure 2.1.In Figure 2.2 we plot for system (2.15) the normalized errorsmax t ∈ [0 , t max ] | P ( t ) − P N ( t ) | max t ∈ [0 , t max ] | P ( t ) | (2.16) t -2-1012 K ( t ) t -4-20246 P ( t ) Fig. 2.1 . Plot of functions K and P for system (2.15). The circles correspond to time-instantswhere the function is not infinitely many times differentiable. Function K (function P ) exhibits adiscontinuity in its k -th derivative ( ( k + 1) -th derivative) at t = k , for all k ∈ N . for t max = 2 and | P (0) − P N (0) || P (0) | , (2.17)as a function of N . Note that, as B = C = 1 for system (2.15), expression (2.17)corresponds to the normalized error on the squared H norm if the latter is approxi-mated by (cid:107) Υ N (cid:107) , see (2.14). We observe the following rates of convergence: O (cid:0) N − (cid:1) for (2.16), and O (cid:0) N − (cid:1) for (2.17). In all other experiments we observed the samerates of convergence.The seemingly slow convergence, O (cid:0) N − (cid:1) for the maximum error of P on acompact interval, is expected in view of the smoothness properties of function P . Aswe have seen, P has discontinuities in its second derivative at t = τ i , i = 1 , . . . , m (with ¨ P of bounded variation), while function P N , defined by (2.10), is analytic on R .Thus, we are approximating a non-smooth function by a series of smooth functions.Note that we would obtain the same rate of convergence when approximating P on an interval by a series of polynomials interpolating in a Chebyshev mesh [23,Theorem 7.2]. As P is analytic in the interval (0 , τ ) and we only consider nonnegative t , the convergence rate is better at t = 0. We refer to [26], where an extensiveargumentation for the rate O ( N − ) for the H norm approximation induced by (cid:107) Υ N (cid:107) is given. We recall that the lack of smoothness of P also affects solution schemes basedon solving the boundary value problem (1.4) directly [12]. Frequency domain.
With the choice of the Chebyshev mesh (2.2) the asymptoticconvergence of the individual eigenvalues of A N to corresponding characteristic rootsis fast. More specifically, in [2] it is proven that spectral accuracy (approximation error O ( N − N )) is obtained. An additional property of using mesh (2.2) for discretizing(1.6), observed in extensive numerical experiments, is that the eigenvalues of A N ,which have not yet converged to corresponding characteristic roots of (1.1), are veryoften located to the left of the eigenvalues that have already converged, which isimportant with respect to preservation of stability. These properties are illustrated forsystem (2.15) in Figure 2.3. Finally, since the effect of the spectral discretization canbe interpreted in terms of a rational approximation of functions λ → exp( − λτ i ) , i =1 , . . . , m around zero, see [28], convergence is almost invariably reached first for thesmallest characteristic roots in modulus if N is gradually increased. Due to the N -10 -8 -6 -4 Fig. 2.2 . Normalized error (2.16) (blue curve) and (2.17) (green curve) as a function of N for system (2.15). The dashed lines indicate the rates O (cid:0) N − (cid:1) and O (cid:0) N − (cid:1) . -100 -80 -60 -40 -20 0 (s) -200-150-100-50050100150200 ( s ) -6 -5 -4 -3 -2 -1 0 (s) -80-60-40-20020406080 ( s ) Fig. 2.3 . (left) all eigenvalues of A N , corresponding to system (2.15), for N = 30 (blackcircles). (right) Zoom of the right-part of the spectrum of A N , supplemented with the characteristicroots of the delay equation (blue stars). Its null solution is exponentially stable, with rightmostcharacteristic roots − . ± . j . characteristic shape on the spectrum of delay equation (exhibiting infinite root chainsextending in the left half plane, along which the imaginary part grows exponentially asa function of the real part, see [27] for a detailed description), the rightmost, stabilitydetermining roots, are typically among the smallest characteristic roots.With respect to the approximation of the transfer function, the following momentmatching property is proven in [17], which is in fact independent of the choice of themesh points in (2.1). Proposition 2.3.
The transfer functions (1.7) and (2.11) satisfy, d i Υ N ( s ) ds i (cid:12)(cid:12)(cid:12)(cid:12) s =0 = d i Υ( s ) ds i (cid:12)(cid:12)(cid:12)(cid:12) s =0 , i = 0 , . . . , N, (2.18) and d i Υ N ( s − ) ds i (cid:12)(cid:12)(cid:12)(cid:12) s =0 = d i Υ( s − ) ds i (cid:12)(cid:12)(cid:12)(cid:12) s =0 , i = 0 , , (2.19) -2 -1 | ( j | , | N ( i ) | -15 -10 -5 | ( i )- N ( i ) | Fig. 2.4 . (left) Modulus of the transfer function of (2.15) (blue curve) and the correspondingapproximation (2.11) for N = 5 (red curve), evaluated on the imaginary axis, i.e. for s = ıω, ω ≥ .(right) Approximation error on the imaginary axis for N = 5 (red curve), N = 30 (blue curve) and N = 100 (black curve). that is, the moments of Υ( s ) and Υ N ( s ) at zero match up to the N th moment, andthe moments at infinity match up to the first moment. By Property (2.18), which corresponds to Hermite interpolation at s = 0, theregion in the complex plane where the approximation is accurate extends from theorigin as N is increased, consistently with the convergence behavior of characteristicroot approximations sketched in the right pane of Figure 2.3 . At the same time,the asymptotic delay rate of the transfer function for ω → ∞ , which is described by CB/ω , is captured by property (2.19). Note that higher-order moments of (1.7) atinfinity are not well defined, which is related to the property that s = ∞ is an essential singularity of Υ. As a consequence, the overall approximation error is mainly due toa mismatch in the mid-frequency range. This is illustrated in Figure 2.4, where wecompare the transfer function of (2.15) and its approximation of form (2.11).The right pane in Figure 2.4 gives a complementary explanation, besides thesmoothness properties of the function t (cid:55)→ P ( t ), why the convergence of P N (0) to P (0)has exhibits a low rate of convergence O ( N − ), compared to the spectral convergenceof the eigenvalues of A N : unlike an individual pole and the H ∞ norm, the H norm is a global characteristic of the transfer function, in the sense that an accurate computationinvolves approximating the transfer function well over whole the imaginary axis. The following main the-orem reformulates expressions (2.12)-(2.13) in terms of a matrix G N similar to A − N ,giving the Lyapunov equation a favorable structure that will be exploited by thealgorithms presented in Section 3. Theorem 2.4.
Assume that A N is Hurwitz and let G N = Σ − N Π N , (2.20) here Π N = τ m τ m τ m τ m · · · · · · τ m − − − . . . . . . . . . N − − N − N ⊗ I (2.21) and Σ N = R R · · · R N I n . . . I n , (2.22) with R i = A T i (1) + m (cid:88) k =1 A k T i (cid:18) − τ k τ m + 1 (cid:19) , i = 0 , . . . , N and T i the Chebyshev polynomial of the first kind and order i, i = 0 , , , . . . . More-over, let H N = R − (cid:0) I − τ m R (cid:1) R − B τ m R − B ... (2.23) and F N = [ R R · · · R N ] . (2.24) Then we can express P N in (2.12) as P N ( t ) = F N Q N e G − TN t F TN (2.25) where Q N satisfies the Lyapunov equation G N Q N + Q N G TN + H N H TN = 0 . (2.26) Moreover, system (2.7) is equivalent to (cid:26) G N ˙ η ( t ) = η ( t ) + H N u ( t ) ,y ( t ) = CF N η ( t ) , (2.27) and we can express Υ N ( s ) = CF N ( sG N − I ) − H N . (2.28) roof . In [17, Section 3.1] it has been shown that A N = ( S N ⊗ I ) G − N ( S − N ⊗ I ) , (2.29)where matrix S N ∈ R ( N +1) × ( N +1) maps coefficients of a polynomial of degree N inthe Chebyshev basis (cid:26) T i (cid:18) tτ m + 1 (cid:19) : i = 0 , . . . , N (cid:27) (2.30)onto the corresponding coefficients in the Lagrange basis, { l N,i ( t ) : i = 1 , . . . , N + 1 } , defined on the mesh (2.2).Substituting (2.29) into (2.10) yields P N ( t ) = (cid:90) ∞ E TN ( S N ⊗ I ) e G − N s ( S − N ⊗ I ) B N B TN ( S − TN ⊗ I ) e G − TN ( s + t ) ( S TN ⊗ I ) E N ds. (2.31)In the proof of Theorem 3.2 of [17] it has been shown that( S − N ⊗ I ) B N = c N ⊗ B, E TN ( S N ⊗ I ) = TN ⊗ I, (2.32)with c N = (cid:26) N +1 [0 1 0 1 · · · T ⊗ B, N odd , N +1 [ · · · T ⊗ B, N even , and N = [1 1 · · · T . Using these expressions, as well as the identity e G − N t = G − N e G − N t G N , we can write (2.31) as P N ( t ) = (cid:90) ∞ ( TN ⊗ I ) G − N e G − N s G N ( c N ⊗ B )( c TN ⊗ B T ) G TN e G − TN ( s + t ) G − TN ( N ⊗ I ) ds. (2.33)A straightforward computation shows that( TN ⊗ I ) G − N = F N , G N ( c N ⊗ B ) = ˆ H N , with ˆ H N = R − B . As a consequence, we can write P N ( t ) = F N (cid:18)(cid:90) ∞ e G − N s ˆ H N ˆ H TN e G − TN s ds (cid:19) e G − TN t F TN . (2.34) enoting the integral in (2.34) by ˆ Q N , we can express the latter, relying on theassumption that A N and G − N are Hurwitz, as the solution of the Lyapunov equation G − N ˆ Q N + ˆ Q N G − TN + ˆ H N ˆ H TN = 0 . Pre-multiplying this equations with G N and post-multiplying with G TN yieldsˆ Q N G TN + G N ˆ Q N + G N ˆ H N ˆ H TN G TN = 0 . Since we have G N ˆ H N = H N , it follows that ˆ Q N = Q N , where Q N uniquely solves(2.26). Hence, (2.34) corresponds to (2.25) and (2.26).Finally,expression (2.28) constitutes the assertion of Theorem 3.2 of [17].Matrices Σ N and Π N have a sparse structure that can be exploited. In whatfollows a key role will be played by the following property. Proposition 2.5.
Assume that N , N ∈ N with N < N . Then the matrices Σ N , Π N , F N , H N in Theorem 2.4 are submatrices of Σ N , Π N , F N , H N .
3. A dynamic subspace method.
The price to pay for the discretization ofthe delay equation and the standard state space representation (2.7), which on theirturn led us to delay Lyapunov matrix approximations in explicit form, namely (2.12)-(2.13) and (2.25)-(2.26), is an increase of dimension from n to ( N + 1) n . At the sametime relatively of high value of N are expected for an accurate approximation, asmotivated in Section 2.2.If N n is large and matrix B N B TN , respectively H N H TN , has low rank (in the senseof r << N n ), computing a low-rank approximation of P N , respectively Q N , may bebeneficial. In this section we construct an approximation inferred from the projectionof the Lyapunov equation on a Krylov space of dimension kr . Before we presentthe construction in Sections 3.2-3.4, we use another didactic example to motivateimportant methodological choices regarding i) the relation between parameters N and k , ii) the choice of the Krylov space, and iii) the system matrix / Lyapunov equationto be projected on this space. We discuss some implementation aspects in Section3.5 and conclude with an interpretation in terms of projecting an infinite-dimensionsystem linear in Section 3.6.The main contributions are contained in Sections 3.4-3.6. The Arnoldi process ofSection 3.2 and the construction of the reduced model in Sections 3.3 extend resultspresented in [9, 17] to the multiple-input setting.Since the technical derivations involve many steps, we included Figure 3.3 at theend of the section in order to keep an overview of the main steps and correspondingnotations. We consider system ˙ x ( t ) = − . − .
03 0 . . − . − . − .
06 0 . − . x ( t ) + − . − . − . − . − . − . . . . x ( t − u ( t ) , y ( t ) = (cid:2) (cid:3) x ( t ) . (3.1)For N = 50 , ,
150 and 200 we computed matrices P N and Q N , solving Lyapunovequations (2.13) and (2.26). We display in Figure 3.1 (above) their ordered singularvalues, normalized such that the leading singular value equals to one. We also show,
50 100 150 200 250 i -15 -10 -5 i ( P N ) / ( P N ) N=150 N=200N=100N=50 0 50 100 150 200 250 i -15 -10 -5 i ( Q N ) / ( Q N ) N=200N=100 N=150N=50
50 100 150 200 250 300 N || P N || , || Q N || Fig. 3.1 . (above) Normalized eigenvalues ( λ i ( · ) denoting the i -th, eigenvalue in decreasingorder) of matrix P N and Q N , computed for system (3.1). (below) Spectral norm of P N (red) and Q N (blue) as a function of N . in the lower figure, the leading singular value of both matrices as a function of N . Thisexperiment indicates that the solution of Lyapunov equation (2.26), inferred from therepresentation (2.27), is more amendable for a low-rank approximation.Concerning the input-output behavior, Proposition 2.3 expresses that functionsΥ and Υ N match N + 1 moments at zero and two at infinity. To have these matchingmoments carried over by a projection of (2.7) on a right Krylov space, one needs ingeneral a subspace of dimension N +2. At the same time, if more than N +1 momentsat zero are preserved by the projection, or more than two at infinity, the highest ordermoments won’t match anymore with those of the original transfer function (1.7). Thiscan be interpreted as an instance of “over-fitting” in the sense that particularities ofthe discretization (2.7) are captured by the projection, which are not present in theoriginal delay equation and related to the discretization error. Similar conclusionscan be made from the experiment related to the upper right pane of Figure 3.1. Ona compact interval for index i , the eigenvalue functions of Q N uniformly converge for N → ∞ to the limit function indicated in black color, which is related to the original(non-discretized) delay equation (we come back to this in Section 3.6). Importantto observe is that, for a given value of N , less than N singular values are related tothe limit behavior. This indicates that, at least for a best rank- k approximation of Q N , the choice k > N could lead to a similar instance of over-fitting. All the aboveelments motivate us to assure N being sufficiently large, such that the dimension of eigenvalue index -15 -10 -5 ab s o l u t e e rr o r N=30 eigenvalue index -15 -10 -5 ab s o l u t e e rr o r N=60
Fig. 3.2 . Absolute error on the smallest characteristic roots of (3.1), obtained from (3.3)(blue) and (3.4) (red). In the left pane we consider N = k = 30 , in the right pane N = k = 60 . the subspace k satisfies k ≤ N (3.2)and, preferably k << N .The typical spectrum distribution of delay equations, with rightmost character-istic roots close to the origin, the properties of the spectral discretization, illustratedin Figure 2.3, and the above reasoning with respect to matching moments, suggest tobuild a Krylov space using matrix A − N , K k ( A − N , B N ) = span (cid:110) B N , A − N B N , . . . , A − ( k − N B N (cid:111) . Letting the columns of V N,k be an orthogonal basis for this Krylov space, we depictin Figure 3.2 the approximation error on the smallest characteristic roots for (3.1),obtained as the reciprocal of the eigenvalues of V TN,k A − N V N,k , (3.3)and as the eigenvalues of V TN,k A N V N,k , (3.4)for N = 30, N = 60 and in both cases k = N (such that (3.2) is taken into account).The plots illustrates a property observed in many experiments, that it is beneficial toproject matrix A − N on the Krylov space, compared to projecting A N . This observa-tion can be explained by a better separation of the targeted characteristic roots afteran inversion of the spectrum.The preference for building a Krylov space for A − N and for projecting this matrix,the property that G N is similar to A − N , and, last but not least, the typically fasterdecay of singular values of Q N than those of P N naturally lead us to the representation(2.25)-(2.28) of the discretized system and associated approximation of the delayLyapunov matrix. In addition, matrices Σ N and Π N have a sparse structure thatcan be exploited. In particular, the property expressed in Proposition 2.5 along withcondition (3.2) will allow us to ultimately arrive at a method that does not rely onan a-priori choice of critical parameter N , similar to the infinite-Arnoldi method foreigenvalue computations [9]. .2. Dynamic construction of a Krylov space. We fix integer k and as-sume N large enough such that (3.2) holds. We consider the block Krylov space K k ( G N , b ) = span { b, G N b, . . . , G k − N b } , (3.5)where b is a block vector of size ( N + 1) n × r , having the structure b = [ x T · · · T , (3.6)with x ∈ R n × r to be specified in Section 3.3. The block Arnoldi algorithm builds theKrylov sequence, block vector by block vector, where these vectors are orthogonalized.Due to the special structure of b and the fact that G N is a block Hessenberg matrix,whose blocks have size n × n , the block vectors G N b, . . . , G k − N b only have their first 2 n ,3 n, . . . , kn block rows different from zero. Moreover, in computing the matrix vectorproducts with (3.6), only sub-matrices of G N are needed. Hence, in the computationof the Krylov space, we can restrict to storing only the nonzero part of the blockvectors and using the relevant part of G N . This leads us to the following procedure.1. Apply Algorithm 1 for computing a basis of K k ( G k − , [ x T · · · T ). There weuse notation common for Arnoldi iterations: we let H i ∈ R ( i +1) r × ri denotethe constructed rectangular block Hessenberg matrix and H i ∈ R ri × ri thecorresponding i × i upper blocks.2. A basis for K k ( G N , [ x T · · · T ) (3.7)is spanned by the columns of V N,k = (cid:2) V Tk · · · (cid:3) T ∈ R ( N +1) n × kr , (3.8)while, due to the structure of G N , expression H k = V TN,k G N V N,k , holds, i.e., H k can be considered as an orthogonal projection of G N on a k -dimensional Krylov subspace, for any N satisfying (3.2). We now arrive atthe derivation of an approximation of Υ N ( s ), defined by (2.11) or, equivalently, (2.28),having a prescribed order kr , once again under the condition that (3.2) is satisfied. Forthis we construct the Krylov space (3.7) and project matrices F N , G N , H N , definedin Theorem 2.4, on this Krylov space. An orthogonal projection yields the followingapproximation of Υ N ( s ): Υ k ( s ) = F k ( s G k − I ) − H k , (3.9)where F k = CF N V N,k = CF k − V k , G k = V TN,k G N V N,k = H k , H k = V TN,k H N = V Tk H k − , (3.10)matrix V k and H k refer to the output of Algorithm 1 and V N,k is given by (3.8). Thematrices of the reduced model (3.9) do not depend on N . Furthermore, matrices F k lgorithm 1 A structure exploiting block Arnoldi algorithm
Require: x ∈ R n × r of full column rank, number of iterations k Let x = Q ˜ R be the reduced QR factorization of x . Set V = Q and let H be the empty matrix for i = 1 , , . . . , k do Let W i = G i (cid:20) Q i − (cid:21) Compute H i = [ V Ti W i and then ˆ W i = W i − (cid:20) V i (cid:21) H i (orthogonalization) Compute ˆ W i = Q i ˜ R i as the reduced QR factorization of ˆ W i (normalization) Let H i = (cid:20) H i − H i R i (cid:21) ∈ R ( i +1) r × ir Expand V i into V i +1 = (cid:20) V i Q i (cid:21) end forOutput: matrix V k , whose columns are an orthogonal basis for K k ( G k − , [ x T · · · T ), H k , H k , satisfying H k = V Tk G k − V k .and H k are submatrices of F k +1 and H k +1 . Therefore, they can be constructed in adynamic way when doing iterations of Algorithm 1, as is the case with the Hessenbergmatrix G k = H k .With a particular choice of the vector x in (3.7), the transfer function (3.9)satisfies the following moment matching property with the (original) transfer function(1.7) of the time-delay system (1.6). Proposition 3.1. [17, Theorem 11] Let
N, k ∈ N with N ≥ k ≥ and let theKrylov space (3.7) be constructed from x = R − B. Then transfer function (3.9) satisfies d i Υ k ( s ) ds i (cid:12)(cid:12)(cid:12)(cid:12) s =0 = d i Υ( s ) ds i (cid:12)(cid:12)(cid:12)(cid:12) s =0 , i = 0 , . . . , k − and d i Υ k ( s − ) ds i (cid:12)(cid:12)(cid:12)(cid:12) s =0 = d i Υ( s − ) ds i (cid:12)(cid:12)(cid:12)(cid:12) s =0 , i = 0 , . (3.12)Note that Proposition 3.1 concerns the matching of moments with the transferfunction of the original delay system (1.6). This is due to to the property that themoments, preserved by projection of the discretized system, are precisely matchingmoments between the discretized system and the delay system, by Proposition 2.3. The eval-uation of P N ( t ), defined by (2.25), relies on solving Lyapunov equation (2.26). Anestablished way to solve large-scale Lyapunov equations consists of computing a low-rank approximation obtained from the projection of the Lyapunov equation on aKrylov space, see, e.g., [21] and the references therein. o determine an appropriate Krylov space, it is useful to express Q N in terms ofmatrix exponentials, Q N = (cid:90) ∞ e G − N s (cid:0) G − N H N (cid:1) (cid:0) H TN G − TN (cid:1) e G − TN s ds. (3.13)Hence, a low rank approximation of Q N can be induced by approximating the actionof e G − N t on vector(s) ( G − N H N ) in a low-dimensional space. This motivates us toinclude G − N H N = R − B in the Krylov space. Furthermore, since the rightmost characteristic roots of a delayequation are typically very well approximated by the dominant eigenvalues of G N (equivalently, the smallest eigenvalues of A N in modulus), while the largest eigen-values of A N have no correspondence with characteristic roots (see the arguments inSection 2.2 and the illustration in Figure 2.3), approximating the dominant eigenspaceof G N should be favored, which brings us once again to Krylov space (3.7) with start-ing vector x = R − B .Replacing Q N in (2.26) by V N,k Q k V TN,k and requiring the residual to be orthogonalwith respect to the Krylov space, we arrive at the projected Lyapunov equation G k Q k + Q k G Tk + H k H Tk = 0 . (3.14)Hence, under assumption that G k is invertible we can approximate Q N ≈ V N,k Q k V TN,k = (cid:82) ∞ V N,k e s G − k ( G − k H k ) ( H Tk G − Tk ) e s G − Tk V TN,k ds. (3.15)Let us now compare approximation (3.15) with expression (3.13). By construction ofthe Krylov space we have G − N H N = V N,k β for some matrix β of appropriate dimensions. As a consequence, H N = G N V N,k β ⇒ H k = G k β. Thus, the approximation of Q N as in (3.15) can be interpreted in terms of the ap-proximation e tG − N ( G − N H N ) = e tG − N ( V N,k ) β ≈ V N,k e t G − k β. (3.16)Substituting the right-hand side of (3.15) into (2.25) we get P N ( t ) ≈ F N V N,k Q k V TN,k e G − TN t F TN = F N V N,k Q k (cid:16) e tG − N V N,k (cid:17) T F TN . (3.17)To approximate e tG − N V N,k we use the same principle underlying (3.16). More pre-cisely, we build a Krylov space, span (cid:8) V N,k , G N V N,k , . . . , G kN V N,k (cid:9) . Since the columns f V N,k already span a Krylov space, this can be done by doing k more iterations ofAlgorithm 1, provided condition (3.2) on N is strengthened to2 k ≤ N. (3.18)It results in a basis V N, k such that V N,k = V N, k (cid:20) I (cid:21) , hence, we can approximate (cid:16) e tG − N V N,k (cid:17) ≈ V N, k e t G − k (cid:20) I (cid:21) . (3.19)Finally, combining (3.17) and (3.19) we arrive at the following approximation of P N ( t ) and thus of the Lyapunov matrix P ( t ), P k ( t ) = [ R R · · · R k − ] V k Q k [ I e t G − T k V T k R T R T ... R T k − , (3.20)where Q k satisfies (3.14). This brings us to Algorithm 2. Algorithm 2
Construction of a (uniformly) low-rank approximation of the the delayLyapunov matrix
Require: B ∈ R n × r of full column rank, parameter k determining number of Arnoldiiterations Set x = R − B and perform 2 k iterations of Algorithm 1, resultingin V k and G k = H k ; set G k = (cid:2) I kr (cid:3) G k (cid:20) I kr (cid:21) . Construct matrices H k = V Tk H k − and L k = [ R R · · · R k − ] V k . Solve Lyapunov equation (3.14) for Q k . Output: matrices L k , Q k , G k from which P k can be constructedaccording to (3.20).Finally we note that the low-order approximation (3.9) of transfer function Υ andthe approximation (3.20) of Lyapunov matrix P ( t ) of rank smaller or equal to kr arestill consistent, in view of Proposition 1.9 and Proposition 2.2. Proposition 3.2.
We can express (cid:107) Υ k (cid:107) = Tr (cid:0) C P k (0) C T (cid:1) Proof . From (3.20) we directly haveTr (cid:0) C P k (0) C T (cid:1) = Tr (cid:16) CF N V N,k Q k V TN,k F TN C T (cid:17) = Tr (cid:0) F k Q k F Tk (cid:1) The latter expression, combined with (3.14), characterize the H norm of Υ k . Algorithm 2is fully dynamic, in the sense that by increasing iteration count k , matrices V k , G k , L k , etc., only need to be extended or updated, hence, the iteration can be resumed f the accuracy is deemed insufficient. If k is not chosen a-priori, this brings us todiscuss stopping criteria. The most reliable approach consists of testing the residualfor boundary value problem (1.4) at a set of time-instants in the interval under consid-eration. Substituting (3.20) in (1.4) and letting the columns of W k be on orthogonalbasis for the column space of [ L k A L k · · · A m L k ] , every term in the equations hasits column, respectively row range contained in those of W k , respectively W Tk . As aconsequence the Euclidean norm of the residual at a given time-instant can be ex-pressed in terms of the residual for a boundary value problem where the size of thematrices is determined by the rank of W k .The construction of matrix W k however introduces a significant additional com-putational cost. To our experience a good indicator of convergence consists of deter-mining the residual for Lyapunov equation (2.26). Recall that G N Q N is approximatedby G N V N,k Q k V TN,k = V N,k +1 H k [ Q k V TN,k +1 . At the same time we have H N = V N,k V TN,k H N = V N,k H k = V N,k +1 (cid:20) H k (cid:21) . Since the columns of V N,k +1 are orthogonal, the residual of (2.26), R N,k satisfies (cid:107) R N,k (cid:107) = (cid:13)(cid:13)(cid:13)(cid:13) H k [ Q k
0] + (cid:20) Q Tk (cid:21) H Tk + (cid:20) H k (cid:21) (cid:2) H Tk (cid:3)(cid:13)(cid:13)(cid:13)(cid:13) . (3.21) Note that the residual norm can be expressed in terms of projected matrices and isindependent of N .What concerns the computational complexity, the core of Algorithm 2 consists ofdoing 2 k iterations of Algorithm 1. Expressed in terms of operations on vectors oflength n , the computational complexity is as follows:number of backward solves: 2 rk ,number of matrix vector products: O ( rk ),number of scalar products (orthogonalization): O ( r k ).It is important to point out that all backwards solves are with the same matrix ( R ),inherent to an Arnoldi type algorithm. Hence, the first step in our implementationconsists of computing a (sparse) LU factorization of matrix R = (cid:80) mi =0 A i . For theremaining steps of Algoirthm 2 the dominant cost in most cases consists of solvingLyapunov equation (3.14) for Q k , whose complexity is described by O ( r k ) oper-ations for the adopted Bartels-Stewart algorithm. In addition, our implementationfully exploits the property that, due to the special structure of G k and the startingvector of the Arnoldi iteration, V k can be represented in the form V k = ( I k ⊗ W k ) v , v , · · · v ,k v , . . . v ,k ... . . . . . . ...0 · · · v k,k , (3.22)where both factors are orthogonal matrices, matrix W k has dimensions n × s with s ≤ kr and v i,j ∈ R s × r , i, j = 1 , . . . , k . Furthermore, both factors can be dynamicallyconstructed. These properties are fundamental in the so-called tensor infinite Arnoldimethod and CORK framework (COmpact Rational Krylov algorithms) for nonlinear igenvalue problems [25, 10], on their turn generalizing [1] for quadratic eigenvalueproblems. We refer to these references for more details on representation (3.22). Ob-viously, for large n its use leads to a significant reduction in the memory requirements,but it is also beneficial in terms of computational complexity, as argued in [10]. The spectral discretization in Section 2, resulting in a finite-dimensionalapproximation of dimension ( N + 1) n , played a major role in the technical derivationof Algorithm 2. However, eventually the role of parameter N is marginal: • the execution of Algorithm 2 (and Algorithm 1 on which it relies), as well asthe discussed stopping criteria, do not rely on a choice of N ; • the algorithms are dynamic in the sense that the iterative processes can alwaysbe resumed; • Proposition 3.1 connects moments of transfer functions Υ k and Υ directly.As a matter of fact it is only implicitly assumed that N is sufficiently large (suchthat (3.18) holds). A limit argument, for N → ∞ , provides some intuition for theexistence of an interpretation of Algorithm 2 as an algorithm acting on an infinite-dimensional linear system equivalent to (1.6). This is also suggested by Figure 3.1,where the singular value functions of Q N uniformly converge on compact intervalsto the limit function displayed in black color. In what follows we make a connectionwith an infinite-dimensional linear system concrete.We reconsider system (1.6) and define v ( θ, t ) = x ( t + θ ) , θ ∈ [ − τ m , , t ≥ . Solutions of (1.6), starting at t = 0, are continuous for t ≥
0, and they satisfy theadvection PDE (cid:26) ∂v∂t ( θ, t ) − ∂v∂θ ( θ, t ) = 0 , θ ∈ [ − τ m , , t ≥ τ m , ∂v∂t (0 , t ) = A v (0 , t ) + (cid:80) mi =1 A i v ( − τ i , t ) + Bu ( t ) , t ≥ τ m , (3.23)see [15]. Let us represent v ( θ, t ) in a Chebyshev series in variable θ on the interval[ − τ m , v ( θ, t ) = (cid:80) ∞ j =0 c j ( t ) T j (cid:16) θτ m + 1 (cid:17) , θ ∈ [ − τ m , . The second equation in (3.23) then becomes (cid:80) ∞ j =0 ˙ c j ( t ) = A (cid:16)(cid:80) ∞ j =0 c j ( t ) (cid:17) + (cid:80) mi =1 A i (cid:16)(cid:80) ∞ j =0 c j ( t ) T j (cid:16) − τ i τ m + 1 (cid:17)(cid:17) = (cid:80) ∞ j =0 c j ( t ) (cid:16) A + (cid:80) mi =1 A i T j (cid:16) − τ i τ m + 1 (cid:17)(cid:17) . (3.24)In the same way the first equation in (3.23) becomes ∞ (cid:88) j =0 ˙ c j ( t ) T j (cid:18) θτ m + 1 (cid:19) = ∞ (cid:88) j =1 c j ( t ) 2 jτ m U j − (cid:18) θτ m + 1 (cid:19) , (3.25)where we employed the property˙ T j +1 ( θ ) = ( j + 1) U j ( θ ) , with U j the Chebyshev polynomial of the second kind and order j , for j ≥ or j ≥ T j (cid:18) θτ m + 1 (cid:19) = 12 (cid:18) U j (cid:18) θτ m + 1 (cid:19) − U j − (cid:18) θτ m + 1 (cid:19)(cid:19) in (3.25), as well as T (cid:18) θτ m + 1 (cid:19) = 12 U (cid:18) θτ m + 1 (cid:19) , T (cid:18) θτ m + 1 (cid:19) = U (cid:18) θτ m + 1 (cid:19) . Multiplying subsequently left and right hand side of (3.25) with U i − (cid:18) θτ m + 1 (cid:19) (cid:115) − (cid:18) θτ m + 1 (cid:19) , taking the integral in θ from − τ m to zero, and considering the orthogonality propertiesof Chebyshev polynomials of the second kind, we arrive at˙ c ( t ) − ˙ c ( t ) = τ m c , ( ˙ c i − ( t ) − ˙ c i +1 ( t )) = iτ m c i , i ≥ . (3.26)Letting c = (cid:2) c T c T · · · (cid:3) T , e = [1 0 · · · ] T and = [1 1 · · · ] T , differential equations(3.25) and (3.26) can be written as (cid:26) Π ∞ ˙ c ( t ) = Σ ∞ c ( t ) + ( e ⊗ B ) u ( t ) ,y ( t ) = (cid:0) T ⊗ C (cid:1) c ( t ) , (3.27)with Π ∞ = τ m τ m τ m τ m · · · · · · · · · − − − . . . . . . . . . ⊗ I (3.28) and Σ ∞ = R R · · · I n . . . . (3.29)System (3.27)-(3.29) can be interpreted as alternative representation of (3.23),and of the original delay equation (1.6). At the same time, system (2.27), obtainedafter a spectral discretization and at the basis of the approach spelled out in theprevious sections, is equivalent to (cid:26) Π N ˙ c N ( t ) = Σ N c N ( t ) + ( N ⊗ B ) u ( t ) ,y ( t ) = ( TN ⊗ C ) x ( t ) (3.30)since G N Σ − N ( N ⊗ B ) = H N , ( TN ⊗ C ) G − N = F N . System (3.30) can be obtained from (3.27) by truncating the state to the first N + 1 block components (or, equivalently applying a Galerkin projection on the range ystem Delay Lyapunov matrix(approximation) Variable of standardLyapunov matrixequation A , . . . , A m , B , Cτ , . . . , τ m P ( t ) A N , B N , C N ∼ CF N , G N , H N dim = ( N + 1) n P N ( t ) P N Q N F k , G k , H k dim = kr P k ( t ) rank ≤ kr Q k dim = kr Spectral DiscretizationProjection on Krylov Space (2 k ≤ N ) ∼ Π ∞ ,Σ ∞ ,( e ⊗ B ), ( T ⊗ C )Projection Fig. 3.3 . Overview of different steps in the derivations and corresponding notations. If onlythe H norm needs to be approximated, a Kyrlov space of dimension k such that k ≤ N is sufficient.With the relation between k and N satisfied, the delay Lyapunov matrix and H norm approxima-tions, obtained after projection of the discretized system, do not depend on the value of N , only on k . This leads to the interpretation spelled out in Section 3.6 and illustrated with the curves arrows. of (cid:0) [ I n ( N +1) · · · T (cid:1) . Algorithms 1-2 only rely on the use of submatrices of Σ N , Π N ,at top-left position (recall definition (2.20) of G N ), which are on their turn “subma-trices” of Π ∞ and Σ ∞ . Therefore, these algorithms can be interpreted as applied toinfinite-dimensional system (3.27) directly. We note that, for case of approximatingcharacteristic roots by the reciprocal of eigenvalues of G k , a related interpretation ofAlgorithm 1 is given in [9], in terms of an operator eigenvalue problem.Finally, an overview of the developments throughout the Sections 2-3 is given byFigure 3.3
4. Experiments.
We first consider the model for a heat exchanger describedin [27], for which the controller (based on a combination of static state feedback andproportional integral (PI) control) has been determined by optimizing the spectralabscissa using the method of [16]. The closed-loop system is described by a delayequation of form (1.6) with n = 5 state variables and m = 7 delays. The non-zeroelements of matrices A i , i = 0 , . . . ,
7, are specified as in the following table, A (2 ,
1) : , (2 ,
2) : − , (3 ,
3) : − (5 ,
4) : − A (4 ,
3) : 0 . A (1 ,
1) : − . A (4 ,
4) : − . A (2 ,
4) : A (1 ,
1) : − . , (1 ,
2) : − . , (1 ,
3) : − . ,
4) : − . , (1 ,
5) : 0 . A (3 ,
2) : 0 . A (1 ,
2) : 0 . , (4.1) hile input matrix B , output matrix C and the delay values are given by B = . , C = I, τ τ τ τ τ τ τ = . . . . . (4.2)In Figure 4.1 we plot the normalized error on the Lyapunov matrix,max t ∈ [0 , t max ] | P ( t ) − P k ( t ) | max t ∈ [0 , t max ] | P ( t ) | , (4.3)for t max = 50 as a function of k , computed using Algorithm 2. We also show thenormalized error on the H norm, |(cid:107) Υ (cid:107) − (cid:107) Υ k (cid:107) |(cid:107) Υ (cid:107) . (4.4)Finally, the evolution of selected elements of the Lyapunov matrix P ( t ) is shown inFigure 4.2.Even though the dimension n is small, the advantage of using a projection methodis significant. To illustrate this, when choosing k = 100 the application of Algorithm 2involves the solution of a matrix Lyapunov equation of dimension 100 × H norm approximation smaller than 2 10 − , see Figure 4.1. Atthe same time, when discritizing the delay equation into an ordinary equation as inSection 2, with N = 19, and computing an H norm approximation via (2.14), onealso has to solve a Lyapunov equation of size 100 × − . The underlying reason is that the former approach can be interpreted in termsof a much more accurate discretization with N >
99 points, followed by 100 steps ofan Arnoldi iteration (see Figure 3.3).For the second and third example we consider models described by partial differ-ential equations (PDE) ∂v ( x, t ) ∂t = ∂ v ( x, t ) ∂x − x v ( x, t −
1) (4.5)and ∂v ( x, t ) ∂t = ∂ v ( x, t ) ∂x − x ) v ( x, t ) + 2 sin( x ) v ( π − x, t − , (4.6)with in both cases v (0 , t ) = v ( π, t ) = 0. The equations, which are variants of examplesin [3], can be interpreted as heat equations describing in the temperature in a rod,controlled with distributed delayed feedback. In (4.5) the feedback is proportionaland localized, in (4.6) it of Pyragas type and non-localized. We discretize differentialequations (4.5)-(4.6) in space using central differences. For (4.6), for instance, thisresulting a systems of the form (1.1) with matrices A = (cid:18) n − π (cid:19) − − − − − k -10 -8 -6 -4 R e l a t i v e e rr o r ~N -3 ~N -2 Fig. 4.1 . Normalized errors (4.3) with t max = 50 (blue curve) and (4.4) (green curve) as afunction of k for system (1.6) with matrices and delays (4.1)-(4.2). The dashed lines indicate therates O (cid:0) k − (cid:1) and O (cid:0) k − (cid:1) . t -0.02-0.0100.010.020.03 P ( t ) P P P Fig. 4.2 . Some elements of P ( t ) as a function of t for system (1.6) with matrices and delays(4.1)-(4.2). and A = 2∆ − . Here A , A ∈ R n × n , and ∆ is a diagonal matrix containing theelements of the vector (cid:16) , sin (cid:16) n − π (cid:17) , · · · , sin (cid:16) n − n − π (cid:17) , (cid:17) on its diagonal, while ∆ − is the anti-diagonal vector based on the same vector. For both (4.5) and (4.6) we wetake n = 10000 and output matrix C = (1 , , . . . , / (cid:107) (1 , . . . , (cid:107) , i.e., the output isthe average temperature of the rod. We further assume B = C T .In Figure 4.3 we display the normalized error (4.3) on the Lyapunov matrix forthe interval [0 , t max ] = [0 , , as well as the normalized error on the associated H k -6 -5 -4 -3 -2 -1 R e l a t i v e e rr o r H normLyapunov matrix Fig. 4.3 . Normalized errors (4.3), with t max = 3 , and (4.4) as a function of k , with matricesobtained form the spatial discretization of (4.5) (blue curves) and (4.6) (green curves), such that n = 10000 . norm approximation, as a function of k . To shed a light on the computation time,for system (4.6) and k = 100 the computation time for the delay Lyapunov matrix,respectively H norm , was 42 seconds, respectively 4 . kr >> n, which is natural if n is small as for the first presented example, indicate an asymptoticrate of O ( k ), respectively O ( k ) for the H norm, respectively the delay Lyapunovmatrix approximation. These rates are similar to those obtained by the spectraldiscretization in Section 2 (as a function on N ), hence, the projection step does notresult in a slowing down of the asymptotic convergence rate (recall the arguments inSection 2.2 where the rates are, among others, related to the lack of smoothness of P ( · )), even though it is highly advantageous from the point of view of computationalcomplexity. Some intuition behind this observation is given by Theorems 2.3 and3.1: by construction precisely the matching moment between Υ and Υ N carry overto the projected transfer function Υ k . In experiments with very large n , we have kr << n for a realistic range for k values as in the second and third example, and theobserved decay rate is slower, which is illustrated by a comparison between Figure 4.3and Figure 4.1. A possible explanation is that unlike the previous case a low-rankapproximation of Lyapunov matrix P ( t ) ∈ R n × n is enforced by construction. As can be seen from (3.20) only V k needs to be available to evaluate P k ( t ) at t = 0.26 nherent to the projection approach, the efficiency of the computational approachdepends on whether or not accurate low rank approximations exist, whose determiningfactors are not well understood, and the projected system matrix G k must be stabilitypreserving (this is the case for most problems and it was an important considerationin the methodological choices, but not always - a counter example is the 2nd examplein [11] for n = 1023, where spurious roots are observed in the right half plane). Thelatter is not necessarily a strong limitation for the H norm computation, since the L norm of the low-order, projected transfer function Γ k can still be computed usingother techniques different from solving the Lyapunov equation directly. All theseissues, and related fixes are subject for further investigation.
5. Concluding remarks.
A novel algorithm for computing delay Lyaopunovmatrices and H norms has been presented, which is the first algorithm generallyapplicable to linear time-delay systems with multiple delays and at the same timehaving favorable scaling properties with respect to dimension n (the examples with n = 10000 in Section 4 indicates the potential of the approach). Furthermore, thealgorithm is dynamic in nature, in the sense that the computations can be resumedif the accuracy is judged insufficient. The algorithm results in approximations of thedelay Lyapunov matrix in an explicit form given by (3.20).Computing delay Lyapunov matrices induces a lot of challenges and complicationcompared to solving classical Lyapunov matrix equations (making the leap from analgebraic equation to matrix valued boundary problem (1.4) with a non-smooth solu-tion). At the same time the research is in still an initial phase, with to the best of ourknowledge, for the moment only two methods available applicable to large problems,the presented one and the one of [11], which are fundamentally different. Thereforewe hope that the methodology, results and observations trigger further research onthe topic.Finally we come back to the assumption of exponential stability of (1.6). It impliesthat computing the Lyapunov matrix (when alternatively defined as the solution of(1.4) and not via the fundamental solution), with the presented method is not usefulin the context of verifying recent stability conditions, precisely expressed precisely interms of the delay Lyapunov matrix (see, e.g., [4] and the references therein). Yet,the overall algorithm starts with iterations of Algorithm 1, which corresponds to theInfinite-Arnoldi algorithm [9] for eigenvalue computations and which does require anexponentially stable system. Consequently, from the output of the first step, moreprecisely from the spectrum of G k , we directly obtain a certificate whether or notthe system is exponentially stable. Acknowledgements.
The first author thanks V.L. Kharitonov for an invita-tion to give a talk in a session on Lyapunov matrices at the 14th IFAC Workshopon Time-Delay System, which was the starting point of this work. The researchwas supported by the project C14/17/072 of the KU Leuven Research Council, bythe project G0A5317N of the Research Foundation-Flanders (FWO - Vlaanderen),and by the project UCoCoS, funded by the European Unions Horizon 2020 researchand innovation programme under the Marie Sklodowska-Curie Grant Agreement No675080.
REFERENCES[1] Z. Bai and Y. Su. SOAR: A second-order Arnoldi method for the solution of the quadratic27igenvalue problem.
SIAM Journal of Matrix Analysis and Applications , 26(3):640–659,2005.[2] D. Breda, S. Maset, and R. Vermiglio. Pseudospectral differencing methods for characteristicroots of delay differential equations.
SIAM Journal on Scientific Computing , 27(2):482–495, 2005.[3] D. Breda, S. Maset, and R. Vermiglio. Numerical approximation of characteristic values ofpartial retarded functional differential equations.
Numerische Mathematik , 113(2):181–242, 2009.[4] S. Cuvas and S. Mondi´e. Necessary stability conditions for delay systems with multiple pointwiseand distributed delays.
IEEE Transactions on Automatic Control , 61(7):1987–1994, 2016.[5] V. Druskin, L. Knizhnerman, and V. Simoncini. Analysis of the Rational Krylov Subspace andADI methods for solving the Lyapunov equation.
SIAM Journal on Numerical Analysis ,49:1875–1898, 2011.[6] M.A. Gomez, A. Egorov, S. Mondi´e, and W. Michiels. Optimization of the h2 norm for time-delay systems, with application to control design and model approximation.
IEEE Trans-actions on Automatic Control , 2018. In press.[7] J. K. Hale and S. M. Verduyn Lunel.
Introduction to functional differential equations , volume 99of
Applied Mathematical Sciences . Springer Verlag: New York, 1993.[8] E. Huesca, S. Mondi´e, and O. Santos. Polynomial approximations of the Lyapunov matrix ofa class of time delay systems. In
Proceedings of the 8th IFAC Workshop on Time-DelaySystems , pages 261 – 266, 2009.[9] E. Jarlebring, K. Meerbergen, and W. Michiels. A Krylov method for the delay eigenvalueproblem.
SIAM Journal on Scientific Computing , 32(6):3278–3300, 2010.[10] E. Jarlebring, G. Mele, and O. Runborg. The waveguide eigenvalue problem and the tensorinfinite arnoldi method.
SIAM Journal on Scientific Computing , 39:A1062–A1088, 2017.[11] E. Jarlebring and F. Poloni. Iterative methods for the delay Lyapunov equation with T-sylvesterpreconditioning.
Applied Numerical Mathematics , 2018. In Press.[12] E. Jarlebring, J. Vanbiervliet, and W. Michiels. Characterizing and computing the H normof time-delay systems by solving the delay Lyapunov equation. IEEE Transactions onAutomatic Control , 56(4):814–825, 2011.[13] V. L. Kharitonov.
Time-delay systems. Lyapunov functionals and matrices . Birkh¨auser, 2013.[14] V. L. Kharitonov and E. Plischke. Lyapunov matrices for time-delay systems.
Systems andControl Letters , 55(9):697–706, 2006.[15] M. Krstic.
Delay compensation for nonlinear, adaptive and PDE systems . Birkhauser, 2007.[16] W. Michiels. Spectrum based stability analysis and stabilization of systems described by delaydifferential algebraic equations.
IET Control Theory and Applications , 5(16):1829–1842,2011.[17] W. Michiels, E. Jarlebring, and K. Meerbergen. Krylov based model order reduction of time-delay systems.
SIAM Journal on Matrix Analysis and Applications , 32(4):1399–1421, 2011.[18] W. Michiels and S.I. Niculescu.
Stability, Control, and Computation for Time-Delay Systems.An Eigenvalue Based Approach . SIAM, 2 edition, 2014.[19] S.I. Niculescu.
Delay effects on stability. A robust control approach , volume 269 of
LectureNotes in Control and Information Sciences . Springer-Verlag, 2001.[20] J. Peeters and W. Michiels. Computing the H norm of large-scale time-delay systems. In Proceedings of the IFAC Joint Conference , pages 1–6, Grenoble, France, 2012.[21] V. Simoncini. A new iterative method for solving large-scale Lyapunov matrix equations.
SIAMJournal on Scientific Computing , 29(3):1268–1288, 2007.[22] V. Simoncini. Computational methods for linear matrix equations.
SIAM Review , 58(3):377–441, 2016.[23] L.N . Trefethen.
Approximation theory and approximation practice . SIAM, 2013.[24] L.N. Trefethen.
Spectral methods in MATLAB , volume 10 of
Software, Environments, andTools . SIAM, 2000.[25] R. Van Beeumen, K. Meerbergen, and W. Michiels. Computing a partial Schur factorization ofnonlinear eigenvalue problems using the infinite Arnoldi method.
SIAM Journal of MatrixAnalysis and Applications , 36(2):820–838, 2015.[26] J. Vanbiervliet, W. Michiels, and E. Jarlebring. Using spectral discretization for the optimal H design of time-delay systems. International Journal of Control , 84(2):228–241, 2011.[27] T. Vyhl¨ıdal and P. Z¨ıtek. Mapping based algorithm for large-scale computation of quasi-polynomial zeros.
IEEE Transactions on Automatic Control , 54(1):171–177, 2009.[28] Z. Wu and W. Michiels. Reliably computing all characteristic roots of delay differential equa-tions in a given right half plane.