[PDF] Transfer function interpolation remainder formula of rational Krylov subspace methods

Abstract

Rational Krylov subspace projection methods are one of successful methods in MOR, mainly because some order derivatives of the approximate and original transfer functions are the same. This is the well known moments matching result. However, the properties of points which are far from the interpolating points are little known. In this paper, we obtain the error's explicit expression which involves shifts and Ritz values. The advantage of our result over than the known moments matches theory is, to some extent, similar to the one of Lagrange type remainder formula over than Peano Type remainder formula in Taylor theorem. Expect for the proof, we also provide three explanations for the error formula. One explanation shows that in the Gauss-Christoffel quadrature sense, the error is the Gauss quadrature remainder, when the Gauss quadrature formula is applied onto the resolvent function. By using the error formula, we propose some greedy algorithms for the interpolatory H_{\infty} norm MOR.

Full PDF

aa r X i v : . [ m a t h . NA ] F e b TRANSFER FUNCTION INTERPOLATION REMAINDER FORMULAOF RATIONAL KRYLOV SUBSPACE METHODS ∗ YIDING LIN † Abstract.

Rational Krylov subspace projection methods are one of successful methods in MOR, mainly because some orderderivatives of the approximate and original transfer functions are the same. This is the well known moments matching result.However, the properties of points which are far from the interpolating points are little known. In this paper, we obtain the error’sexplicit expression which involves shifts and Ritz values. The advantage of our result over than the known moments matchestheory is, to some extent, similar to the one of Lagrange type remainder formula over than Peano Type remainder formula inTaylor theorem. Expect for the proof, we also provide three explanations for the error formula. One explanation shows that in theGauss-Christoﬀel quadrature sense, the error is the Gauss quadrature remainder, when the Gauss quadrature formula is appliedonto the resolvent function. By using the error formula, we propose some greedy algorithms for the interpolatory H ∞ norm MOR. Key words. transfer function, rational Krylov subspace, Ritz value, H ∞ model order reduction AMS subject classiﬁcations.

1. Introduction.

We are interested in the model error of model order reduction (MOR) [61] by rationalKrylov subspace projection methods. The origin single-input-single-output (SISO) dynamical system is:  dx ( t ) dt = Ax ( t ) + bu ( t ) ,y ( t ) = c H x ( t ) , with A ∈ C n × n and b, c ∈ C n × .With the rational Krylov subspaces span( V ) and span( W ), we obtain the reduced system:  W H V dx ( t ) dt = W H AV x ( t ) + W H bu ( t ) ,y ( t ) = c H V x ( t ) . Their transfer functions are h ( z ) = c H ( zI − A ) − b, ˜ h ( z ) = c H V ( zW H V − W H AV ) − W H b. The model error can been measured by some norms of e ( z ) := h ( z ) − ˜ h ( z ) [10, Section 7.2.3]. The wellknown moments matching result says: h ( k ) ( z ) = ˜ h ( k ) ( z ) . This implies e ( z ) = o (( z − z ) k ) near z . This issomehow the Peano type remainder in Taylor theorem. When we do analysis on a wide region Z ( ∋ z ), thisPeano type remainder is not satisﬁed. We are curious about the behavior of e ( z ) when z is far away from z .In a word, this paper answers the following question: Since ˜ h ( z ) is the interpolating function of h ( z ), then whois the interpolation remainder?The Krylov subspace projection method is one of the mainstream methods in MOR. There are plenty ofreferences related to this topic. We mainly review the references which are closely related to the (tangential)interpolation property. This type methods are ﬁrst set up by Skelton et al . [17,67,68]. Grimmi combines it withthe rational Krylov subspaces [33]. Gallivan, Vandendorpe and Van Dooren have a series of work here [27–29].A recent book [3] by Antoulas, Beattie and Gugercin summarizes the latest theories and algorithms. Othermaterial can been found in [1, 4, 7, 11, 34, 45, 61] and references therein. ∗ Version of January 20, 2021. † School of Economic and Mathematics, Southwestern University of Finance and Economics, Chengdu, China(

[email protected] ). The work is supported by National Natural Science Foundation of China (NSFC-11526166).1

Yiding Lin

In 1997, Grimme [33] already noticed the model error can be expressed by e ( z ) = r Hc ( zI − A ) − r b . Thistruth is used in later references (e.g. [5, 6, 22, 23, 55]). Our work begins by expanding r b and r c with respectto the Ritz values and the shifts. The expressions of the residual r by rational Krylov subspace methods areprovided by [9, 19, 36, 41, 64]. The known expressions of the residual are Galerkin type (one-sided) projection.When we generalize it for dealing with MOR, we replace it by Petro-Galerkin (two-sided) projection, which isa common manner of MOR. Thus, we obtain an expression of e ( z ).Note the Ritz values are the quadrature nodes of Gauss-Christoﬀel quadrature. This motivates our workingon ﬁnding the relation between the error formula and the Gauss quadrature. The book [46, Chapter 3] by Liesenand Strakoˇs reveals the relating theory for Hermitian A and b = c . After some derivations, we observe that theerror formula is actually the Gauss quadrature remainder, when the Gauss quadrature formula is applied ontothe resolvent function H ( λ, z ) = 1 / ( z − λ ).With the information of the previous proofs, we also notice the error formula is actually the interpolationremainder, when Hermitian formula is applied onto the resolvent function with respect to variables z and λ .Thus, we obtain the second and third explanations. Because of these explanations, we also call e ( z ) as thetransfer function interpolation remainder.It deserves further researching that how it relates to the other parts of MOR. One of the most interestingproblems is the interpolatory H ∞ norm MOR. By using the remainder formula, we can transform the H ∞ norm MOR into the approximation of the operator functions. The later work is already well establishedby G¨uttel [35–37]. His approach is based on an estimation of Walsh-Hermite formula. When our probleminvolves the resolvent function H ( λ, z ), it is quite special, since the Cauchy integral in Walsh-Hermite formulais computed successfully. After looking more closely at the error formula, we get some approximations whichcan been computed in reduced problems. Then, our greedy algorithms are designed and compared with otherMOR algorithms. Numerical experiment shows its error in H ∞ norm has the similar behavior as the one of IRKA (which computes H norm MOR), but our algorithms need much less CPU times.The paper is structured as follows: In Section 2, we present our expression of the error e ( z ) with one proofand another three explanations. After noticing the remainder of the l order two-sided projection MOR hasthe similar form as the 2 l order one-sided projection MOR, we also provide the remainder formula of 2 l orderone-sided projection MOR in Section 3. In Section 4, we discuss the interpolatory H ∞ norm MOR and givesome approximations of e ( z ). In section 5, we propose some greedy algorithms by modifying [19]. In Section 6,we do numerical testing for benchmark problems. We list our main contributions in Section 7. Notation : The standard Krylov subspace is denoted byKrylov(

A, b, l ) := span( b, Ab, . . . , A l − b ) . (1.1)The shifts sets for the left and right rational Krylov subspaces are respectively denoted by T = { t j } k c j =1 and S = { s j } k b j =1 . Hence, the rational Krylov subspaces are written byRK( A, b, S , k b ) := span { ( A − s I ) − b, ( A − s I ) − ( A − s I ) − b, . . . , k b Y j =1 ( A − s j I ) − b } , RK( A H , c, T , k c ) := span { ( A − t I ) − H c, ( A − t I ) − H ( A − t I ) − H c, . . . , k c Y j =1 ( A − t j I ) − H c } . (1.2) P m ( λ ) denotes the polynomial set, in which the degree of the polynomials is less than or equal to m . Given ϕ ( λ ) , the symbol P m ( λ ) /ϕ ( λ ) denotes the rational polynomial set. The numerator of its element is a polynomial,whose degree is less than or equal to m . The symbol Q l − ,l ( λ ) also denotes the rational polynomial set. Thenumerator of its element is of l − l degree.Write ι = √− . Let eig( A ) = { λ i ( A ) } ni =1 denotes the set of A’s eigenvalues. The ﬁeld of values of A isdeﬁned as W ( A ) = { x H Ax : x ∈ C n , x H x = 1 } . The uniform norm of a function on a set Σ is deﬁned by ransfer function interpolation remainder formula of rational Krylov subspace methods k f k Σ := sup λ ∈ Σ | f ( λ ) | [16]. The operation orth( V ) gets the orthonormal basis matrix of span( V ). Unlessotherwise speciﬁed, the norm k · k is an abbreviation of k · k . Whenever possible, Matlab notation will be used.

2. The error formula by two-sided projection.

We ﬁrst deﬁne the combined Krylov (CK) subspaces.CK(

A, b, k b , m b ) := RK( A, b, S , k b ) ∪ Krylov(

A, b, m b ) , CK( A H , c, k c , m c ) := RK( A H , c, T , k c ) ∪ Krylov( A H , c, m c ) , (2.1)where (rational) Krylov subspaces are deﬁned by (1.1) and (1.2). With notations l := k b + m b = k c + m c and ϕ ( z ) := k b Y j =1 ( z − s j ) , ψ ( z ) := k c Y j =1 ( z − t j ) , (2.2)we observe that (cf. Section 2.2.1)CK( A, b, k b , m b ) = Krylov( A, ϕ ( A ) − b, l ) , CK( A H , c, k c , m c ) = Krylov( A H , ψ ( A ) − H c, l ) . Theorem 2.1.

Let V and W satisfy span( V ) = CK( A, b, k b , m b ) , span( W ) = CK( A H , c, k c , m c ) deﬁned by (2.1) . Suppose that Lanczos biorthogonalization procedure of [ A, ϕ ( A ) − b, ψ ( A ) − H c ] does not break until ( l +1) thiteration. Let Λ( λ ) := Q li =1 ( λ − λ i ) be the monic characteristic polynomial of ( W H V ) − W H AV . With (2.2) ,write g b ( λ ) := Λ( λ ) ϕ ( λ ) , g c ( λ ) := Λ( λ ) ψ ( λ ) . (2.3) Then, it holds that e ( z ) = h ( z ) − ˜ h ( z ) = c H ( zI − A ) − b − c H V ( zW H V − W H AV ) − W H b = 1 g b ( z ) g c ( z ) c H g c ( A )( zI − A ) − g b ( A ) b. The proof is divided into three steps : Theorem 2.5, Theorem 2.12 and the ﬁnal proof in Section 2.3. Further-more, we have Theorem 2.19 for the generalization onto the descriptor system. The proof is a constructive typeone by using a Lanczos biorthogonalization procedure. The involving assumption ensures W H V is nonsingular.Now, we give some remarks immediately.1. The combined Krylov subspaces span the same subspaces as [27, Deﬁnition 11]. Our bases in thedeﬁnition (2.1) are product type bases, with no consideration on the multiplicity of the shifts. Thistype basis is a truncation of the one used in [44, Theorem 6.1]. These bases are one of various bases ofrational Krylov subspace. By the expressions of (rational) polynomials (e.g. [36, Lemma 4.2(d)]), theyare easily known to span the same subspace.2. Note that k b + m b = k c + m c , but k b does not need to be equal to k c . We set up our rational subspace ascommon as possible, so that we can include the majority of the interesting subspaces. The theorem isa conclusive result of the interpolation property. It directly gives rise to the moments matching results(e.g. [1, Proposition 11.7, Proposition 11.8, Proposition 11.10, Proposition 11.11]). From the form of e ( z ) in Theorem 2.1, it is easy to check that e ( z ) perfectly satisﬁes [27, Deﬁnition 16] (an equivalentcondition of moments matching).3. It is observed that the Krylov subspace type methods can not maintain the stability of the system.By Theorem 2.1, we easily get the reason. The error formula is independent of the stability of A .Speciﬁcally, the condition of Theorem 2.1 does not require eig( A ) be in the left half plane. Thus,we can not expect this two-sided projection method to maintain the stability, unless we add moreconstrains. Note that there exist some other applications which do not involve the stability of A (e.g. [14, (3.4)] [49, Theorem 3.1]). Yiding Lin

Step 1:

Special subspaces and special bases.

We ﬁrst prove a special case of Theorem 2.1 when l = m b = m c =: m, k b = k c = 0 . The main result of this step is Theorem 2.5.

Our proof uses the bases from the Lanczos biorthog-onalization procedure of (

A, b, c ) (e.g. [60, Section 7.1]). Thus, we need the assumption that the procedure doesnot break out. The conditions of successfully executing procedure are well researched (a recent work [56]). Therelations from the Lanczos biorthogonalization procedure of (

A, b, c ) are concluded by: v = b, w = c, c H b = 1 ,AV m = V m +1 T m , span( V m ) = Krylov( A, b, m ) , span( V m +1 ) = Krylov( A, b, m + 1) ,A H W m = W m +1 K m , span( W m ) = Krylov( A H , c, m ) , span( W m +1 ) = Krylov( A H , c, m + 1) ,W Hm V m = I, W Hm +1 V m +1 = I, W Hm AV m = T m , V Hm A H W m = K m , T m = K Hm ,T m =  α β γ α β

3. . . . . . . . . γ m − α m − β m γ m α m γ m +1  = (cid:20) T m γ m +1 e Hm (cid:21) . (2.4) Lemma 2.2.

Let τ j ( λ ) = a jj λ j + a jj − λ j − · · · + a j λ + a j be a polynomial of degree j . It is easy to see that τ j ( A ) b ∈ Krylov(

A, b, j + 1) and τ j ( A ) b / ∈ Krylov(

A, b, j ) . With the basis V in (2.4) , we have τ j ( A ) b = V m τ j ( T m ) e , j < m (2.5) τ m ( A ) b = V m τ m ( T m ) e + a mm ζ m v m +1 , j = m (2.6) where ζ m = γ m +1 · · · γ γ . Lemma 2.3.

Suppose relations (2.4) hold. Let r be a nonzero vector, which is in Krylov(

A, b, m + 1) andorthogonal to

Krylov( A H , c, m ) . Then r = ρ Λ m ( A ) b, where ρ = 0 and Λ m ( λ ) = det( λI − T m ) is the moniccharacteristic polynomial of T m . The above lemmas are generalizations of [31, Lemma 3.1, Lemma 3.2] or [54, Lemma 2.1, Corollary 2.1].Since the proofs are quite similar, we do not provide the details.

Similar to the case of the linear equation [60, Chapter 5], we require the residualbe orthogonal to the given subspace. With notations r b := b − ( zI − A ) x b , r c := c − ( zI − A ) H x c , (2.7)we have projection (Petro-Galerkin) condition: x b ∈ Krylov(

A, b, m ) = span( V m ) , s . t . r b ⊥ Krylov( A H , c, m ) = span( W m ) , and x c ∈ Krylov( A H , c, m ) = span( W m ) , s . t . r c ⊥ Krylov(

A, b, m ) = span( V m ) . Thus, we get ˜ h ( z ) = x Hc b = c H x b , where x b = V ( zI − W H AV ) − W H b, x c = W ( zI − W H AV ) − H V H c. With (2.7), we easily observe that: For any complex z , it holds that r b ∈ Krylov(

A, b, m + 1) = span( V m +1 ) , r c ∈ Krylov( A H , c, m + 1) = span( W m +1 ) . ransfer function interpolation remainder formula of rational Krylov subspace methods x b = χ b ( A, z ) b, χ b ( λ, z ) ∈ P m − ( λ ) ,r b = γ b ( A, z ) b, γ b ( λ, z ) ∈ P m ( λ ) . Their relations are given by r b = b − ( zI − A ) x b ,γ b ( λ, z ) = 1 − ( z − λ ) χ b ( λ, z ) ,χ b ( λ, z ) = 1 − γ b ( λ, z ) z − λ . (2.8)We easily ﬁnd γ b ( λ, z ) satisﬁes a constraint condition: γ b ( z, z ) = 1 . (2.9)Similar to [54, (3.8)] and [31, Theorem 3.1], we obtain the polynomial formulas for the residuals. Lemma 2.4.

Suppose relations (2.4) hold. Then, it holds r b = γ b ( A, z ) b, r Hc = c H γ c ( A, z ) ,γ b ( λ, z ) = γ c ( λ, z ) = Λ( λ )Λ( z ) , where Λ( λ ) := Q mi =1 ( λ − λ i ) is the monic characteristic polynomial of T m = W H AV .Proof . We observe that r b ∈ Krylov(

A, b, m + 1) and r b ⊥ Krylov( A H , c, m ) . By Lemma 2.3, we directlyobtain r b = ρ Λ( A ) b, where Λ( λ ) is the monic characteristic polynomial of T m = W H AV . Obviously, when z varies, ρ changes. So, we can consider ρ as the function of variable z . Therefore, we get γ b ( λ, z ) = ρ ( z )Λ( λ ) . By constraint condition (2.9), we directly obtain ρ ( z ) = 1 / Λ( z ) and then γ b ( λ, z ) = Λ( λ ) / Λ( z ) . Similarly, weget the result about r c . Theorem 2.5.

Suppose the Lanczos biorthogonalization procedure of ( A, b, c ) does not break out until ( m + 1) th iteration. The relations of the formed matrices are given by (2.4). Then, it holds that h ( z ) − ˜ h ( z ) = c H ( zI − A ) − b − c H V ( zW H V − W H AV ) − W H b = c H ( zI − A ) − b − c H V ( zI − W H AV ) − W H b = 1[Λ( z )] c H ( zI − A ) − [Λ( A )] b where Λ( λ ) := m Q i =1 ( λ − λ i ) is the monic characteristic polynomial of T m = W H AV .Proof . Note that r b = b − ( zI − A ) x b , ( zI − A ) − b − x b = ( zI − A ) − r b ,c H ( zI − A ) − b − c H x b = c H ( zI − A ) − r b , similarly, c H ( zI − A ) − = r Hc ( zI − A ) − + x Hc . Since x c ∈ Krylov( A H , c, m ) and r b ⊥ Krylov( A H , c, m ) , we know x Hc r b = 0 . By Lemma 2.4, we obtain e ( z ) = h ( z ) − ˜ h ( z ) = c H ( zI − A ) − b − c H x b = c H ( zI − A ) − r b = [ r Hc ( zI − A ) − + x Hc ] r b = r Hc ( zI − A ) − r b = c H Λ( A )Λ( z ) ( zI − A ) − Λ( A )Λ( z ) b = 1[Λ( z )] c H ( zI − A ) − [Λ( A )] b. Yiding Lin

Remark 2.6.

The relation (2.4) requires c H b = 1 , which is not the common case in practice. However, itactually does not inﬂuence the result of Theorem 2.5. The coeﬃcient m := c H b is actually the ﬁrst moment of c H A i b [56, (2.1)]. Note that h ( z ) = c H ( zI − A ) − b = m c H ( zI − A ) − ( b/m ) . After applying Theorem 2.5 onto ( A, b/m , c ) and multiplying the obtained result by m , we shall again get Theorem 2.5 without the assumption c H b = 1 . The relation e ( z ) = r Hc ( zI − A ) − r b is known in Grimme’s PhD thesis [33, Theorem 5.1]. The traditionalresult shows that e ( z ) = O (1 /z m +1 ) (e.g. [1, Section 11.2.1] [46, (3.3.26)]). Now, we explicitly reveal whatexactly it is. Since Ritz values are the quadrature nodesof a Gauss-Christoﬀel quadrature, we are motivated to ﬁnding the relation between the error formula andthe moments matching. Book [46, Chapter 3] by Liesen and Strakoˇs clearly reveals the involving relationsamong symmetric Lanczos procedure, moments matching, orthogonal polynomials, continued fractions andGauss quadrature. It is for Hermitian A and b = c . Recently, Strakoˇs and his co-workers have a series of workon generalizing it to Lanczos biorthogonalization procedure [56–59, 65].We ﬁrst deﬁne a linear functional [56]: I ( F ) := c H F ( A ) b. (2.10)If A is Hermitian positive deﬁnite and b = c , then it can be expressed as an integral [46, Page 136].With the bases from (2.4), we have the Gauss-Christoﬀel quadrature formula : I ( F ) = I G ( F ) + E ( F ) , i . e ., c H F ( A ) b = c H V ( F ( T m )) W H b + I ([Λ( λ )] F [ λ , λ , λ , λ , . . . , λ m , λ m , λ ]) . (2.11)We express the Gauss-Christoﬀel quadrature formula by the matrix form I G ( F ) = c H V ( F ( T m )) W H b = m e H F ( T m ) e . A Gauss quadrature formula obviously has the common form I G ( F ) = P mj =1 ω j F ( λ j ) (cf.Remark 2.16), where ω j is the quadrature weight coeﬃcient, and λ j is the quadrature note. Actually, it alsohas continued fraction form (e.g. [46, Section 3.3] [1, Example 11.5]). An easy way to assert that the matrixform is a Gauss type quadrature formula is as follows. Proposition 2.7. [24, Theorem 2]

Suppose relations (2.4) hold. Then, it holds that c H P ( A ) b = e H P ( T m ) e for any P ( λ ) ∈ P m − . The relations (2.4) require m = c H b = 1. If m = 1, then the treatment is the same as in Remark2.6. Proposition 2.7 is the common property of Gauss type quadrature formula. Obviously, it holds that I ( P ) = I G ( P ) for any P ( λ ) ∈ P m − . If P ( λ ) = λ j ( j = 1 , , . . . , m − A and b = c ) instead of the common Lagrange type remainder involving with the high derivatives and anunknown ξ (e.g. [46, (3.2.21)] [13, Section 4.3]). Remark 2.8.

If all of λ i are real, then the divided diﬀerence remainder can been transformed into Lagrangetype remainder. F [ λ , λ , λ , λ , . . . , λ m , λ m , λ ] = F (2 m ) ( ξ )(2 m )! . (2.12) This is a well known result (e.g. [13, Exercise 3.3.20]). We can prove it by repeatedly utilizing Lagrange meanvalue theorem. However, the Lagrange mean value theorem can not been directly generalized onto complex ﬁeld.Thus, we can not write (2.12) for complex λ i directly.Note that in complex ﬁeld, Lagrange interpolating polynomials, Newton interpolating polynomials and Her-mite interpolating polynomials still hold with the divided diﬀerences remainder or Hermitian remainder (cf.Lemma 2.17). ransfer function interpolation remainder formula of rational Krylov subspace methods H ( λ, z ) = 1 / ( z − λ ). Substituting H ( λ, z ) =1 / ( z − λ ) into F ( λ ) in (2.11) and noticing W H b = V H c = e , W H AV = T m , we obtain c H ( zI − A ) − b = c H V ( zI − W H AV ) − W H b + E ( H ) . or e ( z ) = h ( z ) − ˜ h ( z ) = E ( H ) = I ([Λ( λ )] H [ λ , λ , λ , λ , . . . , λ m , λ m , λ ]) (2.13) Lemma 2.9.

Set H ( λ ) = 1 / ( z − λ ) . It holds that H [ λ , λ , λ , . . . , λ m ] = 1 / Q mi =1 ( z − λ i ) for distinct λ i ( i = 1 , , . . . , m ) with m ≥ . Proof . By using the recursive deﬁnition of the divided diﬀerences (e.g. [13, (3.8)]), we can prove it by themathematical induction.For H ( λ ) = 1 / ( z − λ ), we can easily check that H [ λ , λ , λ , λ , . . . , λ m , λ m , λ ] = 1( z − λ )[ m Q i =1 ( z − λ i )] . (2.14)Note that the term [Λ( z )] = [ Q mi =1 ( z − λ i )] is independent of λ . After we apply deﬁnition (2.10) onto(2.13) and (2.14), we shall obtain an expression of e ( z ), which is exactly the same as the result of Theorem 2.5.In conclusion, the error formula of e ( z ) in Theorem 2.5 is the remainder of Gauss quadrature (2.11), whenthe Gauss quadrature is applied onto the resolvent function H ( λ, z ) = 1 / ( z − λ ).In the above discussion, we overlap many closely related terminologies, such as Jacobi matrix, (formal)orthogonal polynomials and continued fractions. Their relations are well described in [46, Chapter 3]. For moredetails of the Gauss quadrature, we refer the readers to [18, 24, 30, 50, 62] and references therein. Step 2:

General subspaces and special bases.

The main result of this step is Theorem 2.12.

CK(

A, b, k b , m b ) and CK( A H , c, k c , m c ) . The rational Krylov subspacecan been obtained by a special standard Krylov subspace (e.g. [36, Lemma 4.2(a)]). With (2.1), we haveCK(

A, b, k b , m b ) = Krylov( A, ϕ ( A ) − b, l ) , CK(

A, b, k b , m b + 1) = Krylov( A, ϕ ( A ) − b, l + 1) , CK( A H , c, k c , m c ) = Krylov( A H , ψ ( A ) − H c, l ) , CK( A H , c, k c , m c + 1) = Krylov( A H , ψ ( A ) − H c, l + 1) . (2.15)We use the Lanczos biorthogonalization procedure of A , ϕ ( A ) − b and ψ ( A ) − H c to form the biorthogonalbases of Krylov( A, ϕ ( A ) − b, l + 1) and Krylov( A H , ψ ( A ) − H c, l + 1). The relations are summarized as follows. b v = ϕ ( A ) − b, b w = ψ ( A ) − H c, b w H b v = 1 ,A b V l = b V l +1 b T l , span( b V l ) = Krylov( A, ϕ ( A ) − b, l ) , span( b V l +1 ) = Krylov( A, ϕ ( A ) − b, l + 1) A H c W l = c W l +1 b K l , span( c W l ) = Krylov( A H , ψ ( A ) − H c, l ) , span( c W l +1 ) = Krylov( A H , ψ ( A ) − H c, l + 1) , c W Hl b V l = I, c W Hl +1 b V l +1 = I, c W Hl A b V l = b T l , b V H A H c W l = b K l , b T l = b K Hl , b T l +1 =  α β γ α β

3. .. .. . . .. γ l − α l − β l γ l α l γ l +1  = (cid:20) b T m γ m +1 e Hm (cid:21) . (2.16) Like what is done in Section 2.1.2, we introduce the same notations for x b , x c , r b , r c and their rationalpolynomial expression. The two-sided projection conditions (Petro-Galerkin condition) are: x b ∈ CK(

A, b, k b , m b ) , r b = b − ( zI − A ) x b ∈ CK(

A, b, k b , m b + 1) , s . t . r b ⊥ CK( A H , c, k c , m c ) , (2.17) Yiding Lin and x c ∈ CK( A H , c, k c , m c ) , r c = c − ( zI − A ) H x c ∈ CK( A H , c, k c , m c + 1) , s . t . r c ⊥ CK(

A, b, k b , m b ) . Thus, we get ˜ h ( z ) = x Hc b = c H x b , where x b = b V ( zI − c W H A b V ) − c W H b, x c = c W ( zI − c W H A b V ) − H b V H c. By (2.15), the rational polynomial expressions are x b = χ b ( A, z ) b, χ b ( λ, z ) ∈ P l − ( λ ) /ϕ ( λ ) ,r b = γ b ( A, z ) b, γ b ( λ, z ) ∈ P l ( λ ) /ϕ ( λ ) . Their relations are given by r b = b − ( zI − A ) x b ,γ b ( λ, z ) = 1 − ( z − λ ) χ b ( λ, z ) ,χ b ( λ, z ) = 1 − γ b ( λ, z ) z − λ . (2.18)Thus, we still have the constraint condition: γ b ( z, z ) = 1. We ﬁrst give the expression of r b and r c . Lemma 2.10.

Suppose relations (2.16) hold. Set g b ( z ) := Λ( z ) /ϕ ( z ) and g c ( z ) := Λ( z ) /ψ ( z ) , where Λ( λ ) is the monic characteristic polynomial of b T l = c W H A b V . Then, it holds that γ b ( λ, z ) = g b ( λ ) g b ( z ) , r b = γ b ( A, z ) b,γ c ( λ, z ) = g c ( λ ) g c ( z ) , r Hc = c H γ c ( A, z ) . Proof . Translate (2.17) into the standard Krylov subspace language by (2.15), we obtain x b ∈ Krylov(

A, b, l ) , r b = b − ( zI − A ) x b ∈ Krylov(

A, ϕ ( A ) − b, l + 1) ,s.t. r b ⊥ Krylov( A H , ψ ( A ) − H c, l ) . By Lemma 2.3, we directly obtain r b = ρ Λ( A )[ ϕ ( A ) − b ] = ρg b ( A ) b, where Λ( λ ) is the monic characteristicpolynomial of b T l . When z varies, the coeﬃcient ρ changes. So, we can consider ρ as a function with respect tothe variable z . After we use a new notation ρ ( z ) , we obtain γ b ( λ, z ) = ρ ( z ) Λ( λ ) ϕ ( λ ) = ρ ( z ) g b ( λ ) . Because the constraint condition γ b ( z, z ) = 1, we obtain ρ ( z ) = 1 /g b ( z ) . Hence, γ b ( λ, z ) = g b ( λ ) /g b ( z ) . Similarly, we shall obtain the result about r c . Remark 2.11.

By relation (2.18) , we obtain a new expression of ˜ h ( z ) = c H x b = x Hc b :˜ h ( z ) = c H χ b ( A, z ) b = c H χ c ( A, z ) b,χ b ( λ, z ) = 1 − γ b ( λ, z ) z − λ , χ c ( λ, z ) = 1 − γ c ( λ, z ) z − λ . ransfer function interpolation remainder formula of rational Krylov subspace methods Theorem 2.12.

Suppose the Lanczos biorthogonalization procedure of (cid:2)

A, ϕ ( A ) − b, ψ ( A ) − H c (cid:3) does notbreak out until ( l + 1) th iteration. The formed matrices satisfy (2.16) . Then, it holds that h ( z ) − ˜ h ( z ) = c H ( zI − A ) − b − c H b V ( zI − c W H A b V ) − c W H b = r Hc ( zI − A ) − r b = 1 g b ( z ) g c ( z ) c H g c ( A )( zI − A ) − g b ( A ) b, where Λ( λ ) is the monic characteristic polynomial of b T l = c W H A b V . The proof is straightforward by our substituting the result of Lemma 2.10 into e ( z ) = r Hc ( zI − A ) − r b . Theassumption c H ψ ( A ) − ϕ ( A ) − b = 1 in (2.16) also does not inﬂuence the result. The discussion is almost thesame as the one about c H b = 1 for Theorem 2.5, which is stated in Remark 2.6. In a word, we obtain the errorformula by using special bases b V and c W of general subspaces CK( A, b, k b , m b ) and CK( A H , c, k c , m c ). We give another proof of Theorem 2.12from the viewpoint of Gauss quadrature, like what we do in Section 2.1.3 for Theorem 2.5.

Proposition 2.13.

Suppose (2.16) hold. Then, it holds that c H ψ ( A ) − P ( A ) ϕ ( A ) − b = e H P ( b T l ) e for any P ( λ ) ∈ P l − .Proof . By (2.16), the relating matrices are formed by Lanczos biorthogonalization procedure of A, ϕ ( A ) − b and ψ ( A ) − H c . Thus, the proof ends by directly applying Proposition 2.7 onto A, ϕ ( A ) − b and ψ ( A ) − H c. If ϕ ( λ ) = ψ ( λ ) = 1 (i.e. k b = k c = 0), then the result is actually Proposition 2.7 ( [24, Theorem 2]). Proposition 2.14.

Suppose relations (2.16) hold. Then, it holds that c H Q ( A ) b = c H b V Q ( c W H A b V ) c W H b for any Q ( λ ) ∈ P l − ( λ ) ϕ ( λ ) ψ ( λ ) .Proof . Similar to Lemma 2.2, for ` τ bj ( λ ) := ` a jj λ j + · · · + ` a j λ + ` a j , we obtain` τ j ( A ) ϕ ( A ) − b = b V l ` τ j ( b T l ) e , j < l ` τ l ( A ) ϕ ( A ) − b = b V l ` τ l ( b T l ) e + ` a ll ` ζ l b v l +1 , j = l ` ζ l := γ l +1 · · · γ γ . Setting ` τ j ( λ ) = ϕ k b ( λ ) , we obtain ( ϕ k b ( A ) ϕ k b ( A ) − b = b V l ϕ ( b T l ) e , k b < lϕ k b ( A ) ϕ k b ( A ) − b = b V l ϕ ( b T l ) e + ` a l ` ζ l b v l +1 . k b = l, m b = 0Note that c W Hl b V l = I, c W Hl b v l +1 = 0 and W H ϕ ( A ) − b = e . No matter whether ( k b < l ) is true, we obtain c W H b = c W H ϕ ( A ) ϕ ( A ) − b = ϕ ( b T l ) e = ϕ ( b T l ) c W H ϕ ( A ) − b,ϕ ( b T l ) − c W H b = c W H ϕ ( A ) − b = e . (2.19)Similarly, we obtain e = ψ ( b K l ) − b V H c, e H = c H b V ψ ( b K l ) − H = c H b V ψ ( b T l ) − . (2.20)The proof ends by rewriting Proposition 2.13 with (2.19) and (2.20).Now, we start to explain the error formula. For simple presentation, we assume that b m = 1 , where b m := c H ψ ( A ) − ϕ ( A ) − b. If b m = 1 , then a similar discussion of Remark 2.6 can been applied. We deﬁne thelinear functional with a weighted term: b I ( F ) := c H ψ ( A ) − F ( A ) ϕ ( A ) − b. (2.21)0 Yiding Lin

With the bases from (2.16), we have the Gauss-Christoﬀel quadrature formula : b I ( F ) = b I G ( F ) + b E ( F ) , i . e ., c H ψ ( A ) − F ( A ) ϕ ( A ) − b = e H F ( b T l ) e + b I ([Λ( λ )] F [ λ , λ , λ , λ , . . . , λ l , λ l , λ ]) . (2.22)By Theorem 2.13, we observe that b I ( P ) = b I G ( P ) holds for any P ( λ ) ∈ P l − . Thus, we know b I G ( F ) = e H F ( b T l ) e is the Gauss-Christoﬀel quadrature formula.Then, we shall proceed the research with the function b H ( λ, z ) = ϕ ( λ ) ψ ( λ ) / ( z − λ ). After substituting b H ( λ, z ) into F ( λ ) in (2.22), we shall obtain c H ( zI − A ) − b = e H ψ ( b T l )( zI − b T l ) − ϕ ( b T l ) e + b E ( b H )= c H b V ( zI − c W H A b V ) − c W H b + b E ( b H ) , Note that ϕ ( b T l ) e = c W H b and e H ψ ( b T l ) = c H b V are already stated in (2.19) and (2.20). Hence, e ( z ) = h ( z ) − ˜ h ( z ) = b E ( b H ) = b I ([Λ( λ )] b H [ λ , λ , λ , λ , . . . , λ l , λ l , λ ]) . (2.23) Lemma 2.15.

Set f M ( λ ) = [ Q ni =1 ( λ − t i )] / ( z − λ ) . For distinct λ i ( i = 1 , , . . . , m ) with m ≥ ( n + 1) , itholds that f M [ λ , λ , λ , . . . , λ m ] = [ Q ni =1 ( z − t i )] / [ Q mi =1 ( z − λ i )] . Proof . Separate every factor in the numerator of f M ( λ ) as ( λ − t i ) = ( λ − z ) + ( z − t i ) and expand theproduct. Then, we obtain f M ( λ ) = − ( λ − z ) n − + a n − ( λ − z ) n − + · · · + a ( λ − z ) + a | {z } P n − ( λ ) + n Q i =1 ( z − t i ) z − λ | {z } R ( λ ) , where a , a , . . . , a n − are independent of λ .The term P n − [ λ , λ , . . . , λ m ] is the coeﬃcient of the highest degree term from the interpolating polynomial.Now, the interpolating polynomial is of ( m −

1) degree, while the interpolated polynomial P n − ( λ ) is only of( n −

1) degree. If the condition ( m ≥ n ) holds, then the interpolating polynomial happens to be P n − ( λ ). Thus,we obtain P n − [ λ , λ , . . . , λ m ] = 0 from m ≥ ( n + 1). Here, we do not use the well known relation between the divided diﬀerences and the high order deriva-tives (e.g. [13, Exercise 3.3.20]). The reason is described in Remark 2.8. Otherwise, we directly obtain P n − [ λ , λ , . . . , λ m ] = P ( m − n − ( ξ ) / [( m − . The numerator of R ( λ ) is independent of λ . So, we can consider R ( λ )’s numerator as a coeﬃcient. Togetherit with Lemma 2.9, we obtain R [ λ , λ , . . . , λ m ] = [ Q ni =1 ( z − t i )] / [ Q mi =1 ( z − λ i )] . Thus, the proof ends by noticing f M [ λ , λ , λ , . . . , λ m ] = P n − [ λ , λ , . . . , λ m ] + R [ λ , λ , . . . , λ m ] . For b H ( λ, z ) = ϕ ( λ ) ψ ( λ ) / ( z − λ ), by Lemma 2.15, we can easily get b H [ λ , λ , λ , λ , . . . , λ l , λ l , λ ] = (cid:26) ϕ ( z ) ψ ( z )[Λ( z )] (cid:27) z − λ . (2.24)Substitute (2.24) into (2.23), use deﬁnition (2.21) and notice that the term in large braces is independentof λ . Then, we shall obtain an expression of e ( z ) which is the same as the result of Theorem 2.12.In conclusion, we use a weighted linear functional (2.22) for the function b H ( λ, z ) = ϕ ( λ ) ψ ( λ ) / ( z − λ ). Theerror formula is the Gauss quadrature remainder, when the Gauss quadrature (2.22) is applied onto b H ( λ, z ) . The diﬀerence between Section 2.1.3 and here is that we use diﬀerent weight terms for the linear functionals.Thus, the shifts t i , s i in ϕ ( λ ) , ψ ( λ ) change the linear functional from (2.10) to (2.22), like the role of the ransfer function interpolation remainder formula of rational Krylov subspace methods M in Preconditioned Conjugate Gradient (PCG) method. [60, Section 9.2] clearly reveals how M changes the inner product in PCG.Next, we provide the common form of Gauss quadrature, which is used in Section 2.4.2. Remark 2.16.

To give a simple expression, we assume λ i ( i = 1 , , . . . , l ) are distinct here. Otherwise, theGauss quadrature formula includes higher order derivatives terms [56, Section 6].Let b L i ( λ ) be the Lagrange basis functions on the nodes λ i : b L i ( λ ) = l Y j =1 ,j = i λ − λ j λ i − λ j . (2.25) Then, the Lagrange interpolating polynomial of b H ( λ, z ) = ϕ ( λ ) ψ ( λ ) / ( z − λ ) is b P l − ( λ ) = P li =1 b H ( λ i ) b L i ( λ ) . Hence, ˜ h ( z ) = c H b V ( zI − c W H A b V ) − c W H b = e H ψ ( b T l )( zI − b T l ) − ϕ ( b T l ) e can also been expressed as ˜ h ( z ) = P li =1 b I ( b L i ( λ )) b H ( λ i ) , where b I ( b L i ( λ )) is actually the Gauss quadrature coeﬃcient. Step 3:

General subspaces and general bases.

We give the ﬁnal proof of Theorem 2.1.

Proof . Let b V and c W be the special bases used in Theorem 2.5. They satisfy span( b V l ) = CK( A, b, k b , m b ) , span( c W l ) = CK( A H , c, k c , m c ) and c W H b V = I. Since they span the same subspaces as span( V ) and span( W ) do. We obtain the relations V = b V R and W = c W S, where both R and S are nonsingular transformation matrices. Then, it holds that˜ h ( z ) = c H V ( zW H V − W H AV ) − W H b = c H b V R ( zS H c W H b V R − S H c W H A b V R ) − S H c W H b = c H b V ( z c W H b V − c W H A b V ) − c W H b = c H b V ( zI − c W H A b V ) − c W H b. By Theorem 2.5, we obtain h ( z ) − ˜ h ( z ) = 1 g b ( z ) g c ( z ) c H g c ( A )( zI − A ) − g b ( A ) b, where λ i are eigenvalues of c W H A b V .

In this formula, the only thing related to the bases b V and c W are theeigenvalues. Let us see how to relate these eigenvalues to the bases V and W . Obviously, we have eig( c W H A b V ) =eig( S − W H AV R − ) = eig(( SR ) − W H AV ) = eig(( W H V ) − W H AV ) . Researchers already use rational biorthogonal basis to design MOR [5, 6, 22, 23]. Since the error formula isindependent of the bases, we obviously prefer V and W to be orthonormal in practical implementation. That is V Hl V l = I, W Hl W l = I and W Hl V l = I, where span( V l ) = CK( A, b, k b , m b ) and span( W l ) = CK( A H , c, k c , m c ) . To obtain the orthonormal bases we can apply Arnoldi-like procedure onto the basis (2.1). If the shifts sets T and S are given, then a more convenient approach is to call A Rational Krylov Toolbox for MATLAB [12].For s i = s j , it holds that (cid:2) ( λ − s i ) − , ( λ − s j ) − (cid:3) (cid:20) s i − s j ) − (cid:21) = (cid:2) ( λ − s i ) − , ( λ − s i ) − ( λ − s j ) − (cid:3) . Thus, for distinct shifts s i , we can obtain the orthonormal basis by orthogonalizing [( A − s I ) − b, ( A − s I ) − b, . . . ]. It is more convenient than our orthogonalizing [( A − s I ) − b, ( A − s I ) − ( A − s I ) − b, . . . ] before.Use new notation: G two ( λ ) := g b ( λ ) g c ( λ ) = Λ( λ ) ϕ ( λ ) Λ( λ ) ψ ( λ ) , (2.26)the error formula is simpliﬁed as e ( z ) = h ( z ) − ˜ h ( z ) = 1 G two ( z ) c H ( zI − A ) − G two ( A ) b. (2.27)2 Yiding Lin

Hermitian formula is the Cauchyintegral form of the interpolation remainder [43, (8)] [26, Page 59]. After transforming the polynomial to therational polynomial [36, Page 33], we are able to obtain an error formula of rational polynomial interpolation,which is named as Walsh-Hermite formula in [37, Page 19].

Lemma 2.17. (Hermite)

Suppose the boundary Γ of Σ consists of ﬁnitely many rectiﬁable Jordan curveswith positive orientation relative to Σ , and suppose M ( z ) is analytic in Σ and continuous in Σ ∪ Γ . Suppose theinterpolation conditions hold, i.e., M ( α i ) = P k − ( α i ) for α i ∈ Σ( i = 1 , , . . . , k ) . Write π αk ( z ) = Q ki =1 ( z − α i ) . Then, it holds that M ( z ) − P k − ( z ) = 12 πι I Γ π αk ( z ) π αk ( ζ ) M ( ζ ) ζ − z dζ. Lemma 2.18. (Walsh-Hermite)

Let Q k − ,k ( z ) ∈ Q k − ,k ( z ) be a rational interpolating polynomial of N ( z ) with poles β i and interpolating nodes α i . Write G α,βk ( z ) := k Y i =1 z − α i z − β i = ( z − α )( z − α ) · · · ( z − α k )( z − β )( z − β ) · · · ( z − β k ) . Suppose the boundary Γ of Σ consists of ﬁnitely many rectiﬁable Jordan curves with positive orientation relativeto Σ , and suppose N ( z ) is analytic in Σ and continuous in Σ ∪ Γ . If interpolating nodes α i are in Σ , then itholds that N ( z ) − Q k − ,k ( z ) = 12 πi I Γ G α,βk ( z ) G α,βk ( ζ ) N ( ζ ) ζ − z dζ. Proof . Let π βk ( z ) := Q ki =1 ( z − β i ) , e N ( z ) := N ( z ) π βk ( z ) and P k − ( z ) := Q k − ,k ( z ) π βk ( z ) . By the rationalinterpolation condition, we obtain e N ( α i ) = P k − ( α i ) for α i ∈ Σ( i = 1 , , . . . , k ) . Thus, Lemma 2.17 is applied. N ( z ) π βk ( z ) − P k − ( z ) = e N ( z ) − P k − ( z ) = 12 πι I Γ π αk ( z ) π αk ( ζ ) e N ( ζ ) ζ − z dζ = 12 πι I Γ π αk ( z ) π αk ( ζ ) N ( ζ ) π βk ( ζ ) ζ − z dζ. (2.28)The proof is completed after our dividing (2.28) by π βk ( z ) . In the next two subsections, we apply Lemma 2.18 onto the resolvent function H ( λ, z ) = 1 / ( z − λ ) withrespect to variables z and λ , respectively. Note the interpolation nodes and poles of these two cases are opposite.The key step in the two poofs is the interchange of the problem resolvent function H ( λ, z ) and the resolventfunction in the Cauchy integral. Therefore, if other problems do not have the resolvent function term inside,we may not expect to acquire an explicit error formula. z . To make use of Lemma 2.18, we substitute h ( z ) ∈ Q n − ,n ( z ) and ˜ h ( z ) ∈ Q l − ,l ( z )into N ( z ) and Q k − ,k ( z ), respectively. For simplicity, we assume k b = k c = l and all of the shifts t i , s i (from ϕ ( λ ) , ψ ( λ )) are ﬁnite. Suppose there exist a region Σ satisfying eig( A ) ∪ { λ i } ⊆ Σ and { s i , t i } ⊆ Σ − . Now, weknow ˜ h ( z ) is a rational interpolating polynomial of h ( z ) with poles λ i (doubled) and interpolating nodes t i , s i . Thus, we substitute G α,βk ( z ) = 1 /G two ( z ) into Lemma 2.18, where G two ( z ) is from (2.26). It is easy to checkthat h ( z ) is analytic on the Σ − . By Lemma 2.18, for z ∈ Σ − , it holds that h ( z ) − ˜ h ( z ) = 12 πι I Γ − G α,βk ( z ) G α,βk ( ζ ) h ( ζ ) ζ − z dζ = 12 πι I Γ − G two ( ζ ) G two ( z ) c H ( ζI − A ) − bζ − z dζ = 1 G two ( z ) 12 πι c H (cid:20)I Γ − G two ( ζ ) ζ − z ( ζI − A ) − dζ (cid:21) b = 1 G two ( z ) 12 πι c H (cid:20)I Γ G two ( ζ ) z − ζ ( ζI − A ) − dζ (cid:21) b = 1 G two ( z ) c H G two ( A )( zI − A ) − b. ransfer function interpolation remainder formula of rational Krylov subspace methods G two ( ζ ) / ( z − ζ ) with respect to ζ is analytic on Σ ⊇ eig( A ) ∪ { λ i } . λ . This explanation is closely related to the ﬁrst explanation which uses the Gauss quadra-ture. We substitute b H ( λ ) = ϕ ( λ ) ψ ( λ ) / ( z − λ ) into e N ( z ) in the proof of Lemma 2.18. Now, the involving variableis λ . For simplicity, we assume there exist a set Σ satisfying eig( A ) ∪ { λ i } ⊆ Σ and { s i , t i } ⊆ Σ − . To have asimple interpolating polynomial expression, we assume all of { λ i } are distinct.Let e P l − ( λ ) ∈ P l − ( λ ) be the Hermitian interpolating polynomial of b H ( λ ) on interpolating nodes λ i (doubled). Its explicit expression is well known (e.g. [13, Theorem 3.9]). b H ( λ i ) = e P l − ( λ i ) , b H ′ ( λ i ) = e P ′ l − ( λ i ) , i = 1 , , . . . , l e P l − ( λ ) = l X i =1 b H ( λ i ) { b L i ( λ )[1 − λ − λ i )] b L ′ i ( λ ) } + l X i =1 b H ′ ( λ i )[( λ − λ i ) b L i ( λ )] , (2.29)where b L i ( λ ) is the Lagrange basis function polynomial (2.25). By Lemma 2.17, we obtain b H ( λ ) − e P l − ( λ ) = 12 πι I Γ [Λ( λ )] [Λ( ζ )] b H ( ζ ) ζ − λ dζ = [Λ( λ )] πι I Γ ζ )] ϕ ( ζ ) ψ ( ζ ) z − ζ ζ − λ dζ = [Λ( λ )] πι I Γ ζ )] ϕ ( ζ ) ψ ( ζ ) λ − ζ ζ − z dζ = [Λ( λ )] πι I Γ − (cid:26) ζ )] ϕ ( ζ ) ψ ( ζ ) ζ − λ (cid:27) ζ − z dζ = [Λ( λ )] z )] ϕ ( z ) ψ ( z ) z − λ . (2.30)Cauchy integral formula is used at last equality, since the function in braces with respect to ζ is analytic onΣ − . This remainder formula is a quantity equality, which can been proved by other ways. Actually, we have b H ( λ ) − e P l − ( λ ) = b H [ λ , λ , λ , λ , . . . , λ l , λ l , λ ][Λ( λ )] , (2.31)where b H [ λ , λ , λ , λ , . . . , λ l , λ l , λ ] is discussed in (2.24).To obtain the expression of e ( z ), we now have two approaches: 1. Directly apply the linear functional witha weighted term (2.21) onto (2.31). 2. Divide (2.31) by ϕ ( λ ) ψ ( λ ), and then use the linear functional (2.10).Finally, we answer the question: How does c H ψ ( A ) − e P l − ( A ) ϕ ( A ) − b become to ˜ h ( z ) = c H b V ( zI − c W H A b V ) − c W H b ? By Proposition 2.14, we obtain c H ψ ( A ) − e P l − ( A ) ϕ ( A ) − b = c H b V e P l − ( c W H A b V ) c W H b. In (2.29), the term [( λ − λ i ) b L i ( λ )] obviously has the factor Λ( λ ) in its numerator. By Cayley-Hamilton theo-rem, these terms becomes zero, when we put b T = c W H A b V inside. Thus, we obtain e P l − ( b T ) = b P l − ( b T ), where b P l − ( λ ) only needs to satisfy the interpolation condition: b H ( λ i ) = b P l − ( λ i ) , i = 1 , , . . . , l . Actually, we havediscussed this b P l − ( λ ) in Remark 2.16.In conclusion, the relation between the third (this subsection) and ﬁrst explanation (Section 2.2.3) is thesame as the one between interpolation polynomial theory and Gauss quadrature theory. The latter is wellknown. E = I . We discuss how h ( z ) = c H ( zE − A ) − b is approximated by ˜ h ( z ) = c H V ( z W H E V − W H A V ) − W H b . Obviously, we have h ( z ) = c H [ zI − ( E − A )] − ( E − b ) . Substituting it into Theorem 2.1, we obtain the error formula of the descriptor model.˜ h ( z ) = c H V ( z f W H V − f W H E − A V ) − f W H ( E − b )= c H V [ z ( E − H f W ) H E V − ( E − H f W ) H A V ] − ( E − H f W ) H b = c H V ( z W H E V − W H A V ) − W H b, Yiding Lin where V and f W are set up by ( E − A, E − b, c ) in Theorem 2.1. Actually, we use W = E − H f W instead of f W . Theorem 2.19.

Let span( V ) = CK( E − A, E − b, k b , m b ) and span( W ) = CK(( AE − ) H , E − H c, k c , m c ) with k b + m b = k c + m c =: l. Suppose that the Lanczos biorthogonalization procedure of A , [ Q k b j =2 ( A − s j E ) − E ]( A − s E ) − b and Q k c j =2 ( A − t j E ) − H E H ]( A − t E ) − H c does not break until ( l + 1) th iteration. Withnotations (2.2) and (2.3) , we have e ( z ) = h ( z ) − ˜ h ( z ) = c H ( zE − A ) − b − c H V ( z W H E V − W H A V ) − W H b = 1 g b ( z ) g c ( z ) c H g c ( E − A )( zI − E − A ) − g b ( E − A ) E − b = 1 g b ( z ) g c ( z ) c H g c ( E − A ) g b ( E − A )( zE − A ) − b, (2.32) where Λ( λ ) is the monic characteristic polynomial of ( W H E V ) − W H A V .

3. One-sided Galerkin projection.

Obviously, we can use known methods to approximate H ( A ) b . Itis a function of matrices, which is studied in [38, Chapter 13] and [36, 37, 39]. After pre-multiplying it by c H ,we shall obtain an approximation of the transfer function. The basis is obtained byCK( E − A, E − b, k, m ) : = RK( E − A, E − b, S , k ) ∪ Krylov( E − A, E − b, m ) , span( V l ) = CK( E − A, E − b, k, m ) = Krylov( E − A, ϕ k ( E − A ) − E − b, l ) ,V H V = I, ϕ k ( λ ) = k Y j =1 ( λ − s j ) , k + m = 2 l. (3.1)By summarizing [9, (2.1)] and [19, (2.2)], we obtain the error of one-sided projection method. Theorem 3.1.

Suppose the dimension of

CK( E − A, E − b, k, m ) is l . With notations by (3.1), we have h ( z ) − ˘ h ( z ) = c H ( zE − A ) − b − c H V ( zV H EV − V H AV ) − V H b = 1 G one ( z ) c H G one ( E − A )( zE − A ) − b, where G one ( λ ) := V one ( λ ) ϕ ( λ ) , (3.2) with V one ( λ ) = Q li =1 ( λ − λ i ) be the monic characteristic polynomial of ( V H EV ) − V H AV . We observe G one ( λ ) has the similar form like (2.26) in the two-sided projection. The main diﬀerence is that:The order of the reduced system by the two-sided projection is l , while the one-sided projection gets 2 l . Incomparison to the numerical quadrature, it is quite comprehensible. To acquire higher degree of accuracy, oneapproach is to use Gauss quadrature (two-sided projection method, cf. Section 2.2.3), while another approachis to use higher interpolating polynomials (one-sided projection method).

4. Interpolatory H ∞ norm MOR. In this section, we suppose A is c -stable, i.e., eig( A ) are on left halfplane. Norm k · k H ∞ is deﬁned in a H ∞ -space. The space requires the function matrix is analytic and boundedin the open right hand half plane. With h ( z ) = c ( zI − A ) − b , we have k h k H ∞ = sup z ∈ i R ∪{∞} | c H ( zI − A ) − b | . MOR in H ∞ norm sense is to solve k h ( z ) − ˆ h ∗ ( z ) k H ∞ = mindim (ˆ h )= l k h ( z ) − ˆ h ( z ) k H ∞ , (4.1)where h ( z ) = c H ( zE − A ) − b, ˆ h ( z ) = c Hl ( zE l − A l ) − b l + d l . (4.2) ransfer function interpolation remainder formula of rational Krylov subspace methods H ∞ norm MOR is clearly described in [21, Section 1]. More theories and algorithms canbeen found in [32, 40] and references therein. Note that balanced truncation methods [51, 52] provide a H ∞ norm estimation of the model error by using the Hankel singular values [1, Chapter 7].We start to discuss the interpolatory H ∞ norm MOR. We require ˆ h ( z ) in (4.1) to be ˜ h ( z ), which is aninterpolatory transfer function. The reduced transfer function ˜ h ( z ) is a rational interpolating polynomial of h ( z ), i.e., it can been formed by a rational Krylov spaces projection method.For simplicity, we set d l = 0 . The rationality is as follows. If E is singular, then d l in (4.2) can not been zero.With assumption of det( E ) = 0 and det( E l ) = 0, we observe that lim z →∞ | h ( z ) | = 0 and lim z →∞ | ˜ h ( z ) | = 0 . Thus, we get lim l →∞ | d l | = 0 . When we seek the optimal shifts for ˜ h ( z ), we obviously require the shifts fully inﬂuence the error e ( z ).Thus, we set k b = k c = l. In conclusion, we shall discuss the following problem:min s ,s ,...,s l ,t ,t ,...,t l sup z ∈ i R ∪{∞} | e ( z ) | = min s ,s ,...,s l ,t ,t ,...,t l sup z ∈ i R ∪{∞} | h ( z ) − ˜ h ( z ) | = min s ,s ,...,s l ,t ,t ,...,t l sup z ∈ i R ∪{∞} (cid:12)(cid:12)(cid:12)(cid:12) G ( z ) (cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12) c H G ( E − A )( zE − A ) − b (cid:12)(cid:12) , (4.3)where G ( z ) is either G two ( z ) in (2.26) or G one ( z ) in (3.2). Lemma 4.1. ( [36, Theorem 4.9])

Let f be analytic in a neighborhood of W ( A ) , and let Σ ⊇ W ( A ) . Thereholds k f ( A ) k ≤ C k f k Σ with a constant C ≤ . . For W ( E − A ) ⊆ Σ , it is easy to check that (cid:12)(cid:12) c H G ( E − A )( zE − A ) − b (cid:12)(cid:12) ≤ e Dk G ( E − A ) k ≤ Dk G ( E − A ) k Σ = D sup λ ∈ Σ | G ( λ ) | . Thus, problem (4.3) is approximately solved bymin s ,s ,...,s l ,t ,t ,...,t l sup λ ∈ Σ | G ( λ ) | inf z ∈ i R ∪{∞} | G ( z ) | . (4.4) The problem (4.4) is closelyrelated to the generalized Zolotaryov problem [37, (17)], which can been solved by logarithmic potential theory[66, Chapter 12]. G¨uttel’s PhD thesis work [36, 37] clearly states how to handle this type problem in a moregeneral setting. We give quite short description here.We abbreviate H ( λ, z ) = 1 / ( z − λ ) to H ( λ ). Thus, h ( z ) = c H H ( A ) b . We use the special bases c W , b V from Section 2.2. Thus, ˜ h ( z ) = c H b V H ( c W H A b V ) c W H b . We discuss the function H ( λ ) on the region Σ ⊃ [ W ( A ) ∪ W ( c W H A b V )] . The counterpart of [36, Lemma 4.6] [37, Lemma 3.1] is Proposition 2.14. Thus, similarto [36, Theorem 4.10] [37, Corollay 3.4], we obtain:

Proposition 4.2. (Near-optimality)

Suppose relations (2.16) hold. Suppose that H ( λ ) = 1 / ( z − λ ) isanalytic in a neighborhood of a compact set Σ ⊇ [ W ( A ) ∪ W ( c W H A b V )] . Then the approximation c H b V ( zI − c W H A b V ) − c W H b satisﬁes | c H ( zI − A ) − b − c H b V ( zI − c W H A b V ) − c W H b |≤ C (1 + k b V kk c W k ) k b kk c k min Q ( λ ) ∈ P l − λ ) ϕ ( λ ) ψ ( λ ) kH ( λ ) − Q ( λ ) k Σ , (4.5) with a constant C ≤ . . Yiding Lin

Proof . By Theorem 2.14, we have c H Q ( A ) b = c H b V Q ( c W H A b V ) c W H b for Q ( λ ) ∈ P l − ( λ ) ϕ ( λ ) ψ ( λ ) . | c H ( zI − A ) − b − c H b V ( zI − c W H A b V ) − c W H b | = | c H H ( A ) b − c H b V H ( c W H A b V ) c W H b | = k c H H ( A ) b − c H Q ( A ) b − [ c H b V H ( c W H A b V ) c W H b − c H b V Q ( c W H A b V ) c W H b ] k≤ k b kk c k ( kH ( A ) − Q ( A ) k + k V kkH ( c W H A b V ) − Q ( c W H A b V ) kk c W k ) ≤ C k b kk c k ( kH ( λ ) − Q ( λ ) k W ( A ) + k b V kkH ( λ ) − Q ( λ ) k W ( c W H A b V ) k c W k ) ≤ C (1 + k b V kk c W k ) k b kk c kkH ( λ ) − Q ( λ ) k Σ . Let the numerator of the Q ( λ ) to be any. Then, we obtain the minimization.Based an estimation of Hermite-Walsh formula [36, Page 33], by using some special shifts, a bound ofmin Q ( λ ) ∈ P l − ( λ ) / [ ϕ ( λ ) ψ ( λ )] kH ( λ ) − Q ( λ ) k Σ can been obtained, see [36, (7.5)] [37, (18)] [43, Theorem 6 (Walsh)].The obtained shifts are asymptotically optimal shifts and used for forming rational Krylov subspaces. Practicalapproaches to compute the shifts are discussed in [36, Page 95] [37, Page 12]. Our problem involves the resolventfunction H ( λ ) = 1 / ( z − λ ) and a parameter z . Some uniform results are summarized in [37, Section 4.1].This theory is for the approximation of a general function, and is based on an estimation of Walsh-Hermitianformula. However, for our problem, the Cauchy integral in Walsh-Hermitian formula is computed successfully.The term is expressed explicitly when the involving function is the resolvent function (cf. (2.30) and (2.31)).This explicit error formula makes our problem easier. e ( z ) . The full computation of e ( z ) is expensive, we need some approximations: Remark 4.3.

We have these approximations for | e ( z ) | . e ( z ) = h ( z ) − ˜ h ( z ) = 1 G ( z ) c H G ( E − A )( zE − A ) − b, Approximation 1 : | e ( z ) | = C | G ( z ) | , Approximation 2 : | e ( z ) | ≤ | G ( z ) | k c H G ( E − A ) k k ( zE − A ) − b k ≈ C | G ( z ) | k ( zW H EV − W H AV ) − W H b k , Approximation 3 : | e ( z ) | ≤ | G ( z ) | k c H k k G ( E − A ) k k ( zE − A ) − b k ≈ e C | G ( z ) | k c H k k ( zE − A ) − b k ≈ C | G ( z ) | | c H V ( zW H EV − W H AV ) − W H b | . The term G ( z ) is a quantity formula, which implies it can been computed easily. The algorithm in [19]actually utilizes Approximation 1 (cf. Algorithm 1). If all of Ritz values λ i are real, then the constant C isclosely related to the interpolation remainder, which is expressed by high order derivatives and the unknown ξ . The remainder (2.23) and (2.31) are expressed by the divided diﬀerence. After turning them into the highorder derivatives by (2.12), we shall obverse the coeﬃcient C .Except 1 /G ( z ), there is still z in the left of e ( z ). This implies Approximation 1 can been improved. Notethat the obtained s i and λ i are independent of z . To approximate c H G ( E − A )( zE − A ) − b , we ﬁgure outApproximations 2 and 3, which can been computed in the reduced problems.To solve problem (4.3), we use a diﬀerent approach of (4.4) together with the logarithmic potential theory.We actually ignore the term k G ( E − A ) k and pay more attention on k ( zE − A ) − b k . The former is independent ransfer function interpolation remainder formula of rational Krylov subspace methods Algorithm 1

Adaptive Rational Krylov Subspace Method (

ARKSM ) [19]

Input : A, E ∈ C n × n , B ∈ C n × m , C ∈ C p × n and D ∈ C p × m , Input : s min , s max , l max . s = s min , v = ( A − s E ) − b ; s = s max , v = ( A − s E ) − b ; V = [ v , v ]; V = orth( V ); for l = 2 , . . . , l max do Update V H AV ; V H EV ; V H b ; Get Ritz values λ i ∈ eig(( V H EV ) − V H AV ); Determine ∂ Ξ = convex hull of {− λ , . . . , − λ l , s min , s max } ; Choose the next shift s l +1 : s l +1 = arg max z ∈ ∂ Ξ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) l Q j =1 ( z − s j ) l Q j =1 ( z − λ j ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (5.1) v l +1 = ( A − s l +1 E ) − b ; V = orth([ V, v l +1 ]); end forOutput : V ; Shifts s l ( l = 1 , , . . . , l max ); ˘ h ( z ) = c H V ( zV H EV − V H AV ) − V H b ≈ h ( z ) . of z . When the previous shifts are known, the Ritz value are also ﬁxed. We directly consider k G ( E − A ) k as a constant. Meanwhile, in logarithmic potential theory, k G ( E − A ) k ≈ D sup λ ∈ Σ | G ( λ ) | is used. In ouralgorithms, we also consider the inﬂuence of k ( zE − A ) − b k . In testing examples, we observe algorithms basedon Approximations 2 and 3 have better behavior than the one based on Approximation 1.In conclusion, the greedy algorithm is used here. The next shifts are obtained by choosing the maximalpoint of the error. Moreover, we claim that these obtained shifts are distinct, because any obtained shiftssatisﬁes e ( s j ) = 0 , which implies it can not been the maximal point of the error. This implies that we only needto orthogonalize [ V, ( A − s i E ) − b ] instead of orthogonalizing [ V, ( A − s i E ) − v ] , where v is the last vector of V .

5. Greedy algorithms for MOR.

Based on diﬀerent error estimations, there exist many adaptive al-gorithms for MOR [5, 6, 22, 23, 55]. Our algorithm uses the obtained error formula by modifying the algorithmin [19].

ARKSM . In paper [19], the shifts are required to be real numbers or conjugate pairs.For simplicity, we do not have this constrain (See Algorithm 1). The ﬁrst two shifts are acquired by s min = eigs(-A,E,1,‘sm’) , s max = eigs(-A,E,1,‘lm’) . (5.2)Other shifts are selected on the boundary of Ξ, where Ξ is a mirror region of Ritz values. If s i = − λ i , then weshall get H norm optimal MOR [34]. By using the maximal value theorem on Ξ, the shifts are picked up onthe boundary ∂ Ξ. Another explanation can be found in [37, Section 4.1].

Two-sided algorithms.

Our greedy two-sided algorithm is Algorithm 2.Since E − A is c-stable, shifts are selected on right half plane to make ( A − s i E ) be nonsingular. Theboundary of the right half plane is the image axis. Moreover, we are now doing research on H ∞ norm MOR,which is deﬁned on the image axis. Thus, we shall choose the shifts on the image axis. Choosing shifts on theimage axis is not new (e.g. [22, 23]). In practice, our shifts are selected on Z ( α, β, Z ( α, β, k ) = [ − ι × logspace ( α, β, k ) , , ι × logspace ( α, β, k )] . (5.4)8 Yiding Lin

Algorithm 2

Two-sided greedy rational Krylov subspace method for MOR

Input : A, E ∈ C n × n , B ∈ C n × m , C ∈ C p × n and D ∈ C p × m Input : a, b, k two , s max , l max . s = | s max | ι/ , v = ( A − s E ) − b ; t = −| s max | ι/ , w = ( A − s E ) − H c ; V = orth( v ); W = orth( w ); Determine the set for choosing shifts: Z = Z ( a, b, k two ) from (5.4); for l = 1 , . . . , l max do Update W H AV ; W H EV ; W H b ; c H V ; Get Ritz values λ i ∈ eig(( W H EV ) − W H AV ); Λ( z ) := Q li =1 ( z − λ i ); Set new symbols: ϕ l ( z ) := Q li =1 ( z − s i ); ψ l ( z ) := Q li =1 ( z − t i ); Choose next shift for s l +1 : Option 1 : s l +1 = arg max z ∈ Z (cid:12)(cid:12)(cid:12)(cid:12) ϕ l ( z ) ψ l ( z )[Λ( z )] (cid:12)(cid:12)(cid:12)(cid:12) , Option 2 : s l +1 = arg max z ∈ Z (cid:12)(cid:12)(cid:12)(cid:12) ϕ l ( z ) ψ l ( z )[Λ( z )] (cid:12)(cid:12)(cid:12)(cid:12) k ( zW H EV − W H AV ) − W H b k , Option 3 : s l +1 = arg max z ∈ Z (cid:12)(cid:12)(cid:12)(cid:12) ϕ l ( z ) ψ l ( z )[Λ( z )] (cid:12)(cid:12)(cid:12)(cid:12) | c H V ( zW H EV − W H AV ) − W H b | ; (5.3) Choose the next shift for t l +1 : t l +1 = conj ( s l +1 ); v l +1 = ( A − s l +1 E ) − b ; w l +1 = ( A − t l +1 E ) − H c ; V = orth([ V, v l +1 ]); W = orth([ W, w l +1 ]); end forOutput : V, W, ˜ h ( z ) = c H V ( zW H EV − W H AV ) − W H b ≈ h ( z ) . For the two-sided algorithm, G ( z ) is G two ( z ) in (2.26). At l th iteration, shifts s j , t j ( j = 1 , , . . . , l ) arecomputed. Then, the Ritz values are also acquired. It is important that the variable z in e ( z ) are actuallyindependent of the obtained shifts and Ritz values. Thus, the standard greedy algorithm is quite appropriate tobe used here. On Line 9, we provide three options for the next shift, which correspond to the approximationsof e ( z ) in Remark 4.3. Numerical testing shows Options 2 and 3 behavior better than Option 1.The computation of (5.3) is not expensive. The computation of ϕ l ( z ) , ψ l ( z ) and Λ( z ) only involves quantityoperations. For Options 2 and 3, we need to solve small order linear equations ( zW H EV − W H AV ) − V H b . Allof the coeﬃcient matrices are the same, while only z varies in Z . Thus, we are able to use hess ( W H EV, W H AV )and linsolve(..., opts.UHESS=1) for reducing CPU times.It is a problem how to choose t l +1 for the left subspace, since we need two new shifts for adding one order.We make an easy choice by t l +1 = s l +1 . There are other options, such asOption 2 : t l +1 = arg max z ∈ Z (cid:12)(cid:12)(cid:12)(cid:12) ϕ l ( z ) ψ l ( z )( z − s l +1 )[Λ( z )] (cid:12)(cid:12)(cid:12)(cid:12) , Option 3 : t l +1 = arg max z ∈ Z (cid:12)(cid:12)(cid:12)(cid:12) ϕ l ( z ) ψ l ( z )( z − s l +1 )[Λ( z )] (cid:12)(cid:12)(cid:12)(cid:12) k ( zW H EV − W H AV ) − W H b k , Option 4 : t l +1 = arg max z ∈ Z (cid:12)(cid:12)(cid:12)(cid:12) ϕ l ( z ) ψ l ( z )( z − s l +1 )[Λ( z )] (cid:12)(cid:12)(cid:12)(cid:12) | c H V ( zW H EV − W H AV ) − W H b | . Numerical testings do not show these options have better behaviors than t l +1 = s l +1 . ransfer function interpolation remainder formula of rational Krylov subspace methods H ∞ norm error by the max error: k e ( z ) k ∞ = max z ∈ Z e | h ( z ) − ˜ h ( z ) | , (5.5)where Z e also has form (5.4). Our greedy algorithms actually do not need to compute it. It is only used for theevaluations with other algorithms. To compute the max error, we shall spend much computation. It is not likethe case of matrix equations, whose residual norm can been computed in the reduced problems [47, 48].

6. Numerical experiments.

All experiments are carried out in Matlab2016a on a notebook (64 bits)with an Intel CPU i7-5500U and 8GB memory. Any data involving random numbers are ﬁxed by setting rand(‘state’,0) or randn(‘state’,0) . Note eigs uses a random vector as the initial vector. Before we use eigs for computing (5.2), we also set rand(‘state’,0) . We compare our algorithms with ARKSM (Algorithm1),

AAA and

IRKA .Algorithm

AAA can be applied for MOR [2, 20, 53]. It solves a min-max problem. Thus, it approximatelysolves the H ∞ norm MOR problem. The algorithm gets samples on the image axis [53, Section 6.9]. The greedyidea is also used, and thus the selected samples are nested. In our testing, the optional samples set Z A hasthe same form of Z ( α, β, k ) . If Z A = Z e , then AAA outputs the error directly. This is actually an attractiveadvantage of AAA . H norm MOR is accomplished by IRKA [8, 34]. It mainly has two disadvantages: it converges slowly andthe optimal shifts are not nested. The advantage is that it has optimization property in H norm sense. IRKAactually does not need to compute the model error either in H or H ∞ norm, since it is automatically optimizedin H norm sense. Here, we still compute its max error (5.5) to be a rough standard for other algorithms. Theinitial shifts of IRKA are the outputs shifts of ARKSM . We stop

IRKA if the iteration number is larger than 100or k S j − S j − k < − k S j k , where S j is the sorted shifts vector at j th iteration.In all testings, we use Matlab backslash to do the inversion ( A − s i E ) − b . In Matlab, directly using backslash to ﬁnish ( A − s i E ) − b and ( A − s i E ) − H c is usually faster than saving LU decomposition factors of( A − s i E ) and solving triangle linear equations. Thus, we use the former in IRKA .For large scale problems, the computation of all of the algorithms concentrates on the linear solvers. Toget l order MOR, we account the number of linear solvers for every algorithm. AAA needs to “get samples”on the imaginary axis. It computes h ( z ) for z ∈ Z A = Z ( α A , β A , k A ) . Thus, it needs (2 k A + 1) linear solvers. IRKA needs 2 j max l linear solvers, where j max is the ﬁnal iteration number when IRKA stops.

ARKSM needs l liner solvers, while our two-sided algorithm needs 2 l linear solvers. Note ARKSM is an one-sided type algorithm.

Example 1 (small problems):

The testing examples are from SLICOT benchmark problems [15]. Weset b=B(:,1) and c=C(1,:)’ , if the model problems have multi inputs and multi outputs. We set Z e = Z A = Z ( − , , AAA . We use Z = Z ( − , , two-sided algorithms.The error pictures of two problems are showed in Fig. 6.1. The CPU times of 40 order MOR are listedin Table 6.1. Except H norm MOR ( IRKA ), other algorithms have nested shifts. Thus, in Table 6.1, weonly compute the H norm MOR for l = 40. Other codes for making Fig. 6.1 and Table 6.1 do not have bigdiﬀerences. Example 2 (large scale problems):

L10000 and

L10648 are from papers [19] and [63] respectively,with E = I . They are acquired from the explicit discretization of partial diﬀerential equations. The left arefrom Oberwolfach collection [42]. The other information are summarized in the ﬁrst part of Table 6.2.We set Z A = Z e = Z ( α, β, Z = Z ( α, β, AAA include the one of “get samples”. Note that otheralgorithms actually do not need to “get samples”. We still “get samples” for computing the max error(5.5), so that all of the algorithms can been compared in the same sense.0

Yiding Lin

Table 6.1: 40 order MOR of small problemsMatrix beam CDplayer eady fom iss random

Size 348 120 598 1006 270 200CPU(s) Get Samples 7 .

08 0 .

20 40 .

13 0 .

51 0 .

40 3 . AAA .

67 0 .

72 40 .

66 0 .

80 1 .

01 4 . ARKSM .

96 2 .

62 3 .

87 2 .

67 2 .

64 2 . two-sided (O1) 3 .

21 2 .

72 5 .

61 2 .

92 2 .

84 3 . two-sided (O2) 5 .

02 4 .

72 7 .

03 4 .

61 4 .

60 4 . H ( IRKA ) 44 .

85 0 .

78 235 .

87 4 .

18 2 .

02 23 . k e ( z ) k ∞ AAA . −

02 2 . −

03 2 . −

06 5 . −

12 4 . −

06 1 . − ARKSM .

29E + 01 3 .

87E + 01 9 . −

04 5 . −

12 4 . −

03 1 . − two-sided (O1) 1 .

12E + 00 1 . −

01 1 . −

05 1 . −

11 4 . −

05 6 . − two-sided (O2) 6 . −

01 9 . −

02 7 . −

06 3 . −

12 2 . −

05 5 . − H ( IRKA ) 4 . −

02 2 . −

02 9 . −

07 4 . −

12 1 . −

05 3 . − IRKA “Get Samples” accounts the CPU times of computing h ( z ) for z ∈ Z = Z e = Z ( − , , . As is stated in Fig. 6.1, thedata of

AAA only involves l = 29 for fom and l = 31 for random . Here, we do not compute H norm MOR ( IRKA ) for l <

40. The initial shifts of

IRKA are from the output shifts of Algorithm

ARKSM . “

IRKA

IRKA when it stops.

Fig. 6.1: Behaviors of diﬀerent MOR for small problems order l m a x e rr o r k e ( z ) k ∞ -8 -6 -4 -2 AAAH (IRKA)ARKSMtwo-sided(O2) (a) eady order l m a x e rr o r k e ( z ) k ∞ -10 -5 AAAH (IRKA)ARKSMtwo-sided(O2) (b) random Although we input l = 40 into AAA , it stops at l = 31 for random .2. In Fig. 6.2, the error pictures of the two-sided (O2) algorithm are close to the one of H norm MORand AAA . Since H norm MOR has H norm optimality and AAA ’ s nice eﬀectiveness is already testedin many examples [53], we can say our greedy algorithms for solving problem (4.3) behavior well.Meanwhile, our two-sided (O2) algorithm spends much less time than

IRKA and

AAA in large scaleproblems.3. Both H norm MOR and our two-sided algorithm are two-sided type projection method. For 40 order, ransfer function interpolation remainder formula of rational Krylov subspace methods order l m a x e rr o r k e ( z ) k ∞ -15 -10 -5 AAAH (IRKA)ARKSMtwo-sided(O2) (a) flow v0 order l m a x e rr o r k e ( z ) k ∞ -15 -10 -5 AAAH (IRKA)ARKSMtwo-sided(O2) (b) rail20209 they actually have 80 shifts. ARKSM is actually an one-sided type projection method. For 40 order, itonly has 40 shifts. In Section 3, we ﬁnd the errors of l order two-sided and 2 l order one-sided methodsare quite similar. Thus, we think 20 order two-sided algorithm will have the similar precision like 40order onesided algorithm. Some of error pictures in Fig. 6.1 and 6.2 roughly show this fact, e.g., eady , random and rail20209 . Concretely speaking, the max error of l = 40 in ARKSM approximatelyequals to the one of l = 20 in two-sided algorithm or H ( IRKA ) .

7. Conclusion.

We have the following contributions: 1. We obtain the explicit formula for the modelerror by rational Krylov subspaces. It is a conclusive result of moments matching results. Our result aboutthe error is intrinsic. From the three explanations, we know the error is a Gauss quadrature remainder orHermitian formula remainders. Whether a quadrature formula has the remainder formula or not is importantto the method. Whether an interpolation formula has the remainder formula or not is important to the method.Our explicit error formula is helpful for the researchers to analyze the MOR error when various rational Krylovsubspaces are used. 2. We ﬁnd the error is the interpolation remainder of Gauss quadrature, when the Gaussquadrature is applied on the resolvent function H ( λ, z ) = 1 / ( z − λ ). This also explains why an l order two-sidedprojection method have the similar expression like a 2 l order one-sided projection method. The l order two-sidedprojection method has (2 l −

1) precision, so does the one-sided projection method. 3. We discover the erroris also the remainder when Hermitian formula is applied on the resolvent function with respect to variables z and λ . 4. We transform the interpolatory H ∞ norm MOR into the approximation of matrix function. Thus,the logarithmic potential theory can been used. Since the error of the rational interpolation for the resolventfunction can been expressed explicitly, the approach to solving interpolatory H ∞ norm MOR should beenreconsidered. 5. By using the error formula, we propose a greedy two-sided projection method for interpolatory H ∞ norm MOR.We would like to mention another two points, whose phenomena are known. 1. The error formula isindependent of the stablility of A . Thus, the two-sided projection method can not maintain the stablility of thesystem. 2. The ﬁnal error formula is independent of the bases. For numerical stablility, the researchers preferto the orthonormal bases in both left and right subspaces.It is a problem how to generalize the error formula to the multi-input-multi-output system. Once the conciseand explicit expressions of the residuals can been obtained, then the error formula of the full interpolation [3,2 Yiding Lin

Table 6.2: 40 order MOR of large scale problemsMatrix

L10000 L10648 flow v0 flow v0.5 rail5177 rail20209

Info. Size 10000 10648 9669 9669 5177 20209symmetry No No Yes No Yes Yes b ones(n,1) randn(n,1) B(:,1) B(:,1) B(:,6) B(:,6) c ones(n,1) randn(n,1) C(4,:)’ C(1,:)’ C(2,:)’ C(2,:)’ α, β − , − , − , − , − , − , .

94 323 .

14 70 .

04 68 .

99 24 .

10 113 . AAA .

24 323 .

47 70 .

37 69 .

34 24 .

48 113 . ARKSM .

21 20 .

25 4 .

49 6 .

15 2 .

71 6 . two-sided (O1) 9 .

81 41 .

37 11 .

68 15 .

30 5 .

31 19 . two-sided (O2) 12 .

16 42 .

38 13 .

31 29 .

22 7 .

08 21 . H ( IRKA ) 120 .

56 1073 .

56 671 .

25 682 .

88 254 .

41 1106 . k e ( z ) k ∞ AAA . −

06 2 . −

06 8 . −

08 7 . −

09 2 . −

12 6 . − ARKSM . −

02 8 . −

04 3 . −

08 1 . −

05 4 . −

10 1 . − two-sided (O1) 8 . −

06 3 . −

06 1 . −

08 5 . −

12 6 . −

16 1 . − two-sided (O2) 4 . −

06 1 . −

05 1 . −

11 2 . −

12 6 . −

16 1 . − H ( IRKA ) 1 . −

07 9 . −

07 4 . −

12 3 . −

10 8 . −

16 1 . − IRKA

The vectors b and c of rail20209 are stated in [34, Section 5.3]. By (5.2), 10 α is selected as a lower bound of | s min | , sois 10 β an upper bound of | s max | . “Get Samples” accounts the CPU times of evaluating h ( z ) for z ∈ Z A = Z e . Here, wedo not compute H norm MOR ( IRKA ) for l <

40. The initial shifts of

IRKA are from the output shifts of Algorithm

ARKSM . “

IRKA

IRKA when it stops.

Section 3.3] can been obtained by e ( z ) = R HC ( zI − A ) − R B . [25, Theorem 2.9] describes some properties ofthe residuals, but a more concise expression is still appreciated. Another attempt is to use the tangentialinterpolation. When the left and right tangential directions { c i } li =1 and { b i } li =1 are ﬁxed by c = c i and b = b i (e.g. [3, Theorem 3.3.1, Theorem 3.3.2]), then we can easily rewrite the error formula which only involves A , Bb and Cc . For diﬀerent left and right tangential directions, it is a problem how to write the error formula ina concise expression. Acknowledge.

The author deeply appreciates Valeria Simoncini, Ren-cang Li and Shengxin Zhu for theirinsightful comments.

REFERENCES[1]

A. C. Antoulas , Approximation of large-scale dynamical systems , vol. 6 of Advances in Design and Control, Society forIndustrial and Applied Mathematics (SIAM), Philadelphia, PA, 2005.[2]

A. C. Antoulas and B. D. Q. Anderson , On the scalar rational interpolation problem , IMA Journal of MathematicalControl and Information, 3 (1986), pp. 61–88.[3]

A. C. Antoulas, C. A. Beattie, and S. Gugercin , Interpolatory Methods for Model Reduction , Society for Industrial andApplied Mathematics, Philadelphia, PA, 2020.[4]

Z. Bai and Y. Su , Dimension reduction of large-scale second-order dynamical systems via a second-order Arnoldi method ,SIAM J. Sci. Comput., 26 (2005), pp. 1692–1709.[5]

H. Barkouki, A. H. Bentbib, and K. Jbilou , An adaptive rational block Lanczos-type algorithm for model reduction oflarge scale dynamical systems , J. Sci. Comput., 67 (2016), pp. 221–236.[6] ,

A matrix rational Lanczos method for model reduction in large-scale ﬁrst- and second-order dynamical systems ,Numer. Linear Algebra Appl., 24 (2017), pp. e2077, 13.[7]

U. Baur, P. Benner, and L. Feng , Model order reduction for linear and nonlinear systems: a system-theoretic perspective ,Arch. Comput. Methods Eng., 21 (2014), pp. 331–358.ransfer function interpolation remainder formula of rational Krylov subspace methods [8] C. Beattie, Z. Drmaˇc, and S. Gugercin , Revisiting IRKA: Connections with Pole Placement and Backward Stability ,Vietnam J. Math., 48 (2020), pp. 963–985.[9]

B. Beckermann , An error analysis for rational Galerkin projection applied to the Sylvester equation , SIAM J. Numer. Anal.,49 (2011), pp. 2430–2450.[10]

P. Benner, A. Cohen, M. Ohlberger, and K. Willcox , eds.,

Model reduction and approximation: theory and algorithms ,vol. 15 of Computational Science & Engineering, Society for Industrial and Applied Mathematics (SIAM), Philadelphia,PA, 2017.[11]

P. Benner, S. Gugercin, and K. Willcox , A survey of projection-based model reduction methods for parametric dynamicalsystems , SIAM Rev., 57 (2015), pp. 483–531.[12]

M. Berljafa, S. Elsworth, and S. G¨uttel , A rational krylov toolbox for matlab , (2014).[13]

R. L. Burden and J. D. Faires , Numerical analysis(9th edition) , Prindle, Weber & Schmidt, Boston, Mass., 2010.[14]

C. Cartis, N. I. M. Gould, and P. L. Toint , Adaptive cubic regularisation methods for unconstrained optimization. PartI: motivation, convergence and numerical results , Math. Program., 127 (2011), pp. 245–295.[15]

Y. Chahlaoui and P. Van Dooren , Benchmark examples for model reduction of linear time-invariant dynamical systems , inDimension reduction of large-scale systems, vol. 45 of Lect. Notes Comput. Sci. Eng., Springer, Berlin, 2005, pp. 379–392.[16]

M. Crouzeix , Numerical range and functional calculus in Hilbert space , J. Funct. Anal., 244 (2007), pp. 668–690.[17]

C. de Villemagne and R. E. Skelton , Model reductions using a projection formulation , Internat. J. Control, 46 (1987),pp. 2141–2169.[18]

K. Deckers , Orthogonal Rational Functions: Quadrature, Recurrence and Rational Krylov , PhD thesis, 2009.[19]

V. Druskin and V. Simoncini , Adaptive rational Krylov subspaces for large-scale dynamical systems , Systems Control Lett.,60 (2011), pp. 546–560.[20]

S.-I. Filip, Y. Nakatsukasa, L. N. Trefethen, and B. Beckermann , Rational minimax approximation via adaptivebarycentric representations , SIAM J. Sci. Comput., 40 (2018), pp. A2427–A2455.[21]

G. Flagg, C. A. Beattie, and S. Gugercin , Interpolatory H ∞ model reduction , Systems Control Lett., 62 (2013), pp. 567–574.[22] M. Frangos and I. M. Jaimoukha , Adaptive rational Krylov algorithms for model reduction , in 2007 European ControlConference (ECC), 2007, pp. 4179–4186.[23] ,

Adaptive rational interpolation: Arnoldi and Lanczos-like equations , Eur. J. Control, 14 (2008), pp. 342–354.[24]

R. W. Freund and M. Hochbruck , Gauss quadratures associated with the Arnoldi process and the Lanczos algorithm ,Springer Netherlands, Dordrecht, 1993, pp. 377–380.[25]

A. Frommer, K. Lund, and D. B. Szyld , Block krylov subspace methods for functions of matrices II: Modiﬁed block fom ,SIAM J. Matrix Anal. Appl., 41 (2020), pp. 804–837.[26]

D. Gaier , Lectures on complex approximation , vol. 188, Springer, 1987.[27]

K. Gallivan, A. Vandendorpe, and P. Van Dooren , Model reduction via truncation: an interpolation point of view , LinearAlgebra Appl., 375 (2003), pp. 115–134.[28] ,

Sylvester equations and projection-based model reduction , in Proceedings of the International Conference on LinearAlgebra and Arithmetic (Rabat, 2001), vol. 162, 2004, pp. 213–229.[29] ,

Model reduction of MIMO systems via tangential interpolation , SIAM J. Matrix Anal. Appl., 26 (2004/05), pp. 328–349.[30]

G. H. Golub and G. Meurant , Matrices, moments and quadrature with applications , Princeton Series in Applied Mathe-matics, Princeton University Press, Princeton, NJ, 2010.[31]

S. Goossens and D. Roose , Ritz and harmonic Ritz values and the convergence of FOM and GMRES , Numer. LinearAlgebra Appl., 6 (1999), pp. 281–293.[32]

K. M. Grigoriadis , Optimal H ∞ model reduction via linear matrix inequalities: continuous- and discrete-time cases , SystemsControl Lett., 26 (1995), pp. 321–333.[33] E. Grimme , Krylov projection methods for model reduction , PhD thesis, University of Illinois, 1997.[34]

S. Gugercin, A. C. Antoulas, and C. Beattie , H model reduction for large-scale linear dynamical systems , SIAM J.Matrix Anal. Appl., 30 (2008), pp. 609–638.[35] S. G¨uttel , Convergence estimates of Krylov subspace methods for the approximation of matrix functions using tools frompotential theory , (2006).[36]

S. G¨uttel , Rational Krylov methods for operator functions , PhD thesis, TU Bergakademie Freiberg, Germany, 2010.[37]

S. G¨uttel , Rational Krylov approximation of matrix functions: numerical methods and optimal pole selection , GAMM-Mitt.,36 (2013), pp. 8–31.[38]

N. J. Higham , Functions of matrices , Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2008.Theory and computation.[39]

C. Jagels and L. Reichel , The extended Krylov subspace method and orthogonal Laurent polynomials , Linear Algebra Appl.,431 (2009), pp. 441–458.[40]

D. Kavrano˘glu and M. Bettayeb , Characterization of the solution to the optimal H ∞ model reduction problem , SystemsControl Lett., 20 (1993), pp. 99–107.[41] L. Knizhnerman, V. Druskin, and M. Zaslavsky , On optimal convergence rate of the rational Krylov subspace reductionfor electromagnetic problems in unbounded domains , SIAM J. Numer. Anal., 47 (2009), pp. 953–971.[42]

J. G. Korvink and E. B. Rudnyi , Oberwolfach benchmark collection , in Dimension Reduction of Large-Scale Systems, Yiding LinP. Benner, D. C. Sorensen, and V. Mehrmann, eds., Berlin, Heidelberg, 2005, Springer Berlin Heidelberg, pp. 311–315.[43]

E. Levin and E. B. Saff , Potential theoretic tools in polynomial and rational approximation , in Harmonic analysis andrational approximation, vol. 327 of Lect. Notes Control Inf. Sci., Springer, Berlin, 2006, pp. 71–94.[44]

J.-R. Li and J. White , Low rank solution of Lyapunov equations , SIAM J. Matrix Anal. Appl., 24 (2002), pp. 260–280.[45]

R.-C. Li and Z. Bai , Structure-preserving model reduction using a Krylov subspace projection formulation , Commun. Math.Sci., 3 (2005), pp. 179–199.[46]

J. Liesen and Z. Strakoˇs , Krylov subspace methods: principles and analysis , Numerical Mathematics and Scientiﬁc Com-putation, Oxford University Press, Oxford, 2013.[47]

Y. Lin and V. Simoncini , Minimal residual methods for large scale Lyapunov equations , Appl. Numer. Math., 72 (2013),pp. 52–71.[48] ,

A new subspace iteration method for the algebraic Riccati equation , Numer. Linear Algebra Appl., 22 (2015), pp. 26–47.[49]

Y. Lin, X. Wang, and L. Zhang , Solving symmetric and positive deﬁnite second-order cone linear complementarity problemby a rational Krylov subspace method , arXiv, (2020).[50]

G. L´opez Lagomasino, L. Reichel, and L. Wunderlich , Matrices, moments, and rational quadrature , Linear AlgebraAppl., 429 (2008), pp. 2540–2554.[51]

B. C. Moore , Principal component analysis in linear systems: controllability, observability, and model reduction , IEEETrans. Automat. Control, 26 (1981), pp. 17–32.[52]

C. T. Mullis and R. A. Roberts , Synthesis of minimum roundoﬀ noise ﬁxed point digital ﬁlters , IEEE Trans. Circuits andSystems, CAS-2 (1976), pp. 551–562.[53]

Y. Nakatsukasa, O. S`ete, and L. N. Trefethen , The AAA algorithm for rational approximation , SIAM J. Sci. Comput.,40 (2018), pp. A1494–A1522.[54]

C. C. Paige, B. N. Parlett, and H. A. van der Vorst , Approximate solutions and eigenvalue bounds from Krylov subspaces ,Numer. Linear Algebra Appl., 2 (1995), pp. 115–133.[55]

H. K. F. Panzer , Model Order Reduction by Krylov Subspace Methods with Global Error Bounds and Automatic Choice ofParameters , PhD thesis, 2014.[56]

S. Pozza and M. S. Prani´c , The Gauss quadrature for general linear functionals, Lanczos algorithm, and minimal partialrealization , 2019.[57]

S. Pozza, M. S. Prani´c, and Z. Strakoˇs , Gauss quadrature for quasi-deﬁnite linear functionals , IMA J. Numer. Anal., 37(2017), pp. 1468–1495.[58] ,

The Lanczos algorithm and complex Gauss quadrature , Electron. Trans. Numer. Anal., 50 (2018), pp. 1–19.[59]

S. Pozza and Z. Strakoˇs , Algebraic description of the ﬁnite Stieltjes moment problem , Linear Algebra Appl., 561 (2019),pp. 207–227.[60]

Y. Saad , Iterative methods for sparse linear systems , Society for Industrial and Applied Mathematics, Philadelphia, PA,second ed., 2003.[61]

W. H. A. Schilders, H. A. van der Vorst, and J. Rommes , eds.,

Model order reduction: theory, research aspects andapplications , vol. 13 of Mathematics in Industry, Springer-Verlag, Berlin, 2008. European Consortium for Mathematicsin Industry (Berlin).[62]

M. Schweitzer , A two-sided short-recurrence extended Krylov subspace method for nonsymmetric matrices and its relationto rational moment matching , Numer. Algorithms, 76 (2017), pp. 1–31.[63]

V. Simoncini , A new iterative method for solving large-scale Lyapunov matrix equations , SIAM J. Sci. Comput., 29 (2007),pp. 1268–1288.[64] ,

Analysis of the rational Krylov subspace projection method for large-scale algebraic Riccati equations , SIAM J. MatrixAnal. Appl., 37 (2016), pp. 1655–1674.[65]

Z. Strakoˇs , Model reduction using the Vorobyev moment problem , Numer. Algorithms, 51 (2009), pp. 363–379.[66]

L. N. Trefethen , Approximation theory and approximation practice , Society for Industrial and Applied Mathematics (SIAM),Philadelphia, PA, 2013.[67]

A. Yousuff and R. E. Skelton , Covariance equivalent realizations with application to model reduction of large-scale systems ,in Control and dynamic systems, Vol. 22, Academic Press, Orlando, FL, 1985, pp. 273–348.[68]