A rational Even-IRA algorithm for the solution of T-even polynomial eigenvalue problems
aa r X i v : . [ m a t h . NA ] S e p A RATIONAL EVEN-IRA ALGORITHM FOR THE SOLUTION OF T -EVEN POLYNOMIAL EIGENVALUE PROBLEMS ∗ PETER BENNER † , HEIKE FASSBENDER ‡ , AND
PHILIP SALTENBERGER ‡ Abstract.
In this work we present a rational Krylov subspace method for solving real large-scale polynomial eigenvalue problems with T -even (that is, symmetric/skew-symmetric) structure.Our method is based on the Even-IRA algorithm [24]. To preserve the structure, a sparse T -evenlinearization from the class of block minimal bases pencils is applied, see [9]. Due to this linearization,the Krylov basis vectors can be computed in a cheap way. Based on the ideas developed in [3], arational decomposition is derived so that our method explicitly allows for changes of the shift duringthe iteration. This leads to a method that is able to compute parts of the spectrum of a T -evenmatrix polynomial in a fast and reliable way. Key words. polynomial eigenvalue problem, symmetric/skew-symmetric matrix polynomial,structure-preserving linearization, Krylov subspace method, rational Krylov decomposition
AMS subject classifications.
1. Introduction.
Eigenvalue problems are ubiquitous in engineering, physics,mechanics and many more scientific disciplines. Moreover, they lie at the heart ofnumerical linear algebra. As eigenproblems stemming from real-world-applicationsare often subject to physical constraints and side conditions, they frequently andnaturally inherit structure. For instance, mechanical vibration systems are usuallydescribed by symmetric mass, damping and stiffness matrices, see [21]. Optimal con-trol problems often involve Hamiltonian/skew-Hamiltonian matrix pencils [22]. But,after all, which features and properties single out faithful numerical algorithms forstructured problems from universal methods? In the first place, the occurrence ofstructure can be utilized to speed up algorithms and reduce memory requirements.This originates from the deeper focus on the true nature of the problem comparedto standard methods. In addition to that, the adequate exploitation of structure isbeneficial (and indispensable, sometimes) for the reliability of an algorithm. Indeed,a proper numerical treatment of structure will often produce more accurate and phys-ically meaningful results. Consequently, it seems reasonable to design tailor-madealgorithms instead of addressing structured problems without any care by standardmeans. We present an algorithm for real, T -even polynomial eigenvalue problemsof large scale that takes into account all the aforementioned aspects. The methodwe propose is an implicitly-restarted rational Krylov-Schur approach based on the Even-IRA algorithm introduced in [24] (see also [12]). In contrast to the
Even-IRA algorithm and motivated by [3], our approach explicitly allows for changes of the shiftparameter during the iteration. This leads to a flexible and adjustable rational Krylovalgorithm.There exist various major applications, including the vibration of gyroscopic sys-tems and control theory, that lead to T -even polynomial eigenproblems of large size,see e.g. [24, 7] and the references therein. A matrix polynomial P ( λ ) is an element ∗ Version of September 12, 2020. † Max-Planck-Institute for Dynamics of Complex Technical Systems Magdeburg, ([email protected].) ‡ Institute for Numerical Analysis, TU Braunschweig ([email protected],[email protected].) 1
P. BENNER, H. FASSBENDER, AND P. SALTENBERGER from R m × n [ λ ], i.e.(1.1) P ( λ ) = ℓ X k =0 P k λ k = P ℓ λ ℓ + P ℓ − λ ℓ − + · · · + P λ + P with matrices P j ∈ R m × n . The degree deg( P ) of P ( λ ) is the largest index j with P j = 0. Often, we write P ( λ ) as a matrix with polynomial entries, i.e., as an elementfrom R [ λ ] m × n . Here, we are mostly interested in square matrix polynomials P ( λ ) ∈ R [ λ ] n × n with some particular structure in its matrix coefficients. We call P ( λ ) ∈ R n × n [ λ ] as in (1.1) T -even if P j = P Tj holds whenever j is even and P j = − P Tj holdsotherwise. Equivalently, P ( λ ) T = P ( − λ ). Eigenvalue/eigenvector pairs ( µ, x ) ∈ C × C n of P ( λ ) are characterized by the relation P ( µ ) x = 0. To find eigenvaluesof P ( λ ), it is a common approach to turn P ( λ ) into a matrix polynomial L P ( λ ) = λX + Y of degree one (e.g. the Frobenius companion form, [20]) by linearization.Then, the eigenvalues of P ( λ ) and L P ( λ ) coincide and the generalized eigenproblemcorresponding to the linearization L P ( λ ) may be solved by, e.g., the standard QZalgorithm, cf. [25]. However, solving a structured (i.e. T -even) eigenvalue problemvia the QZ algorithm and the Frobenius companion form is not conducive in the lightof the problems nature and structure.In particular, the spectrum of T -even matrix polynomials has a Hamiltonianstructure , that is, it is symmetric with respect to both the real and the imaginary axis.The algorithm we present takes care of this fact in two different ways. On the onehand, the linearization L P ( λ ) = λX + Y of P ( λ ) we consider is a symmetric/skew-symmetric matrix pencil (i.e. Y = Y T , X = − X T ). In particular, L P ( λ ) itself is T -even and so it naturally preserves the Hamiltonian spectral structure of P ( λ ). Onthe other hand, for any ζ ∈ C outside the spectrum of P ( λ ), we consider the specialshift-and-invert transformation L P ( ζ ) = ζX + Y K ( ζ ) := L P ( ζ ) − T X L P ( ζ ) − X as proposed in [22, 24]. Each eigenvalue pair (+ µ, − µ ) of L P ( λ ) is transformed toonly one eigenvalue θ = ( µ − ζ ) − of K ( ζ ). Consequently, K ( ζ ) preserves eigenvaluepairings and the spectral symmetry inherent to the problem is respected.The foundation of our method is the Even-IRA algorithm from [24]. This methodis a sophisticated variant of the Krylov-Schur algorithm (see Stewart [29]) applied to K ( ζ ) for some appropriately chosen shift parameter ζ and a T -even linearization L P ( λ ) for P ( λ ). Rather than applying a (structure-preserving) symplectic Lanczosprocess as in [4, 5], our approach is related to the ideas established for the SHIRAalgorithm in [22] (see also [3]). To define K ( ζ ), we take L P ( λ ) = λX + Y to bea special linearization from the class of block minimal bases pencils, see [9]. Dueto the structure and sparsity of L P ( λ ), the computation of matrix-vector-products K ( ζ ) x can be realized implicitly without ever forming K ( ζ ) at all. Moreover, linearsystems with L P ( ζ ) and L P ( ζ ) T (that arise in Arnoldi-like processes from matrix-vector-products involving K ( ζ )) can be solved implicitly through systems involvingonly P ( ζ ) and P ( − ζ ). Accordingly, the complexity of computing K ( ζ ) x is reducedby a significant amount since the size of P ( ζ ) is substantially smaller than the size of L P ( ζ ). For the same reason, memory requirements (e.g. for storing matrix decompo-sitions) can be decreased. These advantages of L P ( λ ) over other linearizations (see, Matrix polynomials of degree one are often called matrix pencils. RATIONAL EVEN-IRA ALGORITHM
Even-IRA algorithm.However, as the
Even-IRA algorithm does not allow for changes of the shift ζ duringthe iteration, this feature is incorporated in our method. Based on [3] and [26], ourrational Even-IRA algorithm permits shift adjustments during the iteration withoutdiscarding the information that has been accumulated so far. Retaining the advan-tageous computational aspects, this endows our approach with more flexibility. Inconsequence, the rational
Even-IRA algorithm we present is a new reliable, flexibleand fast numerical method with reasonable costs.This work is structured as follows:1. The basic definitions regarding matrix polynomials and their eigenvalues arepresented in Section 2. We introduce the concept of linearization and showhow a T -even linearization can be constructed.2. In Section 3, we briefly review the Even-IRA algorithm from [24]. It isthe basis of our rational method to compute eigenvalues of T -even matrixpolynomials in a structure-preserving way.3. We show how the matrix-vector-multiplications involved in the Even-IRA algorithm can be carried out in a very efficient and implicit way in Section 4.This is possible without forming the corresponding large-scale matrix at all.4. Section 5 is dedicated to the rational Arnoldi decomposition. We show howa rational decomposition can be invoked for the
Even-IRA algorithm andhow it is applied in a useful fashion for our purpose.5. We introduce the rational
Even-IRA algorithm in Section 6. We discuss theKrylov-Schur-restart procedure in detail and also address the issue of infiniteeigenvalues to guarantee a stable convergence of the algorithm.6. Some numerical examples are given in Section 7. We also illustrate how theshift-strategy influences the algorithms success.Some conclusions are given in Section 8.
2. Definitions of matrix polynomials and notation.
Recall that a matrixpolynomial P ( λ ) ∈ R [ λ ] n × n as in (1.1) is said to be T -even if(2.1) P ( λ ) T = m X k =0 λ k P Tk = P ( − λ )holds. In turn, T -odd matrix polynomials are characterized by P ( λ ) T = − P ( − λ ). Asmentioned in Section 1, it is easily seen that P ( λ ) in (2.1) is T -even if and only if P Tk = P k holds for all matrix coefficients P k with even index k ≥ P Tk = − P k holds whenever the index k is odd. The converse is true for T -odd matrix polynomials.Both structures have already been analyzed in [16, Sec. 6]. The classes of regular,singular and unimodular matrix polynomials are defined as follows:1. A matrix polynomial P ( λ ) ∈ R [ λ ] n × n is called regular if det( P ( λ )) = 0 and singular otherwise (notice that det( P ( λ )) ∈ R [ λ ]).2. A matrix polynomial Q ( λ ) ∈ R [ λ ] n × n is called unimodular if det( Q ( λ )) is anonzero constant.Let P ( λ ) ∈ R [ λ ] n × n be regular. We call µ ∈ C a (finite) eigenvalue of P ( λ ) if P ( µ ) = m X k =0 µ k P k ∈ C n × n is a singular matrix. Thus, µ ∈ C is an eigenvalue of P ( λ ) if and only if det( P ( µ )) =0. Therefore, the set of all finite eigenvalues of P ( λ ) coincides with the roots of P. BENNER, H. FASSBENDER, AND P. SALTENBERGER det( P ( λ )) ∈ R [ λ ] [20, Sec. 2]. The algebraic multiplicity of µ is defined as the multi-plicity of µ as a root of det( P ( λ )). In addition, if µ ∈ C is some eigenvalue of P ( λ ),the corresponding nullspace null( P ( µ )) is called the eigenspace for µ . Its dimensionis referred to as the geometric multiplicity of µ .We define for any d ≥ deg( P )rev d P ( λ ) := λ d P ( λ − ) . Then rev d P ( λ ) is again a matrix polynomial, i.e. rev d P ( λ ) ∈ R [ λ ] n × n . It is calledthe d -reversal corresponding to P ( λ ) [20, Def. 2.2]. In case d = deg( P ), we callrev P ( λ ) := rev d P ( λ ) the reversal of P ( λ ). It is easily verified that the finite eigen-values of rev P ( λ ) are the reciprocals of the eigenvalues of P ( λ ). In accordance withthis observation, we call ∞ an eigenvalue of P ( λ ) if zero in an eigenvalue of rev P ( λ ).The algebraic and geometric multiplicities of the eigenvalue ∞ are defined in termsof rev P ( λ ) and its finite eigenvalue µ = 0 [20, Def. 2.3]. The set of all eigenvalues of P ( λ ) ∈ R [ λ ] n × n is called the spectrum of P ( λ ) and is denoted by σ ( P ).The following important property is intrinsic for the eigenvalues of T -even matrixpolynomials P ( λ ): Proposition
The spectrum σ ( P ) of real, T -even matrix polynomials P ( λ ) has a Hamiltonian structure. That is, σ ( P ) is symmetric with respect to both the realand the imaginary axis. Two matrix polynomials S ( λ ) , P ( λ ) ∈ R [ λ ] n × n are called unimodular equiva-lent , if there exist unimodular matrix polynomials U ( λ ) , V ( λ ) ∈ R [ λ ] n × n such that S ( λ ) = U ( λ ) P ( λ ) V ( λ ) holds. Linearizations for matrix polynomials are defined byunimodular equivalence as follows: Definition
Let P ( λ ) ∈ R [ λ ] n × n . ( i ) Any matrix polynomial L ( λ ) = λX + Y that can be expressed as (2.2) L ( λ ) = U ( λ ) (cid:20) I s P ( λ ) (cid:21) V ( λ ) ∈ R [ λ ] ( n + s ) × ( n + s ) for two unimodular matrix polynomials U ( λ ) , V ( λ ) of size ( n + s ) × ( n + s ) and some s ∈ N is called a linearization for P ( λ ) . ( ii ) Assume deg( P ) = k . A linearization L ( λ ) for P ( λ ) as in (2.2) is called strong(linearization) whenever rev L ( λ ) is a linearization for rev k P ( λ ) = rev P ( λ ) , too. Notice that unimodular matrix polynomials do not have any finite eigenvalues.Therefore, any linearization L ( λ ) as in (2.2) of P ( λ ) has the same finite eigenvalues(with the same algebraic and geometric multiplicities) as P ( λ ) [9]. Furthermore, if L ( λ ) is strong, the same holds for the eigenvalue ∞ in case ∞ ∈ σ ( P ).The problem of finding linearizations for matrix polynomials P ( λ ) ∈ R [ λ ] n × n hasbeen addressed in, e.g., [20, 28]. Particular research has been done on conditioning[13], structure-preservation [14] and nonstandard polynomial bases [11, 17, 18]. In[9], a new class of linearizations was introduced (so called block minimal bases lin-earizations) that has recently attracted much attention. The linearization we presentin Theorem 2.5 will belong to this class.Due to Proposition 2.1, we are particularly interested in T -even linearizations(whenever P ( λ ) ∈ R [ λ ] n × n is T -even) to preserve the symmetries inherent to thespectrum of P ( λ ). The structure of the T -even linearization L P ( λ ) we define in (2.5) RATIONAL EVEN-IRA ALGORITHM P ) (which can be even or odd). Thus wedefine M P ( λ ) in Definition 2.3 depending on the degree of P ( λ ) to treat both cases inTheorem 2.5 in a common framework. Here and hereafter, we use the notation h x, y i to represent the scalar product x T y ∈ R of two vectors x and y and ⊕ to denote thedirect sum of matrices, i.e. A ⊕ B = diag( A, B ) for any
A, B ∈ R n × n . Definition
Assume P ( λ ) ∈ R [ λ ] n × n is given as in (2.1) .(a) If deg( P ) is odd, we define (2.3) M P ( λ ) := ℓ − M k =0 ( − k (cid:0) λP d − k + P d − k − (cid:1) ∈ R [ λ ] ℓn × ℓn with d = deg( P ) and ℓ = ( d + 1) / .(b) If deg( P ) is even, we define M P ( λ ) as in (2.3) above with d = deg( P ) + 1 , ℓ = ( d + 1) / and P d := 0 n × n . Notice that ( λP d − k + P d − k − ) T = − λP d − k + P d − k − holds for all summands in(2.3) regardless of the parity of deg( P ). That means M P ( λ ) T = M P ( − λ ), so M P ( λ )is always T -even. With the definition Λ k ( λ ) := [ λ k λ k − · · · λ ∈ R [ λ ] × ( k +1) forany k ≥
1, we make the following important observation.
Remark M P ( λ ) for P ( λ ) as in (2.3) it canbe verified by a direct calculation that (cid:0) Λ ℓ ( − λ ) ⊗ I n (cid:1) M P ( λ ) (cid:0) Λ ℓ ( λ ) T ⊗ I n (cid:1) = P ( λ )holds. This property will be exploited in Section 4.Let P ( λ ) ∈ R [ λ ] n × n be T -even. With the use of M P ( λ ) and(2.4) L k ( λ ) := − λ − λ . . . . . .1 − λ ∈ R [ λ ] k × ( k +1) , k ≥ , we present a structure-preserving, i.e. T -even, linearization L P ( λ ) for P ( λ ) in thefollowing Theorem 2.5. It is a block minimal bases pencil (as introduced in [9]) andwas already used in [12]. In particular, Theorem 3.3 in [9] applies to the matrix pencil L P ( λ ) defined in (2.5) below and confirms that it is in fact a linearization for P ( λ ). Theorem
Let P ( λ ) ∈ R [ λ ] n × n be T -even. Then the matrix pencil (2.5) L P ( λ ) := (cid:20) M P ( λ ) L ℓ − ( − λ ) T ⊗ I n L ℓ − ( λ ) ⊗ I n (cid:21) ∈ R [ λ ] dn × dn defined for P ( λ ) with M P ( λ ) , d and ℓ given as in Definition 2.3 and L ℓ − ( λ ) asintroduced in (2.4) is T -even. Moreover, L P ( λ ) is a strong linearization for P ( λ ) if deg( P ) is odd and a linearization for P ( λ ) if deg( P ) is even. Due to the linearization property, the matrix pencil L P ( λ ) ∈ R [ λ ] dn × dn definedin (2.5) has the same finite eigenvalues as P ( λ ) ∈ R [ λ ] n × n (from whose matrix co-efficients it is defined). Moreover, since L P ( λ ) is T -even whenever P ( λ ) is T -even,we call L P ( λ ) a structure-preserving linearization for P ( λ ). To illustrate the form of L P ( λ ) consider the following example. P. BENNER, H. FASSBENDER, AND P. SALTENBERGER
Example L P ( λ ) defined for a T -even matrix polynomial P ( λ ) ∈ R [ λ ] n × n has a very sparse and clear structure. This is illustrated below for P ( λ ) = P k =0 λ k P k of degree seven. Writing L P ( λ ) in the form λX + Y for two7 n × n matrices X and Y we have(2.6) L P ( λ ) = − P P I n − P I n P I n − I n − I n − I n λ + − P I n P I n − P I n P I n I n I n . Since P ( λ ) was assumed to be T -even, it is seen directly that Y is symmetric while X is skew-symmetric. In addition, notice that, if P ( λ ) was only of degree six, L P ( λ )as defined in (2.5) would be as in (2.6) with P = 0.In general, determining σ ( P ) for a matrix polynomial P ( λ ) is sometimes referredto as the polynomial eigenvalue problem (PEP). If P ( λ ) = λX + Y is a matrix pencil,the term generalized eigenvalue problem (GEP) is often used. A common way tosolve a PEP corresponding to P ( λ ) ∈ R [ λ ] n × n is to compute σ ( P ) (or just a part ofit) through a linearization L ( λ ) for P ( λ ) using a method for GEPs. Notice that thesize of L ( λ ) is usually much larger than the size of P ( λ ) (depending on the degree of P ( λ )). Thus it is often appropriate (or even necessary) not to compute all eigenvaluesof L ( λ ) but only some (e.g. in a predefined area of the complex plane). For suchpurposes Krylov subspace methods are among the most appropriate algorithms (cf.[1] for an overview of different Krylov subspace algorithms). Hereby, the area whereeigenvalues are to be found is controlled via a shift parameter ζ ∈ C . With someabuse of terminology, a Krylov subspace method can be called rational if it admitschanges of this shift parameter during its iteration, see [26, 27].All subsequent investigations mainly aim for the construction of a rational Krylovsubspace algorithm to determine eigenvalues of L P ( λ ) defined as in (2.5) for somegiven T -even matrix polynomial P ( λ ) ∈ R [ λ ] n × n . Hereby, the T -even structure of L P ( λ ) is exploited to preserve the spectral symmetries.
3. The Even-IRA algorithm.
According to Proposition 2.1, the spectrum ofa (real) T -even matrix polynomial is symmetric with respect to the real and imag-inary axis. Numerical algorithms respecting this spectral symmetry will in generalbe more accurate than standard methods [19]. In addition, numerical methods thatignore the special structure may often produce (physically) less meaningful results[21]. Therefore, our focus in the development of a reliable eigensolver for T -evenpolynomial eigenvalue problems is twofold: on the one hand on the application ofa structure-preserving linearization (see Theorem 2.5) and on the other hand on amethod that profitably exploits this structure. One method taking the T -even struc-ture into account is the Even-IRA algorithm presented in [24]. It belongs to the class RATIONAL EVEN-IRA ALGORITHM T -even generalized eigenvalue problems. Other methodsfor solving T -even polynomial eigenvalue problems can be found in, e.g., [2, 22].The Even-IRA algorithm is designed to determine a part of the spectrum of aregular T -even matrix pencil G ( λ ) = λX + Y ∈ R [ λ ] m × m close to a predefined targetin the complex plane. To preserve the Hamiltonian eigenvalue structure, a specialspectral transformation is applied to preserve the ± matching pairs of eigenvalues.In particular, whenever G ( λ ) = λX + Y is regular and T -even, i.e. X = − X T and Y = Y T holds, and some shift ζ / ∈ σ ( G ) is given in a region of the complex planewhere eigenvalues are to be found, then in [24] the transformation(3.1) G ( ζ ) = ζX + Y K ( ζ ) = G ( ζ ) − T X G ( ζ ) − X ∈ C m × m is considered. Notice that a similar spectral transformation already appeared in [4,22, 31] in the context of skew-Hamiltonian/Hamiltonian eigenvalue problems and thesymplectic Lanczos process. Whenever G ( µ ) x = 0 holds for some µ ∈ C and x ∈ C m ,it is easily confirmed that K ( ζ ) x = θx follows, where θ = ( µ − ζ ) − . Thus, any twofinite eigenvalues µ and − µ of G ( λ ) are mapped to the same eigenvalue θ ∈ σ ( K ( ζ )).Due to this fact, ± matching pairs of eigenvalues are preserved. On the other hand,all eigenvalues of K ( ζ ) are necessarily of even multiplicity. Notice the following twoimportant facts: • Whenever some eigenvalue θ ∈ σ ( K ( ζ )) has been found, it gives rise to a ± matching pair of two eigenvalues of G ( λ ), namely(3.2) µ = p (1 /θ ) + ζ and b µ = − p (1 /θ ) + ζ . • The matrix K ( ζ ) from (3.1) will in general be complex but remains realwhenever ζ ∈ R or ζ ∈ i R . In case ζ = a + bi with nonzero real and imaginaryparts, a slightly different spectral transformation can be considered, see [24,Rem. 2.1], to stay in real arithmetic.In [24] the authors suggest to apply the implicitly restarted Krylov-Schur method[29] to the matrix K ( ζ ) in (3.1) to find some, say s ∈ N , eigenvalues of G ( λ ). That is,if v , . . . , v s is an orthonormal basis of the Krylov space(3.3) K s ( K ( ζ ) , x ) = span { x, K ( ζ ) x, K ( ζ ) x, . . . , K ( ζ ) s − x } for some x ∈ R m (computed by the Arnoldi method, see [1]) and V = [ v · · · v s ] ∈ R m × s , in general some of the eigenvalues of K ( ζ ) of largest magnitude are well ap-proximated by some of the s eigenvalues of V T K ( ζ ) V . This process can now be(implicitly) restarted using the Krylov-Schur restart strategy [29, Sec. 3] until all s eigenvalues of V T K ( ζ ) V serve as good approximations to eigenvalues of K ( ζ ). Thisapproach is called the Even-IRA algorithm (details on the practical implementa-tion of the algorithm can be found in [24, Sec. 4]). Additional information on howeigenvectors may be captured can be found in [24, p. 4074ff].The basis of our algorithm is the
Even-IRA algorithm. As this method is de-signed for T -even matrix pencils, it cannot be used directly for T -even matrix polyomi-als P ( λ ) of degree >
1. To solve the PEP for P ( λ ), we apply the Even-IRA algorithmto the structure-preserving linearization L P ( λ ) from (2.5). The sparse block structureof L P ( λ ) turns out to be very beneficial for the computation of matrix-vector-products K ( ζ ) x (which are necessary to build the Krylov space, see (3.3)). In fact, we showin Section 4 that K ( ζ ) x can be computed in a cheap and reliable way without ever P. BENNER, H. FASSBENDER, AND P. SALTENBERGER forming K ( ζ ) and L P ( ζ ) explicitly. In Section 5, we will modify the Even-IRA algo-rithm so that it is able to handle changes of the shift parameter ζ during the Arnoldiiteration and the restart process. This makes it possible to accelerate convergence orto control/change the regions in the complex plane where eigenvalues are to be found.
4. The efficient computation of matrix-vector-products K ( ζ ) x . Assumethat P ( λ ) ∈ R [ λ ] n × n is some T -even matrix polynomial and let L P ( λ ) ∈ R [ λ ] dn × dn bedefined as in (2.5). Recall that P ( λ ) and L P ( λ ) share the same finite eigenvalues. Asoutlined in Theorem 2.5, L P ( λ ) is T -even, so the Even-IRA algorithm can be appliedto L P ( λ ) to determine a part of the finite spectrum of P ( λ ). In consideration of large-scale-problems , the sparsity and structure of L P ( λ ) can be exploited to significantlyincrease the computational speed in calculating K ( ζ ) v . This effective computationalapproach in explained in detail in this section (see also [12] and [28, Sec. 5.2]). Remark T -even matrix polynomials, matrix-vector-multiplications K ( ζ ) v will only involvereal vectors v ∈ R dn in all subsequent sections (even if ζ and K ( ζ ) are complex).However, the technique to perform matrix-vector-multiplications is valid even if v ∈ C dn , so we give a general treatment here.To begin, assume ζ ∈ C is not contained in σ ( P ) and let v ∈ C dn be given.Moreover, let L P ( λ ) = λX + Y as in (2.5). Explicitly, the matrix-vector-product K ( ζ ) v can be written as(4.1) K ( ζ ) v = (cid:0) L P ( ζ ) − T X L P ( ζ ) − X (cid:1) v. Actually, (4.1) can be evaluated using four consecutive matrix-vector-multiplications.The matrix-vector-products with X , where X ∈ R dn × dn , can be evaluated directlyand quickly by exploiting the sparsity of X . Moreover, as X has a clear and de-termined block-structure, a matrix-vector-multiplication Xv can entirely be carriedout implicitly, that is, without forming X at all, on its nonzero n × n blocks. Thematrix-vector-products with L P ( ζ ) − and L P ( ζ ) − T = L P ( − λ ) − can be realized bysolving linear systems with L P ( ζ ) and L P ( − ζ ), respectively. However, the size of bothmatrices is dn × dn and, therefore, can be rather large. Fortunately, a linear-systems-solve with L P ( ζ ) can be traced back to solely n × n computations. The solution of alinear system with L P ( λ ) can essentially be reduced to the solution of a linear systeminvolving P ( ζ ) ∈ C n × n . This provides an economic approach for the determinationof these products since, for instance, the computational cost of an LU decompositionfor L P ( ζ ) is within O ( d n ) while it is only O ( n ) for P ( ζ ) if no sparsity patternsare taken into account. For sparse matrices the cost is about O ( d · nz) and O (nz),respectively, where nz denotes the number of nonzero entries. Moreover, the storagerequirements for the LU factors for P ( ζ ) are way below those for the LU factors of L P ( λ ).Assume that L P ( ζ ) − v is to be computed, i.e. the linear system(4.2) L P ( ζ ) y = (cid:20) M P ( ζ ) L ℓ − ( − ζ ) T ⊗ I n L ℓ − ( ζ ) ⊗ I n (cid:21) (cid:20) y y (cid:21) = (cid:20) x x (cid:21) = x Recall that the size of L P ( λ ) is (depending on the degree of P ( λ )) much larger than the size of P ( λ ). A major part of the computational cost that is raised by a Krylov subspace method such asthe Even-IRA algorithm usually comes from the computation of the matrix-vector-products to formthe Krylov space (see (3.3)). Thus, to achieve a reasonable efficiency of our method, it is necessary toguarantee the fast and cheap computation of matrix-vector-products K ( ζ ) v , where K ( ζ ) ∈ C dn × dn is the matrix defined for L P ( λ ) in (3.1), ζ ∈ C is some shift not contained in the spectrum of P ( λ ),and v ∈ C dn . RATIONAL EVEN-IRA ALGORITHM x ∈ C dn . Let y ⋆ be the solution of (4.2) which isunique since L P ( ζ ) is nonsingular (due to the fact that ζ / ∈ σ ( P )). Let x, y ∈ C dn bepartitioned as x , y ∈ C ℓn and x , y ∈ C ( ℓ − n and assume that y ⋆ = [ ( y ⋆ ) T ( y ⋆ ) T ] T is partitioned accordingly. The structure of L P ( ζ ) reveals that (4.2) can be rewrittenas a system of two equations for the unknown vectors y and y as M P ( ζ ) y + (cid:0) L ℓ − ( − ζ ) T ⊗ I n (cid:1) y = x and(4.3) ( L ℓ − ( ζ ) ⊗ I n ) y = x . (4.4)Notice that (4.4) is an underdetermined system with L ℓ − ( ζ ) ⊗ I n ∈ C ( ℓ − n × ℓn .Moreover, rank( L ℓ − ( ζ ) ⊗ I n )) = ( ℓ − n holds regardless of the choice of ζ . Therefore,the nullspace of L ℓ − ( ζ ) ⊗ I n is always n -dimensional and easily determined since (cid:0) L ℓ ( ζ ) ⊗ I n (cid:1)(cid:0) Λ ℓ ( ζ ) T ⊗ I n (cid:1) = 0 ( ℓ − n × n . Therefore we have null( L ℓ ( ζ ) ⊗ I n ) = { (Λ ℓ ( ζ ) T ⊗ I n ) r ; r ∈ C n } . Consequently, anysolution y for (4.4) has the form y = b y + (Λ ℓ ( ζ ) T ⊗ I n ) r where b y ∈ C ℓn solves(4.4) and r ∈ C n is arbitrary (i.e. (Λ ℓ ( ζ ) T ⊗ I n ) r is a solution to the homogenoussystem corresponding to (4.4)). In fact, once some particular solution b y has beenfound, there exists some unique r ⋆ such that y ⋆ = b y + (Λ ℓ ( ζ ) T ⊗ I n ) r ⋆ ∈ C ℓn . Withthis characterization of y ⋆ at hand, it follows from (4.3) that(4.5) M P ( ζ ) (cid:0)b y + (Λ ℓ ( ζ ) T ⊗ I n ) r ⋆ (cid:1) + (cid:0) L ℓ ( − ζ ) T ⊗ I n (cid:1) y ⋆ = x holds. Multiplying (4.5) by Λ ℓ ( − ζ ) ⊗ I n from the left eliminates the second term since(Λ ℓ ( − ζ ) ⊗ I n )( L ℓ ( − ζ ) T ⊗ I n ) = 0. After some reordering we obtain from (4.5)(4.6) (cid:0) Λ ℓ ( − ζ ) ⊗ I n (cid:1) M P ( ζ ) (cid:0) Λ ℓ ( ζ ) T ⊗ I n (cid:1)| {z } = P ( ζ ) r ⋆ = x − (cid:0) Λ ℓ ( − ζ ) ⊗ I n (cid:1) M P ( ζ ) b y . Notice that the left-hand-side of (4.6) simplifies to P ( ζ ) r ⋆ in accordance with Remark2.4. In addition, as P ( ζ ) is nonsingular, r ⋆ is the unique solution of (4.6). Thus, inother words, for any fixed particular solution b y of (4.4), the unique solution r ⋆ of the n × n linear system(4.7) P ( ζ ) r = x − (cid:0) Λ ℓ ( − ζ ) ⊗ I n (cid:1) M P ( ζ ) b y determines the first part y ⋆ = b y + (Λ ℓ ( ζ ) T ⊗ I n ) r ⋆ ∈ C ℓn of the solution vector y ⋆ .Once y ⋆ has been found, y ⋆ ∈ C ( ℓ − n will be the unique solution of the overdeterminedsystem(4.8) ( L ℓ − ( − ζ ) ⊗ I n ) y = x − M P ( ζ ) y ⋆ since (4.3) and (4.4) are satisfied if and only if y = y ⋆ and y = y ⋆ . The computationsof a particular solution b y of (4.4) and the solution y ⋆ of (4.8) can be carried out byforward and backward substitution and both require O ( ℓn ) flops. In particular:1. A solution b y ∈ C ℓn for (4.4), i.e.(4.9) I n − ζI n I n − ζI n . . . . . . I n − ζI n y , y , ... y ,ℓ − y ,ℓ = v , v , ... v ,ℓ − , y ,k , v ,k ∈ C n , P. BENNER, H. FASSBENDER, AND P. SALTENBERGER can be found by backward substitution. If b y and v ∈ C ( ℓ − n are partitionedas in (4.9) and b y ,ℓ = 0 is chosen, then b y , , . . . , b y ,ℓ − are uniquely determinedthrough the recurrence relation b y ,k = v ,k + ζ b y ,k +1 for k = ℓ − , . . . , .
2. The unique solution y ⋆ of (4.8), i.e.(4.10) I n ζI n I n ζI n . . .. . . I n ζI n y , y , ... y ,ℓ − = x − M P ( ζ ) y ⋆ =: w w ... w ℓ − w ℓ , y ,k , w k ∈ C n , can be found by forward substitution. If y ⋆ is partitioned as y in (4.10),then y ⋆ , = w and y ⋆ , , . . . , y ⋆ ,ℓ are uniquely determined by the recurrence y ⋆ ,k = w k − ζy ⋆ ,k − for k = 2 , . . . , ℓ. Remark y ∈ C ( ℓ − n which usually need not have a solution, there is a unique solution y ⋆ for (4.10)since we assumed L P ( ζ ) y = x to be uniquely solvable.For the determination of matrix-vector-products L P ( ζ ) − T v , the T -even structure of L P ( ζ ) can be exploited. In particular, L P ( ζ ) − T = ( L P ( ζ ) T ) − = L P ( − ζ ) − . In orderto find L P ( − ζ ) − v , the same approach as above can be used involving − ζ insteadof ζ . In particular, in (4.7) the matrix P ( − ζ ) instead of P ( ζ ) will show up. If anLU decomposition P ( ζ ) = LU has been computed to solve the linear system with P ( ζ ) in (4.7), this factorization can be reused to solve the system with P ( − ζ ) since P ( − ζ ) = P ( ζ ) T = U T L T .In conclusion, the procedure described in this section presents an efficient way tocalculate matrix-vector-products of the form (4.1). Whenever d ≪ n , the complexityof the overall method is dominated by the cost of the LU decomposition of P ( ζ ) whichis O (nz) or O ( n ) depending on whether P ( ζ ) is sparse or not.
5. The rational Arnoldi decomposition.
Let P ( λ ) ∈ R [ λ ] n × n be some T -even matrix polynomial and let L P ( λ ) = λX + Y ∈ R [ λ ] dn × dn and(5.1) K ( ζ ) = L P ( ζ ) − T X L P ( ζ ) − X = L P ( − ζ ) − X L P ( ζ ) − X, ζ / ∈ σ ( P ) , be defined for P ( λ ) as in (2.5) and (3.1), respectively.Recall from Section 3 that K ( ζ ) as in (5.1) stays real whenever ζ is real or purelyimaginary. We will assume for the moment that either of them holds to stay withinreal arithmetics. So, let v ∈ R dn be some normalized vector and suppose that (forinstance as part of the Even-IRA algorithm) m ∈ N steps of the Arnoldi process(cf. [1, Alg. 7.3]) have been performed for K ( ζ ). That is, we are with an Arnoldidecomposition for K ( ζ ) ∈ R dn × dn of the form(5.2) K ( ζ ) " V m = " V m T m + t m +1 ,m v m +1 e Tm = " V m +1 T m where V m +1 = [ v · · · v m +1 ] = [ V m v m +1 ] ∈ R dn × ( m +1) . The following statementshold for the vectors and matrices involved in (5.2):1. The columns v , . . . , v m +1 ∈ R dn of V m +1 form an orthonormal basis of theKrylov space K m +1 ( K ( ζ ) , v ) where v is the starting vector of the Arnoldiiteration. RATIONAL EVEN-IRA ALGORITHM T m = [ t i,j ] i,j ∈ R m × m and T m := I m +1 ,m T m + t m +1 ,m v m +1 e Tm ∈ R ( m +1) × m have upper-Hessenberg structure where e m denotes the m -th unitvector in R m .The eigenvalues of T m ∈ R m × m are called Ritz values of K ( ζ ) with respect to K m +1 ( K ( ζ ) , v ). According to the Rayleigh-Ritz principle (cf. [8, Sec. 7]), thesevalues are used as approximations to the eigenvalues of K ( ζ ) by the Even-IRA algo-rithm (cf. [24, pp. 4074ff], see also Section 3). Recall that the spectral transformation µ ( µ − ζ ) − corresponding to the transformation (3.1) causes eigenvalues µ of L P ( λ ) close to ζ to be of large magnitude. In the first place, these will be well approx-imated by eigenvalues of T m . Starting with the decomposition (5.2), the Even-IRA algorithm performs several Krylov-Schur restarts (see [29]) until convergence of thedesired number of eigenvalues was observed. As soon as t m +1 ,m in (5.2) becomeszero, K ( ζ ) V m = V m T m holds and all eigenvalues of T m are exact eigenvalues of K ( ζ ).Finally, the reverse transformation (3.2) reveals eigenvalues of L P ( λ ) close to ζ .Now notice that K ( ζ ) in (5.1) is nonsingular if and only if X is nonsingular.Therefore, assuming X to be nonsingular, we have(5.3) K ( ζ ) − = X − L P ( ζ ) X − L P ( − ζ ) = X − (cid:0) ζX + Y (cid:1) X − (cid:0) − ζX + Y (cid:1) = − ζ I dn + X − Y X − Y = (cid:0) X − Y (cid:1) − ζ I dn so that K ( ζ ) = (( X − Y ) − ζ I dn ) − . For all further considerations we let G :=( X − Y ) whenever it exists so that K ( ζ ) = ( G − ζ I dn ) − .Now, whenever X ∈ R dn × dn is nonsingular and G exists, (5.3) can be taken intoaccount and (5.2) may be rewritten in terms of G as(5.4) " G V m +1 T m = " V m +1 H m where H m := ζ T m + I m +1 ,m ∈ R ( m +1) × m is again of upper-Hessenberg form. Adecomposition of the form (5.4) is called a generalized rational Arnoldi decompositionfor G in [6, (1.1)], so we adapt this terminology here. When working with K ( ζ ) =( G − ζ I dn ) − we will mainly consider rational decompositions as in (5.4) instead ofArnoldi decompositions as in (5.2) from now on.Now regarding (5.4), the eigenvalues of the matrix pencil λT m + H m ∈ R [ λ ] m × m (where T m and H m are given by the first m rows of T m and H m , respectively) will ingeneral be good approximations of the eigenvalues of G , see [27]. Certainly it holdsthat σ ( G ) = σ ( G ) = σ ( L P ) , so the square roots + √ θ and −√ θ of eigenvalues θ ∈ C found from λT m + H m approximate eigenvalues of L P ( λ ) and, in turn, P ( λ ).It is a crucial observation regarding the rational Even-IRA algorithm presented inSection 6 that this relationship holds even if G (and hence G ) does not exist.Now it is important to notice that, in (5.4), only T m and H m directly dependon ζ but G does not. In comparison to the Arnoldi decomposition (5.2) - where K ( ζ ) appears on the left-hand-side and explicitly depends on ζ - this enables us toextend the decomposition (5.4) while changing the shift ζ to some newly chosen value ξ ∈ C . This is not possible for the standard Arnoldi decomposition (5.2) and cannotbe realized in the Even-IRA algorithm. In particular, instead of calculating andorthogonalizing K ( ζ ) v m +1 to extend (5.4) (as in the Arnoldi iteration), we may usethe vector K ( ξ ) v m +1 for some new shift ξ .2 P. BENNER, H. FASSBENDER, AND P. SALTENBERGER
Remark G ∈ R dn × dn exists sincethis will be helpful to illustrate the forthcoming computations. This assumption willbe dropped in the next section since the regularity of X is actually not required toperform the rational Even-IRA algorithm outlined in Section 6.In Section 5.1 we show how the rational Arnoldi decomposition (5.4) can beextended to increase the dimension of the underlying Krylov space (spanned by thecolumns of V m +1 ). To this end, we distinguish between the cases where either ξ ∈ R or ξ ∈ i R holds (only in these cases ξ and K ( ξ ) are real) and where ξ = a + bi iscomplex with nonzero real and imaginary parts (which implies K ( ξ ) to be non-real).In the second case, we may still remain in real arithmetics if the real and imaginaryparts of K ( ξ ) v m +1 are considered separately . Assume we aregiven a decomposition as in (5.4) obtained from m steps of the Arnoldi iterationapplied to K ( ζ ) for some shift ζ / ∈ σ ( L P ). Now let ξ / ∈ σ ( L P ) be some new shiftparameter. First assume that either ξ ∈ R or ξ ∈ i R holds, so K ( ξ ) is a real matrixand K ( ξ ) v m +1 is a real vector of size dn . The Gram-Schmidt-orthogonalization of K ( ξ ) v m +1 against v , . . . , v m +1 yields(5.5) e v m +2 = K ( ξ ) v m +1 − (cid:2) v · · · v m +1 (cid:3) t ,m +1 ... t m +1 ,m +1 , where t i,m +1 = h K ( ξ ) v m +1 , v i i , i = 1 , . . . , m + 1, and v m +2 = ( t m +2 ,m +1 ) − e v m +2 with t m +2 ,m +1 = k e v m +2 k . Then (5.5) can be rearranged to K ( ξ ) v m +1 = V m +2 t m +1 ,where V m +2 := [ V m +1 v m +2 ] ∈ R dn × ( m +2) and t m +1 = [ t k,m +1 ] m +2 k =1 ∈ R m +2 . Puttingthe expression K ( ξ ) = ( G − ξ I dn ) − from (5.3) in use we obtain(5.6) G V m +2 t m +1 = v m +1 + ξ V m +2 t m +1 . The relation established in (5.6) can now be incorporated into the decomposition (5.4)easily by defining(5.7) T m +1 := T m t ,m +1 ... t m,m +1 t m +1 ,m +1 · · · t m +2 ,m +1 ∈ R ( m +2) × ( m +1) and(5.8) H m +1 := H m ξ t ,m +1 ... ξ t m,m +1 ξ t m +1 ,m +1 · · · ξ t m +2 ,m +1 ∈ R ( m +2) × ( m +1) which gives a new decomposition G V m +2 T m +1 = V m +2 H m +1 that has the samestructure as in (5.4). Notice that the authors from [24] deal with complex shifts in another way by changing thedefiniton of K ( ζ ), see [24, Rem. 2.1]. RATIONAL EVEN-IRA ALGORITHM v ∈ R dn with k v k = 1 is given and m = 1. The computations in (5.5), (5.7) and (5.8) yield(5.9) V = (cid:2) v v (cid:3) ∈ R dn × , T = (cid:20) t , t , (cid:21) ∈ R × , H = (cid:20) h , h , (cid:21) ∈ R × , so that G V T = V H holds. Now there exists a Givens rotation F ∈ R × suchthat the second entry in F T is zero. Redefining T as F T , V as V F T = [ v v ]and H as F H , we have computed a new equivalent decomposition G V T = V H .Now, notice that the left-hand-side can also be expressed as G V T , where V = [ v ]consists only of the first column of V and T = [ t , ] where t , is the first entry of T . In particular, T is now (trivially) an upper-triangular matrix.For the further extension of the decomposition G V T = V H , it is appropri-ate and feasible to keep T k ∈ R k × k , k ≥
2, in upper-triangular form (instead ofupper-Hessenberg form) throughout while the upper-Hessenberg structure of H k ispreserved. This can be achieved by applying a special bulge-chasing after every ex-tension step. We describe this procedure in general for a given decomposition of theabove form of size m ≥
1, i.e.(5.10) " G V m T m = " V m +1 H m , where V m +1 ∈ R dn × ( m +1) has orthonormal columns v , . . . , v m +1 , T m ∈ R m × m isupper-triangular and H m ∈ R ( m +1) × m has upper-Hessenberg structure.Let the scalars t k,m +1 ∈ R , k = 1 , . . . , m + 2, and the vector v m +2 ∈ R dn becomputed as in (5.5) and let b V m +2 := [ V m +1 v m +2 ]. The matrices in (5.7) and (5.8)now have the special form(5.11) b T m +1 := T m t ,m +1 ... t m,m +1 · · · · · · t m +1 ,m +1 t m +2 ,m +1 ∈ R ( m +2) × ( m +1) and(5.12) b H m +1 := H m ξ t ,m +1 ... ξ t m,m +1 ξ t m +1 ,m +1 · · · ξ t m +2 ,m +1 ∈ R ( m +2) × ( m +1) . From b V m +2 and the upper-Hessenberg matrices in (5.11) and (5.12) we may nowrecover the structures from (5.10), i.e. V m +2 ∈ R dn × ( m +2) with orthonormal columns, T m +1 ∈ R ( m +1) × ( m +1) with upper-triangular form and H m +1 ∈ R ( m +2) × ( m +1) withupper-Hessenberg structure, so that G V m +1 T m +1 = V m +2 H m +1 holds. In fact, two4 P. BENNER, H. FASSBENDER, AND P. SALTENBERGER orthogonal matrices Q ∈ R ( m +2) × ( m +2) and Z ∈ R ( m +1) × ( m +1) may be found suchthat H m +1 := Q b H m +1 Z ∈ R ( m +2) × ( m +1) and(5.13) Q b T m +1 Z =: T m +1 · · · with T m +1 ∈ R ( m +1) × ( m +1) upper-triangular.Finally, defining V m +2 := b V m +2 Q T , we obtain the new decomposition G V m +1 T m +1 = V m +2 H m +1 . This equation is of the same form as (5.10) except that V m +1 has been extended byone column - which corresponds to the extension of the underlying Krylov space byone dimension - and that T m +1 and H m +1 have increased in their sizes by one. Thematrices Q and Z can be set up as a product of Givens rotations by the bulge-chasing-process described in Algorithm 5.1 for a real or purely imaginary shift. Algorithm 5.1
Bulge-Chasing-Procedure (real/imaginary shift) At first, a Givens rotation is applied to b T m +1 (from the left) on rows m + 2 and m + 1 to eliminate t m +2 ,m . Applying this transformation to b H m +1 introducesa bulge in the position ( m + 2 , m ). This bulge can be eliminated by applying aGivens rotation (from the right) to b H m +1 acting on columns m and m + 1. Abulge will now show up in the position ( m + 1 , m ) in b T m +1 . The bulge in the position ( m + 1 , m ) in b T m +1 created in (a) can be eliminatedby a Givens rotation applied (from the left) on rows m and m + 1 of b T m +1 . Thisintroduces a new bulge in b H m +1 at the position ( m + 1 , m − b H m +1 acting on the columns m − m . In consequence, a new bulge willappear in b T m +1 in the position ( m, m − The elimination process described in steps 1 and 2 continues in the same manneruntil the bulge in b T m +1 is chased off the top-left corner.Next, we discuss the case where ξ ∈ C has nonzero real and imaginary parts.Let us begin directly with a real rational Arnoldi decomposition as in (5.10). As K ( ξ ) ∈ C dn × dn is now a complex matrix, the resulting vector K ( ξ ) v m +1 will also becomplex (although v m +1 is still real). To remain in real arithmetics, we decompose K ( ξ ) v m +1 as Re( K ( ξ ) v m +1 ) + Im( K ( ξ ) v m +1 ) i into its real and imaginary part. Nowwe apply the Gram-Schmidt process to both vectors one after the other. That is, forRe( K ( ξ ) v m +1 ) we obtain, analogously to (5.5),(5.14) e v m +2 = Re( K ( ξ ) v m +1 ) − (cid:2) v · · · v m +1 (cid:3) t ,m +1 ... t m +1 ,m +1 with t i,m +1 = h Re( K ( ξ ) v m +1 ) , v i i and set v m +2 := ( t m +2 ,m +1 ) − e v m +2 with t m +2 ,m +1 = k e v m +2 k . Having computed v m +2 , we may now orthogonalize Im( K ( ξ ) v m +1 ) RATIONAL EVEN-IRA ALGORITHM v , . . . , v m +2 to obtain(5.15) e v m +3 = Im( K ( ξ ) v m +1 ) − (cid:2) v · · · v m +2 (cid:3) t ,m +2 ... t m +2 ,m +2 with t i,m +2 = h Im( K ( ξ ) v m +1 ) , v i i . Again we define v m +3 := ( t m +3 ,m +2 ) − e v m +3 ,where t m +3 ,m +2 = k e v m +3 k . Now we set b V m +3 = [ V m +1 v m +2 v m +3 ], t m +1 := t ,m +1 ... t m +2 ,m +1 ∈ R m +3 and t m +2 := t ,m +2 ... t m +2 ,m +2 t m +3 ,m +2 ∈ R m +3 . From (5.14) and (5.15) we obtain Re( K ( ξ ) v m +1 ) = b V m +3 t m +1 and Im( K ( ξ ) v m +1 ) = b V m +3 t m +2 , so that K ( ξ ) v m +1 = b V m +3 ( t m +1 + it m +2 ) follows. Putting again K ( ξ ) =( G − ξ I dn ) − from (5.3) in use we get(5.16) G b V m +3 (cid:0) t m +1 + it m +2 (cid:1) = b V m +3 (cid:0) e m +1 + ξ t m +1 + iξ t m +2 (cid:1) , where e m +1 denotes the ( m + 1)-st unit vector from R m +3 . Furthermore, from (5.16)the splitting of ξ as ξ = ρ + ηi with ρ := Re( ξ ) and η := Im( ξ ) yields G b V m +3 (cid:0) t m +1 + it m +2 (cid:1) = b V m +3 (cid:2) e m +1 + ρt m +1 − ηt m +2 + i (cid:0) ηt m +1 + ρt m +2 (cid:1)(cid:3) and, decomposing this once more into its real and imaginary parts, we arrive at G b V m +3 t m +1 = b V m +3 (cid:0) e m +1 + ρt m +1 − ηt m +2 (cid:1) and(5.17) G b V m +3 t m +2 = b V m +3 (cid:0) ηt m +1 + ρt m +2 (cid:1) . (5.18)The two relations (5.17) and (5.18) can now be incorporated into the decomposition(5.10). To this end, we define(5.19) b T m +2 = T m t ,m +1 t ,m +2 ... ... t m,m +1 t m,m +2 · · · · · · · · · t m +1 ,m +1 t m +1 ,m +2 t m +2 ,m +1 t m +2 ,m +2 t m +3 ,m +2 and(5.20) b H m +2 H m ρt ,m +1 − ηt ,m +2 ηt ,m +1 + ρt ,m +2 ... ... ρt m,m +1 − ηt m,m +2 ηt m,m +1 + ρt m,m +2 ρt m +1 ,m +1 − ηt m +1 ,m +2 ηt m +1 ,m +1 + ρt m +1 ,m +2 · · · · · · ρt m +2 ,m +1 − ηt m +2 ,m +2 ηt m +2 ,m +1 + ρt m +2 ,m +2 − ηt m +3 ,m +2 ρt m +3 ,m +2 . P. BENNER, H. FASSBENDER, AND P. SALTENBERGER
From b V m +3 and the matrices in (5.19) and (5.20) we may again recover the struc-tures from (5.10), that is G V m +2 T m +2 = V m +3 H m +2 , where V m +3 ∈ R dn × ( m +3) has orthonormal columns, T m +2 ∈ R ( m +2) × ( m +2) is upper-triangular and H m +2 ∈ R ( m +3) × ( m +2) has upper-Hessenberg form. As before, a special bulge-chasing pro-cedure is appropriate to determine two orthogonal matrices Q ∈ R ( m +3) × ( m +3) and Z ∈ R ( m +2) × ( m +2) such that H m +2 := Q b H m +2 Z ∈ R ( m +3) × ( m +2) and(5.21) Q b T m +2 Z = T m +2 · · · where T m +2 ∈ R ( m +2) × ( m +2) is upper-triangular.The matrices Q and Z can be set up as a product of Givens rotations by Algorithm5.2. With V m +3 := b V m +3 Q T we obtain the desired decomposition. Algorithm 5.2
Bulge-Chasing-Procedure (complex shift) At first, a Givens rotation is applied to b T m +2 (from the left) on rows m + 1 and m + 2 to eliminate t m +2 ,m +1 . Subsequently, another Givens rotation is applied tothe resulting matrix on rows m + 2 and m + 3 to eliminate t m +3 ,m +2 . Applyingboth transformation to b H m +2 introduces a bulge in the positions ( m + 2 , m ) and( m + 3 , m ). We now apply two Givens rotations (from the right) to b H m +2 actingon columns m and m + 1 to eliminate the element in position ( m + 3 , m ) and,subsequently, acting on columns m + 1 and m + 2 to eliminate the element inposition ( m + 3 , m + 1). Two new bulges will show up in the positions ( m + 1 , m )and ( m + 2 , m + 1) in b T m +2 . The additional element in the ( m + 2 , m ) positionof b H m +2 remains in its position and is eliminated in step 2. The bulges in the positions ( m + 1 , m ) and ( m + 2 , m + 1) in b T m +2 created in step1 can be eliminated by two Givens rotations applied (from the left) on rows m and m + 1 (to eliminate the bulge in position ( m + 1 , m )) and on rows m + 1 and m + 2 (to eliminate the bulge in position ( m + 2 , m + 1)) of b T m +2 . This introducesnew additional nonzero elements in b H m +2 at the position ( m + 1 , m −
1) and( m +2 , m − b H m +2 acting on the columns m − m (to eliminate the element in ( m + 2 , m − m and m + 1 (to eliminate the element in ( m + 2 , m )). Notice thatthe ( m + 2 , m )-element we eliminate now was the one that remained in step 1.Now new bulges will appear in b T m +2 in the positions ( m, m −
1) and ( m + 1 , m ).The additional element in the ( m + 1 , m −
1) position of b H m +2 remains in itsposition and is eliminated in the next step. The elimination process described in steps 1 and 2 continues in the same manneruntil the bulge in b T m +2 is chased off the top-left corner.Starting with some v ∈ R dn , k v k = 2 , the previously described procedures areappropriate to construct and extend a rational Arnoldi decomposition of the form(5.10). In each run, a new shift parameter ξ / ∈ σ ( L P ) can be chosen. Per iteration,the decomposition grows in size by one if ξ is real or purely imaginary and by twootherwise.Now recall that, whenever X is singular, G and, consequently, a decompositionof the form (5.10), does not exist. Nevertheless, the vectors K ( ξ ) v m +1 and v m +2 RATIONAL EVEN-IRA ALGORITHM T m , H m can be extended as in(5.11) and (5.12) if ξ is real or purely imaginary. The bulge-chasing-procedure fromAlgorithm 5.1 applies and recovers the matrix structures from (5.10). If ξ is not real orpurely imaginary, K ( ξ ) v m +1 can be splitted into its real and imaginary parts and thecalculations in (5.14) and (5.15) can be carried out as described above. The extensionof T m and H m works as explained in (5.19) and (5.20) and the bulge-chasing-procedurefrom Algorithm 5.2 recovers the upper-triangular and upper-Hessenberg structures.In conclusion, for any m ≥
1, the matrix pencil λT m + H m ∈ R m × m can be formedeven if G cannot. Moreover, its eigenvalues can be used to approximate the eigenvaluesin σ ( L P ) as before. We will permanently drop the assumption that X needs to benonsingular and that G needs to exist from now on. In other words, we explicitlyallow L P ( λ ) to have eigenvalues at infinity. Therefore, the following derivations willmostly be dealing only with the matrices V k , T k and H k as in (5.10) instead of thedecomposition G V m T m = V m +1 H m . These matrices and their modifications in theupcoming section should always be understood in the context of a rational Arnoldidecomposition as in (5.2) whenever such a decomposition exists. We have summarizedthe method to generate (or extend) a rational Arnoldi decomposition in Algorithm5.3.
6. The rational Even-IRA algorithm.
Let P ( λ ) ∈ R [ λ ] n × n be some T -evenmatrix polynomial and let L P ( λ ) = λX + Y ∈ R [ λ ] dn × dn and K ( ζ ) = L P ( ζ ) − T X L P ( ζ ) − X = L P ( − ζ ) − X L P ( ζ ) − X, ζ / ∈ σ ( P ) , be defined for P ( λ ) as in (2.5) and (3.1), respectively. The rational Even-IRA al-gorithm presented in this section is a method that unifies the Krylov-Schur restartstrategy [29] with the spectral-preserving transformation K ( ζ ) (see Section 3 and[24, 22, 31]) and the shift flexibility offered by the rational Arnoldi process [27, 6, 26].The sparse and structured form of the linearization L P ( λ ) (see Theorem 2.5) isexploited for evaluating matrix-vector-products with K ( ζ ) implicitly and efficientlywithout ever forming K ( ζ ) at all (see Section 4). Hence, the memory requirement ofthe method is essentially that of storing the given matrix polynomial and the vectorsfrom the current search space. In a nutshell, this approach yields a powerful Krylov-subspace algorithm for the computation of some eigenvalues for T -even polynomialeigenvalue problems. The rational Even-IRA algorithm presented next consists ofseveral phases. In the initialization phase (Section 6.1) a rational Arnoldi decompo-sition is constructed which is the start and end point of each Krylov-Schur cycle. Inthe expansion phase (Section 6.2) the size of this decomposition is increased. Afterthe expansion, a QZ decomposition is applied (Section 6.3) to identify eigenvaluesthat have converged during the current run and which are to be locked (Section 6.4).To initialize the algorithm’s next cycle, the decomposition is truncated (Section 6.5)and the upper-triangular and upper-Hessenberg forms of the matrices are recovered(Section 6.6). The next iteration then begins with the expansion phase. We nowdescribe the different phases in detail. Let M ∈ N be the number of desired eigenval-ues for P ( λ ) ( L P ( λ ), respectively). The premier start of the algorithm begins withits initialization phase. That is, matrices(6.1) V M +1 ∈ R dn × ( M +1) , T M ∈ R M × M , H M = " H M B M ∈ R ( M +1) × M , P. BENNER, H. FASSBENDER, AND P. SALTENBERGER
Algorithm 5.3
Rational Arnoldi Expansion Input : The linearization L P ( λ ) = λX + Y ∈ R [ λ ] dn × dn for a T -even matrixpolynomial P ( λ ) ∈ R [ λ ] n × n defined in (2.5). A matrix V k +1 = [ v · · · v k +1 ] ∈ R dn × ( k +1) with orthonormal columns, T k ∈ R k × k upper-triangular and H k ∈ R ( k +1) × k in upper Hessenberg form satisfying G V k T k = V k +1 H k if G exists. Incase k = 0, we set T = [ ] and H := [ ]. A number m ∈ N , m > k . Output : Matrices V m +1 = [ v · · · v m +1 ] ∈ R dn × ( m +1) with orthonormal col-umns, T m ∈ R m × m upper-triangular and H m ∈ R ( m +1) × m in upper Hessenbergform that satisfy (5.4) in case G exists. for j = ℓ + 1 , . . . , m do pick a shift ζ j ∈ C compute w := K ( ζ j ) v j = ( L P ( ζ j ) − T X L P ( ζ j ) − X ) v j using Section 4 if ζ j ∈ R or ζ j ∈ i R then orthogonalize w against v , . . . , v j and obtain t ,j , . . . , t j,j ∈ R as in (5.5) set v j +1 to obtain t j +1 ,j ∈ R form b T j ∈ R ( j +1) × j and b H j ∈ R ( j +1) × j as in (5.11) and (5.12) set b V j +1 = [ v · · · v j +1 ] apply the bulge-chasing-procedure described in Algorithm 5.1 to determine orthogonal matrices Q ∈ R ( j +1) × ( j +1) and Z ∈ R j × j such that ⊲ Q b T j Z is upper-triangular with zeros in its last row and ⊲ Q b H j Z =: H j has upper-Hessenberg structure define T j to be the first j rows of Q b T j Z define V j +1 := b V j +1 Q T else orthogonalize Re( w ) against v , . . . , v j and get t ,j , . . . , t j,j ∈ R as in (5.14) set v j +1 to obtain t j +1 ,j ∈ R orthogonalize Im( w ) against v , . . . , v j , v j +1 and obtain t ,j +1 , . . . , t j +1 ,j +1 ∈ R as in (5.15) set v j +2 to obtain t j +2 ,j +1 ∈ R form b T j +1 ∈ R ( j +2) × ( j +1) and b H j ∈ R ( j +2) × ( j +1) as in (5.19) and (5.20) set b V j +2 = [ v · · · v j +1 v j +2 ] apply the bulge-chasing-procedure described in Algorithm 5.2 to determine orthogonal matrices Q ∈ R ( j +2) × ( j +2) and Z ∈ R ( j +1) × ( j +1) such that ⊲ Q b T j +1 Z is upper-triangular with zeros in its last row ⊲ Q b H j +1 Z =: H j +1 has upper-Hessenberg structure define T j +1 to be the first j + 1 rows of Q b T j +1 Z define V j +2 := b V j +2 Q T end if end for where B = h M +1 ,M e TM for some scalar h M +1 ,M ∈ R are computed by Algorithm 5.3.Note that the columns of V m +1 are orthonormal, T M is upper-triangular and H M has upper-Hessenberg form. In case X is nonsingular, ( X − Y ) V M T M = V M +1 H M holds. If P ( λ ) has no eigenvalues at infinity, Algorithm 5.3 may be initialized with V = [ v ] (arbitrary and normalized), T = [ ] and H = [ ]. In case of the presenceof infinite eigenvalues, a different initialization should be chosen, see Section 6.7. RATIONAL EVEN-IRA ALGORITHM
19A cycle of the rational Krylov-Schur algorithm begins and ends with matrices ofthe form (6.1). Now suppose, at some stage of the algorithm, s ∈ N eigenvalues havealready converged. Assume these had been locked so that they are located in thetop-left s × s corner of T M and H M (of course, beginning with the algorithms firstrun, s = 0). The first step of the algorithm is the expansionphase where the above matrices are extended up to a size m > M . This is achievedby performing m − M additional steps of Algorithm 5.3 with the input matrices from(6.1). We call m − M the extension size for the algorithm. Now we obtain matrices(6.2) V m +1 ∈ R dn × ( m +1) , T m ∈ R m × m , H m = " H m B m ∈ R ( m +1) × m , where B m = h m +1 ,m e Tm , the columns v , . . . , v m +1 ∈ R dn of V m +1 = [ V m v m +1 ] areorthonormal, T m is upper-triangular and H m has upper-Hessenberg structure. Wepartition the matrices in (6.2) in accordance with the number s of locked eigenvaluesas(6.3) V m = (cid:2) V s V (cid:3) , T m = (cid:20) T s T ′ T (cid:21) , H m = (cid:20) H s H ′ H (cid:21) and B m = (cid:2) B s B (cid:3) , where T s , H s ∈ R s × s , B Ts = [ 0 · · · T ∈ R s and B T = [ 0 · · · h m +1 ,m ] T ∈ R k ,where we have set k := m − s implying T, H ∈ R k × k . Moreover, recall that theeigenvalues from the matrix pair λT s + H s are the ones we assumed to be locked. We may now enter the de-composition and reordering phase of the algorithm. To this end, we first computea QZ decomposition of the matrix pair (
T, H ). For this purpose, orthogonal ma-trices Q , Z ∈ R k × k can be determined so that Q T T Z = T ⋆ ∈ R k × k remainsupper-triangular while Q T HZ = H ⋆ ∈ R k × k becomes quasi upper-triangular (withsolely 1 × × λT ⋆ + H ⋆ to move unwanted eigenvaluesof λT ⋆ + H ⋆ into the trailing part of its generalized Schur decomposition. Thatis, two additional orthogonal transformations Q , Z ∈ R k × k can be found, so thatunwanted eigenvalues of λT ⋆ + H ⋆ move to the south-east corner of the matrices T ⋄ := Q T T ⋆ Z ∈ R k × k and H ⋄ := Q T H ⋆ Z ∈ R k × k . Thereby, the matrices T ⋄ ∈ R k × k and H ⋄ ∈ R k × k stay upper-triangular and quasi upper-triangular, re-spectively. Finally, defining Q T := Q T Q T and Z := Z Z , we update (6.3) as follows(6.4) b V m = (cid:2) V s V Q (cid:3) , b T m = (cid:20) T s T ′ Z T ⋄ (cid:21) , b H m = (cid:20) H s H ′ Z H ⋄ (cid:21) and b B m = (cid:2) B s B ⋄ (cid:3) with B ⋄ := B T Z . Notice that B ⋄ will now be, in general, a full vector. With (6.4) the inspection-of-convergence phase of the algorithm begins. That is, the leading components of B ⋄ areinspected for convergence and eigenvalues are locked whenever convergence has takenplace. Let H ⋄ = [ h i,j ] i,j , T ⋄ = [ t i,j ] i,j with 1 ≤ i, j ≤ k and let B ⋄ = [ b s +1 · · · b m ].Starting with r ≡ P. BENNER, H. FASSBENDER, AND P. SALTENBERGER (a) Whenever h r +1 ,r = 0 and | b s + r | is below a given tolerance tol , we consider thecorresponding eigenvalue h r,r /t r,r as converged. The element b s + r is set to zeroand the number r of converged eigenvalues in the current run is increased by one.(b) Whenever h r +1 ,r = 0 but k [ b s + r b s + r +1 ] k is below the given tolerance tol , weconsider the pair of complex conjugate eigenvalues corresponding to the 2 × λ (cid:20) t r,r t r,r +1 t r +1 ,r +1 (cid:21) + (cid:20) h r,r h r,r +1 h r +1 ,r h r +1 ,r +1 (cid:21) as converged. The elements b s + r and b s + r +1 are both set to zero. Finally, thenumber r of converged eigenvalues in the current run is increased by two.We repeat the locking of eigenvalues as long as (a) or (b) reveals convergence. Onceno further convergence is observed notice that b B m := [ 0 · · · b s + r +1 · · · b m ] (where r is now the total number of locked eigenvalues during the current run). The newnumber of converged eigenvalues in total is now s ⋆ = s + r . If s ⋆ ≥ M (the number ofdesired eigenvalues), we are done. Otherwise the matrices are truncated to prepare arestart of the algorithm. If s ⋆ < M , the size of the matrices in (6.4) is nowdecreased to size M × M in the truncation phase to initialize a restart of the process. Inparticular, let b V M ∈ R dn × M be the first M columns of b V m and b V M +1 = [ b V M v m +1 ],where v m +1 denotes the last column from V m +1 in (6.2) (note that v m +1 has notbeen touched in all steps up to this point). Moreover, denote the top-left M × M submatrices of b T m and b H m by b T M and b H M , respectively, and the vector obtainedfrom the first M components of b B m by b B M , i.e. b B M = [ 0 · · · b s ⋆ +1 · · · b M ]. Inthe form (6.2) we have b V M +1 ∈ R dn × ( M +1) , b T M ∈ R M × M , b H M := " b H M b B M ∈ R ( M +1) × M . Notice that b H M will not have Hessenberg structure at this stage of the algorithmsince b B M will have more nonzero elements than just b M . Remark M + 1 , M )-element in b H m is nonzero. In this case a 2 × M by one.Analogously to (6.3) and (6.4) we may now partition b V M , b T M and b H M accordingto the new number s ⋆ of locked and converged Ritz values. This highlights the activepart of the decomposition and separates it from the locked part (which does not needto be touched again). In particular, we partition(6.5) b V M = (cid:2) V s ⋆ V ◦ (cid:3) , b T M = (cid:20) T s ⋆ T ′′ T ◦ (cid:21) , b H M = (cid:20) H s ⋆ H ′′ H ◦ (cid:21) , b B M = (cid:2) B s ⋆ B ◦ (cid:3) , where V s ⋆ ∈ R dn × s ⋆ , T s ⋆ , H s ⋆ ∈ R s ⋆ × s ⋆ and B Ts ⋆ = [0 · · · T ∈ R s ⋆ . Recall that B ◦ is in general a full vector with all nonzero entries. Set k ⋆ = M − s ⋆ so that T ◦ , H ◦ ∈ R k ⋆ × k ⋆ . Our next goal is to tranform the matrices in (6.5)back to a decomposition of the form (6.1) in the recovery phase. That is, we determine
RATIONAL EVEN-IRA ALGORITHM
Q, Z ∈ R k ⋆ × k ⋆ such that Q T T ◦ Z =: T ∈ R k ⋆ × k ⋆ is still upper-triangular, Q T H ◦ Z =: H ∈ R k ⋆ × k ⋆ remains in upper-Hessenberg form and B = B ◦ Z =: h M +1 ,M e Tk ⋆ for some scalar h M +1 ,M ∈ R . Then we update (6.5) to obtain(6.6) V M = (cid:2) V s ⋆ V ◦ Q (cid:3) , T M = (cid:20) T s ⋆ T ′′ Z T (cid:21) , H M = (cid:20) H s ⋆ H ′′ Z H (cid:21) , B M = (cid:2) B s ⋆ B (cid:3) and with V M +1 = [ V M v m +1 ] we are back with matrices(6.7) V M +1 ∈ R dn × ( M +1) , T M ∈ R M × M , H M = " H M B M ∈ R ( M +1) × M as in (6.1), where B M = h M +1 ,M e TM . The next cycle of the algorithm then beginswith the expansion phase as described in Section 6.2. The recovery phase can becarried out by the bulge-chasing-process described in Section 6.9The overall goal of this algorithm is to achieve B M = [ 0 · · · M eigenvalues of λT M + H M are exact eigenvalues of L P ( λ ) and, inturn, their plus/minus square roots exact eigenvalues of P ( λ ). A matrix polynomial P ( λ ) might have eigen-values at infinity (see Section 2). The rational Even-IRA algorithm will eventuallydetect infinite eigenvalues, i.e., in computations in real arithmetic, eigenvalues of verylarge magnitude might be found. This is detrimental for the algorithm’s performancesince ( i ) the detection of very large eigenvalues is, in this case, a wrong result, and( ii ) the convergence results after the detection of such an eigenvalue are of unsatis-fying accuracy. Therefore, it seems reasonable to a priori eliminate any possibilityof convergence to infinity. This will guarantee a good performance throughout andreliable results.Assuming P ( λ ) = P dk =1 P k λ k ∈ R [ λ ] n × n of degree d ≥ µ = ∞ are the nullvectors of P d . These can be found by solvingthe n × n linear system P d x = 0 with an appropriate method. These vectors can nowbe used to initialize our algorithm so that convergence for the eigenvalue infinity hasalready taken place. For this purpose, let dim(null( P d )) = t and { v , v , . . . , v t } ⊂ R n some orthonormal basis of null( P d ). We define V t +1 = v · · · v t · · · v t +1 ∈ R dn × ( t +1) , H t = " I t · · · , and T t = 0 t × t ∈ R t × t . The vector v t +1 ∈ R dn can be chosen arbitrarily so that thecolumns of V t +1 are orthonormal. From here on, we start the Initialization Phaseof the rational Even-IRA algorithm described in Section 6 with V t +1 , T t and H t in Algorithm 5.3. Moreover, we define, right from this point on, the number s ofconverged eigenvalues to be t . In other words, with this initialization, convergenceto infinity and locking has already occurred before the algorithm actually starts.The algorithm will not reveal further eigenvalues at infinity. The number of desiredeigenvalues has to be increased from M to t + M .2 P. BENNER, H. FASSBENDER, AND P. SALTENBERGER
The appropriate choice of shifts is a delicate problemthat often depends on user-specified priorities. According to Section 4, a matrix-vector-multiplication with K ( ζ ) ∈ C dn × dn essentially reduces to a system solve with P ( ζ ) (and P ( ζ ) T ). If an LU decomposition of P ( ζ ) is computed, it can be reused aslong as the shift does not change. On the other hand, once a change of shift tookplace, a new decomposition has to be computed for the subsequent iterations (untilthe next change). Hence, in order to keep the algorithm effective, shift changes shouldnot be applied too often. However, on the other hand, the choice of a good new shift islikely to increase the convergence speed. Two general shift strategies are given below.(a) Assume the current run of the rational Even-IRA algorithm revealed convergenceand (in total) s ⋆ eigenvalues are locked - this corresponds to the situation (6.5).Let T ◦ = [ t ◦ i,j ] i,j , H ◦ = [ h ◦ i,j ] i,j , B ◦ = [ b s ⋆ +1 · · · b M ] and consider the case h ◦ , = 0. In particular, assuming G exists and regarding b V s ⋆ +1 (the matrixconsisting of the first s ⋆ + 1 columns of b V M ), b T s ⋆ +1 , b H s ⋆ +1 (the ( s ⋆ + 1) × ( s ⋆ + 1)principal submatrices of b T M and b H M , respectively) we have in view of (6.5) G V s ⋆ +1 T s ⋆ +1 = V s ⋆ +1 H s ⋆ +1 + b s ⋆ +1 v m +1 e Ts ⋆ +1 . In other words,(6.8) k G V s ⋆ +1 T s ⋆ +1 − V s ⋆ +1 H s ⋆ +1 k = | b s ⋆ +1 | because k v m +1 k = 1. Therefore, the absolute value of the first entry b s ⋆ +1 of B ◦ displays the first residual which was not below the given tolerance tol since,otherwise, the corresponding eigenvalue ξ := h ◦ , /t ◦ , of λT ◦ + H ◦ located in thetop-left 1 × ξ mayserve as a good approximation to the next eigenvalue that is about to convergeand ξ might now be chosen as the next shift parameter. Analogously, whenever h ◦ , = 0, an eigenvalue of the 2 × λT ◦ + H ◦ can be chosen asa new shift.(b) The shift-strategy from (a) can be modified so that the next shift parameteris chosen as ξ = h ◦ , /t ◦ , only if the corresponding residual | b s ⋆ +1 | is above agiven tolerance. In particular, if | b s ⋆ +1 | is already very small, a change of shiftis probably not necessary since the algorithm seems to be “on the right way” toreveal the next convergence soon (e.g. within the next cycle). However, | b s ⋆ +1 | being above a given tolerance might indicate that the current shift is not headingoff to reveal further convergence in the near future. Thus, changing the shift couldbe an appropriate means to speed up convergence in such a situation.Certainly, other shift strategies beside (a) and (b) above and mixtures of both areconceivable. In particular, if one is interested in eigenvalues in a particular region ofthe complex plane, the shift should, of course, be chosen appropriately. We now consider the recovery phase of the rational
Even-IRA algorithm in detail. Therefore, reconsider the matrices obtained in (6.5).We now show how to construct two orthogonal matrices
Q, Z ∈ R k ⋆ × k ⋆ such that Q T T ◦ Q = T ∈ R k ⋆ × k ⋆ remains upper-triangular, Q T H ◦ Z = H ∈ R k ⋆ × k ⋆ has upper-Hessenberg form and B ◦ Z = [ 0 · · · h M +1 ,M ] is a vector of zeros except for somescalar h M +1 ,M ∈ R in the last position. The matrices Q and Z are the products of asequence of Givens rotations that constitute our bulge-chasing procedure. Hereby, aGivens rotation ˜ Q T ∈ R × from the left acts on two rows i and j (with 1 ≤ i, j ≤ k ⋆ )of T ◦ and H ◦ . Each transformation ˜ Q T needs to be applied via ˜ Q to the columns RATIONAL EVEN-IRA ALGORITHM i and j of V ◦ , too. This is implicitly understood in all the following derivations. AGivens rotation ˜ Z ∈ R × from the right acts on two columns i and j , 1 ≤ i, j ≤ k ⋆ ,of T ◦ , H ◦ and B ◦ . These transformations do not influence the matrix V ◦ .Now let B ◦ = [ b · · · b k ⋆ ]. The bulge-chasing process proceeds as follows:(a) We apply a Givens rotation Z from the right on the first two columns to eliminate b using b . This introduces a bulge in the position (3 ,
1) in H ◦ and in the position(2 ,
1) in T ◦ . A rotation Q T acting on rows one and two from the left can be usedto eliminate the bulge in T ◦ . The bulge on the second subdiagonal in H ◦ remainsin its position. Now the first element in B ◦ is zero and the analogous processcan be used to eliminate the second element in B ◦ . As before, the new bulge inthe position (4 ,
2) in H ◦ (i.e. on the second subdiagonal in H ◦ ) remains it itsposition.(b) The third element in B ◦ can be eliminated as in (a) above and two new elementsin the positions (4 ,
3) in T ◦ and (5 ,
3) in H ◦ show up. As in (a), this bulge in H ◦ is accepted for the moment. However, with the elimination of the bulge in T ◦ with a Givens rotations from the left on rows three and four an additional bulgein H ◦ will appear in the position (4 ,
1) (i.e. on the third subdiagonal in H ◦ ).This bulge can be eliminated by a Givens rotation from the right on the first andsecond column of H ◦ introducing again a bulge in the (2 ,
1) position in T ◦ . Arotation applied to the first two rows from the left is used to eliminate the bulgein T ◦ .(c) The process from (b) now continues for all t >
3. That is, the elimination of the t -th entry in B ◦ is achieved by a Givens rotation from the right on columns t and t + 1. Consequently, bulges appear in ( t + 2 , t ) in H ◦ and ( t + 1 , t ) in T ◦ . Theelimination of the bulge in T ◦ by a Givens rotation from the left introduces anadditional bulge ( t + 1 , t −
2) in H ◦ . This bulge is chased off the top-left cornerof T ◦ and H ◦ by applying Givens rotation alternatingly from left and right.If the bulge-chasing process described in (a) to (c) is completely carried out, in theend, T ◦ is still of upper-triangular form, B ◦ = [ 0 · · · h M +1 ,M ] and H ◦ is a matrixthat now has two full subdiagonals (i.e. all entries below the second subdiagonal of H ◦ are zero). Now the transformation process can be continued and the second sub-diagonal in H ◦ can be eliminated from the lower right corner to the top-left corner.A standard bulge-chasing (Givens rotations alternatingly from left and right) is ade-quate to achieve this. It is important to note that no Givens rotation is required thattouches the last column of T ◦ , H ◦ and B ◦ . Therefore, B ◦ remains as it is and weobtain the desired form.
7. Numerical experiments.
In this section, we briefly describe the results oftwo numerical experiments to give a proof of concept for the algorithm described inthe previous section. To this end, we set up a basic implementation of the rational
Even-IRA algorithm in MATLAB R2020a and compared our results to those foundwith the MATLAB function polyeig . As the degree of P ( λ ) is even in both examples, L P ( λ ) was constructed as in (2.5) with M P ( λ ) from Definition 2.3 (b). We initializethe algorithm as explained in Section 6.7. In contrast to the computation of eigenval-ues with polyeig , the rational Even-IRA algorithm is designed to find only a feweigenvalues of a matrix polynomial. Therefore, a comparison of the computationaltimes for both algorithms seems inappropriate here.Our first example is taken from [23], see also butterfly in [7]. Here, the matrixpolynomial P ( λ ) = P j =0 P j λ j under consideration is of degree four. The matrixcoefficients are build from several Kronecker products as follows: we set m = 104 P. BENNER, H. FASSBENDER, AND P. SALTENBERGER and n = m = 100 . Let N denote the m × m nilpotent Jordan matrix with onesone the first subdiagonal and define ˜ P = (1 / I m + N + N T ) , ˜ P = N − N T , ˜ P = − (2 I m − N − N T ), ˜ P = ˜ P and ˜ P = − ˜ P . Moreover, we set P i = c i I m ⊗ ˜ P i + c i ˜ P i ⊗ I m with positive constants c ij chosen as c = 0 . , c = 1 . , c = 1 . , c = 0 . , c =0 . , c = 1 . , c = c = c = c = 1 . P ( λ ) = P j =0 P j λ j has size 100 × − . An eigenvalue is considered as converged if itscorresponding residual becomes less than 10 − . With the initial shift ζ ∈ C chosen as0 . i , the rational Even-IRA finds the eigenvalues displayed in Figure 1 (left plot)in 18 iterations. Compared to the computation with the MATLAB function polyeig ,we observe an accordance in both the real and imaginary parts of the computed valuesof at least the first ten decimal places. In this experiment, the shift was changed once,see Figure 1, so an LU decomposition of P ( ζ ) had to be computed twice (see also thediscussion subsequent to Remark 4.2). Remark L P ( ζ ) = ζX + Y to K ( ζ ) = L P ( ζ ) − T X L P ( ζ ) − X preserves ± matching pairs of eigenvalues (+ µ, − µ ) as both are mapped to the sameeigenvalue θ = ( µ − ζ ) − (recall Section 3). Therefore, each eigenvalue of K ( ζ )has even multiplicity. In exact arithmetic, multiple eigenvalues will not be cap-tured (see [24, Sec. 3]) by the Arnoldi iteration. However, as round-off may even-tually create them, the authors of the Even-IRA algorithm suggest an additional X -orthogonalization of the Krylov basis, see [24, Lem. 2.3]. Requiring that the basisof the underlying Krylov space is X -orthogonal (that is, h v i , Xv j i = 0 for all i = j )will hinder the algorithm to find multiple copies of the same eigenvalue. The X -orthogonalization procedure suggested in [24] cannot be directly applied here. Thereason for this is that the rational Even-IRA algorithm as outlined in Section 6handles complex shifts differently in comparison to [24].For our second example we chose the model of a rolling tire, see [10] or [4,Sec. 4.2.2]. Here P ( λ ) = λ M + λG + K , where M, G, K are of size 2697 × M and K are symmetric whereas G is skew-symmetric. Moreover, M and K are positive definite which implies that P ( λ ) has eigenvalues exclusively on theimaginary axis (see [21, Sec. 1]). Those vary in magnitude from about 10 to 5 · .Here we intend to find the eigenvalues of smallest magnitude. To this end, we con-sider rev P ( λ ) = λ K + λG + M since the eigenvalues of rev P ( λ ) of largest magnitudecorrespond via their reciprocals to the eigenvalues of P ( λ ) of smallest magnitude. Wehave applied the rational Even-IRA to rev P ( λ ) with the same parameters as in theprevious example, an initial shift of 10 − i and the shift strategy from Section 6.8 tofind 14 eigenvalues of rev P ( λ ) of largest magnitude. The eigenvalues of rev P ( λ ) com-puted with the rational Even-IRA algorithm and polyeig are displayed in Figure 1(right plot). Eight restarts have been performed. The imaginary parts of the eigen-values found by the MATLAB function polyeig and those values on the imaginaryaxis found by the rational
Even-IRA algorithm coincide to at least ten significantdigits. Clearly, in contrast to polyeig , the rational
Even-IRA algorithm anticipatesthe fact that all eigenvalues are located on the imaginary axis.
RATIONAL EVEN-IRA ALGORITHM -1.5 -1 -0.5 0 0.5 1 1.5 Re -2.5-2-1.5-1-0.500.511.522.5 I m -2 -1 0 1 2 3 4 Re -15 I m -4 Fig. 1 . Left: 24 eigenvalues found by the rational
Even-IRA algorithm (red stars) for theexample butterfly in [7]. Blue circles indicate the eigenvalues computed via polyeig . Due to thepreservation of ± matching eigenvalue pairs, only twelve eigenvalues were required to be computedby the rational Even-IRA algorithm. Crosses indicate the two shifts that have been used. Right:14 eigenvalues (in the upper half plane) found by the rational
Even-IRA algorithm (red stars) for rev P ( λ ) with P ( λ ) = λ M + λG + K for a gyroscopic system (cf. [4, Sec. 4.2]). Blue circlescorrespond to eigenvalues of rev P ( λ ) computed via polyeig . In contrast to polyeig , the rational Even-IRA algorithm recognizes the fact that the eigenvalues are all located on the imaginary axis.
Remark P ( λ ) with smallest magnitude directly, i.e., when P ( λ )instead of rev P ( λ ) is used. The shift often increases during the algorithms run andtends to find eigenvalues of larger magnitudes. Thus, according to our experiments,the basic shift strategy from Section 6.8 is not appropriate in this situation and amore sophisticated strategy has to be used.In conclusion, the overall success of our algorithm depends in large amounts onthe chosen shift-strategy. The method described in Section 6.8 works well if one isinterested in accelerating the convergence. However, if certain areas of the complexplane are to be “scanned” for eigenvalues, a more subtle shift-technique is needed.This is not further discussed here.
8. Conclusions.
In this work we have presented a method to compute partsof the spectrum of a T -even matrix polynomial. We developed our algorithm onthe basis of the Even-IRA algorithm from [24] which is a method for computinga few eigenvalues of a T -even (i.e. symmetric/skew-symmetric) matrix pencil andthe ideas developed in [3] on the rational SHIRA algorithm. Given a T -even ma-trix polynomial, we introduced a special linearization L P ( λ ) for P ( λ ) to preserveits T -even structure. We showed that the specific block-structure and sparsity of L P ( λ ) = λX + Y enables us to solve systems L P ( ζ ) x = y in an efficient way. Weapplied this technique to accelerate the computation of matrix-vector-products forthe matrix K ( ζ ) = L P ( ζ ) − T X L P ( ζ ) − X to build the underlying Krylov space. Aneigenvalue θ of K ( ζ ) gives rise to a ± matching pair of eigenvalues + p (1 /θ ) + ζ and − p (1 /θ ) + ζ of L P ( λ ). As suggested in [24], we used this spectral transformation(i.e. the matrix K ( ζ )) and the implicitly restarted Krylov-Schur algorithm to findeigenvalues of K ( ζ ). Moreover, we modified the Even-IRA algorithm and turned itinto a rational method that is able to handle changes of the shift parameter duringthe iteration.A question for future work that is naturally related to our algorithm is the exis-tence of a compact representation of the Krylov basis similar to the one developed6
P. BENNER, H. FASSBENDER, AND P. SALTENBERGER in [30] for other types of linearizations (which are not of the same form as L P ( λ )).This would be an appropriate means to decrease the cost for storing the Krylov basisvectors. Acknowledgments.
Work on this manuscript started when all three authorsvisited the Courant Institute of New York University. We would like to give a specialthanks to our host Michael Overton who made this research stay possible!
REFERENCES[1]
Bai, Z., Demmel, J., Dongarra, J., Ruhe, A. and Van der Vorst, H. , Templates for the So-lution of Algebraic Eigenvalue Problems , Society for Industrial and Applied Mathematics,Philadelphia, 2000.[2]
Bassour, M. , Hamiltonian polynomial eigenvalue problems , Journal of Applied Mathematicsand Physics, 8 (2020), pp. 609 – 619.[3]
Benner, P. and Effenberger, C. , A rational SHIRA method for the Hamiltonian eigenvalueproblem , Taiwanese Journal of Mathematics, 14 (2010), pp. 805 – 823.[4]
Benner, P., Fassbender, H. and Stoll, M. , Solving large-scale quadratic eigenvalue problemswith Hamiltonian eigenstructure using a structure-preserving Krylov subspace method ,Electronic Transactions on Numerical Analysis, 29 (2008), pp. 212–229.[5]
Benner, P., Fassbender, H. and Stoll, M. , A Hamiltonian Krylov-Schur-type method basedon the symplectic Lanczos process , Linear Algebra and its Applications, 435 (2011), pp. 578– 600.[6]
Berljafa, M. and G¨uttel, S. , Generalized rational Krylov decompositions with an applicationto rational approximation , SIAM Journal on Matrix Analysis and Applications, 36 (2015),pp. 894–916.[7]
Betcke, T., Higham, N. J., Mehrmann, V. and Tisseur, F. , NLEVP: A collection of non-linear eigenvalue problems , ACM Transactions on Mathematical Software, 39 (2011).[8]
Demmel, J. W. , Applied Numerical Linear Algebra , Society for Industrial and Applied Math-ematics, Philadelphia, 1997.[9]
Dopico, F. M., Lawrence, P. W., P´erez, J. and Van Dooren, P. , Block Kronecker lin-earizations of matrix polynomials and their backward errors , Numerische Mathematik, 140(2018), pp. 373–426.[10]
Elssel, K. and Voss, H. , Reducing huge gyroscopic eigenproblems by automated multi-levelsubstructuring , Archive of Applied Mechanics, 76 (2006), pp. 171 – 179.[11]
Fassbender, H. and Saltenberger, P. , On vector spaces of linearizations for matrix poly-nomials in orthogonal bases , Linear Algebra and its Applications, 525 (2017), pp. 59 –83.[12]
Fassbender, H. and Saltenberger, P. , On a modification of the EVEN-IRA algorithm for thesolution of T-even polynomial eigenvalue problems , Proceedings in Applied Mathematicsand Mechanics (PAMM), (2018).[13]
Higham, N. J., Mackey, D. S. and Tisseur, F. , The conditioning of linearizations of matrixpolynomials , SIAM Journal on Matrix Analysis and Applications, 28 (2006), pp. 1005 –1028.[14]
Higham, N. J., Mackey, D. S., Mackey, N. and Tisseur, F. , Symmetric linearizations formatrix polynomials , SIAM Journal on Matrix Analysis and Applications, 29 (2006), pp. 143– 159.[15]
Kagstrom, B. , A direct method for reordering eigenvalues in the generalized real Schur formof a regular matrix pair ( A, B ) . , In M. S. Moonen, G. H. Golub, and B. L. R. De Moor,editors, Linear Algebra for Large Scale and Real-Time Applications , Kluwer AcademicPublishers, Amsterdam, (1993), pp. 195–218.[16]
Mackey, D. S. , Structured linearizations for matrix polynomials , MIMS EPrint 2006.68,Mancheter Institute for Mathematical Sciences, Manchester, 2006.[17]
Mackey, D. S. and Perovic, V. , Linearizations of matrix polynomials in Bernstein bases ,Linear Algebra and its Applications, 501 (2016), pp. 162 – 197.[18]
Mackey, D. S. and Perovic, V. , Linearizations of matrix polynomials in Newton bases , LinearAlgebra and its Applications, 556 (2018), pp. 1 – 45.[19]
Mackey, D. S., Mackey, N., Mehl, C. and Mehrmann, V. , Structured polynomial eigenvalueproblems: good vibrations from good linearizations , SIAM Journal on Matrix Analysis andApplications, 28 (2006), pp. 1029–1051.[20]
Mackey, D. S., Mackey, N., Mehl, C. and Mehrmann, V. , Vector spaces of linearizations
RATIONAL EVEN-IRA ALGORITHM for matrix polynomials , SIAM Journal on Matrix Analysis and Applications, 28 (2006),pp. 971 – 1004.[21] Meerbergen, K. and Tisseur, F. , The quadratic eigenvalue problem , SIAM Review, 43 (2001),pp. 235–286.[22]
Mehrmann, V. and Watkins, D. , Structure-preserving methods for computing eigenpairs oflarge sparse skew-Hamiltoninan/Hamiltonian pencils , SIAM Journal on Scientific Com-puting, 22 (2001), pp. 1905–1925.[23]
Mehrmann, V. and Watkins, D. , Polynomial eigenvalue problems with Hamiltonian structure ,Electronic Transactions on Numerical Analysis, 13 (2002), pp. 106 – 118.[24]
Mehrmann, V., Schr¨oder, C. and Simoncini, V. , An implicitly-restarted Krylov subspacemethod for real symmetric/skew-symmetric eigenproblems , Linear Algebra and its Appli-cations, 436 (2012), pp. 4070–4087.[25]
Moler, C. B. and Stewart, G. W. , An algorithm for generalized matrix eigenvalue problems ,SIAM Journal on Numerical Analysis, 10 (1973), pp. 241–256.[26]
Ruhe, A. , Rational Krylov sequence methods for eigenvalue computation , Linear Algebra andits Applications, 58 (1984), pp. 391–405.[27]
Ruhe, A. , Rational Krylov: a practical algorithm for large sparse nonsymmetric matrix pencils ,SIAM Journal on Scientific Computing, 19 (1998), pp. 1535–1551.[28]
Saltenberger, P. , On different concepts for the linearization of matrix polynomials andcanonical decompositions of structured matrices with respect to indefinite sesquilinearforms , Logos Verlag, Berlin, 2019.[29]
Stewart, G. W. , A Krylov-Schur algorithm for large eigenproblems , SIAM Journal on MatrixAnalysis and Applications, 23 (2002), pp. 601–614.[30]
Van Beeumen, R., Meerbergen, K. and Michiels, W. , Compact rational Krylov methodsfor nonlinear eigenvalue problems , SIAM Journal on Matrix Analysis and Applications, 36(2015), pp. 820 – 838.[31]