[PDF] A residual concept for Krylov subspace evaluation of the φ matrix function

Abstract

An efficient Krylov subspace algorithm for computing actions of the φ matrix function for large matrices is proposed. This matrix function is widely used in exponential time integration, Markov chains and network analysis and many other applications. Our algorithm is based on a reliable residual based stopping criterion and a new efficient restarting procedure. For matrices with numerical range in the stable complex half plane, we analyze residual convergence and prove that the restarted method is guaranteed to converge for any Krylov subspace dimension. Numerical tests demonstrate efficiency of our approach for solving large scale evolution problems resulting from discretized in space time-dependent PDEs, in particular, diffusion and convection-diffusion problems.

Full PDF

AA RESIDUAL CONCEPT FOR KRYLOV SUBSPACE EVALUATIONOF THE ϕ MATRIX FUNCTION

M.A. BOTCHEV ∗ , L.A. KNIZHNERMAN † , AND

E.E. TYRTYSHNIKOV ‡ Abstract.

An eﬃcient Krylov subspace algorithm for computing actions of the ϕ matrix functionfor large matrices is proposed. This matrix function is widely used in exponential time integration,Markov chains and network analysis and many other applications. Our algorithm is based on areliable residual based stopping criterion and a new eﬃcient restarting procedure. For matriceswith numerical range in the stable complex half plane, we analyze residual convergence and provethat the restarted method is guaranteed to converge for any Krylov subspace dimension. Numericaltests demonstrate eﬃciency of our approach for solving large scale evolution problems resulting fromdiscretized in space time-dependent PDEs, in particular, diﬀusion and convection–diﬀusion problems. Key words. phi matrix function, Krylov subspace methods, exponential time integration,matrix exponential, restarting, exponential residual

AMS subject classiﬁcations.

1. Introduction.

Exponential time integration is an actively developing re-search ﬁeld [41], with numerous successful applications in challenging large scalecomputations, such as elastic wave equations [43], wind farm simulations [30], powerdelivery network analysis [69], large circuit analysis [70] vector ﬁnite element dis-cretizations of Maxwell’s equations [12] and photonic crystal modeling [9, 11].In this paper we propose a Krylov subspace method for solving initial-value prob-lem (IVP)(1.1) y (cid:48) ( t ) = − Ay ( t ) + g, y (0) = v, t (cid:62) , where the matrix A ∈ R n × n and vectors g, v ∈ R n ( g (cid:54) = 0) are given and the problemsize n is supposed to be large, i.e., n (cid:29)

1. Using the variation of constants formula(see, e.g., [45, Chapter 2.3]) y ( t ) = exp( − tA ) v + (cid:20)(cid:90) t exp( − ( t − s ) A ) ds (cid:21) g, t (cid:62) , we can write y ( t ) in the form(1.2) y ( t ) = v + tϕ ( − tA )( g − Av ) , t (cid:62) . Here ϕ ( − tA )( g − Av ) is a matrix-vector product with the matrix function ϕ ( − tA )and ϕ is deﬁned as(1.3) ϕ ( z ) ≡ e z − z , z ∈ C , where we set ϕ (0) = 1 so that ϕ is an entire function. Thus, the matrix function ϕ provides the exact solution (1.2) to system (1.1). In other words, solving IVP (1.1) is ∗ Corresponding author. Keldysh Institute of Applied Mathematics, Russian Academy of Sciences,Miusskaya Sq. 4, Moscow 125047, Russia, [email protected]. † Mathematical Modelling Department, Central Geophysical Expedition, Narodnogo OpolcheniyaSt., 38, Bldg. 3, Moscow 123298, Russia, [email protected]. ‡ Marchuk Institute of Numerical Mathematics, Russian Academy of Sciences, Gubkin St. 8, Mos-cow 119333, Russia, [email protected]. Work of this author is supported by the Rus-sian Science Foundation grant No. 19-11-00338. 1 a r X i v : . [ m a t h . NA ] O c t quivalent to evaluating the matrix function ϕ times a vector in (1.2). The methodfor solving (1.1) we propose in this paper is based on an eﬃcient Krylov subspaceevaluation of the matrix function ϕ ( − tA ) in (1.2).For homogeneous ODEs (ordinary diﬀerential equations), i.e., for g = 0, the exactsolution of (1.1) takes the well-known form y ( t ) = exp( − tA ) v . Therefore, computingmatrix-vector products with the matrix function ϕ and the matrix exponential is animportant task in exponential time integration [41], occurring independently as well aswithin higher-order ODE Krylov subspace methods [42, 32] or splitting schemes [36, 8].For large problems a direct evaluation of the matrix function based on dense linearalgebra techniques [50, 37] is out of question. Therefore, the matrix-vector productswith the matrix function are approximately computed, rather than the matrix functionitself. Krylov subspace methods have been a successful tool for doing this since theend of the eighties; we list chronologically [54, 65, 23, 46, 57, 24, 39]. This progress innumerical linear algebra has triggered advances in numerical time integration methods(see, e.g., [31, 40, 18]) and a revival of the exponential time integrators [41], whichhave been developed since the sixties [19, 49, 48].In this work we are interested in solving large scale problems (1.1) which result,for instance, within a method of lines, when spatially discretized PDEs (partial dif-ferential equations) have to be eﬃciently solved in time. This means that A doesnot have to be symmetric and is often far from a normal matrix. Furthermore, ﬁneand nonuniform grids as well as realistic coeﬃcient values in the PDE may lead toa stiﬀness in the problem, i.e., the eigenvalues of A may signiﬁcantly vary in magni-tude [29]. Taking into account unavoidable space discretization errors, error producedby a time integrator does not have to be extremely small in these problems. Usuallyabsolute time errors in the range [10 − , − ] suﬃce and this is the range we aim here.As we see in numerical tests, delivering accuracies in this range can be a challengefor existing exponential solvers. The proposed method appears to carry out this tasksuccessfully.Krylov subspace exponential integrators are good candidates for solving largescale problems with sparse nonnormal matrices A , such as space-discretized PDEs.Indeed, these solvers have suﬃcient stability properties and often lead to an eﬃcientcompromise between standard implicit time integrators (which may occur too expen-sive in large scale applications) and explicit stabilized solvers. Moreover, as Krylovsubspace methods can signiﬁcantly proﬁt from eigenvalue clustering [67], their appli-cation for solving stiﬀ problems can be successful [33, 20, 16, 15] because stiﬀnesstypically means a large discrepancy in eigenvalues. Other approaches, not employingKrylov subspaces, for evaluating matrix exponential and related matrix functions oflarge sparse matrices include the Chebyshev polynomials, scaling and squaring withTaylor or Pad´e approximations and contour integrals [64, 21, 59, 17, 4].Our approach for computing ϕ matrix-vector products is essentially based ontwo ingredients presented in this paper: (i) a residual notion for ϕ matrix functionsand (ii) a restarting, i.e., an algorithm aiming to preserve convergence while keepingthe Krylov dimension k bounded, k (cid:54) k max [25]. The residual notion leads to areliable stopping criterion and facilitates the proposed restarting. Restarting is oftenindispensable for solving large scale problems because the size n of the matrix A tends to be extremely large in real life applications. Thus, storing k max Krylov basisvectors of size n can be prohibitively expensive for n (cid:29) k max is moderatelylarge. Therefore, in the tests we restrict k max by 30. Numerous restarting algorithmsfor Krylov subspace evaluations of the matrix exponential have been proposed, forinstance [2, 18, 26, 52, 10]. Restarting for general matrix functions include [27, 35]. wo restarting techniques have been designed speciﬁcally for the ϕ matrix functions,namely [60] and [53]. They both are based on error estimates and choose a timeinterval length t such that solving (1.1) by k max Krylov steps yields a suﬃcientlysmall error. In the numerical experiments presented below we show that our methodperforms favorably compared to the codes based on these techniques.Furthermore, we provide convergence analysis of the method, obtaining upperbounds for the residual norm with no asymptotic estimates involved, and show thatthe restarted algorithm is guaranteed to converge for any restart length, i.e., for anyKrylov dimension k max .The structure of this paper is as follows. In Section 2 we describe details of theKrylov subspace approximation and introduce a residual concept for the ϕ matrixfunction. In Section 3 some observations on the residual behavior are given andthe restarting algorithm, based on the observed behavior, is formulated. Section 4is devoted to residual convergence analysis for both symmetric and nonsymmetricmatrices. Based on this analysis, in Section 5 we show convergence of the restartedalgorithm. Section 6 presents numerical experiments. Finally, some conclusions aredrawn in Section 7.

2. Krylov subspace evaluations of the ϕ matrix function.2.1. Residual notion for the ϕ matrix function. Throughout the paper weassume that(2.1) Re( z ∗ Az ) (cid:62) , ∀ z ∈ C n , where the function Re( · ) is the real part of its argument, and, unless reported other-wise, (cid:107) · (cid:107) denotes the vector or matrix 2-norm.To keep our presentation simpler, we ﬁrst assume that the initial value vectorin (1.1) is zero: v = 0. In this case relation (1.2) takes a form(1.2 (cid:48) ) y ( t ) = tϕ ( − tA ) g, t (cid:62) , and, thus, to solve (1.1) we have to evaluate the action of the matrix function ϕ ( − tA )on the vector g . Let upper Hessenberg matrix H k ∈ R ( k +1) × k and matrix V k +1 = (cid:2) v v . . . v k v k +1 (cid:3) ∈ R n × ( k +1) , v = g/ (cid:107) g (cid:107) , with orthonormal columns v j , be the matrices which result after carrying out k steps ofthe Arnoldi process. These matrices satisfy the Arnoldi decomposition [34, 55, 67, 58](2.2) AV k = V k +1 H k = V k H k + h k +1 ,k v k +1 e Tk . Here H k = V Tk AV k ∈ R k × k consists of the ﬁrst k rows of H k and the columns v , . . . , v k span the Krylov subspace K k ( A, g ) = span( g, Ag, . . . , A k − g ) . A standard way to compute a Krylov subspace approximation y k ( t ) to y ( t ) in (1.2 (cid:48) )is to set [67, Chapter 11](2.3) y k ( t ) := V k tϕ ( − tH k )( βe ) , with β = (cid:107) g (cid:107) and e = [1 , , . . . , T ∈ R k . This expression for y k ( t ) can be derivedby projecting the IVP (1.1) onto the Krylov subspace K k ( A, g ) = colspan( V k ) in the alerkin sense. Indeed, let us deﬁne a residual r k ( t ) of y k ( t ) with respect to (1.1),i.e., let(2.4) r k ( t ) ≡ − Ay k ( t ) + g − y (cid:48) k ( t ) . Now consider y k ( t ) ≈ y ( t ) , y k ( t ) = V k u ( t )such that its residual is kept orthogonal to the Krylov subspace: V Tk r k ( t ) = 0. Since g = βV k e , this orthogonality condition leads to a projected IVP in u ( t ) : R → R k (2.5) u (cid:48) ( t ) = − H k u ( t ) + βe , u (0) = 0 , where u (0) = 0 follows from the assumption y (0) = 0. With u ( t ) = tϕ ( − tH k )( βe ),we see that V k u ( t ) yields (2.3).Now it is easy to observe that the residual (2.4) is readily computable in thecourse of the Arnoldi process:(2.6) r k ( t ) (2.3) = − AV k u ( t ) + g − V k u (cid:48) ( t ) (2.5) = − AV k u ( t ) + g − V k ( − H k y ( t ) + βe )= − AV k u ( t ) + g + V k H k u ( t ) − g (2.2) = − h k +1 ,k v k +1 e Tk u ( t ) , where the bracketed numbers above the equality signs indicate the relations used inthe derivation.In case the initial value vector v in (1.1) is not zero, we introduce a function˜ y ( t ) = y ( t ) − v which satisﬁes an IVP˜ y (cid:48) ( t ) = − A ˜ y ( t ) + ˜ g, ˜ y (0) = 0 , t (cid:62) , with ˜ g = g − Av . Using (1.2) for this transformed problem we see that ˜ y ( t ) = tϕ ( − tA )˜ g and, thus, a Krylov subspace approximation ˜ y k ( t ) to ˜ y ( t ), as well as its residual ˜ r k ( t )(cf. (2.4)), can be computed as discussed above. Our Krylov subspace approximationto y ( t ) is then y k ( t ) = ˜ y k ( t ) + v . For this approximation we introduce the residual r k ( t ) in the same way as for the case v = 0: see (2.4). Then we have r k ( t ) = − Ay k ( t ) + g − y (cid:48) k ( t ) = − A ˜ y k ( t ) − Av + g − y (cid:48) k ( t ) = − A ˜ y k ( t ) + ˜ g − ˜ y (cid:48) k ( t ) = ˜ r k ( t ) , i.e., the residual r k ( t ) coincides with the residual ˜ r k ( t ) for the transformed solution˜ y k ( t ) and, hence, is also easily computable.Note that the error (cid:15) k ( t ) ≡ y ( t ) − y k ( t ) of the Krylov approximation y k ( t ) satisﬁesan IVP(2.7) (cid:15) (cid:48) k ( t ) = − A(cid:15) k ( t ) + r k ( t ) , (cid:15) k (0) = 0 . This shows that the introduced residual can be seen as the backward error for theapproximate solution y k ( t ). This is similar to the well known property of the linearsystem residual: if ˜ x approximates exact solution x of a linear system Ax = b thenfor its residual ˜ r = b − A ˜ x we have A ( x − ˜ x ) = ˜ r . Using (2.1),(2.7) we can bound theerror norm as [10, Lemma 4.1](2.8) (cid:107) (cid:15) k ( t ) (cid:107) (cid:54) tϕ ( − tω ) max s ∈ [0 ,t ] (cid:107) r k ( s ) (cid:107) , where ω (cid:62) .2. ϕ via the matrix exponential, two approaches. If A is nonsingular,a naive approach to compute y := ϕ ( − A ) g for a given g would be to solve v from Av = − g and evaluate y := exp( − A ) v − v . This means that we replace (1.1) byan equivalent IVP with the initial value v and a zero source term. Comparing thisapproach with Krylov subspace evaluation of y := ϕ ( − A ) g shows that both Krylovsubspace methods (for y := exp( − A ) v − v and for y := ϕ ( − A ) g ) require approximatelythe same number of Krylov steps. However, the additional computational costs forsolving Av = − g are not paid oﬀ. Hence, unless the systems with A can be solved ata negligible cost, computing actions of ϕ by the matrix exponential actions appearsto be ineﬃcient. Of course, this approach can not be used for a general g in case A issingular.We remark that quite a diﬀerent and successful approach to compute the ϕ matrixfunction via a matrix exponential is presented in [57, Proposition 2.1] (see also [4]). Forcomputing ϕ ( − tA ) v , A ∈ R n × n , it amounts to evaluating the last column of exp( − t (cid:98) A ),where (cid:98) A ∈ R ( n +1) × ( n +1) is a matrix obtained by augmenting A with v as the lastcolumn and zeros as the last row. In the context of large scale problems and Krylovsubspace methods this approach has a drawback that eventual (skew) symmetry of A is lost in (cid:98) A . We use this approach for the evaluating u ( t ) = tϕ ( − tH k )( βe ) to solvethe small scale projected IVP (2.5), see Section 6.

3. A restarting procedure for ϕ Krylov subspace evaluation.3.1. The residual norm as a function of time.

For real matrices A it is notdiﬃcult to see that property (2.1) is equivalent to x T Ax (cid:62) , ∀ x ∈ R n , which means that the symmetric part ( A + A T ) of A is nonnegative deﬁnite. Denote(3.1) ω ≡ min (cid:54) = x ∈ R n x T Axx T x (cid:62) . Then it holds (see, e.g., [45, Section I.2.3])(3.2) (cid:107) exp( − tA ) (cid:107) (cid:54) e − tω , ∀ t (cid:62) . A similar estimate holds for the matrix H k = V Tk AV k produced by the Arnoldi process(see (2.2)). Indeed, let(3.3) ω k ≡ min (cid:54) = y ∈ R k y T H k yy T y . Then ω k = min (cid:54) = y ∈ R k ( V k y ) T A ( V k y ) y T y (cid:62) ω (cid:62) , so that, similarly to (3.2),(3.4) (cid:107) exp( − tH k ) (cid:107) (cid:54) e − tω k , ∀ t (cid:62) . Exploiting the estimate [41, Proof of Lemma 2.4] (cid:107) ϕ ( − tA ) (cid:107) (cid:54) (cid:90) (cid:107) exp( − t (1 − θ ) A ) (cid:107) dθ, ounding the norm under the integral by e − t (1 − θ ) ω and evaluating the integral, weobtain a useful inequality(3.5) (cid:107) ϕ ( − tA ) (cid:107) (cid:54) ϕ ( − tω ) . Then a similar relation holds also for H k :(3.6) (cid:107) ϕ ( − tH k ) (cid:107) (cid:54) ϕ ( − tω k ) . Proposition

Let t > , g ∈ R n and let A ∈ R n × n be such that (2.1) holds.If y k ( t ) is the Krylov subspace approximation (2.3) to the vector y ( t ) = tϕ ( − tA ) g then for its residual r k ( t ) , deﬁned by (2.4) , (2.6) , holds (3.7) (cid:107) r k ( t ) (cid:107) (cid:54) βth k +1 ,k ϕ ( − tω k ) =  βω k h k +1 ,k (1 − e − tω k ) , for ω k > ,βth k +1 ,k , for ω k = 0 , where ω k is deﬁned by (3.3) .Proof. From (2.5) we have u ( t ) = tϕ ( − tH k )( βe ), so that (cid:107) u ( t ) (cid:107) (cid:54) βt (cid:107) ϕ ( − tH k ) (cid:107) (cid:54) βtϕ ( − tω k ) =  βω k (1 − e − tω k ) , for ω k > ,βt, for ω k = 0 , where (3.6) is used. Relation (2.6) implies (cid:107) r k ( t ) (cid:107) = h k +1 ,k | e Tk u ( t ) | (cid:54) h k +1 ,k (cid:107) u ( t ) (cid:107) , where h k +1 ,k (cid:62) (cid:107) u ( t ) (cid:107) , gives the required bound (3.7).Note that (3.7) exhibits solely the time behavior of the residual r k ( t ) as a functionof time for a ﬁxed k . A more detailed analysis of the residual dependence on k andon t is given below in Section 4. The method for evaluatingthe ϕ matrix functions we propose in this paper adopts the so-called residual-time(RT) restarting strategy. It is proposed for the matrix exponential evaluations in ourrecent paper [13]. The RT strategy is essentially based on a notion of the matrixexponential residual deﬁned, for an approximation w k ( t ) ≈ w ( t ) = exp( − tA ) v , as [18,22, 10] − Aw k ( t ) − w (cid:48) k ( t ) . The RT restarting is based on the observation that the residual is easily computableand the residual norm tends to be a monotonically increasing function of time, whichexplains the name “residual-time” restarting. The ϕ residual (2.4) exhibits the samebehavior (see relation (2.6) and Proposition 3.1 above). Therefore the RT restartingapproach can be readily adopted for the ϕ evaluations.The RT restarting approach is quite simple. We solve (1.1) by performing k =1 , , . . . , k max steps of the Arnoldi process (see Section 2.1). We stop as soon as theresidual norm max s ∈ [0 ,t ] (cid:107) r k ( s ) (cid:107) is small enough. If, after k max carried out steps, thisis not the case we ﬁnd a time moment δ such that the residual norm (cid:107) r k max ( s ) (cid:107) issuﬃciently small for s ∈ [0 , δ ]. Then we set in (1.1) v := y k ( δ ), decrease the time Given: A ∈ R n × n , g (cid:54) = 0 , v ∈ R n , t > , k max and tol > convergence := false, tol := tol /t while not(convergence) β := (cid:107) g − Av (cid:107) , v := β ( g − Av ) for k = 1 , . . . , k max carry out step k of Arnoldi process:compute v k +1 and h ,k , ..., h k +1 ,k if max s ∈ [0 ,t ] (cid:107) r k ( s ) (cid:107) (cid:54) tol then ( ∗ ) convergence := truebreak (leave the for loop)elseif k = k max % -------- restart after k max stepsfind largest δ such that max s ∈ [0 ,δ ] (cid:107) r k ( s ) (cid:107) (cid:54) tol ( ◦ ) u := δϕ ( − δH k ) e v := v + V k ( βu ) t := t − δ break (leave the for loop)end ifend forend while u := tϕ ( − tH k ) e , y k := v + V k ( βu ) Fig. 3.1 . Description of the RT (residual–time) restarting algorithm for the ϕ matrix functionevaluations. The algorithm computes Krylov subspace approximation y k ( t ) ≈ v + ϕ ( − tA )( g − Av ) such that for its residual r k ( t ) holds max s ∈ [0 ,t ] (cid:107) r k ( s ) (cid:107) (cid:54) tol . interval t := t − δ and start the Arnoldi process for this modiﬁed IVP again. This RTrestarting for the ϕ matrix function evaluations is outlined in Figure 3.1.To monitor the values of the residual norm in the algorithm (see the algorithmlines marked with ( ∗ ), ( ◦ )), we compute the exact solution u ( s ) of the projectedIVP (2.5) at time moments s = ∆ t , 2∆ t , 3∆ t , . . . . This is done by recursion(3.8) u ( s + ∆ t ) = E k u ( s ) + ∆ td k ,E k = exp( − ∆ tH k ) , d k = βϕ ( − ∆ tH k ) e , where the matrix E k and vector d k are computed once. The value of ∆ t is chosen inlines ( ∗ ) and ( ◦ ) diﬀerently:(3.9) ∆ t :=  t/ , to estimate max s ∈ [0 ,t ] (cid:107) r k ( s ) (cid:107) , line ( ∗ ) ,t p , with smallest p = 0 , , . . . suchthat (cid:107) r k (∆ t ) (cid:107) (cid:54) tol , line ( ◦ ) . For estimating max s ∈ [0 ,t ] (cid:107) r k ( s ) (cid:107) in line ( ∗ ) of the algorithm, ∆ t does not have to bevery small. This is because (cid:107) r k ( s ) (cid:107) usually achieves its maximum at the end point s = t and the value (cid:107) r k ( t ) (cid:107) determines the quality of the approximation y k ( t ) ≈ y ( t ).We emphasize that we compute the values of u ( s ) exactly, and the computationaccuracy does not depend on ∆ t as is the case in a regular time integration scheme.To determine the restart interval length δ (line ( ◦ ) in Figure 3.1), we proceedaccording to formula (3.9). More speciﬁcally, ∆ t is initially set to t/ (cid:107) r k (∆ t ) (cid:107) (cid:54) ol , we trace the values (cid:107) r k (∆ t ) (cid:107) , (cid:107) r k (2∆ t ) (cid:107) , . . . by recursion (3.8), until they exceedthe tolerance value. Then we set δ to the largest p ∆ t for which (cid:107) r k ( p ∆ t ) (cid:107) (cid:54) tol .Otherwise, if (cid:107) r k (∆ t ) (cid:107) > tol for ∆ t = t/ t until (cid:107) r k (∆ t ) (cid:107) (cid:54) tol and then set δ := ∆ t .

4. Estimates on the residual in terms of the Faber series.

For simpliﬁ-cation of formulae we assume in this section that β = 1.Faber series as a means to investigate convergence of Arnoldi’s method wereintroduced in [46]; see also [6]. Denote by W ( A ) be the numerical range of matrix A and let Φ j be the Faber polynomials [63] for the compact W ( A ). Since, for a ﬁxed t , ϕ ( − tz ) is an entire function of z , we can consider the Faber series decomposition(4.1) f ( z ; t ) ≡ tϕ ( − tz ) = 1 − e − tz z = + ∞ (cid:88) j =0 f j ( t )Φ j ( z ) , z ∈ W ( A ) , t (cid:62) , where t is considered as a parameter.The following assertion is an analogue of [14, Proposition 1]. Proposition

The residual (2.4) satisﬁes the inequality (4.2) (cid:107) r k ( t ) (cid:107) (cid:54) h k +1 ,k + ∞ (cid:88) j = k − | f j ( t ) | (cid:54) (cid:107) A (cid:107) + ∞ (cid:88) j = k − | f j ( t ) | . Proof.

The superexponential convergence in j and the smoothness of f j ( t ) in t enable one to diﬀerentiate series (4.1) in t . Decomposition (4.1) implies that theapproximant (2.3) can be represented as(4.3) y k ( t ) = V k f ( H k ; t ) e = V k + ∞ (cid:88) j =0 f j ( t )Φ j ( H k ) e . From (4.1) we see that(4.4) e − tz = 1 − zf ( z ; t ) = 1 − + ∞ (cid:88) j =0 f j ( t ) z Φ j ( z ) . Therefore, since ∂e − tz ∂t = − ze − tz , we obtain, diﬀerentiating (4.4), − ze − tz = ∂∂t  − + ∞ (cid:88) j =0 f j ( t ) z Φ j ( z )  , − z  − + ∞ (cid:88) j =0 f j ( t ) z Φ j ( z )  = − + ∞ (cid:88) j =0 f (cid:48) j ( t ) z Φ j ( z ) , − + ∞ (cid:88) j =0 f j ( t ) z Φ j ( z ) = + ∞ (cid:88) j =0 f (cid:48) j ( t )Φ j ( z ) , − + ∞ (cid:88) j =0 (cid:2) f (cid:48) j ( t ) + zf j ( t ) (cid:3) Φ j ( z ) = 0 . (4.5) xploiting equality deg Φ j = j and relation (2.4), (4.3), (2.2), (4.5) with H k substi-tuted for z , we get r k ( t ) = − ( V k H k + h k +1 ,k v k +1 e T k ) + ∞ (cid:88) j =0 f j ( t )Φ j ( H k ) e + g − V k + ∞ (cid:88) j =0 f (cid:48) j ( t )Φ j ( H k ) e = − V k + ∞ (cid:88) j =0 (cid:2) f (cid:48) j ( t ) I k + H k f j ( t ) (cid:3) Φ j ( H k ) e + V k e − h k +1 ,k v k +1 e T k + ∞ (cid:88) j =0 f j ( t )Φ j ( H k ) e = − h k +1 ,k v k +1 e T k + ∞ (cid:88) j = k − f j ( t )Φ j ( H k ) e , where I k denotes the k × k identity matrix. The bound (cid:107) Φ j ( H k ) (cid:107) (cid:54) Remark Comparing estimate (4.2) and the estimate on the error (cid:107) y ( t ) − y k ( t ) (cid:107) (cid:54) + ∞ (cid:88) j = k | f j ( t ) | (see [6, Theorem 3.2]) shows that the two upper bounds mainly diﬀer in coeﬃcients.Thus, we can conjecture that the error and the residual behave similarly to each other. We assume here that A T = A and the spectralinterval of A is contained in the segment [0 , c ], c >

0. The change of variable x = ( z − c ) /c translates f ( z ; t ) into the function g ( x ; t ) ≡ − e − ct e − ctx c ( x + 1) , and we wish to estimate its Chebyshev coeﬃcients in x , because in this case theFaber series essentially reduces to the Chebyshev series up to a linear variable scaling.Following [56], we denote by a m [ h ] the m th Chebyshev coeﬃcient of a function h deﬁned for x ∈ [ − ,

1] as h ( x ) = + ∞ (cid:88) m =0 (cid:48) a m [ h ] T m ( x ) , where T m is the m th ﬁrst kind Chebyshev polynomial and the prime (cid:48) means that theterm at m = 0 is divided by two. Lemma

For k (cid:62) the inequality (4.6) + ∞ (cid:88) m = k − (cid:12)(cid:12)(cid:12)(cid:12) a m (cid:20) − e − ct e − ctx c ( x + 1) (cid:21)(cid:12)(cid:12)(cid:12)(cid:12) (cid:54) e − ct c + ∞ (cid:88) ν = k ( ν − k + 2)( ν − k + 1) I ν ( ct ) . holds with Bessel functions I ν .Proof. With formulae [56, Ch. II, §

11, (10), (54)], we have e − ctx = 2 + ∞ (cid:88) j =0 (cid:48) I j ( − ct ) T j ( x ) = 2 + ∞ (cid:88) j =0 (cid:48) ( − j I j ( ct ) T j ( x ) . hen, with δ i,j denoting the Kronecker delta, we obtain a m (cid:20) − e − ct e − ctx c (cid:21) = 2 δ m, − e − ct a m [ e − ctx ] c = 2 δ m, − − m e − ct I m ( ct ) c , which in view of [56, Ch. II, §

9, Theorem 9.4, (42)] gives a m (cid:20) − e − ct e − ctx c ( x + 1) (cid:21) = 2 + ∞ (cid:88) l =0 ( − l (cid:18) l + 11 (cid:19) a m + l +1 (cid:20) − e − ct e − ctx c (cid:21) = 4 c + ∞ (cid:88) l =0 ( − l ( l + 1)[ δ m + l +1 , − e − ct ( − m + l +1 I m + l +1 ( ct )]= 4 c ( − m + ∞ (cid:88) l =0 ( l + 1) e − ct I m + l +1 ( ct ) , whence + ∞ (cid:88) m = k − (cid:12)(cid:12)(cid:12)(cid:12) a m (cid:20) − e − ct e − ctx c ( x + 1) (cid:21)(cid:12)(cid:12)(cid:12)(cid:12) (cid:54) e − ct c + ∞ (cid:88) m = k − ∞ (cid:88) l =0 ( l + 1) I m + l +1 ( ct )= 4 e − ct c + ∞ (cid:88) ν = k (cid:34) ν − k (cid:88) l =0 ( l + 1) (cid:35) I ν ( ct ) = 4 e − ct c + ∞ (cid:88) ν = k ( ν − k + 2)( ν − k + 1)2 I ν ( ct )= 2 e − ct c + ∞ (cid:88) ν = k ( ν − k + 2)( ν − k + 1) I ν ( ct ) . Proposition

For k (cid:62) (cid:107) r k ( t ) (cid:107) (cid:54) e − ct + ∞ (cid:88) ν = k ( ν − k + 2)( ν − k + 1) I ν ( ct ) . Proof.

Gathering (4.2), the deﬁnition of c , (4.6) and the fact that Φ [ − , ,ν = 2 T ν , ν (cid:62) §

1, (21)]), derive (cid:107) r k ( t ) (cid:107) (cid:54) c + ∞ (cid:88) j = k − | f j ( t ) | = 2 c + ∞ (cid:88) j = k − (cid:12)(cid:12)(cid:12)(cid:12) a j (cid:20) − e − ct e − ctx c ( x + 1) (cid:21)(cid:12)(cid:12)(cid:12)(cid:12) (cid:54) e − ct + ∞ (cid:88) ν = k ( ν − k + 2)( ν − k + 1) I ν ( ct ) . A similar use of Bessel functions I k can be found in [23, § ν − k + 2)( ν − k + 1) as a complication. Note that the upper bound (4.7) can beeasily computed with a high accuracy by truncating the inﬁnite sum (cid:80) + ∞ ν = k to (cid:80) Mν = k where M can be estimated as discussed in [23, § Lemma

Let µ ∈ N and w (cid:62) . Then the Bessel function I µ obeys the in-equality (4.8) e − w I µ ( w ) (cid:54) w µ (2 µ − . roof. Using formulae [1, 9.6.18, 6.1.8, 6.1.12], derive e − w I µ ( w ) (cid:54) ( w/ µ √ π Γ( µ + 1 / (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) π (cid:90) e w cos θ sin µ θ dθ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:54) e − w ( w/ µ √ π Γ( µ + 1 / πe w = √ πw µ µ · (2 µ − µ · √ π = w µ (2 µ − . Proposition k of order at least ct in the symmetriccase). If (4.9) k (cid:62) max { ct, } , then (4.10) (cid:107) r k ( t ) (cid:107) (cid:54) ct ) k (2 k − Proof.

Exploiting assumption (4.9) and the inequality (see [51, Proposition 1]) I ν +1 ( w ) < I ν ( w )1 + νw , w > , ν (cid:62) − , obtain for ν (cid:62) k ( ν − k + 3)( ν − k + 2) I ν +1 ( ct )( ν − k + 2)( ν − k + 1) I ν ( ct ) = (cid:18) ν − k + 1 (cid:19) · I ν +1 ( ct ) I ν ( ct ) <

31 + νct (cid:54)

31 + ctct (cid:54) , so that, in view of (4.8), e − ct + ∞ (cid:88) ν = k ( ν − k + 2)( ν − k + 1) I ν ( ct ) (cid:54) e − ct · I k ( ct ) (cid:54) ct ) k (2 k − . It remains to substitute the above bound into (4.7).Proposition 4.5 is an analogue of [23, Theorem 5]. Bound (4.10) would be expo-nentially worse, if we used the Taylor series instead of the Chebyshev one; this clariﬁesthe role of Faber series.In Figure 4.1 we plot the graph of the right-hand side of (4.7) for the values c and t as occur in the numerical test presented in Section 6.1. Since (cid:107) A (cid:107) (cid:54) (cid:107) A (cid:107) (cid:107) A (cid:107) ∞ (see, e.g., [44, Chapter 5.6]) and (cid:107) A (cid:107) = (cid:107) A (cid:107) ∞ for A = A T , we can estimate c by (cid:107) A (cid:107) /

2. This leads to the c values reported in caption of Figure 4.1. The t valuesused in the plots (and reported in the caption) are taken to be approximately theaverage time interval length ¯ δ per restart, i.e., t := ¯ δ , obtained from data in Table 6.1as the total time interval length divided by the number of restarts. In the ﬁgure wealso plot the actual residual norm values computed with the regular Lanczos process.From Figure 4.1 we see that the obtained upper bound is far from being sharp.This is certainly expected. Although (4.7) is a true upper bound and contains noasymptotic estimates, it does not reﬂect the sensibility of the Krylov subspace methods

50 100 15010 -15 -13 -11 -09 -07 -05 -03 -01 Krylov dimension k test 1, 40 x 40 x 40 grid c=501.1359, t=0.8||r k || 0 20 40 60 80 100 120 14010 -15 -13 -11 -09 -07 -05 -03 -01 Krylov dimension k test 1, 60 x 60 x 60 grid c=98.9898, t=3.0||r k || Fig. 4.1 . The solid lines are error upper bound (4.7) for c = 501 . , t = ¯ δ = 0 . (left plot)and c = 98 . , t = ¯ δ = 3 . (right plot). The values correspond to A being discretized anisotropicelliptic operator described in Section 6.1, for × × (left plot) and × × (right plot)ﬁnite volume grids, respectively. The dashed lines are the actual scaled residual norm values (cid:107) r k (cid:107) /β . with respect to the discrete structure of the spectrum. To conﬁrm this and to showthat the upper bound in (4.7) can be sharp, in Figure 4.2 we plot the upper boundalong with the actual residual norm values for the following tridiagonal A ∈ R n × n and g ∈ R n :(4.11) A =  / √ . . . / √ / /  , g =   . It can be shown that (cid:107) A (cid:107) ≈ n , hence, we take n = 1000 and c = 1. Theresidual values are computed with the regular Lanczos process. As we see from theplots, the upper bound (4.7) is sharp. -15 -13 -11 -09 -07 -05 -03 -01 Krylov dimension k c=1.0, t=1.0||r k || 0 10 20 30 4010 -15 -13 -11 -09 -07 -05 -03 -01 Krylov dimension k c=1.0, t=20||r k || Fig. 4.2 . The solid lines are error upper bound (4.7) for c = 1 , t = 1 (left plot) and c = 1 , t = 20 (right plot). The values correspond to A deﬁned in (4.11) . The dashed lines are the actualscaled residual norm values (cid:107) r k (cid:107) /β obtained for g deﬁned in (4.11) . .3. A weakly non-symmetric case. Proposition

Let thenumerical range W ( A ) be enlcosed by an ellipse E c,ρ with foci and c with the sumof semiaxes ρc , c > , ρ > . Then for k (cid:62) (cid:107) r k ( t ) (cid:107) (cid:54) e − ct + ∞ (cid:88) ν = k ( ν − k + 2)( ν − k + 1) ρ ν I ν ( ct ) . Proof.

The proof is a slight generalization of that of proposition 4.3; the lattercorresponds to the limit case ρ = 1. One should exploit the relation (see [63, Ch. II, §

1, (25)]) between the Faber polynomials for E c,ρ and the Faber polynomials for thefocal segment [0 , c ] (actually, the shifted Chebyshev polynomials).Similarly to (4.7), upper bound (4.12) is easily computable by truncating theinﬁnite series in (4.12).In Figure 4.3 we plot the right-hand side of (4.12) for the matrices of the secondand third test problems, see Sections 6.2 and 6.3. In each of these two test problemsseveral grids and corresponding parameter settings are used, and the two matricesfor the plots in Figure 4.3 are built for the grids 202 ×

202 (the second test) and512 ×

512 (the third test), respectively. The t values are approximate average timeinterval length ¯ δ per restart, computed as the total time interval length divided bythe number of restarts. To obtain the ellipse parameters appearing in the bound, weapproximate the numerical ranges of the matrices A by the numerical ranges of theirKrylov projections H k = V Tk AV k , see (2.2). This is done for a suﬃciently large k such that the exterior eigenvalues of A are well approximated by the eigenvalues of H k . Thus, the plots in Figure 4.3 are produced for approximate spectral data but dosuﬃce for illustrative purposes. In the ﬁgure the actual residual norm is also plotted. Krylov dimension k -15 -13 -11 -9 -7 -5 -3 -1 test 2: complex diffusion c=4.0398e07,t=1.25e-05, =1.0142||r k || Krylov dimension k -15 -13 -11 -9 -7 -5 -3 -1 test 3: dominating convection c=0.49949,t=3750, =1.0937||r k || Fig. 4.3 . The solid lines are error upper bound (4.12) for c = . , t = ¯ δ = , ρ = 1 . (left plot) and c = 0 . , t = ¯ δ = 3750 , ρ = 1 . (right plot). The values correspondto the matrices A of the second test problem (see Section 6.2, grid × ) and the third testproblem (see Section 6.3, grid × ), respectively. The dashed lines are the actual scaledresidual norm values (cid:107) r k (cid:107) /β .

5. Convergence of the restarted method.

Proposition

Let the RT restarted Krylov subspace method be used to com-pute an approximation to the solution y ( t ) , t < ∞ , of IVP (1.1) and let tol be he residual tolerance used at each restart. Then for any restart length k max the RTrestarted Krylov subspace method stops after a ﬁnite number J of restarts and theerror of its approximate solution y k ( t ) is bounded by t · tol , (cid:107) y ( t ) − y k ( t ) (cid:107) (cid:54) t · tol , with t being the length of the time interval on which problem (1.1) is solved.Proof. Based on Propositions 3.1, 4.3, 4.6, we know that at each restart j a δ j > tol on thetime interval of length δ j . Hence, since the time interval length t is ﬁnite and isreduced at each restart by δ j , a ﬁnite number J of restarts will suﬃce. The timeinterval after j restarts is reduced from [0 , t ] to [0 , t − ( δ + · · · + δ j )] and (cid:80) Jj =1 δ j = t .To simplify the notation in the proof, by y j ( s ), s ∈ [0 , δ j ] we denote the Krylovsubspace solution which is obtained, with k (cid:54) k max steps of the Arnoldi process, atrestart j ( k < k max is possible only for j = J ). We also denote the residual of y j ( s )by r j ( s ). With this notation, y j ( s ) approximates the exact solution y ( (cid:80) j − i =1 δ i + s )for s ∈ [0 , δ j ] with residual r j ( s ) such thatmax s ∈ [0 ,δ j ] (cid:107) r j ( s ) (cid:107) (cid:54) tol . Then the approximate solution at the ﬁrst restart is y ( δ ) = v + δ ϕ ( − δ A )( g − Av ) + (cid:15) ( δ ) , where (cid:15) ( s ) is the error of the approximation y ( s ) ≈ y ( s ). Using (2.8) we can bound (cid:107) (cid:15) (cid:107) as (cid:107) (cid:15) ( δ ) (cid:107) (cid:54) δ ϕ ( − δ ω ) max s ∈ [0 ,δ ] (cid:107) r ( s ) (cid:107) (cid:54) δ ϕ ( − δ ω ) · tol (cid:54) δ · tol , where ω (cid:62) (cid:15) j = (cid:15) j ( δ j ) and y j = y j ( δ j ). Then at restart 2 we have y = y + (cid:15) + δ ϕ ( − δ A )( g − A ( y + (cid:15) )) + (cid:15) = y + δ ϕ ( − δ A )( g − Ay ) (cid:124) (cid:123)(cid:122) (cid:125) exact solution + δ ϕ ( − δ A )( − A(cid:15) ) + (cid:15) + (cid:15) (cid:124) (cid:123)(cid:122) (cid:125) ε , where we denote all the error terms by ε . Now note that δ ϕ ( − δ A )( − A(cid:15) ) + (cid:15) = exp( − δ A ) (cid:15) , and, hence, ε = exp( − δ A ) (cid:15) + (cid:15) , (cid:107) ε (cid:107) (cid:54) (cid:107) (cid:15) + (cid:15) (cid:107) (cid:54) ( δ + δ ) · tol . Here we bound (cid:15) in the same way as (cid:15) and use estimate (3.2). Continuing in thesame way, we obtain at the last restart Jε J = exp( − ( t − δ ) A ) (cid:15) + exp( − ( t − δ − δ ) A ) (cid:15) + . . . + exp  − ( t − J − (cid:88) j =1 δ j ) A  (cid:15) J − + (cid:15) J , (cid:107) ε J (cid:107) (cid:54) (cid:107) (cid:15) (cid:107) + · · · + (cid:107) (cid:15) J (cid:107) (cid:54) ( δ + · · · + δ J ) · tol = t · tol .

20 40 60 80 100 120 140 160 180

Krylov dimension k -15 -13 -11 -9 -7 -5 -3 -1 test 1, 40 x 40 x 40 grid t=0.1t=0.2t=0.4t=0.8 Fig. 5.1 . Residual upper bound (4.7) for c = 501 . (the ﬁrst test problem, × × grid)and diﬀerent time interval lengths t . The number of restarts J appearing in Proposition 5, can be estimated from theresidual upper bounds (4.7),(4.10),(4.12). This can be done, for example, using plotsas shown in Figure 5.1. There, for the test problem described in Section 6.1, we plotthe bound (4.7) for diﬀerent time interval lengths t . For instance, if k max ≈

30 andthe requested tolerance is in the range [10 − , − ], then the restart interval length δ should be approximately 0 . t = 1000 used in the test, J ≈ / . J . Indeed, as we see from Table 6.1, the actual number of restarts is roughly fourtimes smaller.

6. Numerical experiments.

In the experiments presented here we test ourmethod to compute actions of ϕ ( − tA ) implemented with the proposed RT restarting.We compare our code phiRT with two other well known Krylov subspace codes forevaluating ϕ ( − tA ), namely the phiv function of the EXPOKIT package [60] and the phip function presented in [53] and publicly available online . We do not include therecently proposed code phipm [53] in the comparisons because this code uses a variableKrylov dimension up to 100. Using such a high Krylov dimension in the large scaleproblems is not always feasible, and in the tests we restrict the Krylov dimension forall the three solvers by 30.All the tests are carried out on a Linux laptop with four AMD Ryzen 3 3200UCPUs (clock rate 2.6 GHz) and 8 Gb memory, in Octave (version 4.2.2) with noparallelization. Our implementation of the Arnoldi and Lanczos processes does notinclude Krylov basis vector reorthogonalization and no serious orthogonality loss hasbeen observed. In the numerical tests the Krylov subspace matrices are computed bythe Lanczos process as soon as A = A T , otherwise the Arnoldi process is employed.To calculate matrix-vector products ϕ ( − tH k ) x for the small projected Krylovsubspace matrix H k in our algorithm (cf. (3.8)), we employ relation [57, Proposi-tion 2.1] exp( (cid:20) − tH k x (cid:21) ) = (cid:20) exp( − tH k ) ϕ ( − tH k ) x (cid:21) , computing the matrix exponential at the left hand side and, thus, reducing com-putations with ϕ to computations with the matrix exponential (see also [4]). Tocompute the matrix exponentials exp( − tH k ) here and in (3.8) the standard built-in ∼ jitse/software.html15 unction expm of Octave is used. It is based on Ward’s scaling and squaring methodcombined with diagonal Pad´e approximation and is certainly not the best methodavailable, see [50, 38]. For the moderate tolerance range [10 − , − ] we aim in thetests, we have observed no accuracy problems with this Octave implementation. Wehave chosen to stick to the built-in function expm to have our Octave code fully Mat-lab portable. Note that Matlab’s implementation of expm [3] is seen as more reliablebut is not publicly available. For higher accuracy requirements we could switch, forinstance, to the phipade function of the EXPINT package [7].In all the tests the reported achieved error is measured for each of the three Krylovsubspace solvers as (cid:107) y k ( t ) − y ref ( t ) (cid:107)(cid:107) y ref ( t ) (cid:107) , where y k ( t ) is the solution of the Krylov subspace solver at ﬁnal time t and y ref ( t ) isa reference solution computed with a high tolerance (reported for each test below) bythe phiv function. Since the reference solution y ref ( t ) is aﬀected by the same spacediscretization error as y k ( t ), (cid:107) y k ( t ) − y ref ( t ) (cid:107) is a good indicator of the time error. In this test wesolve IVP (1.1) with a symmetric matrix A . It is a 3D (three-dimensional) anisotropicelliptic operator 10 ∂ xx + 10 ∂ yy + ∂ zz with periodic boundary conditions deﬁned inthe spatial domain [0 , × [0 , × [0 ,

1] and discretized by a fourth order accurate ﬁnitevolume discretization described in [68]. The source vector g contains the grid valuesof either of the functionssin(2 πx ) sin(2 πy ) sin(2 πz ) + x (1 − x ) y (1 − y ) z (1 − z ) , e − x − . +( y − . +( z − . )) + x (1 − x ) y (1 − y ) z (1 − z ) . We use two diﬀerent source vectors g to make sure that Krylov subspace methods donot proﬁt from a particular choice of the source vector. In this test the initial valuevector v in (1.1) is set to zero.In the test we use uniform 40 × ×

40 and 60 × ×

60 grids, for which theproblem size is n = 64 000 and n = 216 000, respectively. For these grids the matrix A has its norms (cid:107) A (cid:107) = (cid:107) A (cid:107) ∞ approximately equal to 1002 .

27 and 197 .

98, respectively.These values give the estimations (cid:107) A (cid:107) / c used in Figure 4.1. The ﬁnal timeis t = 1000 and the reference solution is computed by function phiv with tolerance tol = .The results for this test are presented in Table 6.1 and in Figure 6.1, wherefor each of the solvers the values of achieved accuracy, the number of matrix-vectormultiplications (matvecs) and the CPU times are given. These values are reported forthe same tolerance . Exploring performance of the solvers for diﬀerent tolerancevalues leads to plots in Figure 6.1. As can be seen in the plots, on the coarser gridfunction phiv fails to produce results with an accuracy less stringent than ≈ − .Its least accurate result (corresponding to the upper left end of the dashed curve)is obtained for tolerance . An attempt to run the code with a less stringenttolerance yields an error message. Furthermore, the phip solver does not producethe results within the required accuracy range [10 − , − ] and appears to be moreexpensive for this test than our phiRT . For both spatial grids the phip solver exhibitsan abrupt change in the delivered accuracy with gradually varying requested tolerance,indicated in the Figure by segments AB and CD . Thus, we see that both phiv and phip solvers have diﬃculties with delivering results within the desired accuracy range[10 − , − ]. able 6.1 Results for the fourth order ﬁnite-volume discretized anisotropic heat equation method(Krylov dim.), delivered matvecs /requested tolerance error CPU time, s40 × ×

40 grid, sine g phiv (10), phip (10), phiRT (10), phiv (30), phip (30), phiRT (30), × ×

40 grid, Gaussian g phiv (30), phip (30), phiRT (30), × ×

60 grid, sine g phiv (30), phip (30), phiRT (30), -13 -12 -11 -10 -09 -08 -07 -06 -05 -04 -03 -02 -01 a cc u r a cy A BphivphipphiRT 100 200 300 400 50010 -13 -12 -11 -10 -09 -08 -07 -06 -05 -04 -03 -02 -01 CPU time a cc u r a cy A B phivphipphiRT5000 10000 1500010 -13 -12 -11 -10 -09 -08 -07 -06 -05 -04 -03 -02 -01 a cc u r a cy C D phivphipphiRT 100 200 300 400 500 600 70010 -13 -12 -11 -10 -09 -08 -07 -06 -05 -04 -03 -02 -01 CPU time a cc u r a cy C D phivphipphiRT

Fig. 6.1 . The fourth order ﬁnite-volume discretized anisotropic heat equation. Relative errorversus number of the matrix-vector products (left) and versus the CPU time (right) for the phiv (dashed line), phip (dash-dotted line) and phiRT (solid line). The restart length is 30 for all thethree solvers. The upper left end of the phiv curve at the top plots is obtained for tolerance and running phiv with tolerance fails with an error message. The points A , B , C , and D onthe phip curve are obtained for the tolerances , , , and , respectively. − − − y x -1 -0.5 0 0.5 1-1-0.500.51 x y velocity field Fig. 6.2 . Diﬀusion coeﬃcients D , ( x, y ) (left) and velocity ﬁeld [ v ( x, y ) , v ( x, y )] (right). Theregion where D , = 10 occupies the square [ − . , . × [ − . , . and is surrounded by the wallwhere D , = 10 − which is . wide and has a . broad slit. We now consider a nonsymmetric matrix A resulting from a standard ﬁvepoint central-diﬀerence discretization of the following convection–diﬀusion operator: L [ u ] = − ( D u x ) x − ( D u y ) y + Pe (cid:18)

12 ( v u x + v u y ) + 12 (( v u ) x + ( v u ) y ) (cid:19) , where ( x, y ) ∈ [ − , × [ − ,

1] and the homogeneous Dirichlet conditions are imposedon the boundary. Here the ﬁrst derivatives representing the convective terms arewritten in a special way. If the velocity ﬁeld is divergence-free, i.e., if ( v ) x + ( v ) y =0, then the central-diﬀerence discretization of the convective terms yields a skew-symmetric matrix [47]. This physically desirable property [68] guarantees that theﬁrst derivatives do not contribute to the symmetric part of the matrix which is thecase for diﬀusive upwind discretizations.The diﬀusion coeﬃcients D , ( x, y ) are deﬁned as shown at the left plot in Fig-ure 6.2. This choice is inspired by [66, test 4]. The velocity ﬁeld, taken as v ( x, y ) = y (1 − x ) , v ( x, y ) = x ( y − , is the recirculating wind adopted from the IFISS package [62], see the right plot inFigure 6.2. The initial value vector v , cf. (1.1), has all its entries set to 0 .

01. For v = 0 the solution may become unstable due to appearance of small negative solutionvalues (recall that central diﬀerence discretizations are used for both diﬀusion andconvection terms). The components of the source vector g are the values of function1000e − x + y ) on the ﬁnite diﬀerence grid. Uniform 202 × ×

402 and 802 ×

802 grids areused in the experiments, so that the problem size is n = 40 000, n = 160 000 and n = 640 000, respectively. On the ﬁrst two grids the Peclet number Pe is set to 10,whereas Pe = 20 on the ﬁnest grid. For these Peclet values and all the three gridsthe ratio of the norms of the skew-symmetric and symmetric parts of the matrices,measured as the maximal row and column norms, is of order 10 − , i.e., the matrices are ig. 6.3 . Solution of ﬁnite-diﬀerence discretized convection–diﬀusion problem at time t =5 × − (left) and t = 10 − (right), × grid, Pe = 10 . At the right plot a correspondenceto the values of the diﬀusion coeﬃcients D , is clearly visible, with a leakage through the slit, seeFigure 6.2. weakly nonsymmetric. Solution samples of this test problem can be seen in Figure 6.3.The reference solution is computed by the phiv function run with tolerance .The results of the comparison runs are presented in Table 6.2. To see clearlythe dependence of the attained accuracy on the computational work and CPU time,we also present plots in Figure 6.4. We see that phiv tends to deliver a much moreaccurate solution than is requested by given tolerance. We also see that for thecoarsest 202 ×

202 grid our solver is more eﬃcient than phip in terms of matrix–vector multiplications but requires more CPU time. This eﬀect disappears on the ﬁnergrids. There, for the moderate tolerance values we are interested in, phiRT clearlyoutperforms the other solvers in terms of CPU time and matrix–vector multiplications.

In this test IVP (1.1) is solved where A and g are produced by theIFISS software package [62, 61, 28]. More precisely, A results from a ﬁnite-elementdiscretization of the following two-dimensional convection–diﬀusion problem:(6.1) − ν ∇ u + (cid:126)w · ∇ u = 0 , u = u ( x, y ) , ( x, y ) ∈ [ − , × [ − , , where ν is viscosity parameter and Dirichlet boundary conditions u = 0 . y = 1, where u ( x,

1) = 12 + 12 (1 − x ) . The initial value vector v in (1.1) is set to zero in this test.Discretization of (6.1) in IFISS by bilinear quadrilateral ( Q ) ﬁnite elements withthe streamline upwind Petrov–Galerkin (SUPG) stabilization leads to a linear system Au = g , where g enforces the Dirichlet boundary conditions. Thus, the IVP we solveis a nonstationary version of the stationary discretized convection–diﬀusion problem.We use uniform 512 ×

512 and 1024 × ν = 1 / ν = 1 / ν min (cid:26) h x cos α , h y sin α (cid:27) (cid:107) (cid:126)w (cid:107) , α = arctan w y w x , is ≈

25, where h x,y and w x,y are respectively the element sizes and the componentsof the velocity (cid:126)w . able 6.2 Results for the ﬁnite-diﬀerence discretized convection–diﬀusion test problem. method(Krylov dim.), delivered matvecs /requested tolerance error CPU time, s202 ×

202 grid, Pe = 10, t = 10 − phiv (30), phiv (30), phip (30), phip (30), phiRT (30), phiRT (30), ×

402 grid, Pe = 10, t = 10 − phiv (30), phiv (30), phip (30), phip (30), phiRT (30), phiRT (30), phiv (20), phip (20), phiRT (20), phiRT (20), ×

802 grid, Pe = 20, t = 5 · − phiv (30), phip (30), phiRT (30), phiv (20), phip (20), phiRT (20), (cid:107) A + A T (cid:107) = 1, (cid:107) A − A T (cid:107) ≈ . ×

512 grid and (cid:107) A + A T (cid:107) = 1, (cid:107) A − A T (cid:107) ≈ . × t = 2 · . For this t the solution y ( t )of IVP (1.1) still diﬀers signiﬁcantly from the stationary solution A − g to which y ( t )converges for growing t , see Figure 6.5. The reference solution is computed by phiv with tolerance .The results are presented in Tables 6.3 and 6.4 and in Figure 6.6. We see that phiRT appears to be more eﬃcient for this test and that obtaining a time error withinthe desired tolerance range [10 − , − ] can be problematic with phiv and phip . Onboth grids the phip solver demonstrates a remarkable “all-or-nothing” behavior: onthe 1024 × to yields adrop in the delivered accuracy from to . This last accuracy valueis not plotted in Figure 6.6 as it is too low for a satisfactory PDE solution.

7. Conclusions.

We have proposed a Krylov subspace method for computingmatrix-vector products with the ϕ matrix function. It is based on the residual no-tion (2.4), which is essentially used in our algorithm both as a stopping criterion andfor eﬃcient restarting. Our analysis provides true upper bounds for the residual, withno asymptotic estimates, and shows convergence of the restarted method for an arbi-

000 10000 15000 20000 2500010 -11 -10 -09 -08 -07 -06 -05 -04 -03 -02 a cc u r a cy

402 x 402 grid, Pe = 10 phivphipphi-RT

50 100 150 200 250 300 350 40010 -11 -10 -09 -08 -07 -06 -05 -04 -03 -02 CPU time a cc u r a cy

402 x 402 grid, Pe = 10 phivphipphi-RT -12 -11 -10 -09 -08 -07 -06 -05 -04 -03 -02 a cc u r a cy

802 x 802 grid, Pe = 20 phivphipphi-RT

100 200 300 400 50010 -12 -11 -10 -09 -08 -07 -06 -05 -04 -03 -02 CPU time a cc u r a cy

802 x 802 grid, Pe = 20 phivphipphi-RT

Fig. 6.4 . Finite-diﬀerence discretized convection–diﬀusion problem. Relative error versus num-ber of the matrix-vector products (left) and versus the CPU time (right) for phiv (dashed line), phip (dash-dotted line) and phiRT (solid line). Top plots: × grid, Pe = 10 , t = 10 − , bottomplots: × grid, Pe = 20 , t = 5 · − . The restart length is 30 for all the three solvers.The loosest tolerance value for phiv and phip solvers is . Hence, it is problematic to obtaincheaper, less accurate results with these solvers. Fig. 6.5 . Solution of the IFISS convection-dominated problem y ( t ) at t = 2 · (left) and thestationary solution A − g (right) on the × grid. trary ﬁnal time interval [0 , t ], t < ∞ , and for any restart length (i.e., Krylov subspacedimension).Since the residual can be seen as the backward error, the residual based stoppingcriterion and residual-time (RT) restarting provide a reliable error control. This isconﬁrmed by the performance of the method: in all the test runs it either outperforms able 6.3 Results for the ﬁnite-element discretized convection-dominated convection–diﬀusion test prob-lem, ν = 1 / , × grid method(Krylov dim.), delivered matvecs /requested tolerance error CPU time, s phiv (30),

160 / 5.6 phiv (30),

224 / 7.7 phip (30),

31 / 1.2 phip (30),

210 / 7.7 phip (30),

180 / 6.7 phiRT (30),

89 / 3.3 phiRT (30),

160 / 5.7 phiv (10),

216 / 4.3 phip (10),

320 / 7.7 phiRT (10),

166 / 4.1

50 100 150 20010 -9 -8 -7 -6 -5 -4 -3 -2 -1 a cc u r a cy

512 x 512 grid phivphipphi-RT 0 2 4 6 810 -9 -8 -7 -6 -5 -4 -3 -2 -1 CPU time a cc u r a cy

512 x 512 grid phivphipphi-RT0 20 40 60 80 100 120 14010 -11 -10 -09 -08 -07 -06 -05 -04 -03 -02 -01 a cc u r a cy phivphipphi-RT 0 5 10 15 2010 -11 -10 -09 -08 -07 -06 -05 -04 -03 -02 -01 CPU time a cc u r a cy phivphipphi-RT Fig. 6.6 . The ﬁnite-element discretized convection-dominated convection–diﬀusion test prob-lem. Relative error versus number of the matrix-vector products (left) and versus the CPU time(right) for phiv (dashed line), phip (dash-dotted line) and phiRT (solid line). The restart lengthis 30 for all the three solvers. Perturbing requested tolerance from to for phip yieldsa change in delivered accuracy from to , see the two upper left points of thedash-dotted line in the upper plots, × grid. A similar eﬀect is observed for × grid. able 6.4 Results for the ﬁnite-element discretized convection-dominated convection–diﬀusion test prob-lem, ν = 1 / , × grid method(Krylov dim.), delivered matvecs /requested tolerance error CPU time, s phiv (30),

128 / 18.1 phiv (30),

128 / 18.2 phip (30), phip (30),

60 / 9.0 phip (30),

90 / 13.8 phiRT (30), phiRT (30), phiRT (30),

19 / 2.5 phiRT (30),

28 / 4.2 phiv (10),

60 / 5.2 phiv (10),

96 / 8.2 phip (10), phip (10),

100 / 9.9 phiRT (10), phiRT (10), phiRT (10),

45 / 4.4the phiv and phip solvers or delivers a comparable eﬃciency. More importantly,in all the test runs our method appears to deliver a monotone dependence of theobtained accuracy on the required tolerance. The phiv and phip solvers have in somecases diﬃculties with delivering error in the desired moderate PDE tolerance range[10 − , − ]; they tend to produce more accurate and more expensive results thanrequested.Thus, the presented residual based Krylov subspace method for evaluating matrix-vector products with the ϕ matrix function appears to be a promising approach usefulin practical computations. REFERENCES[1]

M. Abramowitz and I. A. Stegun , Handbook of mathematical functions with formulas,graphs, and mathematical tables , vol. 55 of National Bureau of Standards Applied Math-ematics Series, For sale by the Superintendent of Documents, U.S. Government PrintingOﬃce, Washington, D.C., 1964.[2]

M. Afanasjew, M. Eiermann, O. G. Ernst, and S. G¨uttel , Implementation of a restartedKrylov subspace method for the evaluation of matrix functions , Linear Algebra Appl., 429(2008), pp. 2293–2314.[3]

A. H. Al-Mohy and N. J. Higham , A new scaling and squaring algorithm for the matrixexponential , SIAM Journal on Matrix Analysis and Applications, 31 (2010), pp. 970–989,https://doi.org/10.1137/09074721X.[4]

A. H. Al-Mohy and N. J. Higham , Computing the action of the matrix exponential, withan application to exponential integrators , SIAM J. Sci. Comput., 33 (2011), pp. 488–511.http://dx.doi.org/10.1137/100788860.[5]

B. Beckermann , Image num´erique, GMRES et polynˆomes de Faber , C. R. Acad. Sci. Paris:Ser. I, 340 (2005), pp. 855–860.[6]

B. Beckermann and L. Reichel , Error estimation and evaluation of matrix functions via theFaber transform , SIAM J. Num. Anal., 47 (2009), pp. 3849–3883.[7]

H. Berland, B. Skaflestad, and W. M. Wright , EXPINT—a MATLAB package for ex- onential integrators M. Botchev, I. Farag´o, and R. Horv´ath , Application of operator splitting to theMaxwell equations including a source term , Applied Numerical Mathematics, 59(2009), pp. 522–541, https://doi.org/DOI:10.1016/j.apnum.2008.03.031, http://dx.doi.org/10.1016/j.apnum.2008.03.031.[9]

M. A. Botchev , Krylov subspace exponential time domain solution of Maxwell’s equations inphotonic crystal modeling , J. Comput. Appl. Math., 293 (2016), pp. 24–30. http://dx.doi.org/10.1016/j.cam.2015.04.022.[10]

M. A. Botchev, V. Grimm, and M. Hochbruck , Residual, restarting and Richardson it-eration for the matrix exponential , SIAM J. Sci. Comput., 35 (2013), pp. A1376–A1397.http://dx.doi.org/10.1137/110820191.[11]

M. A. Botchev, A. M. Hanse, and R. Uppu , Exponential Krylov time integration for modelingmulti-frequency optical response with monochromatic sources , Journal of Computationaland Applied Mathematics, 340 (2018), pp. 474–485. https://doi.org/10.1016/j.cam.2017.12.014.[12]

M. A. Botchev, D. Harutyunyan, and J. J. W. van der Vegt , The Gautschi time steppingscheme for edge ﬁnite element discretizations of the Maxwell equations , J. Comput. Phys.,216 (2006), pp. 654–686. http://dx.doi.org/10.1016/j.jcp.2006.01.014.[13]

M. A. Botchev and L. A. Knizhnerman , ART: Adaptive residual-time restarting for Krylovsubspace matrix exponential evaluations , J. Comput. Appl. Math., 364 (2020), p. 112311.https://doi.org/10.1016/j.cam.2019.06.027.[14]

M. A. Botchev and L. A. Knizhnerman , ART: Adaptive residual-time restarting for Krylovsubspace matrix exponential evaluations , J. Comput. Appl. Math., 364 (2020). https://doi.org/10.1016/j.cam.2019.06.027.[15]

M. A. Botchev, G. L. G. Sleijpen, and H. A. van der Vorst , Stability control for approx-imate implicit time stepping schemes with minimum residual iterations , Appl. Numer.Math., 31 (1999), pp. 239–253.[16]

G. D. Byrne, A. C. Hindmarsh, and P. N. Brown , VODPK, large non-stiﬀ or stiﬀ ordinarydiﬀerential equation initial-value problem solver

M. Caliari and A. Ostermann , Implementation of exponential Rosenbrock-type integrators ,Appl. Numer. Math., 59 (2009), pp. 568–581, https://doi.org/10.1016/j.apnum.2008.03.021.[18]

E. Celledoni and I. Moret , A Krylov projection method for systems of ODEs , Appl. Numer.Math., 24 (1997), pp. 365–378, https://doi.org/10.1016/S0168-9274(97)00033-0.[19]

J. Certaine , The solution of ordinary diﬀerential equations with large time constants , inMathematical Methods for Digital Computers, K. E. A. Ralston, H.S. Wilf, ed., Wiley,New York, 1960, pp. 128–132.[20]

T. F. Chan and K. R. Jackson , The use of iterative linear-equation solvers in codes for largesystems of stiﬀ IVPs for ODEs , SIAM J. Sci. Stat. Comput., 7 (1986), pp. 378–417.[21]

H. De Raedt, K. Michielsen, J. S. Kole, and M. T. Figge , One-step ﬁnite-diﬀerence time-domain algorithm to solve the Maxwell equations , Phys. Rev. E, 67 (2003), p. 056706.[22]

V. L. Druskin, A. Greenbaum, and L. A. Knizhnerman , Using nonorthogonal Lanczosvectors in the computation of matrix functions , SIAM J. Sci. Comput., 19 (1998), pp. 38–54, https://doi.org/10.1137/S1064827596303661.[23]

V. L. Druskin and L. A. Knizhnerman , Two polynomial methods of calculating functions ofsymmetric matrices , U.S.S.R. Comput. Maths. Math. Phys., 29 (1989), pp. 112–121.[24]

V. L. Druskin and L. A. Knizhnerman , Krylov subspace approximations of eigenpairs andmatrix functions in exact and computer arithmetic , Numer. Lin. Alg. Appl., 2 (1995),pp. 205–217.[25]

M. Eiermann and O. G. Ernst , A restarted Krylov subspace method for the evaluation ofmatrix functions , SIAM Journal on Numerical Analysis, 44 (2006), pp. 2481–2504.[26]

M. Eiermann, O. G. Ernst, and S. G¨uttel , Deﬂated restarting for matrix functions , SIAMJ. Matrix Anal. Appl., 32 (2011), pp. 621–641.[27]

M. Eiermann, O. G. Ernst, and S. G¨uttel , Deﬂated restarting for matrix functions , SIAMJ. Matrix Anal. Appl., 32 (2011), pp. 621–641.[28]

H. C. Elman, A. Ramage, and D. J. Silvester , IFISS: A computational laboratory forinvestigating incompressible ﬂow problems , SIAM Review, 56 (2014), pp. 261–273.[29]

R. P. Fedorenko , Stiﬀ systems of ordinary diﬀerential equations , in Numerical methods andapplications, CRC, Boca Raton, FL, 1994, pp. 117–154.[30]

X. Fu, C. Wang, P. Li, and L. Wang , Exponential integration algorithm for large-scale windfarm simulation with Krylov subspace acceleration , Applied Energy, 254 (2019), p. 113692.24ttps://doi.org/10.1016/j.apenergy.2019.113692.[31]

E. Gallopoulos and Y. Saad , Eﬃcient solution of parabolic equations by Krylov ap-proximation methods , SIAM J. Sci. Statist. Comput., 13 (1992), pp. 1236–1264. http://dx.doi.org/10.1137/0913071.[32]

S. Gaudreault, G. Rainwater, and M. Tokman , KIOPS: A fast adaptive Krylov subspacesolver for exponential integrators , Journal of Computational Physics, 372 (2018), pp. 236–255. https://doi.org/10.1016/j.jcp.2018.06.026.[33]

C. W. Gear and Y. Saad , Iterative solution of linear equations in ODE codes , SIAM J. Sci.Stat. Comput., 4 (1983), pp. 583–601.[34]

G. H. Golub and C. F. Van Loan , Matrix Computations , The Johns Hopkins UniversityPress, Baltimore and London, third ed., 1996.[35]

S. G¨uttel, A. Frommer, and M. Schweitzer. , Eﬃcient and stable Arnoldi restarts for matrixfunctions based on quadrature , SIAM J. Matrix Anal. Appl, 35 (2014), pp. 661–683.[36]

E. Hansen and A. Ostermann , High-order splitting schemes for semilinear evolution equa-tions , BIT Numerical Mathematics, 56 (2016), pp. 1303–1316.[37]

N. J. Higham , Functions of Matrices: Theory and Computation , Society for Industrial andApplied Mathematics, Philadelphia, PA, USA, 2008.[38]

N. J. Higham and A. H. Al-Mohy , Computing matrix functions , Acta Numer., 19 (2010),pp. 159–208, https://doi.org/10.1017/S0962492910000036.[39]

M. Hochbruck and C. Lubich , On Krylov subspace approximations to the matrix exponentialoperator , SIAM J. Numer. Anal., 34 (1997), pp. 1911–1925.[40]

M. Hochbruck, C. Lubich, and H. Selhofer , Exponential integrators for large systems ofdiﬀerential equations , SIAM J. Sci. Comput., 19 (1998), pp. 1552–1574.[41]

M. Hochbruck and A. Ostermann , Exponential integrators , Acta Numer., 19 (2010), pp. 209–286, https://doi.org/10.1017/S0962492910000048.[42]

M. Hochbruck, A. Ostermann, and J. Schweitzer , Exponential Rosenbrock-type methods ,SIAM J. Numer. Anal., 47 (2008/09), pp. 786–803, https://doi.org/10.1137/080717717.http://dx.doi.org/10.1137/080717717.[43]

M. Hochbruck, T. Paˇzur, A. Schulz, E. Thawinan, and C. Wieners , Eﬃcient time inte-gration for discontinuous Galerkin approximations of linear wave equations , ZAMM, 95(2015), pp. 237–259, http://dx.doi.org/10.1002/zamm.201300306.[44]

R. A. Horn and C. R. Johnson , Matrix Analysis , Cambridge University Press, 1986.[45]

W. Hundsdorfer and J. G. Verwer , Numerical Solution of Time-Dependent Advection-Diﬀusion-Reaction Equations , Springer Verlag, 2003.[46]

L. A. Knizhnerman , Calculation of functions of unsymmetric matrices using Arnoldi’s method ,U.S.S.R. Comput. Maths. Math. Phys., 31 (1991), pp. 1–9.[47]

L. A. Krukier , Implicit diﬀerence schemes and an iterative method for solving them for acertain class of systems of quasi-linear equations , Sov. Math., 23 (1979), pp. 43–55. Trans-lation from Izv. Vyssh. Uchebn. Zaved., Mat. 1979, No. 7(206), 41–52 (1979).[48]

J. D. Lawson , Generalized Runge-Kutta processes for stable systems with large Lipschitz con-stants , SIAM J. Numer. Anal., 4 (1967), pp. 372–380.[49]

J. Legras , R´esolution num´erique des grands syst`emes diﬀ´erentiels lin´eaires , Numer. Math., 8(1966), pp. 14–28.[50]

C. B. Moler and C. F. Van Loan , Nineteen dubious ways to compute the exponential of amatrix, twenty-ﬁve years later , SIAM Rev., 45 (2003), pp. 3–49.[51]

I. N ˙asell , Inequalities for modiﬁed Bessel functions , Math. Comp., 28 (1974), pp. 253–256.https://doi.org/10.2307/2005831.[52]

J. Niehoff , Projektionsverfahren zur Approximation von Matrixfunktionen mit Anwendungenauf die Implementierung exponentieller Integratoren , PhD thesis, Mathematisch-Naturwis-senschaftlichen Fakult¨at der Heinrich-Heine-Universit¨at D¨usseldorf, December 2006.[53]

J. Niesen and W. M. Wright , Algorithm 919: A Krylov subspace algorithm for evaluatingthe ϕ -functions appearing in exponential integrators , ACM Trans. Math. Softw., 38 (2012),pp. 22:1–22:19, https://doi.org/10.1145/2168773.2168781.[54] T. J. Park and J. C. Light , Unitary quantum time evolution by iterative Lanczos reduction ,J. Chem. Phys., 85 (1986), pp. 5870–5876.[55]

B. N. Parlett , The Symmetric Eigenvalue Problem , SIAM, 1998.[56]

S. Paszkowski , Computational Applications of Chebyshev Polynomials and Series , Nauka,1983. Russian, translated from Polish.[57]

Y. Saad , Analysis of some Krylov subspace approximations to the matrix exponential operator ,SIAM J. Numer. Anal., 29 (1992), pp. 209–228.[58]

Y. Saad , Iterative Methods for Sparse Linear Systems ∼ saad/books.html.2559] T. Schmelzer and L. N. Trefethen , Evaluating matrix functions for exponential integra-tors via Carath´eodory-Fej´er approximation and contour integrals , Electron. Trans. Numer.Anal., 29 (2007/08), pp. 1–18.[60]

R. B. Sidje , Expokit.

A software package for computing matrix exponentials

D. J. Silvester, H. C. Elman, and A. Ramage , Algorithm 886: IFISS, a Matlab toolbox formodelling incompressible ﬂow , ACM Transactions on Mathematical Software, 33 (2007).[62]

D. J. Silvester, H. C. Elman, and A. Ramage , Incompressible ﬂow & iterative solver soft-ware

P. K. Suetin , Series of Faber Polynomials , CRC Press, 1998.[64]

H. Tal-Ezer , Spectral methods in time for parabolic problems , SIAM J. Numer. Anal., 26(1989), pp. 1–11.[65]

H. A. van der Vorst , An iterative solution method for solving f ( A ) x = b , using Krylovsubspace information obtained for the symmetric positive deﬁnite matrix A , J. Comput.Appl. Math., 18 (1987), pp. 249–263.[66] H. A. van der Vorst , BiCGSTAB: a fast and smoothly converging variant of BiCG for thesolution of nonsymmetric linear systems , SIAM J. Sci. Stat. Comput., 13 (1992), pp. 631–644.[67]

H. A. van der Vorst , Iterative Krylov methods for large linear systems , Cambridge UniversityPress, 2003.[68]

R. W. C. P. Verstappen and A. E. P. Veldman , Symmetry-preserving discretization ofturbulent ﬂow , J. Comput. Phys., 187 (2003), pp. 343–368. http://dx.doi.org/10.1016/S0021-9991(03)00126-8.[69]

X. Wang, P. Chen, and C. Cheng , Stability and convergency exploration of matrix ex-ponential integration on power delivery network transient simulation , IEEE Transac-tions on Computer-Aided Design of Integrated Circuits and Systems, (2019), pp. 1–1.https://doi.org/10.1109/TCAD.2019.2954473.[70]