[PDF] Convergence analysis of multi-level spectral deferred corrections

Abstract

The spectral deferred correction (SDC) method is class of iterative solvers for ordinary differential equations (ODEs). It can be interpreted as a preconditioned Picard iteration for the collocation problem. The convergence of this method is well-known, for suitable problems it gains one order per iteration up to the order of the quadrature method of the collocation problem provided. This appealing feature enables an easy creation of flexible, high-order accurate methods for ODEs. A variation of SDC are multi-level spectral deferred corrections (MLSDC). Here, iterations are performed on a hierarchy of levels and an FAS correction term, as in nonlinear multigrid methods, couples solutions on different levels. While there are several numerical examples which show its capabilities and efficiency, a theoretical convergence proof is still missing. This paper addresses this issue. A proof of the convergence of MLSDC, including the determination of the convergence rate in the time-step size, will be given and the results of the theoretical analysis will be numerically demonstrated. It turns out that there are restrictions for the advantages of this method over SDC regarding the convergence rate.

Full PDF

CConvergence analysis of multi-level spectral deferredcorrections G ITTE K REMLING R OBERT S PECK

The spectral deferred correction (SDC) method is class of iterative solvers forordinary differential equations (ODEs). It can be interpreted as a preconditionedPicard iteration for the collocation problem. The convergence of this method iswell-known, for suitable problems it gains one order per iteration up to the orderof the quadrature method of the collocation problem provided. This appealingfeature enables an easy creation of ﬂexible, high-order accurate methods for ODEs.A variation of SDC are multi-level spectral deferred corrections (MLSDC). Here,iterations are performed on a hierarchy of levels and an FAS correction term, as innonlinear multigrid methods, couples solutions on different levels. While there areseveral numerical examples which show its capabilities and efﬁciency, a theoreticalconvergence proof is still missing. This paper addresses this issue. A proof of theconvergence of MLSDC, including the determination of the convergence rate inthe time-step size, will be given and the results of the theoretical analysis will benumerically demonstrated. It turns out that there are restrictions for the advantagesof this method over SDC regarding the convergence rate.

The original spectral deferred correction (SDC) method for solving ordinary differentialequations (ODEs), a variant of the defect and deferred correction methods developed inthe 1960s [7, 11, 26, 32], was ﬁrst introduced in [12] and then subsequently improved,e.g. in [18, 20, 24, 25]. It relies on a discretization of the initial value problem interms of a collocation problem which is then iteratively solved using a preconditionedﬁxed-point iteration. The iterative structure of SDC has been proven to provide manyopportunities for algorithmic and mathematical improvements. These include the optionof using Newton-Krylov schemes such as the Newton-GMRES method to solve theresulting preconditioned nonlinear systems, leading to the so-called Krylov deferredcorrection methods [20, 21]. Various semi-implicit and multi-implicit formulations a r X i v : . [ m a t h . NA ] A ug Gitte Kremling and Robert Speck of the method have been explored [15, 24, 25, 5, 23]. In the last decade, SDC hasbeen applied e.g. to gas dynamics and incompressible or reactive ﬂows [6, 25] as wellas to fast-wave slow-wave problems [27] or particle dynamics [36]. The generalizedintegral deferred correction framework includes further variations of SDC, where theused discretization approach is not limited to collocation methods [9, 10]. Moreover,the SDC approach was used to derive efﬁcient parallel-in-time solvers addressing theneeds of modern high-performance computing architectures [13, 29].Here, we will focus on the multi-level extension of SDC, namely multi-level spectraldeferred corrections (MLSDC), which was introduced in [3]. It uses a multigrid-likeapproach to solve the collocation problem with SDC iterations (now called “sweeps” inthis context) being performed on the individual levels. The solutions on the differentlevels are then coupled using the Full Approximation Scheme (FAS) coming fromnonlinear multigrid methods. This variation was designed to improve the efﬁciency ofthe method by shifting some of the work to coarser, less expensive levels. While thereare several numerical examples which show the correctness and efﬁciency of MLSDC[31, 14, 17], a theoretical proof of its convergence is still missing. The convergence ofSDC, however, was already extensively examined [8, 15, 18, 20, 37, 33]. It could beshown that, under certain conditions, the method gains one order per iteration up tothe accuracy of the solution of the collocation problem. The aim of this work now isto prove statements on the convergence behavior of MLSDC using similar conceptsand ideas as they were used in the convergence proof of SDC, in particular the onepresented in [33].For that, we ﬁrst review SDC along with one of its existing convergence proofs, formingthe basis for the following convergence analysis of MLSDC. Then, MLSDC is describedand a ﬁrst convergence theorem is provided. The theorem speciﬁcally states thatMLSDC behaves at least as good as SDC does. Since this result contradicts our intuitiveexpectations in that we would assume the multi-level extension to be more efﬁcient thanthe original one, we will again examine the convergence proof in greater detail, nowfor a speciﬁc choice of transfer operators between the different levels. As a result, asecond theorem on the convergence of MLSDC will be derived, describing an improvedbehavior of the method if particular conditions are fulﬁlled. More speciﬁcally, wewill provide theoretical guidelines for parameter choices in practical applications ofMLSDC in order to achieve this improved efﬁciency. Finally, the theoretical resultswill be veriﬁed by numerical examples. onvergence analysis of multi-level spectral deferred corrections In the following SDC is presented as preconditioned Picard iterations for the collocationproblem. The used approach and notations are substantially based on [3, 4] andreferences therein. First, the collocation problem for a generic initial value problem isexplained. Then, SDC is described as a solver for this problem and compact notationsare introduced. Finally, an existing theorem on the convergence of SDC, including itsproof, is presented.

Consider the following autonomous initial value problem (IVP) u (cid:48) ( t ) = f ( u ( t )) , t ∈ [ t , t ] , u ( t ) = u (1)with u ( t ) , u ∈ C N and f : C N → C N , N ∈ N . To guarantee the existence anduniqueness of the solution, f is required to be Lipschitz continuous. Since a high-ordermethod shall be used, f is additionally assumed to be sufﬁciently smooth.The IVP can be written as u ( t ) = u + (cid:90) tt f ( u ( s )) ds , t ∈ [ t , t ]and choosing M quadrature nodes τ , ..., τ M within the time interval such that t ≤ τ < τ < ... < τ M = t , the integral is now approximated using a spectral quadraturerule like Gauß-Radau. This approach results in the discretized system of equations u m = u + ∆ t M (cid:88) j = q m , j f ( u j ) , m = , ..., M , (2)where u m ≈ u ( τ m ), ∆ t = t − t denotes the time step size and q m , j represent thequadrature weights for the unit interval with q m , j = ∆ t (cid:90) τ m t l j ( s ) ds . Here, l j represents the j -th Lagrange polynomial corresponding to the set of nodes( τ m ) ≤ m ≤ M . We can combine these M equations into the following system of linear ornon-linear equations, deﬁning the collocation problem: C ( U ) : = ( I MN − ∆ t ( Q ⊗ I N ) F )( U ) = U , (3) Gitte Kremling and Robert Speck where U : = ( u , u , . . . , u M ) T ∈ C MN , U : = ( u , u , . . . , u ) T ∈ C MN , Q : = ( q m , j ) ≤ m , j ≤ M is the matrix gathering the quadrature weights, the vector function F is given by F ( U ) : = ( f ( u ) , f ( u ) , . . . , f ( u M )) T and I MN , I N are the identity matricesof dimensions MN and N .As described above, the solution U of the collocation problem approximates the solutionof the initial value problem (1). With this in mind, the following theorem, referring to[16, Thm. 7.10], provides a statement on its order of accuracy. Theorem 2.1

The solution U = ( u , u , . . . , u M ) T ∈ C MN of the collocation problemdeﬁned by equation (3) approximates the solution u of the IVP (1) at the collocationnodes. In particular, for ¯ U : = ( u ( τ ) , . . . , u ( τ M )) T the following error estimationapplies: (cid:107) ¯ U − U (cid:107) ∞ ≤ C ∆ t M + (cid:107) u (cid:107) M + , where C is independent of ∆ t , M denotes the number of nodes and (cid:107) u (cid:107) M + representsthe maximum norm of u ( M + , the ( M + th derivative of u . Interpreting the collocation problem as a discretization method with discretizationparameter n : = ∆ t − , the theorem shows that the discrete approximation U deﬁnedby the collocation problem converges with order M + U of thecorresponding IVP.Since the system of equations (3) deﬁning the collocation problem is naturally dense asthe matrix Q gathering the quadrature weights is fully populated, a direct solution isnot advisable, in particular if the right-hand side of the ODE is non-linear. An iterativemethod to solve the problem is SDC.The standard Picard iteration for the collocation problem (3) is given by U ( k + = U ( k ) + ( U − C ( U ( k ) )) = U + ∆ t ( Q ⊗ I N ) F ( U ( k ) ) . (4)As this method only converges for very small step sizes ∆ t , using a preconditioner toincrease range and speed of convergence is reasonable. The SDC-type preconditionersare deﬁned by P ( U ) = ( I MN − ∆ t ( Q ∆ ⊗ I N ) F )( U ) , where the matrix Q ∆ = ( q ∆ m , j ) ≤ m , j ≤ M ≈ Q is formed by the use of a simpler quadraturerule. In particular, Q ∆ is typically a lower triangular matrix, such that solving thesystem can be easily done by forward substitution. onvergence analysis of multi-level spectral deferred corrections Common choices for Q ∆ include the matrix Q ∆ = ∆ t  ∆ τ ∆ τ ∆ τ ... ... . . . ∆ τ ∆ τ . . . ∆ τ M  with ∆ τ m = τ m − τ m − for m = , ..., M and ∆ τ = τ − t representing the right-sided rectangle rule. Similarly, the left-sided rectangle rule [27] or a part of the LU decomposition of the matrix Q [35] are chosen. The theoretical considerations in thenext chapters do not rely on a speciﬁc matrix Q ∆ . However, in the numerical examplesthe right-sided rectangle rule as given above is used.By the use of such an operator to precondition the Picard iteration (4), the followingiterative method for solving the collocation problem is obtained( I MN − ∆ t ( Q ∆ ⊗ I N ) F )( U ( k + ) = U + ∆ t (( Q − Q ∆ ) ⊗ I N ) F ( U ( k ) ) , (5)which constitutes the SDC iteration [12, 20]. Written down line-by-line, this formulationrecovers the original SDC notation given in [12]. A more implicit formulation is givenby U ( k + = U + ∆ t ( Q ∆ ⊗ I N ) F ( U ( k + ) + ∆ t (( Q − Q ∆ ) ⊗ I N ) F ( U ( k ) )(6)and this will be used for the following convergence considerations. There already exist several approaches proving the convergence of SDC, particularlythose presented in [8, 15, 18, 20, 37, 33]. Here, we will focus on the idea of the prooffrom [33] as it uses the previously introduced matrix formulation of SDC, needed foran appropriate adaptation for a convergence proof of MLSDC, and simultaneously,provides a general result for linear and non-linear initial value problems. We will reviewthe idea of this proof in some detail to introduce the notation and the key ideas. This isfollowed by a concise discussion on stability and convergence of SDC in the sense ofone-step ODE solvers.The approach in [33] relies on a split of the local truncation error (LTE). The key conceptused in the proof is a property of the operators QF ( U ) and Q ∆ F ( U ), respectively, whichcan be interpreted as a kind of extended Lipschitz continuity. It is presented in thefollowing lemma using the previously introduced notations. For reasons of readability,the sizes of the identity matrices are no longer denoted here. Gitte Kremling and Robert Speck

Lemma 2.2 If f : C N → C N is Lipschitz continuous, the following estimates apply (cid:107) ∆ t ( Q ⊗ I )( F ( U ) − F ( U )) (cid:107) ∞ ≤ C ∆ t (cid:107) U − U (cid:107) ∞ , (cid:107) ∆ t ( Q ∆ ⊗ I )( F ( U ) − F ( U )) (cid:107) ∞ ≤ C ∆ t (cid:107) U − U (cid:107) ∞ , where the constants C and C are dependent on the Lipschitz constant L , butindependent of ∆ t and U , U ∈ C NM . Proof

Can be shown directly using the deﬁnition of the maximum norm, the Lipschitzcontinuity of f and the compatibility between maximum absolute row sum norm formatrices and maximum norm for vectors. Remark 2.3

For a system of ODEs stemming from a discretized PDE, the constants C and C may depend on the spatial resolution given by some grid spacing ∆ x , becausethe Lipschitz constant of f may depend on it. In this case we have C = C ( ∆ x − d ) , C = C ( ∆ x − d ) for d ∈ N . For example, using second-order ﬁnite differences in spacefor the heat equation results in the ODE system u (cid:48) = Au with matrix A ∈ O ( ∆ x − ) , i.e. d = in this case. This has to be kept in mind for most of the upcoming results and wewill address this point separately in remarks where appropriate. This will be particularlyrelevant for the convergence results in section 3.2, where the spatial discretization playsa key role. Note, however, that this is a rather pessimistic estimate. When focusing onspatial operators with more restrictive properties (e.g. linearity) or when using a speciﬁcmatrix Q ∆ , the convergence results can be improved substantially, both in terms ofconstants and time-step size restrictions. For SDC, this has been already done, seee.g. [35]. The following theorem provides a convergence statement for SDC using the presentedlemma in the proof.

Theorem 2.4

Consider a generic initial value problem like (1) with a Lipschitz-continuous function f on the right-hand side.If the step size ∆ t is sufﬁciently small, SDC converges linearly to the solution U of thecollocation problem with a convergence rate in O ( ∆ t ) , i.e. the following estimate forthe error of the k -th iterated U ( k ) of SDC compared to the solution of the collocationproblem is valid: (cid:107) U − U ( k ) (cid:107) ∞ ≤ C ∆ t (cid:107) U − U ( k − (cid:107) ∞ , (7) where the constant C is independent of ∆ t . onvergence analysis of multi-level spectral deferred corrections If, additionally, the solution of the initial value problem u is ( M + -times continuouslydifferentiable, the LTE of SDC compared to the solution ¯ U of the ODE can be boundedby (cid:107) ¯ U − U ( k ) (cid:107) ∞ ≤ C ∆ t k + k (cid:107) u (cid:107) k + + C ∆ t M + (cid:107) u (cid:107) M + = O ( ∆ t min( k + k , M + ) , (8) where the constants C and C are independent of ∆ t , k denotes the approximationorder of the initial guess U (0) and (cid:107) u (cid:107) p is deﬁned by (cid:107) u ( p ) (cid:107) ∞ . Proof

We again closely follow [33] here. According to the deﬁnition of the collocationproblem (3) and an SDC iteration (6), it follows (cid:107) U − U ( k ) (cid:107) ∞ = (cid:107) ∆ t ( Q ⊗ I )( F ( U ) − F ( U ( k − )) + ∆ t ( Q ∆ ⊗ I )( F ( U ( k − ) − F ( U ( k ) )) (cid:107) ∞ . Together with the triangle inequality and lemma 2.2, we obtain (cid:107) U − U ( k ) (cid:107) ∞ ≤ C ∆ t (cid:107) U − U ( k − (cid:107) ∞ + C ∆ t (cid:107) U ( k − − U ( k ) (cid:107) ∞ . Applying the triangle inequality again yields (cid:107) U − U ( k ) (cid:107) ∞ ≤ ˜ C ∆ t (cid:107) U − U ( k − (cid:107) ∞ + C ∆ t (cid:107) U − U ( k ) (cid:107) ∞ , where here and in the following, we use variables in the form of ˜ C i to denote temporaryarising constants. We continue by subtracting C ∆ t (cid:107) U − U ( k ) (cid:107) ∞ from both sides anddividing by 1 − C ∆ t which results in (cid:107) U − U ( k ) (cid:107) ∞ ≤ ˜ C − C ∆ t ∆ t (cid:107) U − U ( k − (cid:107) ∞ . If the step size is sufﬁciently small, in particular C ∆ t < , (9)the following estimate is valid ˜ C − C ∆ t ≤ C , (10)which concludes the proof for equation (7).Continuing with recursive insertion, we get (cid:107) U − U ( k ) (cid:107) ∞ ≤ ˜ C ∆ t k (cid:107) U − U (0) (cid:107) ∞ . Since U (0) is assumed to be an approximation of k -th order, we further know that (cid:107) ¯ U − U (0) (cid:107) ≤ ˜ C ∆ t k (cid:107) u (cid:107) k + . Gitte Kremling and Robert Speck

This estimation together with the triangle inequality and the error estimation for thesolution of the collocation problem stated in theorem 2.1 yields (cid:107) U − U ( k ) (cid:107) ∞ ≤ ˜ C ∆ t k ( (cid:107) ¯ U − U (cid:107) ∞ + (cid:107) ¯ U − U (0) (cid:107) ∞ ) ≤ ˜ C ∆ t M + k + (cid:107) u (cid:107) M + + C ∆ t k + k (cid:107) u (cid:107) k + . (11)Altogether, it follows (cid:107) ¯ U − U ( k ) (cid:107) ∞ ≤ (cid:107) ¯ U − U (cid:107) ∞ + (cid:107) U − U ( k ) (cid:107) ∞ ≤ C ∆ t M + (cid:107) u (cid:107) M + + ˜ C ∆ t M + k + (cid:107) u (cid:107) M + + C ∆ t k + k (cid:107) u (cid:107) k + = ( C + ˜ C ∆ t k ) ∆ t M + (cid:107) u (cid:107) M + + C ∆ t k + k (cid:107) u (cid:107) k + . Since the step size ∆ t is assumed to be sufﬁciently small, i.e. bounded above, thefollowing estimate is valid C + ˜ C ∆ t k ≤ C , which ﬁnally concludes the proof for equation (8). Remark 2.5

Note that if the right-hand side of the ODE comes from a discretizedPDE with a given spatial resolution, no additional restriction is posed. In this case, C = C ( ∆ x − d ) = ˜ C ∆ x − d for some constant ˜ C , so that the condition (9) becomes ˜ C ∆ t < ∆ x d . Since we did not specify Q ∆ here, the SDC iterations can be explicitor implicit and it is natural to obtain such a restriction on the time-step size. Similarrestrictions can be found in other convergence results for SDC, see e.g. [8, 15, 20].Condition (10) then translates to C ≥ ˜ C − C ∆ t = ˜ C ∆ x − d − ˜ C ∆ x − d ∆ t = ˜ C ∆ x d − ˜ C ∆ t for some constant ˜ C . Assuming a ﬁxed distance between ∆ x p and ˜ C ∆ t with ∆ x d − ˜ C ∆ t = δ we can write C = C ( δ − ) to indicate the dependence of C on thisdistance and not ∆ x p or ∆ t alone. Thus, C does not pose an additional restrictionto the convergence of SDC. Note that for C we have C = C ( δ − ( k + ) , so that thechoice of δ can increase the constant in front of the ∆ t k + k -term quite substantially, butit does not affect the ∆ t M + -term coming from the collocation problem itself. The theorem can be read as a convergence statement for SDC. In particular, the ﬁrstestimation (7) shows that SDC, interpreted as an iterative method to solve the collocationproblem, converges linearly to the solution of the collocation problem with a convergencerate of O ( ∆ t ) if C ∆ t <

1. The second part of the theorem, equation (8) shows thatSDC, in the sense of a discretization method, converges with order min( k + k , M + onvergence analysis of multi-level spectral deferred corrections to the solution of the initial value problem. In other words, the method gains one orderper iteration, limited by the selected number of nodes used for discretization.We can now immediately extend this result by looking at the right endpoint of the singletime interval (which, in our case, is equal to the last collocation node). There, theconvergence rate is limited not by the number of collocation nodes, but by the order ofthe quadrature. Corollary 2.6

Consider a generic initial value problem like (1) with a Lipschitz-continuous function f on the right-hand side. Furthermore, let the solution of the initialvalue problem u be (2 M ) -times continuously differentiable.Then, if the step size ∆ t is sufﬁciently small, the error of the k -th iterated of SDC,deﬁned by equation (6), at the last collocation node u ( k ) M , compared to the exact value atthis point u ( τ M ) , can be bounded by (cid:107) u ( τ m ) − u ( k ) M (cid:107) ∞ ≤ C ∆ t M max( (cid:107) u (cid:107) M , (cid:107) u (cid:107) M + ) + C ∆ t k + k max( (cid:107) u (cid:107) k + , (cid:107) u (cid:107) M + )(12) = O ( ∆ t min( k + k , M ) ) , where the constants C and C are independent of ∆ t , k denotes the approximationorder of the initial guess U (0) and (cid:107) u (cid:107) p is deﬁned by (cid:107) u ( p ) (cid:107) ∞ . Proof

The proof mainly relies on the interpretation of the solution of the collocationproblem evaluated at the last node τ M as the result of a Radau method with M stages.With this in mind, the well-known convergence, or in this case rather consistency, orderof Radau methods yields the estimate [34] (cid:107) u ( τ M ) − u M (cid:107) ∞ ≤ ˜ C ∆ t M (cid:107) u (cid:107) M , where ˜ C is independent of ∆ t . Here and in the following, temporary arising constantswill again be denoted by symbols like ˜ C i . However, they are separately deﬁned andthus, do not correspond to the ones used in previous proofs.To use this estimation, we ﬁrst have to apply the triangle inequality to the left-hand sideof equation (12), in particular (cid:107) u ( τ m ) − u ( k ) M (cid:107) ∞ ≤ (cid:107) u ( τ m ) − u M (cid:107) ∞ + (cid:107) u M − u ( k ) M (cid:107) ∞ . (13)Then, with the deﬁnition of the vector R : = (0 , . . . , , ∈ R × M which, multipliedwith another vector, only captures its last value, the second term on the right-hand sideof the above equation can be transferred to (cid:107) u M − u ( k ) M (cid:107) ∞ = (cid:107) RU − RU ( k ) (cid:107) ∞ ≤ (cid:107) R (cid:107) ∞ (cid:107) U − U ( k ) (cid:107) ∞ = (cid:107) U − U ( k ) (cid:107) ∞ ≤ ˜ C ∆ t M + k + (cid:107) u (cid:107) M + + C ∆ t k + k (cid:107) u (cid:107) k + , Gitte Kremling and Robert Speck where the last estimate comes from equation (11) in the proof of theorem 2.4. Finally,by inserting all these results in equation (13), we obtain (cid:107) u ( τ m ) − u ( k ) M (cid:107) ∞ ≤ (cid:107) u ( τ m ) − u M (cid:107) ∞ + (cid:107) U − U ( k ) (cid:107) ∞ ≤ ˜ C ∆ t M (cid:107) u (cid:107) M + ˜ C ∆ t M + k + (cid:107) u (cid:107) M + + C ∆ t k + k (cid:107) u (cid:107) k + . Note that the leading order of this term is essentially independent of the summandcorresponding to ∆ t M + k + . For ∆ t small enough, this result can be seen by acase analysis for k . For k + ≥ M , the considered summand is dominated by ∆ t M ≥ ∆ t M + k + and thus can be disregarded in terms of leading order analysis. Inthe other case, i.e. for k + < M , the considered summand will, however, be greaterthan the one of order 2 M . Therefore, we will instead compare it to ∆ t k + k in thiscase. Since the number of collocation nodes M is usually chosen to be greater thanthe approximation order k of the initial guess, the relation ∆ t k + k ≥ ∆ t M + k + appliesand hence, k + k will be the leading order for k < M . Thus, the considered summand ∆ t M + k + is again dominated by another term and can be disregarded concerning theoverall asymptotic behavior. These considerations consequently lead to the followingestimation (cid:107) u ( τ m ) − u ( k ) M (cid:107) ∞ ≤ ˜ C ∆ t M max( (cid:107) u (cid:107) M , (cid:107) u (cid:107) M + ) + C ∆ t k + k (cid:107) u (cid:107) k + for k ≥ M and (cid:107) u ( τ m ) − u ( k ) M (cid:107) ∞ ≤ ˜ C ∆ t M (cid:107) u (cid:107) M + ˜ C ∆ t k + k max( (cid:107) u (cid:107) k + , (cid:107) u (cid:107) M + )for k < M , which can be combined to (cid:107) u ( τ m ) − u ( k ) M (cid:107) ∞ ≤ C ∆ t M max( (cid:107) u (cid:107) M , (cid:107) u (cid:107) M + ) + C ∆ t k + k max( (cid:107) u (cid:107) k + , (cid:107) u (cid:107) M + ) , concluding the proof.With this corollary, it can be concluded that SDC, in the sense of a single-step methodto solve ODEs, is consistent of order min( k + k − , M − Theorem 2.7

Consider a generic initial value problem like (1) with a Lipschitz-continuous function f on the right-hand side. If the step size ∆ t is sufﬁciently smalland an appropriate initial guess is used, the SDC method, deﬁned by equation (5), isstable. onvergence analysis of multi-level spectral deferred corrections Proof

As usual for single-step methods, we will prove the Lipschitz continuity of theincrement function of SDC in order to prove the stability of the method.A general single-step method is deﬁned by the formula u n + = u n + ∆ t φ ( u n ) , where u n denotes the approximation at the time step t n and φ ( u n ) is the incrementfunction. Our aim now is to identify the speciﬁc increment function φ correspondingto SDC and to, subsequently, show its Lipschitz continuity, i.e. prove the validity of | φ ( u n ) − φ ( v n ) | ≤ L φ | u n − v n | .First, note that the k -th iterated of SDC u ( k ) m at an arbitrary collocation node τ m (1 ≤ m ≤ M ) can be written as u ( k ) m = u n + r ( k ) m with r ( k ) m : = ∆ t ( Q ∆ F ( U n + r ( k ) )) m + ∆ t (( Q − Q ∆ ) F ( U n + r ( k − )) m , r ( k ) : = ( r ( k )1 , . . . , r ( k ) M ) T and U n = ( u n , . . . , u n ) T for k ≥ m denotes the m th line of the vectors. Consequently, the corresponding approximation atthe time step t n + = τ M can be written as u n + = u ( k ) M = u n + r ( k ) M = : u n + ∆ t φ ( k ) ( u n )with φ ( k ) ( u n ) = ∆ t r ( k ) M . Hence, we have found an appropriate, albeit implicit deﬁnition for the incrementfunction φ ( k ) of SDC.As a second step, it follows an investigation on the Lipschitz continuity of this function.For that, we start by noting that | φ ( k ) ( u n ) − φ ( k ) ( v n ) | = ∆ t | r ( k ) M − s ( k ) M | ≤ ∆ t (cid:107) r ( k ) − s ( k ) (cid:107) ∞ , (14)where the r -terms belong to u n and the s -terms to v n . Now, we will further analyze theterm (cid:107) r ( k ) − s ( k ) (cid:107) . With the insertion of the corresponding deﬁnitions and an applicationof the triangle inequality, it follows (cid:107) r ( k ) − s ( k ) (cid:107) ≤ (cid:107) ∆ tQ ∆ ( F ( U n + r ( k ) ) − F ( V n + s ( k ) )) (cid:107) + (cid:107) ∆ t ( Q − Q ∆ )( F ( U n + r ( k − ) − F ( V n + s ( k − )) (cid:107) . The use of lemma 2.2 and a reapplication of the triangle inequality further yield (cid:107) r ( k ) − s ( k ) (cid:107) ≤ ˜ C ∆ t ( | u n − v n | + (cid:107) r ( k ) − s ( k ) (cid:107) + (cid:107) r ( k − − s ( k − (cid:107) ) . Gitte Kremling and Robert Speck

Continuing with the same trick as in the proof of theorem 2.4, namely a subtraction of˜ C ∆ t (cid:107) r ( k ) − s ( k ) (cid:107) and a subsequent division by 1 − ˜ C ∆ t , we get (cid:107) r ( k ) − s ( k ) (cid:107) ≤ ˜ C − ˜ C ∆ t ∆ t ( | u n − v n | + (cid:107) r ( k − − s ( k − (cid:107) ) . If the step size ∆ t is sufﬁciently small, the following estimation applies˜ C − ˜ C ∆ t ≤ ˜ C and a subsequent iterative insertion further yields (cid:107) r ( k ) − s ( k ) (cid:107) ≤ ˜ C k (cid:88) l = ∆ t l | u n − v n | + ˜ C ∆ t k (cid:107) r (0) − s (0) (cid:107) . With the insertion of this result in equation (14) above, it ﬁnally follows | φ ( k ) ( u n ) − φ ( k ) ( v n ) | ≤ C k − (cid:88) l = ∆ t l | u n − v n | + C ∆ t k − (cid:107) r (0) − s (0) (cid:107) . The value of (cid:107) r (0) − s (0) (cid:107) depends on the initial guess for the SDC iterations. If, forexample, the value at the last time step is used as the initial guess for all collocationnodes, i.e. U (0) = U n and V (0) = V n , we get (cid:107) r (0) − s (0) (cid:107) = − =

Note that the assumed upper bound for the step size ∆ t in the previoustheorem is the same as the one in corollary 2.6, describing the consistency of SDC.Hence, there is no additional restriction for the convergence of the method. Together with the last theorem, corollary 2.6 can be extended towards a convergencetheorem for SDC regarded in the context of single-step methods to solve ODEs.Speciﬁcally, the proven stability of the method allows a direct transfer of the order ofconsistency to the order of convergence. Consequently, it follows that SDC, in the senseof a single-step method, converges with order min( k + k − , M − onvergence analysis of multi-level spectral deferred corrections Multi-level SDC (MLSDC) is a method that uses a multigrid-like approach to solvethe collocation problem (3). It is an extension of SDC in which the iterations, nowcalled “sweeps” in this context, are computed on a hierarchy of levels and the individualsolutions are coupled in the same manner as used in the full approximation scheme(FAS) for non-linear multigrid methods.The different levels are typically created by using discretizations of various resolutions.In this paper, only the two-level algorithm is considered. For this purpose, let Ω h denotethe ﬁne level and Ω H the coarse one. Then, U h denotes the discretized vector on Ω h .Furthermore, C h , F h and Q h are the discretizations of the operators and the quadraturematrix. Likewise, U H , C H , F H and Q H represent the corresponding values for thediscretization parameter H .Here, we will consider two coarsening strategies. The ﬁrst one is a re-discretizationin time at the collocation problem, i.e. a reduction of collocation nodes. The secondpossibility, only applicable if a partial differential equation has to be solved, is are-discretization in space, i.e. the use of less variables for the conversion into an ODE.Since it is necessary to perform computations on different levels, a method to transfervectors between the individual levels is needed. For this purpose, let I hH denote theoperator that transfers a vector from the coarse level Ω H to the ﬁne level Ω h . Thisoperator is called the interpolation operator. I Hh , on the other hand, shall represent theoperator for the reverse direction. It is called the restriction operator. Both operatorstogether are called transfer operators.In detail, the MLSDC two-level algorithm consists of these four steps:(1) Compute the τ -correction as the difference between coarse and ﬁne level: τ = C H ( I Hh U ( k ) h ) − I Hh C h ( U ( k ) h ) = I Hh ( ∆ tQ h F h ( U ( k ) h )) − ∆ tQ H F H ( I Hh U ( k ) h ) . (15)(2) Perform an SDC sweep to approximate the solution of the modiﬁed collocationproblem on the coarse level(16) C ( U H ) = U , H + τ on Ω H , beginning with I Hh U ( k ) h : U ( k +

12 ) H = U , H + τ + ∆ tQ ∆ , H F H ( U ( k +

12 ) H ) + ∆ t ( Q H − Q ∆ , H ) F H ( I Hh U ( k ) h ) . (17) Gitte Kremling and Robert Speck (3) Compute the coarse level correction: U ( k +

12 ) h = U ( k ) h + I hH ( U ( k +

12 ) H − I Hh U ( k ) h ) . (18)(4) Perform an SDC sweep to approximate the solution of the original collocationproblem C ( U h ) = U , h on Ω h , beginning with U ( k +

12 ) h : U ( k + h = U , h + ∆ tQ ∆ , h F h ( U ( k + h ) + ∆ t ( Q h − Q ∆ , h ) F h ( U ( k +

12 ) h ) . (19)Note that for better readability, the enlargements of the matrices Q and Q ∆ by applyingthe Kronecker product with the identity matrix are no longer indicated. Here, we will extend the existing convergence proof for SDC, as presented in theorem 2.4,to prove the convergence of its multi-level extension MLSDC. The following theoremprovides an appropriate convergence statement. In the proof, we use very similar ideasas in the one for the convergence of SDC.

Theorem 3.1

Consider a generic initial value problem like (1) with a Lipschitz-continuous function f on the right-hand side.If the step size ∆ t is sufﬁciently small, MLSDC converges linearly to the solution of thecollocation problem with a convergence rate in O ( ∆ t ) , i.e. the following estimate forthe error of the k -th iterated U ( k ) h of MLSDC compared to the solution of the collocationproblem U h is valid: (cid:107) U h − U ( k ) h (cid:107) ∞ ≤ C ∆ t (cid:107) U h − U ( k − h (cid:107) ∞ , (20) where the constant C is independent of ∆ t .If, additionally, the solution of the initial value problem u is ( M + -times continuouslydifferentiable, the LTE of MLSDC compared to the solution of the ODE can be boundedby (cid:107) ¯ U − U ( k ) h (cid:107) ∞ ≤ C ∆ t k + k (cid:107) u (cid:107) k + + C ∆ t M + (cid:107) u (cid:107) M + (21) = O ( ∆ t min( k + k , M + ) , (22) where the constants C and C are independent of ∆ t , k denotes the approximationorder of the initial guess U (0) and (cid:107) u (cid:107) p is deﬁned by (cid:107) u ( p ) (cid:107) ∞ . onvergence analysis of multi-level spectral deferred corrections Proof

For better readability, the maximum norm (cid:107)·(cid:107) ∞ is denoted with the simplenorm (cid:107)·(cid:107) within this proof. Besides, we consider U ( k + h instead of U ( k ) h here, in orderto enable consistent references to the deﬁnition of the MLSDC algorithm above.As the last step of an MLSDC iteration, in particular equation (19), corresponds to anSDC iteration, we can use theorem 2.4 to get an initial error estimation. Keeping inmind that the SDC iteration is based on U ( k +

12 ) h as initial guess here, the application ofthe mentioned theorem yields the estimation (cid:107) U h − U ( k + h (cid:107) ≤ C ∆ t (cid:107) U h − U ( k +

12 ) h (cid:107) , (23)if the step size ∆ t is sufﬁciently small.Now, the expression on the right-hand side of the above equation will be further examined.In this context, the deﬁnition of an MLSDC iteration, in particular equation (18), yields (cid:107) U h − U ( k +

12 ) h (cid:107) = (cid:107) U h − U ( k ) h − I hH (cid:18) U ( k +

12 ) H − I Hh U ( k ) h (cid:19) (cid:107) = (cid:107) U h − U ( k ) h − I hH (cid:18) U ( k +

12 ) H + U H − U H − I Hh U ( k ) h (cid:19) (cid:107) = (cid:107) ( I − I hH I Hh )( U h − U ( k ) h ) + I hH ( U H − U ( k +

12 ) H ) (cid:107) , where in the last step we used the identity U H = I Hh U h which applies in consequence ofthe τ -correction stemming from the usage of FAS. We get (cid:107) U h − U ( k +

12 ) h (cid:107) ≤ (cid:107) ( I − I hH I Hh )( U h − U ( k ) h ) (cid:107) + (cid:107) I hH ( U H − U ( k +

12 ) H ) (cid:107) (24) ≤ ˜ C (cid:107) U h − U ( k ) h (cid:107) + ˜ C (cid:107) U H − U ( k +

12 ) H (cid:107) , (25)where here and in the following, temporary arising constants will again be denoted bysymbols like ˜ C i .Now, it follows a further investigation of the newly emerged term, in particular thesecond summand of equation (25). An insertion of the corresponding deﬁnitions,namely equation (16) and (17), together with the application of the triangle inequality Gitte Kremling and Robert Speck and lemma 2.2 yields (cid:107) U H − U ( k +

12 ) H (cid:107) = (cid:107) ∆ tQ H ( F H ( U H ) − F H ( I Hh U ( k ) h )) + ∆ tQ ∆ , H ( F H ( I Hh U ( k ) h ) − F H ( U ( k +

12 ) H )) (cid:107)≤ C , H ∆ t (cid:107) U H − I Hh U ( k ) h (cid:107) + C , H ∆ t (cid:107) I Hh U ( k ) h − U ( k +

12 ) H (cid:107)≤ ˜ C ∆ t (cid:107) U H − I Hh U ( k ) h (cid:107) + C , H ∆ t (cid:107) U H − U ( k +

12 ) H (cid:107) . Subtracting C , H ∆ t (cid:107) U H − U ( k +

12 ) H (cid:107) from both sides and dividing by 1 − C , H ∆ t , resultsin (cid:107) U H − U ( k +

12 ) H (cid:107) ≤ ˜ C − C , H ∆ t ∆ t (cid:107) U H − I Hh U ( k ) h (cid:107) . With the same argumentation as above, given a sufﬁciently small step size, it follows (cid:107) U H − U ( k +

12 ) H (cid:107) ≤ ˜ C ∆ t (cid:107) U H − I Hh U ( k ) h (cid:107)≤ ˜ C ∆ t (cid:107) U h − U ( k ) h (cid:107) , (26)where in the last step we used the identity U H = I Hh U h .By inserting equation (25) and this result subsequently into equation (23), we obtain (cid:107) U h − U ( k + h (cid:107) ≤ C ∆ t (cid:107) U h − U ( k +

12 ) h (cid:107)≤ C ∆ t ( ˜ C (cid:107) U h − U ( k ) h (cid:107) + ˜ C (cid:107) U H − U ( k +

12 ) H (cid:107) ) ≤ ˜ C ∆ t (cid:107) U h − U ( k ) h (cid:107) + ˜ C ∆ t ( ˜ C ∆ t (cid:107) U h − U ( k ) h (cid:107) ) = ( ˜ C + ˜ C ∆ t ) ∆ t (cid:107) U h − U ( k ) h (cid:107) . Since the step size ∆ t is assumed to be sufﬁciently small, i.e. bounded above, thefollowing estimate is valid ˜ C + ˜ C ∆ t ≤ C , which concludes the proof for equation (20).The proof of equation (21) is similar to the one of equation (8) in theorem 2.4, using theprevious result. Remark 3.2

As before, if the Lipschitz constant of f depends on the spatial resolution,then constants ˜ C and C , H do as well. In (26) , constant ˜ C then only depends on the onvergence analysis of multi-level spectral deferred corrections original condition − C , H < , which is just the condition we know from SDC, butwith a scaled constant if spatial coarsening is applied. As in remark 2.5 we can write ˜ C = ˜ C ( δ − ) . This only affects C in equation (21), where now C = C ( δ − ( k + ) . As theorem 2.4, this theorem can also be read as a convergence statement. It shows thatMLSDC, interpreted as an iterative method solving the collocation problem, convergeslinearly with a convergence rate in O ( ∆ t ) if C ∆ t <

1. Moreover, the second partof the theorem shows that MLSDC, in the sense of a discretization method for ODEs,converges with order min( k + k , M + Remark 3.3

The results regarding consistency and stability of SDC, namely corol-lary 2.6 and theorem 2.7, can be easily adapted for MLSDC. Analogous to SDC, it canbe proven that the error at the last collocation node can be bounded by O ( ∆ t min( k + k , M ) ) and the increment function of MLSDC is Lipschitz continuous for ∆ t small enough. Although theorem 3.1 is the ﬁrst general convergence theorem for MLSDC, its statementis rather disappointing: It merely establishes that MLSDC converges as least as fast asSDC, although more work is done per iteration.A deeper look into the proof of the theorem gives an idea on the cause for therather unexpected low convergence order. In particular, it is the estimate leading toequation (25) which is responsible for this issue. This equation implies that (cid:107) U h − U ( k +

12 ) h (cid:107) ≤ C (cid:107) U h − U ( k ) h (cid:107) , which essentially means that the additional iteration on the coarse level does not gainany additional order in ∆ t compared to the previously computed iterated on the ﬁnelevel U ( k ) h . More speciﬁcally, it is the estimation (cid:107) ( I − I hH I Hh )( U h − U ( k ) h ) (cid:107) ∞ ≤ C (cid:107) U h − U ( k ) h (cid:107) ∞ , in equation (25) which leads to this result.Thus, a possibly superior behavior of MLSDC seems to depend on the magnitudeof (cid:107) ( I − I hH I Hh ) e h (cid:107) ∞ with e h = U h − U ( k ) h ∈ C MN which describes the differencebetween an original vector and the one which results from restricting and interpolatingit. Consequently, the term can be interpreted as the quality of the approximation on thecoarse level or the accuracy loss it causes, respectively. In the following, this term willbe examined in detail, resulting in a new theorem for the convergence of MLSDC witha higher convergence order but additional assumptions which have to met. Gitte Kremling and Robert Speck

For this purpose, we will mainly focus on a speciﬁc coarsening strategy here, inparticular coarsening in space. The differences occurring from coarsening in time willbe discussed at the end of this section. Moreover, we will focus on particular methodsused for the transfer operators. For I hH we consider a piece-wise Lagrange interpolationof order p . This means that instead of using all N H available values to approximatethe value at a particular point x i ∈ Ω h , only its p neighbors are taken into accountfor this purpose. Hence, I hH corresponds to the application of a p -th order Lagrangeinterpolation for each point. For the restriction operator I Hh , on the other hand, weconsider simple injection. Thereby, we can mainly focus on the interpolation order anddisregard the restriction order.The following lemma now provides an appropriate estimation for the considered term (cid:107) ( I − I hH I Hh ) e h (cid:107) ∞ . Lemma 3.4

Let E : = ( E m ) ≤ m ≤ M denote the remainder of the truncated inversediscrete Fourier transformation of U h − U ( k ) h , i.e. E m : = N − (cid:88) (cid:96) = N | c m ,(cid:96) | , m = , . . . , M . for some cutoff index N ≤ N and c m ,(cid:96) ∈ C being the Fourier coefﬁcients. Then, thefollowing estimate for this error is valid: (cid:107) ( I − I hH I Hh )( U h − U ( k ) h ) (cid:107) ∞ ≤ ( C ∆ x p + C ( E )) (cid:107) U h − U ( k ) h (cid:107) ∞ , where I ≡ I MN denotes the identity matrix of size MN , I hH is the piece-wise spatialLagrange interpolation of order p and I Hh denotes the injection operator. Furthermore, ∆ x ≡ ∆ x H is deﬁned as the resolution in space on the coarse level Ω H of MLSDC. Proof

First of all, we need to introduce some deﬁnitions. For ease of notation, theconsidered error vector U h − U ( k ) h will be denoted by e h within the proof. In detail, thefollowing deﬁnition is used: e m , n = e h , n ( τ m ) = u h , n ( τ m ) − u ( k ) h , n ( τ m ) , ∀ m = , . . . , M , n = , . . . , N , where m denotes the temporal index identifying the particular collocation node τ m and n represents the spatial index referring to some discretized points ( x n ) ≤ n ≤ N withinthe considered interval in space [0 , S ]. Additionally, we assume the spatial steps to beequidistant here. onvergence analysis of multi-level spectral deferred corrections Another deﬁnition needed for the proof is g m , p ( x ). It denotes the Lagrangian interpolationpolynomial of order p for the restricted vector I Hh e h for each point in time τ m , m = , . . . , M . As the two levels Ω h and Ω H differ in their spatial resolution, thetransfer operators are applied at the spatial axis and thus can be considered separatelyfor each component e h ( τ m ). More speciﬁcally, the restriction operator, correspondingto simple injection according to the assumptions, omits several values of e h ( τ m ) ∈ C N resulting in ( I Hh e h )( τ m ) ∈ C N H , where N H denotes the number of degrees of freedom atthe coarse level. The subsequent application of the interpolation operator at this vectorthen leads to the M interpolating polynomials ( g m , p ( x )) ≤ m ≤ M with p referring to theirorder of accuracy.Having introduced these notations, the considered term can be written as( I − I hH I Hh )( U h − U ( k ) h ) = ( I − I hH I Hh ) e h = (cid:0) e m , n − g m , p ( x n ) (cid:1) ≤ m ≤ M , ≤ n ≤ N , so that we can focus on | e m , n − g m , p ( x n ) | .Since g m , p ( x ) partially interpolates the points ( e m , n ) ≤ n ≤ N , it seems very reasonable touse the general error estimation of Lagrangian interpolation to determine an estimatefor the considered term. However, there is a crucial issue: The corresponding errorbound, generally deﬁned in [2, 19] bymax x ∈ [ a , b ] | f ( x ) − g p ( x ) | ≤ ∆ x p p | f ( p ) ( ξ ) | , ξ ∈ [ a , b ](27)apparently depends on the function f from which the interpolation points are obtained.In our case, namely | e m , n − g m , p ( x n ) | , we do not have a speciﬁc function directlyavailable to which the error components e m , n correspond.To ﬁll this gap, we will now derive an appropriate function for this purpose, using acontinuous extension of the inverse Discrete Fourier Transformation (iDFT). The iDFTat the spatial axes for the points e m , n is given by e m , n = √ N N − (cid:88) (cid:96) = c m ,(cid:96) exp (cid:18) i π N ( n − (cid:96) (cid:19) , m = , . . . , M , n = , . . . , N with c m ,(cid:96) as Fourier coefﬁcients and i symbolizing the imaginary unit [22].A continuous extension ˜ e m ( x ) , x ∈ [0 , S ], on the whole spatial interval can thenbe derived by enforcing ˜ e m ( x n ) ! = e m , n for all m and n . With the transformation x n = SN ( n − n = NS x n +

1, implied by an equidistant spatial discretization, itfollows ˜ e m ( x n ) = √ N N − (cid:88) (cid:96) = c m ,(cid:96) exp (cid:18) i π S (cid:96) x n (cid:19) Gitte Kremling and Robert Speck and hence˜ e m ( x ) : = √ N N − (cid:88) (cid:96) = c m ,(cid:96) exp (cid:18) i π S (cid:96) x (cid:19) , x ∈ [0 , S ] , m = , . . . , M . Consequently, we have found a function describing the points e m , n . Thus, an interpreta-tion of g m , p ( x ) as interpolation polynomial of p points stemming from ˜ e m ( x ) is possibleand the error estimation for Lagrange interpolation, presented in equation (27), can nowbe applied. As a result, we get | e m , n − g m , p ( x n ) | = | ˜ e m ( x n ) − g m , p ( x n ) | ≤ ∆ x p p | ˜ e ( p ) m ( ξ ) | (28)with ξ ∈ [0 , S ].With the insertion of the deﬁnition of ˜ e m ( x ) and its p th derivative, it follows | ˜ e m ( x n ) − g m , p ( x n ) | ≤ ∆ x p p (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) √ N N − (cid:88) (cid:96) = c m ,(cid:96) ( i π S (cid:96) ) p exp (cid:18) i π S (cid:96)ξ (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = √ N ∆ x p p (cid:18) π S (cid:19) p (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) N − (cid:88) (cid:96) = c m ,(cid:96) (cid:96) p exp (cid:18) i π S (cid:96)ξ (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , so that with ˜ C ( p ) : = p (cid:0) π S (cid:1) p we have | ˜ e m ( x n ) − g m , p ( x n ) | ≤ √ N ˜ C ( p ) ∆ x p N − (cid:88) (cid:96) = (cid:12)(cid:12)(cid:12)(cid:12) c m ,(cid:96) (cid:96) p exp (cid:18) i π S (cid:96)ξ (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≤ √ N ˜ C ( p ) ∆ x p N − (cid:88) (cid:96) = | c m ,(cid:96) | (cid:96) p . We now choose N ≤ N such that N − (cid:88) (cid:96) = | c m ,(cid:96) | ≥ (cid:15) m > ∀ m for given (cid:15) m . Note that this is not possible if there exists an m for which the coefﬁcients( c m ,(cid:96) ) (cid:96) = ,..., N − are all 0. This would imply that for this particular m the error e m , n equals to 0 for all n = , . . . , N . However, the respective error e h ( τ m ) would then be 0and could just be disregarded in our particular context as it does not have an impact onthe considered maximum norm. Therefore, the assumption does not lead to a loss ofgenerality. Finally, we deﬁne the remainders of the sums as E m : = N − (cid:88) (cid:96) = N | c m ,(cid:96) | , m = , . . . , M . onvergence analysis of multi-level spectral deferred corrections Now, the sum in the previous estimation will be split at N resulting in the followingestimate: | ˜ e m ( x n ) − g m , p ( x n ) | ≤ √ N ˜ C ( p ) ∆ x p  N − (cid:88) (cid:96) = | c m ,(cid:96) | (cid:96) p + N − (cid:88) (cid:96) = N | c m ,(cid:96) | (cid:96) p  . With the simple estimation (cid:96) ≤ N for the ﬁrst sum and (cid:96) ≤ N for the second one, thefollowing formulation using the deﬁnition of the remainder E m is obtained: | ˜ e m ( x n ) − g m , p ( x n ) | ≤ √ N ˜ C ( p ) ∆ x p  N p N − (cid:88) (cid:96) = | c m ,(cid:96) | + N p N − (cid:88) (cid:96) = N | c m ,(cid:96) |  = √ N ˜ C ( p ) ∆ x p (cid:32) N p N − (cid:88) (cid:96) = | c m ,(cid:96) | + N p E m (cid:33) . Now, we will have a look at the norm of the whole vector ( I − I hH I Hh ) e h , but instead ofthe maximum norm, we ﬁrst consider the squared 2-norm given by (cid:107) ( I − I hH I Hh ) e h (cid:107) = N (cid:88) n = M (cid:88) m = | ˜ e m ( x n ) − g m , p ( x n ) | . By the insertion of the previous estimation, it follows (cid:107) ( I − I hH I Hh ) e h (cid:107) ≤ N (cid:88) n = M (cid:88) m = (cid:34) √ N ˜ C ( p ) ∆ x p (cid:32) N p N − (cid:88) (cid:96) = | c m ,(cid:96) | + N p E m (cid:33)(cid:35) . Since the summands are independent of the running index n , the equation can besimpliﬁed to (cid:107) ( I − I hH I Hh ) e h (cid:107) ≤ ˜ C ( p ) ∆ x p M (cid:88) m = (cid:32) N p N − (cid:88) (cid:96) = | c m ,(cid:96) | + N p E m (cid:33) . (29)Now, each inner summand can be written as (cid:32) N p N − (cid:88) (cid:96) = | c m ,(cid:96) | + N p E m (cid:33) = N p (cid:32) N − (cid:88) (cid:96) = | c m ,(cid:96) | (cid:33) + N p N p E m N − (cid:88) (cid:96) = | c m ,(cid:96) | + N p E m (30) = : S + S + S . While S is an intended component (it contains the squared sum of c m ,(cid:96) which we willneed to get back to the norm of e h ), the other two summands S and S are inconvenient. Gitte Kremling and Robert Speck

Therefore, we will now eliminate them by searching a T ( E m ) such that S + S + S ≤ S + T ( E m ) (cid:32) N − (cid:88) (cid:96) = | c m ,(cid:96) | (cid:33) = (cid:16) N p + T ( E m ) (cid:17) (cid:32) N − (cid:88) (cid:96) = | c m ,(cid:96) | (cid:33) . (31)This is true if S + S − T ( E m ) (cid:32) N − (cid:88) (cid:96) = | c m ,(cid:96) | (cid:33) ≤ , which in turn leads to T ( E m ) ≥ N p N p (cid:80) N − (cid:96) = | c m ,(cid:96) | E m + N p (cid:16)(cid:80) N − (cid:96) = | c m ,(cid:96) | (cid:17) E m , after using the deﬁnitions of S and S . Thus, for T ( E m ) : = N p N p (cid:15) m E m + N p (cid:15) m E m (32)we can bound S + S + S ≤ (cid:16) N p + T ( E m ) (cid:17) (cid:32) N − (cid:88) (cid:96) = | c m ,(cid:96) | (cid:33) . Using the Cauchy-Schwarz inequality we have (cid:32) N − (cid:88) (cid:96) = | c m ,(cid:96) | (cid:33) = (cid:32) N − (cid:88) (cid:96) = | c m ,(cid:96) | · (cid:33) ≤ N − (cid:88) (cid:96) = | c m ,(cid:96) | · N − (cid:88) (cid:96) = = N N − (cid:88) (cid:96) = c m ,(cid:96) , so that with (29), (30) and (31) we get (cid:107) ( I − I hH I Hh ) e h (cid:107) ≤ ˜ C ( p ) ∆ x p N M (cid:88) m = (cid:16) N p + T ( E m ) (cid:17) N − (cid:88) (cid:96) = c m ,(cid:96) . Further, it follows from Parseval’s theorem [22] that N − (cid:88) (cid:96) = c m ,(cid:96) ≤ N − (cid:88) (cid:96) = c m ,(cid:96) = N (cid:88) n = e m , n onvergence analysis of multi-level spectral deferred corrections and thus (cid:107) ( I − I hH I Hh ) e h (cid:107) ≤ ˜ C ( p ) ∆ x p N M (cid:88) m = (cid:16) N p + T ( E m ) (cid:17) N (cid:88) n = e m , n = ˜ C ( p ) ∆ x p N (cid:18) N p + max m = ,..., M T ( E m ) (cid:19) (cid:107) e h (cid:107) . Since it is the maximum norm we are interested in and not the Euclidean one, anappropriate transformation is required, given by (cid:107) x (cid:107) ∞ ≤ (cid:107) x (cid:107) ≤ √ n (cid:107) x (cid:107) ∞ ∀ x ∈ C n , so that (cid:107) ( I − I hH I Hh ) e h (cid:107) ∞ ≤ (cid:107) ( I − I hH I Hh ) e h (cid:107) ≤ ˜ C ( p ) ∆ x p (cid:115) N (cid:18) N p + max m = ,..., M T ( E m ) (cid:19) (cid:107) e h (cid:107) ≤ ˜ C ( p ) ∆ x p √ N MN (cid:114) N p + max m = ,..., M T ( E m ) (cid:107) e h (cid:107) ∞ . Using the triangle inequality for square roots, the sum can be split as (cid:107) ( I − I hH I Hh ) e h (cid:107) ∞ ≤ ˜ C ( p ) ∆ x p √ N MN (cid:18) N p + (cid:113) max m = ,..., M T ( E m ) (cid:19) (cid:107) e h (cid:107) ∞ . With the insertion of the deﬁnition of T ( E m ) presented in equation (32), we get (cid:107) ( I − I hH I Hh ) e h (cid:107) ∞ ≤ ˜ C ( p ) √ N MNN p ∆ x p ·  N p + (cid:115) max m = ,..., M N p N p (cid:15) m E m + N p (cid:15) m E m  (cid:107) e h (cid:107) ∞ . (33)At the ﬁrst sight, it looks like ∆ x p is the dominating term in this equation. However, acloser look reveals that the root term is in O ( N p ) which shifts the dominance of thissummand towards the remainder E m . To see this, we ﬁrst transform the root term byextracting N p : (cid:115) max m = ,..., M N p N p (cid:15) m E m + N p (cid:15) m E m ≤ (cid:115) max m = ,..., M N p (cid:18) N p (cid:15) m E m + (cid:15) m E m (cid:19) = N p (cid:115) max m = ,..., M N p (cid:15) m E m + (cid:15) m E m . Gitte Kremling and Robert Speck

Then, we consider the deﬁnition of the step size on the ﬁne level, namely ∆ x h = SN ,which allows the representation of N p as (cid:16) S ∆ x h (cid:17) p . From this representation, it follows ∆ x p (cid:115) max m = ,..., M N p N p (cid:15) m E m + N p (cid:15) m E m ≤ S p (cid:18) ∆ x H ∆ x h (cid:19) p (cid:115) max m = ,..., M N p (cid:15) m E m + (cid:15) m E m . The insertion of this estimation into equation (33) ﬁnally leads to the overall result (cid:107) ( I − I hH I Hh ) e h (cid:107) ∞ ≤ ˜ C ( p ) √ N MNN p ∆ x p (cid:107) e h (cid:107) ∞ + (2 π ) p p (cid:18) ∆ x H ∆ x h (cid:19) p (cid:115) max m = ,..., M N p (cid:15) m E m + (cid:15) m E m  (cid:107) e h (cid:107) ∞ , where we used the deﬁnition of ˜ C ( p ) to eliminate S p in the second summand.In this estimation, we can ﬁnally see that ∆ x p does not dominate the second summandanymore. It is replaced by the relation between the step size on the coarse and on theﬁne level which can be regarded as constant. As a result, the remainder E m is now thedominating item in the term. Hence, a formulation like (cid:107) ( I − I hH I Hh ) e h (cid:107) ∞ ≤ C ∆ x p (cid:107) e h (cid:107) ∞ + C ( E ) (cid:107) e h (cid:107) ∞ with E : = ( E m ) ≤ m ≤ M is reasonable and concludes the proof. Remark 3.5

The splitting of the sum and the consideration of the vector of remainders E is needed since the constant C would otherwise depend on N p . Considering therelation NS = ∆ x , this would mean that the approximated boundary would not depend on ∆ x anymore. As a result, we would not obtain a better estimation than the simple one (cid:107) ( I − I hH I Hh )( e h ) (cid:107) ∞ ≤ C (cid:107) e h (cid:107) ∞ , which was already used in the proof of theorem 3.1.By the applied split of the series, the term N p is replaced by N p which yields a moremeaningful estimation as it keeps the dependence on the term ∆ x p while adding anotheron the smoothness of the error. Remark 3.6

Note that ideally the error only has a few, low-frequency Fourier coefﬁ-cients, i.e. ˜ e m ( x ) can be written as ˜ e m ( x ) : = √ N N − (cid:88) (cid:96) = c m ,(cid:96) exp (cid:18) i π S (cid:96) x (cid:19) , x ∈ [0 , S ] , m = , . . . , M , onvergence analysis of multi-level spectral deferred corrections using N summands only. Then, E m = and the estimate (cid:107) ( I − I hH I Hh ) e h (cid:107) ∞ ≤ C ∆ x p (cid:107) e h (cid:107) ∞ + C ( E ) (cid:107) e h (cid:107) ∞ reduces to (cid:107) ( I − I hH I Hh ) e h (cid:107) ∞ ≤ C ∆ x p (cid:107) e h (cid:107) ∞ . The following theorem uses lemma 3.4 to extend theorem 3.1. In particular, the providedestimation for (cid:107) ( I − I hH I Hh )( U h − U ( k ) h ) (cid:107) ∞ is used in the corresponding proof whichresults in a new convergence theorem for MLSDC. Theorem 3.7

Consider a generic initial value problem like (1) with a Lipschitz-continuous function f on the right-hand side. Furthermore, let the conditions oflemma 3.4 be met.Then, if the step size ∆ t is sufﬁciently small, MLSDC converges linearly to the solutionof the collocation problem with a convergence factor in O (( ∆ x p + C ( E )) ∆ t + ∆ t ) , i.e.the following estimate for the error is valid: (cid:107) U h − U ( k ) h (cid:107) ∞ ≤ (( C ∆ x p + C ( E )) ∆ t + C ∆ t ) (cid:107) U h − U ( k − h (cid:107) ∞ , (34) where ∆ x ≡ ∆ x H is deﬁned as the resolution in space on the coarse level Ω H ofMLSDC and the constants C , C ( E ) and C are independent of ∆ t .If, additionally, the solution of the initial value problem u is ( M + -times continuouslydifferentiable, the LTE of MLSDC compared to the solution of the ODE can be boundedby: (cid:107) ¯ U h − U ( k ) h (cid:107) ∞ ≤ C ∆ t M + (cid:107) u (cid:107) M + + k (cid:88) l = C l + ( ∆ x p + C ( E )) k − l ∆ t k + k + l (cid:107) u (cid:107) k + (35) where the constants C , . . . , C k + are independent of ∆ t , k denotes the approximationorder of the initial guess U (0) h and (cid:107) u (cid:107) p is deﬁned by (cid:107) u ( p ) (cid:107) ∞ . Proof

The proof is similar to the one of theorem 3.1 but differs in the used estimationfor (cid:107) ( I − I hH I Hh )( U h − U ( k ) h ) (cid:107) ∞ . Here, lemma 3.4 instead of the simple norm compatibilityinequality is used for this purpose. Based on the estimations (23), (24) and (26), arisingin the proof of the mentioned theorem, it follows (cid:107) U h − U ( k + h (cid:107) ∞ ≤ ˜ C ∆ t (cid:107) ( I − I hH I Hh )( U h − U ( k ) h ) (cid:107) ∞ + C ∆ t (cid:107) U h − U ( k ) h (cid:107) ∞ . Gitte Kremling and Robert Speck

As already mentioned, we will now apply lemma 3.4, namely (cid:107) ( I − I hH I Hh )( U h − U ( k ) h ) (cid:107) ∞ ≤ ( C ∆ x p + C ( E )) (cid:107) U h − U ( k ) h (cid:107) ∞ , which yields (cid:107) U h − U ( k + h (cid:107) ∞ ≤ ( C ∆ x p + C ( E )) ∆ t (cid:107) U h − U ( k ) h (cid:107) ∞ + C ∆ t (cid:107) U h − U ( k ) h (cid:107) ∞ = (( C ∆ x p + C ( E )) ∆ t + C ∆ t ) (cid:107) U h − U kh (cid:107) ∞ . This concludes the proof of equation (34).The proof of equation (35) is again similar to the one of the second equation intheorem 2.4, using the previous result. Additionally, the binomial theorem is applied tosimplify the arising term (( C ∆ x p + C ( E )) ∆ t + C ∆ t ) k . Remark 3.8

It is here that a possible dependency of f ’s Lipschitz constant on ∆ x playsa key role. Similar to the observations before, we ﬁnd that in this case equation (34)needs to be replaced with (cid:107) U h − U ( k ) h (cid:107) ∞ ≤ (( C ∆ x p + C ( E )) C ( δ − ) ∆ t + C ( δ − ) ∆ t ) (cid:107) U h − U ( k − h (cid:107) ∞ , where δ denotes the difference between spatial and temporal resolution (up to constants)and comes from the initial step size restriction of SDC. The term C ∆ x p + C ( E ) itself does not depend on δ , since it comes from the remainder of the interpolationestimate in lemma 3.4, where f and therefore its Lipschitz constant as well as the stepsize does not play a role. As before, equation (35) has to be modiﬁed, now includingthe term δ − ( k − l + in the sum. The constant C is still independent of δ . Remark 3.9

Similar to corollary 2.6, it can be proven that the order limit M + intheorem 3.7 can be replaced by M if only the error at the last collocation node isconsidered. The theorem states that, under the named conditions, MLSDC converges linearlywith a convergence rate of O (( ∆ x p + C ( E )) ∆ t + ∆ t ) to the collocation solutionif ( C ∆ x p + C ( E )) ∆ t + C ∆ t <

1. This means that if ∆ x p and the vector ofremainders E are sufﬁciently small, the error of MLSDC decreases by two orders of ∆ t with each iteration, which indeed represents an improved convergence behaviorcompared to the one described in theorem 3.1. Otherwise, i.e. if ∆ x p and E are notthat small, it only decreases by one order in ∆ t which is equivalent to the result of theprevious theorem. onvergence analysis of multi-level spectral deferred corrections In the second equation of the theorem, it can be seen that again, ∆ x p and E are thecrucial factors here. If they are small enough such that ∆ t k + k is the leading order,MLSDC converges with order min( k + k − , M −

1) and thus gains two orders periteration. Otherwise, the convergence order is only min( k + k − , M − ∆ t in each iteration.Note that, as a result, it is advisable to use a high interpolation order p and a small spatialstep size ∆ x on the coarse level in practical applications of MLSDC. This theoreticalresult matches the numerical observations described in [31]. In section 2.2.5 of thispaper, it is mentioned that the convergence properties of MLSDC seem to be highlydependent on the used interpolation order and resolution in space. Moreover, it wassaid that in the considered numerical examples a high resolution in space, i.e. a small ∆ x , led to a lower sensitivity on the interpolation order p . Our theoretical investigationprovides an explanation for this behavior.It seems reasonable to use a similar approach to determine the conditions for a higherconvergence order of MLSDC if coarsening in time instead of space is used. Analogousto equation (27) in the proof of lemma 3.4, the Lagrangian error estimation could beused for this purpose, resulting in the following estimation (cid:107) ( I − I hH I Hh )( U h − U ( k ) H ) (cid:107) ∞ ≤ ∆ τ p p (cid:107) e ( p ) ( t ) (cid:107) ∞ , (36)where I hH and I Hh denote temporal transfer operators now and e ( t ) is deﬁned as thecontinuous error of MLSDC compared to the collocation solution. In this case, however,the function e ( t ) is implicitly known and thus does not have to be approximated by aniDFT. In particular, it is a polynomial of degree M ≡ M h as both the collocation solution U and each iterate U ( k ) h of MLSDC are polynomials of that degree, respectively. Thiscan be seen by considering that ∆ tQF ( U ) as well as ∆ tQ ∆ F ( U ) essentially represent asum of integrals of Lagrange polynomials which apparently results in a polynomial.Consequently, the p -th derivative of e ( t ) is a polynomial of degree M h − p . The maximalinterpolation order p is the number of collocation nodes M H on the coarse level. Here,we will assume that p = M H , i.e. the maximal interpolation order is used. Note that e ( p ) ( t ) = p > M h and hence (cid:107) ( I − I hH I Hh )( U h − U ( k ) H ) (cid:107) ∞ = M H = M h whichis consistent with the expected behavior as it means that no coarsening is used at all. Asa conclusion, it can be said that, according to equation (36) the improved convergencebehavior of MLSDC using coarsening in time is dependent on the used time step size ∆ τ = C ∆ t and the number of collocation nodes on the coarse level M H . Note thatit is also dependent on the speciﬁc coefﬁcients of e ( p ) ( t ). However, these are highlydependent on the right-hand side f of the IVP and thus cannot be controlled by anymethod parameters. Gitte Kremling and Robert Speck

In summary, two convergence theorems for MLSDC were established in this section.While the ﬁrst one, theorem 3.1, represents a general statement on the convergence ofthe method, the second one, theorem 3.7, provides theoretically established guidelinesfor the parameter choice in practical applications of MLSDC in order to achieve animproved convergence behavior of the method. In the next section, we will examinenumerical examples of MLSDC to check if the resulting errors match those theoreticalpredictions.

In this section, the convergence behavior of MLSDC, theoretically analyzed in theprevious section, is veriﬁed by numerical examples. The method is applied to threedifferent initial value problems and the results are compared to those from classical,single-level SDC. The key question here is whether the conditions derived in the previoussections (smoothness, high spatial/temporal resolution and high interpolation order)are actually sharp, i.e. whether MLSDC does indeed show only low order convergenceif any of these conditions are violated. The corresponding programs were written inPython using the pySDC code [28, 30].

The ﬁrst numerical example is the one-dimensional heat equation deﬁned by thefollowing initial value problem:(37) ∂∂ t u ( x , t ) = ν ∂ ∂ x u ( x , t ) , ∀ t ∈ [0 , t end ] , x ∈ [0 , u (0 , t ) = , u (1 , t ) = , u ( x , = u ( x ) , where u ( x , t ) represents the temperature at the location x and time t and ν > N degrees-of-freedom.As initial value a sine wave with frequency κ is selected, i.e. u ( x ) = sin( κπ x ). Underthese conditions, the analytical solution of the spatially discretized initial value problemis given by (cid:126) u ( t ) = sin( κπ(cid:126) x ) e − t νρ with ρ = ∆ x (2 − πν(cid:126) x )) onvergence analysis of multi-level spectral deferred corrections with (cid:126) x : = ( x n ) ≤ n ≤ N and an element-wise application of the trigonometric functions.For the tests, we choose κ = ν = . M = f is not a necessary condition in theorems 2.4, 3.1 or 3.7. In fact, f isjust assumed to be Lipschitz continuous. However, we will consider the heat equationhere since it is well studied and has a convenient exact solution needed to compute theerrors of SDC and MLSDC.The following tests are structured in a particular way: In the ﬁrst one, we will adjustthe method parameters according to the results of theorem 3.7 to observe an improvedconvergence of MLSDC over SDC. More speciﬁcally, we will use a small spatial stepsize ∆ x , a high interpolation order p and try to generate smooth errors using smoothinitial guesses for the iteration. In a second step, we will then subsequently change theseparameters leading to a lower convergence order of MLSDC as described in theorem 3.1.Thereby, we will reveal the dependence of MLSDC’s convergence behavior on thoseparameters and simultaneously verify the general minimal achievable convergenceorder of the method. Altogether, this will conﬁrm the theoretical results of the previoussection.For the ﬁrst test, the number of degrees-of-freedom was set to N h =

255 on the ﬁne and N H =

127 on the coarse level of MLSDC. This parameter particularly determines thespatial grid size ∆ x = N + . As discussed before, we use injection as restriction anda piecewise p -th order Lagrange interpolation as interpolation, for now with p = u ( x ) = sin(4 π x ) was spread across the differentnodes τ m to form the initial guess.An illustration of the corresponding numerical results is shown in ﬁgure 1c, with thereference SDC result in ﬁgure 1a. MLSDC was applied with different step sizes ∆ t and numbers of iterations k to the considered problem and the resulting errors wereplotted as points in the respective graphs. The drawn lines, on the other hand, representthe expected behavior, i.e. the predicted convergence orders of the method accordingto theorem 3.7. In particular, we assume that the terms ∆ x p and C ( E ) are sufﬁcientlysmall such that ∆ t k + k is the leading order in the corresponding error estimation. As aresult, MLSDC is expected to gain two orders per iteration. In the ﬁgure, it can be seenthat all of the computed points nearly lie on the expected lines which always start atthe error resulting for the largest step size. Therefore, the numerical results match thetheoretical predictions.If, by contrast, the spatial grid size ∆ x is chosen to be signiﬁcantly larger, in particularas large as on the ﬁne and on the coarse level, the leading order in theorem 3.7, Gitte Kremling and Robert Speck − − − − ∆t10 − − − − − − − − e rr o r (a) SDC spread initial guess order k-1 2 − − − − ∆t (b) SDC random initial guess order k-12 − − − − ∆t10 − − − − − − − − e rr o r (c) MLSDC optimal parameters order 2k-1 2 − − − − ∆t (d) MLSDC coarser grid spacing ∆x order k-12 − − − − ∆t10 − − − − − − − − (e) MLSDC lower interpolation order p order k-1 2 − − − − ∆t (f) MLSDC random initial guess order k-1 k=1 k=2 k=3 k=4 k=5

Figure 1: Convergence behavior of SDC and MLSDC applied to the discretizedone-dimensional heat equation with coarsening in space (for MLSDC) and differentparameters onvergence analysis of multi-level spectral deferred corrections presenting an error estimation for MLSDC, changes to ∆ t k + k . Hence, in this example,we expect MLSDC to only gain one order in ∆ t with each iteration, as SDC does and asit was described in the general convergence theorem 3.1. The corresponding numericalresults, presented in ﬁgure 1d, conﬁrm this prediction. For both methods, the errordecreases by one order in ∆ t with each iteration.Another possible modiﬁcation of the ﬁrst example is a decrease of the interpolationorder p . Figure 1e shows the numerical results if this parameter is changed to p = ∆ t k + k due to the highermagnitude of ∆ x p . Besides, it should be noted that the considered values of ∆ t aresigniﬁcantly smaller here. This is caused by the fact that MLSDC does not converge forgreater values of this parameter, i.e. the upper bound for ∆ t , implicitly occurring inthe assumptions of the respective theorem, seems to be lower here. The smaller stepsizes ∆ t also entail overall smaller errors. As a result, the accuracy of the collocationsolution is reached earlier which explains the outliers in the considered plots.The third necessary condition for the improved convergence of MLSDC is the magnitudeof the remainders E m or, in other words, the smoothness of the error. In this context,we will now have a look at the changes which result from a higher oscillatory initialguess. In particular, we will assign random values to U (0) . The corresponding errorsare shown in ﬁgure 1b for SDC and ﬁgure 1f for MLSDC. It can be seen that thischange results again in a lower convergence order of MLSDC, in particular it gains oneorder per iteration as SDC. Since this time, as the crucial term ∆ x p is left unchanged,the result can only be assigned to a higher value of C ( E ) and thus to an insufﬁcientsmoothness of the error. This may lead to the assumption that for this problem type asmooth initial guess is a sufﬁcient condition for the smoothness of the error and thus, alow value of C ( E ). The second test case is the non-linear, two-dimensional Allen-Cahn equation u t = ∆ u + (cid:15) u (1 − u ) on [ − . , . × [0 , T ] , T > , (38) u ( x , = u ( x ) , x ∈ [ − . , . , with periodic boundary conditions and scaling parameter (cid:15) >

0. We use again second-order ﬁnite differences in space and choose a sine wave in 2D as initial condition, i.e. Gitte Kremling and Robert Speck u ( x ) = sin( κπ x ) sin( κπ y ). There is no analytical solution, neither for the continuousnor for the spatially discretized equations. Therefore, reported errors are computedagainst a numerically computed high-order reference solution. For the tests, we choose κ = (cid:15) = . M = N h =

128 on the ﬁne and N H =

64 on the coarse level of MLSDC. Transfer operators are the same as before.Figure 2 shows the results of our tests for the Allen-Cahn equation for both SDC andMLSDC. The main conclusion here is the same as before: using less degrees-of-freedom(here N =

32 on the ﬁne level instead of 128), a lower interpolation order (here p = The third test case is the following two-dimensional ODE introduced in [1]:(39) ˙ u = (cid:18) ˙ x ˙ y (cid:19) = (cid:18) − y − λ x (1 − x − y ) x − λρ y (1 − x − y ) (cid:19) , ∀ t ∈ [0 , t end ] u (0) = u , where λ < ρ > λ = − . ρ = u = (1 , T . The analytical solutionof this initial value problem is known. It is given by u ( t ) = (cid:18) x ( t ) y ( t ) (cid:19) = (cid:18) cos( t )sin( t ) (cid:19) , t ∈ [0 , t end ] . The corresponding tests are structured in a similar way as before but this time the ODEversion of theorem 3.7, given by equation (36), is considered. So, ﬁrst appropriateparameters are used to reach second-order convergence of MLSDC, and then thesharpness of the implied conditions is tested. In particular, the improved convergencebehavior of MLSDC is expected to depend on the time step size ∆ τ = C ∆ t , the (nowtemporal) interpolation order p and the smoothness of the error in time. In our tests,we always used the maximal interpolation order p = M H corresponding to the number onvergence analysis of multi-level spectral deferred corrections − − − − ∆t10 − − − − − − e rr o r (a) SDC spread initial guess order k-1 2 − − − − − ∆t (b) SDC random initial guess order k-12 − − − − ∆t10 − − − − − − e rr o r (c) MLSDC optimal parameters order 2k-1 2 − − − − ∆t (d) MLSDC coarser grid spacing ∆x order k-12 − − − − ∆t10 − − − − − − (e) MLSDC lower interpolation order p order k-1 2 − − − − − ∆t (f) MLSDC random initial guess order k-1 k=1 k=2 k=3 k=4 k=5

Figure 2: Convergence behavior of SDC and MLSDC applied to the discretized two-dimensional Allen-Cahn equation with coarsening in space (for MLSDC) and differentparameters Gitte Kremling and Robert Speck of collocation nodes on the coarse level, since otherwise it was not possible to get asecond order convergence at all. The number of nodes on the ﬁne grid was chosen to be M h = ∆ t ∈ [2 − , − ] instead of [2 − , − ]), a lower interpolation order (here p = M H = k − k − k and k . This behavior is probably related to the k -term in theestimates which stems from the initial guess of SDC and MLSDC. This explanationwould also agree with the result that this additional order gets lost if a random initialguess is used (see ﬁgure 3b, f). Aside from that, it should be noted that the use of alower interpolation order (ﬁgure 3e) led to a convergence order of k + k as we would have expected. The reason for this is not clear but could be relatedto equation (35) which implies that all orders between k and 2 k can potentially bereached. In any case, the result shows that the second-order convergence is lost ifthe interpolation order is decreased. Finally, we want to discuss the plot in ﬁgure 3dresulting from the use of a larger time step size ∆ t . It can be seen that the data pointsdo not perfectly agree with the predicted lines here. Apparently, the numerical resultsare often much better than expected. However, they do not reach order 2 k and henceconﬁrm our theory that the second-order convergence of MLSDC is also dependent ofa small time step size. The deviations in the data are, in fact, not too surprising here,considering that the time step size is a very crucial parameter for the convergence ofMLSDC in general. As described in theorem 3.1 and 3.7, ∆ t has to be small enough inorder for MLSDC to converge at all. For that reason, the possible testing scope for thetime step size is rather small, making it difﬁcult to ﬁnd appropriate parameters whereMLSDC converges exactly with order k .These artifacts shed some light on the “robustness” of the results, a fact that we wouldlike to share here: during the tests with all three examples, we saw that it is actuallyvery hard to get these more or less consistent results. All model and method parametershad to be chosen carefully in order to support the theory derived above so clearly. Inmany cases the results were much more inconsistent, showing e.g. convergence orderssomewhere between k and 2 k , changing convergence orders or stagnating results closeto machine precision or discretization errors. None of the tests we did contradicted our onvergence analysis of multi-level spectral deferred corrections − − − − ∆t10 − − − − − − − e rr o r (a) SDC spread initial guess order k 2 − − − − ∆t (b) SDC random initial guess order k-12 − − − − ∆t10 − − − − − − − e rr o r (c) MLSDC optimal parameters order 2k 2 − − − − ∆t (d) MLSDC larger time step size ∆t order k2 − − − − ∆t10 − − − − − − − (e) MLSDC lower interpolation order p order k+1 2 − − − − ∆t (f) MLSDC random initial guess order k-1k=1 k=2 k=3 k=4 k=5

Figure 3: Convergence behavior of SDC and MLSDC applied to the Auzinger problemwith coarsening in the collocation nodes (for MLSDC) and different parameters Gitte Kremling and Robert Speck theoretical results, though, but they revealed that the bounds we obtained are indeedrather pessimistic.

In this paper, we established two convergence theorems for multi-level spectral deferredcorrection (MLSDC) methods, using similar concepts and ideas as those presented in[33] for the proof of the convergence of SDC. In the ﬁrst theorem, namely theorem 3.1,it was shown that with each iteration of MLSDC the error compared to the solution ofthe initial value problem decreases by at least one order of the chosen step size ∆ t ,limited by the accuracy of the underlying collocation solution. The correspondingtheorem only requires the operator on the right-hand side of the considered initial valueproblem to be Lipschitz-continuous, not necessarily linear, and the chosen time stepsize ∆ t to be sufﬁciently small. Consequently, we found a ﬁrst theoretical convergenceresult for MLSDC proving that it converges as good as SDC does. However, we wouldexpect and numerical results already indicated that the additional computations on thecoarse level, more speciﬁcally the SDC iterations performed there, would lead to animproved convergence behavior of the method.For that reason, we analyzed the errors in greater detail, leading to a second theoremon the convergence of MLSDC, namely theorem 3.7. Here, we focused on a speciﬁccoarsening strategy and transfer operators. In particular, we considered MLSDC usingcoarsening in space with Lagrangian interpolation. Given these assumptions, we couldprove that, if particular conditions are met, the method can even gain two orders of ∆ t in each iteration until the accuracy of the collocation problem is reached. Thisconsequently led us to theoretically established guidelines for the parameter choice inpractical applications of MLSDC in order to achieve the described improved convergencebehavior of the method. More speciﬁcally, the corresponding theorem says that for thispurpose the spatial grid size on the coarse level has to be small, the interpolation orderhas to be high and the errors have to be smooth. We presented numerical exampleswhich conﬁrm these theoretical results. In particular, it could be observed that thechange of one of those crucial parameters immediately led to a decrease in the order ofaccuracy. Essentially, it resulted in a convergence behavior as it was described in theﬁrst presented theorem.Besides, there are several open questions related to the presented work which have notyet been investigated. Three of them are brieﬂy discussed here. onvergence analysis of multi-level spectral deferred corrections More information, better results.

The results presented here are quite generic. As aconsequence, since we only assume Lipschitz continuity of the right-hand side of theODE and do not pose conditions on the SDC preconditioner, both constants and stepsize restrictions are rather pessimistic. Using more knowledge of the right-hand side orthe matrix Q ∆ will yield better results, as it already did for SDC. Since the goal of thispaper is to establish a baseline for convergence of MLSDC, exploiting this direction,especially with respect to the treatment of convergence in the stiff limit as done in [35]for SDC, is left for future work. Smoothness of the error.

The second theorem, describing conditions for an improvedconvergence behavior of MLSDC, has a drawback regarding its practical signiﬁcance.The way theorem 3.7 is currently proven requires a smooth error after its periodicextension. This occurring condition of a smooth error does not always apply and isin particular not easy to control. Essentially, something like a smoothing propertywould be needed to ensure that the error always becomes smooth after enough iterations.Numerical results indicate that this property apparently does not hold for SDC, though [4].In this context, however, it would be sufﬁcient if we could at least control this condition,i.e. derive particular criteria for the parameters of the method ensuring the errors tobe smooth. The numerical examples presented in section 4 particularly lead to theassumption that the selection of a smooth initial guess U (0) would result in smootherrors for U ( k ) , k ≥

1, at least for a particular set of problems.

Other extensions of SDC.

Furthermore, it could be tried to adapt the presentedconvergence proofs of MLSDC to other extensions and variations of SDC, as forexample the parallel-in-time method PFASST (Parallel Full Approximation Schemein Space and Time) [13] or general semi-implicit and multi-implicit formulations ofSDC (SISDC/MISDC) [24, 25]. Whereas an adaptation to SISDC and MISDC methodsseems to be rather straightforward [8], we found that the application of similar conceptsand ideas to prove the convergence of PFASST may involve some difﬁculties. Inparticular, the coupling of the different time steps, i.e. the use of the approximationat the endpoint of the last subinterval for the start point of the next one, could causea problem in this context since the corresponding operator is independent of ∆ t andwould thus add a constant term to our estimations. References [1]

W Auzinger , H Hofst¨atter , W Kreuzer , E Weinm ¨uller , Modiﬁed defect correctionalgorithms for ODEs. Part I: General theory, Numerical Algorithms 36 (2004) 135–155 Gitte Kremling and Robert Speck [2]

S Bartels , Numerik 3x9, Springer Spektrum (2016)[3]

M Bolten , D Moser , R Speck , A multigrid perspective on the parallel fullapproximation scheme in space and time, Numerical Linear Algebra with Applica-tions 24 (2017) e2110E2110 nla.2110[4]

M Bolten , D Moser , R Speck , Asymptotic convergence of the parallel fullapproximation scheme in space and time for linear problems, Numerical Linear Algebrawith Applications 25 (2018)[5]

A Bourlioux , A T Layton , M L Minion , High-order multi-implicit spectral deferredcorrection methods for problems of reactive ﬂow, Journal of Computational Physics189 (2003) 651 – 675[6]

E Bouzarth , M Minion , A multirate time integrator for regularized Stokeslets, Journalof Computational Physics 229 (2010) 4208–4224[7]

K B¨ohmer , P Hemker , H Stetter , The Defect Correction Approach, from “DefectCorrection Methods: Theory and Applications”, Springer, Berlin (1984) 1–32[8]

M Causley , D Seal , On the convergence of spectral deferred correction methods,Communications in Applied Mathematics and Computational Science 14 (06 2017)[9]

A Christlieb , B Ong , JMQiu , Comments on high-order integrators embedded withinintegral deferred correction methods, Communications in Applied Mathematics andComputational Science 4 (2009) 27–56[10]

A Christlieb , B Ong , JMQiu , Integral deferred correction methods constructed withhigh order Runge-Kutta integrators, Mathematics of Computation 79 (2010) 761–783[11]

J Daniel , V Pereyra , L Schumaker , Integrated Deferred Corrections for Initial ValueProblems, Acta Cient. Venezolana 19 (1968) 128–135[12]

A Dutt , L Greengard , V Rokhlin , Spectral Deferred Correction Methods for OrdinaryDifferential Equations, BIT Numerical Mathematics 40 (2000) 241–266[13]

M Emmett , M Minion , Toward an efﬁcient parallel in time method for partialdifferential equation, Communications in Applied Mathematics and ComputationalScience 7 (2012) 105–132[14]

M Emmett , E Motheau , W Zhang , M Minion , J B Bell , A fourth-order adaptive meshreﬁnement algorithm for the multicomponent, reacting compressible Navier{Stokesequations, Combustion Theory and Modelling 23 (2019) 592–625[15]

T Hagstrom , R Zhou , On the spectral deferred correction of splitting methods forinitial value problems, Communications in Applied Mathematics and ComputationalScience 1 (12 2006) 169–205[16]

E Hairer , S Nørsett , G Wanner , Solving ordinary differential equations I: nonstiffproblems, volume 8, Springer-Verlag (1993)[17]

F P Hamon , M Schreiber , M L Minion , Multi-level spectral deferred correctionsscheme for the shallow water equations on the rotating sphere, Journal of ComputationalPhysics 376 (2019) 435 – 454 onvergence analysis of multi-level spectral deferred corrections [18] A Hansen , J Strain , Convergence theory for spectral deferred correction, Universityof California at Berkeley (2005)[19]

M Heath , Scientiﬁc Computing: An Introductory Survey, 2 edition, Siam (2018)[20]

J Huang , J Jia , M Minion , Accelerating the convergence of spectral deferredcorrection methods, Journal of Computational Physics 214 (05 2006) 633–656[21]

J Huang , J Jia , M Minion , Arbitrary order Krylov deferred correction methods fordifferential algebraic equations, Journal of Computational Physics 221 (2007) 739–760[22]

D Kammler , A First Course in Fourier Analysis, Cambridge University Press (2008)[23]

A T Layton , M L Minion , Conservative multi-implicit spectral deferred correctionmethods for reacting gas dynamics, Journal of Computational Physics 194 (2004) 697– 715[24]

M Minion , Semi-implicit spectral deferred correction methods for ordinary differentialequations, Communications in Mathematical Sciences 1 (09 2003) 471–500[25]

M Minion , Semi-implicit projection methods for incompressible ﬂow based on spectraldeferred corrections, Applied Numerical Mathematics 48 (2004) 369–387[26]

V Pereyra , Iterated deferred corrections for nonlinear operator equations, NumerischeMathematik 10 (1967) 316–323[27]

D Ruprecht , R Speck , Spectral Deferred Corrections with Fast-wave Slow-waveSplitting, SIAM Journal on Scientiﬁc Computing 38 (2016) A2535–A2557[28]

R Speck , pySDC (2017) Available at http://parallel-in-time.org/pySDC/ [29]

R Speck , Parallelizing spectral deferred corrections across the method, Computing andVisualization in Science 19 (07 2018) 75–83[30]

R Speck , Algorithm 997: PySDC|Prototyping Spectral Deferred Corrections, ACMTrans. Math. Softw. 45 (August 2019)[31]

R Speck , et al., A multi-level spectral deferred correction method, BIT NumericalMathematics 55 (2015) 843–867[32]

H Stetter , Economical global error estimation, from “Stiff differential systems”,Springer-Verlag (1974) 245–258[33]

T Tang , H Xie , X Yin , High-Order Convergence of Spectral Deferred CorrectionMethods on General Quadrature Nodes, Journal of Scientiﬁc Computing (07 2013)[34]

G Wanner , E Hairer , Solving ordinary differential equations II: stiff anddifferential-algebraic problems, Springer-Verlag (1991)[35]

M Weiser , Faster SDC convergence on non-equidistant grids by DIRK sweeps, BITNumerical Mathematics 55 (12 2015) 1219–1241[36]

M Winkel , R Speck , D Ruprecht , A high-order Boris integrator, Journal of Computa-tional Physics 295 (2015) 456–474 Gitte Kremling and Robert Speck [37]

Y Xia , Y Xu , C Shu , Efﬁcient time discretization for local discontinuous Galerkinmethods, Discrete and continuous dynamical systems 8 (10 2007) 677–693