[PDF] A new convergence analysis of two-level hierarchical basis methods

Abstract

This paper is concerned with the convergence analysis of two-level hierarchical basis (TLHB) methods in a general setting, where the decomposition associated with two hierarchical component spaces is not required to be a direct sum. The TLHB scheme can be regarded as a combination of compatible relaxation and coarse-grid correction. Most of the previous works focus on the case of exact coarse solver, and the existing identity for the convergence factor of exact TLHB methods involves a tricky max-min problem. In this work, we present a new and purely algebraic analysis of TLHB methods, which gives a succinct identity for the convergence factor of exact TLHB methods. The new identity can be conveniently utilized to derive an optimal interpolation and analyze the influence of coarse space on the convergence factor. Moreover, we establish two-sided bounds for the convergence factor of TLHB methods with inexact coarse solver, which extend the existing TLHB theory.

Full PDF

aa r X i v : . [ m a t h . NA ] F e b A NEW CONVERGENCE ANALYSIS OF TWO-LEVELHIERARCHICAL BASIS METHODS

XUEFENG XU

Abstract.

This paper is concerned with the convergence analysis of two-levelhierarchical basis (TLHB) methods in a general setting, where the decomposi-tion associated with two hierarchical component spaces is not required to be adirect sum. The TLHB scheme can be regarded as a combination of compatiblerelaxation and coarse-grid correction. Most of the previous works focus on thecase of exact coarse solver, and the existing identity for the convergence factorof exact TLHB methods involves a tricky max-min problem. In this work, wepresent a new and purely algebraic analysis of TLHB methods, which gives asuccinct identity for the convergence factor of exact TLHB methods. The newidentity can be conveniently utilized to derive an optimal interpolation andanalyze the inﬂuence of coarse space on the convergence factor. Moreover, weestablish two-sided bounds for the convergence factor of TLHB methods withinexact coarse solver, which extend the existing TLHB theory. Introduction

Multigrid is a typical multilevel iterative scheme, which has been proved to bea powerful solver (with linear or near-linear computational complexity) for a largeclass of linear systems that arise from discretized partial diﬀerential equations; see,e.g., [12, 8, 17, 19]. The fundamental module of multigrid is a two-grid scheme,which combines two complementary error-reduction processes: a smoothing (or relaxation ) process and a coarse-grid correction process. The smoothing process istypically a simple iterative method such as the Jacobi and Gauss–Seidel iterations.Usually, it is eﬃcient on high-frequency (i.e., oscillatory) error modes, while the low-frequency (i.e., smooth) part cannot be eliminated eﬀectively. One way to capturethe low-frequency error is to coarsen the underlying grid so that low-frequencymodes on the initial ﬁne-grid appear high-frequency on a coarser-grid. The low-frequency error will be further eliminated by a relaxation method on the coarse-grid.The resulting correction can be interpolated back to the ﬁne-grid by an interpolationoperator. Such a process is the so-called coarse-grid correction.For a given initial guess u ∈ R n , the smoothing iteration for solving A u = f can be described as(1.1) u k +1 = u k + M − ( f − A u k ) , where A ∈ R n × n is symmetric positive deﬁnite (SPD) and M ∈ R n × n is a nonsingu-lar smoother. In the classical two-grid analysis (see, e.g., [10, 11, 19]), M is assumedto be A -convergent (i.e., k I − M − A k A < Mathematics Subject Classiﬁcation.

Primary 65F08, 65F10, 65N55; Secondary 15A18.

Key words and phrases.

Two-level methods, hierarchical basis, convergence analysis, inexactcoarse solver. deﬁniteness of M + M T − A ; see, e.g., [19, Proposition 3.8]. This assumption playsa crucial role in the theoretical analysis of two-grid methods.As mentioned earlier, the global smoothing (1.1) has a little eﬀect on the low-frequency error in general. Alternatively, one can use a local smoother M s ∈ R n s × n s ( n s < n ) in the smoothing process, which is expected to focus on eliminating thehigh-frequency error modes. Compared with the global smoother M , one has moreroom to design the local smoother M s due to its size is relatively small. Inspiredby the compatible relaxation iterations in [10] (the idea of compatible relaxationoriginated with Brandt [6]), we perform the following smoothing iteration:(1.2) u k +1 = u k + SM − S T ( f − A u k ) , where S ∈ R n × n s is of full column rank and M s ∈ R n s × n s is ( S T AS )-convergent(noting that M s is not restricted to S T M S ). Clearly, the iteration (1.2) will reduceto (1.1) if n s = n and S = I n . Such a special case is not our focus here: two-gridtheory has been well developed in the literature; see, e.g., [10, 11, 19, 24, 14, 16,20, 21, 22]. Let u ℓ ∈ R n be an approximation to the exact solution u (e.g., u ℓ isgenerated from (1.2)), and let P ∈ R n × n c ( n c < n ) be an interpolation matrix withfull column rank. The (exact) coarse-grid correction can then be described as(1.3) u ℓ +1 = u ℓ + P ( P T AP ) − P T ( f − A u ℓ ) . The two-level hierarchical basis (TLHB) scheme can be obtained by successivelyperforming presmoothing, coarse-grid correction, and postsmoothing iterations (seeAlgorithm 1). Some pioneering works on TLHB methods can be found in [5, 2, 23,4, 3, 18]. A basic assumption in the classical TLHB theory is that (

S P ) is squareand nonsingular (which entails that n c = n − n s ). For example, S = (cid:18) I n s (cid:19) , P = (cid:18) ∗ I n − n s (cid:19) . This assumption leads to the positive deﬁniteness of the hierarchical basis matrix(1.4) (cid:18) S T P T (cid:19) A (cid:0) S P (cid:1) = (cid:18) S T AS S T APP T AS P T AP (cid:19) . An important quantity involved in the analysis of multilevel methods is the so-called

Cauchy–Bunyakowski–Schwarz (C.B.S.) constant [9, 1, 11]. The C.B.S. constantassociated with (1.4) is deﬁned as(1.5) γ := max v s ∈ R n s \{ } v c ∈ R n c \{ } v T s S T AP v c p v T s S T AS v s · v T c P T AP v c . The positive deﬁniteness of (1.4) implies that γ ∈ [0 , S ) andrange( P ).Using a hierarchical expression for the inverse of TLHB preconditioner (see (2.6)),one can easily verify that a necessary condition for TLHB convergence is(1.6) rank( S P ) = n, which is a foundation of TLHB analysis. In particular, in the case of exact coarsesolver, the condition (1.6) is also suﬃcient for TLHB convergence. Obviously, (1.6)implies that n s + n c ≥ n , i.e., n c ≥ n − n s . If S and M s are preselected, theclassical setting n c = n − n s is not the optimal one, at least from the perspective of ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 3 convergence (see Theorem 3.7 and Remark 3.8). In addition, if n c > n − n s , then γ happens to be 1, which will trivialize some classical TLHB theories (see, e.g., [9,1, 18, 11]). Under the condition (1.6), Falgout, Vassilevski, and Zikatanov [11,Theorem 4.1] established an identity for the convergence factor of exact TLHBmethods. However, the identity (see also (2.15)) involves a tricky max-min problem:it is generally diﬃcult to determine when the ‘min’ is attained, which limits theapplication of the identity.In this paper, we derive a new and succinct identity (see (3.1)) for the conver-gence factor of exact TLHB methods under the condition (1.6). Our proof is notonly novel but also much simpler than that in [11]. The new identity provides astraightforward approach to analyze the optimal interpolation (see Theorem 3.4)and the inﬂuence of range( P ) on the convergence factor (see Theorem 3.7). In prac-tice, the Galerkin coarse-grid system is often too costly to solve exactly, especiallywhen its size is still large. Instead, one can solve the system approximately as longas the convergence speed is satisfactory. Compared with the exact case, the con-vergence analysis of TLHB methods with inexact coarse solver is of more practicalsigniﬁcance. Motivated by this observation, we establish two-sided bounds for theconvergence factor of inexact TLHB methods, from which one can readily get theidentity for the exact case.The rest of this paper is organized as follows. In Section 2, we introduce someproperties and convergence results of TLHB methods. In Section 3, we present anew identity for the convergence factor of exact TLHB methods, followed by somediscussions on how the new identity can be used to analyze the optimal interpolationand the inﬂuence of range( P ) on the convergence factor. In Section 4, we establisha systematic convergence theory for inexact TLHB methods. In Section 5, we givesome concluding remarks. 2. Preliminaries

We start with some notation used in the subsequent discussions.– I n denotes the n × n identity matrix (or I when its size is clear from context).– λ min ( · ), λ +min ( · ), and λ max ( · ) stand for the smallest eigenvalue, the smallestpositive eigenvalue, and the largest eigenvalue of a matrix, respectively.– λ i ( · ) denotes the i -th smallest eigenvalue of a matrix.– λ ( · ) denotes the spectrum of a matrix.– ρ ( · ) represents the spectral radius of a matrix.– k · k denotes the spectral norm of a matrix.– k · k A denotes the energy norm induced by an SPD matrix A ∈ R n × n : forany v ∈ R n , k v k A = √ v T A v ; for any B ∈ R n × n , k B k A = max v ∈ R n \{ } k B v k A k v k A .Our focus is on TLHB methods for solving the linear system(2.1) A u = f , where A ∈ R n × n is SPD, u ∈ R n , and f ∈ R n . Some basic assumptions involved inthe analysis of TLHB methods are listed below. • Let S ∈ R n × n s and P ∈ R n × n c be of full column rank, wheremax { n s , n c } < n ≤ n s + n c . • Assume that (

S P ) ∈ R n × ( n s + n c ) is of full row rank, or, equivalently, forany v ∈ R n , there exist v s ∈ R n s and v c ∈ R n c such that v = S v s + P v c . XUEFENG XU • Let M s be an n s × n s nonsingular matrix such that M s + M T s − A s is SPD,where A s := S T AS . • Let B c ∈ R n c × n c be an SPD approximation to A c , where A c := P T AP isthe so-called Galerkin coarse-grid matrix .With the above assumptions, the standard TLHB scheme for solving (2.1) can bedescribed as Algorithm 1 ( u ∈ R n is an initial guess). If B c = A c , then Algorithm 1is called an exact TLHB method; otherwise, it is called an inexact

TLHB method.

Algorithm 1

TLHB method Presmoothing: u ← u + SM − S T ( f − A u ) Restriction: r c ← P T ( f − A u ) Coarse-grid correction: e c ← B − r c Interpolation: u ← u + P e c Postsmoothing: u TL ← u + SM − T s S T ( f − A u ) Remark . Due to n s < n , it follows that k I − SM − S T A k A = 1 , which does not satisfy a conventional assumption in two-grid analysis, that is, thesmoothing iteration is a contraction in A -norm. Moreover, there is no nonsingularmatrix X ∈ R n × n such that I − X − A = ( I − SM − S T A )( I − SM − T s S T A ) . Therefore, the classical two-grid theory is not applicable for Algorithm 1. Comparedwith the two-grid case, one has more room to design the local smoother M s insteadof limiting it to simple types (e.g., the Jacobi or Gauss–Seidel type). For example,if M s is taken to be A s , then I − SM − S T A = I − SA − S T A, which is an A -orthogonal projection along (or parallel to) range( S ) onto null( S T A )and hence can remove the error components contained in range( S ). If range( S ) cov-ers most of the high-frequency modes, then the smoothing iteration will eliminatethe high-frequency error eﬀectively.From Algorithm 1, we have u − u TL = e E TL ( u − u )with(2.2) e E TL = ( I − SM − T s S T A )( I − P B − P T A )( I − SM − S T A ) , which is called the iteration matrix (or error propagation matrix ) of Algorithm 1.Deﬁne(2.3) M s := M s ( M s + M T s − A s ) − M T s . Then, e E TL can be expressed as(2.4) e E TL = I − e B − A, ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 5 where(2.5) e B − = SM − S T + ( I − SM − T s S T A ) P B − P T ( I − ASM − S T ) . Indeed, e B − admits the following hierarchical expression:(2.6) e B − = (cid:0) S P (cid:1) b B − (cid:0) S P (cid:1) T , where(2.7) b B TL = (cid:18) I P T ASM − I (cid:19) (cid:18) M s B c (cid:19) (cid:18) I M − T s S T AP I (cid:19) . The matrix e B TL is referred to as the TLHB preconditioner , whose positive deﬁnite-ness follows from the positive deﬁniteness of b B TL and rank( S P ) = n . Accordingto (2.2) and (2.4), we deduce that(2.8) k e E TL k A = ρ ( e E TL ) = max (cid:8) λ max (cid:0) e B − A (cid:1) − , − λ min (cid:0) e B − A (cid:1)(cid:9) . In particular, if B c = A c , then the iteration matrix is denoted by E TL , and(2.9) E TL = ( I − SM − T s S T A )( I − Π A )( I − SM − S T A ) , where(2.10) Π A := P A − P T A. Similarly, we have(2.11) E TL = I − B − A, where(2.12) B − = SM − S T + ( I − SM − T s S T A ) P A − P T ( I − ASM − S T ) . Note that Π A is an A -orthogonal projection. We then have(2.13) k E TL k A = λ max ( E TL ) = 1 − λ min (cid:0) B − A (cid:1) . Based on the so-called saddle-point lemma [11, Lemma 3.1], Falgout, Vassilevski,and Zikatanov [11, Theorem 4.1] derived an identity for k E TL k A , as described inthe following theorem. Theorem 2.2.

Deﬁne (2.14) f M s := M T s ( M s + M T s − A s ) − M s . The convergence factor of Algorithm with B c = A c can be characterized as (2.15) k E TL k A = 1 − K TL , where (2.16) K TL = max v ∈ range( I − Π A ) min v s : v =( I − Π A ) S v s v T s f M s v s v T A v . The identity (2.15) is valid as long as rank(

S P ) = n . In particular, if ( S P ) issquare and nonsingular, Falgout, Vassilevski, and Zikatanov [11, Corollary 4.1 andTheorem 4.2] further proved the following results.

XUEFENG XU

Theorem 2.3. If ( S P ) is an n × n nonsingular matrix, then (2.17) K TL = max v s ∈ R n s \{ } v T s f M s v s v T s S T A ( I − Π A ) S v s and (2.18) K TL ≤ λ max ( A − f M s )1 − γ , where γ is deﬁned by (1.5) . In the case of inexact coarse solver, if λ ( B − A c ) ⊂ (cid:20)

11 + δ , (cid:21) with δ > , then (2.19) k e E TL k A ≤ − K TL + δ − γ . Convergence analysis of exact TLHB methods

In this section, we present a new convergence analysis of Algorithm 1 with exactcoarse solver (under the condition rank(

S P ) = n ), which gives a succinct identityfor the convergence factor k E TL k A . The new identity can be conveniently used toanalyze the optimal interpolation and the inﬂuence of range( P ) on k E TL k A .Observe that the identity (2.15) involves a tricky max-min problem. In general,it is diﬃcult to determine when the ‘min’ is attained, which limits the applicationof (2.15). Furthermore, the proof of (2.15) provided in [11] is not very direct.The following theorem gives a new and succinct identity for k E TL k A , whose proofis straightforward. Theorem 3.1.

The convergence factor of Algorithm with B c = A c can be char-acterized as (3.1) k E TL k A = 1 − σ TL , where (3.2) σ TL = λ +min (cid:0) S f M − S T A ( I − Π A ) (cid:1) = λ +min (cid:0) f M − S T A ( I − Π A ) S (cid:1) . Proof.

By (2.9) and (2.11), we have B − A = I − ( I − SM − T s S T A )( I − Π A )( I − SM − S T A ) . Then λ (cid:0) B − A (cid:1) = λ (cid:0) I − ( I − SM − S T A )( I − SM − T s S T A )( I − Π A ) (cid:1) = λ (cid:0) I − ( I − S f M − S T A )( I − Π A ) (cid:1) = λ (cid:0) S f M − S T A ( I − Π A ) + Π A (cid:1) . Since Π A = Π A and rank( Π A ) = n c , there exists a nonsingular matrix Y ∈ R n × n such that(3.3) Π A = Y − (cid:18) I n c

00 0 (cid:19) Y. Let(3.4) S f M − S T A = Y − (cid:18) Z Z Z Z (cid:19) Y, ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 7 where Z ij ∈ R n i × n j with n = n c and n = n − n c . Direct computations yield S f M − S T A ( I − Π A ) = Y − (cid:18) Z Z (cid:19) Y, (3.5) S f M − S T A ( I − Π A ) + Π A = Y − (cid:18) I n c Z Z (cid:19) Y. (3.6)Note that A E TL A − is symmetric positive semideﬁnite (SPSD) and B TL is SPD.We then have λ (cid:0) S f M − S T A ( I − Π A ) + Π A (cid:1) = λ (cid:0) B − A (cid:1) ⊂ (0 , , which, together with (3.6), leads to λ ( Z ) ⊂ (0 , . Hence, λ min (cid:0) B − A (cid:1) = λ min (cid:0) S f M − S T A ( I − Π A ) + Π A (cid:1) = λ min ( Z ) = λ +min (cid:0) S f M − S T A ( I − Π A ) (cid:1) , where we have used the expressions (3.5) and (3.6). The identity (3.1) then followsimmediately from (2.13). (cid:3) Remark . According to the proof of Theorem 3.1, we deduce that(3.7) λ (cid:0) S f M − S T A ( I − Π A ) (cid:1) = (cid:8) , . . . , | {z } n c , ν , . . . , ν n − n c (cid:9) , where 0 < ν i ≤ i = 1 , . . . , n − n c . Then(3.8) λ (cid:0) f M − S T A ( I − Π A ) S (cid:1) = (cid:8) , . . . , | {z } n s + n c − n , ν , . . . , ν n − n c (cid:9) . In particular, if (

S P ) is an n × n nonsingular matrix, then σ − = λ − (cid:0) f M − S T A ( I − Π A ) S (cid:1) = min v s ∈ R n s \{ } v T s S T A ( I − Π A ) S v s v T s f M s v s ! − = max v s ∈ R n s \{ } v T s f M s v s v T s S T A ( I − Π A ) S v s , which gives the expression (2.17). If M s is further taken to be A s , then k E TL k A = 1 − λ min (cid:0) A − S T A ( I − Π A ) S (cid:1) = 1 − λ min (cid:0) I − A − S T AP A − P T AS (cid:1) = λ max (cid:0) A − S T AP A − P T AS (cid:1) = (cid:13)(cid:13) A − s S T AP A − c (cid:13)(cid:13) = γ , where γ is deﬁned by (1.5).The proof of Theorem 3.1 also yields a characterization for the spectrum of E TL ,as described in the following corollary. XUEFENG XU

Corollary 3.3.

The spectrum of E TL is given by (3.9) λ ( E TL ) = (cid:8) , . . . , | {z } n c , − ν , . . . , − ν n − n c (cid:9) , where { ν i } n − n c i =1 are the positive eigenvalues of f M − S T A ( I − Π A ) S . Compared with (2.15), the identity (3.1) is more convenient for TLHB analysis.Of particular interest is an interpolation P that minimizes the convergence factor k E TL k A , provided that S and M s are preselected.Using (3.1), we can derive the following optimal interpolation theory. Theorem 3.4.

Let { ( µ i , v i ) } ni =1 be the eigenpairs of S f M − S T A , where µ ≤ µ ≤ · · · ≤ µ n and v Ti A v j = ( , if i = j, , if i = j. Then (3.10) k E TL k A ≥ − µ n c +1 , and the equality holds if range( P ) = span { v , . . . , v n c } .Proof. Due to f M s − A s is SPSD and S f M − S T A has the same nonzero eigenvaluesas f M − A s , it follows that0 = µ = · · · = µ n − n s < µ n − n s +1 ≤ · · · ≤ µ n ≤ . Let V = ( v , . . . , v n ) and U = V − P ( P T V − T V − P ) − . It is easy to check that V T AV = I and U is an n × n c matrix with orthonormalcolumns (i.e., U T U = I n c ). Let U be an n × ( n − n c ) matrix such that ( U U ) isorthogonal. Then S f M − S T A ( I − Π A ) = S f M − S T A ( I − P A − P T A )= S f M − S T A ( I − V U U T V T A )= S f M − S T A ( I − V U U T V − )= S f M − S T AV ( I − U U T ) V − = V Σ U U T V − , where Σ = diag(0 , . . . , , µ n − n s +1 , . . . , µ n ).According to (3.7) andΣ U U T = V − S f M − S T A ( I − Π A ) V, we deduce that Σ U U T has n − n c positive eigenvalues. Since( U U ) T = ( U U ) − and (cid:18) U T U T (cid:19) Σ U U T (cid:0) U U (cid:1) = (cid:18) U T Σ U U T Σ U (cid:19) , it follows that U T Σ U is positive deﬁnite. Hence, σ TL = λ +min (cid:0) S f M − S T A ( I − Π A ) (cid:1) = λ +min (Σ U U T ) = λ min ( U T Σ U ) . Using the

Poincar´e separation theorem (see, e.g., [13, Corollary 4.3.37]), we obtain λ min ( U T Σ U ) = λ ( U T Σ U ) ≤ λ n c +1 (Σ) = µ n c +1 . ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 9

Consequently, k E TL k A = 1 − λ min ( U T Σ U ) ≥ − µ n c +1 . In particular, if range( P ) = span { v , . . . , v n c } , then there exists a nonsingularmatrix P ∈ R n c × n c such that P = V (cid:18) P (cid:19) . In this case, U = (cid:18) P (cid:19) ( P T P ) − . Then U U T = I − U U T = I − (cid:18) P (cid:19) ( P T P ) − (cid:0) P T (cid:1) = (cid:18) I n − n c (cid:19) . Hence, k E TL k A = 1 − λ +min (Σ U U T ) = 1 − µ n c +1 . This completes the proof. (cid:3)

Remark . As mentioned in [11, Page 483], A s happens to be well-conditioned inthe classical TLHB methods, so it is not that impractical to take M s = A s . In sucha case, the eigenvalues of S f M − S T A are0 = µ = · · · = µ n − n s < µ n − n s +1 = · · · = µ n = 1 . Then, µ n c +1 = 1 (since n c ≥ n − n s ), which gives the optimal convergence factor 0. Remark . Unlike the optimal interpolation theory for two-grid methods [20, 7], S f M − S T here is a singular matrix. That is, (cid:0) S f M − S T (cid:1) − is not well-deﬁned andhence cannot induce an inner product in R n .Besides the optimal interpolation analysis, the identity (3.1) is also convenientfor analyzing the inﬂuence of range( P ) on k E TL k A . Theorem 3.7.

Let b P ∈ R n × b n c be of full column rank ( with n c ≤ b n c < n ) , and let rank( S b P ) = n . If range( P ) ⊆ range( b P ) , then (3.11) σ TL ≤ b σ TL , where σ TL is given by (3.2) and (3.12) b σ TL = λ +min (cid:0) S f M − S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1)(cid:1) . Proof.

Since range( P ) ⊆ range( b P ), there exists an b n c × n c matrix W such that P = b P W.

Note that W ∈ R b n c × n c is of full column rank. One can ﬁnd an b n c × b n c nonsingularmatrix c W such that W = c W (cid:18) I n c (cid:19) , which yields P = ( b P c W ) (cid:18) I n c (cid:19) . Hence, b P = ( P C ) c W −

10 XUEFENG XU for some C ∈ R n × ( b n c − n c ) . From (3.12), we have(3.13) b σ TL = λ +min (cid:0) S f M − S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1)(cid:1) , where b P = ( P C ).In light of (3.2), (3.8), and (3.13), we have σ TL = λ +min (cid:0) f M − s S T A ( I − Π A ) S f M − s (cid:1) = λ n s + n c − n +1 (cid:0) f M − s S T A ( I − Π A ) S f M − s (cid:1) , b σ TL = λ +min (cid:0) f M − s S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1) S f M − s (cid:1) = λ n s + b n c − n +1 (cid:0) f M − s S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1) S f M − s (cid:1) . Let D = f M − s S T A ( I − Π A ) S f M − s − f M − s S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1) S f M − s . Then D = f M − s S T A (cid:2) b P ( b P T A b P ) − b P T A − P ( P T AP ) − P T A (cid:3) S f M − s = f M − s S T A (cid:20) b P ( b P T A b P ) − b P T A − b P (cid:18) ( P T AP ) −

00 0 (cid:19) b P T A (cid:21) S f M − s = f M − s S T A b P (cid:20) ( b P T A b P ) − − (cid:18) ( P T AP ) −

00 0 (cid:19) (cid:21) b P T AS f M − s . It is easy to verify that ( b P T A b P ) − − (cid:18) ( P T AP ) −

00 0 (cid:19) is an SPSD matrix of rank b n c − n c [15, Lemma 2.7]. Accordingly, D is an SPSDmatrix of rank at most b n c − n c . Using [13, Corollary 4.3.5], we obtain σ TL = λ n s + n c − n +1 (cid:0) f M − s S T A ( I − Π A ) S f M − s (cid:1) = λ n s + n c − n +1 (cid:0) f M − s S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1) S f M − s + D (cid:1) ≤ λ n s + n c − n +1+rank( D ) (cid:0) f M − s S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1) S f M − s (cid:1) ≤ λ n s + b n c − n +1 (cid:0) f M − s S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1) S f M − s (cid:1) = b σ TL . This completes the proof. (cid:3)

Remark . According to (3.1) and (3.11), we deduce that k E TL k A decreases whenincreasing the number of columns in P (i.e., n c ). In other words, n c cannot be verysmall in order to achieve a satisfactory convergence.4. Convergence analysis of inexact TLHB methods

In practice, the Galerkin coarse-grid system is often too costly to solve exactly.Without essential loss of convergence speed, it is advisable to solve the problemapproximately (one way is to apply Algorithm 1 recursively). In this section, weestablish a new convergence theory for Algorithm 1 with inexact coarse solver underthe condition rank(

S P ) = n . ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 11

We remark that the estimate (2.19) is only applicable for n c = n − n s . In fact,if n c > n − n s , then (cid:18) S T P T (cid:19) A (cid:0) S P (cid:1) = (cid:18) A s S T APP T AS A c (cid:19) is an SPSD matrix. Hence, the Schur complement A s − S T AP A − P T AS is SPSD,which leads to the positive semideﬁniteness of I − A − s S T AP A − P T ASA − s . Then γ = λ max (cid:0) A − s S T AP A − P T ASA − s (cid:1) = 1 . Obviously, the upper bound in (2.19) is always 1, no matter how close B c is to A c .That is, the estimate (2.19) will be trivial if n c > n − n s .In what follows, we establish two-sided bounds for the convergence factor k e E TL k A under the general condition n c ≥ n − n s .The following identities on the extreme eigenvalues of ( I − S f M − S T A )( I − Π A )and ( I − S f M − S T A ) Π A will be frequently used in the subsequent analysis. Lemma 4.1.

The following eigenvalue identities hold: λ min (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) = 0 , (4.1a) λ max (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) = 1 − σ TL , (4.1b) λ min (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 0 , (4.1c) λ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) = ( − λ +min ( S f M − S T AΠ A ) , if r = n c , , if r < n c , (4.1d) where r = rank( S T AP ) .Proof. The positive semideﬁniteness of f M s − A s implies that I − A S f M − S T A isSPSD, which yields the positive semideﬁniteness of A − − S f M − S T . Then λ (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) = λ (cid:0) ( A − − S f M − S T ) A ( I − Π A ) (cid:1) = λ (cid:0) ( A − − S f M − S T ) A ( I − Π A )( A − − S f M − S T ) (cid:1) ⊂ [0 , + ∞ ) . Similarly, λ (cid:0) ( I − S f M − S T A ) Π A (cid:1) ⊂ [0 , + ∞ ) . The identities (4.1a) and (4.1c) then follow immediately from the factsdet (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) = 0 and det (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 0 . Here, det( · ) denotes the determinant of a matrix.According to the proof of Theorem 3.1, it holds that λ max (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) = 1 − λ min (cid:0) B − A (cid:1) = 1 − σ TL , which is exactly the identity (4.1b). By (3.3) and (3.4), we have S f M − S T AΠ A = Y − (cid:18) Z Z (cid:19) Y, ( I − S f M − S T A ) Π A = Y − (cid:18) I n c − Z − Z (cid:19) Y. Then λ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 − λ min ( Z ) . • If r = n c , then rank (cid:0) f M − s S T AP A − c (cid:1) = n c , which leads to the positivedeﬁniteness of A − c P T AS f M − S T AP A − c . This implies that S f M − S T AΠ A has n c positive eigenvalues. Consequently, λ min ( Z ) = λ +min ( S f M − S T AΠ A ) . • If r < n c , we deduce from the above argument that λ min ( Z ) = 0.Thus, the identity (4.1d) is proved. (cid:3) To analyze the convergence of Algorithm 1, we need an important tool for eigen-value analysis, i.e., the well-known

Weyl’s theorem (see, e.g., [13, Theorem 4.3.1]).

Lemma 4.2.

Let H , H ∈ C n × n be Hermitian. Assume that the spectra of H , H , and H + H are { λ i ( H ) } ni =1 , { λ i ( H ) } ni =1 , and { λ i ( H + H ) } ni =1 , respectively.Then, for each k = 1 , . . . , n , it holds that λ k − j +1 ( H ) + λ j ( H ) ≤ λ k ( H + H ) ≤ λ k + ℓ ( H ) + λ n − ℓ ( H ) for all j = 1 , . . . , k and ℓ = 0 , . . . , n − k . In particular, one has λ min ( H + H ) ≥ λ min ( H ) + λ min ( H ) , (4.2a) λ min ( H + H ) ≤ min (cid:8) λ min ( H ) + λ max ( H ) , λ max ( H ) + λ min ( H ) (cid:9) , (4.2b) λ max ( H + H ) ≥ max (cid:8) λ max ( H ) + λ min ( H ) , λ min ( H ) + λ max ( H ) (cid:9) , (4.2c) λ max ( H + H ) ≤ λ max ( H ) + λ max ( H ) . (4.2d) Remark . Certainly, the Weyl’s theorem is applicable for real symmetric matri-ces. It is worth noting that this theorem can also be applied to the nonsymmetricmatrix ( I − S f M − S T A )( I − tΠ A ) with a parameter t , which is based on the fact λ (cid:0) ( I − S f M − S T A )( I − tΠ A ) (cid:1) = λ (cid:0) ( A − − S f M − S T ) A ( I − tΠ A )( A − − S f M − S T ) (cid:1) . One can ﬁrst apply the Weyl’s theorem to the symmetric matrix( A − − S f M − S T ) A ( I − tΠ A )( A − − S f M − S T ) , and then transform the result into a form related to I − S f M − S T A, ( I − S f M − S T A ) Π A , or ( I − S f M − S T A )( I − Π A ) . For example, if t ≥

0, using (4.2a), we obtain λ min (cid:0) ( I − S f M − S T A )( I − tΠ A ) (cid:1) = λ min (cid:0) ( A − − S f M − S T ) A ( I − tΠ A )( A − − S f M − S T ) (cid:1) ≥ λ min (cid:0) ( A − − S f M − S T ) A ( A − − S f M − S T ) (cid:1) − tλ max (cid:0) ( A − − S f M − S T ) AΠ A ( A − − S f M − S T ) (cid:1) = λ min ( I − S f M − S T A ) − tλ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) . For the sake of brevity, such a trick will be implicitly used in the proof of the nexttheorem.Now, we are ready to present the new convergence theory for Algorithm 1 withinexact coarse solver.

ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 13

Theorem 4.4.

Let (4.3) α = λ min ( B − A c ) and β = λ max ( B − A c ) . (i) If rank( S T AP ) = n c , then (4.4) L ≤ k e E TL k A ≤ U , where L =  − min (cid:8) β − βλ +min ( S f M − S T AΠ A ) , σ TL (cid:9) , if β ≤ , − min (cid:8) λ max ( f M − A s ) , βσ TL (cid:9) , if α ≤ < β, max (cid:8) α − − αλ +min ( S f M − S T AΠ A ) , − λ max ( f M − A s ) , ( α − (cid:0) − λ max ( f M − A s ) (cid:1) , − βσ TL (cid:9) , if < α, U =  − ασ TL , if β ≤ , max (cid:8) − ασ TL , ( β − (cid:0) − λ +min ( S f M − S T AΠ A ) (cid:1)(cid:9) , if α ≤ < β, max (cid:8) − σ TL , ( β − (cid:0) − λ +min ( S f M − S T AΠ A ) (cid:1)(cid:9) , if < α. (ii) If rank( S T AP ) < n c , then (4.5) L ≤ k e E TL k A ≤ U , where L =  − min { β, σ TL } , if β ≤ , − min (cid:8) λ max ( f M − A s ) , βσ TL (cid:9) , if α ≤ < β, max (cid:8) α − , − λ max ( f M − A s ) , − βσ TL (cid:9) , if < α, U =  − ασ TL , if β ≤ , max (cid:8) − ασ TL , β − (cid:9) , if α ≤ < β, max (cid:8) − σ TL , β − (cid:9) , if < α. Proof.

By (2.2) and (2.4), we have e B − A = I − ( I − SM − T s S T A )( I − P B − P T A )( I − SM − S T A ) , which yields λ (cid:0) e B − A (cid:1) = λ (cid:0) I − ( I − SM − S T A )( I − SM − T s S T A )( I − P B − P T A ) (cid:1) = λ (cid:0) I − ( I − S f M − S T A )( I − P B − P T A ) (cid:1) . Then λ max (cid:0) e B − A (cid:1) = 1 − λ min (cid:0) ( I − S f M − S T A )( I − P B − P T A ) (cid:1) ,λ min (cid:0) e B − A (cid:1) = 1 − λ max (cid:0) ( I − S f M − S T A )( I − P B − P T A ) (cid:1) . Note that ( I − S f M − S T A )( I − P B − P T A ) has the same spectrum as the symmetricmatrix ( A − − S f M − S T ) A ( I − P B − P T A )( A − − S f M − S T ) . In view of (4.3),we have − a ≤ λ max (cid:0) e B − A (cid:1) − ≤ − a , (4.6a) b ≤ − λ min (cid:0) e B − A (cid:1) ≤ b , (4.6b) where a = λ min (cid:0) ( I − S f M − S T A )( I − αΠ A ) (cid:1) ,a = λ min (cid:0) ( I − S f M − S T A )( I − βΠ A ) (cid:1) ,b = λ max (cid:0) ( I − S f M − S T A )( I − αΠ A ) (cid:1) ,b = λ max (cid:0) ( I − S f M − S T A )( I − βΠ A ) (cid:1) . According to (2.8), (4.6a), and (4.6b), we deduce that(4.7) max {− a , b } ≤ k e E TL k A ≤ max {− a , b } . The remaining task is to establish the upper bounds for a and b , as well as thelower bounds for a and b . Case 1: rank( S T AP ) = n c . Subcase 1.1: β ≤

1. By (4.2b), we have that a = λ min (cid:0) ( I − S f M − S T A )( I − αΠ A ) (cid:1) ≤ λ min ( I − S f M − S T A ) − αλ min (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 − λ max ( f M − A s )and a = λ min (cid:0) ( I − S f M − S T A ) (cid:0) I − Π A + (1 − α ) Π A (cid:1)(cid:1) ≤ λ min (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) + (1 − α ) λ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 − α − (1 − α ) λ +min ( S f M − S T AΠ A ) , where we have used the identities (4.1a), (4.1c), and (4.1d). We then have(4.8) a ≤ − max (cid:8) λ max ( f M − A s ) , α + (1 − α ) λ +min ( S f M − S T AΠ A ) (cid:9) . Using (4.2a), we obtain a = λ min (cid:0) ( I − S f M − S T A ) (cid:0) (1 − β ) I + β ( I − Π A ) (cid:1)(cid:1) ≥ (1 − β ) λ min ( I − S f M − S T A ) + βλ min (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) , which, together with (4.1a), yields(4.9) a ≥ (1 − β ) (cid:0) − λ max ( f M − A s ) (cid:1) . By (4.2d), we have b = λ max (cid:0) ( I − S f M − S T A ) (cid:0) (1 − α ) I + α ( I − Π A ) (cid:1)(cid:1) ≤ (1 − α ) λ max ( I − S f M − S T A ) + αλ max (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) . The above inequality, together with (4.1b), gives(4.10) b ≤ − ασ TL . In light of (4.2c), we have that b = λ max (cid:0) ( I − S f M − S T A )( I − βΠ A ) (cid:1) ≥ λ max ( I − S f M − S T A ) − βλ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 − β + βλ +min ( S f M − S T AΠ A ) ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 15 and b = λ max (cid:0) ( I − S f M − S T A ) (cid:0) I − Π A + (1 − β ) Π A (cid:1)(cid:1) ≥ λ max (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) + (1 − β ) λ min (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 − σ TL , where we have used the identities (4.1b)–(4.1d). Then(4.11) b ≥ − min (cid:8) β − βλ +min ( S f M − S T AΠ A ) , σ TL (cid:9) . Subcase 1.2: α ≤ < β . In this subcase, the inequalities (4.8) and (4.10) stillhold. We next focus on the lower bounds for a and b . Using (4.2a), we obtain a = λ min (cid:0) ( I − S f M − S T A ) (cid:0) I − Π A + (1 − β ) Π A (cid:1)(cid:1) ≥ λ min (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) + (1 − β ) λ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) , which, together with (4.1a) and (4.1d), leads to(4.12) a ≥ (1 − β ) (cid:0) − λ +min ( S f M − S T AΠ A ) (cid:1) . By (4.1b), (4.1c), and (4.2c), we have that b = λ max (cid:0) ( I − S f M − S T A )( I − βΠ A ) (cid:1) ≥ λ min ( I − S f M − S T A ) − βλ min (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 − λ max ( f M − A s )and b = λ max (cid:0) ( I − S f M − S T A ) (cid:0) (1 − β ) I + β ( I − Π A ) (cid:1)(cid:1) ≥ (1 − β ) λ max ( I − S f M − S T A ) + βλ max (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) = 1 − βσ TL . Accordingly,(4.13) b ≥ − min (cid:8) λ max ( f M − A s ) , βσ TL (cid:9) . Subcase 1.3: < α . In this subcase, the estimates (4.12) and (4.13) are stillvalid. We then consider the upper bounds for a and b . In light of (4.1a), (4.1d),and (4.2b), we have that a = λ min (cid:0) ( I − S f M − S T A )( I − αΠ A ) (cid:1) ≤ λ max ( I − S f M − S T A ) − αλ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 − α + αλ +min ( S f M − S T AΠ A )and a = λ min (cid:0) ( I − S f M − S T A ) (cid:0) (1 − α ) I + α ( I − Π A ) (cid:1)(cid:1) ≤ (1 − α ) λ min ( I − S f M − S T A ) + αλ min (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) = 1 − α + ( α − λ max ( f M − A s ) . Hence,(4.14) a ≤ − α + min (cid:8) αλ +min ( S f M − S T AΠ A ) , ( α − λ max ( f M − A s ) (cid:9) . Using (4.2d), we obtain b = λ max (cid:0) ( I − S f M − S T A ) (cid:0) I − Π A + (1 − α ) Π A (cid:1)(cid:1) ≤ λ max (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) + (1 − α ) λ min (cid:0) ( I − S f M − S T A ) Π A (cid:1) , which, together with (4.1b) and (4.1c), yields(4.15) b ≤ − σ TL . Combining (4.7)–(4.15), we can arrive at the estimate (4.4) immediately.

Case 2: rank( S T AP ) < n c . The detailed proof of Case 2 is omitted for the sakeof conciseness. With the identities (4.1a)–(4.1c) and λ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 , one can prove the following inequalities similarly. Subcase 2.1: β ≤

1. It is easy to show that a ≤ − max (cid:8) λ max ( f M − A s ) , α (cid:9) , (4.16) a ≥ (1 − β ) (cid:0) − λ max ( f M − A s ) (cid:1) , (4.17) b ≤ − ασ TL , (4.18) b ≥ − min { β, σ TL } . (4.19) Subcase 2.2: α ≤ < β . In this subcase, the inequalities (4.16) and (4.18) stillhold. In addition, a ≥ − β, (4.20) b ≥ − min (cid:8) λ max ( f M − A s ) , βσ TL (cid:9) . (4.21) Subcase 2.3: < α . In such a case, the inequalities (4.20) and (4.21) still hold.Furthermore, one has a ≤ − α, (4.22) b ≤ − σ TL . (4.23)The estimate (4.5) then follows directly from (4.7) and (4.16)–(4.23). This com-pletes the proof. (cid:3) Remark . If rank( S T AP ) = n c , we get from (4.1b) and (4.1d) that2 − σ TL − λ +min ( S f M − S T AΠ A ) = λ max (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) + λ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) ≥ λ max ( I − S f M − S T A ) = 1 , which yields(4.24) 1 − λ +min ( S f M − S T AΠ A ) ≥ σ TL . With the relation (4.24), one can easily check that the estimate (4.4) reduces to (3.1)when B c = A c . Likewise, the estimate (4.5) will reduce to (3.1) if B c = A c . ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 17 Conclusions

In this paper, we present a purely algebraic convergence analysis of TLHB meth-ods, provided that (

S P ) is of full row rank ( S and P correspond to two hierarchicalcomponents). A new and succinct identity for the convergence factor of exact TLHBmethods is derived, which can be conveniently used to analyze the optimal interpo-lation and the inﬂuence of range( P ) on the convergence factor. Two-sided boundsfor the convergence factor of inexact TLHB methods are also established, whichprovide a theoretical framework for the convergence analysis of multilevel methods(a multilevel method can be treated as an inexact two-level scheme). An interestingquestion involved in the inexact TLHB theory is how to design an eﬃcient coarsesolver, which serves as a motivation for developing new multilevel algorithms. References

1. O. Axelsson,

Iterative Solution Methods , Cambridge University Press, Cambridge, 1994.2. O. Axelsson and I. Gustafsson,

Preconditioning and two-level multigrid methods of arbitrarydegree of approximation , Math. Comp. (1983), 219–242.3. R. E. Bank, Hierarchical bases and the ﬁnite element method , Acta Numer. (1996), 1–43.4. R. E. Bank, T. F. Dupont, and H. Yserentant, The hierarchical basis multigrid method , Numer.Math. (1988), 427–458.5. D. Braess, The contraction number of a multigrid method for solving the Poisson equation ,Numer. Math. (1981), 387–404.6. A. E. Brandt, General highly accurate algebraic coarsening , Electron. Trans. Numer. Anal. (2000), 1–20.7. J. Brannick, F. Cao, K. Kahl, R. D. Falgout, and X. Hu, Optimal interpolation and compatiblerelaxation in classical algebraic multigrid , SIAM J. Sci. Comput. (2018), A1473–A1493.8. W. L. Briggs, V. E. Henson, and S. F. McCormick, A Multigrid Tutorial , second ed., SIAM,Philadelphia, PA, 2000.9. V. Eijkhout and P. S. Vassilevski,

The role of the strengthened Cauchy–Buniakowskii–Schwarzinequality in multilevel methods , SIAM Rev. (1991), 405–419.10. R. D. Falgout and P. S. Vassilevski, On generalizing the algebraic multigrid framework , SIAMJ. Numer. Anal. (2004), 1669–1693.11. R. D. Falgout, P. S. Vassilevski, and L. T. Zikatanov, On two-grid convergence estimates ,Numer. Linear Algebra Appl. (2005), 471–494.12. W. Hackbusch, Multi-Grid Methods and Applications , Springer-Verlag, Berlin, Heidelberg,1985.13. R. A. Horn and C. R. Johnson,

Matrix Analysis , second ed., Cambridge University Press,Cambridge, 2013.14. S. P. MacLachlan and L. N. Olson,

Theoretical bounds for algebraic multigrid performance:review and analysis , Numer. Linear Algebra Appl. (2014), 194–220.15. R. Nabben and C. Vuik, A comparison of deﬂation and coarse grid correction applied toporous media ﬂow , SIAM J. Numer. Anal. (2004), 1631–1647.16. Y. Notay, Algebraic theory of two-grid methods , Numer. Math. Theor. Meth. Appl. (2015),168–198.17. U. Trottenberg, C. W. Oosterlee, and A. Sch¨uller, Multigrid , Academic Press, New York,2001.18. P. S. Vassilevski,

On two ways of stabilizing the hierarchical basis multilevel methods , SIAMRev. (1997), 18–53.19. P. S. Vassilevski, Multilevel Block Factorization Preconditioners: Matrix-based Analysis andAlgorithms for Solving Finite Element Equations , Springer-Verlag, New York, 2008.20. J. Xu and L. T. Zikatanov,

Algebraic multigrid methods , Acta Numer. (2017), 591–721.21. X. Xu and C.-S. Zhang, On the ideal interpolation operator in algebraic multigrid methods ,SIAM J. Numer. Anal. (2018), 1693–1710.22. X. Xu and C.-S. Zhang, Convergence analysis of inexact two-grid methods: A theoreticalframework , submitted.

23. H. Yserentant,

On the multilevel splitting of ﬁnite element spaces , Numer. Math. (1986),379–412.24. L. T. Zikatanov, Two-sided bounds on the convergence rate of two-level methods , Numer.Linear Algebra Appl. (2008), 439–454. Department of Mathematics, Purdue University, West Lafayette, IN 47907, USA

E-mail address ::