A new convergence analysis of two-level hierarchical basis methods
aa r X i v : . [ m a t h . NA ] F e b A NEW CONVERGENCE ANALYSIS OF TWO-LEVELHIERARCHICAL BASIS METHODS
XUEFENG XU
Abstract.
This paper is concerned with the convergence analysis of two-levelhierarchical basis (TLHB) methods in a general setting, where the decomposi-tion associated with two hierarchical component spaces is not required to be adirect sum. The TLHB scheme can be regarded as a combination of compatiblerelaxation and coarse-grid correction. Most of the previous works focus on thecase of exact coarse solver, and the existing identity for the convergence factorof exact TLHB methods involves a tricky max-min problem. In this work, wepresent a new and purely algebraic analysis of TLHB methods, which gives asuccinct identity for the convergence factor of exact TLHB methods. The newidentity can be conveniently utilized to derive an optimal interpolation andanalyze the influence of coarse space on the convergence factor. Moreover, weestablish two-sided bounds for the convergence factor of TLHB methods withinexact coarse solver, which extend the existing TLHB theory. Introduction
Multigrid is a typical multilevel iterative scheme, which has been proved to bea powerful solver (with linear or near-linear computational complexity) for a largeclass of linear systems that arise from discretized partial differential equations; see,e.g., [12, 8, 17, 19]. The fundamental module of multigrid is a two-grid scheme,which combines two complementary error-reduction processes: a smoothing (or relaxation ) process and a coarse-grid correction process. The smoothing process istypically a simple iterative method such as the Jacobi and Gauss–Seidel iterations.Usually, it is efficient on high-frequency (i.e., oscillatory) error modes, while the low-frequency (i.e., smooth) part cannot be eliminated effectively. One way to capturethe low-frequency error is to coarsen the underlying grid so that low-frequencymodes on the initial fine-grid appear high-frequency on a coarser-grid. The low-frequency error will be further eliminated by a relaxation method on the coarse-grid.The resulting correction can be interpolated back to the fine-grid by an interpolationoperator. Such a process is the so-called coarse-grid correction.For a given initial guess u ∈ R n , the smoothing iteration for solving A u = f can be described as(1.1) u k +1 = u k + M − ( f − A u k ) , where A ∈ R n × n is symmetric positive definite (SPD) and M ∈ R n × n is a nonsingu-lar smoother. In the classical two-grid analysis (see, e.g., [10, 11, 19]), M is assumedto be A -convergent (i.e., k I − M − A k A < Mathematics Subject Classification.
Primary 65F08, 65F10, 65N55; Secondary 15A18.
Key words and phrases.
Two-level methods, hierarchical basis, convergence analysis, inexactcoarse solver. definiteness of M + M T − A ; see, e.g., [19, Proposition 3.8]. This assumption playsa crucial role in the theoretical analysis of two-grid methods.As mentioned earlier, the global smoothing (1.1) has a little effect on the low-frequency error in general. Alternatively, one can use a local smoother M s ∈ R n s × n s ( n s < n ) in the smoothing process, which is expected to focus on eliminating thehigh-frequency error modes. Compared with the global smoother M , one has moreroom to design the local smoother M s due to its size is relatively small. Inspiredby the compatible relaxation iterations in [10] (the idea of compatible relaxationoriginated with Brandt [6]), we perform the following smoothing iteration:(1.2) u k +1 = u k + SM − S T ( f − A u k ) , where S ∈ R n × n s is of full column rank and M s ∈ R n s × n s is ( S T AS )-convergent(noting that M s is not restricted to S T M S ). Clearly, the iteration (1.2) will reduceto (1.1) if n s = n and S = I n . Such a special case is not our focus here: two-gridtheory has been well developed in the literature; see, e.g., [10, 11, 19, 24, 14, 16,20, 21, 22]. Let u ℓ ∈ R n be an approximation to the exact solution u (e.g., u ℓ isgenerated from (1.2)), and let P ∈ R n × n c ( n c < n ) be an interpolation matrix withfull column rank. The (exact) coarse-grid correction can then be described as(1.3) u ℓ +1 = u ℓ + P ( P T AP ) − P T ( f − A u ℓ ) . The two-level hierarchical basis (TLHB) scheme can be obtained by successivelyperforming presmoothing, coarse-grid correction, and postsmoothing iterations (seeAlgorithm 1). Some pioneering works on TLHB methods can be found in [5, 2, 23,4, 3, 18]. A basic assumption in the classical TLHB theory is that (
S P ) is squareand nonsingular (which entails that n c = n − n s ). For example, S = (cid:18) I n s (cid:19) , P = (cid:18) ∗ I n − n s (cid:19) . This assumption leads to the positive definiteness of the hierarchical basis matrix(1.4) (cid:18) S T P T (cid:19) A (cid:0) S P (cid:1) = (cid:18) S T AS S T APP T AS P T AP (cid:19) . An important quantity involved in the analysis of multilevel methods is the so-called
Cauchy–Bunyakowski–Schwarz (C.B.S.) constant [9, 1, 11]. The C.B.S. constantassociated with (1.4) is defined as(1.5) γ := max v s ∈ R n s \{ } v c ∈ R n c \{ } v T s S T AP v c p v T s S T AS v s · v T c P T AP v c . The positive definiteness of (1.4) implies that γ ∈ [0 , S ) andrange( P ).Using a hierarchical expression for the inverse of TLHB preconditioner (see (2.6)),one can easily verify that a necessary condition for TLHB convergence is(1.6) rank( S P ) = n, which is a foundation of TLHB analysis. In particular, in the case of exact coarsesolver, the condition (1.6) is also sufficient for TLHB convergence. Obviously, (1.6)implies that n s + n c ≥ n , i.e., n c ≥ n − n s . If S and M s are preselected, theclassical setting n c = n − n s is not the optimal one, at least from the perspective of ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 3 convergence (see Theorem 3.7 and Remark 3.8). In addition, if n c > n − n s , then γ happens to be 1, which will trivialize some classical TLHB theories (see, e.g., [9,1, 18, 11]). Under the condition (1.6), Falgout, Vassilevski, and Zikatanov [11,Theorem 4.1] established an identity for the convergence factor of exact TLHBmethods. However, the identity (see also (2.15)) involves a tricky max-min problem:it is generally difficult to determine when the ‘min’ is attained, which limits theapplication of the identity.In this paper, we derive a new and succinct identity (see (3.1)) for the conver-gence factor of exact TLHB methods under the condition (1.6). Our proof is notonly novel but also much simpler than that in [11]. The new identity provides astraightforward approach to analyze the optimal interpolation (see Theorem 3.4)and the influence of range( P ) on the convergence factor (see Theorem 3.7). In prac-tice, the Galerkin coarse-grid system is often too costly to solve exactly, especiallywhen its size is still large. Instead, one can solve the system approximately as longas the convergence speed is satisfactory. Compared with the exact case, the con-vergence analysis of TLHB methods with inexact coarse solver is of more practicalsignificance. Motivated by this observation, we establish two-sided bounds for theconvergence factor of inexact TLHB methods, from which one can readily get theidentity for the exact case.The rest of this paper is organized as follows. In Section 2, we introduce someproperties and convergence results of TLHB methods. In Section 3, we present anew identity for the convergence factor of exact TLHB methods, followed by somediscussions on how the new identity can be used to analyze the optimal interpolationand the influence of range( P ) on the convergence factor. In Section 4, we establisha systematic convergence theory for inexact TLHB methods. In Section 5, we givesome concluding remarks. 2. Preliminaries
We start with some notation used in the subsequent discussions.– I n denotes the n × n identity matrix (or I when its size is clear from context).– λ min ( · ), λ +min ( · ), and λ max ( · ) stand for the smallest eigenvalue, the smallestpositive eigenvalue, and the largest eigenvalue of a matrix, respectively.– λ i ( · ) denotes the i -th smallest eigenvalue of a matrix.– λ ( · ) denotes the spectrum of a matrix.– ρ ( · ) represents the spectral radius of a matrix.– k · k denotes the spectral norm of a matrix.– k · k A denotes the energy norm induced by an SPD matrix A ∈ R n × n : forany v ∈ R n , k v k A = √ v T A v ; for any B ∈ R n × n , k B k A = max v ∈ R n \{ } k B v k A k v k A .Our focus is on TLHB methods for solving the linear system(2.1) A u = f , where A ∈ R n × n is SPD, u ∈ R n , and f ∈ R n . Some basic assumptions involved inthe analysis of TLHB methods are listed below. • Let S ∈ R n × n s and P ∈ R n × n c be of full column rank, wheremax { n s , n c } < n ≤ n s + n c . • Assume that (
S P ) ∈ R n × ( n s + n c ) is of full row rank, or, equivalently, forany v ∈ R n , there exist v s ∈ R n s and v c ∈ R n c such that v = S v s + P v c . XUEFENG XU • Let M s be an n s × n s nonsingular matrix such that M s + M T s − A s is SPD,where A s := S T AS . • Let B c ∈ R n c × n c be an SPD approximation to A c , where A c := P T AP isthe so-called Galerkin coarse-grid matrix .With the above assumptions, the standard TLHB scheme for solving (2.1) can bedescribed as Algorithm 1 ( u ∈ R n is an initial guess). If B c = A c , then Algorithm 1is called an exact TLHB method; otherwise, it is called an inexact
TLHB method.
Algorithm 1
TLHB method Presmoothing: u ← u + SM − S T ( f − A u ) Restriction: r c ← P T ( f − A u ) Coarse-grid correction: e c ← B − r c Interpolation: u ← u + P e c Postsmoothing: u TL ← u + SM − T s S T ( f − A u ) Remark . Due to n s < n , it follows that k I − SM − S T A k A = 1 , which does not satisfy a conventional assumption in two-grid analysis, that is, thesmoothing iteration is a contraction in A -norm. Moreover, there is no nonsingularmatrix X ∈ R n × n such that I − X − A = ( I − SM − S T A )( I − SM − T s S T A ) . Therefore, the classical two-grid theory is not applicable for Algorithm 1. Comparedwith the two-grid case, one has more room to design the local smoother M s insteadof limiting it to simple types (e.g., the Jacobi or Gauss–Seidel type). For example,if M s is taken to be A s , then I − SM − S T A = I − SA − S T A, which is an A -orthogonal projection along (or parallel to) range( S ) onto null( S T A )and hence can remove the error components contained in range( S ). If range( S ) cov-ers most of the high-frequency modes, then the smoothing iteration will eliminatethe high-frequency error effectively.From Algorithm 1, we have u − u TL = e E TL ( u − u )with(2.2) e E TL = ( I − SM − T s S T A )( I − P B − P T A )( I − SM − S T A ) , which is called the iteration matrix (or error propagation matrix ) of Algorithm 1.Define(2.3) M s := M s ( M s + M T s − A s ) − M T s . Then, e E TL can be expressed as(2.4) e E TL = I − e B − A, ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 5 where(2.5) e B − = SM − S T + ( I − SM − T s S T A ) P B − P T ( I − ASM − S T ) . Indeed, e B − admits the following hierarchical expression:(2.6) e B − = (cid:0) S P (cid:1) b B − (cid:0) S P (cid:1) T , where(2.7) b B TL = (cid:18) I P T ASM − I (cid:19) (cid:18) M s B c (cid:19) (cid:18) I M − T s S T AP I (cid:19) . The matrix e B TL is referred to as the TLHB preconditioner , whose positive definite-ness follows from the positive definiteness of b B TL and rank( S P ) = n . Accordingto (2.2) and (2.4), we deduce that(2.8) k e E TL k A = ρ ( e E TL ) = max (cid:8) λ max (cid:0) e B − A (cid:1) − , − λ min (cid:0) e B − A (cid:1)(cid:9) . In particular, if B c = A c , then the iteration matrix is denoted by E TL , and(2.9) E TL = ( I − SM − T s S T A )( I − Π A )( I − SM − S T A ) , where(2.10) Π A := P A − P T A. Similarly, we have(2.11) E TL = I − B − A, where(2.12) B − = SM − S T + ( I − SM − T s S T A ) P A − P T ( I − ASM − S T ) . Note that Π A is an A -orthogonal projection. We then have(2.13) k E TL k A = λ max ( E TL ) = 1 − λ min (cid:0) B − A (cid:1) . Based on the so-called saddle-point lemma [11, Lemma 3.1], Falgout, Vassilevski,and Zikatanov [11, Theorem 4.1] derived an identity for k E TL k A , as described inthe following theorem. Theorem 2.2.
Define (2.14) f M s := M T s ( M s + M T s − A s ) − M s . The convergence factor of Algorithm with B c = A c can be characterized as (2.15) k E TL k A = 1 − K TL , where (2.16) K TL = max v ∈ range( I − Π A ) min v s : v =( I − Π A ) S v s v T s f M s v s v T A v . The identity (2.15) is valid as long as rank(
S P ) = n . In particular, if ( S P ) issquare and nonsingular, Falgout, Vassilevski, and Zikatanov [11, Corollary 4.1 andTheorem 4.2] further proved the following results.
XUEFENG XU
Theorem 2.3. If ( S P ) is an n × n nonsingular matrix, then (2.17) K TL = max v s ∈ R n s \{ } v T s f M s v s v T s S T A ( I − Π A ) S v s and (2.18) K TL ≤ λ max ( A − f M s )1 − γ , where γ is defined by (1.5) . In the case of inexact coarse solver, if λ ( B − A c ) ⊂ (cid:20)
11 + δ , (cid:21) with δ > , then (2.19) k e E TL k A ≤ − K TL + δ − γ . Convergence analysis of exact TLHB methods
In this section, we present a new convergence analysis of Algorithm 1 with exactcoarse solver (under the condition rank(
S P ) = n ), which gives a succinct identityfor the convergence factor k E TL k A . The new identity can be conveniently used toanalyze the optimal interpolation and the influence of range( P ) on k E TL k A .Observe that the identity (2.15) involves a tricky max-min problem. In general,it is difficult to determine when the ‘min’ is attained, which limits the applicationof (2.15). Furthermore, the proof of (2.15) provided in [11] is not very direct.The following theorem gives a new and succinct identity for k E TL k A , whose proofis straightforward. Theorem 3.1.
The convergence factor of Algorithm with B c = A c can be char-acterized as (3.1) k E TL k A = 1 − σ TL , where (3.2) σ TL = λ +min (cid:0) S f M − S T A ( I − Π A ) (cid:1) = λ +min (cid:0) f M − S T A ( I − Π A ) S (cid:1) . Proof.
By (2.9) and (2.11), we have B − A = I − ( I − SM − T s S T A )( I − Π A )( I − SM − S T A ) . Then λ (cid:0) B − A (cid:1) = λ (cid:0) I − ( I − SM − S T A )( I − SM − T s S T A )( I − Π A ) (cid:1) = λ (cid:0) I − ( I − S f M − S T A )( I − Π A ) (cid:1) = λ (cid:0) S f M − S T A ( I − Π A ) + Π A (cid:1) . Since Π A = Π A and rank( Π A ) = n c , there exists a nonsingular matrix Y ∈ R n × n such that(3.3) Π A = Y − (cid:18) I n c
00 0 (cid:19) Y. Let(3.4) S f M − S T A = Y − (cid:18) Z Z Z Z (cid:19) Y, ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 7 where Z ij ∈ R n i × n j with n = n c and n = n − n c . Direct computations yield S f M − S T A ( I − Π A ) = Y − (cid:18) Z Z (cid:19) Y, (3.5) S f M − S T A ( I − Π A ) + Π A = Y − (cid:18) I n c Z Z (cid:19) Y. (3.6)Note that A E TL A − is symmetric positive semidefinite (SPSD) and B TL is SPD.We then have λ (cid:0) S f M − S T A ( I − Π A ) + Π A (cid:1) = λ (cid:0) B − A (cid:1) ⊂ (0 , , which, together with (3.6), leads to λ ( Z ) ⊂ (0 , . Hence, λ min (cid:0) B − A (cid:1) = λ min (cid:0) S f M − S T A ( I − Π A ) + Π A (cid:1) = λ min ( Z ) = λ +min (cid:0) S f M − S T A ( I − Π A ) (cid:1) , where we have used the expressions (3.5) and (3.6). The identity (3.1) then followsimmediately from (2.13). (cid:3) Remark . According to the proof of Theorem 3.1, we deduce that(3.7) λ (cid:0) S f M − S T A ( I − Π A ) (cid:1) = (cid:8) , . . . , | {z } n c , ν , . . . , ν n − n c (cid:9) , where 0 < ν i ≤ i = 1 , . . . , n − n c . Then(3.8) λ (cid:0) f M − S T A ( I − Π A ) S (cid:1) = (cid:8) , . . . , | {z } n s + n c − n , ν , . . . , ν n − n c (cid:9) . In particular, if (
S P ) is an n × n nonsingular matrix, then σ − = λ − (cid:0) f M − S T A ( I − Π A ) S (cid:1) = min v s ∈ R n s \{ } v T s S T A ( I − Π A ) S v s v T s f M s v s ! − = max v s ∈ R n s \{ } v T s f M s v s v T s S T A ( I − Π A ) S v s , which gives the expression (2.17). If M s is further taken to be A s , then k E TL k A = 1 − λ min (cid:0) A − S T A ( I − Π A ) S (cid:1) = 1 − λ min (cid:0) I − A − S T AP A − P T AS (cid:1) = λ max (cid:0) A − S T AP A − P T AS (cid:1) = (cid:13)(cid:13) A − s S T AP A − c (cid:13)(cid:13) = γ , where γ is defined by (1.5).The proof of Theorem 3.1 also yields a characterization for the spectrum of E TL ,as described in the following corollary. XUEFENG XU
Corollary 3.3.
The spectrum of E TL is given by (3.9) λ ( E TL ) = (cid:8) , . . . , | {z } n c , − ν , . . . , − ν n − n c (cid:9) , where { ν i } n − n c i =1 are the positive eigenvalues of f M − S T A ( I − Π A ) S . Compared with (2.15), the identity (3.1) is more convenient for TLHB analysis.Of particular interest is an interpolation P that minimizes the convergence factor k E TL k A , provided that S and M s are preselected.Using (3.1), we can derive the following optimal interpolation theory. Theorem 3.4.
Let { ( µ i , v i ) } ni =1 be the eigenpairs of S f M − S T A , where µ ≤ µ ≤ · · · ≤ µ n and v Ti A v j = ( , if i = j, , if i = j. Then (3.10) k E TL k A ≥ − µ n c +1 , and the equality holds if range( P ) = span { v , . . . , v n c } .Proof. Due to f M s − A s is SPSD and S f M − S T A has the same nonzero eigenvaluesas f M − A s , it follows that0 = µ = · · · = µ n − n s < µ n − n s +1 ≤ · · · ≤ µ n ≤ . Let V = ( v , . . . , v n ) and U = V − P ( P T V − T V − P ) − . It is easy to check that V T AV = I and U is an n × n c matrix with orthonormalcolumns (i.e., U T U = I n c ). Let U be an n × ( n − n c ) matrix such that ( U U ) isorthogonal. Then S f M − S T A ( I − Π A ) = S f M − S T A ( I − P A − P T A )= S f M − S T A ( I − V U U T V T A )= S f M − S T A ( I − V U U T V − )= S f M − S T AV ( I − U U T ) V − = V Σ U U T V − , where Σ = diag(0 , . . . , , µ n − n s +1 , . . . , µ n ).According to (3.7) andΣ U U T = V − S f M − S T A ( I − Π A ) V, we deduce that Σ U U T has n − n c positive eigenvalues. Since( U U ) T = ( U U ) − and (cid:18) U T U T (cid:19) Σ U U T (cid:0) U U (cid:1) = (cid:18) U T Σ U U T Σ U (cid:19) , it follows that U T Σ U is positive definite. Hence, σ TL = λ +min (cid:0) S f M − S T A ( I − Π A ) (cid:1) = λ +min (Σ U U T ) = λ min ( U T Σ U ) . Using the
Poincar´e separation theorem (see, e.g., [13, Corollary 4.3.37]), we obtain λ min ( U T Σ U ) = λ ( U T Σ U ) ≤ λ n c +1 (Σ) = µ n c +1 . ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 9
Consequently, k E TL k A = 1 − λ min ( U T Σ U ) ≥ − µ n c +1 . In particular, if range( P ) = span { v , . . . , v n c } , then there exists a nonsingularmatrix P ∈ R n c × n c such that P = V (cid:18) P (cid:19) . In this case, U = (cid:18) P (cid:19) ( P T P ) − . Then U U T = I − U U T = I − (cid:18) P (cid:19) ( P T P ) − (cid:0) P T (cid:1) = (cid:18) I n − n c (cid:19) . Hence, k E TL k A = 1 − λ +min (Σ U U T ) = 1 − µ n c +1 . This completes the proof. (cid:3)
Remark . As mentioned in [11, Page 483], A s happens to be well-conditioned inthe classical TLHB methods, so it is not that impractical to take M s = A s . In sucha case, the eigenvalues of S f M − S T A are0 = µ = · · · = µ n − n s < µ n − n s +1 = · · · = µ n = 1 . Then, µ n c +1 = 1 (since n c ≥ n − n s ), which gives the optimal convergence factor 0. Remark . Unlike the optimal interpolation theory for two-grid methods [20, 7], S f M − S T here is a singular matrix. That is, (cid:0) S f M − S T (cid:1) − is not well-defined andhence cannot induce an inner product in R n .Besides the optimal interpolation analysis, the identity (3.1) is also convenientfor analyzing the influence of range( P ) on k E TL k A . Theorem 3.7.
Let b P ∈ R n × b n c be of full column rank ( with n c ≤ b n c < n ) , and let rank( S b P ) = n . If range( P ) ⊆ range( b P ) , then (3.11) σ TL ≤ b σ TL , where σ TL is given by (3.2) and (3.12) b σ TL = λ +min (cid:0) S f M − S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1)(cid:1) . Proof.
Since range( P ) ⊆ range( b P ), there exists an b n c × n c matrix W such that P = b P W.
Note that W ∈ R b n c × n c is of full column rank. One can find an b n c × b n c nonsingularmatrix c W such that W = c W (cid:18) I n c (cid:19) , which yields P = ( b P c W ) (cid:18) I n c (cid:19) . Hence, b P = ( P C ) c W −
10 XUEFENG XU for some C ∈ R n × ( b n c − n c ) . From (3.12), we have(3.13) b σ TL = λ +min (cid:0) S f M − S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1)(cid:1) , where b P = ( P C ).In light of (3.2), (3.8), and (3.13), we have σ TL = λ +min (cid:0) f M − s S T A ( I − Π A ) S f M − s (cid:1) = λ n s + n c − n +1 (cid:0) f M − s S T A ( I − Π A ) S f M − s (cid:1) , b σ TL = λ +min (cid:0) f M − s S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1) S f M − s (cid:1) = λ n s + b n c − n +1 (cid:0) f M − s S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1) S f M − s (cid:1) . Let D = f M − s S T A ( I − Π A ) S f M − s − f M − s S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1) S f M − s . Then D = f M − s S T A (cid:2) b P ( b P T A b P ) − b P T A − P ( P T AP ) − P T A (cid:3) S f M − s = f M − s S T A (cid:20) b P ( b P T A b P ) − b P T A − b P (cid:18) ( P T AP ) −
00 0 (cid:19) b P T A (cid:21) S f M − s = f M − s S T A b P (cid:20) ( b P T A b P ) − − (cid:18) ( P T AP ) −
00 0 (cid:19) (cid:21) b P T AS f M − s . It is easy to verify that ( b P T A b P ) − − (cid:18) ( P T AP ) −
00 0 (cid:19) is an SPSD matrix of rank b n c − n c [15, Lemma 2.7]. Accordingly, D is an SPSDmatrix of rank at most b n c − n c . Using [13, Corollary 4.3.5], we obtain σ TL = λ n s + n c − n +1 (cid:0) f M − s S T A ( I − Π A ) S f M − s (cid:1) = λ n s + n c − n +1 (cid:0) f M − s S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1) S f M − s + D (cid:1) ≤ λ n s + n c − n +1+rank( D ) (cid:0) f M − s S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1) S f M − s (cid:1) ≤ λ n s + b n c − n +1 (cid:0) f M − s S T A (cid:0) I − b P ( b P T A b P ) − b P T A (cid:1) S f M − s (cid:1) = b σ TL . This completes the proof. (cid:3)
Remark . According to (3.1) and (3.11), we deduce that k E TL k A decreases whenincreasing the number of columns in P (i.e., n c ). In other words, n c cannot be verysmall in order to achieve a satisfactory convergence.4. Convergence analysis of inexact TLHB methods
In practice, the Galerkin coarse-grid system is often too costly to solve exactly.Without essential loss of convergence speed, it is advisable to solve the problemapproximately (one way is to apply Algorithm 1 recursively). In this section, weestablish a new convergence theory for Algorithm 1 with inexact coarse solver underthe condition rank(
S P ) = n . ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 11
We remark that the estimate (2.19) is only applicable for n c = n − n s . In fact,if n c > n − n s , then (cid:18) S T P T (cid:19) A (cid:0) S P (cid:1) = (cid:18) A s S T APP T AS A c (cid:19) is an SPSD matrix. Hence, the Schur complement A s − S T AP A − P T AS is SPSD,which leads to the positive semidefiniteness of I − A − s S T AP A − P T ASA − s . Then γ = λ max (cid:0) A − s S T AP A − P T ASA − s (cid:1) = 1 . Obviously, the upper bound in (2.19) is always 1, no matter how close B c is to A c .That is, the estimate (2.19) will be trivial if n c > n − n s .In what follows, we establish two-sided bounds for the convergence factor k e E TL k A under the general condition n c ≥ n − n s .The following identities on the extreme eigenvalues of ( I − S f M − S T A )( I − Π A )and ( I − S f M − S T A ) Π A will be frequently used in the subsequent analysis. Lemma 4.1.
The following eigenvalue identities hold: λ min (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) = 0 , (4.1a) λ max (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) = 1 − σ TL , (4.1b) λ min (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 0 , (4.1c) λ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) = ( − λ +min ( S f M − S T AΠ A ) , if r = n c , , if r < n c , (4.1d) where r = rank( S T AP ) .Proof. The positive semidefiniteness of f M s − A s implies that I − A S f M − S T A isSPSD, which yields the positive semidefiniteness of A − − S f M − S T . Then λ (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) = λ (cid:0) ( A − − S f M − S T ) A ( I − Π A ) (cid:1) = λ (cid:0) ( A − − S f M − S T ) A ( I − Π A )( A − − S f M − S T ) (cid:1) ⊂ [0 , + ∞ ) . Similarly, λ (cid:0) ( I − S f M − S T A ) Π A (cid:1) ⊂ [0 , + ∞ ) . The identities (4.1a) and (4.1c) then follow immediately from the factsdet (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) = 0 and det (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 0 . Here, det( · ) denotes the determinant of a matrix.According to the proof of Theorem 3.1, it holds that λ max (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) = 1 − λ min (cid:0) B − A (cid:1) = 1 − σ TL , which is exactly the identity (4.1b). By (3.3) and (3.4), we have S f M − S T AΠ A = Y − (cid:18) Z Z (cid:19) Y, ( I − S f M − S T A ) Π A = Y − (cid:18) I n c − Z − Z (cid:19) Y. Then λ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 − λ min ( Z ) . • If r = n c , then rank (cid:0) f M − s S T AP A − c (cid:1) = n c , which leads to the positivedefiniteness of A − c P T AS f M − S T AP A − c . This implies that S f M − S T AΠ A has n c positive eigenvalues. Consequently, λ min ( Z ) = λ +min ( S f M − S T AΠ A ) . • If r < n c , we deduce from the above argument that λ min ( Z ) = 0.Thus, the identity (4.1d) is proved. (cid:3) To analyze the convergence of Algorithm 1, we need an important tool for eigen-value analysis, i.e., the well-known
Weyl’s theorem (see, e.g., [13, Theorem 4.3.1]).
Lemma 4.2.
Let H , H ∈ C n × n be Hermitian. Assume that the spectra of H , H , and H + H are { λ i ( H ) } ni =1 , { λ i ( H ) } ni =1 , and { λ i ( H + H ) } ni =1 , respectively.Then, for each k = 1 , . . . , n , it holds that λ k − j +1 ( H ) + λ j ( H ) ≤ λ k ( H + H ) ≤ λ k + ℓ ( H ) + λ n − ℓ ( H ) for all j = 1 , . . . , k and ℓ = 0 , . . . , n − k . In particular, one has λ min ( H + H ) ≥ λ min ( H ) + λ min ( H ) , (4.2a) λ min ( H + H ) ≤ min (cid:8) λ min ( H ) + λ max ( H ) , λ max ( H ) + λ min ( H ) (cid:9) , (4.2b) λ max ( H + H ) ≥ max (cid:8) λ max ( H ) + λ min ( H ) , λ min ( H ) + λ max ( H ) (cid:9) , (4.2c) λ max ( H + H ) ≤ λ max ( H ) + λ max ( H ) . (4.2d) Remark . Certainly, the Weyl’s theorem is applicable for real symmetric matri-ces. It is worth noting that this theorem can also be applied to the nonsymmetricmatrix ( I − S f M − S T A )( I − tΠ A ) with a parameter t , which is based on the fact λ (cid:0) ( I − S f M − S T A )( I − tΠ A ) (cid:1) = λ (cid:0) ( A − − S f M − S T ) A ( I − tΠ A )( A − − S f M − S T ) (cid:1) . One can first apply the Weyl’s theorem to the symmetric matrix( A − − S f M − S T ) A ( I − tΠ A )( A − − S f M − S T ) , and then transform the result into a form related to I − S f M − S T A, ( I − S f M − S T A ) Π A , or ( I − S f M − S T A )( I − Π A ) . For example, if t ≥
0, using (4.2a), we obtain λ min (cid:0) ( I − S f M − S T A )( I − tΠ A ) (cid:1) = λ min (cid:0) ( A − − S f M − S T ) A ( I − tΠ A )( A − − S f M − S T ) (cid:1) ≥ λ min (cid:0) ( A − − S f M − S T ) A ( A − − S f M − S T ) (cid:1) − tλ max (cid:0) ( A − − S f M − S T ) AΠ A ( A − − S f M − S T ) (cid:1) = λ min ( I − S f M − S T A ) − tλ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) . For the sake of brevity, such a trick will be implicitly used in the proof of the nexttheorem.Now, we are ready to present the new convergence theory for Algorithm 1 withinexact coarse solver.
ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 13
Theorem 4.4.
Let (4.3) α = λ min ( B − A c ) and β = λ max ( B − A c ) . (i) If rank( S T AP ) = n c , then (4.4) L ≤ k e E TL k A ≤ U , where L = − min (cid:8) β − βλ +min ( S f M − S T AΠ A ) , σ TL (cid:9) , if β ≤ , − min (cid:8) λ max ( f M − A s ) , βσ TL (cid:9) , if α ≤ < β, max (cid:8) α − − αλ +min ( S f M − S T AΠ A ) , − λ max ( f M − A s ) , ( α − (cid:0) − λ max ( f M − A s ) (cid:1) , − βσ TL (cid:9) , if < α, U = − ασ TL , if β ≤ , max (cid:8) − ασ TL , ( β − (cid:0) − λ +min ( S f M − S T AΠ A ) (cid:1)(cid:9) , if α ≤ < β, max (cid:8) − σ TL , ( β − (cid:0) − λ +min ( S f M − S T AΠ A ) (cid:1)(cid:9) , if < α. (ii) If rank( S T AP ) < n c , then (4.5) L ≤ k e E TL k A ≤ U , where L = − min { β, σ TL } , if β ≤ , − min (cid:8) λ max ( f M − A s ) , βσ TL (cid:9) , if α ≤ < β, max (cid:8) α − , − λ max ( f M − A s ) , − βσ TL (cid:9) , if < α, U = − ασ TL , if β ≤ , max (cid:8) − ασ TL , β − (cid:9) , if α ≤ < β, max (cid:8) − σ TL , β − (cid:9) , if < α. Proof.
By (2.2) and (2.4), we have e B − A = I − ( I − SM − T s S T A )( I − P B − P T A )( I − SM − S T A ) , which yields λ (cid:0) e B − A (cid:1) = λ (cid:0) I − ( I − SM − S T A )( I − SM − T s S T A )( I − P B − P T A ) (cid:1) = λ (cid:0) I − ( I − S f M − S T A )( I − P B − P T A ) (cid:1) . Then λ max (cid:0) e B − A (cid:1) = 1 − λ min (cid:0) ( I − S f M − S T A )( I − P B − P T A ) (cid:1) ,λ min (cid:0) e B − A (cid:1) = 1 − λ max (cid:0) ( I − S f M − S T A )( I − P B − P T A ) (cid:1) . Note that ( I − S f M − S T A )( I − P B − P T A ) has the same spectrum as the symmetricmatrix ( A − − S f M − S T ) A ( I − P B − P T A )( A − − S f M − S T ) . In view of (4.3),we have − a ≤ λ max (cid:0) e B − A (cid:1) − ≤ − a , (4.6a) b ≤ − λ min (cid:0) e B − A (cid:1) ≤ b , (4.6b) where a = λ min (cid:0) ( I − S f M − S T A )( I − αΠ A ) (cid:1) ,a = λ min (cid:0) ( I − S f M − S T A )( I − βΠ A ) (cid:1) ,b = λ max (cid:0) ( I − S f M − S T A )( I − αΠ A ) (cid:1) ,b = λ max (cid:0) ( I − S f M − S T A )( I − βΠ A ) (cid:1) . According to (2.8), (4.6a), and (4.6b), we deduce that(4.7) max {− a , b } ≤ k e E TL k A ≤ max {− a , b } . The remaining task is to establish the upper bounds for a and b , as well as thelower bounds for a and b . Case 1: rank( S T AP ) = n c . Subcase 1.1: β ≤
1. By (4.2b), we have that a = λ min (cid:0) ( I − S f M − S T A )( I − αΠ A ) (cid:1) ≤ λ min ( I − S f M − S T A ) − αλ min (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 − λ max ( f M − A s )and a = λ min (cid:0) ( I − S f M − S T A ) (cid:0) I − Π A + (1 − α ) Π A (cid:1)(cid:1) ≤ λ min (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) + (1 − α ) λ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 − α − (1 − α ) λ +min ( S f M − S T AΠ A ) , where we have used the identities (4.1a), (4.1c), and (4.1d). We then have(4.8) a ≤ − max (cid:8) λ max ( f M − A s ) , α + (1 − α ) λ +min ( S f M − S T AΠ A ) (cid:9) . Using (4.2a), we obtain a = λ min (cid:0) ( I − S f M − S T A ) (cid:0) (1 − β ) I + β ( I − Π A ) (cid:1)(cid:1) ≥ (1 − β ) λ min ( I − S f M − S T A ) + βλ min (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) , which, together with (4.1a), yields(4.9) a ≥ (1 − β ) (cid:0) − λ max ( f M − A s ) (cid:1) . By (4.2d), we have b = λ max (cid:0) ( I − S f M − S T A ) (cid:0) (1 − α ) I + α ( I − Π A ) (cid:1)(cid:1) ≤ (1 − α ) λ max ( I − S f M − S T A ) + αλ max (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) . The above inequality, together with (4.1b), gives(4.10) b ≤ − ασ TL . In light of (4.2c), we have that b = λ max (cid:0) ( I − S f M − S T A )( I − βΠ A ) (cid:1) ≥ λ max ( I − S f M − S T A ) − βλ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 − β + βλ +min ( S f M − S T AΠ A ) ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 15 and b = λ max (cid:0) ( I − S f M − S T A ) (cid:0) I − Π A + (1 − β ) Π A (cid:1)(cid:1) ≥ λ max (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) + (1 − β ) λ min (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 − σ TL , where we have used the identities (4.1b)–(4.1d). Then(4.11) b ≥ − min (cid:8) β − βλ +min ( S f M − S T AΠ A ) , σ TL (cid:9) . Subcase 1.2: α ≤ < β . In this subcase, the inequalities (4.8) and (4.10) stillhold. We next focus on the lower bounds for a and b . Using (4.2a), we obtain a = λ min (cid:0) ( I − S f M − S T A ) (cid:0) I − Π A + (1 − β ) Π A (cid:1)(cid:1) ≥ λ min (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) + (1 − β ) λ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) , which, together with (4.1a) and (4.1d), leads to(4.12) a ≥ (1 − β ) (cid:0) − λ +min ( S f M − S T AΠ A ) (cid:1) . By (4.1b), (4.1c), and (4.2c), we have that b = λ max (cid:0) ( I − S f M − S T A )( I − βΠ A ) (cid:1) ≥ λ min ( I − S f M − S T A ) − βλ min (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 − λ max ( f M − A s )and b = λ max (cid:0) ( I − S f M − S T A ) (cid:0) (1 − β ) I + β ( I − Π A ) (cid:1)(cid:1) ≥ (1 − β ) λ max ( I − S f M − S T A ) + βλ max (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) = 1 − βσ TL . Accordingly,(4.13) b ≥ − min (cid:8) λ max ( f M − A s ) , βσ TL (cid:9) . Subcase 1.3: < α . In this subcase, the estimates (4.12) and (4.13) are stillvalid. We then consider the upper bounds for a and b . In light of (4.1a), (4.1d),and (4.2b), we have that a = λ min (cid:0) ( I − S f M − S T A )( I − αΠ A ) (cid:1) ≤ λ max ( I − S f M − S T A ) − αλ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 − α + αλ +min ( S f M − S T AΠ A )and a = λ min (cid:0) ( I − S f M − S T A ) (cid:0) (1 − α ) I + α ( I − Π A ) (cid:1)(cid:1) ≤ (1 − α ) λ min ( I − S f M − S T A ) + αλ min (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) = 1 − α + ( α − λ max ( f M − A s ) . Hence,(4.14) a ≤ − α + min (cid:8) αλ +min ( S f M − S T AΠ A ) , ( α − λ max ( f M − A s ) (cid:9) . Using (4.2d), we obtain b = λ max (cid:0) ( I − S f M − S T A ) (cid:0) I − Π A + (1 − α ) Π A (cid:1)(cid:1) ≤ λ max (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) + (1 − α ) λ min (cid:0) ( I − S f M − S T A ) Π A (cid:1) , which, together with (4.1b) and (4.1c), yields(4.15) b ≤ − σ TL . Combining (4.7)–(4.15), we can arrive at the estimate (4.4) immediately.
Case 2: rank( S T AP ) < n c . The detailed proof of Case 2 is omitted for the sakeof conciseness. With the identities (4.1a)–(4.1c) and λ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) = 1 , one can prove the following inequalities similarly. Subcase 2.1: β ≤
1. It is easy to show that a ≤ − max (cid:8) λ max ( f M − A s ) , α (cid:9) , (4.16) a ≥ (1 − β ) (cid:0) − λ max ( f M − A s ) (cid:1) , (4.17) b ≤ − ασ TL , (4.18) b ≥ − min { β, σ TL } . (4.19) Subcase 2.2: α ≤ < β . In this subcase, the inequalities (4.16) and (4.18) stillhold. In addition, a ≥ − β, (4.20) b ≥ − min (cid:8) λ max ( f M − A s ) , βσ TL (cid:9) . (4.21) Subcase 2.3: < α . In such a case, the inequalities (4.20) and (4.21) still hold.Furthermore, one has a ≤ − α, (4.22) b ≤ − σ TL . (4.23)The estimate (4.5) then follows directly from (4.7) and (4.16)–(4.23). This com-pletes the proof. (cid:3) Remark . If rank( S T AP ) = n c , we get from (4.1b) and (4.1d) that2 − σ TL − λ +min ( S f M − S T AΠ A ) = λ max (cid:0) ( I − S f M − S T A )( I − Π A ) (cid:1) + λ max (cid:0) ( I − S f M − S T A ) Π A (cid:1) ≥ λ max ( I − S f M − S T A ) = 1 , which yields(4.24) 1 − λ +min ( S f M − S T AΠ A ) ≥ σ TL . With the relation (4.24), one can easily check that the estimate (4.4) reduces to (3.1)when B c = A c . Likewise, the estimate (4.5) will reduce to (3.1) if B c = A c . ONVERGENCE ANALYSIS OF TWO-LEVEL HIERARCHICAL BASIS METHODS 17 Conclusions
In this paper, we present a purely algebraic convergence analysis of TLHB meth-ods, provided that (
S P ) is of full row rank ( S and P correspond to two hierarchicalcomponents). A new and succinct identity for the convergence factor of exact TLHBmethods is derived, which can be conveniently used to analyze the optimal interpo-lation and the influence of range( P ) on the convergence factor. Two-sided boundsfor the convergence factor of inexact TLHB methods are also established, whichprovide a theoretical framework for the convergence analysis of multilevel methods(a multilevel method can be treated as an inexact two-level scheme). An interestingquestion involved in the inexact TLHB theory is how to design an efficient coarsesolver, which serves as a motivation for developing new multilevel algorithms. References
1. O. Axelsson,
Iterative Solution Methods , Cambridge University Press, Cambridge, 1994.2. O. Axelsson and I. Gustafsson,
Preconditioning and two-level multigrid methods of arbitrarydegree of approximation , Math. Comp. (1983), 219–242.3. R. E. Bank, Hierarchical bases and the finite element method , Acta Numer. (1996), 1–43.4. R. E. Bank, T. F. Dupont, and H. Yserentant, The hierarchical basis multigrid method , Numer.Math. (1988), 427–458.5. D. Braess, The contraction number of a multigrid method for solving the Poisson equation ,Numer. Math. (1981), 387–404.6. A. E. Brandt, General highly accurate algebraic coarsening , Electron. Trans. Numer. Anal. (2000), 1–20.7. J. Brannick, F. Cao, K. Kahl, R. D. Falgout, and X. Hu, Optimal interpolation and compatiblerelaxation in classical algebraic multigrid , SIAM J. Sci. Comput. (2018), A1473–A1493.8. W. L. Briggs, V. E. Henson, and S. F. McCormick, A Multigrid Tutorial , second ed., SIAM,Philadelphia, PA, 2000.9. V. Eijkhout and P. S. Vassilevski,
The role of the strengthened Cauchy–Buniakowskii–Schwarzinequality in multilevel methods , SIAM Rev. (1991), 405–419.10. R. D. Falgout and P. S. Vassilevski, On generalizing the algebraic multigrid framework , SIAMJ. Numer. Anal. (2004), 1669–1693.11. R. D. Falgout, P. S. Vassilevski, and L. T. Zikatanov, On two-grid convergence estimates ,Numer. Linear Algebra Appl. (2005), 471–494.12. W. Hackbusch, Multi-Grid Methods and Applications , Springer-Verlag, Berlin, Heidelberg,1985.13. R. A. Horn and C. R. Johnson,
Matrix Analysis , second ed., Cambridge University Press,Cambridge, 2013.14. S. P. MacLachlan and L. N. Olson,
Theoretical bounds for algebraic multigrid performance:review and analysis , Numer. Linear Algebra Appl. (2014), 194–220.15. R. Nabben and C. Vuik, A comparison of deflation and coarse grid correction applied toporous media flow , SIAM J. Numer. Anal. (2004), 1631–1647.16. Y. Notay, Algebraic theory of two-grid methods , Numer. Math. Theor. Meth. Appl. (2015),168–198.17. U. Trottenberg, C. W. Oosterlee, and A. Sch¨uller, Multigrid , Academic Press, New York,2001.18. P. S. Vassilevski,
On two ways of stabilizing the hierarchical basis multilevel methods , SIAMRev. (1997), 18–53.19. P. S. Vassilevski, Multilevel Block Factorization Preconditioners: Matrix-based Analysis andAlgorithms for Solving Finite Element Equations , Springer-Verlag, New York, 2008.20. J. Xu and L. T. Zikatanov,
Algebraic multigrid methods , Acta Numer. (2017), 591–721.21. X. Xu and C.-S. Zhang, On the ideal interpolation operator in algebraic multigrid methods ,SIAM J. Numer. Anal. (2018), 1693–1710.22. X. Xu and C.-S. Zhang, Convergence analysis of inexact two-grid methods: A theoreticalframework , submitted.
23. H. Yserentant,
On the multilevel splitting of finite element spaces , Numer. Math. (1986),379–412.24. L. T. Zikatanov, Two-sided bounds on the convergence rate of two-level methods , Numer.Linear Algebra Appl. (2008), 439–454. Department of Mathematics, Purdue University, West Lafayette, IN 47907, USA
E-mail address ::