Block preconditioning of stochastic Galerkin problems: New two-sided guaranteed spectral bounds
BBlock preconditioning of stochastic Galerkin problems:New two-sided guaranteed spectral bounds ∗ Marie Kub´ınov´a † and Ivana Pultarov´a ‡ Abstract.
The paper focuses on numerical solution of parametrized diffusion equations with scalar parameter-dependent coefficient function by the stochastic (spectral) Galerkin method. We study precondition-ing of the related discretized problems using preconditioners obtained by modifying the stochasticpart of the partial differential equation. We present a simple but general approach for obtainingtwo-sided bounds to the spectrum of the resulting matrices, based on a particular splitting of thediscretized operator. Using this tool and considering the stochastic approximation space formed byclassical orthogonal polynomials, we obtain new spectral bounds depending solely on the proper-ties of the coefficient function and the type of the approximation polynomials for several classes ofblock-diagonal preconditioners. These bounds are guaranteed and applicable to various distributionsof parameters. Moreover, the conditions on the parameter-dependent coefficient function are onlylocal, and therefore less restrictive than those usually assumed in the literature.
Key words. stochastic Galerkin method, diffusion problem, preconditioning, block-diagonal preconditioning,spectral bounds
AMS subject classifications.
1. Introduction.
Growing interest in uncertainty quantification of numerical solutions ofpartial differential equations stimulates new modifications of standard numerical methods.A popular choice for partial differential equations with parametrized or uncertain data isthe stochastic Galerkin method [4, 36]. Similarly to deterministic problems, approximatesolutions, which depend on physical and stochastic variables (parameters), are searched forin finite-dimensional subspaces of the original Hilbert space. More precisely, the approxi-mate solutions are orthogonal projections of the exact solution to the finite-dimensional sub-spaces with respect to the energy inner product defined by the operator of the equation;see, e.g., [5, 9, 10, 24]. The approximation subspaces are considered in the form of a tensorproduct of a physical variable space (finite-element functions) and a stochastic variable space(polynomials); see, e.g., [4, 14]. The form and qualities of the system matrix A of the dis-cretized problem are determined by the structure of the uncertain data and the type of thefinite-dimensional solution spaces. For special classes of parameters, it was shown, see, e.g.,[24, 32], that certain block-diagonal matrices are spectrally equivalent to A independently ofthe degree of polynomials and the number of random parameters, and thus they can be used ∗ This version dated August 20, 2019.
Funding:
The work of M. K. was supported by the Czech Academy of Sciences through the project L100861901(Programme for promising human resources – postdocs) and by the Ministry of Education, Youth and Sports of theCzech Republic through the project LQ1602 (IT4Innovations excellence in science). The work of I. P. was supportedby the Grant Agency of the Czech Republic under the contract No. 17-04150J. † ‡ Faculty of Civil Engineering, Czech Technical University, Prague, Czech Republic, and College of PolytechnicsJihlava, Czech Republic ([email protected], http://mat.fsv.cvut.cz/ivana). a r X i v : . [ m a t h . NA ] A ug M. KUB´INOV´A, I. PULTAROV´A for preconditioning. Having a good preconditioning method or, in other words, a good andfeasible approximation of A − , we may also efficiently estimate a posteriori the energy normof the error during iterative solution processes [1, 5, 6, 9, 17]. This estimate can be used inadaptive algorithms [5, 7, 8]. In practice, matrix A is never built explicitly, only matrix-vectorproducts are evaluated ([25]).In this paper, we focus on matrices arising in the discretized stochastic Galerkin methodand present new guaranteed two-sided bounds to the spectra of the preconditioned matricesfor several types of preconditioner. We consider only preconditioning with respect to thestochastic parts of problems, and thus we assume that a suitable preconditioning method oran efficient solver for the underlying deterministic problem is available; see, e.g., [12, 23, 33].We formulate an idea of obtaining bounds to the spectra of the preconditioned matrix from thespectrum of small Gram matrices depending solely on the stochastic part of the approximationspace. The motivation, however, comes from techniques and tools of the algebraic multilevelpreconditioning introduced in [11, 2]. Similar idea was, in a simpler form, used alreadyin [28, 29]. In the current paper, it is applied in a more general setting, and we believethat the derived technique may lead to an improvement of some other recently introducedestimates, such as [17, 23]. The derived technique is also applicable to systems in the form ofmulti-term matrix equation (see [25, eq. (1.8)]).The paper is organized as follows. In section 2, we briefly recall the stochastic Galerkinmethod and the structure of the matrices A of the resulting systems of linear equations forthe tensor product polynomials and complete polynomials. Since the structure of A playsa crucial role in the analysis, theoretical considerations will be accompanied by illustrativeexamples throughout the paper. Section 3 formulates a general concept of proving spectralequivalence for a broad class of (not only) stochastic Galerkin preconditioners. In section 4, weapply this idea to preconditioners which are represented by a special type of block-diagonal (orSchur complement) approximations of A , and show how to obtain the spectral bounds of thepreconditioned problems from the spectral bounds of small Gram matrices of the correspond-ing polynomial chaos. We also evaluate those bounds explicitly for the considered polynomialchaoses. Simple numerical examples demonstrating the obtained theoretical outcomes arepresented at the end of the section.Throughout the paper, we denote by κ ( M − A ), where A and M are symmetric positivedefinite, the spectral condition number of M − A , i.e., the standard condition number of M − AM − or, in other words, λ max ( M − A ) /λ min ( M − A ). By e i we denote the i -thcolumn of the identity matrix, where its size follows from the context.
2. Stochastic Galerkin matrices.
Consider the variational problem of finding u ∈ (cid:101) V = H ( D ) ⊗ L ρ (Γ), such that(2.1) (cid:90) Γ (cid:90) D a ( x , ξ ) ∇ u ( x , ξ ) · ∇ v ( x , ξ ) ρ ( ξ ) d x d ξ = (cid:90) Γ (cid:90) D f ( x ) v ( x , ξ ) ρ ( ξ ) d x d ξ for all v ∈ (cid:101) V where D ⊂ R d is a bounded polygonal domain, d = 1 , L ρ (Γ) is a parametric measurespace, Γ ⊂ R K , Γ = (cid:81) Kk =1 Γ k , a ∈ L ∞ ( D ) ⊗ L ∞ ρ (Γ), and f ∈ L ( D ). The gradient is appliedonly with respect to the (physical) variable x ∈ D . Let ξ = ( ξ , . . . , ξ K ) ∈ Γ, where ξ k ∈ Γ k are outcomes of independent random variables with probability densities ρ k ( ξ k ), k = 1 , . . . , K . LOCK PRECONDITIONING IN STOCHASTIC GALERKIN 3
The joint probability density is then ρ = (cid:81) Kk =1 ρ k . In the following, we consider ρ k defined on R such that ρ k ( ξ k ) = 0 outside Γ k . Thus, instead of Γ k and Γ, we further write R and R K ,respectively. For the convenience of notation, the probability densities are not normalized,see also Table 1, and we further refer to them as weights.We assume a ( x , ξ ) in the affine form(2.2) a ( x , ξ ) = a ( x ) + K (cid:88) k =1 a k ( x ) ξ k , where a k ∈ L ∞ ( D ), k = 1 , . . . , K . While it is usually assumed that there exist constants a and a such that 0 < a ≤ a ( x , ξ ) ≤ a < ∞ for a.a. x ∈ D, ξ ∈ Γ , in this paper we consider more general functions a . We will only require that the left-handside of (2.1) defines an inner product on a finite-dimensional approximation space V ⊂ (cid:101) V ; seesubsection 2.3. This will allow us to use random variables ξ k with unbounded images and stillobtain positive definite system matrices. In other words, we can avoid truncation of supportsof distribution functions or any other modification of them. Of course, under such (weaker)condition on a , (2.1) may not be well-defined. In this paper we, however, focus only on thediscretized problem obtained from (2.1); see also the discussion in [24].We consider discretization using the tensor product space [3, 4, 14] of the form V = V FE ⊗ P ⊂ (cid:101) V , where V FE ⊂ H ( D ) is an N FE -dimensional space spanned by the finite-element (FE) functions φ , . . . , φ N FE , and P is an N P -dimensional space spanned by K -variatepolynomials Ψ , . . . , Ψ N P of variables ξ , . . . , ξ K . Denoting the basis functions φ r Ψ j of V by a couple of coordinates r = 1 , . . . , N FE and j = 1 , . . . , N P , we obtain the matrix A of thesystem of linear equations of the discretized Galerkin problem (2.1) with elements A ri,sj = (cid:90) R K (cid:90) D a ( x , ξ ) ∇ φ s ( x ) · ∇ φ r ( x )Ψ j ( ξ )Ψ i ( ξ ) ρ ( ξ ) d x d ξ = (cid:90) D a ( x ) ∇ φ s ( x ) · ∇ φ r ( x ) d x (cid:90) R K Ψ j ( ξ )Ψ i ( ξ ) ρ ( ξ ) d ξ + K (cid:88) k =1 (cid:90) D a k ( x ) ∇ φ s ( x ) · ∇ φ r ( x ) d x (cid:90) R K ξ k Ψ j ( ξ )Ψ i ( ξ ) ρ ( ξ ) d ξ =: ( F ) rs ( G ) ij + K (cid:88) k =1 ( F k ) rs ( G k ) ij , where for k = 0 , , . . . , K (2.3) ( F k ) rs = (cid:90) D a k ( x ) ∇ φ s ( x ) · ∇ φ r ( x ) d x and ( G k ) ij = (cid:90) R K ξ k Ψ j ( ξ )Ψ i ( ξ ) ρ ( ξ ) d ξ , where we formally set ξ = 1. If the numbering of the basis functions φ r Ψ i is anti-lexico- M. KUB´INOV´A, I. PULTAROV´A graphical, the structure of A is(2.4) A = K (cid:88) k =0 G k ⊗ F k . In other words, the matrix A is composed of N P × N P blocks, each of size N FE × N FE . Example K = 1, the uniform distribution ρ ( ξ ) = ρ ( ξ ) = χ [ − , , and letΨ ( ξ ), Ψ ( ξ ) and Ψ ( ξ ) be the normalized Legendre orthogonal polynomials of degrees 0,1, and 2, see Table 1. Then N P = 3 and(2.5) A = F √ F √ F F √ F √ F F = I ⊗ F + G ⊗ F . For approximation of the physical part ofthe solution, we use an N FE -dimensional space V FE . To approximate the stochastic part of thesolution we use the N P -dimensional space P of K -variate polynomials Ψ j ( ξ ) = (cid:81) Kk =1 ψ ( k ) j k ( ξ k ), j = 1 , . . . , N P . To simplify the notation, we assume that the parameters ξ , . . . , ξ K areidentically distributed, i.e., ρ = · · · = ρ K . Thus, we omit the superscripts and subscripts k in ψ ( k ) j and ρ k , respectively. The extension of the results to polynomial bases with different ρ , . . . , ρ K is straightforward.In practice, sets of complete polynomials (C) or tensor product polynomials (TP) are usuallyused; see, e.g., [14, 24]. The set of the tensor product polynomials of the degree at most s k − ξ k , k = 1 , . . . , K , is defined as P TP s ,...,s K = { p ( ξ ) = K (cid:89) k =1 p k ( ξ k ); deg ( p k ) ≤ s k − , k = 1 , . . . , K } and N P = K (cid:89) k =1 s k . Let us denote by V TP s ,...,s K = V FE ⊗ P TP s ,...,s K the corresponding approximation space of (2.1).The set of complete polynomials of the maximum total degree s − P C s = { p ( ξ ) = K (cid:89) k =1 p k ( ξ k ); K (cid:88) k =1 deg ( p k ) ≤ s − } and N P = (cid:18) K + s − K (cid:19) . Let us denote by V C s = V FE ⊗ P C s the corresponding approximation space of (2.1).For both P TP s ,...,s K and P C s , the bases are usually constructed as products of K classicalorthogonal polynomials . More precisely Ψ j ( ξ ) = (cid:81) Kk =1 ψ j k ( ξ k ), j = 1 , . . . , N P , where ψ i arenormalized orthogonal polynomials of the degrees i = 0 , , . . . , with respect to the weightfunction ρ , i.e., (cid:90) R ψ i ( ξ ) ψ j ( ξ ) ρ ( ξ ) d ξ = δ ij . The N FE · N P basis functions of the discretization space V are then of the form(2.6) φ n ( x )Ψ j ( ξ ) = φ n ( x ) ψ j ( ξ ) . . . ψ j K ( ξ K ) . LOCK PRECONDITIONING IN STOCHASTIC GALERKIN 5
For the tensor product polynomials, we consider the anti-lexicographical ordering of the basisfunctions, i.e., the leftmost index ( n ) in (2.6) is changing the fastest, while the rightmost index( j K ) is changing the slowest. For the complete polynomials, we consider ordering by the totaldegree of the polynomials, going from the smallest to the largest.Another popular choice of the basis functions of P is a set of double orthogonal polyno-mials [3, 4, 14]. If we use the double orthogonal polynomials as a basis of P , the matrix A becomes block-diagonal with the diagonal blocks of the sizes N FE × N FE . Such block-diagonalmatrix A can be also obtained by simultaneous diagonalization of all matrices G k , see [14].This diagonal structure of the resulting matrices seems favourable for practical computations.However, the double orthogonal polynomials cannot be used as a basis for complete polynomi-als [14]. Moreover, for this basis, we cannot obtain methods for a posteriori error estimationor adaptivity control in a straightforward way. In addition, to refine the space P , all diagonalblocks of the matrix A must be recomputed. Therefore, in this paper, we only consider theclassical orthogonal polynomials to construct the bases of P C or P TP . The form of the matrices G k , k =0 , , . . . , K , in (2.3) depends on the choice of the basis of P C or P TP and will be importantfor our future analysis. As will be described later, the matrices G k can be constructed from(the elements of) a sequence of smaller s × s matrices( G s,j ) l +1 ,m +1 ≡ (cid:90) R ξ j ψ l ( ξ ) ψ m ( ξ ) ρ ( ξ ) d ξ, j = 0 , , l, m = 0 , , . . . , s − . Let the normalized orthogonal polynomials satisfy the well-known three-term recurrence(2.7) (cid:112) β n +1 ψ n +1 ( ξ ) = ( ξ − α n ) ψ n ( ξ ) − (cid:112) β n ψ n − ( ξ ) , n = 1 , , . . . , ψ − ≡ G s, = I s , where I s is the s × s identity matrix, and G s, have the form of the Jacobimatrix(2.8) G s, = α √ β √ β α . . .. . . . . . (cid:112) β s − (cid:112) β s − α s − . The eigenvalues of this matrix are given by the roots of the polynomial ψ s , which are distinctand lie in the support of ρ ; see, e.g., [15]. In Table 1, we list the classical orthogonal polynomi-als with symmetric statistical distribution considered here together with the weight function ρ corresponding to the non-normalized probability density. Note that due to the symmetry,the diagonal entries of G s, in (2.8) become trivially zero. These matrices will play a crucialrole in deriving spectral bounds, see section 4. M. KUB´INOV´A, I. PULTAROV´A statistical distribution weight function support polynomial chaos β n α n Gaussian e − x ( −∞ , ∞ ) Hermite n − x ) γ − [ − ,
1] Gegenbauer ( n +2 γ − n (2 n − γ )(2 n +2 γ ) − x ) [ − ,
1] Chebyshev (2 nd kind) − ,
1] Legendre n (2 n − n +1) Table 1: Wiener–Askey table: symmetric statistical distributions together with the corre-sponding polynomial chaos (classical orthogonal polynomials) and the three-term recurrencecoefficients.For the tensor product polynomials, the matrices G k , k = 0 , , . . . , K , are obtained as G = G s K , ⊗ G s K − , ⊗ · · · ⊗ G , ⊗ G , (2.9) G = G s K , ⊗ G s K − , ⊗ · · · ⊗ G , ⊗ G , ... G K = G s K , ⊗ G s K − , ⊗ · · · ⊗ G , ⊗ G , , see, e.g., [14, 26]. Example ξ and ξ , with s = s = 3, then N P = 9 and the matrix A has the form(2.10) A = F √ F √ F √ F F √ F √ F √ F F √ F √ F F √ F √ F √ F √ F F √ F √ F
00 0 √ F √ F F √ F √ F F √ F
00 0 0 0 √ F √ F F √ F √ F √ F F = (cid:88) k =0 G k ⊗ F k , where the blocks corresponding to the changing degree of the approximation polynomials ofthe variable ξ are separated graphically.For complete polynomials, the matrices G k lose the Kronecker product structure, since P C s is not a tensor product space. However, since P C s ⊂ P TP s,s,...,s , each matrix G k is permutation-similar to a submatrix of the matrices in (2.9), [14, Lemma 3]. Example ξ and ξ and LOCK PRECONDITIONING IN STOCHASTIC GALERKIN 7 s = 3, then N P = 6 and the relevant submatrix of the tensor-product matrix (2.10) is(2.11) A = F √ F √ F √ F F √ F √ F √ F F √ F F √ F √ F √ F √ F F
00 0 0 √ F F . Reordering the entries by the total degree of the corresponding polynomial, we obtain A = F √ F √ F √ F F √ F √ F √ F F √ F √ F √ F F √ F √ F F
00 0 √ F F = (cid:88) k =0 G k ⊗ F k , where the blocks corresponding to the total degrees 0, 1, and 2 are separated graphically. The left-hand side of the equation (2.1) defines the bilinearform ( · , · ) A on (cid:101) V . We present sufficient conditions on the function a , under which ( · , · ) A becomes an inner product (called energy inner product; see, e.g., [5, 9, 10]) on the finite-dimensional space V . To achieve positive definiteness of the bilinear form ( · , · ) A , we needto assume some dominance of the deterministic part a ( x ) over the stochastic part a k ( x ) ξ k , k = 1 , . . . , K . In this paper, we will assume that there exists a constant µ ≥ K (cid:88) k =1 | a k ( x ) | ≤ µ a ( x ) , for a.a. x ∈ D, where the particular choice of µ depends on the weight ρ ( ξ ). For the Beta distribution on[ − , µ = 1, while for the Gauss distribution, we take µ = (2( s + · · · + s K − K )) − for tensor product polynomials and µ = (2 s − − for complete polynomials. Note that this choice of µ also trivially implies that G s, + µ G s, is positive definite. Forfurther discussion on bounds of Hermite and Legendre polynomials see, e.g., [24].We emphasize that the assumption (2.12) is weaker than the classical assumption widelyused to obtain spectral estimates, e.g.,(2.13) K (cid:88) k =1 (cid:107) a k ( x ) (cid:107) L ∞ ( D ) ≤ µ class ess inf x ∈ D a ( x ) Since the eigenvalues of matrix G s, are the zeros of the Hermite polynomials ψ s and thus lie in the interval (cid:28) − (cid:113) s − s +2 , (cid:113) s − s +2 (cid:29) [34, p.120], the eigenvalues of (2 s − G s, + G s, are strictly positive. M. KUB´INOV´A, I. PULTAROV´A for uniform distribution; see [13, 17, 23, 24, 35]. The main difference between (2.12) and (2.13)is that the former is considered point-wise, while the latter uses the norms of a k over D . Thecondition (2.12) allows us to obtain not only more accurate two-sided guaranteed bounds tothe spectra, but these bounds also apply to parameter distribution and functions a k for whichno estimate could be obtained using the standard approach; see subsection 4.4. Assumption(2.12) is sufficient to achieve positive definiteness of A . In some applications, we can assumea stronger dominance of a , i.e.,(2.14) K (cid:88) k =1 | a k ( x ) | ≤ µ a ( x ) , for a.a. x ∈ D, ≤ µ ≤ µ. The smaller the µ , the more favourable spectral bounds of the matrices A and of the precon-ditioned matrices M − A are generally achieved. We will further assume that µ is the smallestnumber for which (2.14) is satisfied.
3. Proving spectral equivalence of inner products on V . We consider preconditioningmethods based on inner products that are spectrally equivalent to the energy inner product( · , · ) A on V , but are represented by matrices with more favourable non-zero structures suchas, for example, block-diagonal matrices. We base our approach on a splitting of the innerproducts to subdomains (Lemma 3.3) and on a preconditioning of a tensor product matrix(Lemma 3.5).Let D be partitioned into arbitrary non-overlapping elements (subdomains) τ j , j = 1 , . . . , N elem . Consider the following decomposition of A from (2.4) A = K (cid:88) k =0 G k ⊗ F k = N elem (cid:88) j =1 K (cid:88) k =0 G k ⊗ F ( j ) k =: N elem (cid:88) j =1 A ( j ) , where ( F ( j ) k ) rs = (cid:90) τ j a k ( x ) ∇ φ s ( x ) · ∇ φ r ( x ) d x and A ( j ) = K (cid:88) k =0 G k ⊗ F ( j ) k . Assumption a k ( x ), k = 0 , , . . . , K , (and thusthe function a ( x , ξ )) are constant on every element (subdomain) τ j , j = 1 , . . . , N elem . Wedefine(3.1) a ( j ) k ≡ a k ( x ) , x ∈ τ j , j = 1 , . . . , N elem . If a k ( x ) are not constant on elements, we would assume a stronger, element-wise, dominanceof a ( x ) over a k ( x ), i.e., K (cid:88) k =1 ess sup x ∈ τ j | a k ( x ) | ≤ µ ess inf x ∈ τ j a ( x ) , j = 1 , . . . , N elem , instead of (2.14), which would result in a slight modification of the spectral estimates derivedin subsequent sections. To simplify the presentation, we do not describe these modificationsin more detail. LOCK PRECONDITIONING IN STOCHASTIC GALERKIN 9
Using (3.1), we obtain( F ( j ) k ) rs = (cid:90) τ j a k ( x ) ∇ φ s ( x ) · ∇ φ r ( x ) d x = a ( j ) k (cid:90) τ j ∇ φ s ( x ) · ∇ φ r ( x ) d x =: a ( j ) k ( F ( j ) ) rs . Therefore, we can write A ( j ) = K (cid:88) k =0 G k ⊗ F ( j ) k = K (cid:88) k =0 G k ⊗ a ( j ) k F ( j ) = (cid:32) K (cid:88) k =0 a ( j ) k G k (cid:33) ⊗ F ( j ) , which gives(3.2) A = N elem (cid:88) j =1 A ( j ) = N elem (cid:88) j =1 (cid:32) K (cid:88) k =0 a ( j ) k G k (cid:33) ⊗ F ( j ) . In other words, we obtained a decomposition of A in which the dependence of the FE matriceson a ( x ) is compensated by splitting of the operator to elements.In this paper, we consider preconditioners corresponding to an inner product ( · , · ) M definedon V whose matrix representation (with respect to the same basis) is of the form analogousto (3.2), in particular M = K (cid:88) k =0 (cid:101) G k ⊗ (cid:101) F k = N elem (cid:88) j =1 K (cid:88) k =0 (cid:101) G k ⊗ (cid:101) F ( j ) k = N elem (cid:88) j =1 K (cid:88) k =0 (cid:101) G k ⊗ (cid:101) a ( j ) k F ( j ) (3.3) = N elem (cid:88) j =1 (cid:32) K (cid:88) k =0 (cid:101) a ( j ) k (cid:101) G k (cid:33) ⊗ F ( j ) =: N elem (cid:88) j =1 M ( j ) , where (cid:101) a ( j ) k and (cid:101) G k ∈ R N P × N P are such that the matrices M ( j ) are positive semidefinite for all j = 1 , . . . , N elem , and the resulting matrix M is positive-definite.Note that while formally the structure of M is the same as that of the original matrix,special choices of (cid:101) G k , k = 0 , . . . , K , can simplify the solves with M greatly, in comparisonwith the solves with A . We will see in section 4 that many of the preconditioners that are usedin practice are indeed of the form (3.3). Recall that since the preconditioner only differs from A in the stochastic part, we have to have an efficient solver for the underlying deterministicproblem.The following theorem shows that the spectral equivalence between A and M can be ob-tained from the spectral equivalence between (cid:80) Kk =0 a ( j ) k G k and (cid:80) Kk =0 (cid:101) a ( j ) k (cid:101) G k on each element τ j , j = 1 , . . . , N elem . The obtained spectral bounds do not depend on the type and the numberof the FE basis functions. Theorem 3.2.
Let the matrices A and M be defined by (3.2) and (3.3) , respectively, andlet < c ≤ c be such that (3.4) c v T (cid:32) K (cid:88) k =0 (cid:101) a ( j ) k (cid:101) G k (cid:33) v ≤ v T (cid:32) K (cid:88) k =0 a ( j ) k G k (cid:33) v ≤ c v T (cid:32) K (cid:88) k =0 (cid:101) a ( j ) k (cid:101) G k (cid:33) v , for all v ∈ R N P , j = 1 , . . . , N elem . Then also (3.5) c v T M v ≤ v T Av ≤ c v T M v , for all v ∈ R N FE N P . The proof of Theorem 3.2 is based on the two following lemmas.
Lemma 3.3.
Let ( · , · ) A and ( · , · ) M be two inner products on a Hilbert space V . Let theinner products be composed as (3.6) ( u, v ) A = N (cid:88) j =1 ( u, v ) A,j , ( u, v ) M = N (cid:88) j =1 ( u, v ) M,j , u, v ∈ V, where ( · , · ) A,j and ( · , · ) M,j , j = 1 , . . . , N , are positive semidefinite bilinear forms on V . Letthere exist two positive real constants c and c such that the induced seminorms are uniformlyequivalent in the following sense (3.7) c ( u, u ) M,j ≤ ( u, u ) A,j ≤ c ( u, u ) M,j , for all u ∈ V, j = 1 , . . . , N.
Then the induced (cumulative) norms are also equivalent with the same constants, i.e., < c ≤ ( u, u ) A ( u, u ) M ≤ c, for all u ∈ V, u (cid:54) = 0 . Proof.
The proof follows trivially from: c ( u, u ) M (3.6) = c N (cid:88) j = j ( u, u ) M,j (3.7) ≤ (3.6) = ( u,u ) A (cid:122) (cid:125)(cid:124) (cid:123) N (cid:88) j =1 ( u, u ) A,j (3.7) ≤ c N (cid:88) j =1 ( u, u ) M,j (3.6) = c ( u, u ) M . Remark A = (cid:80) j A ( j ) , M = (cid:80) j M ( j ) and c v T M ( j ) v ≤ v T A ( j ) v ≤ c v T M ( j ) v for all v and j , then c v T M v ≤ v T Av ≤ c v T M v for all v . Lemma 3.5.
Let X , Y ∈ R N × N be symmetric positive definite and Z ∈ R N × N be sym-metric positive semidefinite. Let c v T Y v ≤ v T Xv ≤ c v T Y v , for all v ∈ R N , hold for some positive real constants c and c . Then also c u T ( Y ⊗ Z ) u ≤ u T ( X ⊗ Z ) u ≤ c u T ( Y ⊗ Z ) u , for all u ∈ R N N . Proof. If Z is invertible, then the proof follows trivially from(3.8) ( Y ⊗ Z ) − ( X ⊗ Z ) = ( Y − ⊗ Z − )( X ⊗ Z ) = ( Y − X ) ⊗ ( Z − Z ) = ( Y − X ) ⊗ I, see, e.g., [19, Section 13.3] or [26, Section 4.1], and the fact that the spectra satisfy σ (( Y − X ) ⊗ I ) = σ ( Y − X ). If Z is singular, then X ⊗ Z as well as Y ⊗ Z are invertible on R ( I ⊗ ( Z † Z ))and zero on R ( I ⊗ ( I − Z † Z )), where † denotes the Moore-Penrose pseudoinverse. Combinedwith (3.8), the proof follows directly. LOCK PRECONDITIONING IN STOCHASTIC GALERKIN 11
We are now ready to prove Theorem 3.2.
Proof of Theorem . Under the assumption (2.12) and the assumption on the precon-ditioning matrix, the matrices (cid:80) Kk =0 a ( j ) k G k and (cid:80) Kk =0 (cid:101) a ( j ) k (cid:101) G k , j = 1 , . . . , N elem , are positivedefinite, while F ( j ) , j = 1 , . . . , N elem , are positive semidefinite. From (3.4) and Lemma 3.5,we obtain uniform spectral equivalence between A ( j ) and M ( j ) on each element τ j . ApplyingLemma 3.3 to the seminorms defined by A ( j ) and M ( j ) finishes the proof.Let us demonstrate on an example how c and c are obtained using Theorem 3.2. Example A be the matrix from Example 2.1 and let the preconditioner M beblock-diagonal, i.e., M = K (cid:88) k =0 (cid:101) G k ⊗ F k := G ⊗ F . For the element τ j , j = 1 , . . . , N elem , we have K (cid:88) k =0 a ( j ) k G k = a ( j )0 1 √ a ( j )1 √ a ( j )1 a ( j )0 2 √ a ( j )1 √ a ( j )1 a ( j )0 and K (cid:88) k =0 (cid:101) a ( j ) k (cid:101) G k = a ( j )0 a ( j )0
00 0 a ( j )0 . For | a ( j )1 | ≤ a ( j )0 , j = 1 , . . . , N elem , it is easy to prove that these matrices are spectrallyequivalent with c = 1 − √
155 and c = 1 + √ . Thus, using Theorem 3.2, also (cid:32) − √ (cid:33) v T M v ≤ v T Av ≤ (cid:32) √ (cid:33) v T M v , for all v ∈ R N FE N P , and(3.9) κ ( M − A ) ≤ √ ≈ . . In other words, if | a ( x ) | ≤ a ( x ), i.e., µ = 1 in (2.14), the block-diagonal preconditioning of A from (2.5) yields the condition number (3.9). If | a ( x ) | ≤ a ( x ), i.e., µ = in (2.14), weanalogously get(3.10) κ ( M − A ) ≤
23 + 4 √ ≈ . . Using the classical assumption (2.13) and not employing the information about the spectrumof G , we obtain for | a ( x ) | ≤ a ( x ) the estimate κ ( M − A ) ≤
3, while for | a ( x ) | ≤ a ( x ),the term κ ( M − A ) cannot be bounded. In [24], one of the pioneering papers on spectral estimates of preconditioned stochastic Galerkin matrices, the spectral bounds for specific s and K were derived, which for K = 1, µ = 1 and µ = result in (3.9) and (3.10), respectively.These bounds, however, do not reflect possible local properties of functions a k , and thus for K ≥
2, they are still less accurate than our estimates, as will be shown in subsection 4.4.Note that the approach for obtaining spectral equivalence of the stochastic Galerkin ma-trices presented in this section and summarized in Theorem 3.2 is independent of the choice ofthe approximation spaces. In the following section, we apply these results to approximationspaces introduced in subsection 2.1 and some inner products of the form (3.3) defined on themto obtain new spectral bounds of the related preconditioned system matrix M − A .
4. Preconditioning and spectral bounds.
In the following part, we present three innerproducts ( · , · ) M on V and their matrix representations M that can serve as preconditionersfor A . For each of them, we compute constants c and c defined in (3.5), which bound thespectrum of the preconditioned matrices M − A .The inner products and the corresponding matrices M presented in this section shouldonly serve as examples. Other preconditioning matrices of the form (3.3) can be studied andcorresponding bounds to the resulting spectra can be derived analogously. Due to (2.14), we can approximate the original innerproduct by an inner product, where the function a ( x , ξ ) is substituted by a ( x ), representingfor centralized distributions of ξ the mean value of a ( x , ξ ). The mean-based inner product isdefined as ( u, v ) M = (cid:90) R K (cid:90) D a ( x ) ∇ u ( x , ξ ) · ∇ v ( x , ξ ) ρ ( ξ ) d x d ξ . The corresponding preconditioning matrix(4.1) M = G ⊗ F is then block-diagonal with the diagonal blocks of size N FE × N FE .This type of preconditioning can be used both for the complete and the tensor productpolynomials; see, e.g., [24, 25, 30, 35]. We first derive the bounds c and c for the tensor productpolynomials. These bounds also apply to the complete polynomials because V C s ⊂ V TP s,...,s . Example M TP = X X X X X X X X X , M C = X X X X X X , where X stands for a non-zero block of size N FE × N FE . LOCK PRECONDITIONING IN STOCHASTIC GALERKIN 13
Lemma 4.2.
Under assumption (2.14) and for M defined by (4.1) , the constants c and c defined in (3.5) can be obtained as (4.2) c = 1 − µ λ max ( G s, ) and c = 1 + µ λ max ( G s, ) , where s = max( s , . . . , s K ) , and λ max ( G s, ) is the largest eigenvalue of G s, defined in (2.8) .Proof. The proof consists of several rather straightforward steps, which we however preferto give in full detail, since analogous technique will be used in the proofs of some of thesubsequent lemmas.Using Theorem 3.2, we only need to prove that for every v ∈ R N P , it holds that(4.3) c a ( j )0 v T G v ≤ v T (cid:32) a ( j )0 G + K (cid:88) k =1 a ( j ) k G k (cid:33) v ≤ c a ( j )0 v T G v , j = 1 , . . . , N elem . Since G s, = I s , equations (4.2) imply c v T G s, v ≤ v T G s, v ± µ v T G s, v ≤ c v T G s, v , v ∈ R s . Due to the interlacing property of the eigenvalues of Jacobi matrices, we immediately obtain c v T G s k , v ≤ v T G s k , v ± µ v T G s k , v ≤ c v T G s k , v , v ∈ R s k , for all s k ≤ s . Using the tensor structure (2.9) of the matrices G k , we get c v T G v ≤ v T G v ± µ v T G k v ≤ c v T G v , k = 1 , . . . , K, v ∈ R N P . Multiplying by | a ( j ) k | , we obtain c v T | a ( j ) k | G v ≤ v T | a ( j ) k | G v + µ a ( j ) k v T G k v ≤ c v T | a ( j ) k | G v , k = 1 , . . . , K, which taking sum over k becomes(4.4) c v T (cid:32) K (cid:88) k =1 | a ( j ) k | G (cid:33) v ≤ v T (cid:32) K (cid:88) k =1 | a ( j ) k | G + µ K (cid:88) k =1 a ( j ) k G k (cid:33) v ≤ c v T (cid:32) K (cid:88) k =1 | a ( j ) k | G (cid:33) v . Due to (2.14) and the fact that c ≤ ≤ c , it also holds that c v T (cid:32) µ a ( j )0 − K (cid:88) k =1 | a ( j ) k | (cid:33) G v ≤ v T (cid:32) µ a ( j )0 − K (cid:88) k =1 | a ( j ) k | (cid:33) G v (4.5) ≤ c v T (cid:32) µ a ( j )0 − K (cid:88) k =1 | a ( j ) k | (cid:33) G v . Adding (4.4) and (4.5), we get c µ a ( j )0 v T G v ≤ v T (cid:32) µ a ( j )0 G + µ K (cid:88) k =1 a ( j ) k G k (cid:33) v ≤ c µ a ( j )0 v T G v . By dividing by µ >
0, we obtain the desired inequality (4.3).
Since the eigenvalues of the Jacobi matrix G s, are the roots of the polynomial ψ s , thespectral bounds c and c can be obtained directly from the maximal roots of the polynomial ψ s , which we denote by λ max ( ψ s ). Thanks to this relation, we can formulate the followingcorollary purely in terms of these extremal roots. Corollary 4.3.
Let (2.14) be satisfied and let the matrix M represent the mean-based pre-conditioning (4.1) . Then (4.6) κ ( M − A ) ≤ µ λ max ( ψ s )1 − µ λ max ( ψ s ) , where for tensor product polynomials, we define s = max( s , . . . , s K ) . Note that if a is not significantly dominated by the term a in the sense of (2.14), we canexpect the mean-based preconditioning to perform rather poorly, see also [24]. This is reflectedin the bound (4.6) by the denominator 1 − µ λ max ( ψ s ) being close to zero. Remark < c ≤ c can be also used for a posteriori estimates of the energy norm e T Ae of the algebraic error.Let e := x − (cid:101) x be the error of inexact solutions (cid:101) x of the linear systems Ax = b and let c v T M v ≤ v T Av ≤ c v T M v hold for all v of appropriate size. If solutions of the system with M are easily accessible, thendue to e T Ae = r T A − r , r := Ae , the bounds to the error can be obtained efficiently as1 c r T M − r ≤ r T A − r ≤ c r T M − r ;see also [1, Theorem 5.3]. a . Instead of the block-diagonalmatrix (4.1), we can consider a block-diagonal matrix with larger blocks. This strategy canbe advantageous especially if only a small number of parallel processes can be employed.If we consider the tensor product polynomials, we can obtain a new inner product byomitting the last term of the expansion (2.2) of a ( x , ξ ), i.e.,( u, v ) M = (cid:90) R K (cid:90) D (cid:32) a ( x ) + K − (cid:88) k =1 a k ( x ) ξ k (cid:33) ∇ u ( x , ξ ) · ∇ v ( x , ξ ) ρ ( ξ ) d x d ξ . The corresponding preconditioning matrix(4.7) M = K − (cid:88) k =0 G k ⊗ F k is then block-diagonal with the diagonal blocks of size N FE · (cid:81) K − k =1 s k × N FE · (cid:81) K − k =1 s k . LOCK PRECONDITIONING IN STOCHASTIC GALERKIN 15
Example M TP = X XX X XX X X XX X XX X X XX X XX X . Note that one can consider many other truncation schemes, when a various number ofterms is truncated from the expansion (2.2). The mean-based preconditioning and the pre-conditioning (4.7) represent only two extreme cases of this strategy. After proper reordering,either of the expansion (2.2) or the resulting matrix A , we will again get a block diagonalpreconditioner M . The efficiency of omitting a particular term depends on a specific setting(the corresponding stochastic approximation space).It is also possible to apply this technique to complete polynomials. However, if we use thenatural ordering of complete polynomials, the preconditioning matrix will not have a blockdiagonal form. If we consider complete polynomials ordered as in (2.11), the resulting boundwill be the same as in the case of tensor-product polynomials. Lemma 4.6.
Under assumption (2.14) and for M defined by (4.7) , the constants c and c defined in (3.5) can be obtained as (4.8) c = 1 − µ λ max ( G s K , ) and c = 1 + µ λ max ( G s K , ) , where λ max ( G s K , ) is the largest eigenvalue of G s K , defined in (2.8) .Proof. Using Theorem 3.2, we only need to prove for every v ∈ R N P (4.9) c v T (cid:32) K − (cid:88) k =0 a ( j ) k G k (cid:33) v ≤ v T (cid:32) K (cid:88) k =0 a ( j ) k G k (cid:33) v ≤ c v T (cid:32) K − (cid:88) k =0 a ( j ) k G k (cid:33) v , j = 1 , . . . , N elem . Analogously to Lemma 4.2, equations (4.8) imply(4.10) c v T | a ( j ) K | G v ≤ v T | a ( j ) K | G v + µ a ( j ) K v T G K v ≤ c v T | a ( j ) K | G v . From the definition of µ , we have(4.11) v T ( G ± µ G k ) v ≥ , k = 1 , . . . , K, which multiplying by | a ( j ) k | and taking sum over k = 1 , . . . , K − v T (cid:32) K − (cid:88) k =1 | a ( j ) k | G + µ K − (cid:88) k =1 a ( j ) k G k (cid:33) v ≥ . Using c ≤ ≤ c , we get from (4.12) c v T (cid:32) K − (cid:88) k =1 | a ( j ) k | G + µ K − (cid:88) k =1 a ( j ) k G k (cid:33) v ≤ v T (cid:32) K − (cid:88) k =1 | a ( j ) k | G + µ K − (cid:88) k =1 a ( j ) k G k (cid:33) v (4.13) ≤ c v T (cid:32) K − (cid:88) k =1 | a ( j ) k | G + µ K − (cid:88) k =1 a ( j ) k G k (cid:33) v . Adding (4.10), (4.13), and (4.5), we finally obtain c v T (cid:32) µ a ( j )0 G + µ K − (cid:88) k =1 a ( j ) k G k (cid:33) v ≤ v T (cid:32) µ a ( j )0 G + µ K (cid:88) k =1 a ( j ) k G k (cid:33) v (4.14) ≤ c v T (cid:32) µ a ( j )0 G + µ K − (cid:88) k =1 a ( j ) k G k (cid:33) v , which dividing by µ yields the desired inequality (4.9).Using this lemma, the spectral bounds c and c can be obtained directly from the roots ofthe polynomial ψ s K , similarly as in the previous section. Corollary 4.7.
Let (2.14) be satisfied and let the matrix M represent the ( K − -termexpansion preconditioning (4.7) . Then κ ( M − A ) ≤ µ λ max ( ψ s K )1 − µ λ max ( ψ s K ) . Another inner product can be obtained by splittingthe approximation space V into two complementary subspaces, V = U ⊕ W , U ∩ W = 0, sothat any v ∈ V can be uniquely decomposed as v = v U + v W , v U ∈ U , v W ∈ W . Using thisdecomposition, we can define the new inner product component-wise, i.e.,(4.15) ( u, v ) M = ( u U , v U ) A + ( u W , v W ) A . For the space V TP s ,...,s K of the tensor product polynomials of the degrees s − , . . . , s K −
1, wecan use the splitting U = V TP s ,...,s K − ,s K − and W such that V TP s ,...,s K = U ⊕ W , i.e., W containsthe polynomials of V TP s ,...,s K of degree exactly s K − ξ K . The correspondingpreconditioning matrix M has then a two-by-two block-diagonal form, see, e.g., [30, 32], andcan be obtained as(4.16) M = K − (cid:88) k =0 G k ⊗ F k + (cid:101) G K ⊗ F K , (cid:101) G K = (cid:101) G s K , ⊗ G s K − , ⊗ · · · ⊗ G , ⊗ G , , where the matrix (cid:101) G s K , is obtained from G s K , by annihilating the very last elements in boththe sub- and super-diagonal. For distributions with α n = 0, n = 1 , , . . . in the recurrence(2.7), this matrix satisfies (cid:101) G s K , = (cid:18) G s K − , (cid:19) . LOCK PRECONDITIONING IN STOCHASTIC GALERKIN 17
For the space V C s of the complete polynomials of the total degree at most s −
1, we can usethe splitting U = V C s − and W = W C s , where W C s is the span of the complete polynomials of thetotal degree exactly s −
1. Similarly to the previous case, the corresponding preconditioningmatrix M has then a two-by-two block-diagonal form and can be obtained as(4.17) M = K (cid:88) k =0 (cid:101) G k ⊗ F k = G F + K (cid:88) k =1 (cid:101) G k ⊗ F k , where the matrices (cid:101) G k coincide with G k up to the last sub- and super-diagonal blocks of thesizes ( N C s − − N C s − ) × ( N C s − N C s − ) which are annihilated in (cid:101) G k .If U and W are close to orthogonal, the preconditioning based on the splitting (4.15)enables to estimate the error reduction when the approximation space U is enriched by W inthe Galerkin method. This can be exploited in adaptive algorithms, where W is sometimescalled the ‘detail’ space; see Remark 4.13 and, e.g., [5, 7, 27]. Example M TP = X X XX X X XX X XX X XX X X XX X X X XX X XX X , M C = X X XX XX X X X X . As we will see later in this section, the efficiency of the splitting-based preconditioner, bothfor the tensor-product and the complete polynomials, is determined by the spectral propertiesof the matrix(4.18) H ± s = (cid:18) I s ± µ (cid:18) G s − , (cid:19)(cid:19) − ( I s ± µ G s, ) , where either + or − sign is considered. We first investigate the spectrum of H ± s . In thesubsequent lemma, we link the spectral properties of M − A and that of H ± s . Lemma 4.9.
At most two eigenvalues of H ± s defined in (4.18) are different from unity.These two eigenvalues do not depend on the sign considered and can be obtained as λ min = 1 − (cid:112) − d s , λ max = 1 + (cid:112) − d s , where d s = 1 e Ts ( I s + µ G s, ) − e s < . Finally, for a fixed s , d s increases with µ ≥ decreasing. Proof.
Since(4.19) ( I s ± µ G s, ) = (cid:18) I s ± µ (cid:18) G s − , (cid:19)(cid:19) ± µ (cid:112) β s − ( e s − e Ts + e s e Ts − ) , i.e., the two matrices differ by a rank-2 matrix, the first statement follows directly. Further,using (4.19), we have (cid:18) I s ± µ (cid:18) G s − , (cid:19)(cid:19) − ( I s ± µ G s, )(4.20) = I s ± µ (cid:112) β s − (cid:18) I s ± µ (cid:18) G s − , (cid:19)(cid:19) − ( e s − e Ts + e s e Ts − )= I s ± µ (cid:112) β s − (cid:18) ( I s − ± µ G s − , ) − (cid:19) ( e s − e Ts + e s e Ts − ) . Therefore, to obtain those non-unit eigenvalues, it suffices to investigate the last two-by-twodiagonal block of (4.20), which has the form (cid:18) ± µ (cid:112) β s − e Ts − ( I s − ± µ G s − , ) − e s − ± µ (cid:112) β s − (cid:19) . The eigenvalues therefore satisfy λ min = 1 − (cid:113) µ β s − e Ts − ( I s − ± µ G s − , ) − e s − ,λ max = 1 + (cid:113) µ β s − e Ts − ( I s − ± µ G s − , ) − e s − . Defining d − j = e Tj ( I j + µ G j, ) − e j and applying the recursive formula for the last element ofa symmetric tridiagonal matrix, see [22, Theorem 2.3], we get(4.21) d = 1 , d j = 1 − µ β j − d j − , j = 2 , , . . . , and analogously for the − sign, from which the second statement is obtained. The monotonic-ity of d s in µ follows from the discussion below, in particular (4.27). Remark H ± s are independent of the sign, to simplify thenotation, we further work with the matrix H s := H + s . Lemma 4.11.
Assume the tensor product polynomials. Under assumption (2.14) and for M defined by (4.16) , the constants c and c defined in (3.5) can be obtained as (4.22) c = λ min ( H s K ) and c = λ max ( H s K ) , where λ min ( H s K ) and λ max ( H s K ) are the smallest and the largest eigenvalue, respectively, of H s K defined in (4.18) . LOCK PRECONDITIONING IN STOCHASTIC GALERKIN 19
Proof.
Using Theorem 3.2, we only need to prove for every v ∈ R N P , j = 1 , . . . , N elem ,(4.23) c v T (cid:32) K − (cid:88) k =0 a ( j ) k G k + a ( j ) K (cid:101) G K (cid:33) v ≤ v T (cid:32) K (cid:88) k =0 a ( j ) k G k (cid:33) v ≤ c v T (cid:32) K − (cid:88) k =0 a ( j ) k G k + a ( j ) K (cid:101) G K (cid:33) v . From (4.22) and Remark 4.10, using the same technique as in Lemma 4.6, we get c v T (cid:16) | a ( j ) K | G + µ a ( j ) K (cid:101) G K (cid:17) v ≤ v T (cid:16) | a ( j ) K | G + µ a ( j ) K G K (cid:17) v ≤ c v T (cid:16) | a ( j ) K | G + µ a ( j ) K (cid:101) G K (cid:17) v . Proceeding further as in (4.11)–(4.14) of the proof of Lemma 4.6, we obtain the desiredinequality (4.23).
Lemma 4.12.
Assume the complete polynomials. Under assumption (2.14) and for M defined by (4.17) , the constants c and c defined in (3.5) can be obtained as c = min t =1 ,...,s λ min ( H t ) and c = max t =1 ,...,s λ max ( H t ) , where λ min ( H t ) and λ max ( H t ) are the smallest and the largest eigenvalue, respectively, of H t defined in (4.18) .Proof. Using Theorem 3.2, we only need to prove for every v ∈ R N P (4.24) c v T (cid:32) K (cid:88) k =0 a ( j ) k (cid:101) G k (cid:33) v ≤ v T (cid:32) K (cid:88) k =0 a ( j ) k G k (cid:33) v ≤ c v T (cid:32) K (cid:88) k =0 a ( j ) k (cid:101) G k (cid:33) v , j = 1 , . . . , N elem . It suffices to show(4.25) c v T ( G ± µ (cid:101) G k ) v ≤ v T ( G ± µ G k ) v ≤ c v T ( G ± µ (cid:101) G k ) v , k = 1 , . . . , K, from which (4.24) can be obtained analogously as in Lemmas 4.2, 4.6, and 4.11.Since now the matrices do not have the tensor product form, to prove (4.25), we cannotproceed in the same way as in the previous lemmas. The constants c and c are obtained asthe extreme eigenvalues of the generalized eigenvalue problem(4.26) ( I N C s ± µ G k ) v = λ ( I N C s ± µ (cid:101) G k ) v , where N C t = ( K + t − K ) denotes the size of the basis of K -variate complete polynomials of degreeat most t − k = 1. Let G R1 and (cid:101) G R1 be the matrices obtained from G and (cid:101) G , respec-tively, by reordering their rows and columns in such manner that the corresponding basis K -variate orthonormal polynomials are ordered anti-lexicographically as, for example, in (2.11)(instead of the ordering according to their growing total degree, which is used so far). Thenboth G R1 and (cid:101) G R1 become block-diagonal. The diagonal blocks of G R1 are tridiagonal matrices I t + G t, of variable sizes t × t , t ∈ { , . . . , s } . The corresponding diagonal blocks of (cid:101) G R1 areequal to those of G R1 up to the last sub- and super-diagonal elements which are annihilated in (cid:101) G R1 . The number and sizes of the diagonal blocks depend on K and s . For k (cid:54) = 1, we anal-ogously choose ordering where the k -th index changes the fastest. Thus, using Remark 4.10,the eigenvalue problem (4.26) reduces to a number of independent eigenvalue problems withthe matrices H t of sizes t = 1 , . . . , s defined in (4.18).To obtain the actual spectral bounds c and c in Lemma 4.11 and in Lemma 4.12, we needto evaluate d − s = e Ts ( I s + µ G s, ) − e s . This can be done recursively, using (4.21), but it maybe advantageous to exploit the relation between Jacobi matrices and the Gauss-Christoffelquadrature, as we will describe in this part. Defining f ( t ) = 11 + µt , J s = . . . , and (cid:98) G s, = J s G s, J − s , we can rewrite d − s = e Ts ( I s + µ G s, ) − e s = e Ts f ( G s, ) e s = e T J s f ( G s, ) J − s e = e T f ( (cid:98) G s, ) e . Since (cid:98) G s, is again a Jacobi matrix, e T f ( (cid:98) G s, ) e can be computed as the Gauss–Christoffelquadrature of the integral of f , i.e., e T f ( (cid:98) G s, ) e = s (cid:88) j =1 (cid:98) ω ( s ) j f ( (cid:98) λ ( s ) j ) = s (cid:88) j =1 (cid:98) ω ( s ) j µ (cid:98) λ ( s ) j , where (cid:98) λ ( s ) j and (cid:98) ω ( s ) j are the nodes and the weights, respectively, of the Gauss–Christoffelquadrature defined by (cid:98) G s, ; see [20, Chap. 3] for a comprehensive overview of relations betweenJacobi matrices, orthogonal polynomials, underlying distribution functions, and the Gauss–Christoffel quadrature.Due to symmetry of the considered distributions, the weights and nodes are also symmet-ric, and we can write(4.27) d − s = s (cid:88) j =1 (cid:98) ω ( s ) j µ (cid:98) λ ( s ) j = s (cid:88) j =1 (cid:98) ω ( s ) j − µ ( (cid:98) λ ( s ) j ) . Since the weights of the Gauss–Christoffel quadrature are positive, we also obtain mono-tonic dependence on µ , meaning that the conditioning of M − A improves with decreasing µ . However, for a given µ , one can have decreasing as well as increasing behavior in s . Inthe following part, we provide more explicit expressions for the weights and nodes for theconsidered approximation polynomials. It is well known that the nodes are the roots { λ ( s ) j } of the highest-degree polynomial ψ s . Moreover, the weights can be obtained from ψ s as well;see [20, p. 120]. LOCK PRECONDITIONING IN STOCHASTIC GALERKIN 21
Gegenbauer polynomials.
For the Gegenbauer polynomials, the weights are given by (cid:98) ω ( s ) j = 2 s + 2 γ − s + 2 γ − − ( λ ( s ) j ) s , yielding d − s = 2 s + 2 γ − s + 2 γ − s s (cid:88) j =1 − ( λ ( s ) j ) − µ ( λ ( s ) j ) , which for µ = 1 simplifies to d − s = 2 s + 2 γ − s + 2 γ − . Substituting γ = 1 and γ = , we obtain the spectral bounds of H s for the Chebyshev andLegendre polynomials, respectively. Hermite polynomials.
For the Hermite polynomials, the weights are given by (cid:98) ω ( s ) j = 1 s , yielding d − s = 1 s s (cid:88) j =1 − µ ( λ ( s ) j ) . Remark c and c can be usedto estimate the strengthened Cauchy-Bunyakowski-Schwarz inequality constant γ CBS ∈ [0 , γ satisfying( v U , v W ) A ≤ γ ( v U , v U ) A ( v W , v W ) A , v U ∈ U, v W ∈ W ;see, e.g., [2, 5, 18, 27]). In particular, it holds that γ CBS ≤ c − − c ) . The constant γ CBS can be used in the two-by-two block Gauss-Seidel preconditioning (alsocalled the Schur-complement or multiplicative two-level preconditioning), with the resultingcondition number bounded as κ ( M − A ) ≤ − γ CBS (cid:18) ≤ − ( c − = 1 d t (cid:19) ;see, e.g., [2, Chapter 9], [18, Sections 2.2 and 2.3], [31, 32], and the examples in subsection 4.4with the results summarized in Tables 5 and 6. setting 1 setting 2 setting 3 Figure 1: Functions a k ( x ), k = 1 , . . . , K , for the settings from Table 2.Further, if u U and u V represent the solutions of our problem in the spaces U and in V ,respectively, then the ‘solution improvement’ u V − u U can be estimated as(4.28) (cid:107) e W (cid:107) A ≤ (cid:107) u V − u U (cid:107) A ≤ − γ CBS (cid:107) e W (cid:107) A . where e W is a solution of a certain problem restricted to the (small) space W ; see, e.g., [1,Theorem 5.2]. Using (4.28), one can estimate (cid:107) u V − u U (cid:107) A in advance, without solving the(large) problem in V , which can be exploited in adaptivity.Due to Galerkin orthogonality, we have (cid:107) u − u U (cid:107) A = (cid:107) u − u V (cid:107) A + (cid:107) u V − u U (cid:107) A where u isthe exact solution of (2.1) in (cid:101) V . Thus (4.28) provides also an a posteriori lower bound of theenergy norm of the error u − u U in (cid:101) V ; see, e.g., [1, 5, 6]. In this section, we illustrate on some simple numerical ex-amples how to apply the introduced theoretical tool. First, we consider a one-dimensionalproblem in D = [0 ,
1] with homogeneous Dirichlet boundary conditions, uniform mesh with N elem = 30 elements and with the nodes x = 0, . . . , x N elem = 1, K = 3, uniform dis-tributions of ξ , ξ , and ξ with images in [ − , D and of K -variate Legendre polynomials. We define three different settings for a k ( x ), see Table 2 andFigure 1, which are considered constant on every interval, a k ( x ) = a k ( x cj ) on ( x j , x j +1 ) where x cj = ( x j + x j +1 ) / j = 0 , . . . , N elem −
1. We compute the corresponding µ and µ class definedin (2.14) and (2.13), respectively, i.e., µ = max j =1 ,...,N elem (cid:88) k =1 | a k ( x cj ) | , µ class = (cid:88) k =1 max j =1 ,...,N elem | a k ( x cj ) | . We apply the mean-based preconditioning, see subsection 4.1, to each of the settings. Wecompute the true extreme eigenvalues of the resulting preconditioned matrices, the theoreticalspectral bounds c and c given by Lemma 4.2 and Corollary 4.3, and the classical spectral LOCK PRECONDITIONING IN STOCHASTIC GALERKIN 23
Table 2: Problem setting, one-dimensional problems. setting a ( x ) a ( x ) a ( x ) a ( x ) µ µ class . sin(1 πx ) . sin(2 πx ) . sin(3 πx ) 0.35 0.412 1 0 . χ (0 , / . χ (1 / , / . χ (2 / , . χ (0 , / . χ (1 / , / . χ (2 / , bounds c class = 1 − µ class λ max ( ψ s ) , c class = 1 + µ class λ max ( ψ s ) . derived, e.g., in [24, Theorem 3.8]. The roots of the Legendre orthogonal polynomials areobtained from [21]. The results for complete polynomials are summarized in Table 3. For allconsidered settings, we obtained c class ≤ c ≤ λ min ( M − A ) ≤ λ max ( M − A ) ≤ c ≤ c class . Moreover, for the third setting, the classical bounds do not provide any useful informationbecause µ class λ max ( ψ s ) >
1, and thus c class < D = [0 , with homogeneous Dirichletboundary conditions, uniform mesh with N elem = 20 = 400 elements, uniform distributionsof ξ k , k = 1 , . . . , K , with images in [ − , D and of K -variateLegendre polynomials. We define two different settings for a k ( x ), k = 0 , . . . , K , see Table 4,which are analogously as before considered constant on every element. For both settings,we investigate the spectral bounds for the splitting-based preconditioning (SB) and the two-by-two block Gauss-Seidel preconditioning (GS2). The results are shown in Tables 5 and 6.According to Remark 4.13, the upper bound of κ ( M − A ) is obtained as 1 /d t . The values of t used in Lemma 4.11 for obtaining c and c are presented.
5. Conclusion.
We introduced a new tool for obtaining guaranteed and two-sided spectralbounds for discretized stochastic Galerkin problems preconditioned by a matrix with modifiedstochastic part from the spectral information of certain small matrices, from which the largestochastic Galerkin matrix is constructed. Moreover, this analysis only requires point-wiseor local dominance of the deterministic part of the expansion of the parameter-dependentfunction a ( x , ξ ), represented here by the parameter µ , while the standard bounds are typicallybased on the absolute global dominance. The derived estimates are therefore also applicableto problems where global dominance is not achieved. We showed for three types of block-diagonal preconditioners, including less standard ones, how this technique is used to obtainspectral bounds depending solely on µ and the properties of the stochastic approximationspace (here classical orthogonal polynomials). From the locality of µ , it also follows that the Table 3: Mean-based preconditioning and complete polynomials: a comparison of the newand the classical spectral bounds, and the extreme eigenvalues of the preconditioned matrixfor the three different settings from Table 2. s − κ ( A ) c class c λ min ( M − A ) λ max ( M − A ) c c class c/c c class /c class s e tt i n g1 s e tt i n g2 s e tt i n g3 Table 4: Problem setting, two-dimensional problems. setting a ( x , x ) a ( x , x ) a ( x , x ) a ( x , x ) µ . πx ) 0 . πx ) 0 . πx ) 0.83setting a ( x , x ) a k +1 ( x , x ) a k +2 ( x , x )5 1 . K sin (( k + 1) πx ) . K sin (( k + 1) πx ) obtained bounds are tighter than the classical ones. Similar ideas based on local properties ofa preconditioner appear also in [16]. Acknowledgments.
The authors are grateful to Stefano Pozza for sharing his insight intothe Gauss–Christoffel quadrature. We would also like to thank anonymous referees for theircomments and suggestions.
REFERENCES [1]
M. Ainsworth and J. T. Oden , A Posteriori Error Estimation in Finite Element Analysis , John Wiley& Sons, 2000.[2]
O. Axelsson , Iterative solution methods , Cambridge University Press, 1996.
LOCK PRECONDITIONING IN STOCHASTIC GALERKIN 25
Table 5: Splitting-based and two-by-two block Gauss-Seidel preconditioning for completepolynomials. Results for setting 4 (Table 4), K = 3, polynomial degree s − , . . . , s − κ ( A ) κ ( M − A ) c/c κ ( M − A ) 1 /d t t Table 6: Splitting-based and two-by-two block Gauss-Seidel preconditioning for completepolynomials. Results for setting 5 (Table 4), polynomial degree s − K = 1 , . . . , K κ ( A ) κ ( M − A ) c/c κ ( M − A ) 1 /d t t µ I. Babuˇska, F. Nobile, and R. Tempone , A stochastic collocation method for elliptic partial differentialequations with random input data , SIAM review, 52 (2010), pp. 317–355.[4]
I. Babuˇska, R. Tempone, and G. E. Zouraris , Galerkin finite element approximations of stochasticelliptic partial differential equations , SIAM Journal on Numerical Analysis, 42 (2004), pp. 800–825.[5]
A. Bespalov, C. E. Powell, and D. Silvester , Energy norm a posteriori error estimation for para-metric operator equations , SIAM Journal on Scientific Computing, 36 (2014), pp. A339–A363.[6]
A. J. Crowder and C. E. Powell , CBS constants & their role in error estimation for stochasticGalerkin finite element methods , Journal of Scientific Computing, 77 (2018), pp. 1030–1054.[7]
A. J. Crowder, C. E. Powell, and A. Bespalov , Efficient adaptive multilevel stochastic Galerkinapproximation using implicit a posteriori error estimation , arXiv eprint, arXiv:1806.05987, (2018).[8]
M. Eigel, C. J. Gittelson, C. Schwab, and E. Zander , Adaptive stochastic Galerkin FEM , ComputerMethods in Applied Mechanics and Engineering, 270 (2014), pp. 247–269.[9]
M. Eigel and C. Merdon , Local equilibration error estimators for guaranteed error control in adaptivestochastic higher-order Galerkin finite element methods , SIAM/ASA Journal on Uncertainty Quan-tification, 4 (2016), pp. 1372–1397.[10]
M. Eigel, M. Pfeffer, and R. Schneider , Adaptive stochastic Galerkin FEM with hierarchical tensorrepresentations , Numerische Mathematik, 136 (2017), pp. 765–803.[11]
V. Eijkhout and P. Vassilevski , The role of the strengthened CBS-inequality in multi-level methods ,SIAM Review, 33 (1991), pp. 405–419.[12]
H. Elman and D. Furnival , Solving the stochastic steady-state diffusion problem using multigrid , IMAJournal of Numerical Analysis, 27 (2007), pp. 675–688.[13]
O. G. Ernst, C. E. Powell, D. J. Silvester, and E. Ullmann , Efficient solvers for a linear stochas-tic Galerkin mixed formulation of diffusion problems with random data , SIAM Journal on ScientificComputing, 31 (2009), pp. 1424–1447. [14]
O. G. Ernst and E. Ullmann , Stochastic Galerkin matrices , SIAM Journal on Matrix Analysis andApplications, 31 (2010), pp. 1848–1872.[15]
W. Gautschi , Orthogonal Polynomials, Computation and Approximation , Oxford University Press, 2004.[16]
T. Gergelits, K.-A. Mardal, B. F. Nielsen, and Z. Strakoˇs , Laplacian preconditioning of ellip-tic PDEs: Localization of the eigenvalues of the discretized operator , SIAM Journal on NumericalAnalysis, (to appear).[17]
A. Khan, A. Bespalov, C. E. Powell, and D. J. Silvester , Robust a posteriori error estimationfor stochastic Galerkin formulations of parameter-dependent linear elasticity equations , arXiv eprint,arXiv:1810.07440, (2018).[18]
J. Kraus and S. Margenov , Robust Algebraic Multilevel Methods and Algorithms , vol. 5, Walter deGruyter GmbH & Co. KG, Berlin, Radon Series on Computational and Applied Mathematics, 2009.[19]
A. J. Laub , Matrix Analysis for Scientists and Engineers , SIAM, 2005.[20]
J. Liesen and Z. Strakoˇs , Krylov subspace methods , Numerical Mathematics and Scientific Computa-tion, Oxford University Press, Oxford, 2013.[21]
A. N. Lowan, N. Davids, and A. Levenson , Table of the zeros of the Legendre polynomials of order1-16 and the weight coefficients for Gauss’ mechanical quadrature formula , Bull. Amer. Math. Soc.,48 (1942), pp. 739–743.[22]
G. Meurant , A review on the inverse of symmetric tridiagonal and block tridiagonal matrices , SIAMJournal on Matrix Analysis and Applications, 13 (1992), pp. 707–728.[23]
C. M¨uller, S. Ullmann, and J. Lang , A Bramble-Pasciak conjugate gradient method for discreteStokes equations with random viscosity , arXiv eprint, arXiv:1801.01838, (2018).[24]
C. E. Powell and H. C. Elman , Block-diagonal preconditioning for spectral stochastic finite-elementsystems , IMA Journal of Numerical Analysis, 29 (2009), pp. 350–375.[25]
C. E. Powell, D. Silvester, and V. Simoncini , An efficient reduced basis solver for stochastic Galerkinmatrix equations , SIAM Journal on Scientific Computing, 39 (2017), pp. A141–A163.[26]
C. E. Powell and E. Ullmann , Preconditioning stochastic Galerkin saddle point systems , SIAM Journalon Matrix Analysis and Applications, 31 (2010), pp. 2813–2840.[27]
I. Pultarov´a , Adaptive algorithm for stochastic Galerkin method , Applications of Mathematics, 60(2015), pp. 551–571.[28]
I. Pultarov´a , Hierarchical preconditioning for the stochastic Galerkin method: Upper bounds to thestrengthened CBS constants , Computers & Mathematics with Applications, 71 (2016), pp. 949–964.[29]
I. Pultarov´a , Block and multilevel preconditioning for stochastic Galerkin problems with lognormallydistributed parameters and tensor product polynomials , International Journal for Uncertainty Quan-tification, 7 (2017), pp. 441–462.[30]
E. Rosseel and S. Vandewalle , Iterative solvers for the stochastic finite element method , SIAMJ. Sci. Comput., 32 (2010), pp. 372–397.[31]
B. Soused´ık and R. Ghanem , Truncated hierarchical preconditioning for the stochastic Galerkin FEM ,International Journal for Uncertainty Quantification, 4 (2014), pp. 333–348.[32]
B. Soused´ık, R. G. Ghanem, and E. T. Phipps , Hierarchical Schur complement preconditioner for thestochastic Galerkin finite element methods: Dedicated to Professor Ivo Marek on the occasion of his80th birthday. , Numerical Linear Algebra with Applications, 21 (2014), pp. 136–151.[33]
W. Subber and S. Loisel , Schwarz preconditioners for stochastic elliptic PDEs , Computer Methods inApplied Mechanics and Engineering, 272 (2014), pp. 34–57.[34]
G. Szeg˝o , Orthogonal Polynomials , vol. 23, American Mathematical Society, 3rd ed., 1975.[35]
E. Ullmann , A Kronecker product preconditioner for stochastic Galerkin finite element discretizations ,SIAM Journal on Scientific Computing, 32 (2010), pp. 923–946.[36]