[PDF] A matrix-free isogeometric Galerkin method for Karhunen-Loève approximation of random fields using tensor product splines, tensor contraction and interpolation based quadrature

Abstract

The Karhunen-Loève series expansion (KLE) decomposes a stochastic process into an infinite series of pairwise uncorrelated random variables and pairwise L 2 -orthogonal functions. For any given truncation order of the infinite series the basis is optimal in the sense that the total mean squared error is minimized. The orthogonal basis functions are determined as the solution of an eigenvalue problem corresponding to the homogeneous Fredholm integral equation of the second kind, which is computationally challenging for several reasons. Firstly, a Galerkin discretization requires numerical integration over a 2d dimensional domain, where d , in this work, denotes the spatial dimension. Secondly, the main system matrix of the discretized weak-form is dense. Consequently, the computational complexity of classical finite element formation and assembly procedures as well as the memory requirements of direct solution techniques become quickly computationally intractable with increasing polynomial degree, number of elements and degrees of freedom. The objective of this work is to significantly reduce several of the computational bottlenecks associated with numerical solution of the KLE. We present a matrix-free solution strategy, which is embarrassingly parallel and scales favorably with problem size and polynomial degree. Our approach is based on (1) an interpolation based quadrature that minimizes the required number of quadrature points; (2) an inexpensive reformulation of the generalized eigenvalue problem into a standard eigenvalue problem; and (3) a matrix-free and parallel matrix-vector product for iterative eigenvalue solvers. Two higher-order three-dimensional benchmarks illustrate exceptional computational performance combined with high accuracy and robustness.

Full PDF

HHighlights

A matrix-free isogeometric Galerkin method for Karhunen-Lo`eve approximation of randomﬁelds using tensor product splines, tensor contraction and interpolation based quadrature

Micha(cid:32)l (cid:32)L. Mika, Thomas J.R. Hughes, Dominik Schillinger, Peter Wriggers, Ren´e R. Hiemstra • Interpolation based quadrature of the weak form of the integral eigenvalue problem that is optimal interms of the number of evaluation points and with cost independent of polynomial degree; • Eﬃcient formation of ﬁnite element arrays based on sum factorization of integrands deﬁned on high-dimensional domains; • Inexpensive reformulation of the generalized eigenvalue problem into an equivalent standard algebraiceigenvalue problem, which decreases computational cost signiﬁcantly while improving conditioning; • Formulation of a matrix-free and parallel matrix-vector product for iterative eigenvalue solvers thatscales quadratically with the number of degrees of freedom of the interpolation space. a r X i v : . [ c s . C E ] N ov matrix-free isogeometric Galerkin method for Karhunen-Lo`eveapproximation of random ﬁelds using tensor product splines, tensorcontraction and interpolation based quadrature Micha(cid:32)l (cid:32)L. Mika a, ∗ , Thomas J.R. Hughes b , Dominik Schillinger a , Peter Wriggers c , Ren´e R. Hiemstra a a Institut f¨ur Baumechanik und Numerische Mechanik, Leibniz Universit¨at Hannover b Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin c Institut f¨ur Kontinuumsmechanik, Leibniz Universit¨at Hannover

Abstract

The Karhunen-Lo`eve series expansion (KLE) decomposes a stochastic process into an inﬁnite series ofpairwise uncorrelated random variables and pairwise L -orthogonal functions. For any given truncationorder of the inﬁnite series the basis is optimal in the sense that the total mean squared error is minimized.The orthogonal basis functions are determined as the solution of an eigenvalue problem corresponding to thehomogeneous Fredholm integral equation of the second kind, which is computationally challenging for severalreasons. Firstly, a Galerkin discretization requires numerical integration over a 2 d dimensional domain,where d , in this work, denotes the spatial dimension. Secondly, the main system matrix of the discretizedweak-form is dense. Consequently, the computational complexity of classical ﬁnite element formation andassembly procedures as well as the memory requirements of direct solution techniques become quicklycomputationally intractable with increasing polynomial degree, number of elements and degrees of freedom.The objective of this work is to signiﬁcantly reduce several of the computational bottlenecks associatedwith numerical solution of the KLE. We present a matrix-free solution strategy, which is embarrassinglyparallel and scales favorably with problem size and polynomial degree. Our approach is based on (1) aninterpolation based quadrature that minimizes the required number of quadrature points; (2) an inexpensivereformulation of the generalized eigenvalue problem into a standard eigenvalue problem; and (3) a matrix-free and parallel matrix-vector product for iterative eigenvalue solvers. Two higher-order three-dimensionalbenchmarks illustrate exceptional computational performance combined with high accuracy and robustness. Keywords:

Matrix-free solver, Kronecker products, random ﬁelds, Fredholm integral eigenvalue problem, isogeometricanalysis

1. Introduction

Most physical systems exhibit randomness, which, because of its lack of pattern or regularity, can notbe explicitly captured by deterministic mathematical models. The randomness may be due to the natureof the phenomenon itself, called aleatoric uncertainty, or due to a lack of knowledge about the system,referred to as epistemic uncertainty. In the latter the uncertainty may be reduced by obtaining additionaldata about the system at hand. An example of an epistemic uncertainty encountered in engineering arethe ﬂuctuations of material properties throughout a body, which occur due to the inhomogeneity of the ∗ Corresponding author

Email addresses: [email protected] (Micha(cid:32)l (cid:32)L. Mika), [email protected] (Thomas J.R. Hughes), [email protected] (Dominik Schillinger), [email protected] (Peter Wriggers), [email protected] (Ren´e R. Hiemstra)

Preprint submitted to Computer Methods in Applied Mechanics and Engineering (CMAME) November 30, 2020 edium. Deterministic mechanical models typically feature empirically derived material parameters, suchas material stiﬀness and yield stress, that are assumed constant throughout the body. Their value is typicallydetermined as a statistical volumetric average over a large set of laboratory specimens. This idealized modelof reality may be insuﬃcient in e.g. structural risk or reliability analysis and prediction, which is concernedwith probabilities of violation of safety limits or performance measures, respectively [48]. In this case theeﬀects of uncertainty on the result of a computation need to be quantiﬁed.Uncertainty in physical quantities that vary in space and or time may be adequately modeled by stochasticprocesses or random ﬁelds [64]. This approach generalizes a deterministic system modeled by a partialdiﬀerential equation to a stochastic system modeled by a stochastic partial diﬀerential equation or SPDE.Reliable predictions may be obtained by propagating uncertainties in input variables to those in the response.The main objective is to compute the response statistics, such as the mean and variance in the randomsolution ﬁeld, or the probability that a set tolerance is exceeded. To compute these statistics it is necessaryto discretize the SPDE, not only in space and time, but also in the stochastic dimensions. This can bea complicated task, not because of modeling randomness, but due to the curse of dimensionality . Everyrandom variable contributes one dimension to the problem. Hence, it is important to keep their total to aminimum. One of the relevant questions in stochastic analysis is how to represent random ﬁelds discretely, in amanner suitable for use in numerical computation. The essential step is to break down the representation intoa tractable number of mutually independent random variables, whose combination preserves the stochasticvariability of the process [17, 33]. One representation that is of particular interest is the truncated Karhunen-Lo`eve series expansion or KLE [32, 45]. The KLE decomposes a stochastic process into an inﬁnite series ofpairwise uncorrelated random variables and pairwise L -orthogonal basis functions. Truncating the seriesexpansion after M terms yields the best M-term linear approximation of the random ﬁeld, in the sense thatthe total mean squared error is minimized [20]. The KLE is useful in practice when satisfactory accuracy isattained with no more than 20-30 terms [17, 63].Computation of the truncated KL expansion requires the solution of a homogeneous Fredholm integraleigenvalue problem (IEVP) of the second kind. In general this is only possible numerically. The mostpopular numerical methods to solve IEVPs are the Nystr¨om method, degenerate kernel methods and thecollocation and Galerkin method [5, 37]. The Galerkin method is widely regarded as superior due to itsapproximation properties and solid theoretical foundation.Speciﬁcally, it can be shown that the eigenvaluesconverge monotonically towards the exact eigenvalues and, by construction, the the modes preserve exactlythe L orthogonality property of the analytical mode-shapes [17]. Eﬃcient solution of the KLE using the Galerkin method is a computationally challenging task [17]. Themain challenges are the following:(i) A Galerkin discretization requires numerical integration over a 2 d dimensional domain, where d , inthis work, denotes the spatial dimension. The computational complexity of classical ﬁnite elementformation and assembly procedures scales as O (cid:0) N e · p d (cid:1) , where N e is the global number of elements, p the polynomial degree and d the spatial dimension.(ii) The main system matrix of the discretized weak-form is dense and requires 8 N bytes of memory indouble precision arithmetic, where N is the dimension of the trial space.(iii) Numerical solution requires one sparse backsolve O (cid:0) N (cid:1) and one dense matrix-vector product O (cid:0) N (cid:1) in each iteration of the eigenvalue solver, thus the solution time of the numerical eigenvalue solverscales O (cid:0) N · N iter (cid:1) , where N iter is the number of iterations required by the Lanczos solver.2able 1 illustrates that explicit storage of the dense system matrix requires impracticable amounts of mem-ory for problems involving more than 100 K degrees of freedom. Hence, the computational complexity ofclassical ﬁnite element formation and assembly procedures as well as memory requirements of direct solu-tion techniques become quickly computationally intractable with increasing polynomial degree, number ofelements and degrees of freedom.There has been a particular research eﬀort devoted to alleviating the disadvantages of the Galerkinmethod. In [1, 22, 35] an approximation by (Kronecker product) hierarchical matrices is used to eﬃcientlycompute the dense matrices, as well as to reduce the memory requirements. These matrices are sparseand allow for matrix multiplication, addition and inversion in O ( N log N ) time (or for Kronecker producthierarchical matrices in O ( N ) time) where N is the number of degrees of freedom. The generalized FastMultipole Method, which also scales with O ( N log N ), has been proposed in [61]. This method was shownto not yield signiﬁcant speed-ups for p ﬁnite element methods and thus it is recommended for kernelsof low regularity. Wavelet Galerkin-schemes [52] are also being used and can be coupled with compressiontechniques for boundary value problems [12], but have the disadvantage, that the number of eigenmodes to becomputed must be known in advance. The pivoted Cholesky decomposition [23] focuses on approximatingthe discretized random ﬁelds with suﬃciently fast decaying eigenvalues. In this case a truncation of thepivoted Cholesky decomposition of the covariance operator allows for an estimation of the eigenvalues inthe post-processing step in O (cid:0) M N (cid:1) time, where M is the truncation order of the Cholesky decomposition.One of the advantages of this method is the fact, that the number of eigenmodes required for a certainaccuracy of the random ﬁeld discretization can be estimated in advance. Table 1: Minimum memory required for storage of the main system matrix in the solution of the homogeneous Fredholmintegral problem of the second kind assuming double-precision ﬂoating point arithmetic.

Number of degrees of freedom 10 Matrix storage 8 MB 800 MB 80 GB 8 TB

Splines are piecewise polynomials with increased smoothness across element boundaries compared toclassical ﬁnite elements. Traditionally, splines have been primarily used as shape functions in computeraided design. More recently, with the introduction of isogeometric analysis [27], splines have become moreestablished as trial functions in ﬁnite element analysis. Although isogeometric analysis was originally intro-duced to improve the interoperability across several stages of the design to analysis process, it has provenits ﬁdelity as an analysis technology. We refer to the monograph [10] and references contained therein foran exposition of isogeometric analysis applied to deterministic problems in structural and ﬂuid mechanics.More recently, spline based isogeometric analysis has found its way into the stochastic community.Stochastic methods have been proposed to quantify uncertainty due to material randomness in linear elas-ticity [41, 31], static analysis of plates [69], vibrational analysis of shells [44], static and dynamic structuralanalysis of random composite structures [15] and functionally graded plates [26, 42, 43]. In [71] a method isproposed to quantify the eﬀect due to uncertainty in shape. Of these, the methods proposed in [41, 42, 43, 44]use isogeometric analysis within a spectral stochastic ﬁnite element framework [20], which is based on a KLexpansion of random ﬁelds. The methods in [15, 26, 69] use perturbation series of which [69] expands randomﬁelds in terms of the KLE. Standard polynomial chaos is used in [71], while the methods in [16], [31] and[56] discretize the stochastic dimensions in terms of splines. In particular, in [16] tensor product B-splinesare used to expand stochastic variables, [31] proposes a spline-dimensional decomposition (SDD) and [56]proposes a spline chaos expansion, thus extending generalized polynomial chaos [70].To the best of our knowledge, [3] is the ﬁrst work in which splines have been used to approximate thetruncated KLE. In his work the author applies a degenerate kernel approximation based on tensor productspline interpolation at the Greville abscissa. More recently, in the spirit of isogeometric analysis, non-uniformrational B-splines (NURBS) have been used to approximate the KLE using the Galerkin method [55] andthe collocation method [30]. These methods avoid the geometrical errors in the representation of CAD3eometry typically made within the classical ﬁnite element method. The authors note that the use of splinesin the geometry description as well as in discretization of the spatial and stochastic dimensions could enablea “seamless uncertainty quantiﬁcation pipeline”.In the context of the present work we would like to highlight the superior spectral approximation proper-ties of smooth splines as compared to classical C ﬁnite element shape functions. Several studies [9, 28, 29, 54]have investigated the spectral approximation properties of splines in eigenvalue problems corresponding tosecond and fourth order diﬀerential operators and have demonstrated that splines have improved robustnessand accuracy per degree of freedom across virtually the entire range of modes. The numerical results forthe Fredholm integral eigenvalue problem are no diﬀerent, as corroborated by the results shown in Figure 1.It’s precisely these properties that make splines appealing in the representation of random ﬁelds by meansof the Karhunen-Lo`eve expansion. . . . . . . k/N . . . . . . . . λ h , k / λ k p = 2 C isogeometric Galerkin method C ﬁnite element method . . . Figure 1: Normalized discrete eigenvalues corresponding to a univariate Fredholm integral eigenvalue problem with an exponen-tial kernel (correlation length is one). Comparison of eigenvalues obtained by C quadratic splines to C quadratic piecewisepolynomials. Both methods employ a standard Galerkin projection based on full Gauss quadrature. The reference solutionused to normalize the results is computed by the approach described in Remark 6.2 We present a matrix-free isogeometric Galerkin method for Karhunen-Lo`eve approximation of randomﬁelds by splines. Our solution methodology resolves several of the aforementioned computational bottlenecksassociated with numerical solution of integral eigenvalue problems and enables solution of large-scale three-dimensional IEVPs on complex domains. Below we summarize our main contributions.

Conversion to a standard eigenvalue problem

We have chosen a speciﬁc trial space of rational spline functions whose Gramian matrix has a Kroneckerproduct structure independent of the geometric mapping. This enables us to perform the backsolve, usedto convert the IEVP to standard form, in O (cid:0) N · N /d (cid:1) time by utilizing standard linear algebra techniquesfrom [21, 58]. 4 nterpolation based quadrature We present an interpolation based quadrature technique designed and optimized speciﬁcally for the varia-tional formulation of the Fredholm integral equation. The approach integrates a rich target space of functionswith minimal number of quadrature points and outperforms existing competitive techniques in isogeometricanalysis, such as quadrature by interpolation and table look-up [46, 51] and weighted quadrature [8, 25, 60].The proposed interpolation based quadrature technique is inspired by a similar technique used within linearﬁnite elements in [34, Chapter 3.1.3] and [35]. Instead, our approximation of the covariance function isbased on higher order tensor product spline interpolation and resembles the kernel approximation madein [3]. Besides requiring as few quadrature points as possible, the interpolation based quadrature techniqueexposes Kronecker structure in the integral equations, reducing computational complexity signiﬁcantly.

Matrix-free solution methodology

We present a matrix-free solution methodology to avoid explicit storage of the dense system matrixassociated with numerical computation of the KLE. The matrix-free solution methodology not only reducesthe memory complexity from O ( N ) to O ( N ), but also signiﬁcantly reduces the solution time. This isachieved by integrating the matrix-free solver with the proposed interpolation based quadrature technique.The latter exposes Kronecker structure in the resulting discrete integral equation, thereby reducing formationcosts to O (cid:0) N · N /d (cid:1) per iteration, leaving only the dense matrix-vector product that is associated with theLanczos algorithm that remains O ( N ) per iteration. The integrative approach to quadrature and matrix-free solution techniques, exploiting Kronecker structure, is inspired by the matrix-free weighted quadraturemethod proposed recently in [60]. Open source implementation

We provide an open-source Python implementation of the described techniques that is available fordownload at https://github.com/m1ka05/tensiga . All benchmarks have been obtained using this imple-mentation.

In Section 2 we brieﬂy review the necessary mathematical and algorithmic background with regard toKronecker products, B-splines and NURBS. In Section 3 we present the Karhunen-Lo`eve series expansionof random ﬁelds and the weak formulation of the corresponding Fredholm integral eigenvalue problem ofthe second kind. In Section 4 we introduce our methodology for numerical solution of the truncated KLE.This includes reformulation of the eigenvalue problem to standard form, interpolation based quadratureof the weak form of the Fredholm integral problem, and a matrix-free algorithm with low computationalcomplexity and minimal memory requirements that is embarrassingly parallelizable. The computationalcomplexity is described in more detail in Section 5, where we compare our method with usual formationand assembly techniques used for standard Galerkin methods from the literature. Finally, in Section 6, wepresent a one-dimensional numerical study and several three-dimensional high-order numerical examples. Aconclusion and an outlook with recommendations for future work are given in Section 7.

2. Background and notation

This section introduces some of the machinery that is used throughout the paper. The presented solutionmethodology for the Fredholm integral equation relies heavily on the properties of Kronecker products incombination with multidimensional tensor contraction [21]. We brieﬂy review the main properties used inthis work and illustrate their use in the Kronecker matrix-vector product. The Kronecker structure of theinvolved matrices is a direct consequence of the chosen tensor product spline function spaces. We brieﬂyintroduce B-splines as a basis for polynomial splines and Non-Uniform Rational B-splines (NURBS) forsmooth geometrical mappings. For additional details we refer the reader to standard reference books [10, 53].5 .1. Evaluation of the computational cost of an algoritm

The computational cost of the algorithms discussed in this work are evaluated in terms of ﬂoating pointoperations per second ( ﬂops ). A single ﬂop represents the amount of work required to preform one ﬂoatingpoint addition, subtraction, multiplication or division [21]. Although the number of ﬂops does not providea complete assessment of the eﬃciency of an algorithm, it is widely used in the literature. Indeed, manyother considerations such as cache-line eﬃciency and number of memory allocations can have a large impacton the performance of an algorithm. Typically, we are interested in the leading terms that dominate thecomputational cost of an algorithm and record the performance in terms of an order-of-magnitude estimateof the number of ﬂops, written in Big-Oh notation as O ( · ). Let A ∈ R m × n , B ∈ R p × q and C ∈ R s × t denote real valued matrices. The Kronecker product A ⊗ B ∈ R m · p × n · q is a matrix deﬁned as A ⊗ B :=  A B · · · A n B ... ... A m B · · · A mn B  (1)Kronecker products satisfy the following properties( A ⊗ B ) ⊗ C = A ⊗ ( B ⊗ C ) (associativity) (2a)( A ⊗ B )( C ⊗ D ) = ( AC ) ⊗ ( BD ) (mixed product property) (2b)( A ⊗ B ) − = A − ⊗ B − (inverse of a Kronecker product) (2c)( A ⊗ B ) (cid:62) = A (cid:62) ⊗ B (cid:62) (transpose of a Kronecker product) (2d)Let X ∈ R n ×···× n d , Y ∈ R m ×···× m d denote two d -dimensional arrays. Vectorization of X is a linearoperation that maps X to a vector vec( X ) ∈ R n · ... · n d with entriesvec( X ) i := X i ...i d , where i = i + ( i − n + ( i − n · n + . . . + ( i d − n · . . . · n d − . (3)One recurring theme in this paper involving Kronecker matrices is eﬃcient matrix vector multiplication.Let D k ∈ R m k × n k denote a set of d matrices { D i k j k , k = 1 , . . . , d } , i k = 1 , . . . , m k and j k = 1 , . . . , n k . Thematrix-vector productvec( Y ) = (cid:16) D d ⊗ · · · ⊗ D (cid:17) vec( X ) O ( M · N ) ﬂops (4a)can be written as a tensor contraction instead Y i ··· i d = (cid:88) j ··· j d D i j · · · D i d j d X j ··· j d O (max( N · m , n d · M )) ﬂops (4b)Here N = n · . . . · n d and M = m · . . . · m d . The second approach scales nearly linearly with matrix size andsigniﬁcantly outperforms standard matrix vector multiplication which scales quadratically with the matrixsize. In practice, highly optimized linear tensor algebra libraries can be used to perform the tensor contrac-tion such as the tensor algebra compiler (TACO) [36]. Our Python implementation uses Numpy’s reshapingand matrix-matrix product routines, which call low-level BLAS routines. The implemented reshapes do notrequire any expensive and unnecessary data copies. Consider a d -dimensional parametric domain ˆ D = [0 , d ⊂ R d with local coordinates ˆ x = (ˆ x , . . . , ˆ x d ).Let ( B i k ,p k (ˆ x k ) , i k = 1 , . . . , n k ) denote the univariate B-spline basis of polynomial degree p k and dimen-sion n k , corresponding to the k th parametric coordinate ˆ x k . We consider multivariate B-splines as tensor6roducts of univariate B-splines B i (ˆ x ) = d (cid:89) k =1 B i k ,p k (ˆ x k ) , i := ( i , . . . , i d ) . (5)Here i ∈ I is a multi-index in the set I := { ( i , . . . , i d ) : 1 ≤ i k ≤ n k } . The collection of all multivariateB-spline basis functions spans the space B h := span { B i (ˆ x ) } i ∈I . (6)It is important to note that splines allow for increased continuity between polynomial elements as comparedto classical C -continuous ﬁnite element basis functions. This turns out to have signiﬁcant impact on thespectral accuracy of the Galerkin method. This is evidenced by several studies [9, 28, 29, 54] and will bediscussed in some detail in this work. Let F : ˆ D → D map a point ˆ x from the parametric domain ˆ D to a point x in the physical domain D .We assume that the map F and its inverse are smooth such that the Jacobian matrix [D F (ˆ x )] ij := ∂F i ∂ ˆ x j and its inverse are well-deﬁned. In this work F is represented as a linear combination of Non-UniformRational B-splines (NURBS). NURBS are rational functions of B-splines that enable representation ofcommon engineering shapes with conic sections, which cannot be represented by polynomial B-splines [53].The discretization method presented in this work makes heavy use of tensor product properties of theinvolved function spaces. Since NURBS do not have a tensor product structure, we use them only torepresent the geometry and do not consider them as a basis for the function spaces.

3. Isogeometric Galerkin discretization of the Karhunen-Lo`eve series expansion

The Karhunen-Lo`eve (KL) series expansion decomposes a stochastic process or ﬁeld into an inﬁnite linearcombination of L -orthogonal functions and uncorrelated stochastic random variables. In this section wepresent the probability theory underlying the KL expansion and discuss its discretization by means of theGalerkin method. Consider a complete probability space (Θ , Σ , P ). Here Θ denotes a sample set of random events, Σ isthe σ -algebra of Borel subsets of Θ and P is a probability measure P : Σ → [0 , α ( · , θ ) : Θ (cid:55)→ L ( D ) on a bounded domain D ∈ R d is a collection of deterministic functions of x ∈ D , calledrealizations, that are indexed by events θ ∈ Θ. A subset of realizations α ( · , Θ s ) , Θ s ∈ Σ, has a probabilityof occurrence of P (Θ s ).Let E [ · ] denote the expectation operator corresponding to the probability measure P . Assuming α ∈ L ( D ×

Θ) its ﬁrst and second order moments exist and are given by µ ( x ) := E [ α ( x, θ )] and (7a)Γ( x, x (cid:48) ) := E [( α ( x, θ ) − µ ( x ))( α ( x (cid:48) , θ ) − µ ( x (cid:48) ))] . (7b)Here µ is called the mean or expected value of α over all possible realizations, and Γ : D × D → R iscalled its covariance function or kernel . By deﬁnition, the kernel is bounded, symmetric and positive semi-deﬁnite [20]. Because the kernel is square integrable, that is, Γ ∈ L ( D × D ), it is in fact a Hilbert-Schmidtkernel, see [39].A random ﬁeld is stationary or homogeneous if its statistical properties do not vary as a function of theposition x ∈ D . This implies that the covariance function can be written as a function of the diﬀerence x − x (cid:48) . Furthermore, for isotropic random ﬁelds the statistical properties are invariant under rotations,which means the covariance is a function of Euclidean distance (cid:107) x − x (cid:48) (cid:107) .7 . . . . . . k x − x k /L . . . . . . Γ ( x , x ) / σ Γ( x, x ) = σ exp (cid:18) −k x − x k bL (cid:19) b = 1 b = 0 . . . . . . . k x − x k /L . . . . . . Γ ( x , x ) / σ Γ( x, x ) = σ exp (cid:18) −k x − x k bL (cid:19) b = 1 b = 0 . (a) exponential kernel . . . . . . k x − x k /L . . . . . . Γ ( x , x ) / σ Γ( x, x ) = σ exp (cid:18) −k x − x k bL (cid:19) b = 1 b = 0 . . . . . . . k x − x k /L . . . . . . Γ ( x , x ) / σ Γ( x, x ) = σ exp (cid:18) −k x − x k bL (cid:19) b = 1 b = 0 . (b) squared exponential kernelFigure 2: The exponential and squared exponential (Gaussian) covariance functions for diﬀerent correlation lengths with b = { . , . } . Note the diﬀerence in the the continuity of both kernels at x = x (cid:48) . The exponential kernel is C , while thesquare exponential kernel is C ∞ at x = x (cid:48) . Remark 3.1.

Although, the Euclidean distance is widely used in the literature its use is not always justiﬁed.In general, the geodesic distance, i.e. the shortest distance between points x and x (cid:48) along all paths containedin D , is the true measure of distance. The Euclidean distance can vary signiﬁcantly from the geodesicdistance especially if the correlation length is relatively large and the domain is non-convex. The geodesicdistance is, however, diﬃcult and expensive to compute, which explains its non-use. In this work we alsouse the Euclidean distance measure and assume its choice is a reasonable one in the context of the appliednumerical benchmark problems. Figure 2 shows two common examples of covariance functions that correspond to stationary isotropicrandom ﬁelds: the exponential and the

Gaussian or squared exponential kernel . Important parameters thatinﬂuence the locality of these correlation functions are the variance σ and correlation length bL . Here L denotes a characteristic length and b is a dimensionless factor.The KL expansion of a random ﬁeld α ( · , θ ) requires the solution of an integral eigenvalue problem.Consider the linear operator T : L ( D ) (cid:55)→ L ( D ) , ( T φ )( x ) := (cid:90) D Γ( x, x (cid:48) ) φ ( x (cid:48) ) d x (cid:48) . (8)The operator T is compact. In fact, T is a Hilbert-Schmidt operator, since the covariance function is aHilbert-Schmidt kernel. Furthermore, since the covariance function is positive semi-deﬁnite and symmet-ric [20], T is a self-adjoint positive semi-deﬁnite linear operator. The eigenfunctions { φ i } i ∈ N of T are deﬁnedby the homogeneous Fredholm integral eigenvalue problem of the second kind, T φ i = λ i φ i , φ i ∈ L ( D ) for i ∈ N . (9)The important properties of the eigenpairs are (1) the normalized eigenfunctions { φ i } i ∈ N are orthonormal,that is, ( φ i , φ j ) L ( D ) = δ ij , and thus form a basis for L ( D ); and (2) the corresponding eigenvalues form a8equence λ ≥ λ ≥ . . . ≥

0, which in general decays with increasing mode number.Because Γ in (7b) is symmetric and positive semi-deﬁnite, it possesses the spectral decomposition [11, 49]Γ( x, x (cid:48) ) = ∞ (cid:88) i =1 λ i φ i ( x ) φ i ( x (cid:48) ) . (10)With these deﬁnitions, the KL expansion of a random ﬁeld α ∈ L ( D ×

Θ) is deﬁned by the followingseries [32] α ( x, θ ) = µ ( x ) + ∞ (cid:88) i =1 (cid:112) λ i φ i ( x ) ξ i ( θ ) , where ξ i ( θ ) := 1 √ λ i (cid:90) D ( α ( x, θ ) − µ ( x )) φ i ( x ) d x. (11)While { φ i } i ∈ N are pairwise L -orthogonal on D , the { ξ i } i ∈ N are pairwise uncorrelated zero-mean randomvariables [20]. For this reason the KL expansion is sometimes said to be bi-orthogonal. In order to represent a random ﬁeld in a discrete numerical computation it is necessary to discretize thecontinuous probability space. This can be achieved by truncating the KL expansion after M terms and thusreducing the dimension of the stochastic space to M uncorrelated random variables˜ α M ( x, θ ) = µ ( x ) + M (cid:88) i =1 (cid:112) λ i φ i ( x ) ξ i ( θ ) . (12)The mean of a random ﬁeld is not aﬀected by the discretization. The variance of the discretization on theother hand can be derived from the spectral decomposition in equation (10) E (cid:2) (˜ α M ( x, θ ) − µ ( x )) (cid:3) = M (cid:88) i =1 λ i φ i ( x ) . (13)The variance of the discretized random ﬁeld converges uniformly in D and in L (Θ , Σ , P ) towards the truevariance [55]lim M →∞ M (cid:88) i =1 λ i φ i ( x ) = Γ( x, x ) . (14)Furthermore, it can be shown that the KL expansion is optimal with respect to the global mean-squarederror among all series expansions of truncation order M [20]. The variational formulation or weak form of the integral eigenvalue problem introduced in equation (9)states

Find { λ, φ } ∈ R +0 × L ( D ) such that (cid:90) D (cid:18)(cid:90) D (cid:48) Γ( x, x (cid:48) ) φ ( x (cid:48) ) d x (cid:48) − λφ ( x ) (cid:19) ψ ( x ) d x = 0 ∀ ψ ∈ L ( D ) . (15)Conﬁning the solution to the ﬁnite-dimensional subspace S h ⊂ L ( D ) we obtain the discrete variationalformulation Find { λ, φ } ∈ R +0 × S h such that (cid:90) D (cid:18)(cid:90) D (cid:48) Γ( x, x (cid:48) ) φ h ( x (cid:48) ) d x (cid:48) − λ h φ h ( x ) (cid:19) ψ h ( x ) d x = 0 ∀ ψ h ∈ S h . (16)9his is the Galerkin method for the homogeneous Fredholm integral eigenvalue problem of the secondkind [4, 55]. Within the trial space under consideration, the Galerkin method produces the best L ap-proximation of the analytical modes. The resulting discrete modes preserve exactly the L orthogonalityproperty of the analytical mode-shapes. Furthermore, it can be shown that a variational treatment usingthe Galerkin method leads to eigenvalues that converge monotonically, under mesh reﬁnement, towards thetrue eigenvalues [20]. The choice of the trial space S h provides some freedom in the design of the Galerkin method. Therecently proposed isogeometric Galerkin method for the KL expansion of random ﬁelds uses NURBS forthe test and trial spaces [55]. This choice is motivated by the fact that the geometrical mapping is deﬁnedusing NURBS and it is natural to remain within the isoparametric paradigm. This method shares the sametechnical challenges as all classical Galerkin methods applied to this class of problems [1, 55]: the formationand assembly costs, which have a time complexity of O (cid:0) N e · p d (cid:1) , as well as the storage requirements,which have space complexity of O (cid:0) N (cid:1) , become quickly intractable with increasing number of elements N e ,polynomial degree p , dimension d and number of degrees of freedom N . A practical Galerkin method mustaddress these diﬃculties in the design of the method.We abandon the isoparametric concept and choose a diﬀerent space to represent the ﬁnite-dimensionalsolution. Our choice oﬀers multiple computational advantages without sacriﬁcing higher-order accuracy androbustness. We deﬁne the trial space for the Galerkin method as S h := span (cid:40) B i (ˆ x ) (cid:112) det D F (ˆ x ) (cid:41) i ∈I . (17)Because the geometrical mapping F is smooth and invertible the Jacobian determinant is never singular,that is, det D F (ˆ x ) > x ∈ ˆ D . Importantly, the functions are linearly independent due to linearindependence of B-splines. In general, however, these basis functions will not form a partition of unity.Instead, the characterizing property is that products of these functions are integral preserving, that is, theytransform as volume forms (cid:90) D B i (ˆ x ) (cid:112) det D F (ˆ x ) B j (ˆ x ) (cid:112) det D F (ˆ x ) d x = (cid:90) D B i (ˆ x ) B j (ˆ x )det D F (ˆ x ) d x = (cid:90) ˆ D B i (ˆ x ) B j (ˆ x ) dˆ x. After substituting the desired subspace for the test and trial functions and performing minor algebraicmanipulations, the discretized Galerkin method results in a generalized algebraic eigenvalue problem Av h = λ h Zv h , (18)where the system matrices are formed by evaluating A ij = (cid:90) ˆ D (cid:90) ˆ D (cid:48) Γ( x (ˆ x ) , x (ˆ x (cid:48) ))) B i (ˆ x ) (cid:112) det D F (ˆ x ) B j (ˆ x (cid:48) ) (cid:112) det D F (ˆ x (cid:48) ) det D F (ˆ x ) det D F (ˆ x (cid:48) ) dˆ x (cid:48) dˆ x = (cid:90) ˆ D (cid:90) ˆ D (cid:48) Γ( x (ˆ x ) , x (ˆ x (cid:48) ))) B i (ˆ x ) B j (ˆ x (cid:48) ) (cid:112) det D F (ˆ x )det D F (ˆ x (cid:48) ) dˆ x (cid:48) d x (cid:48) (19)and Z ij = (cid:90) ˆ D B i (ˆ x ) (cid:112) det D F (ˆ x ) B j (ˆ x ) (cid:112) det D F (ˆ x ) det D F (ˆ x ) dˆ x = (cid:90) ˆ D B i (ˆ x ) B j (ˆ x ) dˆ x. (20)10s a result of the chosen solution space, the mass matrix Z has a Kronecker structure and can be decomposedinto k = 1 , . . . , d univariate mass matrices Z k := Z i k j k = (cid:90) B i k ,p k (ˆ x k ) B j k ,p k (ˆ x k ) dˆ x k , i k , j k = 1 , . . . , n k . (21)The system mass matrix Z can be then written as Z = Z d ⊗ · · · ⊗ Z . (22)Instead of computing and storing the matrix Z , we precompute and store the matrices Z k , k = 1 , . . . , d . Fur-thermore, it is the Kronecker structure that allows us to inexpensively reformulate the generalized eigenvalueproblem to a standard algebraic eigenvalue problem. Remark 3.2.

In practice Z k in (21) is computed exactly up to machine precision using Gauss-Legendrenumerical quadrature with p +1 quadrature points per element, where p is the polynomial degree in componentdirection k . For alternative ways of computing integrals of piecewise polynomial products see [67]. Becausethe domain of integration is one-dimensional the formation and assembly costs of O ( n e p ) as well as thestorage costs of O ( pn ) bytes are negligible compared to the total solver costs. Here n e is the number ofunivariate elements and n is the univariate number of degrees of freedom in component direction k .

4. Eﬃcient matrix-free solution strategy

There are two major challenges when applying the Galerkin method to discretize the homogeneousFredholm integral eigenvalue problem in (9). Firstly, the variational formulation requires integration overa 2 d -dimensional domain to evaluate the matrix entries in A . This leads to formation and assembly costswith complexity O ( N e p d ), where N e is the global number of elements, p the polynomial degree and d thespatial dimension. Secondly, because the matrix is dense, A requires insurmountable memory storage forany practical problem of interest. Several techniques have been presented in the literature in order to dealwith these challenges, for example by approximation with low-rank matrices like the hierarchical matrices[1, 22, 35] or by using Fast Multipole Methods [61]. In this work we present a combination of four techniquesto deal with the aforementioned challenges:1. Reformulation of the generalized eigenvalue problem into an equivalent standard eigenvalue problem;2. Interpolation based quadrature for variational formulations of integral equations;3. Eﬃcient formation of ﬁnite element arrays based on Kronecker matrix-vector product;4. Formulation of a matrix-free and parallel matrix-vector product for the Lanczos algorithm.The reformulation into a standard algebraic eigenvalue problem signiﬁcantly reduces the computationalcost and simultaneously improves conditioning. By exploiting the Kronecker structure of the right-hand-side mass matrix we can perform this reformulation with negligible overhead. The proposed non-standardquadrature technique that we call interpolation based integration is tailored for variational formulations ofintegral equations. The technique is optimal in the sense that few quadrature points are required whileintegrating a rich space of tensor product functions on the 2 d -dimensional domain D × D . Importantly, thetechnique lends itself to multidimensional tensor contraction due to the Kronecker structure of the involvedmatrices. This signiﬁcantly speeds up the evaluation of integrals over high-dimensional domains and scalesfavorably with polynomial degree. Finally, all techniques are combined within a matrix-free evaluationscheme that is embarrassingly parallel and requires minimal memory storage. The formation and assemblycosts of our approach are negligible compared to the remaining solver costs of the Lanczos eigenvalue solver,which is O ( ˜ N · N iter /N thread ). Here ˜ N is the global number of degrees of freedom of the interpolation space, N iter is the number of iterations of the eigensolver and N thread is the number of simultaneous processes. Inthe following we discuss each of the proposed techniques in more detail.11 .1. Reformulation into a standard algebraic eigenvalue problem Let us consider a Cholesky factorization of the mass matrix Z = LL (cid:62) and deﬁne a linear transformationof the eigenvectors v (cid:48) h := L (cid:62) v h . The generalized eigenvalue problem can then be rewritten (see [58, Chapter9.2.2]) as a standard eigenvalue problem with unchanged eigenvalues corresponding to new eigenvectors v (cid:48) h A (cid:48) v (cid:48) h = λ h v (cid:48) h , where A (cid:48) := L − AL −(cid:62) (23)It is expected that the new system matrix A (cid:48) has improved conditioning compared to A . The kernel ispositive-deﬁnite and symmetric. In practice, it often quickly tends to zero for increasing distance (cid:107) x − x (cid:48) (cid:107) .In the limiting case, where Γ( x, x (cid:48) ) → δ ( x, x (cid:48) ), the system matrix A → Z and hence the preconditioner wouldbe ideal.Although improved conditioning is beneﬁcial, the main reason for the chosen transformation is eﬃciency.Solution of a standard algebraic eigenvalue problem is much less expensive than solution of a generalizedeigenvalue problem. The transformation itself is inexpensive. Using the Kronecker structure of Z and theproperties (2b) and (2d) we may write Z = Z d ⊗ . . . ⊗ Z = L d L d (cid:62) ⊗ · · · ⊗ L L (cid:62) = ( L d ⊗ · · · ⊗ L )( L d ⊗ · · · ⊗ L ) (cid:62) = LL (cid:62) . (24)Here L k L k (cid:62) , k = 1 , . . . d , denote the Cholesky factorizations corresponding to the univariate mass matri-ces Z k . Hence, instead of performing the Cholesky factorization for the complete system matrix Z ∈ R N × N ,which is the standard procedure in most solvers for generalized algebraic eigenvalue problems, we merelyneed the Cholesky factorizations for Z k ∈ R n k × n k , k = 1 , . . . , d .The factorization is precomputed once before using it in the eigenvalue solver. The associated com-putational cost is reduced from O (cid:0) N (cid:1) to O (cid:0) n (cid:1) ﬂops, where n = max( n , ..., n d ) and N = n · ... · n d .Subsequently, the cost of applying the factorization in a single iteration of the eigenvalue solver is reducedfrom O (cid:0) N (cid:1) to O ( n · N ). Besides a reduction in computational cost, this approach signiﬁcantly reduces therequired memory storage. One of the most straightforward ways of improving eﬃciency is to design quadrature rules that requirefewer evaluation points. This is especially true for the variational formulation of the Fredholm integralequation which requires numerical integration over a 2 d -dimensional domain. In practice, accurate andeﬃcient quadrature rules are designed as follows. First, one chooses a space of functions T , called the targetspace for numerical quadrature, whose elements should be exactly integrated by the new quadrature rule.If this space is in some sense rich enough then the error due to the quadrature can be bounded by thediscretization error, which is needed to show optimal rates of convergence of the numerical method (see[24]). The next objective is to ﬁnd a quadrature rule that requires as few points as possible to integrate allfunctions in T .We present a non-standard quadrature technique that generalizes the approach to quadrature presentedfor linear ﬁnite elements in [34, 35] to higher order splines. Our approach is tailored toward evaluatingintegrals found in variational formulations of integral equations and achieves a very low number of evaluationpoints while integrating exactly a rich space of functions. The target space T is chosen such that thequadrature scheme exactly evaluates the integral (cid:90) ˆ D (cid:90) ˆ D (cid:48) ˜ G (ˆ x, ˆ x (cid:48) ) B i (ˆ x ) B j (ˆ x (cid:48) ) dˆ x (cid:48) dˆ x, ˜ G ∈ ˜ B h ( ˆ D ) ⊗ ˜ B h ( ˆ D (cid:48) ) . (25)using ˜ N points. Here ˜ B h is another d -dimensional spline space that can be chosen independently of B h and ˜ N is its dimension. In practice this space can be chosen to ﬁt well with the integrand in the variationalformulation of the integral equation. This provides additional ﬂexibility to the quadrature scheme.12ecause ˜ G ∈ ˜ B h ( ˆ D ) ⊗ ˜ B h ( ˆ D (cid:48) ) is a real-valued 2 d -variate spline function it can be expanded in terms ofB-spline basis functions and real-valued coeﬃcients { ˜ G kl } k , l ∈ ˜ I as˜ G (ˆ x, ˆ x (cid:48) ) := (cid:88) k , l ∈ ˜ I ˜ G kl ˜ B k (ˆ x ) ˜ B l (ˆ x (cid:48) ) with k , l ∈ ˜ I := { ( i , . . . , i d ) : 1 ≤ i k ≤ ˜ n k } . Comparing the multidimensional integrand in (19) with the one in (25) we may conclude that the degrees offreedom { ˜ G kl } k , l ∈ ˜ I should be chosen such that ˜ G is a good approximation of the function G : ˆ D × ˆ D (cid:55)→ R + deﬁned as G (ˆ x, ˆ x (cid:48) ) := ˆΓ(ˆ x, ˆ x (cid:48) ) (cid:112) det D F (ˆ x )det D F (ˆ x (cid:48) ) . (26)Here ˆΓ(ˆ x, ˆ x (cid:48) ) is the pull-back of the kernel Γ( x, x (cid:48) ) from the physical to the parametric space using thegeometrical mapping F . Remark 4.1.

To maintain optimal accuracy in numerical quadrature, locally, the smoothness of the in-terpolation space should not exceed smoothness of the integrand in eq. (26) . Indeed, Figure 3b shows thaterror convergence due to quadrature of an exponential kernel, which features reduced regularity at x = x (cid:48) ,is suboptimal. Similarly, the reduced regularity due to the Jacobian determinant term needs to be takeninto account when choosing the interpolation space. Being set in the framework of splines and isogeometricanalysis, our method provides suﬃcient ﬂexibility in enforcing required smoothness. The approximation ˜ G can be estimated in diﬀerent ways. We follow a similar approach to the degeneratekernel approximation in [3] and choose to collocate G at the Greville abscissa [13]. This approach is bothsimple and combines high order accuracy with a minimal number of evaluation points. We note that relatedideas based on quasi-interpolation have been presented in [7, 18, 19] for formation and assembly of boundaryintegral equations and in [47, 51] for matrix assembly in Galerkin discretization of PDEs.Let ˜ B = ˜ B ij := ˜ B j (ˆ x i ) denote the d -variate spline collocation matrix evaluated at the Greville abscissaˆ x i ∈ ˆ D , i ∈ ˜ I . The interpolation problem states Find { ˜ G kl } k , l ∈ ˜ I such that (cid:88) k , l ∈ ˜ I ˜ G kl ˜ B k (ˆ x i ) ˜ B l (ˆ x (cid:48) j ) = G (ˆ x i , ˆ x (cid:48) j ) ∀ i , j ∈ ˜ I . (27)This is equivalent to the matrix problem G = ˜B˜G˜B (cid:62) . Hence, the matrix of coeﬃcients can be computed as ˜G = ˜B − G˜B −(cid:62) . The computational cost of the interpolation can be signiﬁcantly reduced from O ( ˜ N ) to O (˜ n · ˜ N ) ﬂops, where ˜ n = max (˜ n , ..., ˜ n d ), by exploiting the Kronecker structure of ˜B . We decompose ˜B into d univariate collocation matrices ˜B k , k = 1 , . . . , d , and use property (2c) to write its inverse as ˜B − = ˜B − d ⊗ · · · ⊗ ˜B − with ˜B k := ˜ B i k j k = ˜ B j k , ˜ p k (ˆ x i k ) . (28)In practice we compute d LU factorizations, each corresponding to a univariate matrix ˜B k , to apply theinverse of ˜B to a vector. Note, that this approach is similar to the approach we took in equation (24) forthe Cholesky factorization of Z . 13 .3. Matrix formation By substituting G in equation (19) with ˜ G we can approximate matrix A by a matrix ˜A with entries˜ A ij := (cid:90) ˆ D (cid:90) ˆ D (cid:48) ˜ G (ˆ x, ˆ x (cid:48) ) B i (ˆ x ) B j (ˆ x (cid:48) ) dˆ x (cid:48) dˆ x = (cid:88) k , l ∈ ˜ I ˜ G kl (cid:90) ˆ D (cid:90) ˆ D (cid:48) ˜ B k (ˆ x ) ˜ B l (ˆ x (cid:48) ) B i (ˆ x ) B j (ˆ x (cid:48) ) dˆ x (cid:48) dˆ x = (cid:88) k , l ∈ ˜ I ˜ G kl (cid:90) ˆ D ˜ B k (ˆ x ) B i (ˆ x ) dˆ x (cid:90) ˆ D (cid:48) ˜ B l (ˆ x (cid:48) ) B j (ˆ x (cid:48) ) dˆ x (cid:48) . Hence, using the tensor product structure of the interpolation space we have separated the 2 d -dimensionalintegral into a product of two d -dimensional integrals. In matrix notation we may write ˜A = M (cid:62) ˜GM , or˜ A ij = (cid:88) k , l ∈ ˜ I ˜ G kl M ki M lj . (29)Here M := M ij is a mass matrix M ij = (cid:90) ˆ D ˜ B i (ˆ x ) B j (ˆ x ) dˆ x. (30)Similarly, as we did for matrix Z in (22), we can exploit the Kronecker structure and decompose M into d univariate mass matrices M k := M i k j k M = M ⊗ · · · ⊗ M d with M i k j k = (cid:90) ˜ B i k , ˜ p k (ˆ x k ) B j k ,p k (ˆ x k ) dˆ x k . (31)As in the case of Z k , k = 1 , ..., d , these univariate matrices are computed up to machine precision as discussedin remark 3.2. The approximation error A − ˜A is entirely due to the interpolation error G − ˜ G . Hence, accurateapproximation of G should result in an accurate approximation of A . The interpolation based quadrature technique introduced in the previous section involves computationof the matrix of coeﬃcients ˜G := ˜ G kl . This matrix is dense and has ˜ N entries. Consequently, storage of ˜G is just as inconvenient as storing ˜A and becomes quickly intractable with problem size. In this section wepropose a matrix-free evaluation of the matrix-vector product v (cid:48) (cid:55)→ ˜Av (cid:48) that does not require explicit accessto matrix ˜G or ˜A . We have the following standard algebraic eigenvalue problem ˜A (cid:48) v (cid:48) = λ h v (cid:48) . (32a)Here, the system matrix ˜A (cid:48) can be written as ˜A (cid:48) = L − M (cid:62) ˜B − JΓJ˜B −(cid:62) ML −(cid:62) , (32b)where J is a diagonal matrix with diagonal entries given by the square roots of Jacobian determinantsevaluated at the Greville abscissa, and B , M and L are all Kronecker product matrices. Consequently, amatrix-vector product with any of these matrices can be performed close to linear time complexity. The14atrix vector product v (cid:48) (cid:55)→ ˜A (cid:48) v (cid:48) can be subdivided into the following operations Γ := ˆΓ(ˆ x k , ˆ x (cid:48) l ) (Evaluation of the kernel at the Greville abscissa) (32c) G = JΓJ (Scaling of the kernel) (32d) ˜G = ˜B − G˜B −(cid:62) (Interpolation of the scaled kernel) (32e) ˜A = M (cid:62) ˜GM (Evaluation of the integrals) (32f) ˜A (cid:48) = L − ˜AL −(cid:62) (Application of the preconditioner) (32g)In the following subsection we present a matrix-free matrix-vector product that incorporates each of theabove steps. Except for the diagonal matrix J , none of the above matrices are stored explicitly. Only thecorresponding univariate matrices are stored and used in the Kronecker products, while the entries of Γ arecomputed on the ﬂy. Let ˆΓ k ...k d l ...l d := ˆΓ(ˆ x ,k , . . . , ˆ x d,k d , ˆ x (cid:48) ,l , . . . , ˆ x (cid:48) d,l d ) ∈ R ˜ n ×···× ˜ n d × ˜ n ×···× ˜ n d denote the function valuesof the kernel evaluated at the tensor product grid of the Greville abscissa in ˆ D × ˆ D . The proposed evaluationorder of the matrix-free matrix-vector product is summarized in Algorithm 1. Algorithm 1

Matrix-free evaluation of the matrix-vector product v (cid:48) (cid:55)→ ˜A (cid:48) v (cid:48) Input : v i ...i d ∈ R n ×···× n d , J l ...l d ∈ R ˜ n ×···× ˜ n d , B i k j k ∈ R ˜ n k × ˜ n k and M l k j k ∈ R ˜ n k × n k Output : v (cid:48) i ...i d ∈ R n ×···× n d V j ...j d ← L − i j · · · L − i d j d v i ...i d (cid:46) Preconditioning from right X k ...k d ← M k j · · · M k d j d V j ...j d Y l ...l d ← B − k l · · · B − k d l d X k ...k d (cid:46) Use LU -factorization of B k , k = 1 , ..., d Y (cid:48) l ...l d ← J l ...l d (cid:12) Y l ...l d Z (cid:48) k ...k d ← ˆΓ k ...k d l ...l d Y (cid:48) l ...l d (cid:46) Evaluate in parallel without forming Γ Z k ...k d ← J k ...k d (cid:12) Z (cid:48) k ...k d Y j ...j d ← B − j k · · · B − j d k d Z k ...k d (cid:46) Use LU -factorization of B k , k = 1 , ..., d V l ...l d ← M j l · · · M j d l d Y j ...j d v (cid:48) i ...i d ← L − i l · · · L − i d l d V l ...l d (cid:46) Preconditioning from leftThe matrix-free matrix vector product v (cid:48) (cid:55)→ ˜A (cid:48) v (cid:48) is evaluated in nine separate stages. Stage one appliesback-substitution of the upper triangular matrix L (cid:62) and exploits its Kronecker structure. Stage two appliesa matrix vector product with matrix M and again exploits its Kronecker structure. Stage three applies back-substitution using the factorization of the interpolation matrix B . Again, Kronecker structure is essential toreduce both the space and time complexity of the back-substitution. In stage four the coeﬃcient vector iselement-wise multiplied by the square root of the Jacobian determinant evaluated at the Greville abscissa.Here, element-wise multiplication is denoted by the (cid:12) symbol. Stage ﬁve dominates the computationalcost of Algorithm 1. This stage represents a dense matrix-vector product. To perform this step withoutexplicitly forming matrix Γ we compute its entries on the ﬂy, one row at a time. We compute products ofthe coeﬃcient vector with several rows of Γ in parallel. Stages six to nine are equivalent to stages four toone, due to the symmetry of the operator.Due to the iterative solution process, the matrix-vector product needs to be evaluated at each iteration.The number of iterations is dependent on the number of required eigenmodes, the conditioning of thealgebraic eigenvalue problem and the eﬃciency of the eigensolver. In this work we use the standard implicitlyrestarted Lanczos method [21]. 15 . Computational complexity analysis The goal of a time-complexity analysis is to obtain an estimate of the computational cost that scaleslinearly with time. This cost is expressed in terms of certain parameters that depend on the problem size,the dimension and the polynomial degree. For this purpose, let us introduce the following notation: n number of degrees of freedom of the trial space in one component direction;˜ n number of degrees of freedom of the interpolation space in one component direction; N := n d total number of degrees of freedom of the trial space;˜ N := ˜ n d total number of degrees of freedom of the interpolation space; N e total number of spatial elements in the trial space; N q number of quadrature points in a standard quadrature loop. N iter number of iterations of the matrix-free algorithm; N thread number of simultaneous shared memory processes in the matrix-vector product. In the following we present the computational complexity of standard ﬁnite element procedures forhigher-order ﬁnite elements. We use the tensor product structure of the high-dimensional space ˆ

D × ˆ D tominimize the involved computations. The general setting for this analysis is (1) ˆ D × ˆ D has N e elements; and(2) we assume a quadrature rule Q ( f ) := (cid:80) N q k =1 w k f ( x k ), with 1 ≤ N q ≤ ( p + 1) d , to integrate the productson every d -dimensional element (cid:3) d in ˆ D .The leading term in formation and assembly is determined by the cost of forming the element matrices.Consider the following element matrix A e ij = (cid:90) (cid:3) d B i (ˆ x ) (cid:90) (cid:3) d Γ(ˆ x, ˆ x (cid:48) ) B j (ˆ x (cid:48) ) dˆ x (cid:48) dˆ x ≈ N q (cid:88) k =1 w k B i (ˆ x k ) N q (cid:88) l =1 w l Γ(ˆ x k , ˆ x (cid:48) l ) B j (ˆ x (cid:48) l )= N q (cid:88) k =1 C i k D k j with D k j = N q (cid:88) l =1 w l Γ(ˆ x k , ˆ x l ) B j (ˆ x l )with i , j ∈ I . We see that A e ij can be formed by the matrix product of matrices C ∈ R ( p +1) d × N q and D ∈ R N q × ( p +1) d . This matrix product costs O (cid:0) N q ( p + 1) d (cid:1) . The formation of C i k is neglible. The formationof D k j on the other hand is O (cid:0) N q ( p + 1) d (cid:1) . Since N q ≤ ( p + 1) d the leading term is N q ( p + 1) d . Hence,the total cost of forming one element matrix is O (cid:0) N q ( p + 1) d (cid:1) . In total we have to integrate over all N e multidimensional elements of ˆ D × ˆ D . With that, the total cost of forming A is O (cid:0) N e N q ( p + 1) d (cid:1) . Using aGauss-Legendre quadrature rule with ( p + 1) quadrature points in every coordinate direction gives in total N q = ( p + 1) d quadrature points, and we can expect a leading cost proportional to O (cid:0) N e ( p + 1) d (cid:1) . Estimates presented in the previous subsection hold for classical hp ﬁnite element procedures employinga standard quadrature loop. Next we discuss the complexity of ﬁnite element procedures that employsum factorization instead of a standard quadrature loop. Sum factorization signiﬁcantly speeds up theelement array formation by exploiting the tensorial structure of both the ﬁnite element basis and the usedquadrature rules [2, 6, 50, 65, 68] . It’s worth noting that, due to the structure of the integral operator, thesum factorization technique looks somewhat diﬀerent than is standard in the hp ﬁnite element method.The setting for this analysis is (1) ˆ D × ˆ D has N e rectangular elements; (2) we use a tensor product basisof polynomial degree p on every element; and (3) we use a tensor product of univariate Gauss-Legendre16uadrature rules Q ( f ) := (cid:80) p +1 k =1 ω k f ( x k ) to integrate the products on every d -dimensional element (cid:3) d in ˆ D .Consider the element matrix A e ij = (cid:90) (cid:3) d B i ( x ) (cid:90) (cid:3) d Γ( x, x (cid:48) ) B j ( x (cid:48) ) d x (cid:48) d x ≈ p +1 (cid:88) k =1 B i ,p ( x ,k ) p +1 (cid:88) k =1 B i ,p ( x ,k ) · · · p +1 (cid:88) k d =1 B i d ,p ( x d,k d ) p +1 (cid:88) l =1 B j ,p ( x (cid:48) ,l ) p +1 (cid:88) l =1 B j ,p ( x (cid:48) ,l ) · · · p +1 (cid:88) l d =1 Γ( x ,k , . . . , x d,k d , x (cid:48) ,l , . . . , x (cid:48) d,l d ) B j d ,p ( x (cid:48) d,l d )with i , j ∈ I . Sum factorization is essentially tensor contraction. The kernel evaluated at the grid ofquadrature points is a tensor Γ k ...k d l ...l d ∈ R ( p +1) ×···× ( p +1) and is contracted with matrices B j z ,p ( x (cid:48) z,l z ) ∈ R ( p +1) × ( p +1) , for z = 1 , . . . , d , and subsequently with matrices B i z ,p ( x z,k z ) ∈ R ( p +1) × ( p +1) for z = 1 , . . . , d .The cost of every contraction is O (cid:0) ( p + 1) d +1 (cid:1) ﬂops. In total there are 2 d such tensor contractions. Hence,the element matrix formation cost for A e is O (cid:0) d ( p + 1) d +1 (cid:1) ﬂops. With N e elements, the leading cost offorming A is O (cid:0) dN e ( p + 1) d +1 (cid:1) ﬂops. In order to analyze the computational complexity of the proposed solution strategy we must address eachstage of the matrix-free matrix-vector product introduced in Algorithm 1. Let us consider the complexityin one iteration of the matrix-free algorithm.Stage 1 has a cost O (cid:0) dn d +1 (cid:1) .Stage 2 has a cost depending on the chosen projection space,for n > ˜ n the cost is O (cid:0) dpn d (cid:1) ,for n = ˜ n the cost is O (cid:0) dpn d (cid:1) ,for n < ˜ n the cost is O (cid:0) dp ˜ n d (cid:1) .Stage 3 has a cost O (cid:0) d ˜ n d +1 (cid:1) .Stage 4 has a cost O (cid:0) ˜ n d (cid:1) .Stage 5 has a cost O (cid:16) ˜ N /N thread (cid:17) .The remaining steps 6 , , , , p can be replaced by n . In steps 1 and 3 we assume that the Cholesky and LU factorizations of the univariate matrices of size n × n and ˜ n × ˜ n , respectively, have been precomputed andare available. Subsequently, the solver costs are O (cid:0) n (cid:1) and O (cid:0) ˜ n (cid:1) ﬂops, respectively, for each applicationof the factorization. The cost of a single iteration is typically dominated by step 5, which does not dependon the polynomial degree p . Fortunately, this step is embarrassingly parallel. Hence, the time complexityof the matrix free algorithm is O (cid:16) ˜ N N iter /N thread (cid:17) ﬂops. Both matrix-free and non-matrix-free methods need to store the resulting eigenmodes. Storage of theresults takes roughly L · N bytes, where L is the number of eigenmodes that need to be computed. Addition-ally, the standard approach that stores the dense left-hand-side system matrix A ∈ R N × N requires storageof N × N ﬂoating point numbers. Using double precision ﬂoating point arithmetic the storage requirementsare 8 N bytes. The additional storage of the matrix-free approach scales linearly with problem size, with anasymptotic leading term of roughly 2 · N bytes, again using double precision ﬂoating point arithmetic. Ifstep 4 of the matrix-free algorithm is performed using shared memory parallelism then one can expect thisto increase to (1 + N thread ) · N where N thread is the number of simultaneous processes. Consequently, the17torage cost of the matrix-free approach is typically governed by storage of the results and is thus optimal.Table 2 and 3 summarize the leading terms in storage of the two alternatives. Table 2: The leading terms in storage costs of a method that explicitly stores the main system matrix. Here L refers to thenumber of eigenmodes that need to be stored. The storage cost is dominated by storage of the system matrix. Number of degrees of freedom 10 Results storage L · L ·

80 KB L ·

800 KB L · Table 3: The leading terms in storage costs of a matrix-free approach. Here L refers to the number of eigenmodes that needto be stored. The storage cost of the matrix-free method is typically dominated by the storage of the results. Number of degrees of freedom 10 Results storage L · L ·

80 KB L ·

800 KB L · · ·

80 KB 2 ·

800 KB 2 ·

6. Numerical results

In this section we present numerical results that illustrate the accuracy, robustness and computationaleﬃciency of the proposed matrix-free isogeometric Galerkin method. An spectral study of the accuracyis performed in the case of one dimension. This study case gives insight into the accuracy attained byinterpolation based quadrature of the covariance function and its aﬀect on approximating the eigenvaluespectra. Next, several three-dimensional benchmarks illustrate the computational performance attained fora range of polynomial degrees and diﬀerent reﬁnement strategies of the interpolation as well as the solutionsspace. In the ﬁrst two three-dimensional examples, the error in the discrete linear operator introduced by theinterpolation based quadrature is showcased in the 2- and the Frobenius-matrix-norm. In the last two three-dimensional examples we study the eﬀect of solution space reﬁnement. All computations in the benchmarkcases are performed entirely in a single process, without taking advantage of parallel execution, neither inthe kernel evaluation itself nor in the linear algebra packages behind the implementation. The machine usedin these study cases is a laptop equipped with Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz and 2x16 GBof non-ECC DDR4 2666MHz RAM. We further provide a plot showcasing the scalability of the method forone of the examples. The Python implementation in this work relies heavily on the packages Numpy [66]and Scipy [62] for the linear algebra and solver functionalities. In order to achieve high performance, crucialparts of the code base are just-in-time compiled using the LLVM-based Python-compiler Numba [38, 40].

Consider a one-dimensional random ﬁeld deﬁned on the domain D = [0 , ⊂ R . We investigate(i) the relative L ( D ) interpolation error of the kernel, G − ˜ G , with respect to uniform h -reﬁnementenforcing C p − continuity across element boundaries. We consider the cases where G is the Gaussianand exponential kernel with b = 0 .

1, a characteristic domain length L = 1 and a variance σ = 1.(ii) the normalized spectra corresponding to an exponential kernel with b = 1. Remark 6.1.

Normalized spectra corresponding a Gaussian kernel cannot be reliably computed across thefull range of eigenvalues because the smallest eigenvalues quickly approach zero up to machine precision.Kernel approximation

Figure 3a shows the convergence towards the Gaussian kernel for polynomial degrees 1 through 8. It seemsthat the even degrees p = { , , , } perform relatively better than the preceding odd degrees p = { , , , } .It is evident that higher-order interpolation of a smooth kernel leads to a higher-order convergence rate in18he approximation. The smooth interpolation space is not as suitable for approximation of kernels thathave low regularity. Approximation of the exponential kernel in Figure 3b, which is C along x = x (cid:48) , showsthat higher-order continuity of the basis does not lead to an increased convergence rate. This behavior is inagreement with convergence estimates for spline approximation of arbitrary smoothness [59]. Spectral approximation

Although the proposed method, in its current form, is best suited for smooth kernels like the Gaussiankernel, excellent approximation of the eigenvalues corresponding to non-smooth kernels is still possible.Figure 4 depicts the full spectrum corresponding to the exponential kernel with b = 1. The proposedGalerkin method using interpolation based quadrature (IBQ) is compared to the isogeometric Galerkinmethod proposed in [55] and a classical C ﬁnite element solution in the case of polynomial degree p = 2.The interpolation space is set to ˜ h = 0 . · bL . The proposed method (IBQ) exhibits the same advantageouscharacteristics as the standard isogeometric Galerkin method [55] and exhibits no branching phenomena asin the case of the C -continuous ﬁnite element approximation. Due to their increased continuity acrosselement boundaries, splines achieve a higher accuracy per degree of freedom and increased robustness ascompared with classical C ﬁnite element methods. These results are in agreement with several other studiesthat have investigated spectral approximations corresponding to eigenvalue problems in structural mechanics[9, 28, 29, 54]. Remark 6.2.

The reference solution is computed using a standard isogeometric Galerkin method [55] withﬁfty thousand degrees of freedom. The ﬁrst twenty eigenvalues have been validated up to machine accuracyby the analytical approach described in [20, Ch. 2.3.3, page 28-35]. The analytical computation of theseeigenvalues involves solving for roots of a complex equation and is for that reason avoided beyond the ﬁrsttwenty eigenvalues.6.2. Random ﬁeld with exponential kernel in a three-dimensional half-open cylindrical domain

In the ﬁrst three-dimensional example we investigate a random ﬁeld deﬁned in a half-open cylindricaldomain as shown in Figure 5. We consider Gaussian and exponential kernels with a correlation length bL equal to the half of the characteristic length L . The variance of the random ﬁeld is σ = 1. We note thatthe example with an exponential kernel is also studied in [55]. Example 1–1

In this example we consider the exponential kernel and since this kernel is C along x = x (cid:48) , there isno use in enforcing higher smoothness on the element boundaries of the interpolation space. Moreover,the coarse geometry is modeled by two elements with C continuity between both elements, therefore,the Jacobian determinants will be discontinuous at the interface. Recalling remark 4.1, in order to attainoptimal accuracy of our interpolation based quadrature, we enforce a discontinuous interpolation space inthe circumferential direction at that interface. Under given considerations, the chosen interpolation spaceemploys quadratic B-splines and is C on most element boundaries, except at the discontinuity, where it is C − in the circumferential direction. Since the system equations are not aﬀected by the discontinuity, seeequations (22) and (29), we choose the continuity of this space analogously to the example presented in [55]and use quadratic B-splines in each direction with C continuity on the element boundaries. The solutionspace and the interpolation space meshes for each case are shown in Figure 6. The ﬁrst twenty largesteigenvalues together with information about the mesh and computational cost are tabulated in Table 4.The ﬁrst nine eigenfunctions corresponding to the nine largest eigenvalues are visualized in Figure 7 byweighting each eigenfunction by the square root of the corresponding eigenvalue.The presolution formation and assembly time includes the formation and assembly of the univariatemass matrices, the interpolation matrices and their factorizations as well as the computation of the Jacobiandeterminants at the Greville abscissa. The results in Table 4 indicate that these setup costs are negligiblysmall compared to the total solution time of the Lanczos eigenvalue solver, which is dominated by thematrix-free evaluation of the matrix-vector product (Step 5 in Algorithm 1). The number of iterations of19 − − ˜ h − − − − − − − − k G ( x , x ) − ˜ G ( x , x ) k L k G ( x , x ) k − L p = 1 p = 2 p = 3 p = 4 p = 5 p = 6 p = 7 p = 810 − ˜ h/ ( bL ) (a) Gaussian kernel − − ˜ h − − k G ( x , x ) − ˜ G ( x , x ) k L k G ( x , x ) k − L . p = 1 p = 2 p = 4 p = 6 p = 1010 − ˜ h/ ( bL ) (b) Exponential kernelFigure 3: Normalized L interpolation error in the one-dimensional study case for multiple polynomial degrees. The error isgiven with respect to the mesh size of the interpolation space (bottom axis), as well as the mesh size of the interpolation spacenormalized by the correlation length (top axis). The convergence rates are approximately O (cid:16) ˜ h p +1 (cid:17) in (a) and O (cid:16) ˜ h / (cid:17) in (b). . . . . . . k/N . . . . . . . . λ h , k / λ k p = 2 C isogeometric Galerkin method C isogeometric Galerkin method using IBQ C ﬁnite element method . . . . . Figure 4: Ratio of the approximated eigenvalues to the reference eigenvalues over a full spectrum of 501 eigenmodes in theone-dimensional study case with an exponential kernel and a correlation length equal to the domain length. the Lanczos algorithm is equal in each case. The maximum resident memory increases in each case, but isnegligibly small, when compared to standard methods.21 r H

Exponential kernel r = 8 R = 10 H = 15 b = 0.5 L = 10 Figure 5: Half-open cylindrical geometry in the ﬁrst three-dimensional benchmark. The geometry is modeled as a single NURBSpatch using polynomial degrees p = { , , } and knot vectors Ξ = (0 , , , . , . , , , , Ξ = (0 , , , , Ξ = (0 , , , Case 132 × × × ×

15 elements Case 338 × ×

21 elements Galerkin 132 × × × ×

25 elements

Figure 6: Meshes of the half-open cylindrical geometry illustrating the interpolation and solution spaces in gray and orange,respectively. The gray meshes from left to right correspond to cases one through three in Tables 4 and 6, which employ theﬁrst orange mesh in the corresponding solution space. Except for the ﬁrst interpolation mesh, all meshes are nearly uniform ineach parametric direction (1 , , a) 1st eigenfunction (b) 2nd eigenfunction (c) 3rd eigenfunction(d) 4th eigenfunction (e) 5th eigenfunction (f) 6th eigenfunction(g) 7th eigenfunction (h) 8th eigenfunction (i) 9th eigenfunction -18.0 0 18.0 Figure 7: First nine normalized eigenfunctions weighted by the square root of the corresponding eigenvalues in Example 1–1.Results from the benchmark case three. able 4: Enumeration of twenty largest eigenvalues corresponding to the half-open cylinder problem with the exponential kernelin Example 1–1. The numerical eigenvalues have been computed by the proposed isogeometric Galerkin method employinginterpolation based quadrature for the three diﬀerent cases of solution and interpolation spaces depicted in Figure 6 as well asthe standard isogeometric Galerkin reference solutions computed with two diﬀerent meshes and polynomial order p = (2 , , single core. Mode EigenvalueCase 1 Case 2 Case 3 Galerkin 1 * Galerkin 2 * Interpolation space

Number of elements 256 840 1596 – –Number of degrees of freedom 1980 8990 16770 – –Mesh size 2.857 1.719 1.423 – –Mesh size/correlation length 0.571 0.344 0.284 – –Formation and assembly of univariate matrices 0 .

314 s 0 .

308 s 0 .

301 s – –

Solution space

Number of elements . . . 256 . . . 3800Number of degrees of freedom . . . 1050 . . . 6642Mesh size . . . 2.857 . . . 1.073Mesh size/correlation length . . . 0.571 . . . 0.215Formation and assembly of system matrices – – – 4 .

86 min 17 . Summary

Number of iterations 63 63 63 63 63Maximum resident memory [GB] 0.438 0.441 0.441 0.464 1.566Solution time 5 .

031 s 64 .

13 s 3 .

367 min 0 .

09 s 6 .

21 sTotal time 5 .

345 s 64 .

44 s 3 .

372 min 4 .

86 min 17 . * exact kernel, NURBS trial and test space, elementwise assembly Table 5: Relative operator error with respect to the 2- and Frobenius-norm in Example 1–1. Operator A corresponds toequation (19) integrated on each element employing a Gauss-Legendre quadrature rule with p + 1 quadrature points in eachdirection. The approximation ˜ A of that operator follows from equation (29). Tabulated cases correspond to cases one throughthree in Example 1–1 with the exponential kernel. Rel. matrix norm Case 1 Case 2 Case 3 (cid:107) A − ˜ A (cid:107) (cid:107) A (cid:107) − . · − . · − . · − (cid:107) A − ˜ A (cid:107) F (cid:107) A (cid:107) − . · − . · − . · − The two rightmost columns summarize the results obtained by the isogeometric Galerkin method pro-posed in [55]. On the same mesh (Case 1) we observe a speed-up of roughly 2 orders in magnitude. This24omparison might not be completely fair because a full Gaussian quadrature performed in [55] is much moreaccurate than our interpolation based quadrature technique on the same mesh. Nonetheless, the obtainedaccuracy in the eigenvalues is convincing, as evidenced also in Table 5.

Example 1–2

The second example employs the Gaussian covariance kernel with the same correlation length and vari-ance as in Example 1–1. With that, we take advantage of higher smoothness across the element boundariesand rather than performing h -reﬁnement as in Example 1–1, we set the interpolation mesh ﬁxed (compareCase 1 in Example 1–1) and perform k -reﬁnement for p = 2 , ,

8. Nonetheless, at the discontinuity in thecoarse geometry model, we enforce C − in the circumferential direction. Table 6: Enumeration of twenty largest eigenvalues corresponding to the half-open cylinder problem with the Gaussian kernelin Example 1–2. The numerical eigenvalues have been computed by the proposed isogeometric Galerkin method for polynomialdegrees p = 2 , , p = (2 , , single core. Mode Eigenvalue p = 2 p = 4 p = 8 Galerkin 1 * Galerkin 2 * Interpolation space

Number of elements 256 256 256 – –Number of degrees of freedom 1080 2400 6912 – –Mesh size 2.857 2.857 2.857 – –Mesh size/correlation length 0.571 0.571 0.571 – –Formation and assembly of univariate matrices 0 .

299 s 0 .

301 s – –

Solution space

01 min 16 .

97 h

Summary

Number of iterations 52 52 52 52 52Maximum resident memory [GB] 0.437 0.436 0.437 0.464 1.615Solution time 0 .

817 s 3 .

412 s 27 .

95 s 0 .

10 s 5 .

61 sTotal time 1 .

116 s 3 .

711 s 28 .

25 s 5 .

02 min 16 .

97 h * exact kernel, NURBS trial and test space, elementwise assembly As already discussed, it is evident, that the proposed method performs better for smooth kernels, which25 = 30 ◦ β = 10 ◦ γ = 360 ◦ R = 100 L = 3 L = 4 t = 1 t = 4 Ct t L L O Rαβγ xy z

Figure 8: Hemispherical shell with a stiﬀener. The geometry is modeled as a single NURBS patch using polynomial degrees p = { , , } and knot vectors Ξ = (0 , , , , , , , , , , Ξ = (0 , , , , , , , , , , , , Ξ = (0 , , , , , reﬂects in the error of the discrete operator in Table 7. It is worth noting, that the timings versus accuracyare in favour of k -reﬁnement. Table 7: Relative operator error with respect to the 2- and Frobenius-norm in example 1a. Operator A corresponds to equation(19) integrated on each element employing a Gauss-Legendre quadrature rule with p + 1 quadrature points in each direction.The approximation of that operator ˜ A follows from equation (29). Tabulated cases correspond to Example 1–2 with theGaussian kernel. Rel. matrix norm Case 1 Case 2 Case 3 (cid:107) A − ˜ A (cid:107) (cid:107) A (cid:107) − . · − . · − . · − (cid:107) A − ˜ A (cid:107) F (cid:107) A (cid:107) − . · − . · − . · − We consider the hemispherical shell with stiﬀener depicted in Figure 8. This three dimensional shellstructure is similar to the model published in [57] but has a slightly diﬀerent stiﬀener proﬁle. We use theGaussian covariance function with a correlation length bL = 0 . L , where the characteristic domain length, L ≈ p = { , , } . The two examples diﬀer inthe following way: 26xample 2–1 The solution mesh is the same as the interpolation mesh.Example 2–2 The solution mesh is twice as ﬁne as the interpolation mesh in every component direction.In both studies we use interpolation and solution meshes obtained by p -reﬁnement of the geometricalmodel, followed by uniform h -reﬁnement. The continuity of these spaces is thus C at knots that are presentin the initial coarse geometrical model and C p − at new knots introduced by h -reﬁnement. Again, wherethe solution space is C , we enforce C − in the interpolation space in order to achieve optimal accuracy perdegree of freedom. Example 2–1

Figures 9a and 9b depict the interpolation and solution meshes, respectively, that are used in thisbenchmark. As described they are identical. The numerical results are summarized in Table 8. As in theprevious three-dimensional benchmark problems the presolution setup costs hardly contribute to the totalcost of the method. Again, the main computational cost lies in the matrix-free matrix-vector product thatis evaluated in each iteration of the Lanczos eigenvalue solver. Due to the increased number of degreesof freedom for higher polynomial degrees the associated computational cost increases. Interestingly, thenumber of iterations is reduced in the case p = 16 as compared to the lower polynomial degrees.27 a) Interpolation space mesh in Example 2–1 and Example 2–2 with 23 × × , , × × , , × × , , able 8: Enumeration of twenty largest eigenvalues corresponding to the hemispherical shell with stiﬀener problem withGaussian kernel in Example 2–1. The numerical eigenvalues have been computed by the proposed isogeometric Galerkinmethod employing interpolation based quadrature for the three diﬀerent cases using solution and interpolation spaces depictedin Figure 9a and 9b as well as standard isogeometric Galerkin reference solution computed using mesh depicted in Figure 9band polynomial order p = (2 , , single core. Mode Eigenvalue p = 2 p = 6 p = 16 Galerkin 1 * Interpolation space

Number of elements 3864 3864 3864 –Number of degrees of freedom 10672 35424 189144 –Mesh size 7.992 7.992 7.992 –Mesh size/correlation length 0.091 0.091 0.091 –Formation and assembly of univariate matrices 0 .

354 s 0 .

427 s 3 .

530 s –

Solution space

Number of elements 3864 3864 3864 3864Number of degrees of freedom 9612 32760 180090 9612Mesh size 7.992 7.992 7.992 7.992Mesh size/correlation length 0.091 0.091 0.091 0.091Formation and assembly of system matrices – – – 17 .

29 h

Summary

Number of iterations 52 52 41 52Maximum resident memory [GB] 0.209 0.275 1.282 2.709Solution time 70 .

28 s 12 .

33 min 5 .

00 h 13 .

16 sTotal time 70 .

63 s 12 .

33 min 5 .

00 h 17 .

29 h * exact kernel, NURBS trial and test space, elementwise assembly Example 2–2

In the second benchmark the element size of the solution mesh is halved, see Figure 9c. The interpolationmesh is kept the same as in the ﬁrst benchmark. The obtained results are presented in Table 9. By comparingthese results with those in Table 8 it may be observed that the dimension of the solution space does notsigniﬁcantly aﬀect the total solver costs. Indeed, the number of degrees of freedom are more than doubled,yet the timings stay more or less the same, independent of polynomial degree. The increased dimension ofthe solution space mesh is reﬂected in the maximum resident memory, which has increased as compared to29o the results in Table 8. As witnessed in the previous benchmark, the higher order simulations requiredfewer iterations than the lower order ones.

Remark 6.3.

Note that the ﬂexibility in mesh size of the interpolation versus the trial space mesh providesa mechanism by which the error due to quadrature versus the error due to discretization can be eﬀectivelycontrolled.

Table 9: Enumeration of twenty largest eigenvalues corresponding to the hemispherical shell with stiﬀener problem withGaussian kernel in Example 2–2. The numerical eigenvalues have been computed by the proposed isogeometric Galerkinmethod employing interpolation based quadrature for the three diﬀerent cases using solution and interpolation spaces depictedin Figure 9a and 9c as well as standard isogeometric Galerkin reference solution computed using mesh depicted in Figure 9band polynomial order p = (2 , , single core. Mode Eigenvalue p = 2 p = 6 p = 16 Galerkin 1 * Interpolation space

367 s 1 .

229 s 23 .

21 s –

Solution space

Number of elements 30912 30912 30912 3864Number of degrees of freedom 51900 117180 421360 9612Mesh size 4.019 4.019 4.019 7.992Mesh size/correlation length 0.046 0.046 0.046 0.091Formation and assembly of system matrices – – – 17 .

29 h

Summary

Number of iterations 52 52 41 52Maximum resident memory [GB] 0.261 0.741 7.346 2.709Solution time 70 .

05 s 12 .

433 min 4 .

850 h 13 .

16 sTotal time 70 .

42 s 12 .

433 min 4 .

858 h 17 .

29 h * exact kernel, NURBS trial and test space, elementwise assembly a) 1st eigenfunction (b) 2nd eigenfunction (c) 3rd eigenfunction(d) 4th eigenfunction (e) 5th eigenfunction (f) 6th eigenfunction(g) 7th eigenfunction (h) 8th eigenfunction (i) 9th eigenfunction -144.0 0 144.0 Figure 10: First nine normalized eigenfunctions weighted by the square root of the corresponding eigenvalues in Example 2–2. emark 6.4. A relevant question in random ﬁeld discretization is what mesh size is necessary to attainacceptable approximations. The mesh size should clearly depend on the correlation length bL . A rule ofthumb, proposed in [14], is that the element size is approximately in the range from a half to a quarter ofthe given correlation length. Similar rules have been established by other authors, see [63] and referencestherein. Especially, in a three-dimensional problem this may lead to a large number of degrees of freedom.Engineering models of practical interest are generally more complex than the models shown in this paper andmay require millions of degrees of freedom.Parallel execution In order to provide comparable results in the benchmarks, all the computations have been performedin a sequential manner in a single process. The proposed method is well suited for parallel execution, asdiscussed in section 5.3. Figure 11 shows nearly optimal scaling with the number of cores for Example 2–2 inthe case of p = 6. Note, that only the kernel evaluation and matrix-vector product in Step 5 of Algorithm 1have been performed in parallel using shared-memory parallelism. All the other steps, as well as the Lanczoseigensolver are still run sequentially. In this particular case 99 .

84% of the Algorithm 1 execution time wasspent in the parallel execution mode. Number of parallel processes10 − N o r m a li ze d e x ec u t i o n t i m e Figure 11: Scaling of the execution time in Example 2–2 in the case of p = 6 and 1 , ,

7. Conclusion

This paper presented an eﬃcient matrix-free Galerkin method for the Karhunen-Lo`eve (KL) series ex-pansion of random ﬁelds. The KL expansion requires the solution of a generalized eigenvalue problemcorresponding to the homogeneous Fredholm integral eigenvalue problem of the second kind, and is compu-tationally challenging for several reasons. Firstly, the Galerkin method requires numerical integration over32 2 d -dimensional domain, where d , in this work, denotes the spatial dimension. Consequently, classical for-mation and assembly procedures have a time complexity that scales O (cid:0) N e · p d (cid:1) with increasing polynomialdegree p and number of elements N e . Secondly, the main system matrix is dense and requires O (cid:0) N (cid:1) bytesof storage, where N is the global number of degrees of freedom. This means that a discretization involvinga hundred thousand degrees of freedom requires at least 80GB of RAM to store the main system matrix indouble precision. Hence, the computational complexity as well as memory requirements of standard solutiontechniques become quickly computationally intractable with increasing polynomial degree, problem size andspatial dimension.We proposed an eﬃcient solution methodology that signiﬁcantly ameliorates the aforementioned compu-tational challenges. Our approach is based on the following key ingredients:1. A trial space of rational spline functions, whose Gramian or mass matrix has a Kronecker structureindependent of the geometric mapping;2. An inexpensive reformulation of the generalized algebraic eigenvalue problem into a standard algebraiceigenvalue problem;3. A degenerate kernel approximation of the covariance function using smooth tensor product splines;4. Formulation of an eﬃcient matrix-free and parallel matrix-vector product for iterative eigenvaluesolvers, which utilizes the Kronecker structure of the system matrices.In Step 2 the reformulation to a standard eigenvalue problem signiﬁcantly reduces the computational costwhile improving conditioning. This can be done eﬃciently due to the Kronecker structure of the massmatrix, which is a result of the particular choice of trial space, see Step 1. In Step 3 the degenerate kernelapproximation enables us to evaluate the resulting integrals exactly with a minimal number of evaluationpoints. Both steps involve matrices that are endowed with a Kronecker structure and can be performedmatrix-free in O (cid:0) N · N /d (cid:1) time. The leading cost of the method is due to the Lanczos eigenvalue algorithm,which involves dense matrix-vector multiplications. As noted in Step 4, we perform this step matrix-free, bycomputing the necessary components on the ﬂy, and in parallel in approximately O (cid:0) N N iter /N threads (cid:1) time.Here N iter denotes the number of iterations of the eigensolver and N threads is the number of simultaneousprocesses. Several three dimensional benchmark problems involving non-trivial geometrical mappings haveillustrated exceptional eﬃciency and eﬀectiveness of the proposed solution methodology. In particular, weshowed that the proposed methodology scales favorably with polynomial degree and works particularlywell for smooth covariance functions, such as the Gaussian kernel. The Python implementation used togenerate these results has been provided as open-source software and is available for download at https://github.com/m1ka05/tensiga .In a follow-up study we plan to extensively study the accuracy of the proposed solution methodology.There are two sources of error: (1) a quadrature error due to approximation of the covariance function; and(2) a discretization error due to the ﬁnite dimensional representation of the eigenmodes. We will performa-priori as well as a-posteriori error analysis and formulate criteria for bounding the error due to quadratureby the discretization error. Within the same context of accuracy and robustness it is interesting to extendthe spectral analysis results in [29] to generalized eigenvalue problems corresponding to Fredholm integralequations for diﬀerent covariance functions as well as polynomial order of the approximation.We also plan to further improve the eﬃciency of the proposed method where possible. The proposedmatrix-free algorithm lends itself for acceleration on graphics processing units (GPU’s). Furthermore, ex-ploiting particular structure (such as sparsity or symmetry) of the covariance function may lead to improvedsolver cost. For example, the hierarchical matrix method proposed in [35] performs the matrix-vector prod-ucts in O ( N log N ) time, by exploiting certain structure of the covariance function.There are several other interesting avenues for future research. Some or all of the techniques proposedhere could be applied to linear as well as non-linear Fredholm integral diﬀerential equations of the ﬁrst aswell as the second kind. While the proposed method is designed for smooth kernels it would be interestingto develop similar methods that are tailored towards continuous kernels, such as the exponential kernel,or even singular kernels, which are typical in boundary integral equations, see [7, 18, 19] for similar ideasmaking use of quasi-interpolation. 33inally, we would like to mention that similar techniques can be applied in the context of the collocationmethod. The computational cost of such a method would be similar to that of the proposed Galerkinmethod. Acknowledgments

M.(cid:32)L. Mika, R.R. Hiemstra and D. Schillinger gratefully acknowledge funding from the German ResearchFoundation through the DFG Emmy Noether Award SCH 1249/2-1. T.J.R. Hughes and R.R. Hiemstra werepartially supported by the National Science Foundation Industry/University Cooperative Research Center(IUCRC) for Eﬃcient Vehicles and Sustainable Transportation Systems (EV-STS), and the United StatesArmy CCDC Ground Vehicle Systems Center (TARDEC/NSF Project

References [1]

Allaix, D. L., and Carbone, V. I.

Karhunen-Lo`eve decomposition of random ﬁelds based on a hierarchical matrixapproach.

International Journal for Numerical Methods in Engineering 94 , 11 (June 2013), 1015–1036.[2]

Antolin, P., Buffa, A., Calabr`o, F., Martinelli, M., and Sangalli, G.

Eﬃcient matrix computation for tensor-product isogeometric analysis: The use of sum factorization.

Computer Methods in Applied Mechanics and Engineering285 (Mar. 2015), 817–828.[3]

Arthur, D. W.

The Solution of Fredholm Integral Equations Using Spline Functions.

IMA Journal of Applied Mathe-matics 11 , 2 (1973), 121–129.[4]

Atkinson, K. E.

The Numerical Solution of Integral Equations of the Second Kind , 1 ed. Cambridge University Press,June 1997.[5]

Betz, W., Papaioannou, I., and Straub, D.

Numerical methods for the discretization of random ﬁelds by means of theKarhunen–Lo`eve expansion.

Computer Methods in Applied Mechanics and Engineering 271 (Apr. 2014), 109–129.[6]

Bressan, A., and Takacs, S.

Sum factorization techniques in Isogeometric Analysis.

Computer Methods in AppliedMechanics and Engineering 352 (Aug. 2019), 437–460.[7]

Calabr`o, F., Falini, A., Sampoli, M. L., and Sestini, A.

Eﬃcient quadrature rules based on spline quasi-interpolationfor application to IGA-BEMs.

Journal of Computational and Applied Mathematics 338 (Aug. 2018), 153–167.[8]

Calabr`o, F., Sangalli, G., and Tani, M.

Fast formation of isogeometric Galerkin matrices by weighted quadrature.

Computer Methods in Applied Mechanics and Engineering 316 (Apr. 2017), 606–622.[9]

Cottrell, J., Hughes, T., and Reali, A.

Studies of reﬁnement and continuity in isogeometric structural analysis.

Computer Methods in Applied Mechanics and Engineering 196 , 41-44 (Sept. 2007), 4160–4183.[10]

Cottrell, J. A., Hughes, T. J. R., and Bazilevs, Y.

Isogeometric analysis: toward integration of CAD and FEA .Wiley, Chichester, West Sussex, U.K. ; Hoboken, NJ, 2009. OCLC: ocn335682757.[11]

Courant, R., and Hilbert, D.

Methods of Mathematical Physics , 1 ed. Wiley, Apr. 1989.[12]

Dahmen, W., Harbrecht, H., and Schneider, R.

Compression Techniques for Boundary Integral Equations—Asymptotically Optimal Complexity Estimates.

SIAM Journal on Numerical Analysis 43 , 6 (Jan. 2006), 2251–2271.[13]

De Boor, C.

A practical guide to splines . No. 27 in Applied mathematical sciences. Springer-Verlag, New York, 1978.[14]

Der Kiureghian, A., and Ke, J.-B.

The stochastic ﬁnite element method in structural reliability.

Probabilistic Engi-neering Mechanics 3 , 2 (June 1988), 83–91.[15]

Ding, C., Hu, X., Cui, X., Li, G., Cai, Y., and Tamma, K. K.

Isogeometric generalized n th order perturbation-basedstochastic method for exact geometric modeling of (composite) structures: Static and dynamic analysis with randommaterial parameters.

Computer Methods in Applied Mechanics and Engineering 346 (Apr. 2019), 1002–1024.[16]

Eckert, C., Beer, M., and Spanos, P. D.

A polynomial chaos method for arbitrary random inputs using B-splines.

Probabilistic Engineering Mechanics 60 (Apr. 2020), 103051.[17]

Eiermann, M., Ernst, O. G., and Ullmann, E.

Computational aspects of the stochastic ﬁnite element method.

Com-puting and Visualization in Science 10 , 1 (Feb. 2007), 3–15.[18]

Falini, A., Giannelli, C., Kanduˇc, T., Sampoli, M. L., and Sestini, A.

An adaptive IgA-BEM with hierarchicalB-splines based on quasi-interpolation quadrature schemes.

International Journal for Numerical Methods in Engineering117 , 10 (Mar. 2019), 1038–1058.[19]

Falini, A., and Kanduˇc, T.

A Study on Spline Quasi-interpolation Based Quadrature Rules for the Isogeometric GalerkinBEM. In

Advanced Methods for Geometric Modeling and Numerical Simulation , C. Giannelli and H. Speleers, Eds., vol. 35.Springer International Publishing, Cham, 2019, pp. 99–125. Series Title: Springer INdAM Series.[20]

Ghanem, R. G., and Spanos, P. D.

Stochastic Finite Elements: A Spectral Approach . Springer New York, New York,NY, 1991.[21]

Golub, G. H., and Van Loan, C. F.

Matrix computations , 3rd ed ed. Johns Hopkins studies in the mathematicalsciences. Johns Hopkins University Press, Baltimore, 1996. Hackbusch, W., Khoromskij, B. N., and Tyrtyshnikov, E. E.

Hierarchical Kronecker tensor-product approximations.

Journal of Numerical Mathematics 13 , 2 (Jan. 2005).[23]

Harbrecht, H., Peters, M., and Siebenmorgen, M.

Eﬃcient approximation of random ﬁelds for numerical applications.

Numerical Linear Algebra with Applications 22 , 4 (Aug. 2015), 596–617.[24]

Hiemstra, R. R., Calabr`o, F., Schillinger, D., and Hughes, T. J.

Optimal and reduced quadrature rules fortensor product and hierarchically reﬁned splines in isogeometric analysis.

Computer Methods in Applied Mechanics andEngineering 316 (2017), 966 – 1004. Special Issue on Isogeometric Analysis: Progress and Challenges.[25]

Hiemstra, R. R., Sangalli, G., Tani, M., Calabr`o, F., and Hughes, T. J.

Fast formation and assembly of ﬁnite elementmatrices with application to isogeometric linear elasticity.

Computer Methods in Applied Mechanics and Engineering 355 (Oct. 2019), 234–260.[26]

Hien, T. D., and Noh, H.-C.

Stochastic isogeometric analysis of free vibration of functionally graded plates consideringmaterial randomness.

Computer Methods in Applied Mechanics and Engineering 318 (May 2017), 845–863.[27]

Hughes, T., Cottrell, J., and Bazilevs, Y.

Isogeometric analysis: CAD, ﬁnite elements, NURBS, exact geometry andmesh reﬁnement.

Computer Methods in Applied Mechanics and Engineering 194 , 39-41 (Oct. 2005), 4135–4195.[28]

Hughes, T., Reali, A., and Sangalli, G.

Duality and uniﬁed analysis of discrete approximations in structural dynamicsand wave propagation: Comparison of p-method ﬁnite elements with k-method NURBS.

Computer Methods in AppliedMechanics and Engineering 197 , 49-50 (Sept. 2008), 4104–4124.[29]

Hughes, T. J., Evans, J. A., and Reali, A.

Finite element and NURBS approximations of eigenvalue, boundary-value,and initial-value problems.

Computer Methods in Applied Mechanics and Engineering 272 (Apr. 2014), 290–320.[30]

Jahanbin, R., and Rahman, S.

An isogeometric collocation method for eﬃcient random ﬁeld discretization.

InternationalJournal for Numerical Methods in Engineering 117 , 3 (Jan. 2019), 344–369.[31]

Jahanbin, R., and Rahman, S.

Stochastic isogeometric analysis in linear elasticity.

Computer Methods in AppliedMechanics and Engineering 364 (June 2020), 112928.[32]

Karhunen, K. ¨Uber lineare Methoden in der Wahrscheinlichkeitsrechnung . Suomalaisen Tiedeakatemian toimituksia.Zugl.: Helsinki, Univ., Diss., 1947, Helsinki, 1947.[33]

Keese, A.

A Review of Recent Developments in the Numerical Solution of Stochastic Partial Diﬀerential Equations(Stochastic Finite Elements).

Braunschweig, Institut f¨ur Wissenschaftliches Rechnen (2003).[34]

Keese, A.

Numerical Solution of Systems with Stochastic Uncertainties: A General Purpose Framework for StochasticFinite Elements, June 2004.[35]

Khoromskij, B. N., Litvinenko, A., and Matthies, H. G.

Application of hierarchical matrices for computing theKarhunen–Lo`eve expansion.

Computing 84 , 1-2 (Apr. 2009), 49–67.[36]

Kjolstad, F., Kamil, S., Chou, S., Lugato, D., and Amarasinghe, S.

The tensor algebra compiler.

Proceedings of theACM on Programming Languages 1 , OOPSLA (Oct. 2017), 1–29.[37]

Kress, R.

Linear integral equations , third edition ed. No. volume 82 in Applied mathematical sciences. Springer, NewYork, 2014.[38]

Lam, S. K., Pitrou, A., and Seibert, S.

Numba: a LLVM-based Python JIT compiler. In

Proceedings of the SecondWorkshop on the LLVM Compiler Infrastructure in HPC - LLVM ’15 (Austin, Texas, 2015), ACM Press, pp. 1–6.[39]

Lang, S., and Lang, S.

Real and functional analysis , 3rd ed ed. No. 142 in Graduate texts in mathematics. Springer-Verlag, New York, 2012.[40]

Lattner, C., and Adve, V.

LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In

Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and RuntimeOptimization (USA, 2004), CGO ’04, IEEE Computer Society, p. 75. event-place: Palo Alto, California.[41]

Li, K., Gao, W., Wu, D., Song, C., and Chen, T.

Spectral stochastic isogeometric analysis of linear elasticity.

ComputerMethods in Applied Mechanics and Engineering 332 (Apr. 2018), 157–190.[42]

Li, K., Wu, D., and Gao, W.

Spectral stochastic isogeometric analysis for static response of FGM plate with materialuncertainty.

Thin-Walled Structures 132 (Nov. 2018), 504–521.[43]

Li, K., Wu, D., and Gao, W.

Spectral stochastic isogeometric analysis for linear stability analysis of plate.

ComputerMethods in Applied Mechanics and Engineering 352 (Aug. 2019), 1–31.[44]

Li, K., Wu, D., Gao, W., and Song, C.

Spectral stochastic isogeometric analysis of free vibration.

Computer Methodsin Applied Mechanics and Engineering 350 (June 2019), 1–27.[45]

Lo`eve, M.

Functions aleatoires du second ordre. Processus stochastique et mouvement Brownien.

Paris, Gauthier-Villars (1948), 366–420.[46]

Mantzaflaris, A., and J¨uttler, B.

Integration by interpolation and look-up for Galerkin-based isogeometric analysis.

Computer Methods in Applied Mechanics and Engineering 284 (Feb. 2015), 373–400.[47]

Mantzaflaris, A., and J¨uttler, B.

Integration by interpolation and look-up for Galerkin-based isogeometric analysis.

Computer Methods in Applied Mechanics and Engineering 284 (Feb. 2015), 373–400.[48]

Melchers, R. E., and Beck, A. T. , Eds.

Structural Reliability Analysis and Prediction . John Wiley & Sons Ltd,Chichester, UK, Oct. 2017.[49]

Mercer, J.

Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations.

Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 209 , 441-458 (Jan.1909), 415–446.[50]

Orszag, S. A.

Spectral methods for problems in complex geometries.

Journal of Computational Physics 37 , 1 (Aug.1980), 70–92.[51]

Pan, M., J¨uttler, B., and Giust, A.

Fast formation of isogeometric Galerkin matrices via integration by interpolationand look-up.

Computer Methods in Applied Mechanics and Engineering 366 (July 2020), 113005. Phoon, K., Huang, S., and Quek, S.

Implementation of Karhunen–Loeve expansion for simulation using a wavelet-Galerkin scheme.

Probabilistic Engineering Mechanics 17 , 3 (July 2002), 293–303.[53]

Piegl, L., and Tiller, W.

The NURBS Book . Monographs in Visual Communications. Springer Berlin Heidelberg,Berlin, Heidelberg, 1995.[54]

Puzyrev, V., Deng, Q., and Calo, V.

Spectral approximation properties of isogeometric analysis with variable continuity.

Computer Methods in Applied Mechanics and Engineering 334 (June 2018), 22–39.[55]

Rahman, S.

A Galerkin isogeometric method for Karhunen–Lo`eve approximation of random ﬁelds.

Computer Methodsin Applied Mechanics and Engineering 338 (Aug. 2018), 533–561.[56]

Rahman, S.

A Spline Chaos Expansion.

SIAM/ASA Journal on Uncertainty Quantiﬁcation 8 , 1 (Jan. 2020), 27–57.[57]

Rank, E., D¨uster, A., N¨ubel, V., Preusch, K., and Bruhns, O.

High order ﬁnite elements for shells.

ComputerMethods in Applied Mechanics and Engineering 194 , 21-24 (June 2005), 2494–2512.[58]

Saad, Y.

Numerical methods for large eigenvalue problems , rev. ed ed. No. 66 in Classics in applied mathematics. Societyfor Industrial and Applied Mathematics, Philadelphia, 2011.[59]

Sande, E., Manni, C., and Speleers, H.

Explicit error estimates for spline approximation of arbitrary smoothness inisogeometric analysis.

Numerische Mathematik 144 , 4 (Apr. 2020), 889–929.[60]

Sangalli, G., and Tani, M.

Matrix-free weighted quadrature for a computationally eﬃcient isogeometric k -method.

Computer Methods in Applied Mechanics and Engineering 338 (Aug. 2018), 117–133.[61]

Schwab, C., and Todor, R. A.

Karhunen–Lo`eve approximation of random ﬁelds by generalized fast multipole methods.

Journal of Computational Physics 217 , 1 (Sept. 2006), 100–122.[62]

SciPy 1.0 Contributors, Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau,D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., van der Walt, S. J., Brett, M., Wilson, J., Millman,K. J., Mayorov, N., Nelson, A. R. J., Jones, E., Kern, R., Larson, E., Carey, C. J., Polat, I., Feng, Y., Moore,E. W., VanderPlas, J., Laxalde, D., Perktold, J., Cimrman, R., Henriksen, I., Quintero, E. A., Harris, C. R.,Archibald, A. M., Ribeiro, A. H., Pedregosa, F., and van Mulbregt, P.

SciPy 1.0: fundamental algorithms forscientiﬁc computing in Python.

Nature Methods 17 , 3 (Mar. 2020), 261–272.[63]

Stefanou, G.

The stochastic ﬁnite element method: Past, present and future.

Computer Methods in Applied Mechanicsand Engineering 198 (2009), 1031–1051.[64]

Sudret, B., and Kuyreghian, A.

Stochastic ﬁnite element methods and reliability: a state-of-the-art report.

Berkeley,Department of Civil and Environmental Engineering, University of California, 2000.[65]

Tino Eibner, J. M. M.

Fast algorithms for setting up the stiﬀness matrix in hp-FEM: a comparison, 2005.[66] van der Walt, S., Colbert, S. C., and Varoquaux, G.

The NumPy Array: A Structure for Eﬃcient NumericalComputation.

Computing in Science & Engineering 13 , 2 (Mar. 2011), 22–30.[67]

Vermeulen, A. H., Bartels, R. H., and Heppler, G. R.

Integrating Products of B-Splines.

SIAM Journal on Scientiﬁcand Statistical Computing 13 , 4 (July 1992), 1025–1038.[68]

Vos, P. E., Sherwin, S. J., and Kirby, R. M.

From h to p eﬃciently: Implementing ﬁnite and spectral/hp elementmethods to achieve optimal performance for low- and high-order discretisations.

Journal of Computational Physics 229 ,13 (July 2010), 5161–5181.[69]

Wang, W., Chen, G., Yang, D., and Kang, Z.

Stochastic isogeometric analysis method for plate structures with randomuncertainty.

Computer Aided Geometric Design 74 (Oct. 2019), 101772.[70]

Xiu, D., and Karniadakis, G. E.

The Wiener–Askey Polynomial Chaos for Stochastic Diﬀerential Equations.

SIAMJournal on Scientiﬁc Computing 24 , 2 (Jan. 2002), 619–644.[71]

Zhang, H., and Shibutani, T.

Development of stochastic isogeometric analysis (SIGA) method for uncertainty in shape.

International Journal for Numerical Methods in Engineering (Dec. 2018), nme.6008.(Dec. 2018), nme.6008.