Analysis of circulant embedding methods for sampling stationary random fields
Ivan G. Graham, Frances Y. Kuo, Dirk Nuyens, Rob Scheichl, Ian H. Sloan
AAnalysis of circulant embedding methodsfor sampling stationary random fields ∗ I.G. Graham † F.Y. Kuo ‡ D. Nuyens § R. Scheichl ¶ I.H. Sloan (cid:107)
March 21, 2018
Abstract
A standard problem in uncertainty quantification and in computationalstatistics is the sampling of stationary Gaussian random fields with givencovariance in a d -dimensional (physical) domain. In many applications itis sufficient to perform the sampling on a regular grid on a cube enclosingthe physical domain, in which case the corresponding covariance matrix isnested block Toeplitz. After extension to a nested block circulant matrix, thiscan be diagonalised by FFT – the “circulant embedding method”. Providedthe circulant matrix is positive definite, this provides a finite expansion ofthe field in terms of a deterministic basis, with coefficients given by i.i.d.standard normals. In this paper we prove, under mild conditions, that thepositive definiteness of the circulant matrix is always guaranteed, providedthe enclosing cube is sufficiently large. We examine in detail the case of theMat´ern covariance, and prove (for fixed correlation length) that, as h → ν / log h − ) times larger than the size of the physicaldomain. (Here h is the mesh spacing of the regular grid and ν the Mat´ernsmoothness parameter.) We show that the sampling cube can become smalleras the correlation length decreases when h and ν are fixed. Our results ∗ March 21, 2018. The authors acknowledge financial support from the Australian Re-search Council (FT130100655; DP150101770), the KU Leuven research fund (OT:3E130287;C3:3E150478), the Taiwanese National Center for Theoretical Sciences’ Mathematics Division,and the Statistical and Applied Mathematical Sciences Institute (SAMSI) under its 2017 Programon Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applied Mathematics. † Dept. of Mathematical Sciences, University of Bath, Bath BA2 7AY, UK([email protected]). ‡ School of Mathematics and Statistics, University of NSW, Sydney NSW 2052, Australia([email protected]). § Dept. of Computer Science, KU Leuven, Celestijnenlaan 200A, B-3001 Leuven, Belgium([email protected]). ¶ Dept. of Mathematical Sciences, University of Bath, Bath BA2 7AY, UK([email protected]). (cid:107)
School of Mathematics and Statistics, University of NSW, Sydney NSW 2052, Australia([email protected]). a r X i v : . [ m a t h . NA ] M a r re confirmed by numerical experiments. We prove several results about thedecay of the eigenvalues of the circulant matrix. These lead to the conjecture,verified by numerical experiment, that they decay with the same rate as theKarhunen–Lo`eve eigenvalues of the covariance operator. The method analysedhere complements the numerical experiments for uncertainty quantification inporous media problems in an earlier paper by the same authors in J. Comp.Physics . 230 (2011), pp. 3668–3694.
Keywords:
Gaussian Random Fields, Circulant Embedding, Statistical Homo-geneity, Mat´ern Covariance, Fast Fourier Transform, Analysis
AMS:
In recent years there has been a huge growth in interest in uncertainty quantification(UQ) for physical models involving partial differential equations. In this context,the forward problem of UQ consists of describing the statistics of outputs (solutions)of a PDE model, given statistical assumptions on its inputs (e.g., its coefficients).This has led to the widespread study of model PDE problems where the coefficientsare given as random fields. Various flavours of solution method (e.g., StochasticGalerkin or various sampling methods) have been formulated under the assumptionthat the random coefficient field has a separable expansion in physical/probabilityspace – for example a suitably truncated Karhunen–Lo´eve (KL) expansion, [10],[18].In its standard form, the KL expansion is in principle infinite, and requires thecomputation of the (infinitely many) eigenpairs of the integral operator with kernelgiven by the covariance of the field. It has to be truncated to be computable, evenwhen the field is only required at a finite set of physical spatial points. Howeverif it is known from the outset that the coefficient field is only required at a finiteset of physical points (as is the case in typical implementations of finite elementmethods for solving the PDE), then a different point of view emerges. The requiredfield is now a random vector; in the Gaussian case it is characterised by its meanand covariance matrix. A real, for example Cholesky, factorization of the covariancematrix provides a finite separable expansion of the random vector, with no need forany truncation.In the paper [11], the authors proposed a practical algorithm for solving a classof elliptic PDEs with coefficients given by statistically homogeneous lognormal ran-dom fields with low regularity. Lognormal random fields are commonly used inapplications, for example in hydrology (see, e.g., [19, 20] and the references there).The PDE was solved by piecewise linear finite elements on a uniform grid. Thestiffness matrix was obtained by an appropriate quadrature rule, with the field valuesat the quadrature points being obtained via a factorization of the covariance matrixusing the circulant embedding technique described below. The method was foundto be effective even for problems with high stochastic dimension, but [11] did notcontain a convergence analysis of the algorithm. The main purpose of the present2aper is to provide an analysis for the circulant embedding part of the algorithm of[11]. The analysis of the corresponding uncertainty quantification algorithm for thePDE is done in [12].We consider here the fast evaluation of a Gaussian random field Z ( x , ω ) withprescribed mean Z ( x ) and covariance r cov ( x , x (cid:48) ) := E [( Z ( x , · ) − Z ( x ))( Z ( x (cid:48) , · ) − Z ( x (cid:48) )] , (1.1)where the expectation is with respect to the Gaussian measure. Throughout we willassume that Z is stationary (see, e.g., [2, p. 24]), i.e., its covariance function satisfies r cov ( x , x (cid:48) ) = ρ ( x − x (cid:48) ) , (1.2)for some function ρ : R d → R . Note that we assume here that ρ is defined on all of R d , as it is in many applications, although strictly speaking we only need ρ to bedefined on a sufficiently large ball. Note also that (1.1) and (1.2) imply that ρ issymmetric, i.e., ρ ( x ) = ρ ( − x ) , for all x ∈ R d , (1.3)and that ρ is also positive semidefinite (see § ρ will begiven below. A particular case, to be discussed extensively, is the Mat´ern covariancedefined in Example 2.7 below.We shall consider the problem of evaluating Z ( x , ω ) at a uniform grid of M = ( m + 1) d points on the d -dimensional unit cube [0 , d , with integer m fixed, and with gridspacing h := 1 /m . (The extension to general tensor product grids is straightfor-ward and not discussed here.) Denoting the grid points by x , x , . . . , x M , we wishto obtain samples of the random vector: Z ( ω ) := ( Z ( x , ω ) , . . . , Z ( x M , ω )) (cid:62) . This is a Gaussian random vector with mean Z := ( Z ( x ) , . . . , Z ( x M )) (cid:62) and apositive semidefinite covariance matrix R = [ ρ ( x i − x j )] Mi,j =1 . (1.4)Because of its finite length, Z ( ω ) can be expressed exactly (but not uniquely) as alinear combination of a finite number of i.i.d. standard normals, i.e., as Z ( ω ) = B Y ( ω ) + Z , where Y ∼ N ( , I s × s ) . (1.5)for some real M × s matrix B with s ≥ M satisfying R = BB (cid:62) . (1.6)To verify this construction, simply note that (1.5) and (1.6) imply E [( Z − Z )( Z − Z ) (cid:62) ] = E [ B Y Y (cid:62) B (cid:62) ] = B E [ Y Y (cid:62) ] B (cid:62) = BB (cid:62) = R,
3o ensuring that (1.1) is satisfied on the discrete grid.In principle the factorization (1.6) could be computed via a Cholesky factoriza-tion (or even a spectral decomposition) of R with s = M , but this is likely to beprohibitively expensive, since R is large and dense. However, under appropriateordering of the indices, R is a nested block Toeplitz matrix and, as we will explainbelow, R can be embedded in a bigger s × s nested block circulant matrix whose spec-tral decomposition can be rapidly computed using FFT with O ( s log s ) complexity.A subtle but vital point is that while the covariance matrix R is automatically pos-itive semidefinite, the extension to a larger nested block circulant matrix may losedefiniteness, yet the nested block circulant matrix must be at least positive semidef-inite for the algorithm to work. Small deviations from positive semidefiniteness areacceptable if one is prepared to accept the incurred errors from omitted negativeeigenvalues. That error can be controlled via an a posteriori bound in terms of thenegative eigenvalues (see, e.g., [18, § R is as follows. (Fulldetails are in § , d in a larger cube [0 , (cid:96) ] d with side length (cid:96) = mh ≥ m ≥ m = 1 /h . Note that ρ isautomatically defined on [0 , (cid:96) ] d , since it is defined on all of R d . Then we constructa 2 (cid:96) -periodic even symmetric extension of ρ on [0 , (cid:96) ] d , called ρ ext , see (2.4) below,that coincides with ρ on [0 , (cid:96) ] d . The extended s × s matrix with s = (2 m ) d (1.7)is then obtained by the analogue of formula (1.4), with ρ replaced by ρ ext . InTheorem 2.3, we show, under quite general conditions, that if (cid:96) (equivalently m ) ischosen large enough, this extension is necessarily positive definite. The algorithmused in practice (Algorithm 1 in § (cid:96) cautiously through a sequenceof increments in m until positive definiteness is achieved.To know that the resulting algorithm is efficient, we need a lower bound on thevalue of (cid:96) needed to achieve positive definiteness. Our second set of theoreticalresults provides such bounds for the important Mat´ern class of covariance functions,defined in (2.21) below. In Theorem 2.9 we show that positive definiteness is alwaysachieved with (cid:96)/λ ≥ C + C ν / log (cid:0) max { λ/h , ν / } (cid:1) , (1.8)where λ is a parameter with the unit of length (the “correlation length”), ν < ∞ isthe Mat´ern smoothness parameter and C , C are constants independent of h , (cid:96), λ, ν and variance σ . Thus the required (cid:96) grows very slowly in 1 /h and can get smalleras the correlation length decreases. There is some growth as ν increases. However,the less smooth fields with ν small, and with small correlation length λ , are the onesoften found in applications – see the references in [11]. In Theorem 2.11 we discussthe same question in the Gaussian case ( ν = ∞ ). This theorem shows, for example,that if λ and h both decrease but λ/h is kept fixed (i.e., a fixed number of gridpoints per unit correlation length), then the minimum value of (cid:96) needed for positivedefiniteness can decrease linearly in λ . 4n additional benefit of the spectral decomposition obtained by applying theFFT to the circulant matrix is that it allows us to determine empirically whichvariables are the most important in the system. This gives an ordering of thevariables, which can be used to drive the design of the Quasi-Monte Carlo algorithms(see, e.g., [12]).While circulant embedding techniques are well-known in the computational statis-tics literature (e.g., [5, 6, 7, 17, 18]), there is relatively little theoretical analysis ofthis technique, the best existing references being Chan and Wood [5] and Dietrichand Newsam [7]. First, [5] provides a theorem in general dimension d identify-ing conditions on the covariance function which ensure that the circulant extension(which we shall describe below) is positive definite for some sufficiently large (cid:96) . Thecondition is similar to that provided in our Theorem 2.3 below, except that an ad-ditional assumption on the discrete Fourier transform (the “spectral density”) of ρ is required in [5]. In our work we require positivity of the continuous Fourier trans-form which is automatically satisfied via Bochner’s theorem, due to the fact that ρ is a covariance kernel. On the other hand Dietrich and Newsam [7] provide moredetailed information about the behaviour of the algorithm by restricting the theoryto a 1-dimensional domain. They show that the discrete Fourier transform of ρ willbe positive when r cov is convex, decreasing and non-negative. This automaticallyproves the success of the algorithm for certain covariance kernels. However, twocovariance kernels that are not covered by this theory are the “Whittle” covarianceand the Gaussian covariance. These belong to the Mat´ern class introduced in Ex-ample 2.7 below, with Mat´ern parameter ν = 1 and ν = ∞ respectively. Our theorycovers the whole Mat´ern family and also describes the behaviour of the embeddingalgorithm with respect to the parameters ν (Mat´ern parameter) and λ (correlationlength) as well as the mesh size h . Finally we note that [7] also describes embeddingstrategies which are more general than those which we describe and analyse here,but without theory.We may compare the present approach with other methods for the generationof Gaussian random fields. The principal limitations of the present approach arethat it requires the covariance function to be stationary, and generates the randomfield only on a uniform tensor product grid. The requirement of uniformity of thegrid is not very restrictive for PDE applications, as we show in [12] where we takethe interpolation error into account for the error analysis. In [9], the uniform gridrequirement is removed but at the price of approximating the covariance matrix byan H -matrix, and then obtaining an approximate square root of that H -matrix.The use of H -matrices to promote the efficient computation of KL eigenfunctions haspreviously been explored in [8, 16, 14]. In [21], fast multipole methods were promotedto achieve the same objective. In all these approaches, there is an inevitable needto truncate the KL series, but on the other hand there is no essential restriction tostationary covariance functions. We mention also the recent paper [3] in which arelated periodization procedure is applied (this time using a smooth cut-off function),facilitating wavelet expansions of Z .An interesting paper closely related to the present paper is [14], where the authorspropose and analyse an approximate, pivoted Cholesky factorisation followed by asmall eigendecomposition, which provides in many cases a very efficient and accurate5ay to approximate the random field. This approach does not require stationarity ora uniform grid, but for efficiency reasons the factorisation needs to be truncated after K (cid:28) M steps, where in the simplest case M is the number of grid points as in thepresent paper. For fixed K , the accuracy of the truncated expansion depends on thedecay of the KL eigenvalues and the cost is O ( K M ). When the KL eigenvalues donot decay sufficiently fast, this limits the possible accuracy or leads to prohibitivelyhigh costs.The structure of this paper is as follows. The circulant embedding algorithmis described in § R is given in § § § § There are potentially several ways of computing the factorization (1.6). Since R issymmetric positive definite, its Cholesky factorization R = LL (cid:62) yields (1.6), with s = M . Alternatively, we can use the spectral decomposition R = W Λ W (cid:62) , with W being the orthogonal matrix of eigenvectors, and Λ being the positive diagonalmatrix of eigenvalues, again giving (1.6) with s = M , this time with B = W Λ / .Given that the matrix R is large and dense, in both of these cases the factorizationwill generally be expensive. In the present paper we use a spectral decompositionof a circulant extension of R to obtain (1.6) with s > M – see Theorem 2.2 below. Recall that we began with a uniform grid of points on the d -dimensional unit cube[0 , d . The M = ( m + 1) d points { x i : i = 1 , . . . , M } are assumed to have spacing h := 1 /m along each coordinate direction. We then consider an enlarged cube[0 , (cid:96) ] d of edge length (cid:96) := mh ≥ , (2.1)with integer m ≥ m . We take the point of view that m is fixed (and hence so toois h ), while m is variable (and hence so too is (cid:96) ).We index the grid points on the unit cube by an integer vector k , writing x k := h k for k = ( k , . . . , k d ) ∈ { , . . . , m } d . (2.2)Then it is easy to see that (with analogous vector indexing for the rows and columns)the M × M covariance matrix R defined in (1.4) can be written as R k , k (cid:48) = ρ (cid:0) h ( k − k (cid:48) ) (cid:1) , k , k (cid:48) ∈ { , . . . , m } d . (2.3)6f the vectors k are enumerated in lexicographical ordering, then R is a nested blockToeplitz matrix where the number of nested levels is the physical dimension d . Weremark that all indexing of matrices and vectors by vector notation in this paper isto be considered in this way.In the following it will be convenient to extend the definition of the grid points(2.2) to an infinite grid, x k := h k for k ∈ Z d . Then, in order to define the extended matrix R ext , we define a 2 (cid:96) -periodic map on R by specifying its action on [0 , (cid:96) ]: ϕ ( x ) := (cid:40) x if 0 ≤ x ≤ (cid:96), (cid:96) − x if (cid:96) ≤ x < (cid:96). Now we apply this map componentwise and so define an extended version ρ ext of ρ as follows: ρ ext ( x ) := ρ ( ϕ ( x ) , . . . , ϕ ( x d )) , x ∈ R d . (2.4)Note that ρ ext is 2 (cid:96) -periodic in each coordinate direction and ρ ext ( x ) = ρ ( x ) when x ∈ [0 , (cid:96) ] d . (2.5)Then R ext is defined to be the s × s symmetric nested block circulant matrix with s = (2 m ) d , defined, analogously to (2.3), by R ext k , k (cid:48) = ρ ext (cid:0) h ( k − k (cid:48) ) (cid:1) , k , k (cid:48) ∈ { , . . . , m − } d . (2.6)Moreover, R is the submatrix of R ext in which the indices are constrained to lie inthe range k , k (cid:48) ∈ { , . . . , m } d .In the following it will be convenient to introduce the notation, defined for anyinteger m ≥ Z d m := { , . . . , m − } d , Z dm := {− m, . . . , m − } d . Then we have the following simple result.
Proposition 2.1 R ext has real eigenvalues Λ ext k and corresponding normalised eigen-vectors V k , given, for k ∈ Z d m by the formulae: Λ ext k = (cid:88) k (cid:48) ∈ Z dm ρ ( h k (cid:48) ) exp (cid:18) − π i k · k (cid:48) m (cid:19) , and ( V k ) κ = 1 √ s exp (cid:18) π i k · κ m (cid:19) , κ ∈ Z d m , (2.7) where s is given in (1.7) . Proof.
By (2.4) and the fact that ϕ is symmetric, we see that ρ ext is also symmetric,so the eigenvalues of R ext are real. To obtain the required formula for the eigenvalues,7e use (2.6) and the formula for the eigenvectors in (2.7) to write, for k , κ ∈ Z d m . (cid:0) R ext V k (cid:1) κ = (cid:88) k (cid:48) ∈ Z d m ρ ext ( h ( κ − k (cid:48) )) ( V k ) k (cid:48) = (cid:88) k (cid:48)(cid:48) ∈ κ − Z d m ρ ext ( h k (cid:48)(cid:48) ) ( V k ) κ − k (cid:48)(cid:48) = (cid:32) (cid:88) k (cid:48)(cid:48) ∈ κ − Z d m ρ ext ( h k (cid:48)(cid:48) ) exp (cid:18) − π i k · k (cid:48)(cid:48) m (cid:19) (cid:33) ( V k ) κ . Then (2.7) follows, on using the coordinatewise 2 m -periodicity of the summand inthe last equation, and also the extension property (2.5). (cid:50) It follows from the simple form of the eigenfunctions that the matrix R ext canbe diagonalised by FFT. The following version of the spectral decomposition the-orem, taken from [11], has the advantage that it allows the diagonalisation to beimplemented using only real FFT. Theorem 2.2 R ext has the spectral decomposition: R ext = Q ext Λ ext Q ext , where Λ ext is the diagonal matrix containing the eigenvalues of R ext , and Q ext = Re ( F ) + Im ( F ) is real symmetric, with F k , k (cid:48) = 1 √ s exp (cid:18) π i k (cid:48) · k m (cid:19) denoting the d -dimensional Fourier matrix. If the eigenvalues of R ext are all non-negative then the required B in (1.6) can be obtained by selecting M appropriaterows of B ext := Q ext (Λ ext ) / . (2.8)The use of FFT allows fast computation of the matrix-vector product B ext y forany vector y , which then yields B y needed for sampling the random field in (1.5).Our algorithm for obtaining a minimal positive definite R ext is given in Algorithm 1.Our algorithm for sampling an instance of the random field is given in Algorithm 2.Note that the normalisation used within the FFT routine differs among particularimplementations. Here, we assume the Fourier transform to be unitary.We can replace Step 1 of Algorithm 2 with a QMC point from [0 , s and mappedto R s elementwise by the inverse of the cumulative normal distribution function. Therelative size of the eigenvalues in Λ ext tells us the relative importance of the corre-sponding variables in the extended system, which helps to determine the orderingof the QMC variables. Algorithm 1
Input: d , m , and covariance function ρ .1. Set m = m .2. Calculate r , the first column of R ext in (2.6) .3. Calculate v , the vector of eigenvalues of R ext , by d -dimensional FFT on r . . If smallest eigenvalue < then increment m and go to Step 2.Output: m , v . Algorithm 2
Input: d , m , mean field Z , and m and v obtained by Algorithm 1.1. With s = (2 m ) d , sample an s -dimensional normal random vector y .2. Update y by elementwise multiplication with √ v .3. Set w to be the d -dimensional FFT of y .4. Update w by adding its real and imaginary parts.5. Obtain z by extracting the appropriate M = ( m + 1) d entries of w .6. Update z by adding Z .Output: z (or exp( z ) in the case of lognormal field). In the following subsection we shall show (under mild conditions) that Algo-rithm 1 will always terminate. Moreover we shall give (for the case of the Mat´erncovariance function) a detailed analysis of how (cid:96) depends on various parameters ofthe field. Then in §
3, we give an analysis of the decay rates of the eigenvalues of R ext compared with that of the the eigenvalues of the original Toeplitz matrix R and the KL eigenvalues of the underlying continuous field Z . We first note that by definition (1.1) and (1.2) it follows that for all N ≥
1, all pointsets t , . . . , t N ∈ R d , and all γ ∈ R NN (cid:88) i =1 N (cid:88) i (cid:48) =1 γ i γ i (cid:48) ρ ( t i − t i (cid:48) ) = E [( W − W ) ] ≥ , (2.9)where W ( ω ) = N (cid:88) i =1 γ i Z ( t i , ω )and W denotes its mean. This observation, together with (1.3), means that ρ is a symmetric positive semidefinite function (see, e.g., [22, Chapter 6]). If in addition,(2.9) is always positive for nonzero γ then ρ is called positive definite . In our laterapplications, ρ will always be positive definite.We use the following definition of the Fourier transform of an absolutely inte-grable function ρ : (cid:98) ρ ( ξ ) = (cid:90) R d ρ ( x ) exp( − π i ξ · x ) d x , ξ ∈ R d . (2.10)9f, in addition, ρ is continuous and (cid:98) ρ is absolutely integrable then the Fourier integraltheorem gives ρ ( x ) = (cid:90) R d (cid:98) ρ ( ξ ) exp(2 π i ξ · x ) d ξ , x ∈ R d . In this case, (cid:98) ρ is also continuous, and Bochner’s theorem (e.g., [18, Theorem 6.3])states that ρ is a positive definite function if and only if (cid:98) ρ is positive.The following theorem is the main result of this subsection. Theorem 2.3
Suppose that ρ ∈ L ( R d ) is a real-valued, symmetric positive definitefunction with the additional reflectional symmetry ρ ( x ) = ρ ( ± x , . . . , ± x d ) = ρ ( | x | , . . . , | x d | ) for all x ∈ R d , and suppose (cid:98) ρ ∈ L ( R d ) . Suppose also that for the given value of h = 1 /m wehave (cid:88) k ∈ Z d | ρ ( h k ) | < ∞ . (2.11) Then Algorithm 1 will always terminate with a finite value of m , and the resultingmatrix R ext will be positive definite. We prove Theorem 2.3 via a technical estimate – Lemma 2.4 – which provides anexplicit lower bound for the eigenvalues of R ext . This estimate proves the theorem,and moreover, allows us to obtain explicit lower bounds for the eigenvalues in thelemmas which follow. Lemma 2.4
Under the assumptions of Theorem 2.3, the eigenvalues Λ ext k of R ext all satisfy the estimate Λ ext k ≥ h d min ζ ∈ [ − , ] d (cid:88) r ∈ Z d (cid:98) ρ (cid:16) ζ + r h (cid:17) − (cid:88) k (cid:48) ∈ Z d \ Z dm | ρ ( h k (cid:48) ) | . (2.12) Proof.
By Proposition 2.1, we haveΛ ext k = (cid:88) k (cid:48) ∈ Z d ρ (cid:0) h k (cid:48) (cid:1) exp (cid:18) − π i k (cid:48) · k m (cid:19) − (cid:88) k (cid:48) ∈ Z d \ Z dm ρ (cid:0) h k (cid:48) (cid:1) exp (cid:18) − π i k (cid:48) · k m (cid:19) . (2.13)We obtain the lower bound on the first sum on the right-hand side of (2.13) using(A.3) of Theorem A.1 in the Appendix with h = h and ξ = k (cid:48) / (2 m ), and an upperbound on the second sum in the obvious way. (cid:50) Proof of Theorem 2.3.
Because of the assumed positive definiteness of ρ , it fol-lows fromBochner’s theorem that the Fourier transform (cid:98) ρ is positive, so the strict positiv-ity of the first term on the right-hand side of (2.12) follows from Theorem A.1. Thelower bound is also independent of m . For fixed h , (2.11) ensures that the tail sumin the second term on the right-hand side of (2.12) converges to zero as m → ∞ .Hence the result follows. (cid:50) .3 Isotropic covariance More detailed lower bounds on Λ ext k can be obtained under the additional assumptionthat the random field Z in (1.1) is isotropic , i.e., ρ ( x ) = κ ( (cid:107) x (cid:107) /λ ) , (2.14)where the parameter λ is a correlation length which will play a key role in Example2.7 and Theorem 2.9 below. In this case the Fourier transform (2.10) is given by (cid:98) ρ ( ξ ) = λ d (cid:98) κ d ( λ (cid:107) ξ (cid:107) ) , (2.15)where (cid:98) κ d ( r ) := 2 πr ( d − / (cid:90) ∞ κ ( t ) t d/ J ( d − / (2 π rt ) d t with r ≥ , (2.16)and J α denotes the Bessel function of order α . The right-hand side of (2.16) is the Hankel transform of κ (see, e.g., [18, Theorem 1.107]). The following lemma thenestimates each of the terms on the right-hand side of (2.12) to obtain an explicitlower bound on the eigenvalues of R ext . Lemma 2.5
Suppose ρ and (cid:98) ρ are given as in (2.14) – (2.16) . (i) If | κ | is a decreasing function on R + and r d − κ ( r ) ∈ L ( R + ) , then (with (cid:96) = mh ), (cid:88) k ∈ Z d \ Z dm | ρ ( h k ) | ≤ (3 d − d − ( h /λ ) d (cid:90) ∞ ( (cid:96) − h ) /λ r d − | κ ( r ) | d r . (ii) If (cid:98) κ d is positive and decreasing on R + and r d − (cid:98) κ d ( r ) ∈ L ( R + ) , then h d min ζ ∈ [ − , ] d (cid:88) r ∈ Z d (cid:98) ρ (cid:16) ζ + r h (cid:17) ≥ d d d/ − d − (cid:90) ∞ d / λ/ (2 h ) r d − (cid:98) κ d ( r ) d r . Proof.
We begin by deriving upper and lower bounds for some elementary se-quences.Suppose g is any positive and decreasing function satisfying r d − g ( r ) ∈ L ( R + ).For any integer m ≥ ∞ (cid:88) j = m +1 j d − g ( j ) ≤ ∞ (cid:88) j = m +1 (cid:90) jj − (cid:100) r (cid:101) d − g ( r ) d r ≤ (cid:90) ∞ m (2 r ) d − g ( r ) d r, (2.17)where the first inequality uses (cid:100) r (cid:101) = j for r ∈ ( j − , j ] and the second inequalityfollows from (cid:100) r (cid:101) ≤ r for r ≥
1. Similarly, we can obtain a lower bound for m ≥ ∞ (cid:88) j = m +1 j d − g ( j ) ≥ ∞ (cid:88) j = m +1 (cid:90) j +1 j (cid:98) r (cid:99) d − g ( r ) d r ≥ (cid:90) ∞ m +1 ( r/ d − g ( r ) d r. (2.18)11lso let us consider the set S d ( j ) := { k ∈ Z d : (cid:107) k (cid:107) ∞ = j } for any integer j ≥ S d ( j ) denoting the cardinality of this set, we have the bounds d d j d − ≤ S d ( j ) = (2 j + 1) d − (2 j − d = 2 d (cid:88) i =1 i odd (cid:18) di (cid:19) (2 j ) d − i ≤ (3 d − j d − , (2.19)where the lower bound is the i = 1 term in the binomial expansion and the upperbound comes from the estimate j d − i ≤ j d − for i ≥
1. The bounds are exact for d = 1 and d = 2, with S ( j ) = 2 and S ( j ) = 8 j , while for d = 3 we have S ( j ) = 2 + 24 j , and the lower and upper bounds given by (2.19) are 24 j and26 j respectively.Now, to prove (i), using (2.19) and (2.17) and (cid:107) k (cid:107) ≥ (cid:107) k (cid:107) ∞ , we can now write (cid:88) k ∈ Z d \ Z dm | κ ( h (cid:107) k (cid:107) /λ ) | ≤ ∞ (cid:88) j = m (cid:88) (cid:107) k (cid:107) ∞ = j | κ ( h (cid:107) k (cid:107) /λ ) | ≤ (3 d − ∞ (cid:88) j = m j d − | κ ( h j/λ ) |≤ (3 d − d − (cid:90) ∞ m − r d − | κ ( h r/λ ) | d r = (3 d − d − ( h /λ ) d (cid:90) ∞ ( (cid:96) − h ) /λ r d − | κ ( r ) | d r, with (cid:96) = mh , thus completing the proof of (i).To prove (ii), note first that, for ζ ∈ [ − , ] d and any r ∈ Z d with (cid:107) r (cid:107) ∞ = j , wehave (cid:107) ζ + r (cid:107) ≤ d / (cid:107) ζ + r (cid:107) ∞ ≤ d / (1 / j ). Using (2.15), (2.19) and (2.18) anddropping the r = term, we can write1 h d (cid:88) r ∈ Z d (cid:98) ρ (cid:16) ζ + r h (cid:17) = (cid:18) λh (cid:19) d (cid:88) r ∈ Z d (cid:98) κ d (cid:16) λ (cid:107) ζ + r (cid:107) h (cid:17) ≥ (cid:18) λh (cid:19) d ∞ (cid:88) j =1 (cid:88) (cid:107) r (cid:107) ∞ = j (cid:98) κ d (cid:16) λ (cid:107) ζ + r (cid:107) h (cid:17) ≥ d d ( h /λ ) d ∞ (cid:88) j =1 j d − (cid:98) κ d (cid:16) λ d / (1 / j ) h (cid:17) ≥ d ( h /λ ) d (cid:90) ∞ r d − (cid:98) κ d (cid:16) λd / (1 / r ) h (cid:17) d r = 2 d d/ − (cid:90) ∞ d / λ/ (2 h ) (cid:18) r − λd / h (cid:19) d − (cid:98) κ d ( r ) d r ≥ d d/ − (cid:90) ∞ d / λ/ (2 h ) (cid:18) r (cid:19) d − (cid:98) κ d ( r ) d r, where the last inequality follows from r − c ≥ r/ ⇔ r ≥ c , with c = d / λ/ (2 h ). (cid:50) Corollary 2.6
Under the assumptions of Lemma 2.5, R ext is positive definite if (cid:90) ∞ λd / / (2 h ) r d − (cid:98) κ d ( r ) d r > (3 d −
1) 3 d − d d/ − h /λ ) d (cid:90) ∞ ( (cid:96) − h ) /λ r d − | κ ( r ) | d r. (2.20) Proof.
We make use of Lemma 2.4. The fact that ρ ∈ L ( R d ) and (cid:98) ρ ∈ L ( R d )follow immediately from (2.14) and (2.16) respectively, and from the assumptions r d − κ ( r ) ∈ L ( R + ) and r d − (cid:98) κ d ( r ) ∈ L ( R + ). It then follows from Part (i) ofLemma 2.5 that the assumption (2.11) of Theorem 2.3 is satisfied. Since the symme-try assumption in the theorem is automatically satisfied by an isotropic covariance,the result now follows immediately from Lemma 2.4. (cid:50)
12o interpret Corollary 2.6, recall that λ is the correlation length, so λ is boundedabove (without loss of generality let us assume λ ≤ λ may approach 0. If h is chosen so that h /λ is a fixed constant, then (2.20) and integrability of r d − | κ ( r ) | ensures that positive definiteness is achieved for (cid:96) sufficiently large independently of λ and h . The condition “ h /λ constant” is a natural analogue to the requirement inoscillatory problems that the meshwidth should be proportional to the wavelength.However if (cid:96) and λ are fixed and h → (cid:96) needs to grow as h decreases in order to be sure of positive definiteness. This canbe answered fairly completely and satisfactorily in the case of the Mat´ern family. Example 2.7
The Mat´ern family of covariances are defined by ρ ( x ) = κ ( (cid:107) x (cid:107) /λ ) , where κ ( r ) = σ − ν Γ( ν ) ( √ ν r ) ν K ν (cid:16) √ ν r (cid:17) . (2.21)Here Γ is the gamma function and K ν is the modified Bessel function of the secondkind, σ is the variance, λ is the correlation length and ν > ν ≥ /
2, with the cases ν = 1 / ν = ∞ corresponding to the exponential and Gaussian covariances respectively, see,e.g., [13]. For this reason, we have restricted the analysis to the case ν ≥ /
2, seealso Remark 2.10.For this particular κ , the Hankel transform (cid:98) κ d (see (2.16)) is explicitly known,and so we have from (2.15)–(2.16) that (cid:98) ρ ( ξ ) = λ d (cid:98) κ d ( λ (cid:107) ξ (cid:107) ) , where (cid:98) κ d ( r ) = σ d π d/ (2 ν ) ν Γ( ν + d/ ν ) 1(2 ν + (2 πr ) ) ν + d/ . (2.22)(This can be obtained, for example, by some manipulation of the formula [18, p.264].)Note that for the Mat´ern case both ρ and (cid:98) ρ are positive, radial, decreasing functions.The following theorem shows that when ν and λ are fixed, (cid:96) needs to growwith order log h − in order to be sure of positive definiteness as h →
0. On theother hand if λ and h are fixed, (cid:96) needs to grow like ν / log ν as ν increases. Forgiven (cid:96) , h , the bound on (cid:96) gets smaller as λ decreases. This provides conditionsguaranteeing the termination of Algorithm 1. Notation 2.8
When discussing the Mat´ern case, we shall use the notation A (cid:46) B (equivalently B (cid:38) A ) to mean A/B is bounded above independently of (cid:96), h , λ and ν , and we write A ∼ B if A (cid:46) B and B (cid:46) A . Theorem 2.9
Consider the Mat´ern covariance family (2.21) , with smoothness pa-rameter ν satisfying / ≤ ν < ∞ and correlation length λ ≤ . Suppose h /λ ≤ e − . Then there exist positive constants C and C ≥ √ which may depend ondimension d but are independent of the other parameters (cid:96), h , λ, ν, σ , such that R ext is positive definite if (cid:96)/λ ≥ C + C ν / log (cid:0) max { λ/h , ν / } (cid:1) . (2.23)13 roof. For convenience we introduce the notationΨ = max { λ/h , ν / } . Aiming to verify (2.20), we note first that, since both κ and ˆ κ d depend linearlyon σ , we can without loss of generality set σ = 1. We shall obtain the followinglower bound on the left-hand side of (2.20): (cid:90) ∞ λd / / (2 h ) r d − (cid:98) κ d ( r ) d r (cid:38) ν ν + d/ − (4 π ) − ν Ψ − ν , (2.24)and (subject to assumption (2.23)), the following upper bound on the integral onthe right-hand side of (2.20): (cid:90) ∞ ( (cid:96) − h ) /λ r d − | κ ( r ) | d r (cid:46) ν ν d/ − exp (cid:18) − (cid:114) ν (cid:96) − h λ (cid:19) . (2.25)Since these estimates are rather technical, we defer their justification until theend of this proof. Thus, assuming (2.24) and (2.25) we see that there exists D > D independent of (cid:96), h , λ, ν , such that (2.20) holds ifexp (cid:18) − (cid:114) ν (cid:96) − h λ (cid:19) ≤ D (cid:16) ν π (cid:17) ν Ψ − ν . Taking logs and rearranging, this is equivalent to( (cid:96) − h ) /λ ≥ (cid:112) /ν log (1 /D ) + √ ν / log(20 π /ν ) + 2 √ ν / log Ψ . The first term on the right-hand side is positive only when
D <
1, in which case it issufficient to replace (cid:112) /ν by its maximum value 2 (given that ν ≥ / ν ≤ π , and in this case we have ν / log(20 π /ν ) ≤ √ π log(40 π ) . Taking into account that h /λ ≤ e − , we have thus demonstrated the sufficiency ofa condition of the form (2.23). We now complete the proof by proving the technicalestimates (2.24) and (2.25). Proof of estimate (2.24): Recalling that (cid:98) κ d is given by (2.22), we may use thefollowing elementary result to bound the ratio of gamma functions if d is even:Γ( ν + d/ ν ) = ( ν + d/ − . . . ( ν + 1) ν ≥ ν d/ . (2.26)If d is odd then we may useΓ( ν + d/ ν ) = ( ν + d/ − . . . ( ν + 3 / ν + 1 /
2) Γ( ν + 1 / ν ) (2.27) ≥ ν ( d − / Γ( ν + 1 / ν ) ≥ ν ( d − / ( ν − / / / ≥ ν ( d − / ν / / √ ν ≥ /
2, where in the penultimate step we use Kershaw’s inequality (see [15],equation (1.3)) Γ( x + 1)Γ( x + r ) > (cid:16) x + r (cid:17) − r for x > , < r < , with r = 1 / x = ν − /
2. (If ν = 1 / ν → /
2+ and using continuity.)On using this lower bound in (2.22), we obtain (cid:98) κ d ( r ) (cid:38) (2 ν ) ν ν d/ (2 ν ) − ( ν + d/ (1 + 2 π r /ν ) − ( ν + d/ ∼ (1 + 2 π r /ν ) − ( ν + d/ . Now (for convenience), we introduce the notation a = 3 √ d/ , (cid:101) Ψ = max { λa/h , (cid:112) ν/ π } . Then the left-hand side of (2.20) can be estimated from below by: (cid:90) ∞ λa/h r d − (cid:98) κ d ( r ) d r (cid:38) (cid:90) ∞ λa/h r d − (cid:0) π r /ν (cid:1) − ( ν + d/ d r ≥ (cid:90) ∞ (cid:101) Ψ r d − (cid:0) π r /ν (cid:1) − ( ν + d/ d r . Now, noting that when r ≥ (cid:101) Ψ we have 2 π r /ν ≥
1, and so (cid:90) ∞ λa/h r d − (cid:98) κ d ( r ) d r ≥ (cid:90) ∞ (cid:101) Ψ r d − (4 π r /ν ) − ( ν + d/ d r = (cid:16) ν π (cid:17) ν + d/ (cid:90) ∞ (cid:101) Ψ r − ν − d r ∼ ν ν + d/ − (4 π ) − ( ν + d/ (cid:101) Ψ − ν . which yields (2.24), since (cid:101) Ψ ∼ Ψ. Proof of estimate (2.25): It is sufficient to prove this estimate wth (cid:96) − h on eachside replaced by (cid:96) . By definition (2.21) and the change of variable r (cid:55)→ ( (cid:112) /ν ) r ,we have (cid:90) ∞ (cid:96)/λ r d − | κ ( r ) | d r ∼ ν ) (cid:16) ν (cid:17) ν + d/ (cid:90) ∞ √ /ν ( (cid:96)/λ ) r d + ν − | K ν ( νr ) | d r . (2.28)To obtain an estimate for the Bessel function on the right-hand side of (2.28),note first that by the hypotheses of this theorem, we have h /λ ≤ e − and ν ≥ / C > C > √
2, both not yet fixed. Then (cid:96)/λ ≥ C + C ν / ≥ C + 2 √ ν / > √ ν / = 4 (cid:112) ν/ , (2.29)and hence the range of integration in (2.28) is contained in [4 , ∞ ).The uniform asymptotic estimate for K ν ( νr ) given in [1, 9.7.8] implies that thereexists a ν ∗ < ∞ and a constant C such that, for all ν ≥ ν ∗ , | K ν ( νr ) | ≤ Cν − / exp( − νr ) r ∈ [4 , ∞ ) . (2.30)15e shall show that in fact such an inequality holds for all ν ≥ / r ≥ ν ≥ /
2, we introduce the quantity c ( ν ) := (cid:107) exp( z ) z / K ν ( z ) (cid:107) L ∞ (2 , ∞ ) , The continuity of K ν on [2 , ∞ ) and the order-dependent asymptotics of K ν [1, 9.7.2]ensure that c ( ν ) < ∞ for all ν < ∞ . It can also be shown, by appealing to theintegral representation of the modified Bessel function, that c ( ν ) is continuous withrespect to ν when ν ≥ /
2, from which we deduce that c ∗ := max [1 / ,ν ∗ ] c ( ν ) < ∞ . Now, for 1 / ≤ ν ≤ ν ∗ and r ≥ νr ≥ νr ) ν / | K ν ( νr ) | = r − / (cid:2) exp( νr )( νr ) / | K ν ( νr ) | (cid:3) ≤ c ( ν ) ≤ c ∗ . This shows that the estimate (2.30) holds uniformly for all r ≥ ν ≥ / (cid:90) ∞ (cid:96)/λ r d − | κ ( r ) | d r (cid:46) ν ) (cid:16) ν (cid:17) ν + d/ ν − / (cid:90) ∞ √ /ν ( (cid:96)/λ ) r d − ν e − νr d r . (2.31)Now note that (2.29) implies that (cid:96)/λ ≥ C ν / , and hence (cid:112) /ν ( (cid:96)/λ ) ≥ √ C ,so we can choose C a large enough constant independent of (cid:96), h , λ, ν, σ so that (cid:112) /ν ( (cid:96)/λ ) ≥
10. Then, using the elementary inequality xe ≤ e x which holds when x ≥
1, it follows that ( r/ e ≤ exp( r/
10) when r ≥
10. Hence for r ≥ (cid:112) /ν ( (cid:96)/λ ) ≥
10, we have r d − ν ≤ (cid:18) e (cid:19) d − ν exp (cid:18) d − ν r (cid:19) ≤ (cid:18) e (cid:19) d − ν e νr/ , where in the last step we used d − ≤ ≤ ν . Inserting this into the right-handside of (2.31), we get, after integration and some manipulation, (cid:90) ∞ (cid:96)/λ r d − | κ ( r ) | d r (cid:46) ν ) (cid:18) νe (cid:19) ν ν d/ − / exp (cid:18) − (cid:114) ν (cid:96)λ (cid:19) . Stirling’s formula implies that Γ( ν ) ν / e ν ∼ ν ν and the estimate (2.25) follows. (cid:50) Remark 2.10
The cases ν ≥ / ν several timesin a non-trivial way. First of all it is used in the demonstration that estimates (2.23),(2.24) and (2.25) are sufficient to ensure the positive definiteness of R ext . Then itis used in the proofs of each of the estimates (2.24) and (2.25). The extension ofTheorem 2.9 to ν ∈ (0 , /
2) remains an open question.To complete the theory of this section, we discuss the case ν = ∞ . In this case, ρ ( x ) = κ ( (cid:107) x (cid:107) /λ ) , with κ ( r ) = σ exp( − r / , (cid:98) ρ ( ξ ) = σ (2 π ) d/ λ d exp( − π λ (cid:107) ξ (cid:107) ) . For this (Gaussian) covariance it is well-known that the Karhunen-Lo´eve expan-sion converges exponentially (see, e.g., [21]), in which case it may be preferable tocompute realisations of the field Z via the KL expansion, rather than the methodproposed here. Nevertheless the existing analysis can be applied to this case as wenow show. Theorem 2.11
For the Gaussian covariance (i.e., the Mat´ern kernel with ν = ∞ ),there exists a constant B (depending only on spatial dimension d ) such that positivedefiniteness of R ext is guaranteed when (cid:96) ≥ λ max (cid:26) √ λh , B (cid:27) . Hence, if λ/h is fixed (i.e., a fixed number of grid points per unit correlation lengthis used), then the (cid:96) required for positive definiteness decreases linearly with decreasingcorrelation length λ , until the minimal value (cid:96) = 1 is reached. Proof.
As before, without loss of generality we set σ = 1 and we make use ofCorollary 2.6. The left-hand side of (2.20) is then(2 π ) d/ (cid:90) ∞ λd / / (2 h ) r d − exp( − π r ) d r = (2 π ) − d/ (cid:90) ∞ c ( λ/h ) r d − exp( − r /
2) d r, with c = 3 π √ d > (cid:90) ∞ ( (cid:96) − h ) /λ r d − exp( − r /
2) d r < c (cid:18) h λ (cid:19) d (cid:90) ∞ c λ/h r d − exp( − r /
2) d r, with c = 2 d (2 πd ) − d/ (3 d − − − d . We shall show that there exists B (dependingonly on d ) such that (cid:90) ∞ (cid:96)/λ r d − exp( − r /
2) d r < c (cid:18) h λ (cid:19) d (cid:90) ∞ c λ/h r d − exp( − r /
2) d r, (2.32)when (cid:96) ≥ λ max (cid:26) √ c λh , B (cid:27) , (2.33)and the statement of the theorem then follows, since h ≤ (cid:90) ∞ (cid:96)/λ r d − exp( − r /
2) d r ≤ exp( − (cid:96) / λ ) (cid:90) ∞ (cid:96)/λ r d − exp( − r /
4) d r = 2 d/ exp( − (cid:96) / λ ) (cid:90) ∞ (cid:96)/ ( √ λ ) r d − exp( − r /
2) d r . (2.34)17e now choose B (depending only on d ) to have the property thatexp( − x / x d ≤ c , when x ≥ B. (2.35)Then if (cid:96) satisfies (2.33), we have (cid:96)/λ ≥ B and (cid:96)/λ ≥ √ c λ/h > √ λ/h . (2.36)Thus, using (2.35) and (2.36), we have2 d/ exp( − (cid:96) / λ ) ≤ c (cid:16) √ λ/(cid:96) (cid:17) d < c ( h /λ ) d . Combining this with (2.34) and using the fact that (cid:96)/ ( √ λ ) ≥ c λ/h we obtain(2.32). (cid:50) In this section we return to the formula (1.5) for sampling the random field Z . Forthe circulant embedding method, using the notation introduced above, we have Z ( ω ) = (cid:88) k ∈ Z dm B k Y k ( ω ) + Z , where Y k are i.i.d. standard normals, B k are the columns of the matrix B , and inthis case B is taken to be the appropriate M rows of the matrix B ext , as describedin Theorem 2.2. Recalling (2.8) and noting that every entry in Re ( F ) + Im ( F ) isbounded by (cid:112) /s , it follows that (cid:107) B k (cid:107) ∞ ≤ (cid:114) ext k s , k ∈ Z dm , where Λ ext k are the eigenvalues of the matrix R ext .In Quasi-Monte Carlo (QMC) convergence theory (see for example [13], [12]) itis important to have good estimates for (cid:107) B k (cid:107) ∞ . More precisely, arranging the Λ ext k in non-increasing order, it is important to study the rate of decay of the resultingsequence. In order to obtain some insight into this question we first recall that theΛ ext k depend on both the regular mesh diameter h and the extension length (cid:96) orequivalently, on h and m (see (2.1)). Since m = (cid:96)h − , the dimension s given by(1.7) then grows if either h decreases or (cid:96) increases (or both). To indicate thedependence on these two parameters, in this section we write variouslyΛ ext k = Λ ext k ( h , (cid:96) ) = Λ ext s, k . In order to get insight into the asymptotic behaviour of Λ ext k ( h , (cid:96) ), we study firstthe spectrum of the continuous periodic covariance integral operator defined by R ext v ( x ) := (cid:90) [0 , (cid:96) ] d ρ ext ( x − x (cid:48) ) v ( x (cid:48) ) d x (cid:48) , x ∈ [0 , (cid:96) ] d , ρ ext is defined in (2.4). This operator is a continuous analogue of the matrix R ext defined in (2.6) and the eigenvalues of each are closely related as we shall discussbelow.The operator R ext is a compact operator on the space of 2 (cid:96) -periodic continuousfunctions on R d , and so it has a discrete spectrum with the only accumulation pointat the origin. Since R ext is a periodic convolution operator, it is easily verified thatits eigenvalues and (normalised) eigenfunctions (which depend on (cid:96) ) are λ ext k ( (cid:96) ) = (cid:90) [0 , (cid:96) ] d ρ ext ( x ) exp ( − π i ξ k · x ) d x = (cid:90) [ − (cid:96),(cid:96) ] d ρ ( x ) exp ( − π i ξ k · x ) d x (3.1)and v k ( x ) = (2 (cid:96) ) − d/ exp(2 π i ξ k · x ) , and where ξ k = k / (2 (cid:96) ) and k ∈ Z d . Here we used the fact that ρ ext is 2 (cid:96) -periodic and that ρ ext and ρ coincide on [ − (cid:96), (cid:96) ] d .The eigenvalues λ ext k ( (cid:96) ) are real but not necessarily positive, since ρ ext , unlike ρ , maynot be positive definite (for the definition see § Theorem 3.1
For fixed (cid:96) ≥ and fixed k ∈ Z dm , where m = (cid:96)/h , the matrixeigenvalues Λ ext k ( h , (cid:96) ) , weighted by h d , converge to λ ext k ( (cid:96) ) as h → : h d Λ ext k ( h , (cid:96) ) → λ ext k ( (cid:96) ) , as h → . Proof.
The formula for Λ ext k ( h , (cid:96) ) given by (2.7), when weighted by h d , can be seenas a rectangle rule approximation of the second integral defining λ ext k ( (cid:96) ) in (3.1),with grid spacing of h . Since the integrand in that integral is continuous, theconvergence claim holds. (cid:50) From now on we restrict attention to the Mat´ern case (Example 2.7). Our firstresult shows that λ ext k ( (cid:96) ) is exponentially close to the full range Fourier transform (cid:98) ρ ( ξ k ) (given in (2.10)), and this holds true uniformly in (cid:96) and k . Lemma 3.2
For the case of the Mat´ern kernel, we have (cid:12)(cid:12) λ ext k ( (cid:96) ) − (cid:98) ρ ( ξ k ) (cid:12)(cid:12) (cid:46) λ d ν ν d/ − exp (cid:18) − (cid:114) ν (cid:96)λ (cid:19) , for all (cid:96) ≥ and k ∈ Z dm . (3.2) Thus there exist positive constants C , C (independent of h , (cid:96), λ and ν ), such that,for all ε > and all k ∈ Z dm , (cid:12)(cid:12) λ ext k ( (cid:96) ) − (cid:98) ρ ( ξ k ) (cid:12)(cid:12) ≤ ε, when (cid:96)/λ ≥ C ν / + C ν − / log(1 /ε ) . (3.3) Proof.
From (3.1), (2.10) and (2.21), it is easy to see that (cid:12)(cid:12) λ ext k ( (cid:96) ) − (cid:98) ρ ( ξ k ) (cid:12)(cid:12) ≤ (cid:90) (cid:107) x (cid:107) ≥ (cid:96) | ρ ( x ) | d x ∼ (cid:90) ∞ (cid:96) r d − | κ ( r/λ ) | d r = λ d (cid:90) ∞ (cid:96)/λ r d − κ ( r ) d r, κ as in (2.21), and again we use Notation 2.8. Then, using (2.25), we obtain(3.2).Forcing the right-hand side of (3.2) to be bounded above by ε , rearranging andtaking logs, we obtain a sufficient condition of the form (cid:114) ν (cid:96)λ ≥ C ∗ + log(1 /ε ) + d log λ + ν log 5 + ( d/ −
1) log ν, for some parameter-independent constant C ∗ . Recalling that we assume λ ≤
1, andusing ν ≥ / ν ) /ν (cid:46)
1, the sufficiency condition in (3.3) follows. (cid:50)
With Lemma 3.2 in mind, we now discuss the asymptotic behaviour of (cid:98) ρ ( ξ k ), bymaking use of the analytic formula (2.22). In order to define an appropriate orderingwe make the following definition. Definition 3.3 (
Ordering of the integer lattice)
Since Z d is countable we canwrite its elements as a sequence { k ( j ) : j = 1 , , . . . } , such that k (1) = ∈ Z d andsuch that the sequence {(cid:107) k ( j ) (cid:107) : j = 1 , , . . . } is non-decreasing. This ordering isnot unique. Theorem 3.4
For the case of the Mat´ern kernel, we have < (cid:98) ρ ( ξ k ( j ) ) (cid:46) λ − ν ( ν(cid:96) ) ν + d/ j − (1+2 ν/d ) , j = 1 , , . . . . (3.4) Proof.
By (2.22) and the definition ξ k = k / (2 (cid:96) ), we have0 < (cid:98) ρ ( ξ k ) ∼ λ d ν ν Γ( ν + d/ ν ) (cid:0) ν + π λ (cid:107) k (cid:107) / (2 (cid:96) ) (cid:1) − ( ν + d/ . Now by (2.26) we have (for even d ), Γ( ν + d/ (cid:46) ν d/ Γ( ν ). Moreover the sameestimate also holds for d odd, as can be seen by employing the first equation in(2.27), and then the estimate Γ( ν + 1 / (cid:46) ν / Γ( ν ) (from [15, eq. (1.3)]). Hencewe have the upper bound (cid:98) ρ ( ξ k ) (cid:46) λ d ν ν + d/ (cid:0) ν + π λ (cid:107) k (cid:107) / (2 (cid:96) ) (cid:1) − ( ν + d/ (cid:46) λ d (cid:0) ν(cid:96) (cid:1) ν + d/ ( λ (cid:107) k (cid:107) ) − (2 ν + d ) = λ − ν ( ν(cid:96) ) ν + d/ (cid:107) k (cid:107) − (2 ν + d )2 . Now, to complete the proof of (3.4), we shall show that (cid:107) k ( j ) (cid:107) ∼ j /d . (3.5)To obtain (3.5), for each j , define the set T ( j ) := { k ∈ Z d : (cid:107) k (cid:107) ≤ (cid:107) k ( j ) (cid:107) } . Then T ( j ) is a superset and a subset of two cubes:[ − d − / (cid:107) k ( j ) (cid:107) , d − / (cid:107) k ( j ) (cid:107) ] d ⊆ T ( j ) ⊆ [ −(cid:107) k ( j ) (cid:107) , (cid:107) k ( j ) (cid:107) ] d . (The right inclusion follows from the definition of T ( j ). On the other hand if k liesin the left-most cube, then (cid:107) k (cid:107) ∞ ≤ d − / (cid:107) k ( j ) (cid:107) which implies (cid:107) k (cid:107) ≤ (cid:107) k ( j ) (cid:107) ,i.e., k ∈ T ( j ).) Then it follows that T ( j ) ∼ (cid:107) k ( j ) (cid:107) d . But also by definition T ( j )contains j − Z d . Hence (3.5) follows. (cid:50) R ext and provides an upper bound for λ k ( j ) ( (cid:96) ), explicit in the parameters h , (cid:96), λ and ν . Corollary 3.5
For the Mat´ern kernel, suppose (cid:96)/λ ≥ max { C , C ν / } + max { C ν / , C ν − / } log (cid:0) max { λ/h , ν / } (cid:1) . Then Λ ext k ( h , (cid:96) ) ≥ for all k ∈ Z dm and also λ ext k ( j ) ( (cid:96) ) ≤ min { h /λ, ν − / } + λ − ν ( ν(cid:96) ) ν + d/ j − (1+2 ν/d ) , j = 1 , , . . . (3.6) Discussion 3.6
In the paper [12] the authors of this paper analyse the convergenceof Quasi-Monte Carlo methods for uncertainty quantification of certain PDEs withrandom coefficients, realisations of which are computed using the circulant embeddingtechnique described here. A dimension-independent convergence estimate for theQMC method is proved there under the assumption that given d and covariancefunction ρ , there exist p < and C > such that for all integers m , with m chosenas in Algorithm 1 and s = (2 m ) d , we have (cid:88) k ∈ Z dm (cid:18) Λ ext s, k s (cid:19) p/ ≤ C . (3.7) If Λ ext s,j for j = 1 , . . . , s denotes the eigenvalues Λ ext s, k reordered so as to be nonincreas-ing, then a sufficient condition for this to hold is that for some β > and C > ,independent of j and s , there holds (cid:115) Λ ext s,j s ≤ Cj − β for j = 1 , . . . , s, (3.8) in which case (3.7) holds for p any number in the open interval (1 /β, . Based onthe known rate of decay of the analogous Karhunen-Lo`eve eigenvalues (e.g., [18],[13]) and the fact that the second term in (3.6) decays with the same rate (andbearing in mind the result of Theorem 3.1), we conjecture that Conjecture : For the Mat´ern kernel, (3.8) holds with β = 1 + 2 ν/d . (3.9) Numerical experiments in the next section support this conjecture. Thus we need toassume that ν > d/ in order to ensure that β > and p < . (If ν ≤ d/ thenthere is no predicted convergence rate for the QMC algorithm. A similar theoreticalbarrier was detected in [13]).The conjecture that (cid:113) Λ ext s,j /s ≤ Cj − (1+2 ν/d ) / remains a conjecture in spite of theresults in Lemma 3.2 and Theorem 3.4. That is because these results concern thecontinuous eigenvalues λ ext k ( (cid:96) ) and the result which connects these with the eigenval-ues Λ ext k ( h , (cid:96) ) (namely Theorem 3.1) does not provide enough information about the mall eigenvalues. Indeed it is well known, and demonstrated in the numerical ex-periments in the next section, that the small matrix eigenvalues depart significantlyfrom the corresponding eigenvalues λ ext k ( (cid:96) ) of the integral operator.A sufficient condition for (3.8) , and hence for (3.7) , is that (3.8) holds not forall eigenvalues but for some fixed fraction of the eigenvalues, say α , starting fromthe largest. Suppose, for example, that for some fixed α ∈ (0 , we have (cid:115) Λ ext s,j s ≤ Cj − β for j = 1 , . . . , (cid:100) αs (cid:101) . Then because the eigenvalues are ordered, we also have, for j = (cid:100) αs (cid:101) + 1 , . . . , s , (cid:115) Λ ext s,j s ≤ C (cid:100) αs (cid:101) − β = Cj − β (cid:18) j (cid:100) αs (cid:101) (cid:19) β ≤ Cα β j − β , so that the sufficiency condition (3.8) is satisfied for all j = 1 , . . . , s with an appro-priately larger value of the constant. In this section we perform numerical experiments illustrating the theoretical resultsgiven above, for the Mat´ern covariance in 2D and 3D. In all experiments we set thevariance σ = 1. In § § ext s,j and compareit to (3.9). We investigate experimentally the minimal value of (cid:96) needed to ensure that the ex-tended circulant matrix R ext is positive definite. Recall that Theorem 2.3 guaranteesthe existence of such an (cid:96) and Theorems 2.9 and 2.11 quantify its behaviour in theMat´ern case.The plots in Figure 1 show the behaviour of (cid:96) as a function of log h − = log m for various choices of d , ν < ∞ and λ . They clearly show that (cid:96) depends linearly onlog m , for m large enough. They also indicate that (cid:96) gets smaller if either λ or ν gets smaller, all in accordance with Theorem 2.9. For fixed d , h and λ , Theorem 2.9gives a lower bound on (cid:96) which grows like ν / . We observe this behaviour clearlyfor d = 2, but the growth with respect to ν is a bit slower for d = 3. (See Figure 2,which plots log (cid:96) against log ν .) The small triangle embedded in each graph indicates O ( ν / ) growth.Table 1 illustrates the result of Theorem 2.11 which covers the case ν = ∞ . Herewe tabulate the value of (cid:96) needed to ensure positive definiteness of R ext , when de-creasing λ and keeping m λ fixed at 8. Theorem 2.11 gives a bound which decreaseslinearly in λ , until the minimum (cid:96) = 1 is reached. This behaviour is exactly as22 m (cid:96) ( d = 2) (cid:96) ( d = 3)1 8 8 90 . . .
25 32 2 2 . .
125 64 1 1 . (cid:96) required to ensure positive defi-niteness of R ext in the case ν = ∞ , with λ decreasing and m λ = 8 fixed. m ‘ = m / m d = 2, λ = 0 . ν = 4 ν = 2 ν = 1 ν = 0 . m ‘ = m / m d = 2, λ = 0 . ν = 4 ν = 2 ν = 1 ν = 0 . m ‘ = m / m d = 2, λ = 0 . ν = 4 ν = 2 ν = 1 ν = 0 .
50 2 4 6 802468 log m ‘ = m / m d = 3, λ = 0 . ν = 4 ν = 2 ν = 1 ν = 0 . m ‘ = m / m d = 3, λ = 0 . ν = 4 ν = 2 ν = 1 ν = 0 . m ‘ = m / m d = 3, λ = 0 . ν = 4 ν = 2 ν = 1 ν = 0 . Figure 1: Illustration of Theorem 2.9: Graphs of the minimum needed value of (cid:96) toobtain positive definiteness against log h − = log m for different choices of d , λ and ν . The graphs show a linear relationship with lower values of (cid:96) for smaller λ and for smaller ν . 23 ν l og ‘ d = 2, m = 32 λ = 1 λ = 0 . λ = 0 . λ = 0 . − ν l og ‘ d = 2, m = 64 λ = 1 λ = 0 . λ = 0 . λ = 0 . − ν l og ‘ d = 2, m = 128 λ = 1 λ = 0 . λ = 0 . λ = 0 . − ν l og ‘ d = 3, m = 8 λ = 1 λ = 0 . λ = 0 . λ = 0 . − ν l og ‘ d = 3, m = 16 λ = 1 λ = 0 . λ = 0 . λ = 0 . − ν l og ‘ d = 3, m = 32 λ = 1 λ = 0 . λ = 0 . λ = 0 . Figure 2: Illustration of Theorem 2.9: Log-log graphs of the minimum needed valueof (cid:96) to obtain positive definiteness against ν for various choices of d , m and λ . Thesmall triangle depicts a gradient of 0 . R ext has many very small eigenvalues and we deemit to be positive definite when all eigenvalues are positive, ignoring those eigenvalueswhich are less than 10 − in modulus. Here we perform experiments to verify the decay conjecture (3.9). In Figures 3and 4 we present log-log plots of (cid:113) Λ ext s,j /s against the index j for various choices of d , ν and λ . For decreasing h (i.e., increasing m ) we determine the minimum (cid:96) toachieve positive definiteness and make the plot for each case. If the conjecture (3.9)holds, then we expect to see a polynomial decay of rate − (1 + 2 ν/d ) /
2, as shownin the slope triangle in each case. (The convergence may tail off eventually; notethat Theorem 3.1 does not provide an explicit convergence rate.) We see that thecomputations closely follow the prediction of (3.9). Moreover, to have the constant C in (3.7) for the minimal embedding to be absolutely bounded independent of h (and thus also independent of s , m and (cid:96) ) we observe the lines to get closer togetherfor increasing m .Thus, while the numerical evidence supports conjecture (3.9), it remains aninteresting open problem to prove it. 24 − − − − − − − − − − m = 8 m = 16 m = 32 m = 64 1 − / j of sorted eigenvalues q Λ e x t s , j / s d = 2, λ = 0 . ν = 4 m = 4, m = 9, ‘ = 2 . m = 8, m = 26, ‘ = 3 . m = 16, m = 71, ‘ = 4 . m = 32, m = 177, ‘ = 5 . − − − − − − − − − − m = 8 m = 16 m = 32 m = 64 1 − / j of sorted eigenvalues q Λ e x t s , j / s d = 2, λ = 0 . ν = 2 m = 4, m = 7, ‘ = 1 . m = 8, m = 21, ‘ = 2 . m = 16, m = 55, ‘ = 3 . m = 32, m = 134, ‘ = 4 . Figure 3: Loglog plot of the decay of the eigenvalues in case of the minimal embed-ding for d = 2, λ = 0 .
5, and ν = 4 (top) and ν = 2 (bottom). The expected decayrates (3.9) are marked by the slope triangles.25 m = 8 m = 16 m = 32 m = 641 / j of sorted eigenvalues q ⇤ e x t s , k ( j ) / s d = 3, = 0 . ⌫ = 2 m = 4, m = 9, ` = 2 . m = 8, m = 27, ` = 3 . m = 16, m = 69, ` = 4 . m = 32, m = 169, ` = 5 . m = 8 m = 16 m = 32 m = 641 / j of sorted eigenvalues q ⇤ e x t s , k ( j ) / s d = 3, = 0 . ⌫ = 1 m = 4, m = 9, ` = 2 . m = 8, m = 25, ` = 3 . m = 16, m = 63, ` = 3 . m = 32, m = 152, ` = 4 . Figure 4: Loglog plot of the decay of the eigenvalues in case of the minimal embed-ding for d = 3, λ = 0 . ν = 2. The expected decay rate (3.9) is marked by theslope triangle. 26 Sampling theorem
The following result is a d -dimensional version of what is commonly called a samplingtheorem. (See [4] for a 1-dimensional version.) Theorem A.1
Let ρ ∈ L ( R d ) be real-valued and symmetric, with (cid:98) ρ ∈ L ( R d ) .Suppose also that for some h > , (cid:88) k ∈ Z d | ρ ( h k ) | < ∞ . (A.1) Then (cid:88) k ∈ Z d ρ ( h k ) exp( − π i h k · ξ ) = 1 h d (cid:88) r ∈ Z d (cid:98) ρ (cid:16) ξ + r h (cid:17) for almost all ξ ∈ R d . (A.2) If (cid:98) ρ is everywhere positive, then for all ξ ∈ R d (cid:88) k ∈ Z d ρ ( h k ) exp( − π i h k · ξ ) ≥ essinf ζ ∈ [ − h , h ] d h d (cid:88) r ∈ Z d (cid:98) ρ (cid:16) ζ + r h (cid:17) ≥ max r ∈ Z d min ζ ∈ [ − , ] d h d (cid:98) ρ (cid:18) ζ + r h (cid:19) > . (A.3) Proof.
Note first that ρ and (cid:98) ρ are both continuous because of the assumed integra-bility of (cid:98) ρ and ρ respectively. Moreover (cid:98) ρ is real because of the assumed symmetryof ρ . The infinite sum on the left-hand side of (A.2) is a well-defined continuousfunction of ξ ∈ R d , because the convergence is absolute and uniform by virtue of(A.1).Now consider the right-hand side of (A.2), writing it for convenience as g h ( ξ ) := 1 h d (cid:88) r ∈ Z d (cid:98) ρ (cid:16) ξ + r h (cid:17) , ξ ∈ R d . We will show that the function so defined is locally integrable. It is also manifestly1 /h periodic in each coordinate direction, since g h ( ξ + q /h ) = g h ( ξ ) for all q ∈ Z d .To show the local integrability of g h we will show that the sum defining g h converges absolutely to an integrable function on [0 , /h ] d . Letting Ξ denote anyfinite subset of Z d , we have (cid:90) [0 , /h ] d (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:88) r ∈ Ξ (cid:98) ρ (cid:16) ξ + r h (cid:17) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) d ξ ≤ (cid:90) [0 , /h ] d (cid:88) r ∈ Ξ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:98) ρ (cid:16) ξ + r h (cid:17) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) d ξ = (cid:88) r ∈ Ξ (cid:90) [0 , /h ] d (cid:12)(cid:12)(cid:12)(cid:98) ρ (cid:16) ξ + r h (cid:17)(cid:12)(cid:12)(cid:12) d ξ = (cid:88) r ∈ Ξ (cid:90) (1+ r ) /hr /h · · · (cid:90) (1+ r d ) /hr d /h | (cid:98) ρ ( ξ ) | d ξ ≤ (cid:90) R d | (cid:98) ρ ( ξ ) | d ξ < ∞ , where the last step follows from the assumed integrability of (cid:98) ρ . It then follows fromthe dominated convergence theorem that g h is integrable on [0 , /h ] d .27e now show that the left-hand side of (A.2) is just the Fourier series of theintegrable (1 /h )-periodic function g h . The Fourier coefficients of g h are given by (cid:98) g h ( k ) = h d (cid:90) / (2 h ) − / (2 h ) · · · (cid:90) / (2 h ) − / (2 h ) (cid:32) h d (cid:88) r ∈ Z d (cid:98) ρ (cid:16) ξ + r h (cid:17)(cid:33) exp(2 π i h k · ξ ) d ξ , where the unconventional sign in the exponent is valid because g h is real and sym-metric. The order of integration and summation can be interchanged by a secondapplication of the dominated convergence argument, to give (cid:98) g h ( k ) = (cid:88) r ∈ Z d (cid:90) / (2 h ) − / (2 h ) · · · (cid:90) / (2 h ) − / (2 h ) (cid:98) ρ (cid:16) ξ + r h (cid:17) exp(2 π i h k · ξ ) d ξ = (cid:88) r ∈ Z d (cid:90) (1 / r ) /h ( − / r ) /h · · · (cid:90) (1 / r d ) /h ( − / r d ) /h (cid:98) ρ ( ξ ) exp(2 π i h k · ξ ) d ξ = (cid:90) R d (cid:98) ρ ( ξ ) exp(2 π i h k · ξ ) d ξ = ρ ( h k ) , where in the last step we used the fact that the inverse Fourier transform of (cid:98) ρ recovers ρ because of the integrability of ρ . Thus the left-hand side of (A.2) is justthe Fourier series of g h , as asserted. From this it follows that the integrable function g h is equal almost everywhere to to the continuous function on the left-hand side,so completing the proof of (A.2).If (cid:98) ρ is a positive function it follows that the left-hand side of (A.2) is boundedbelow by the essential infimum of g h ( ξ ) over ξ , which in turn can be bounded belowby retaining only the largest term in the sum, which is positive. (cid:50) References [1] M. Abramowitz and I.A. Stegun.
Handbook of Mathematical Functions , Dover,New York, 1965.[2] R.J. Adler,
The Geometry of Random Fields , Wiley, London, 1981.[3] M. Bachmayr, A. Cohen and G. Migliorati, Representations of Gaussian randomfields and approximation of elliptic PDEs with lognormal coefficients,
J. FourierAnal. Appl. , published online 29 March 2017.[4] D.C. Champeney,
A Handbook of Fourier Transforms , Cambridge UniversityPress, Cambridge, 1987.[5] G. Chan and A.T.A. Wood, Simulation of stationary Gaussian processes in[0 , d , J. Comput. Graph. Stat. , , 409-432, 1994.[6] G. Chan and A.T.A. Wood, Algorithm AS 312: An Algorithm for simulatingstationary Gaussian random fields, Appl. Stat. – J. Roy. St. C , , 171–181,1997. 287] C.R. Dietrich and G.H. Newsam, Fast and exact simulation of stationary Gaus-sian processes through circulant embedding of the covariance matrix, SIAM J.Sci. Comput. , , 1088–1107, 1997.[8] M. Eiermann, O.G. Ernst and E. Ullmann, Computational aspects of thestochastic finite element method, Comput. Visual. Sci. , (1), 3–15, 2007.[9] M. Feischl, F.Y. Kuo and I.H. Sloan, Fast random field generation with H -matrices, to appear in Numer. Math. , 2018.[10] R.G. Ghanem and P.D. Spanos,
Stochastic Finite Elements , Dover, 1991.[11] I.G. Graham, F.Y. Kuo, D. Nuyens, R. Scheichl and I.H. Sloan, Quasi-Montecarlo methods for elliptic PDEs with random coefficients and applications,
J.Comput. Phys. , , 3668-3694, 2011.[12] I.G. Graham, F.Y. Kuo, D. Nuyens, R. Scheichl and I.H. Sloan, Cir-culant embedding with QMC – analysis for elliptic PDEs with lognor-mal coefficients, Preprint arXiv:1710.09254 , 25 October 2017, available at https://arxiv.org/abs/1710.09254 [13] I.G. Graham, F.Y. Kuo, J.A. Nicholls, R. Scheichl, C. Schwab and I.H. Sloan,Quasi-Monte Carlo finite element methods for elliptic PDEs with log-normalrandom coefficients,
Numer. Math. , , 329–368, 2015.[14] H. Harbrecht, M. Peters and M. Siebenmorgen, Efficient approximation of ran-dom fields for numerical application, Numer. Linear Algebr. , , 596–617, 2015.[15] D. Kershaw, Some extensions of W. Gautschi’s inequatities for the gammafunction, Math. Comput. , , 607–611, 1983.[16] B. N. Khoromskij, A. Litvinenko and H. G. Matthies, Application of hierarchi-cal matrices for computing the Karhunen-Lo`eve expansion, Computing , (1-2),49–67, 2009.[17] D.P. Kroese and Z.I. Botev, Spatial Process Generation. In: V. Schmidt (Ed.). Lectures on Stochastic Geometry, Spatial Statistics and Random Fields , VolumeII: Analysis, Modeling and Simulation of Complex Structures, Springer-Verlag,Berlin, 2014.[18] G. Lord, C. Powell, T. Shardlow,
An Introduction to Computational StochasticPDEs , Cambridge University Press, Cambridge, 2014.[19] R.L. Naff, D.F. Haley, and E.A. Sudicky, High-resolution Monte Carlo sim-ulation of flow and conservative transport in heterogeneous porous media 1.Methodology and flow results,