Flexible Validity Conditions for the Multivariate Matérn Covariance in any Spatial Dimension and for any Number of Components
FFlexible Validity Conditions for the Multivariate Mat´ern Covariance in anySpatial Dimension and for any Number of Components
Xavier Emery a,b , Emilio Porcu c,d , Philip White e a Department of Mining Engineering, University of Chile, Avenida Beauchef 850, Santiago, Chile. b Advanced Mining Technology Center, University of Chile, Avenida Beauchef 850, Santiago, Chile. c Department of Mathematics, Khalifa University of Science and Technology, Abu Dhabi, The United Arab Emirates. d School of Computer Science and Statistics, Trinity College, Dublin, Ireland. e Department of Statistics, Brigham Young University, Provo, UT, USA.
Abstract
Flexible multivariate covariance models for spatial data are on demand. This paper addresses the problem of para-metric constraints for positive semidefiniteness of the multivariate Mat´ern model. Much attention has been given tothe bivariate case, while highly multivariate cases have been explored to a limited extent only. The existing condi-tions often imply severe restrictions on the upper bounds for the collocated correlation coe ffi cients, which makes themultivariate Mat´ern model appealing for the case of weak spatial cross-dependence only. We provide a collectionof validity conditions for the multivariate Mat´ern covariance model that allows for more flexible parameterizationsthan those currently available. We also prove that, in several cases, we can attain much higher upper bounds for thecollocated correlation coe ffi cients in comparison with our competitors. We conclude with a simple illustration on atrivariate geochemical dataset and show that our enlarged parametric space allows for a better fitting performance withrespect to oour competitors. Keywords:
Spatial cross-correlation, Mat´ern model, Multivariate covariance function, Vector random fields,Conditionally negative semidefinite matrices.
1. Introduction
This work is motivated by highly-multivariate spatial modeling problems in geosciences and mining engineeringapplications. For instance, in exploration geochemistry, it is of interest to map the concentrations of several tens tohundreds of elements, with the object of detecting concealed mineral deposits [7]. In mineral resources evaluation, thegrades of several elements of interest (main products, by-products, and contaminants) often need to be jointly predictedor simulated [24]. In geotechnics, the rock mass rating classification system is obtained as the combination of severalbasic parameters (uniaxial compressive strength, rock quality designation, spacing of discontinuities, condition ofdiscontinuities, groundwater conditions, and orientation of discontinuities) [25]. In geometallurgy, one deals withinformation on mineral proportions, grain sizes, rock density, texture, indices of fragmentation, abrasion, crushingor grinding of the rock, mass recoveries, and metallurgical recoveries, leading to tens or hundreds of variables to bejointly modeled [34]. There has been considerable e ff ort in the recent literature to find flexible multivariate modelsthat overcome the drawbacks of the well-known linear model of coregionalization (LMC) Yet, the LMC is still thedefault approach for multivariate spatial modeling within the above branches of applied sciences.Let p be a positive integer, and let Z = { Z ( s ) = ( Z ( s ) , . . . , Z p ( s )) (cid:62) : s ∈ R d } be a p -variate random field, where (cid:62) isthe transpose operator and each component Z i , i = , · · · , p , is a real-valued (scalar) random field. Z is second-order Preprint submitted to Journal of Multivariate Analysis January 13, 2021 a r X i v : . [ s t a t . M E ] J a n tationary if its expected value exists and is a constant vector in R d , and if the covariance between Z ( s ) and Z ( s (cid:48) )depends exclusively on h = s − s (cid:48) . This stationarity assumption is common in applications in order to facilitateinference on the spatial correlation structure of the random field from a set of scattered data. A stationary covariancefunction K : R d → R p × p is a positive semidefinite matrix-valued mapping whose elements are defined as K i j ( h ) = cov { Z i ( s ) , Z j ( s + h ) } , i , j = , · · · , p , and s , h ∈ R d . Showing positive definiteness of any candidate K is often achallenging task, see [16], [6] and references therein.[1] call the following construction principle multivariate parametric adaptation : let p be a positive integer. Let { K ( · ; λ ) : R d → R , λ ∈ R m } be a parametric family of isotropic univariate correlation functions ( K (0; λ ) =
1) indexedby a parameter vector λ = ( λ , . . . , λ m ) (cid:62) . Call λ i j = ( λ i j , , . . . , λ i j , m ) (cid:62) , i , j = , . . . , p , a set of parameter vectors in R m . Then, define K : R d → R p × p through K ( h ) = (cid:104) σ i j K ( h ; λ i j ) (cid:105) pi , j = , h ∈ R d , where σ i j is the covariance betweenthe i th and j th components of the p -variate random field at the same spatial location. Hereafter, we call the quantities σ i j and ρ i j = σ i j / √ σ ii σ j j the collocated covariance and collocated correlation coe ffi cient between both random fieldcomponents, respectively.The Mat´ern covariance function has been extremely popular in the last 50 years and has been used in many branchesof statistics [16]. Theoretical properties of the Mat´ern covariance function for both estimation and prediction underfixed domain asymptotics have been studied by [32], [36] and [5]. The Mat´ern model is so widely used because ofits flexible smoothness, in the mean square sense, and fractal properties of the resulting Gaussian random field. Also,the Mat´ern covariance function has many other popular covariance functions as special cases, obtained by fixing thesmoothness parameter: in particular, the exponential and first-order autoregressive models are special cases of theMat´ern model. The Gaussian model is a limiting case of a reparameterized version of the Mat´ern model. Finally,the Mat´ern covariance has a simple form for the related spectral density, and hence it is an appealing model from theperspective of spectral- [8, 14] and half-spectral modeling [20, 33] and spectral simulation [12, 22].[17] coupled the Mat´ern covariance function with the multivariate parametric adaptation principle to build the multi-variate Mat´ern covariance model. The validity conditions ( i.e. , positive semidefiniteness) are attained under restrictionon the parameters indexing the multivariate covariance function. The main focus in [17] is on necessary and su ffi cientconditions for the bivariate case ( p = ffi cient validity conditions for the p -variate Mat´ern modelwith p > The multivariate Mat´ern model is a breakthrough in the literature where the LMC has been central for decades [21].The LMC represents any component of a p -variate random field Z as a linear combination of latent, uncorrelatedscalar fields. [17] and [10] advocate for di ff erent modeling strategies because of the potential drawbacks of the LMC:the smoothness of any component of the multivariate field amounts to that of the roughest underlying scalar field.Moreover, the number of parameters can quickly become huge as the number of components increases.The multivariate Mat´ern model is certainly more flexible in that it allows for di ff erent smoothness for the componentsof the p -variate random field. Yet, such flexibility is limited by restrictions to the parameter space that ensure K ispositive semidefinite. Notably, in the bivariate case, one of the most important restrictions amounts to an upper boundfor the absolute value of the collocated correlation coe ffi cient, ρ . For p > p ( p − / ffi cients become overly severe.While the bivariate case has been characterized completely by [17], only su ffi cient conditions are available for the case p > d , where the random field is defined. Yet, these sets of conditions imply very severe restrictionsin terms of upper bound for the p ( p − / ffi cients. This can result in a very unrealisticmodel that works under the assumption of quasi-independence between the components of the p -variate random field,as such upper bounds are often close to zero. We explore this aspect in Section 5. The objective of this paper is to improve the definition of the validity region for the parameters of the multivariateMat´ern covariance model, by proposing new su ffi cient conditions that are broader or less restrictive than the currently2nown ones. We do not address any modification in the model itself or in its use in estimation, prediction or simulationstudies, for which we refer to the existing literature [3, 17]. In particular, examining the model complexity, comparingestimation approaches or analyzing the impact of the model parameters in prediction outputs are out of the scope ofthis work. The interest is to find novel conditions under which the p -variate Mat´ern model is positive semidefinite in R d for any given spatial dimension d and number of random field components p .Table 1 gives a comparative sketch of [17], [3], [11], and the findings provided in this paper. The table omits the case p =
2, insofar as [17] identified necessary and su ffi cient conditions for this case. For the case p >
2, we give severalconditions for the validity of the multivariate Mat´ern model, with some specific cases (last column), as well as themost general case (second column). 3 on t r i bu ti on by G e n e r a l c ond iti on s d e p e nd i ngon a m a t r i x - v a l u e dhyp e r p a r a m e t e r ψ G e n e r a l c ond iti on s d e p e nd i ngon ( ν , α , σ ) a nd , po ss i - b l y , a po s iti v e s ca l a r hyp e r p a r a m e t e r β o r δ S p ec i fi cc ond iti on s d e p e nd i ngon ( ν , α , σ ) w it h ν = ν o r α = α T h i s p a p e r ν ∈ B ; α − ∈ B ; σ α − d Γ ( ν + d / ) / Γ ( ν ) ≥ . ψ ≤ ; ν ≤ ; α ψ − ν ≤ ; ν ≤ ; ν / α ≤ ; σ / Γ ( ν ) α d ν ν + d / e xp ( − ν ) ≥ . ν = ν ; α − ≤ ; σ α − d ≥ . σ / Γ ( ν ) ψ ν + d / α ν e xp ( − ν ) ≥ . α = α ; ν ≤ ; σ / Γ ( ν ) ν ν + d / e xp ( − ν ) ≥ . ν ≤ ; α − β ν ≤ ; σ / Γ ( ν ) (cid:16) α / β (cid:17) ν e xp ( − ν ) ≥ . ν = ν ; α ≤ ; σ α ν ≥ . ν = ν ; α ≤ ; σ α (cid:98) d + + (cid:100) ν (cid:101)(cid:99) / ≥ . [ ] α = α ; ν ≤ ; σ Γ ( ν + d / ) / Γ ( ν ) ≥ . [ ] ν ij − ν ii + ν jj ) = δ ( − a ij ) w it h a ii = , a ij ∈ [ , ] ; [ a ij ] i , j ≥ ; α ≤ ; (cid:20) σ ij α ij δ + ν ii + ν jj Γ ( ν ij + d / ) Γ ( ν ii / + ν jj / + d / ) Γ ( ν ij ) (cid:21) i , j ≥ . [ ] ν ij = ( ν ii + ν jj ) / ; α = α ; (cid:20) σ ij Γ ( ν ii ) / Γ ( ν jj ) / Γ (( ν ii + ν jj + d ) / ) Γ ( ν ii + d / ) / Γ ( ν jj + d / ) / Γ (( ν ii + ν jj ) / ) (cid:21) i , j ≥ . T a b l e : A m a po f t h e r e s u lt s c on t a i n e d i n t h i s p a p e r , w it h c o m p a r i s on w it h ea r li e r lit e r a t u r e . H e r e , ≤ s t a nd s f o r i s c ond iti ona ll y n e ga ti ve d e fin it e , ≥ f o r i s po s iti ve s e m i d e fin it e a nd ∈ B f o r i s a B e r n s t e i n m a t r i x . A llt h e m a t r i xop e r a ti on s ( p r odu c t , d i v i s i on , i nv e r s e , po w e r a nd e xpon e n ti a l ) a r e t a k e n e l e m e n t - w i s e . T h e o t h e r s y m bo l s a r ec l a r i fi e d i n t h e p a p e r . ffi cient. Section 6 illustrates our findings through a geochemical dataset of three coregionalizedvariables. Section 7 concludes the paper with a short discussion. The proofs of the technical results are deferred toAppendix A to Appendix C, where we also include some other background material and lemmas.
2. Background
The isotropic Mat´ern correlation function in R d , d ≥
1, is given by k ( h ; α, ν ) = − ν Γ ( ν ) ( α (cid:107) h (cid:107) ) ν K ν ( α (cid:107) h (cid:107) ) , h ∈ R d , (1)with (cid:107) · (cid:107) the Euclidean norm, Γ the gamma function, K ν the modified Bessel function of the second kind, ν and α positive smoothness and scale parameters, respectively. The parameter ν indexes both mean square di ff erentiabilityand fractal dimension of a Gaussian random field having a Mat´ern correlation function.Being a correlation function and integrable in R d for all positive ν , the function k ( · , α, ν ) is the Fourier transform of apositive and bounded measure that is absolutely continuous with respect to the Lebesgue measure: k ( h ; α, ν ) = (cid:90) R d cos( (cid:104) h , ω (cid:105) ) (cid:101) k ( ω ; α, ν )d ω , h ∈ R d , (2)with (cid:104)· , ·(cid:105) being the inner product and (cid:101) k as the isotropic spectral density of the Mat´ern covariance, defined as [22] (cid:101) k ( ω ; α, ν ) = Γ ( ν + d / Γ ( ν ) α d π d / (cid:32) + (cid:107) ω (cid:107) α (cid:33) − ν − d / , ω ∈ R d . (3)We refer to (3) as the spectral representation of k . Another relevant fact is that the mapping t (cid:55)→ k ( √ t ; α, ν ) iscompletely monotonic on the positive real line. Hence, the mapping (1) can be written as a mixture of Gaussiancovariances of the form g ( h ; u ) = exp( − u (cid:107) h (cid:107) ), where 1 / u follows a gamma distribution with shape parameter ν andscale parameter α / k ( h ; α, ν ) = (cid:90) + ∞ g ( h ; u ) f ( u ; α, ν )d u , h ∈ R d , (4)where f ( u ; α, ν ) = Γ ( ν ) (cid:18) α (cid:19) ν u − ν − exp (cid:32) − α u (cid:33) , u > , is the inverse gamma probability density function with parameters ν and α / ν is an half-odd-integer, then k ( · ; α, ν ) factors into the product of a negative exponential function with a polynomialof degree ν − /
2. For instance, ν = / ν = / k ( h ; α, / = exp( − α (cid:107) h (cid:107) ) and k ( h ; α, / = exp( − α (cid:107) h (cid:107) )(1 + α (cid:107) h (cid:107) ), which are associated with Gaussian random fields being the continuous versions of a Orstein-Uhlembeck process and a first-order autoregressive process, respectively [5, 32]. Another relevant fact is that k ( h ; 2 (cid:112) βν, ν ) −−−−−→ ν → + ∞ g ( h ; β ) = exp( − β (cid:107) h (cid:107) ) , (5)with a uniform convergence on any compact set of R d . This convergence is established by showing that the spectral5ensity of k ( h ; 2 √ βν, ν ) tends pointwise to that of g ( h ; β ) [22] as ν tends to infinity: (cid:101) k ( ω ; 2 (cid:112) βν, ν ) −−−−−→ ν → + ∞ (cid:101) g ( ω ; β ) = πβ ) d / exp (cid:32) − (cid:107) ω (cid:107) β (cid:33) , ω ∈ R d . (6)Since the Gaussian correlation model is not exactly a special case of the Mat´ern correlation function, it will receive aseparate exposition in the paper. The results obtained with this model will further be used to obtain validity conditionsfor the multivariate Mat´ern model. The p -variate isotropic Mat´ern covariance model in R d is defined as [17] K ( h ; α , ν , σ ) = [ σ i j k ( h ; α i j , ν i j )] pi , j = , , h ∈ R d , (7)where α = [ α i j ] pi , j = and ν = [ ν i j ] pi , j = are symmetric matrices with positive entries (scale and shape parameters,respectively), while σ = [ σ i j ] pi , j = is a symmetric matrix (collocated covariance) with entries in R . As a particularcase, if ν = /
2, with as a matrix of ones, one obtains the p -variate exponential covariance. A direct application of(4) shows that K ( h ; α , ν , σ ) = (cid:90) + ∞ g ( h ; u ) σ (cid:12) F ( u ; α , ν )d u , (8)with (cid:12) denoting the Hadamard product and F ( u ; α , ν ) = [ f ( u ; α i j , ν i j )] pi , j = . Here, integration is intended as pointwise.A necessary and su ffi cient validity condition for K in (7) is that the spectral density matrix (cid:101) K ( ω ; α , ν , σ ) with entries (cid:101) K i j ( ω ) = σ i j (cid:101) k ( ω ; α i j , ν i j ) and (cid:101) k as defined in (3), is a positive semidefinite matrix for every fixed ω ∈ R d [2, with thereferences therein]. This result has been used to obtain necessary and su ffi cient validity conditions for the bivariateMat´ern model [17, theorem 3]. Yet, applying these validity conditions becomes prohibitive when p > p -variate Gaussian covariance model is defined as G ( h ; β , σ ) = [ σ i j g ( h ; β i j )] , h ∈ R d , (9)with β = [ β i j ] pi , j = a symmetric matrix with positive entries and σ = [ σ i j ] pi , j = a symmetric matrix with entries in R .This model is the limit of a properly rescaled multivariate Mat´ern covariance (7) when the shape parameters ν i j tendto infinity, as per (5). A p × p symmetric real-valued matrix A is said to be conditionally negative semidefinite if, for any vector λ in R p whose components add to zero, one has λ T A λ ≤
0. As an example of conditionally negative semidefinite matrices,for a positive integer d , a set of real numbers { η , . . . , η p } , a set of points in R d { s , . . . , s p } , and a variogram function γ : R d × R d → R + , the matrix [ η i j ] pi , j = , with generic entry η i j = η i + η j + γ ( s i , s j ) , (10)is conditionally negative semidefinite. Recall that a variogram γ : R d × R d → R is a mapping that represents half thevariance of the increments of a random field defined in R d [23]. If γ ( s , s ) ≡ ˜ γ ( s − s ), the variogram is said to beintrinsically stationary.A continuous function b : [0 , + ∞ ) → R such that b (0) < + ∞ is called completely monotonic if it is infinitelydi ff erentiable and ( − n b ( n ) ≥
0, for all n ∈ N , and where b ( n ) means the n -th derivative of b . Here, we abuse ofnotation when writing b (0) for b . A nonnegative and continuous function B : [0 , + ∞ ) → R is called a Bernstein6unction [26, 29] if its first derivative is completely monotonic. Arguments in [26] show that, for any Bernsteinfunction B , the mapping h (cid:55)→ ˜ γ ( h ) = B ( (cid:107) h (cid:107) θ ) , < θ ≤ , (11)is an intrinsically stationary variogram in R d for any positive integer d . In particular, for θ =
1, the matrix B = [ B i j ] pi , j = with generic entry B i j = B ( (cid:107) s i − s j (cid:107) ) is conditionally negative semidefinite. In the following, such a matrix B will bereferred to as a Bernstein matrix . Other well-known examples of conditionally negative semidefinite matrices are: • the zero ( ) and all-ones ( ) matrices; • the sum of two conditionally negative semidefinite matrices; • the product of a conditionally negative semidefinite matrix with a nonnegative constant; • the limit of a convergent sequence of conditionally negative semidefinite matrices; • the product of two Bernstein matrices.The second to fourth results stem from the fact that the set of conditionally negative semidefinite matrices is a convexcone and is closed under pointwise convergence. The last result is instead an exception, as normally the product oftwo Bernstein functions is not necessarily a Bernstein function; it stems from (11) with θ = t (cid:55)→ B ( t ) and t (cid:55)→ B ( t ) are Bernstein functions, so is t (cid:55)→ B ( √ t ) B ( √ t ) [26, 29, corollary 3.8].
3. Su ffi cient Validity Conditions for Parsimonious Models We start with the multivariate Gaussian covariance model.
Theorem 1.
The multivariate Gaussian covariance (9) is a valid model in R d , d ≥
1, if(1) [ β − i j ] pi , j = is conditionally negative semidefinite;(2) (cid:104) σ i j β − d / i j (cid:105) pi , j = is positive semidefinite. Remark 1.
A necessary condition for the element-wise inverse matrix [ β − i j ] pi , j = to be conditionally negative semidef-inite (Condition (1) in Theorem 1) is that [ β i j ] pi , j = is positive semidefinite (lemma 2 in Appendix). Likewise, anecessary condition for Condition (2) to hold is that [ σ i j ] pi , j = is positive semidefinite, insofar as it corresponds to thecovariance matrix between the components of the p -variate random field at the same spatial location.We now establish su ffi cient validity conditions for the multivariate Mat´ern model in which the direct and cross-covariances have the same shape parameter ν > Theorem 2.
The multivariate Mat´ern model K ( h ; α , ν , σ ) with ν > R d , d ≥
1, if, either(A) (1) [ α i j ] pi , j = is conditionally negative semidefinite;(2) (cid:34) σ i j α (cid:106) d + + (cid:100) ν (cid:101) (cid:107) i j (cid:35) pi , j = is positive semidefinite, with (cid:98)·(cid:99) and (cid:100)·(cid:101) the floor and ceil functions;or(B) there exists a matrix [ ψ i j ] pi , j = with positive entries such that:(1) [ ψ i j ] pi , j = is conditionally negative semidefinite;(2) [ α i j ψ i j ] pi , j = is conditionally negative semidefinite;(3) (cid:104) σ i j ψ ν + d / i j α ν i j (cid:105) pi , j = is positive semidefinite. Corollary 1.
The multivariate Mat´ern covariance K ( h ; α , ν , σ ) with ν > R d , d ≥
1, if, either7B.1) (1) [ α i j ] pi , j = is conditionally negative semidefinite;(2) (cid:104) σ i j α ν i j (cid:105) pi , j = is positive semidefinite;or(B.2) (1) [ α − i j ] pi , j = is conditionally negative semidefinite;(2) (cid:104) σ i j α − di j (cid:105) pi , j = is positive semidefinite. Remark 2.
Conditions (B.1) and (B.2) do not overlap unless all the α i j ’s are the same.
4. Multivariate Mat´ern Model: the General Case
In this section, we establish su ffi cient validity conditions for the multivariate Mat´ern model in which the direct andcross-covariances do not have the same shape and scale parameters ( ν and α not proportional to the all-ones matrix). Theorem 3.
The multivariate Mat´ern covariance (7) is a valid model in R d , d ≥
1, if, either(A) (1) [ ν i j ] pi , j = is a Bernstein matrix;(2) [ α − i j ] pi , j = is a Bernstein matrix;(3) (cid:20) σ ij Γ ( ν ij + d / α dij Γ ( ν ij ) (cid:21) pi , j = is positive semidefinite,or(B) there exists a matrix [ ψ i j ] pi , j = with positive entries such that:(1) [ ψ i j ] pi , j = is conditionally negative semidefinite;(2) [ ν i j ] pi , j = is conditionally negative semidefinite;(3) [ α i j ψ i j − ν i j ] pi , j = is conditionally negative semidefinite;(4) (cid:20) σ ij Γ ( ν ij ) ψ ν ij + d / i j α ν ij i j exp( − ν i j ) (cid:21) pi , j = is positive semidefinite.Conditions (B) above provide general su ffi cient validity conditions for the multivariate Mat´ern covariance, which de-pend on a conditionally negative semidefinite matrix ψ . They appear as a generalization of Conditions (B) of Theorem2, which correspond to the case ν = ν . Other cases are given in the next theorem and examples, which bring simplersu ffi cient conditions that only depend on the model parameters ( α , ν , σ ) and, possibly, one scalar hyperparameter β instead of a matrix-valued hyperparameter ψ . Theorem 4.
The multivariate Mat´ern covariance (7) is a valid model in R d , d ≥
1, if, either(A) for some positive scalar, β , one has(1) [ ν i j ] pi , j = is conditionally negative semidefinite;(2) [ α i j − βν i j ] pi , j = is conditionally negative semidefinite;(3) (cid:20) σ ij Γ ( ν ij ) (cid:18) α ij β (cid:19) ν ij exp( − ν i j ) (cid:21) pi , j = is positive semidefinite;or(B) (1) [ ν i j ] pi , j = is conditionally negative semidefinite;(2) [ ν i j α − i j ] pi , j = is conditionally negative semidefinite;83) (cid:20) σ ij Γ ( ν ij ) α − di j ν ν ij + d / i j exp( − ν i j ) (cid:21) pi , j = is positive semidefinite. Remark 3.
As the entries of ν get large, the matrix in Condition (B.3) becomes practically indistinguishable from ma-trix √ π (cid:104) σ i j α − di j ν ( d + / i j (cid:105) pi , j = , based on Stirling’s approximation of the gamma function. The positive semidefinitenessof the former will therefore hold if the latter is positive definite, provided that the entries of ν are large enough.A few particular cases of Conditions (B) of Theorem 4 follow. Example 1. If α = α with α >
0, the following su ffi cient conditions are obtained:(1) [ ν i j ] pi , j = is conditionally negative semidefinite;(2) (cid:20) σ ij Γ ( ν ij ) ν ν ij + d / i j exp( − ν i j ) (cid:21) pi , j = is positive semidefinite. Example 2. If ν = ν with ν >
0, the following su ffi cient conditions are obtained:(1) [ α − i j ] pi , j = is conditionally negative semidefinite;(2) (cid:104) σ i j α − di j (cid:105) pi , j = is positive semidefinite.These conditions are formally the same as Conditions (B.2) obtained in Corollary 1 and apply, for instance, tothe multivariate exponential model. Example 3. If ν = ν with ν > α i j = (cid:112) β i j ν , the following su ffi cient conditions are obtained:(1) [ β − i j ] pi , j = is conditionally negative semidefinite;(2) (cid:104) σ i j β − d / i j (cid:105) pi , j = is positive semidefinite.These conditions apply to the multivariate Gaussian model (9), which corresponds to the asymptotic case when ν tends to infinity (Eq. (5)). One finds the same conditions as in Theorem 1, which is therefore contained inTheorem 4, itself a particular case of Theorem 3.
5. Comparison with the Su ffi cient Conditions in Apanasovich et al. (2012) Conditions (A) in Theorem 4 bear resemblance to the su ffi cient conditions provided by Apanasovich et al. [3], specif-ically:(i) There exist δ ≥ a i j ] pi , j = such that ν i j = ν ii + ν j j + δ (1 − a i j ) , i , j = , · · · , p ; (12)(ii) [ α i j ] pi , j = is conditionally negative semidefinite;(iii) (cid:20) σ ij Γ ( ν ij + d / Γ ( ν ij ) Γ (( ν ii + ν jj + d ) / α δ + ν ii + ν jj i j (cid:21) pi , j = is positive semidefinite.In particular, any matrix [ ν i j ] pi , j = satisfying Condition (i) of Apanasovich et al. [3] also satisfies Condition (A.1) ofTheorem 4, but the reciprocal is not true. Indeed, Condition (i) corresponds to a particular case of (10), with γ abounded variogram that does not exceed its sill δ . Variograms with a hole e ff ect that take values greater than the sill,or variograms without a sill, provide examples of matrices [ ν i j ] pi , j = satisfying Condition (A.1) of Theorem 4 but notCondition (i) of Apanasovich et al. [3]; this is the case for the matrix with entries ν i j = + ( i − j ) [4, remark 2.5].Conversely, any matrix [ α i j ] pi , j = such that Conditions (A.1) and (A.2) of Theorem 4 hold also satisfies Condition (ii)9 igure 1: Upper bounds for ρ as a function of β for a = . a = .of Apanasovich et al. [3], but the reciprocal is not true. As for Condition (A.3), it does not depend on the spatialdimension d , as Condition (iii) of Apanasovich et al. [3] does.In the next subsections, the conditions of Theorems 3 and 4 and of Apanasovich et al. [3] are compared for someparticular examples of multivariate Mat´ern covariance models. Let 2 ν i j = ν ii + ν j j for i , j = , · · · , p . In this case, (12) gives δ =
0, so that Condition (iii) of Apanasovich et al. [3]reduces to the positive semidefiniteness of [ σ i j α ν ij i j / Γ ( ν i j )] pi , j = , as does Condition (A.3) of Theorem 4, if one accountsfor the fact that exp( − ν i j ) /β ν ij factorizes into the product of one term depending only on i and one term depending onlyon j . Furthermore, if [ α i j ] pi , j = is conditionally negative semidefinite, so is [ α i j − βν i j ] pi , j = for any β >
0. Accordingly,in the case when 2 ν i j = ν ii + ν j j , the conditions of Apanasovich et al. [3] coincide with the set (A) of Theorem 4 andno longer depend on the spatial dimension d . In the two-dimensional space ( d = p -variate Mat´ern covariance model with σ i j = i = j , ρ otherwise, ν i j = . i = j , 1 . α i j = . β if i = j , 1 . β + a otherwise, with ρ ≥ β > a ≥
0. Conditions(i) and (ii) of Apanasovich et al. [3] are satisfied with δ =
1, and so are Conditions (A.1) and (A.2) of Theorem4. Accordingly, one can calculate the maximum permissible value for the collocated correlation coe ffi cient ρ underConditions (iii) and (A.3), respectively, as a function of β , a and p .For a =
0, the maximum collocated correlation coe ffi cient does not depend on β and p . One finds ρ max = .
064 withCondition (iii) of Apanasovich et al. [3] and ρ max = .
523 with the Condition (A.3) of Theorem 4. For a = . a =
5, the maximum collocated correlation coe ffi cient is found not to depend on p , with a much (8 to 10 . ff erence between both sets of conditions increases even more when the spatial dimension d increases, becauseCondition (iii) of Apanasovich et al. [3] becomes stricter and stricter, while Condition (A.3) of Theorem 4 remainsunchanged. We consider the same spatial dimension ( d =
2) and parameterization for ν and σ as in the previous example, butmodify the scale parameters as follows: α i j = a > i = j , ln(1 + a ) otherwise. In such a case, Conditions10 igure 2: Upper bounds for ρ as a function of a , based on Gneiting’s conditions (valid for p = p ). (ii) of Apanasovich et al. [3] and (A.2) of Theorem 4 are not satisfied, since [ α i j ] pi , j = is not conditionally negativesemidefinite ( α i j < min( α ii , α j j )).However, Conditions (A.1) and (A.2) of Theorem 3 and Conditions (B.1) and (B.2) of Theorem 4 are satisfied withthe chosen parameters, so that both theorems can be used to derive an upper bound for the collocated correlationcoe ffi cient ρ . Figure 2 compares the maximal value for ρ leading to a positive semidefinite matrix when applyingConditions (A.3) of Theorem 3 and (B.3) of Theorem 4, for a ranging from 0 .
01 to 10. The upper bound found withthe former condition is undistinguishable from that derived from the necessary and su ffi cient conditions provided byGneiting et al. [17, theorem 3] in the bivariate setting. Gneiting’s conditions apply only for p =
2, whereas that ofTheorems 3 and 4 have been established for any positive integer p .
6. Trivariate Geochemical Data Illustration
The main goal of this section is to show that parameter estimation is feasible for the proposed parameterizations, andto show that fitting performances are competitive with respect to previous parameterizations proposed in earlier liter-ature. By no means we are interested in proposing innovative estimation techniques or comparing di ff erent existingtechniques.We consider a data set of geochemical concentrations measured at n = ff ects.11 llllllllllllllllllllllllllllllllll llllllll llllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllll llllll lllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllll llll lll llllll llllllllllllllllllllllllllllllllllll lllllllll llllllllllllllllll lllllll lllllllllllllll llllllllll lllllll llllllllll llllllll llllllllllll lllllllll llllllllll lllllllllll llllllll lllllllllllllllllll lllllll lllllllllllllllllllllllllll llllllllllllllllllllllllllllllllll lllllll lllllllllllll lllllllllllllllllllllllllllllllllllll lllllll lllllll llllllllllllllllllllllllll l lllllllllllllllllllllllllllll lll lllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllll llllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllll l l l l l l l l l l l l l l l l l l ll l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllll l l l l l l lllllllllll lllll llllllllllll l l l l l ll l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllll lllllllllllllllllllllllllllllllllllllllllllllllll lllllll ll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllll l l l l l l l l l l l l l l l l l l l l l l l l llllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllll ll llllllllllllllllllll lllllll llllllllllllllll l l llllllll lllll l l l l l l l l l l l llllllllllll lllllll l l l l l lllllllllllll llll llllll l l l l −2−10123 llllllllllllllllllllllllllllllllll llllllll llllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllll llllll lllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllll llll lll llllll llllllllllllllllllllllllllllllllllll lllllllll llllllllllllllllll lllllll lllllllllllllll llllllllll lllllll llllllllll llllllll llllllllllll lllllllll llllllllll lllllllllll llllllll lllllllllllllllllll lllllll lllllllllllllllllllllllllll llllllllllllllllllllllllllllllllll lllllll lllllllllllll lllllllllllllllllllllllllllllllllllll lllllll lllllll llllllllllllllllllllllllll l lllllllllllllllllllllllllllll lll lllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllll llllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllll l l l l l l l l l l l l l l l l l l ll l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllll l l l l l l lllllllllll lllll llllllllllll l l l l l ll l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllll lllllllllllllllllllllllllllllllllllllllllllllllll lllllll ll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllll l l l l l l l l l l l l l l l l l l l l l l l l llllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllll ll llllllllllllllllllll lllllll llllllllllllllll l l llllllll lllll l l l l l l l l l l l llllllllllll lllllll l l l l l lllllllllllll llll llllll l l l l −2−1012 llllllllllllllllllllllllllllllllll llllllll llllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllll llllll lllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllll llll lll llllll llllllllllllllllllllllllllllllllllll lllllllll llllllllllllllllll lllllll lllllllllllllll llllllllll lllllll llllllllll llllllll llllllllllll lllllllll llllllllll lllllllllll llllllll lllllllllllllllllll lllllll lllllllllllllllllllllllllll llllllllllllllllllllllllllllllllll lllllll lllllllllllll lllllllllllllllllllllllllllllllllllll lllllll lllllll llllllllllllllllllllllllll l lllllllllllllllllllllllllllll lll lllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllll llllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllll l l l l l l l l l l l l l l l l l l ll l l l l l l l l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllll l l l l l l lllllllllll lllll llllllllllll l l l l l ll l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l lllllllll lllllllllllllllllllllllllllllllllllllllllllllllll lllllll ll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllll l l l l l l l l l l l l l l l l l l l l l l l l llllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllll ll llllllllllllllllllll lllllll llllllllllllllll l l llllllll lllll l l l l l l l l l l l llllllllllll lllllll l l l l l lllllllllllll llll llllll l l l l Figure 3:
Transformed concentrations of (left) bismuth (center) magnesium (right) tellurium. distance S e m i v a r i an c e − . − . − . − . .
200 400 600 800 1000 l l l l l l l l l l l l l l l l l l l l
Mg.Te . . . . . l l l l l l l l l l l l l l l l l l l l Bi.Te . . . . . . .
200 400 600 800 1000 l l l l l l l l l l l l l l l l l l l l Te − . − . − . . l l l l l l l l l l l l l l l l l l l l Mg.Bi . . . . . . . l l l l l l l l l l l l l l l l l l l l Bi . . . . . . l l l l l l l l l l l l l l l l l l l l Mg Figure 4:
Empirical binned direct and cross-variograms for the normal scores transforms of bismuth, magnesium, and tellurium concentrations.
As an exploratory tool, we fit exponential covariance models to the empirical direct and cross-variograms usingweighted least-squares, with weights equal to the ratio of the number of pairs over the squared distance. Let ˆ α and ˆ σ be the corresponding fitted scale and collocated covariance parameters. This data set is particularly interesting becauseit matches some, but never all, of the constraints for our proposed classes, [3] and [17]. We find that the entries of ˆ α di ff er by as much of a factor of four. We also find that ˆ α − is conditionally negative definite, ˆ α is not conditionallynegative definite, ˆ σ is positive definite, ˆ σ ˆ α is positive definite, and ˆ σ ˆ α − is not positive definite.We compare these attributes of the empirical fits to the constraints imposed by [17], [3], and Theorem 4 (B), Example2, under the assumption that ν = / [17]: α = α and σ ≥
0. Given the variability in ˆ α , the constant scale parameter α = α may be too restrictive.Within the constraints of [17], ˆ σ is positive definite. [3] or Theorem 4 (B): α ≤ σα ≥
0. This corresponds to δ = a i j = σ and ˆ α is positive semidefinite,matching constraint (iii) of [3] when ν = ν , where ν = / Theorem 4 (B) Example 2: α − ≤ σα − ≥
0. ˆ α − ≤
0, suggesting that our constraints may be preferable to[3] and [17]. On the other hand, the element-wise product of ˆ σ and ˆ α − is not positive semidefinite, meaningthat our constraints do not match the empirical direct and cross-variogram fits perfectly.Although each set of constraints fails to capture all of the empirical attributes of the data, our newly proposed con-straints o ff er practical advantages over the constraints of [3] and [17] for this dataset, as is now shown. We assumethat the transformed metal concentrations y ( s i ) at location s i ∈ R , i = , ..., n , are normally distributed with meanzero and Cov (cid:16) y ( s i ) , y ( s j ) (cid:17) = K (cid:32) s i − s j ; α , , σ (cid:33) + V I ( i = j ) , (13)where K ( s i − s j ; α , / , σ ) is defined in (7), V is a positive semidefinite nugget e ff ect, and I ( · ) is the indicator function.For each set of constraints, we fit each model in a Bayesian framework assuming a priori that α and σ are uniformlydistributed under the constraints discussed above. Because of identifiability challenges for σ and α [36], we alsoconsider models with α fixed on the configuration closest to ˆ α , given the appropriate constraints. We have six modelsusing the constraints discussed in Theorem 4 (B) Example 2, [17], and [3], as well as letting α be fixed or random.We emphasize that these models are the same except for the constraints on covariance parameters.We fit the model using adaptive Markov chain Monte Carlo (MCMC) [see 28, for a discussion on adaptive MCMC].We use a multivariate normal random walk for all parameters using the log-scale for positive parameters ( α , as wellas diagonals of σ and V ) and the raw scale for other parameters (o ff -diagonals of σ and V ). We tune the covariance ofthe multivariate normal proposal distribution using the empirical covariance of previous samples and adding a smallnumber to the diagonal of the empirical covariance [19]. We scale the proposal covariance to obtain an acceptancerate between 0.15 and 0.5. Using this approach, we run the MCMC sampler for 60,000 iterations, discarding the first30,000 iterations.We compare models using the deviance information criterion (DIC) [31]. In many settings, other information criteriaare better approximations for cross-validation than DIC [see, e.g., 35]; however, calculating the widely applicableinformation criterion (WAIC) requires arbitrary partitioning of the spatial domain [15]. In this case, the models aresimilar in form and complexity. The model comparison results are given in Table 2.For fixed α , our model from the expanded constraints presented in Theorem 4 (B) Example 2 provides the best modelin terms of DIC by about 11. For random or unknown α , our model from Theorem 4 (B) Example 2 is best by about10 in terms of DIC. For models with fixed α (1-3 in Table 2), the models have essentially the same size not accountingfor the imposed constraints. Therefore, di ff erences greater than 10 in the DIC and posterior mean deviance D arestriking as these di ff erences correspond to added model flexibility given by Theorem 4. For models with unknownor random α (4-6 in Table 2), the [17] model has fewer parameters because it has only one range parameter. Despitethis simplicity, it has the highest (worst) DIC. Although the model size under the [3] and Theorem 4 (B) Example 2constraints are the same, we observe di ff erences in DIC and posterior mean deviance greater than 10, emphasizing thebenefit of the more flexible Mat´ern models presented herein. In summary, this dataset provides a motivating examplewhere the more flexible constraints presented herein o ff er a better fit to the data than the constraints in [17] or [3].For the lowest DIC model, we plot the covariance and cross-covariances for the posterior mean as a function of thedistance in Figure 5. We thin the 30,000 post-burn posterior samples to 1000 samples and plot the corresponding directand cross-variograms in Figure 6. Three to four variograms align relatively well with the empirical variograms. Wesuggest two possible reasons why the variogram fits may not appear ideal. First, limitations on the joint identifiabilityof scale and collocated covariance parameters are well-studied for the Mat´ern covariance class [36]. That is, equal13 onstraints α Mean Deviance D Model Complexity p D DIC = D + p D Table 2:
Model comparison results. Constraints represent the parameter constraints for the three models. We compare models with fixed orunknown α . All components of DIC are provided: Mean Deviance D , Model Complexity p D , and DIC = D + p D . Lastly, we provide the posteriorprobability that the lowest DIC model has a higher log-likelihood than competing models. values for ασ yield equivalent Gaussian measures and interpolation. However, these non-identifiable combinations ofscale and collocated covariance parameters yield di ff erent variograms. Second, this dataset was selected to challengeeven our improved parametric constraints. As discussed, the element-wise product of ˆ σ and ˆ α − is not positivesemidefinite; therefore, this constraint from Theorem 4 (B) may limit how well the Mat´ern model fits the data. Thatsaid, our proposed constraints in Theorem 4 outperform the current state of the art. −0.40.00.40.8 0 500 1000 1500 distance c o v a r i an c e Variables
BiBi.TeMgMg.BiMg.TeTe
Figure 5:
Covariance and cross-covariances for the posterior mean as a function of the distance in meters. Vertical lines at the origin indicate anugget e ff ect. l l l l l l l l l l l l l l l l l l l . . . . . . Mg l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l − . − . − . − . − . − . − . Mg.Bi S e m i v a r i an c e l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l . . . . . . . Bi l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l − . − . − . − . . Mg.Te l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l . . . . . . Bi.Te
Distance l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l . . . . . . . Te l l l l l l l l l l l l l l l l l l l l Figure 6:
Empirical variograms with 1000 posterior variograms superimposed.
7. Conclusions
While [17] have completely characterized the multivariate Mat´ern model for the case p =
2, when p > ffi cientconditions proposed first by [17] and later on by [11] and [3] are overly restrictive. This paper has provided improvedconditions (Table 1) that allow for more flexible modeling using the multivariate Mat´ern covariance. In some cases,the conditions found for the case p > p = ffi cients. In particular, our examples show that one can attain muchhigher upper bounds for the collocated correlation coe ffi cient, resulting in more flexible versions of the multivariateMat´ern model. This is reflected in Section 6 where our parameterization was successful compared to our competitorson a trivariate dataset.Modeling multivariate spatial data sets has an obvious computational burden, in particular when p >
2, as the numberof parameters to be estimated becomes massive. An alternative to least-squares fitting is to proceed through Bayesianroutines coupled with MCMC techniques, as we have shown in Section 6.
Acknowledgements
This work was supported by the National Agency for Research and Development of Chile [grant CONICYT PIAAFB180004]. 15 ppendix A. Technical Lemmas with their Proofs
From a notation viewpoint, we recall that, in what follows, all the matrix operations (product, division, inverse, power,exponential, etc.) are taken element-wise.
Lemma 1.
Let A = [ a i j ] pi , j = be a symmetric real-valued matrix. The following assertions are equivalent [4, 27, 30]: • A is conditionally negative semidefinite; • exp( − t A ) is positive semidefinite for all t ≥ • [ a ip + a p j − a i j − a pp ] pi , j = is positive semidefinite. Lemma 2. If A = [ a i j ] pi , j = is a conditionally negative semidefinite matrix with positive entries, then [ a − i j ] pi , j = ispositive semidefinite and [ a µ i j ] pi , j = is conditionally negative semidefinite for any µ ∈ (0 , Lemma 3.
The multivariate Mat´ern covariance (7) is a valid model in R d , d ≥
1, if(1) ν is conditionally negative semidefinite;(2) K ( h ; α , ν + µ, Γ ( ν + µ ) (cid:12) Γ ( ν ) − (cid:12) σ ) is a valid model in R d for some positive constant µ , where the exponent − Γ for the gamma function. Proof.
The proof relies on the decomposition of the Mat´ern covariance with shape parameter ν as a scale mixture ofMat´ern covariances with shape parameter ν + µ [18, formula 6.592.12]: K ( h ; α , ν , σ ) = Γ ( ν + µ ) (cid:12) Γ ( ν ) − (cid:12) (cid:90) + ∞ t − ν − µ ( t − µ − K ( h √ t ; α , ν + µ, σ )d t , (A.1)where all operations are taken element-wise. Under Conditions (1) and (2) of Lemma 3, K ( h ; α , ν , σ ) appears as asum of valid multivariate Mat´ern covariance functions of the form K ( h √ t ; α , ν + µ, Γ ( ν + µ ) (cid:12) Γ ( ν ) − (cid:12) σ ) weightedby positive semidefinite matrices of the form t − ν − µ ( t − µ − (Lemma 1), hence, it is a valid covariance model. Corollary 2. If K ( h ; α , ν , σ ) is a valid model in R d for some ν >
0, then so is K ( h ; α , µ , σ ) for µ ∈ (0 , ν ). Proof.
This result stems from the fact that the constant matrix ν is conditionally negative semidefinite, hence itsatisfies Condition (1) of Lemma 3. Appendix B. Transitive Upgrading (Mont´ee) of a Covariance Function
The transitive upgrading, also known as “mont´ee” or Radon transform [23], of order 1 of an absolutely integrable real-valued function ϕ d defined in R d is the function ϕ d − of R d − obtained by integration of ϕ d along the last coordinateaxis: ϕ d − ( s , · · · , s d − ) = (cid:90) + ∞−∞ ϕ d ( s , · · · , s d − , s d ) ds d , ( s , · · · , s d − ) ∈ R d − . (B.1)The Fourier transforms of ϕ d and its mont´ee are related by (cid:101) ϕ d − ( ω , · · · , ω d − ) = (cid:101) ϕ d ( ω , · · · , ω d − , , ( ω , · · · , ω d − ) ∈ R d − . (B.2)The same relation applies to the Fourier transforms of continuous and absolutely integrable covariance functions:the mont´ee operator consists in cancelling out the last coordinate of the Fourier transform (spectral density) of the d -dimensional covariance to find out that of the ( d − i.e. , the mont´ee of an isotropic covariance defined in R d is an isotropic covariance in R d − whose spectral density has the same radial part as that of the original covariance [23]. For example, the mont´ee of theisotropic Mat´ern covariance k ( · ; α, ν ) as in Equation (1) in R d is, up to a multiplicative factor, the Mat´ern covariance k ( · ; α, ν + /
2) in R d − .By repeated application of the mont´ee of order 1, one obtains a mont´ee of any integer order. In particular, themont´ee of integer order ϑ of the isotropic Mat´ern covariance k ( · ; α, ν ) in R d is proportional to the Mat´ern covariance k ( · ; α, ν + ϑ/
2) in R d − ϑ , the spectral densities of both Mat´ern covariances having, up to a multiplicative factor, thesame radial parts (Eq. (3)). By using the formalism of Hankel transforms instead of that of Fourier transforms, it iseven possible to generalize the mont´ee to fractional orders [23]. Appendix C. Proofs
Proof of Theorem 1.
The matrix-valued spectral density of the p -variate Gaussian covariance (9) in R d is (Eq. (6)) (cid:101) G ( ω ; β , σ ) = π ) d / (cid:104) σ i j β − d / i j (cid:105) pi , j = (cid:12) (cid:34) exp (cid:32) − (cid:107) ω (cid:107) β i j (cid:33)(cid:35) pi , j = , ω ∈ R d . (C.1)Condition (1) and Lemma 1 ensure that the last matrix in the right-hand side member of (C.1) is positive semidefinite,while Condition (2) ensures the positive semidefiniteness of the first matrix. Accordingly, based on Schur’s producttheorem, the spectral density matrix (cid:101) G ( ω ; β , σ ) is positive semidefinite for any frequency vector ω in R d , which inturn ensures that the associated covariance model is valid. Proof of Theorem 2.
We start by proving (A). Owing to Corollary 2, it su ffi ces to establish the result for ν being aninteger or a half-integer. We propose a proof based on the representation of the exponential and Mat´ern covarianceas scale mixtures of compactly-supported Askey and Wendland covariances, respectively. The Wendland covariancewith range a > m ∈ N and its spectral density in R d are [9]: w ( h ; a , m , d ) = − ν − m Γ ( µ + Γ ( µ + m + (cid:32) − (cid:107) h (cid:107) a (cid:33) µ + m + × F (cid:32) µ , µ +
12 ; µ + m +
1; 1 − (cid:107) h (cid:107) a (cid:33) , h ∈ R d (C.2) (cid:101) w ( ω ; a , m , d ) = m a d Γ ( µ + Γ (cid:16) + d + m (cid:17) π ( d + / Γ ( µ + d + m + × F + d + m + d + m + µ , + d + m + µ − (cid:32) a (cid:107) ω (cid:107) (cid:33) , ω ∈ R d , (C.3)with µ = (cid:106) d + m + (cid:107) , ( · ) + the positive part function and F a hypergeometric function. The Askey covariancecorresponds to the particular case when m = ν is an integer or a half-integer and let us decompose the univariate exponential covariance in R d + ν − , d ≥
1, as a scale mixture of Askey covariances in R d + ν − with exponent µ = (cid:106) d + + ν (cid:107) : k ( h ; α, / = (cid:90) + ∞ w ( h ; t , , d + ν − φ ( t ; α, ν, d )d t , h ∈ R d + ν − , (C.4)that is, exp( − α (cid:107) h (cid:107) ) = (cid:90) + ∞(cid:107) h (cid:107) (cid:32) − (cid:107) h (cid:107) t (cid:33) µ φ ( t ; α, ν, d )d t , h ∈ R d + ν − . (C.5)17o determine φ , we di ff erentiate ( µ +
1) times under the integral sign, which is permissible owing to the dominatedconvergence theorem, to obtain φ ( t ; α, ν, d ) = α µ + t µ Γ ( µ +
1) exp( − α t ) , t > . (C.6)One recognizes the gamma probability density with shape parameter µ + α , a result that remindsthe representation in R of the exponential covariance as a gamma mixture of spherical covariances [13]. In terms ofspectral density, (C.6) translates into (cid:101) k ( ω ; α, / = (cid:90) + ∞ (cid:101) w ( ω ; t , , d + ν − φ ( t ; α, ν, d )d t , ω ∈ R d + ν − . (C.7)To generalize these results to the Mat´ern covariance of integer or half-integer parameter ν , let us consider a transitiveupgrading (“mont´ee”) of order 2 ν −
1, which provides a covariance model with the same radial spectral density in aspace whose dimension is reduced by 2 ν − i.e. , in R d . In particular, up to a multiplicativeconstant, the upgrading of k ( · ; α, /
2) and w ( · ; t , , d + ν − R d + ν − , yields k ( · ; α, ν ) and w ( · ; t , ν − / , d ), both defined in R d , respectively. Based on (C.7), the upgraded spectral densities in R d are found to be related by Γ ( ν ) α ν − Γ (1 / (cid:101) k ( ω ; α, ν ) = (cid:90) + ∞ t ν − ν − / (cid:101) w ( ω ; t , ν − / , d ) φ ( t ; α, ν, d )d t , ω ∈ R d , (C.8)and the covariance functions therefore satisfy the following identity: Γ ( ν ) α ν − Γ (1 / k ( h ; α, ν ) = (cid:90) + ∞ t ν − ν − / w ( h ; t , ν − / , d ) φ ( t ; α, ν, d )d t , h ∈ R d . (C.9)Based on this scale mixture representation, the multivariate Mat´ern covariance in R d with parameters α , ν and σ canbe written as K ( h ; α , ν , σ ) = σ (cid:12) (cid:90) + ∞ Γ (1 / α t ) ν + µ ν − / Γ ( ν ) Γ ( µ + t (cid:12) exp( − α t ) w ( h ; t , ν − / , d )d t , h ∈ R d , (C.10)where the exponent and exponential are taken element-wise. Under Condition (A.1) of Theorem 2, exp( − t α ) is pos-itive semidefinite (Lemma 1). If Condition (A.2) also holds, then σ (cid:12) α ν + µ (cid:12) exp( − α t ) is positive semidefinite forany t >
0, as element-wise product of positive semidefinite matrices. The matrix-valued Mat´ern covariance function K ( h ; α , ν , σ ) appears as a mixture of valid Wendland covariances in R d weighted by positive semidefinite matrices,thus it is a valid model in R d , which completes the proof for case (A).We now prove (B). The matrix-valued spectral density of the multivariate Mat´ern covariance K ( h ; α , ν , σ ) in R d can be written as (Eq. (3)) (cid:101) K ( ω ; α , ν , σ ) = Γ ( ν + d / Γ ( ν ) π d / (cid:104) σ i j α − di j (cid:105) pi , j = (cid:12) + (cid:107) ω (cid:107) α i j − ν − d / pi , j = = Γ ( ν + d / Γ ( ν ) π d / (cid:104) σ i j ψ ν + d / i j α ν i j (cid:105) pi , j = (cid:12) (cid:20)(cid:16) α i j ψ i j + (cid:107) ω (cid:107) ψ i j (cid:17) − ν − d / (cid:21) pi , j = , (C.11) ω ∈ R d , with [ ψ i j ] pi , j = an arbitrary matrix with positive entries.Under Conditions (B.1) and (B.2) of Theorem (2), (cid:34)(cid:16) α i j ψ i j + (cid:107) ω (cid:107) ψ i j (cid:17) ν + d / (cid:100) ν + d / (cid:101) (cid:35) pi , j = is conditionally negative semidefinite,as the sum of two conditionally negative semidefinite matrices raised to a power less than or equal to 1 (Lemma 2),and has positive entries. Its element-wise inverse raised to power (cid:100) ν + d / (cid:101) (last matrix in the second row of (C.11))18s therefore positive semidefinite, based on Lemma 2 and the fact that an integer power of a positive semidefinitematrix is still positive semidefinite [4, corollary 1.14]. If furthermore, Condition (B.3) of Theorem 2 holds, then thefirst matrix in the second row of (C.11)) is also positive semidefinite, and so is the matrix-valued spectral density (cid:101) K ( ω ; α , ν , σ ) owing to Schur’s product theorem. The positive semidefiniteness of the spectral density matrix for anyfrequency vector ω ∈ R d ensures the validity of the associated covariance function K ( h ; α , ν , σ ) in R d . Proof of Corollary 1.
The two sets of conditions are particular cases of Theorem 2 (Conditions (B)), with [ ψ i j ] pi , j = = and [ ψ i j ] pi , j = = [ α − i j ] pi , j = , respectively. Proof of Theorem 3.
We first prove (A). The matrix-valued spectral density of the multivariate Mat´ern covariance K ( h ; α , ν , σ ) in R d is (Eq. (3)) (cid:101) K ( ω ; α , ν , σ ) = σ i j Γ ( ν i j + d / π d / α di j Γ ( ν i j ) pi , j = (cid:12) exp − (cid:32) ν i j + d (cid:33) ln + (cid:107) ω (cid:107) α i j pi , j = , ω ∈ R d . (C.12)Since the composition of two Bernstein functions is still a Bernstein function and based on the fact that x (cid:55)→ ln(1 + x )is a Bernstein function [29, corollary 3.8 and chapter 16.4], the matrix (cid:32) ν i j + d (cid:33) ln + (cid:107) ω (cid:107) α i j pi , j = turns out to be the product of two Bernstein matrices under Conditions (A.1) and (A.2) of Theorem 3, hence it is con-ditionally negative semidefinite (see example following (11)). If Condition (A.3) also holds, then the spectral densitymatrix (cid:101) K ( ω ; α , ν , σ ) is positive semidefinite for any ω ∈ R d , based on Lemma 1 and Schur’s product theorem, and K ( h ; α , ν , σ ) is therefore a valid covariance function in R d .We now prove (B). Consider the p -variate Mat´ern covariance (7), in which each entry is a scale mixture of Gaus-sian covariances of the form (4). Let ψ = [ ψ i j ] pi , j = be a matrix with positive entries. The change of variable u = ψ − i j v in the scale mixture representation of the cross-covariance k i j ( h ) yields the following expression for the p -variateMat´ern covariance model, where the exponent − K ( h ; α , ν , σ ) = (cid:90) + ∞ g ( h ; ψ − v ) (cid:12) σ (cid:12) F ( ψ − v ; α , ν ) (cid:12) ψ − d v . (C.13)Based on Theorem 1, su ffi cient conditions for this model to be valid in R d are(1) [ ψ i j ] pi , j = is conditionally negative semidefinite;(2) R ( v ) = (cid:104) σ i j f ( ψ − i j v ; α i j , ν i j ) ψ − i j ( ψ − i j v ) − d / (cid:105) pi , j = is positive semidefinite for any v ≥ R ( v ) = v + d / σ i j ψ d / i j Γ ( ν i j ) α i j ψ i j v ν ij exp − α i j ψ i j v pi , j = = (4 t ( v )) + d / σ i j ψ d / i j Γ ( ν i j ) (cid:16) t ( v ) α i j ψ i j (cid:17) ν ij exp (cid:16) − t ( v ) α i j ψ i j (cid:17) pi , j = , with t ( v ) = v >
0. To prove that, under Conditions (B.2) to (B.4) of Theorem 3, this matrix is positive semidefinitefor any positive value of t ( v ), we distinguish two cases. 19 t ( v ) ≤
1. Up to a positive scalar factor, R ( v ) can be decomposed into the element-wise product of three matrices: R ( v ) = (4 t ( v )) + d / (cid:104) exp((ln t ( v ) + − t ( v )) ν i j ) (cid:105) pi , j = (cid:12) (cid:104) exp (cid:16) − t ( v ) (cid:16) α i j ψ i j − ν i j (cid:17)(cid:17)(cid:105) pi , j = (cid:12) σ i j ψ d / i j Γ ( ν i j ) (cid:16) α i j ψ i j (cid:17) ν ij exp( − ν i j ) pi , j = , (C.14)with ln t ( v ) + − t ( v ) < − t ( v ) <
0. Together with Lemma 1, Conditions (B.2), (B.3) and (B.4) of Theorem3 ensure the positive semidefiniteness of the first, second and third matrices in the second member of (C.14),respectively. Accordingly, R ( v ) is positive semidefinite based on Schur’s product theorem. • t ( v ) >
1. One has the following decomposition: R ( v ) = (4 t ( v )) + d / (cid:104) exp (cid:16) − (ln t ( v ) + (cid:16) α i j ψ i j − ν i j (cid:17)(cid:17)(cid:105) pi , j = (cid:12) (cid:104) exp (cid:16) (ln t ( v ) + − t ( v )) α i j ψ i j (cid:17)(cid:105) pi , j = (cid:12) σ i j ψ d / i j Γ ( ν i j ) (cid:16) α i j ψ i j (cid:17) ν ij exp( − ν i j ) pi , j = , (C.15)with − (ln t ( v ) + < t ( v ) + − t ( v ) <
0. The positive semidefiniteness of R ( v ) follows from Conditions(B.2), (B.3) and (B.4) of Theorem 3, Lemma 1 and Schur’s product theorem. In particular, one uses the factthat Conditions (B.2) and (B.3) imply the conditional negative semidefiniteness of [ α i j ψ i j ] pi , j = . Proof of Theorem 4.
Conditions (A) and (B) of Theorem 4 are particular cases of Theorem 3 with ψ i j = /β (A) and ψ i j = ν i j /α i j (B), which fulfill Conditions (B.1) and Conditions (B.3) of the latter theorem, respectively. References [1] Alegr´ıa, A., Porcu, E., Furrer, R., and Mateu, J. (2019). Covariance functions for multivariate Gaussian fields.
Stochastic EnvironmentalResearch Risk Assessment , 33:1593–1608.[2] Alonso-Malaver, C., Porcu, E., and Giraldo, R. (2015). Multivariate and multiradial Schoenberg measures with their dimension walks.
Journalof Multivariate Analysis , 133:251 – 265.[3] Apanasovich, T. V., Genton, M. G., and Sun, Y. (2012). A valid Mat´ern class of cross-covariance functions for multivariate random fields withany number of components.
Journal of the American Statistical Association , 107(497):180–193.[4] Berg, C., Christensen, J. P. R., and Ressel, P. (1984).
Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions .Springer-Verlag.[5] Bevilacqua, M., Faouzi, T., Furrer, R., and Porcu, E. (2019). Estimation and prediction using Generalized Wendland covariance functionsunder fixed domain asymptotics.
Annals of Statistics , 47(2):828–856.[6] Bevilacqua, M., Hering, A. S., and Porcu, E. (2015). On the flexibility of multivariate covariance models.
Statist. Sci. , 30(2):167–169.[7] Castillo, P. I., Townley, B. K., Emery, X., Puig, ´A. F., and Deckart, K. (2015). Soil gas geochemical exploration in covered terrains of northernchile:data processing techniques and interpretation of contrast anomalies.
Geochemistry:Exploration, Environment, Analysis , 15(2-3):222–233.[8] Castruccio, S. and Stein, M. L. (2013). Global space–time models for climate ensembles.
The Annals of Applied Statistics , 7(3):1593–1611.[9] Chernih, A., Sloan, I. H., and Womersley, R. S. (2014). Wendland functions with increasing smoothness converge to a gaussian.
Advances inComputational Mathematics , 40(1):185–200.[10] Daley, D., Porcu, E., and Bevilacqua, M. (2015). Classes of compactly supported covariance functions for multivariate random fields.
Stochastic Environmental Research and Risk Assessment , 29(4):1249–1263.[11] Du, J., Leonenko, N., Ma, C., and Shu, H. (2012). Hyperbolic vector random fields with hyperbolic direct and cross covariance functions.
Stochastic Analysis and Applications , 30(4):662–674.[12] Emery, X., Arroyo, D., and Porcu, E. (2016). An improved spectral turning-bands algorithm for simulating stationary vector Gaussian randomfields.
Stochastic Environmental Research and Risk Assessment , 30(7):1863–1873.[13] Emery, X. and Lantu´ejoul, C. (2006). TBSIM: A computer program for conditional simulation of three-dimensional Gaussian random fieldsvia the turning bands method.
Computers & Geosciences , 32(10):1615–1628.[14] Fuentes, M. (2002). Spectral methods for nonstationary spatial processes.
Biometrika , 89(1):197–210.[15] Gelman, A., Hwang, J., and Vehtari, A. (2014). Understanding predictive information criteria for bayesian models.
Statistics and computing ,24(6):997–1016.[16] Genton, M. G. and Kleiber, W. (2015). Cross-covariance functions for multivariate geostatistics.
Statistical Science , 30(2):147–163.
17] Gneiting, T., Kleiber, W., and Schlather, M. (2010). Mat´ern cross-covariance functions for multivariate random fields.
Journal of the AmericanStatistical Association , 105:1167–1177.[18] Gradshteyn, I. and Ryzhik, I. (2007).
Table of Integrals, Series, and Products . Academic Press, Amsterdam, 7th. edition.[19] Haario, H., Saksman, E., and Tamminen, J. (2001). An adaptive Metropolis algorithm.
Bernoulli , 7(2):223–242.[20] Horrell, M. T. and Stein, M. L. (2017). Half-spectral space–time covariance models.
Spatial Statistics , 19:90–100.[21] Journel, A. and Huijbregts, C. (1978).
Mining Geostatistics . Academic Press, London.[22] Lantu´ejoul, C. (2002).
Geostatistical Simulation: Models and Algorithms . Springer.[23] Matheron, G. (1965).
Les Variables R´egionalis´ees et Leur Estimation . Masson, Paris.[24] Mery, N., Emery, X., C´aceres, A., Ribeiro, D., and Cunha, E. (2017). Geostatistical modeling of the geological uncertainty in an iron oredeposit.
Ore Geology Reviews , 88:336–351.[25] Pinheiro, M., Vallejos, J., Miranda, T., and Emery, X. (2016). Geostatistical simulation to map the spatial heterogeneity ofgeomechanicalparameters: a case study with rock mass rating.
Engineering Geology , 205:93–103.[26] Porcu, E. and Schilling, R. L. (2011). From Schoenberg to Pick–Nevanlinna: Toward a complete picture of the variogram class.
Bernoulli ,17(1):441–455.[27] Reams, R. (1999). Hadamard inverses, square roots and products of almost semidefinite matrices.
Linear Algebra and its Applications ,288:35–43.[28] Roberts, G. O. and Rosenthal, J. S. (2009). Examples of adaptive MCMC.
Journal of Computational and Graphical Statistics , 18(2):349–367.[29] Schilling, R., Song, R., and Vondrachek, Z. (2010).
Bernstein Functions . De Gruyter, Berlin.[30] Schoenberg, I.-J. (1938). Metric spaces and positive definite functions.
Transaction of the American Mathematical Society , 44(3):522–536.[31] Spiegelhalter, D. J., Best, N. G., Carlin, B. P., and Van Der Linde, A. (2002). Bayesian measures of model complexity and fit.
Journal of theRoyal Statistical Society: Series B (statistical methodology) , 64(4):583–639.[32] Stein, M. L. (1999).
Interpolation of Spatial Data: Some Theory for Kriging . Springer.[33] Stein, M. L. (2005). Statistical methods for regular monitoring data.
Journal of the Royal Statistical Society: Series B (Statistical Methodol-ogy) , 67(5):667–687.[34] van den Boogaart, K. and Tolosana-Delgado, R. (2018). Predictive geometallurgy: An interdisciplinary key challenge for mathematicalgeosciences. In Daya Sagar, B., Cheng, Q., and Agterberg, F., editors,
Handbook of Mathematical Geosciences , pages 673–686. Springer,Cham: Switzerland.[35] Watanabe, S. (2010). Asymptotic equivalence of bayes cross validation and widely applicable information criterion in singular learningtheory.
Journal of Machine Learning Research , 11(Dec):3571–3594.[36] Zhang, H. (2004). Inconsistent estimation and asymptotically equal interpolations in model-based geostatistics.
Journal of the AmericanStatistical Association , 99(465):250–261., 99(465):250–261.