[PDF] Estimating covariance functions of multivariate skew-Gaussian random fields on the sphere

Abstract

This paper considers a multivariate spatial random field, with each component having univariate marginal distributions of the skew-Gaussian type. We assume that the field is defined spatially on the unit sphere embedded in \mathbb{R}^3, allowing for modeling data available over large portions of planet Earth. This model admits explicit expressions for the marginal and cross covariances. However, the n-dimensional distributions of the field are difficult to evaluate, because it requires the sum of 2^n terms involving the cumulative and probability density functions of a n-dimensional Gaussian distribution. Since in this case inference based on the full likelihood is computationally unfeasible, we propose a composite likelihood approach based on pairs of spatial observations. This last being possible thanks to the fact that we have a closed form expression for the bivariate distribution. We illustrate the effectiveness of the method through simulation experiments and the analysis of a real data set of minimum and maximum temperatures.

Full PDF

EEstimating covariance functions of multivariate skew-Gaussian randomﬁelds on the sphere

Alegr´ıa A. ∗ a , Caro S. b , Bevilacqua M. c , Porcu E. a , Clarke J. a a Departamento de Matem´atica, Universidad T´ecnica Federico Santa Mar´ıa, Valpara´ıso, Chile b Departamento de Matem´atica y Computaci´on, Universidad de Santiago, Santiago, Chile c Instituto de Estad´ıstica, Universidad de Valpara´ıso, Valpara´ıso, Chile

Abstract

This paper considers a multivariate spatial random ﬁeld, with each component having univariatemarginal distributions of the skew-Gaussian type. We assume that the ﬁeld is deﬁned spatially onthe unit sphere embedded in R , allowing for modeling data available over large portions of planetEarth. This model admits explicit expressions for the marginal and cross covariances. However,the n -dimensional distributions of the ﬁeld are diﬃcult to evaluate, because it requires the sum of2 n terms involving the cumulative and probability density functions of a n -dimensional Gaussiandistribution. Since in this case inference based on the full likelihood is computationally unfeasible,we propose a composite likelihood approach based on pairs of spatial observations. This last beingpossible thanks to the fact that we have a closed form expression for the bivariate distribution.We illustrate the eﬀectiveness of the method through simulation experiments and the analysis ofa real data set of minimum and maximum surface air temperatures. Keywords:

Composite likelihood, Geodesic distance, Global data

1. Introduction

The Gaussian assumption is an appealing option to model spatial data. First, the Gaussian dis-tribution is completely characterized by its ﬁrst two moments. Another interesting property is thetractability of the Gaussian distribution under linear combinations and conditioning. However, inmany geostatistical applications, including oceanography, environment and the study of naturalresources, the Gaussian framework is unrealistic, because the observed data have diﬀerent features,such as, positivity, skewness or heavy tails, among others.Transformations of Gaussian random ﬁelds (RFs) is the most common alternative to model non-Gaussian ﬁelds. Consider a spatial domain D and { Z ( s ) , s ∈ D} deﬁned as Z ( s ) = ϕ ( Y ( s )), where ϕ is a real-valued mapping and { Y ( s ) , s ∈ D} is Gaussian. Apparently the ﬁnite-dimensionaldistributions of Z depend on the choice of ϕ . In some cases, such mapping is one-to-one andadmits an inverse simplifying the analysis. In this class, we can highlight the log-Gaussian RFs, ∗ Corresponding author. Email: [email protected]

Preprint submitted to Elsevier July 27, 2017 a r X i v : . [ m a t h . S T ] J u l hich are generated as a particular example of the Box-Cox transformation (see De Oliveira et al.,1997). However, a one-to-one transformation is not appropriate in general. For instance, Stein(1992) considers a truncated Gaussian RF, taking ϕ as an indicator function, in order to modeldata sets with a given percentage of zeros (for instance, precipitations). On the other hand,wrapped-Gaussian RFs have been introduced in the literature for modeling directional spatialdata, arising in the study of wave and wind directions (see Jona-Lasinio et al., 2012). In addition,Xu and Genton (2016a) propose a ﬂexible class of ﬁelds named the Tukey g-and-h RFs.Another approach consists in taking advantage of the stochastic representations of random vari-ables. For instance, Ma (2009) considers a general approach to construct elliptically contouredRFs through mixtures of Gaussian ﬁelds. These models have an explicit covariance structure andallow a wide range of ﬁnite-dimensional distributions. Other related works have been developed byDu et al. (2012), Ma (2013a) and Ma (2013b) including Hyperbolic, K and Student’s t distributedﬁelds. Moreover, Kim and Mallick (2004), Gualtierotti (2005) and Allard and Naveau (2007) haveintroduced skew-Gaussian RFs for modeling data with skewed distributions. However, Minozzoand Ferracuti (2012) and Genton and Zhang (2012) show that all these models are not valid be-cause they cannot be identiﬁed with probability one using a single realization, i.e., in practice itis impossible to make inference on the basis of these models. Such results do not prevent theexistence of RFs having univariate marginal distributions belonging to a given family.In this paper, we consider multivariate stationary RFs, where each component has a univariatemarginal distribution of the skew-Gaussian type. We follow the representation proposed in theunivariate case by Zhang and El-Shaarawi (2010) and extend it to the multivariate case. This con-struction allows for modeling data with diﬀerent degrees of skewness as well as explicit expressionsfor the covariance function. Estimation methods for this model are still unexplored. Maximumlikelihood is certainly a useful tool, but it is impracticable, because the full likelihood does nothave a simple form. Indeed, if n is the number of observations, it can be explicitly expressed as thesum of 2 n terms depending on the probability density function (pdf) and the cumulative distribu-tion function (cdf) of the n -variate Gaussian distribution. Direct maximization of the likelihoodseems intractable from a computational and analytical point of view. Zhang and El-Shaarawi(2010) consider the EM algorithm to perform inference on the skew-Gaussian model. However, theiterations of the EM algorithm are diﬃcult to evaluate because each step requires Monte Carlosimulations of a non-trivial conditional expectation. On the other hand, composite likelihood (CL)is an estimation procedure (Lindsay, 1988; Varin et al., 2011; Cox and Reid, 2004) based on thelikelihood of marginal or conditional events. CL methods are an attractive option when the fulllikelihood is diﬃcult to write and/or when the data sets are large. This approach has been used inseveral spatial and space-time contexts, mainly in the Gaussian case (Vecchia, 1988; Curriero andLele, 1999; Stein et al., 2004; Bevilacqua et al., 2012; Bevilacqua and Gaetan, 2015; Bevilacquaet al., 2016). Outside the Gaussian scenario, Heagerty and Lele (1998) propose CL inference forbinary spatial data. Moreover, Padoan et al. (2010) and Sang and Genton (2014) have used CLmethods for the estimation of max-stable ﬁelds, whereas Alegr´ıa et al. (2016) consider a truncatedCL approach for wrapped-Gaussian ﬁelds.The implementation of the CL method on multivariate skew-Gaussian ﬁelds is still unexplored. Ourgoal consists in developing a CL approach based on pairs of observations for a multivariate skew-Gaussian RF. Our contribution provides a fast and accurate tool to make inference on skeweddata. The main ingredient of the pairwise CL method is the characterization of the bivariate2istributions of the RF, that is, we derive a closed form expression for the joint distributionbetween two correlated skew-Gaussian random variables (possibly with diﬀerent means, variancesand degrees of skewness).In addition, in order to work with data collected over the whole planet Earth, we consider thespatial domain as the unit sphere D = S := { s ∈ R : (cid:107) s (cid:107) = 1 } , where (cid:107) · (cid:107) denotes the Euclideandistance. We refer the reader to Marinucci and Peccati (2011) for a more detailed study aboutRFs on spheres. An important implication is that the covariance function depends on a diﬀerentmetric, called geodesic distance. In general, covariance models valid on Euclidean spaces are notvalid on the sphere, and we refer the reader to Gneiting (2013) for a overview of the problem.The paper is organized as follows. In Section 2, the multivariate skew-Gaussian RF is introduced.The bivariate distributions of the skew-Gaussian ﬁeld and the CL approach are discussed in Section3. In section 4, simulation experiments are developed. Section 5 contains a real application for abivariate data set of minimum and maximum surface air temperatures. Finally, Section 6 providesa brief discussion.

2. Skew-Gaussian RFs on the unit sphere

In this section, we introduce a skew-Gaussian model generated as a mixture of two latent Gaus-sian RFs. Such construction is based on the stochastic representation of skew-Gaussian randomvariables. Let X and Y be two independent standard Gaussian random variables and − ≤ δ ≤ Z = δ | X | + (cid:112) − δ Y (1)is called skew-Gaussian, with pdf 2 φ ( z )Φ( αz ), for z ∈ R , where α = δ/ √ − δ . Here, φ ( · ) andΦ( · ) denote the pdf and cdf of the standard Gaussian distribution. If δ >

0, we say that Z isright-skewed, whereas for δ < Z is left-skewed. Of course, for δ = 0 we have the Gaussiancase. For a detailed study of the skew-Gaussian distribution, we refer the reader to Azzalini (1985,1986), Azzalini and Dalla Valle (1996), Azzalini and Capitanio (1999), Arellano-Valle and Azzalini(2006) and Azzalini (2013).We work with the spatial counterpart of Equation (1). Let { X ( s ) = ( X ( s ) , . . . , X m ( s )) (cid:62) : s ∈ S } and { Y ( s ) = ( Y ( s ) , . . . , Y m ( s )) (cid:62) : s ∈ S } be two stationary multivariate Gaussian RFs deﬁnedon S . Here, m ∈ N denotes the number of components of the ﬁelds and (cid:62) is the transposeoperator. In addition, we assume that X ( s ) and Y ( s ) are independent, with components havingzero mean and unit variance. In the spherical framework, the covariances are given in terms of thegeodesic distance, deﬁned by θ := θ ( s , s ) = arccos( s (cid:62) s ) ∈ [0 , π ] , s , s ∈ S , which is the most natural metric on the spherical surface. Therefore, we suppose that there existstwo matrix-valued mappings r x , r y : [0 , π ] → R m × m such thatcov { X i ( s ) , X j ( s (cid:48) ) } = r xij ( θ ( s , s (cid:48) )) and cov { Y i ( s ) , Y j ( s (cid:48) ) } = r yij ( θ ( s , s (cid:48) )) , for all s , s (cid:48) ∈ S and i, j = 1 , . . . , m . In such case, we say that the covariance function is sphericallyisotropic. In the univariate case ( m = 1), Gneiting (2013) establishes that some classical covariances3uch as the Cauchy, Mat´ern, Askey and Spherical models, given in the classical literature in termsof Euclidean metrics, can be coupled with the geodesic distance under speciﬁc constraints for theparameters. Furthermore, Porcu et al. (2016) propose several covariance models for space-timeand multivariate RFs on spherical spatial domains.Next, we deﬁne a multivariate spatial RF with each component being skew-Gaussian distributed,according to Equation (1). It is a multivariate extension of the univariate skew-Gaussian ﬁeldproposed by Zhang and El-Shaarawi (2010). This model allows diﬀerent means, variances andskewness in the components of the RF. Deﬁnition 2.1.

A multivariate ﬁeld, { Z ( s ) = ( Z ( s ) , . . . , Z m ( s )) (cid:62) : s ∈ S } , with componentshaving skew-Gaussian marginal distributions, can be deﬁned through Z i ( s ) = µ i + η i | X i ( s ) | + σ i Y i ( s ) , s ∈ S , i = 1 , . . . , m, (2)where µ i , η i ∈ R and σ i ∈ R + . Note that ( Z i ( s ) − µ i ) / (cid:113) η i + σ i , for all s ∈ S , follows a skew-Gaussian distribution with pdf given by f Z i ( z ) = 2 φ ( z ) Φ (( η i /σ i ) z ) . Remark 2.1.

Recent literature considers the latent ﬁelds X i ( s ), i = 1 , . . . , m , as single randomvariables X i , being constants along the spatial domain. However, this approach has apparent iden-tiﬁability problems, since in practice we only work with one realization and there is no informationabout the variability of X i . Thus, this approach only produces a shift eﬀect in the model. Theseconsiderations are studied by Minozzo and Ferracuti (2012) and Genton and Zhang (2012).Direct application of the results given by Zhang and El-Shaarawi (2010) provides the followingproposition. Proposition 2.1.

The ﬁeld Z ( s ) deﬁned through Equation (2) is stationary with expectations E ( Z i ( s )) = µ i + η i (cid:114) π , s ∈ S , and covariances C ij ( θ ) := cov { Z i ( s ) , Z j ( s (cid:48) ) } = 2 η i η j π g ( r xij ( θ )) + σ i σ j r yij ( θ ) , (3)for all i, j = 1 , . . . , m , where θ = θ ( s , s (cid:48) ) and g ( t ) = √ − t + t arcsin( t ) −

1, for | t | ≤

3. Composite likelihood estimation

We ﬁrst introduce the CL approach from a general point of view. CL methods (Lindsay, 1988)are likelihood approximations for dealing with large data sets. In addition, in the last years,these techniques have been used to study data with intractable analytical expressions for the fulllikelihood. The objective function for CL methods is constructed through the likelihood of marginalor conditional events. Formally, let f ( z ; λ ) be the pdf of a n -dimensional random vector, where4 ∈ Λ ⊂ R p is an unknown parameter vector, and Λ is the parametric space. We denote by {A , ..., A K } a set of marginal or conditional events with associated likelihoods L k ( λ ; z ). Then,the objective function for the composite likelihood method is deﬁned as the weighted product L C ( λ ; z ) := K (cid:89) k =1 L k ( λ ; z ) w k , where the non-negative weights w k must be chosen according to an appropriate criterion. Inprinciple, the weights can improve the statistical and/or computational eﬃciency of the estimation.We use the notation (cid:96) k ( λ ; z ) = log L k ( λ ; z ), thus the log composite likelihood is given byCL( λ ; z ) := K (cid:88) k =1 (cid:96) k ( λ ; z ) w k . The maximum CL estimator is deﬁned as (cid:98) λ n = argmax λ ∈ Λ CL( λ ; z ) . By construction, the composite score ∇ CL( λ ) is an unbiased estimating equation, i.e., E ( ∇ CL( λ )) = ∈ R p . This is an appealing property of CL methods, since it is a ﬁrst order likelihood property.On the other hand, the second order properties are related to the Godambe information matrix,deﬁned as G n ( λ ) = H n ( λ ) J n ( λ ) − H n ( λ ) (cid:62) , where H n ( λ ) = − E ( ∇ CL( λ ; z )) and J n ( λ ) = E ( ∇ CL( λ ; z ) ∇ CL( λ ; z ) (cid:62) ). The inverse of G n ( λ ) isan approximation of the asymptotic variance of the CL estimator. Under increasing domain andregularity assumptions, CL estimates are consistent and asymptotically Gaussian. We now develop a CL method based on pairs of observations for the multivariate skew-GaussianRF. We consider the m -variate ﬁeld, { Z ( s ) = ( Z ( s ) , . . . , Z m ( s )) (cid:62) , s ∈ S } , deﬁned in (2), anda realization of Z ( s ) at n spatial locations, namely, ( Z ( s ) (cid:62) , . . . , Z ( s n ) (cid:62) ) (cid:62) . Then, we deﬁne allpossible pairs Z ijkl = ( Z i ( s k ) , Z j ( s l )) (cid:62) with associated log likelihood (cid:96) ijkl ( λ ), where λ ∈ R p is theparameter vector. Therefore, the corresponding log composite likelihood equation is deﬁned by(see Bevilacqua et al., 2016)CL( λ ) = m (cid:88) i =1 n − (cid:88) k =1 n (cid:88) l = k +1 ω iikl (cid:96) iikl ( λ ) + m − (cid:88) i =1 m (cid:88) j = i +1 n (cid:88) k =1 n (cid:88) l =1 ω ijkl (cid:96) ijkl ( λ ) . Note that the order of computation of the method is O{ mn ( n − / m ( m − n / } . Throughout,we consider a cut-oﬀ weight function (0/1 weights): w ijkl = 1 if θ ( s k , s l ) ≤ d ij , and 0 otherwise,for some cut-oﬀ distances d ij . This choice has apparent computational advantages. Moreover, itcan improve the eﬃciency as it has been shown in Joe and Lee (2009), Davis and Yau (2011) andBevilacqua et al. (2012). The intuition behind this approach is that the correlations between pairsof distant observations are often nearly zero. Therefore, using all possible pairs can generate a lossof eﬃciency, since redundant pairs can produce bias in the results.5rom now on, we use the notation Ω( r ) = (cid:18) rr (cid:19) . (4)The following result characterizes the pairwise distributions for the multivariate skew-GaussianRF. We suppose that η i (cid:54) = 0, for all i = 1 , . . . , m . The case with zero skewness is reduced to theGaussian scenario. Proposition 3.1.

Consider two sites s k , s l ∈ S and θ = θ ( s k , s l ). The log likelihood associatedto the pair Z ijkl = ( Z i ( s k ) , Z j ( s l )) (cid:62) in the multivariate skew-Gaussian model (2) is given by (cid:96) ijkl ( λ ) = log (cid:32) (cid:88) t =1 φ ( Z ijkl − µ ; A t )Φ ( L t ; B t ) (cid:33) (5)where φ ( y ; Σ) denotes the bivariate Gaussian density function with zero mean and covariancematrix Σ. Similarly, Φ ( l ; Σ) denotes the corresponding Gaussian cdf. Here, µ = ( µ i , µ j ) (cid:62) ,A t = Ω + Υ − Ω[( − t r xij ( θ )]Υ − ,B t = ([ΥΩ Υ] − + Ω[( − t r xij ( θ )] − ) − ,L t = (cid:20) I + ΥΩ ΥΩ[( − t r xij ( θ )] − (cid:21) − Υ( Z ijkl − µ ) , where Υ = diag { /η i , /η j } , I is the identity matrix of order (2 ×

2) andΩ = (cid:18) σ i σ i σ j σ i σ j σ j (cid:19) ◦ Ω( r yij ( θ )) , where ◦ denotes the Hadamard product.We have deduced a closed form expression for the bivariate distributions of the ﬁeld. Note thatthe correlation function r xij alternates its sign in each element of the sum. Evaluation of Equation(5) requires the numerical calculation of the bivariate Gaussian cdf. The proof of Proposition 3.1is deferred to Appendix A.

4. Simulation study

This section assesses through simulation experiments the statistical and computational performanceof the pairwise CL method. We pay attention to bivariate ( m = 2) skew-Gaussian RFs on S . We believe that there are no strong arguments to consider diﬀerent correlation structures r x ( · )and r y ( · ) for the latent RFs. For example, the smoothness of the skew-Gaussian RF is the sameas the smoothness of the roughest latent Gaussian ﬁeld. Moreover, if both latent correlations are6ompactly supported, thus the covariance generated has also compact support. We thus considerlatent ﬁelds belonging to the same parametric family of correlation functions, r xij ( θ ) = ρ xij r ( θ ; c xij ) , r yij ( θ ) = ρ yij r ( θ ; c yij ) , i, j = 1 , , θ ∈ [0 , π ] , where ρ xii = 1, ρ x = ρ x , | ρ x | ≤ c xij > ρ yij and c yij ), withmapping θ (cid:55)→ r ( θ ; c ) being any univariate correlation function on the sphere (see Gneiting, 2013).The particular choice of r ( · ; · ) produces additional restrictions on the parameters. Throughout, wework with r ( θ ; c ) = exp (cid:18) − θc (cid:19) , θ ∈ [0 , π ] , (6)or r ( θ ; c ) = (cid:18) − θc (cid:19) , θ ∈ [0 , π ] , (7)where c > a ) + = max { , a } . Mappings (6) and (7) are known asExponential and Askey models, respectively. The former decreases exponentially to zero and ittakes values less than 0 .

05 for θ > c , whereas the second is compactly supported, that is, itis identically equal to zero beyond the cut-oﬀ distance c . Explicit parametric conditions for thevalidity of the bivariate Exponential model are provided by Porcu et al. (2016). On the other hand,Appendix B illustrates a construction principle that justify the use of bivariate Askey models onspheres.An interesting property is that the collocated correlation coeﬃcient between the components ofa bivariate skew-Gaussian RF, C (0) / (cid:112) C (0) C (0), with C ( · ) deﬁned in (3), depends on themajority of the model parameters. Figure 1 shows the behavior of this coeﬃcient in terms of thelatent correlation coeﬃcients ρ xij and ρ yij , with σ = σ = 1 and under two diﬀerent settings for theskewness parameters: ( η , η ) = (1 ,

3) and ( η , η ) = (1 , − ρ := ρ x = ρ y , c ij := c xij = c yij and c = ( c + c ) /

2. The parameter vector is given by λ = ( σ , σ , η , η , c , c , ρ , µ , µ ) (cid:62) . In addition, this parameterization avoids identiﬁabilityproblems.Figure 2 shows the covariance C ( θ ), in Equation (3), generated from the latent correlation func-tions (6) and (7). The skew-Gaussian RF preserves the correlation shape of the latent ﬁelds.Figure 3 depicts a bivariate realization of a skew-Gaussian RF, over 15000 spatial locations, withlatent ﬁelds having Exponential correlations. We have simulated using Cholesky decompositionwith σ = 0 . σ = 0 . η = 2, η = 1, µ = µ = 0, ρ = 0 . c = c = 0 .

6. The skewnessof the simulated data is illustrated through the corresponding histograms.

We illustrate the saving of the pairwise CL method in terms of computational burden. All ex-periments were carried out on a 2.7 GHz processor with 8 GB of memory and the estimationprocedures were implemented coupling R functions and C routines. Table 1 provides the com-putational times (in seconds) in evaluating the weighted CL method, with cut-oﬀ distance equal7 igure 1: Collocated correlation coeﬃcient between the components of a bivariate skew-Gaussian RF in terms of ρ xij and ρ yij . We consider σ = σ = 1 and two scenarios for the skewness parameters: ( η , η ) = (1 ,

3) (left) and( η , η ) = (1 , −

3) (right).Figure 2: Covariance function associated to the skew-Gaussian RF, with latent correlations of Exponential (solidline) and Askey (dashed line) types.Figure 3: Bivariate simulation from the skew-Gaussian model with latent ﬁelds having Exponential correlationfunctions. Both variables are right-skewed and the empirical collocated correlation coeﬃcient is approximately 0.7. able 1: Time (in seconds) in evaluating CL method, with 0/1 weights, considering diﬀerent number of observationsand cut-oﬀ distances d ij = 0 . , . , . , Number of observations250 500 1000 2000 4000 8000 16000 d ij = 0 .

25 0.003 0.007 0.016 0.068 0.254 0.954 3.753 d ij = 0 . d ij = 0 .

75 0.007 0.017 0.050 0.205 0.796 3.020 12.024 d ij = 1 0.008 0.022 0.066 0.290 1.139 4.357 17.365to d ij = 0 . , . , . , i, j = 1 ,

2. These results show that CL has a moderatecomputational cost even for large data sets. Indeed, the most demanding part in the evaluation ofthe objective function is the repeated numerical calculation of the bivariate Gaussian cdf.We now study the statistical eﬃciency of the estimation method. We consider 289 spatial sitesin a grid on S , which is generated with 17 equispaced longitude and latitude points. We use thelatent correlation functions (6) and (7), with σ = σ = 1, µ = µ = 0, c = 0 . c = 0 .

25 and ρ = 0 .

5, under the following choices for the skewness parameters:(A) We set η = 1 and η = 2. In this case, both components are right-skewed and the collocatedcorrelation coeﬃcient between the components of the ﬁeld is approximately 0 . η = 1 and η = −

2. The ﬁrst component of the ﬁeld is right-skewed, whereas thesecond one is left-skewed, and the collocated correlation coeﬃcient between the componentsof the ﬁeld is approximately 0 . • Scenario (I) . Exponential model under choice (A). • Scenario (II) . Askey model under choice (A). • Scenario (III) . Exponential model under choice (B). • Scenario (IV) . Askey model under choice (B).For each scenario, we simulate 500 independent realizations from the bivariate skew-Gaussian RF.Then, we estimate the parameters using the weighted pairwise CL method. We set d ij = 0 . i, j = 1 ,

2. Figure 4 reports the boxplots of the CL estimates. All studiesshow the eﬀectiveness of our proposal. We have also applied CL estimation to other parametricmodels, such as, the Cauchy and Wendland correlation functions. For each case considered thepairwise CL performs well.Finally, we assess the performance of the pairwise CL method with increasing scale parameters c ij as well as increasing sample sizes n . For simplicity, all the subsequent experiments considera single scale parameter c ij = c , for all i, j = 1 ,

2. In this case, the parameter vector reducesto λ = ( σ , σ , η , η , c, ρ , µ , µ ) (cid:62) . We consider an Exponential latent correlation under thefollowing parametric setting: σ = σ = η = 1, η = 2, µ = µ = 0 and ρ = 0 .

5. Figure 5reports the centered boxplots of the CL estimates in three diﬀerent cases: c = 0 . , . , .

75. Theincrease of the parameter c imply that the spatial dependence will be strengthened, and it produces9 igure 4: Boxplots of the CL estimates for the bivariate skew-Gaussian RF, under Scenarios (I)-(IV).Figure 5: Centered boxplots of the CL estimates, for the bivariate skew-Gaussian RF, using an Exponential latentcorrelation and diﬀerent scale parameters: c = 0 . , . , . igure 6: Centered boxplots of the CL estimates, for the bivariate skew-Gaussian RF, using an Exponential latentcorrelation and diﬀerent sample sizes: n = 81 , , biased estimates of c and ρ . Our ﬁndings are consistent and add more evidence to the resultsreported in the previous literature (Zhang, 2004; Bevilacqua et al., 2012; Xu and Genton, 2016b).On the other hand, we set c = 0 .

15, and we consider increasing sample sizes: n = 81 , ,

5. A bivariate data set

We analyze a bivariate data set of Minimum (Variable 1) and Maximum (Variable 2) surface airtemperatures. The spatial variability of temperatures is crucial for modeling hydrological andagricultural phenomena. These data outputs come from the Community Climate System Model(CCSM4.0) (see Gent et al., 2011) provided by NCAR (National Center for Atmospheric Research)located at Boulder, CO, USA.We have monthly data over a grid of 2 . × . − η i = 0, for i = 1 ,

2. We haveconsidered the parameterization introduced in the previous sections, so that, the skew-Gaussian11 igure 7: Residuals of the Minimum (left) and Maximum (right) surface air temperatures in July of 2015. model has 9 parameters, whereas the Gaussian model has 7 parameters. The CL estimation iscarried out using only pairs of observations whose spatial distances are less than 1592.75 kilometers(equivalent to 0 .

25 radians on the unit sphere). Table 2 reports the CL estimates for the skew-Gaussian and Gaussian models. The units of the scale parameters are kilometers.The optimal values of the CL objective functions are given in Table 3. Note that the maximumCL value under the skew-Gaussian model is superior to the merely Gaussian model. It is clearthat the incorporation of skewness produces improvements in goodness-of-ﬁt. Figure 8 shows thehistograms of each variable and the ﬁtted skew-Gaussian and Gaussian density functions. In Figure9, the marginal and cross empirical semi-variograms are compared to the theoretical models.Finally, we compare both models in terms of their predictive performance. Since the covariancestructure of the skew-Gaussian ﬁeld is known explicitly, we use the classical best linear unbiasedpredictor (cokriging), which is optimal for the Gaussian model, in terms of mean squared error.However, it is not optimal for the skew-Gaussian RF. In spite of this, we will show that the skew-Gaussian model provides better predictive results. We use a drop-one prediction strategy andquantify the discrepancy between the real and predicted values through the root mean squaredprediction error (RMSPE) RMSPE = (cid:118)(cid:117)(cid:117)(cid:116) n (cid:88) i =1 n (cid:88) k =1 ( Z i ( s k ) − (cid:98) Z i ( s k )) and the Log-score (LSCORE)LSCORE = 12 n (cid:88) i =1 n (cid:88) k =1 (cid:34) log(2 π (cid:98) σ i ( s k ))2 + ( Z i ( s k ) − (cid:98) Z i ( s k )) (cid:98) σ i ( s k ) (cid:35) , where n is the number of spatial locations, (cid:98) Z i ( s k ) is the drop-one prediction of Z i ( s k ) at location s k and (cid:98) σ i ( s k ) is the drop-one prediction variance (see Zhang and Wang, 2010). Note that the12 igure 8: Histograms for the residuals of the Minimum (left) and Maximum (right) surface air temperatures, consid-ering observations with latitudes between −

30 and 30 degrees, and the ﬁtted skew-Gaussian (solid line) and Gaussian(dashed line) probability density functions. skew-Gaussian model generates better results since the mentioned indicators are smaller. In termsof RMSPE, the improvement in the prediction is approximately 3 .

6. Discussion

Building models for non-Gaussian RFs has become a major challenge and more eﬀorts should bedevoted to such constructions. In particular, it seems that the main diﬃculties arise when trying tobuild models that are statistically identiﬁable. Another major problem, on the other hand, comeswhen building the ﬁnite dimensional distributions, which are analytically intractable in most cases.This paper has provided an approach that allows to avoid the identiﬁability problem in multivariateskew-Gaussian RFs, and that permits to implement a CL approach.

Figure 9: Empirical semi-variograms versus ﬁtted semi-variograms, using Exponential latent correlations, for theskew-Gaussian (solid line) and Gaussian (dashed line) models. able 2: CL estimates for the skew-Gaussian and Gaussian RFs, using Exponential latent correlations. Scaleparameters are given in kilometers. (cid:98) σ (cid:99) σ (cid:98) η (cid:98) η (cid:98) c (cid:98) c (cid:98) ρ (cid:98) µ (cid:98) µ skew-Gaussian 0.223 0.179 0.487 0.769 4995.9 4867.1 0.819 0.250 0.079Gaussian 0.310 0.419 - - 3922.4 3393.1 0.743 0.635 0.674 Table 3: Prediction performance of the skew-Gaussian and Gaussian RFs, using Exponential latent correlations. − − . − Acknowledgments

Alfredo Alegr´ıa is supported by

Beca CONICYT-PCHA/Doctorado Nacional/2016-21160371 . MorenoBevilacqua is partially supported by

Proyecto Fondecyt 1160280 . Emilio Porcu is supported by

Proyecto Fondecyt Regular number 1130647 . Jorge Clarke is supported by

Proyecto Fondecyt Post-Doctorado number 3150506.

We also acknowledge the World Climate Research Programme’s Working Group on Coupled Mod-elling, which is responsible for Coupled Model Intercomparison Project (CMIP).

Appendix A. Pairwise distributions for the multivariate skew-Gaussian RF

Before we state the proof of Proposition 3.1, we need the following property of quadratic forms.

Lemma

Let A and B be two symmetric positive deﬁnite matrices of order ( n × n ) and x , a ∈ R n .Then, we have the following identity:( a − x ) (cid:62) A − ( a − x ) + x (cid:62) B − x = ( x − c ) (cid:62) ( A − + B − )( x − c ) + a (cid:62) ( A + B ) − a , (A.1)where c = ( A − + B − ) − A − a . 14 roof of Proposition 3.1 Consider W = ( | X | , | X | ) (cid:62) , V = ( Y , Y ) (cid:62) , µ = ( µ , µ ) (cid:62) , η =( η , η ) (cid:62) and σ = ( σ , σ ) (cid:62) , where ( X , X ) (cid:62) ∼ N ( , Ω( r x )) and ( Y , Y ) (cid:62) ∼ N ( , Ω( r y )) areindependent, with Ω( r ) as deﬁned in (4). Let Z = ( Z , Z ) (cid:62) deﬁned through Z = µ + η ◦ W + σ ◦ V , where ◦ denotes the Hadamard product. Therefore, the joint probability density function of Z isgiven by f Z ( z ) = (cid:90) R f Z | ( W = w ) ( z | w ) f W ( w )d w , (A.2)Here, f W ( w ) is the pdf of the random vector W and w = ( w , w ) (cid:62) . Note that the cdf of therandom vector W can be written as F W ( w , w ) = Φ ( w , w ; Ω( r x )) − Φ ( − w , w ; Ω( r x )) − Φ ( w , − w ; Ω( r x ))+Φ ( − w , − w ; Ω( r x ))Then, we can obtain the pdf of W , f W ( w , w ) = 2 (cid:18) φ ( w , w ; Ω( r x )) + φ ( − w , w ; Ω( r x )) (cid:19) , = 2 (cid:18) φ ( w , w ; Ω( r x )) + φ ( w , w ; Ω( − r x )) (cid:19) . On the other hand, f Z | W = w ( z | w ) is the pdf of the random vector Z | ( W = w ) ∼ N ( µ + η ◦ w ; Ω ),with Ω = (cid:18) σ σ σ σ σ σ (cid:19) ◦ Ω( r y ) . Therefore, evaluation of the integral (A.2) requires the characterization of an integral of the form I = (cid:90) R φ ( z − µ − η ◦ w ; Ω ) φ ( w ; Ω )d w , with Ω being Ω( r x ) or Ω( − r x ). Moreover, I can be written as I = | Υ | (cid:90) R φ (Υ( z − µ ) − w ; ΥΩ Υ) φ ( w ; Ω )d w where Υ = diag { /η , /η } . Thus, using Equation (A.1) we have I = | Υ | φ (cid:18) Υ( z − µ ); ΥΩ Υ + Ω (cid:19) (cid:90) R φ (cid:18) w − L ; ([ΥΩ Υ] − + Ω − ) − (cid:19) d w , = | Υ | φ (cid:18) Υ( z − µ ); ΥΩ Υ + Ω (cid:19) Φ ( L ; ([ΥΩ Υ] − + Ω − ) − )= φ (cid:18) z − µ ; Ω + Υ − Ω Υ − (cid:19) Φ ( L ; ([ΥΩ Υ] − + Ω − ) − )where L = (cid:20) I + ΥΩ ΥΩ − (cid:21) − Υ( z − µ ). 15 ppendix B. Multivariate Askey model on the sphere We consider a multivariate Askey model for the sphere S , deﬁned according to ρ ij r ( θ ; c ij ) = ρ ij (cid:18) − θc ij (cid:19) , θ ∈ [0 , π ] , i, j = 1 , . . . , m, which has been used through Sections 4 and 5, considering c ij = ( c ii + c jj ) /

2. We claim that sucha model is positive deﬁnite using the following mixture (see Daley et al., 2015) r ( θ ; c ij ) ∝ (cid:90) ∞ r ( θ ; ξ ) ξ r ( ξ ; c ij )d ξ, θ ∈ [0 , π ] . The results of Gneiting (2013) for the univariate Askey model, coupled with the conditions devel-oped by Daley et al. (2015) complete our assertion.

ReferencesReferences

Alegr´ıa, A., Bevilacqua, M., Porcu, E., 2016. Likelihood-based inference for multivariate space-time wrapped-Gaussian ﬁelds. Journal of Statistical Computation and Simulation 86 (13), 2583–2597.Allard, D., Naveau, P., 2007. A New Spatial Skew-Normal Random Field Model. Communications in Statistics -Theory and Methods 36 (9), 1821–1834.Arellano-Valle, R., Azzalini, A., 2006. On the uniﬁcation of families of skew-normal distributions. ScandinavianJournal of Statistics 33, 561–574.Azzalini, A., 1985. A class of distributions which includes the normal ones. Scandinavian Journal of Statistics 12 (2),171–178.Azzalini, A., 1986. Further results on a class of distributions which includes the normal ones. Statistica 46 (2),199–208.Azzalini, A., 2013. The skew-normal and related families. Vol. 3. Cambridge University Press.Azzalini, A., Capitanio, A., 1999. Statistical applications of the multivariate skew normal distribution. Journal ofthe Royal Statistical Society: Series B (Statistical Methodology) 61 (3), 579–602.Azzalini, A., Dalla Valle, A., 1996. The multivariate skew-normal distribution. Biometrika 83 (4), 715–726.Bevilacqua, M., Alegria, A., Velandia, D., Porcu, E., 2016. Composite likelihood inference for multivariate Gaussianrandom ﬁelds. Journal of Agricultural, Biological, and Environmental Statistics 21 (3), 448–469.Bevilacqua, M., Gaetan, C., 2015. Comparing composite likelihood methods based on pairs for spatial Gaussianrandom ﬁelds. Statistics and Computing 25 (5), 877–892.Bevilacqua, M., Gaetan, C., Mateu, J., Porcu, E., 2012. Estimating space and space-time covariance functions forlarge data sets: a weighted composite likelihood approach. Journal of the American Statistical Association 107,268–280.Cox, D., Reid, N., 2004. Miscellanea: A note on pseudolikelihood constructed from marginal densities. Biometrika91, 729–37.Curriero, F., Lele, S., 1999. A composite likelihood approach to semivariogram estimation. Journal of Agricultural,Biological and Environmental Statistics 4, 9–28.Daley, D., Porcu, E., Bevilacqua, M., 2015. Classes of compactly supported covariance functions for multivariaterandom ﬁelds. Stochastic Environmental Research and Risk Assessment 29 (4), 1249–1263.Davis, R., Yau, C.-Y., 2011. Comments on pairwise likelihood in time series models. Statistica Sinica 21, 255–277.De Oliveira, V., Kedem, B., Short, D., 1997. Bayesian prediction of transformed Gaussian random ﬁelds. Journal ofthe American Statistical Association 92, 1422–1433.Du, J., Leonenko, N., Ma, C., Shu, H., 2012. Hyperbolic vector random ﬁelds with hyperbolic direct and crosscovariance functions. Stochastic Analysis and Applications 30 (4), 662–674. ent, P., Danabasoglu, G., Donner, L., Holland, M., Hunke, E., Jayne, S., Lawrence, D., Neale, R., Rasch, P.,Vertenstein, M., Worley, P., Yang, Z.-L., Zhang, M., 2011. The Community Climate System Model Version 4.Journal of Climate 24 (19), 4973–4991.Genton, M. G., Zhang, H., 2012. Identiﬁability problems in some non-Gaussian spatial random ﬁelds. Chilean Journalof Statistics 3 (2).Gneiting, T., 2013. Strictly and non-strictly positive deﬁnite functions on spheres. Bernoulli 19 (4), 1327–1349.Gualtierotti, A., 2005. Skew-normal processes as models for random signals corrupted by Gaussian noise. InternationalJournal of Pure and Applied Mathematics 20, 109–142.Heagerty, P., Lele, S., 1998. A composite likelihood approach to binary spatial data. Journal of the AmericanStatistical Association 93, 1099–1111.Joe, H., Lee, Y., 2009. On weighting of bivariate margins in pairwise likelihood. Journal of Multivariate Analysis100, 670–685.Jona-Lasinio, G., Gelfand, A., Jona-Lasinio, M., 12 2012. Spatial analysis of wave direction data using wrappedGaussian processes. The Annals of Applied Statistics 6 (4), 1478–1498.Kim, H., Mallick, B., 2004. A bayesian prediction using the skew Gaussian distribution. Journal of Statistical Planningand Inference 120 (1-2), 85–101.Lindsay, B., 1988. Composite likelihood methods. Contemporary Mathematics 80, 221–239.Ma, C., 2009. Construction of non-Gaussian random ﬁelds with any given correlation structure. Journal of StatisticalPlanning and Inference 139, 780–787.Ma, C., 2013a. K-distributed vector random ﬁelds in space and time. Statistics & Probability Letters 83 (4), 1143–1150.Ma, C., 2013b. Student’s t vector random ﬁelds with power-law and log-law decaying direct and cross covariances.Stochastic Analysis and Applications 31 (1), 167–182.Marinucci, D., Peccati, G., 2011. Random ﬁelds on the sphere: representation, limit theorems and cosmologicalapplications. Vol. 389. Cambridge University Press.Minozzo, M., Ferracuti, L., 2012. On the existence of some skew-normal stationary processes. Chilean Journal ofStatistics 3, 157.Padoan, S., Ribatet, M., Sisson, S., 2010. Likelihood-based inference for max-stable processes. Journal of the Amer-ican Statistical Association 105, 263–277.Porcu, E., Bevilacqua, M., Genton, M., 2016. Spatio-Temporal Covariance and Cross-Covariance Functions of theGreat Circle Distance on a Sphere. Journal of the American Statistical Association 111 (514), 888–898.Sang, H., Genton, M. G., 2014. Tapered composite likelihood for spatial max-stable models. Spatial Statistics 8,86–103.Stein, M., 1992. Prediction and Inference for Truncated Spatial Data. Journal of Computational and GraphicalStatistics 1, 91–110.Stein, M., Chi, Z., Welty, L., 2004. Approximating likelihoods for large spatial data sets. Journal of the RoyalStatistical Society B 66, 275–296.Varin, C., Reid, N., Firth, D., 2011. An overview of composite likelihood methods. Statistica Sinica 21, 5–42.Vecchia, A., 1988. Estimation and model identiﬁcation for continuous spatial processes. Journal of the Royal Statis-tical Society B 50, 297–312.Xu, G., Genton, M. G., 2016a. Tukey g-and-h Random Fields. Journal of the American Statistical Association (Toappear).Xu, G., Genton, M. G., 2016b. Tukey max-stable processes for spatial extremes. Spatial Statistics 18, 431 – 443.Zhang, H., 2004. Inconsistent estimation and asymptotically equal interpolations in model-based geostatistics. Journalof the American Statistical Association 99 (465), 250–261.Zhang, H., El-Shaarawi, A., 2010. On spatial skew-Gaussian processes and applications. Environmetrics 21 (1), 33–47.Zhang, H., Wang, Y., 2010. Kriging and cross-validation for massive spatial data. Environmetrics 21 (3-4), 290–304.ent, P., Danabasoglu, G., Donner, L., Holland, M., Hunke, E., Jayne, S., Lawrence, D., Neale, R., Rasch, P.,Vertenstein, M., Worley, P., Yang, Z.-L., Zhang, M., 2011. The Community Climate System Model Version 4.Journal of Climate 24 (19), 4973–4991.Genton, M. G., Zhang, H., 2012. Identiﬁability problems in some non-Gaussian spatial random ﬁelds. Chilean Journalof Statistics 3 (2).Gneiting, T., 2013. Strictly and non-strictly positive deﬁnite functions on spheres. Bernoulli 19 (4), 1327–1349.Gualtierotti, A., 2005. Skew-normal processes as models for random signals corrupted by Gaussian noise. InternationalJournal of Pure and Applied Mathematics 20, 109–142.Heagerty, P., Lele, S., 1998. A composite likelihood approach to binary spatial data. Journal of the AmericanStatistical Association 93, 1099–1111.Joe, H., Lee, Y., 2009. On weighting of bivariate margins in pairwise likelihood. Journal of Multivariate Analysis100, 670–685.Jona-Lasinio, G., Gelfand, A., Jona-Lasinio, M., 12 2012. Spatial analysis of wave direction data using wrappedGaussian processes. The Annals of Applied Statistics 6 (4), 1478–1498.Kim, H., Mallick, B., 2004. A bayesian prediction using the skew Gaussian distribution. Journal of Statistical Planningand Inference 120 (1-2), 85–101.Lindsay, B., 1988. Composite likelihood methods. Contemporary Mathematics 80, 221–239.Ma, C., 2009. Construction of non-Gaussian random ﬁelds with any given correlation structure. Journal of StatisticalPlanning and Inference 139, 780–787.Ma, C., 2013a. K-distributed vector random ﬁelds in space and time. Statistics & Probability Letters 83 (4), 1143–1150.Ma, C., 2013b. Student’s t vector random ﬁelds with power-law and log-law decaying direct and cross covariances.Stochastic Analysis and Applications 31 (1), 167–182.Marinucci, D., Peccati, G., 2011. Random ﬁelds on the sphere: representation, limit theorems and cosmologicalapplications. Vol. 389. Cambridge University Press.Minozzo, M., Ferracuti, L., 2012. On the existence of some skew-normal stationary processes. Chilean Journal ofStatistics 3, 157.Padoan, S., Ribatet, M., Sisson, S., 2010. Likelihood-based inference for max-stable processes. Journal of the Amer-ican Statistical Association 105, 263–277.Porcu, E., Bevilacqua, M., Genton, M., 2016. Spatio-Temporal Covariance and Cross-Covariance Functions of theGreat Circle Distance on a Sphere. Journal of the American Statistical Association 111 (514), 888–898.Sang, H., Genton, M. G., 2014. Tapered composite likelihood for spatial max-stable models. Spatial Statistics 8,86–103.Stein, M., 1992. Prediction and Inference for Truncated Spatial Data. Journal of Computational and GraphicalStatistics 1, 91–110.Stein, M., Chi, Z., Welty, L., 2004. Approximating likelihoods for large spatial data sets. Journal of the RoyalStatistical Society B 66, 275–296.Varin, C., Reid, N., Firth, D., 2011. An overview of composite likelihood methods. Statistica Sinica 21, 5–42.Vecchia, A., 1988. Estimation and model identiﬁcation for continuous spatial processes. Journal of the Royal Statis-tical Society B 50, 297–312.Xu, G., Genton, M. G., 2016a. Tukey g-and-h Random Fields. Journal of the American Statistical Association (Toappear).Xu, G., Genton, M. G., 2016b. Tukey max-stable processes for spatial extremes. Spatial Statistics 18, 431 – 443.Zhang, H., 2004. Inconsistent estimation and asymptotically equal interpolations in model-based geostatistics. Journalof the American Statistical Association 99 (465), 250–261.Zhang, H., El-Shaarawi, A., 2010. On spatial skew-Gaussian processes and applications. Environmetrics 21 (1), 33–47.Zhang, H., Wang, Y., 2010. Kriging and cross-validation for massive spatial data. Environmetrics 21 (3-4), 290–304.