Estimating covariance functions of multivariate skew-Gaussian random fields on the sphere
Alfredo Alegría, Sandra Caro, Moreno Bevilacqua, Emilio Porcu, Jorge Clarke
EEstimating covariance functions of multivariate skew-Gaussian randomfields on the sphere
Alegr´ıa A. ∗ a , Caro S. b , Bevilacqua M. c , Porcu E. a , Clarke J. a a Departamento de Matem´atica, Universidad T´ecnica Federico Santa Mar´ıa, Valpara´ıso, Chile b Departamento de Matem´atica y Computaci´on, Universidad de Santiago, Santiago, Chile c Instituto de Estad´ıstica, Universidad de Valpara´ıso, Valpara´ıso, Chile
Abstract
This paper considers a multivariate spatial random field, with each component having univariatemarginal distributions of the skew-Gaussian type. We assume that the field is defined spatially onthe unit sphere embedded in R , allowing for modeling data available over large portions of planetEarth. This model admits explicit expressions for the marginal and cross covariances. However,the n -dimensional distributions of the field are difficult to evaluate, because it requires the sum of2 n terms involving the cumulative and probability density functions of a n -dimensional Gaussiandistribution. Since in this case inference based on the full likelihood is computationally unfeasible,we propose a composite likelihood approach based on pairs of spatial observations. This last beingpossible thanks to the fact that we have a closed form expression for the bivariate distribution.We illustrate the effectiveness of the method through simulation experiments and the analysis ofa real data set of minimum and maximum surface air temperatures. Keywords:
Composite likelihood, Geodesic distance, Global data
1. Introduction
The Gaussian assumption is an appealing option to model spatial data. First, the Gaussian dis-tribution is completely characterized by its first two moments. Another interesting property is thetractability of the Gaussian distribution under linear combinations and conditioning. However, inmany geostatistical applications, including oceanography, environment and the study of naturalresources, the Gaussian framework is unrealistic, because the observed data have different features,such as, positivity, skewness or heavy tails, among others.Transformations of Gaussian random fields (RFs) is the most common alternative to model non-Gaussian fields. Consider a spatial domain D and { Z ( s ) , s ∈ D} defined as Z ( s ) = ϕ ( Y ( s )), where ϕ is a real-valued mapping and { Y ( s ) , s ∈ D} is Gaussian. Apparently the finite-dimensionaldistributions of Z depend on the choice of ϕ . In some cases, such mapping is one-to-one andadmits an inverse simplifying the analysis. In this class, we can highlight the log-Gaussian RFs, ∗ Corresponding author. Email: [email protected]
Preprint submitted to Elsevier July 27, 2017 a r X i v : . [ m a t h . S T ] J u l hich are generated as a particular example of the Box-Cox transformation (see De Oliveira et al.,1997). However, a one-to-one transformation is not appropriate in general. For instance, Stein(1992) considers a truncated Gaussian RF, taking ϕ as an indicator function, in order to modeldata sets with a given percentage of zeros (for instance, precipitations). On the other hand,wrapped-Gaussian RFs have been introduced in the literature for modeling directional spatialdata, arising in the study of wave and wind directions (see Jona-Lasinio et al., 2012). In addition,Xu and Genton (2016a) propose a flexible class of fields named the Tukey g-and-h RFs.Another approach consists in taking advantage of the stochastic representations of random vari-ables. For instance, Ma (2009) considers a general approach to construct elliptically contouredRFs through mixtures of Gaussian fields. These models have an explicit covariance structure andallow a wide range of finite-dimensional distributions. Other related works have been developed byDu et al. (2012), Ma (2013a) and Ma (2013b) including Hyperbolic, K and Student’s t distributedfields. Moreover, Kim and Mallick (2004), Gualtierotti (2005) and Allard and Naveau (2007) haveintroduced skew-Gaussian RFs for modeling data with skewed distributions. However, Minozzoand Ferracuti (2012) and Genton and Zhang (2012) show that all these models are not valid be-cause they cannot be identified with probability one using a single realization, i.e., in practice itis impossible to make inference on the basis of these models. Such results do not prevent theexistence of RFs having univariate marginal distributions belonging to a given family.In this paper, we consider multivariate stationary RFs, where each component has a univariatemarginal distribution of the skew-Gaussian type. We follow the representation proposed in theunivariate case by Zhang and El-Shaarawi (2010) and extend it to the multivariate case. This con-struction allows for modeling data with different degrees of skewness as well as explicit expressionsfor the covariance function. Estimation methods for this model are still unexplored. Maximumlikelihood is certainly a useful tool, but it is impracticable, because the full likelihood does nothave a simple form. Indeed, if n is the number of observations, it can be explicitly expressed as thesum of 2 n terms depending on the probability density function (pdf) and the cumulative distribu-tion function (cdf) of the n -variate Gaussian distribution. Direct maximization of the likelihoodseems intractable from a computational and analytical point of view. Zhang and El-Shaarawi(2010) consider the EM algorithm to perform inference on the skew-Gaussian model. However, theiterations of the EM algorithm are difficult to evaluate because each step requires Monte Carlosimulations of a non-trivial conditional expectation. On the other hand, composite likelihood (CL)is an estimation procedure (Lindsay, 1988; Varin et al., 2011; Cox and Reid, 2004) based on thelikelihood of marginal or conditional events. CL methods are an attractive option when the fulllikelihood is difficult to write and/or when the data sets are large. This approach has been used inseveral spatial and space-time contexts, mainly in the Gaussian case (Vecchia, 1988; Curriero andLele, 1999; Stein et al., 2004; Bevilacqua et al., 2012; Bevilacqua and Gaetan, 2015; Bevilacquaet al., 2016). Outside the Gaussian scenario, Heagerty and Lele (1998) propose CL inference forbinary spatial data. Moreover, Padoan et al. (2010) and Sang and Genton (2014) have used CLmethods for the estimation of max-stable fields, whereas Alegr´ıa et al. (2016) consider a truncatedCL approach for wrapped-Gaussian fields.The implementation of the CL method on multivariate skew-Gaussian fields is still unexplored. Ourgoal consists in developing a CL approach based on pairs of observations for a multivariate skew-Gaussian RF. Our contribution provides a fast and accurate tool to make inference on skeweddata. The main ingredient of the pairwise CL method is the characterization of the bivariate2istributions of the RF, that is, we derive a closed form expression for the joint distributionbetween two correlated skew-Gaussian random variables (possibly with different means, variancesand degrees of skewness).In addition, in order to work with data collected over the whole planet Earth, we consider thespatial domain as the unit sphere D = S := { s ∈ R : (cid:107) s (cid:107) = 1 } , where (cid:107) · (cid:107) denotes the Euclideandistance. We refer the reader to Marinucci and Peccati (2011) for a more detailed study aboutRFs on spheres. An important implication is that the covariance function depends on a differentmetric, called geodesic distance. In general, covariance models valid on Euclidean spaces are notvalid on the sphere, and we refer the reader to Gneiting (2013) for a overview of the problem.The paper is organized as follows. In Section 2, the multivariate skew-Gaussian RF is introduced.The bivariate distributions of the skew-Gaussian field and the CL approach are discussed in Section3. In section 4, simulation experiments are developed. Section 5 contains a real application for abivariate data set of minimum and maximum surface air temperatures. Finally, Section 6 providesa brief discussion.
2. Skew-Gaussian RFs on the unit sphere
In this section, we introduce a skew-Gaussian model generated as a mixture of two latent Gaus-sian RFs. Such construction is based on the stochastic representation of skew-Gaussian randomvariables. Let X and Y be two independent standard Gaussian random variables and − ≤ δ ≤ Z = δ | X | + (cid:112) − δ Y (1)is called skew-Gaussian, with pdf 2 φ ( z )Φ( αz ), for z ∈ R , where α = δ/ √ − δ . Here, φ ( · ) andΦ( · ) denote the pdf and cdf of the standard Gaussian distribution. If δ >
0, we say that Z isright-skewed, whereas for δ < Z is left-skewed. Of course, for δ = 0 we have the Gaussiancase. For a detailed study of the skew-Gaussian distribution, we refer the reader to Azzalini (1985,1986), Azzalini and Dalla Valle (1996), Azzalini and Capitanio (1999), Arellano-Valle and Azzalini(2006) and Azzalini (2013).We work with the spatial counterpart of Equation (1). Let { X ( s ) = ( X ( s ) , . . . , X m ( s )) (cid:62) : s ∈ S } and { Y ( s ) = ( Y ( s ) , . . . , Y m ( s )) (cid:62) : s ∈ S } be two stationary multivariate Gaussian RFs definedon S . Here, m ∈ N denotes the number of components of the fields and (cid:62) is the transposeoperator. In addition, we assume that X ( s ) and Y ( s ) are independent, with components havingzero mean and unit variance. In the spherical framework, the covariances are given in terms of thegeodesic distance, defined by θ := θ ( s , s ) = arccos( s (cid:62) s ) ∈ [0 , π ] , s , s ∈ S , which is the most natural metric on the spherical surface. Therefore, we suppose that there existstwo matrix-valued mappings r x , r y : [0 , π ] → R m × m such thatcov { X i ( s ) , X j ( s (cid:48) ) } = r xij ( θ ( s , s (cid:48) )) and cov { Y i ( s ) , Y j ( s (cid:48) ) } = r yij ( θ ( s , s (cid:48) )) , for all s , s (cid:48) ∈ S and i, j = 1 , . . . , m . In such case, we say that the covariance function is sphericallyisotropic. In the univariate case ( m = 1), Gneiting (2013) establishes that some classical covariances3uch as the Cauchy, Mat´ern, Askey and Spherical models, given in the classical literature in termsof Euclidean metrics, can be coupled with the geodesic distance under specific constraints for theparameters. Furthermore, Porcu et al. (2016) propose several covariance models for space-timeand multivariate RFs on spherical spatial domains.Next, we define a multivariate spatial RF with each component being skew-Gaussian distributed,according to Equation (1). It is a multivariate extension of the univariate skew-Gaussian fieldproposed by Zhang and El-Shaarawi (2010). This model allows different means, variances andskewness in the components of the RF. Definition 2.1.
A multivariate field, { Z ( s ) = ( Z ( s ) , . . . , Z m ( s )) (cid:62) : s ∈ S } , with componentshaving skew-Gaussian marginal distributions, can be defined through Z i ( s ) = µ i + η i | X i ( s ) | + σ i Y i ( s ) , s ∈ S , i = 1 , . . . , m, (2)where µ i , η i ∈ R and σ i ∈ R + . Note that ( Z i ( s ) − µ i ) / (cid:113) η i + σ i , for all s ∈ S , follows a skew-Gaussian distribution with pdf given by f Z i ( z ) = 2 φ ( z ) Φ (( η i /σ i ) z ) . Remark 2.1.
Recent literature considers the latent fields X i ( s ), i = 1 , . . . , m , as single randomvariables X i , being constants along the spatial domain. However, this approach has apparent iden-tifiability problems, since in practice we only work with one realization and there is no informationabout the variability of X i . Thus, this approach only produces a shift effect in the model. Theseconsiderations are studied by Minozzo and Ferracuti (2012) and Genton and Zhang (2012).Direct application of the results given by Zhang and El-Shaarawi (2010) provides the followingproposition. Proposition 2.1.
The field Z ( s ) defined through Equation (2) is stationary with expectations E ( Z i ( s )) = µ i + η i (cid:114) π , s ∈ S , and covariances C ij ( θ ) := cov { Z i ( s ) , Z j ( s (cid:48) ) } = 2 η i η j π g ( r xij ( θ )) + σ i σ j r yij ( θ ) , (3)for all i, j = 1 , . . . , m , where θ = θ ( s , s (cid:48) ) and g ( t ) = √ − t + t arcsin( t ) −
1, for | t | ≤
3. Composite likelihood estimation
We first introduce the CL approach from a general point of view. CL methods (Lindsay, 1988)are likelihood approximations for dealing with large data sets. In addition, in the last years,these techniques have been used to study data with intractable analytical expressions for the fulllikelihood. The objective function for CL methods is constructed through the likelihood of marginalor conditional events. Formally, let f ( z ; λ ) be the pdf of a n -dimensional random vector, where4 ∈ Λ ⊂ R p is an unknown parameter vector, and Λ is the parametric space. We denote by {A , ..., A K } a set of marginal or conditional events with associated likelihoods L k ( λ ; z ). Then,the objective function for the composite likelihood method is defined as the weighted product L C ( λ ; z ) := K (cid:89) k =1 L k ( λ ; z ) w k , where the non-negative weights w k must be chosen according to an appropriate criterion. Inprinciple, the weights can improve the statistical and/or computational efficiency of the estimation.We use the notation (cid:96) k ( λ ; z ) = log L k ( λ ; z ), thus the log composite likelihood is given byCL( λ ; z ) := K (cid:88) k =1 (cid:96) k ( λ ; z ) w k . The maximum CL estimator is defined as (cid:98) λ n = argmax λ ∈ Λ CL( λ ; z ) . By construction, the composite score ∇ CL( λ ) is an unbiased estimating equation, i.e., E ( ∇ CL( λ )) = ∈ R p . This is an appealing property of CL methods, since it is a first order likelihood property.On the other hand, the second order properties are related to the Godambe information matrix,defined as G n ( λ ) = H n ( λ ) J n ( λ ) − H n ( λ ) (cid:62) , where H n ( λ ) = − E ( ∇ CL( λ ; z )) and J n ( λ ) = E ( ∇ CL( λ ; z ) ∇ CL( λ ; z ) (cid:62) ). The inverse of G n ( λ ) isan approximation of the asymptotic variance of the CL estimator. Under increasing domain andregularity assumptions, CL estimates are consistent and asymptotically Gaussian. We now develop a CL method based on pairs of observations for the multivariate skew-GaussianRF. We consider the m -variate field, { Z ( s ) = ( Z ( s ) , . . . , Z m ( s )) (cid:62) , s ∈ S } , defined in (2), anda realization of Z ( s ) at n spatial locations, namely, ( Z ( s ) (cid:62) , . . . , Z ( s n ) (cid:62) ) (cid:62) . Then, we define allpossible pairs Z ijkl = ( Z i ( s k ) , Z j ( s l )) (cid:62) with associated log likelihood (cid:96) ijkl ( λ ), where λ ∈ R p is theparameter vector. Therefore, the corresponding log composite likelihood equation is defined by(see Bevilacqua et al., 2016)CL( λ ) = m (cid:88) i =1 n − (cid:88) k =1 n (cid:88) l = k +1 ω iikl (cid:96) iikl ( λ ) + m − (cid:88) i =1 m (cid:88) j = i +1 n (cid:88) k =1 n (cid:88) l =1 ω ijkl (cid:96) ijkl ( λ ) . Note that the order of computation of the method is O{ mn ( n − / m ( m − n / } . Throughout,we consider a cut-off weight function (0/1 weights): w ijkl = 1 if θ ( s k , s l ) ≤ d ij , and 0 otherwise,for some cut-off distances d ij . This choice has apparent computational advantages. Moreover, itcan improve the efficiency as it has been shown in Joe and Lee (2009), Davis and Yau (2011) andBevilacqua et al. (2012). The intuition behind this approach is that the correlations between pairsof distant observations are often nearly zero. Therefore, using all possible pairs can generate a lossof efficiency, since redundant pairs can produce bias in the results.5rom now on, we use the notation Ω( r ) = (cid:18) rr (cid:19) . (4)The following result characterizes the pairwise distributions for the multivariate skew-GaussianRF. We suppose that η i (cid:54) = 0, for all i = 1 , . . . , m . The case with zero skewness is reduced to theGaussian scenario. Proposition 3.1.
Consider two sites s k , s l ∈ S and θ = θ ( s k , s l ). The log likelihood associatedto the pair Z ijkl = ( Z i ( s k ) , Z j ( s l )) (cid:62) in the multivariate skew-Gaussian model (2) is given by (cid:96) ijkl ( λ ) = log (cid:32) (cid:88) t =1 φ ( Z ijkl − µ ; A t )Φ ( L t ; B t ) (cid:33) (5)where φ ( y ; Σ) denotes the bivariate Gaussian density function with zero mean and covariancematrix Σ. Similarly, Φ ( l ; Σ) denotes the corresponding Gaussian cdf. Here, µ = ( µ i , µ j ) (cid:62) ,A t = Ω + Υ − Ω[( − t r xij ( θ )]Υ − ,B t = ([ΥΩ Υ] − + Ω[( − t r xij ( θ )] − ) − ,L t = (cid:20) I + ΥΩ ΥΩ[( − t r xij ( θ )] − (cid:21) − Υ( Z ijkl − µ ) , where Υ = diag { /η i , /η j } , I is the identity matrix of order (2 ×
2) andΩ = (cid:18) σ i σ i σ j σ i σ j σ j (cid:19) ◦ Ω( r yij ( θ )) , where ◦ denotes the Hadamard product.We have deduced a closed form expression for the bivariate distributions of the field. Note thatthe correlation function r xij alternates its sign in each element of the sum. Evaluation of Equation(5) requires the numerical calculation of the bivariate Gaussian cdf. The proof of Proposition 3.1is deferred to Appendix A.
4. Simulation study
This section assesses through simulation experiments the statistical and computational performanceof the pairwise CL method. We pay attention to bivariate ( m = 2) skew-Gaussian RFs on S . We believe that there are no strong arguments to consider different correlation structures r x ( · )and r y ( · ) for the latent RFs. For example, the smoothness of the skew-Gaussian RF is the sameas the smoothness of the roughest latent Gaussian field. Moreover, if both latent correlations are6ompactly supported, thus the covariance generated has also compact support. We thus considerlatent fields belonging to the same parametric family of correlation functions, r xij ( θ ) = ρ xij r ( θ ; c xij ) , r yij ( θ ) = ρ yij r ( θ ; c yij ) , i, j = 1 , , θ ∈ [0 , π ] , where ρ xii = 1, ρ x = ρ x , | ρ x | ≤ c xij > ρ yij and c yij ), withmapping θ (cid:55)→ r ( θ ; c ) being any univariate correlation function on the sphere (see Gneiting, 2013).The particular choice of r ( · ; · ) produces additional restrictions on the parameters. Throughout, wework with r ( θ ; c ) = exp (cid:18) − θc (cid:19) , θ ∈ [0 , π ] , (6)or r ( θ ; c ) = (cid:18) − θc (cid:19) , θ ∈ [0 , π ] , (7)where c > a ) + = max { , a } . Mappings (6) and (7) are known asExponential and Askey models, respectively. The former decreases exponentially to zero and ittakes values less than 0 .
05 for θ > c , whereas the second is compactly supported, that is, itis identically equal to zero beyond the cut-off distance c . Explicit parametric conditions for thevalidity of the bivariate Exponential model are provided by Porcu et al. (2016). On the other hand,Appendix B illustrates a construction principle that justify the use of bivariate Askey models onspheres.An interesting property is that the collocated correlation coefficient between the components ofa bivariate skew-Gaussian RF, C (0) / (cid:112) C (0) C (0), with C ( · ) defined in (3), depends on themajority of the model parameters. Figure 1 shows the behavior of this coefficient in terms of thelatent correlation coefficients ρ xij and ρ yij , with σ = σ = 1 and under two different settings for theskewness parameters: ( η , η ) = (1 ,
3) and ( η , η ) = (1 , − ρ := ρ x = ρ y , c ij := c xij = c yij and c = ( c + c ) /
2. The parameter vector is given by λ = ( σ , σ , η , η , c , c , ρ , µ , µ ) (cid:62) . In addition, this parameterization avoids identifiabilityproblems.Figure 2 shows the covariance C ( θ ), in Equation (3), generated from the latent correlation func-tions (6) and (7). The skew-Gaussian RF preserves the correlation shape of the latent fields.Figure 3 depicts a bivariate realization of a skew-Gaussian RF, over 15000 spatial locations, withlatent fields having Exponential correlations. We have simulated using Cholesky decompositionwith σ = 0 . σ = 0 . η = 2, η = 1, µ = µ = 0, ρ = 0 . c = c = 0 .
6. The skewnessof the simulated data is illustrated through the corresponding histograms.
We illustrate the saving of the pairwise CL method in terms of computational burden. All ex-periments were carried out on a 2.7 GHz processor with 8 GB of memory and the estimationprocedures were implemented coupling R functions and C routines. Table 1 provides the com-putational times (in seconds) in evaluating the weighted CL method, with cut-off distance equal7 igure 1: Collocated correlation coefficient between the components of a bivariate skew-Gaussian RF in terms of ρ xij and ρ yij . We consider σ = σ = 1 and two scenarios for the skewness parameters: ( η , η ) = (1 ,
3) (left) and( η , η ) = (1 , −
3) (right).Figure 2: Covariance function associated to the skew-Gaussian RF, with latent correlations of Exponential (solidline) and Askey (dashed line) types.Figure 3: Bivariate simulation from the skew-Gaussian model with latent fields having Exponential correlationfunctions. Both variables are right-skewed and the empirical collocated correlation coefficient is approximately 0.7. able 1: Time (in seconds) in evaluating CL method, with 0/1 weights, considering different number of observationsand cut-off distances d ij = 0 . , . , . , Number of observations250 500 1000 2000 4000 8000 16000 d ij = 0 .
25 0.003 0.007 0.016 0.068 0.254 0.954 3.753 d ij = 0 . d ij = 0 .
75 0.007 0.017 0.050 0.205 0.796 3.020 12.024 d ij = 1 0.008 0.022 0.066 0.290 1.139 4.357 17.365to d ij = 0 . , . , . , i, j = 1 ,
2. These results show that CL has a moderatecomputational cost even for large data sets. Indeed, the most demanding part in the evaluation ofthe objective function is the repeated numerical calculation of the bivariate Gaussian cdf.We now study the statistical efficiency of the estimation method. We consider 289 spatial sitesin a grid on S , which is generated with 17 equispaced longitude and latitude points. We use thelatent correlation functions (6) and (7), with σ = σ = 1, µ = µ = 0, c = 0 . c = 0 .
25 and ρ = 0 .
5, under the following choices for the skewness parameters:(A) We set η = 1 and η = 2. In this case, both components are right-skewed and the collocatedcorrelation coefficient between the components of the field is approximately 0 . η = 1 and η = −
2. The first component of the field is right-skewed, whereas thesecond one is left-skewed, and the collocated correlation coefficient between the componentsof the field is approximately 0 . • Scenario (I) . Exponential model under choice (A). • Scenario (II) . Askey model under choice (A). • Scenario (III) . Exponential model under choice (B). • Scenario (IV) . Askey model under choice (B).For each scenario, we simulate 500 independent realizations from the bivariate skew-Gaussian RF.Then, we estimate the parameters using the weighted pairwise CL method. We set d ij = 0 . i, j = 1 ,
2. Figure 4 reports the boxplots of the CL estimates. All studiesshow the effectiveness of our proposal. We have also applied CL estimation to other parametricmodels, such as, the Cauchy and Wendland correlation functions. For each case considered thepairwise CL performs well.Finally, we assess the performance of the pairwise CL method with increasing scale parameters c ij as well as increasing sample sizes n . For simplicity, all the subsequent experiments considera single scale parameter c ij = c , for all i, j = 1 ,
2. In this case, the parameter vector reducesto λ = ( σ , σ , η , η , c, ρ , µ , µ ) (cid:62) . We consider an Exponential latent correlation under thefollowing parametric setting: σ = σ = η = 1, η = 2, µ = µ = 0 and ρ = 0 .
5. Figure 5reports the centered boxplots of the CL estimates in three different cases: c = 0 . , . , .
75. Theincrease of the parameter c imply that the spatial dependence will be strengthened, and it produces9 igure 4: Boxplots of the CL estimates for the bivariate skew-Gaussian RF, under Scenarios (I)-(IV).Figure 5: Centered boxplots of the CL estimates, for the bivariate skew-Gaussian RF, using an Exponential latentcorrelation and different scale parameters: c = 0 . , . , . igure 6: Centered boxplots of the CL estimates, for the bivariate skew-Gaussian RF, using an Exponential latentcorrelation and different sample sizes: n = 81 , , biased estimates of c and ρ . Our findings are consistent and add more evidence to the resultsreported in the previous literature (Zhang, 2004; Bevilacqua et al., 2012; Xu and Genton, 2016b).On the other hand, we set c = 0 .
15, and we consider increasing sample sizes: n = 81 , ,
5. A bivariate data set
We analyze a bivariate data set of Minimum (Variable 1) and Maximum (Variable 2) surface airtemperatures. The spatial variability of temperatures is crucial for modeling hydrological andagricultural phenomena. These data outputs come from the Community Climate System Model(CCSM4.0) (see Gent et al., 2011) provided by NCAR (National Center for Atmospheric Research)located at Boulder, CO, USA.We have monthly data over a grid of 2 . × . − η i = 0, for i = 1 ,
2. We haveconsidered the parameterization introduced in the previous sections, so that, the skew-Gaussian11 igure 7: Residuals of the Minimum (left) and Maximum (right) surface air temperatures in July of 2015. model has 9 parameters, whereas the Gaussian model has 7 parameters. The CL estimation iscarried out using only pairs of observations whose spatial distances are less than 1592.75 kilometers(equivalent to 0 .
25 radians on the unit sphere). Table 2 reports the CL estimates for the skew-Gaussian and Gaussian models. The units of the scale parameters are kilometers.The optimal values of the CL objective functions are given in Table 3. Note that the maximumCL value under the skew-Gaussian model is superior to the merely Gaussian model. It is clearthat the incorporation of skewness produces improvements in goodness-of-fit. Figure 8 shows thehistograms of each variable and the fitted skew-Gaussian and Gaussian density functions. In Figure9, the marginal and cross empirical semi-variograms are compared to the theoretical models.Finally, we compare both models in terms of their predictive performance. Since the covariancestructure of the skew-Gaussian field is known explicitly, we use the classical best linear unbiasedpredictor (cokriging), which is optimal for the Gaussian model, in terms of mean squared error.However, it is not optimal for the skew-Gaussian RF. In spite of this, we will show that the skew-Gaussian model provides better predictive results. We use a drop-one prediction strategy andquantify the discrepancy between the real and predicted values through the root mean squaredprediction error (RMSPE) RMSPE = (cid:118)(cid:117)(cid:117)(cid:116) n (cid:88) i =1 n (cid:88) k =1 ( Z i ( s k ) − (cid:98) Z i ( s k )) and the Log-score (LSCORE)LSCORE = 12 n (cid:88) i =1 n (cid:88) k =1 (cid:34) log(2 π (cid:98) σ i ( s k ))2 + ( Z i ( s k ) − (cid:98) Z i ( s k )) (cid:98) σ i ( s k ) (cid:35) , where n is the number of spatial locations, (cid:98) Z i ( s k ) is the drop-one prediction of Z i ( s k ) at location s k and (cid:98) σ i ( s k ) is the drop-one prediction variance (see Zhang and Wang, 2010). Note that the12 igure 8: Histograms for the residuals of the Minimum (left) and Maximum (right) surface air temperatures, consid-ering observations with latitudes between −
30 and 30 degrees, and the fitted skew-Gaussian (solid line) and Gaussian(dashed line) probability density functions. skew-Gaussian model generates better results since the mentioned indicators are smaller. In termsof RMSPE, the improvement in the prediction is approximately 3 .
6. Discussion
Building models for non-Gaussian RFs has become a major challenge and more efforts should bedevoted to such constructions. In particular, it seems that the main difficulties arise when trying tobuild models that are statistically identifiable. Another major problem, on the other hand, comeswhen building the finite dimensional distributions, which are analytically intractable in most cases.This paper has provided an approach that allows to avoid the identifiability problem in multivariateskew-Gaussian RFs, and that permits to implement a CL approach.
Figure 9: Empirical semi-variograms versus fitted semi-variograms, using Exponential latent correlations, for theskew-Gaussian (solid line) and Gaussian (dashed line) models. able 2: CL estimates for the skew-Gaussian and Gaussian RFs, using Exponential latent correlations. Scaleparameters are given in kilometers. (cid:98) σ (cid:99) σ (cid:98) η (cid:98) η (cid:98) c (cid:98) c (cid:98) ρ (cid:98) µ (cid:98) µ skew-Gaussian 0.223 0.179 0.487 0.769 4995.9 4867.1 0.819 0.250 0.079Gaussian 0.310 0.419 - - 3922.4 3393.1 0.743 0.635 0.674 Table 3: Prediction performance of the skew-Gaussian and Gaussian RFs, using Exponential latent correlations. − − . − Acknowledgments
Alfredo Alegr´ıa is supported by
Beca CONICYT-PCHA/Doctorado Nacional/2016-21160371 . MorenoBevilacqua is partially supported by
Proyecto Fondecyt 1160280 . Emilio Porcu is supported by
Proyecto Fondecyt Regular number 1130647 . Jorge Clarke is supported by
Proyecto Fondecyt Post-Doctorado number 3150506.
We also acknowledge the World Climate Research Programme’s Working Group on Coupled Mod-elling, which is responsible for Coupled Model Intercomparison Project (CMIP).
Appendix A. Pairwise distributions for the multivariate skew-Gaussian RF
Before we state the proof of Proposition 3.1, we need the following property of quadratic forms.
Lemma
Let A and B be two symmetric positive definite matrices of order ( n × n ) and x , a ∈ R n .Then, we have the following identity:( a − x ) (cid:62) A − ( a − x ) + x (cid:62) B − x = ( x − c ) (cid:62) ( A − + B − )( x − c ) + a (cid:62) ( A + B ) − a , (A.1)where c = ( A − + B − ) − A − a . 14 roof of Proposition 3.1 Consider W = ( | X | , | X | ) (cid:62) , V = ( Y , Y ) (cid:62) , µ = ( µ , µ ) (cid:62) , η =( η , η ) (cid:62) and σ = ( σ , σ ) (cid:62) , where ( X , X ) (cid:62) ∼ N ( , Ω( r x )) and ( Y , Y ) (cid:62) ∼ N ( , Ω( r y )) areindependent, with Ω( r ) as defined in (4). Let Z = ( Z , Z ) (cid:62) defined through Z = µ + η ◦ W + σ ◦ V , where ◦ denotes the Hadamard product. Therefore, the joint probability density function of Z isgiven by f Z ( z ) = (cid:90) R f Z | ( W = w ) ( z | w ) f W ( w )d w , (A.2)Here, f W ( w ) is the pdf of the random vector W and w = ( w , w ) (cid:62) . Note that the cdf of therandom vector W can be written as F W ( w , w ) = Φ ( w , w ; Ω( r x )) − Φ ( − w , w ; Ω( r x )) − Φ ( w , − w ; Ω( r x ))+Φ ( − w , − w ; Ω( r x ))Then, we can obtain the pdf of W , f W ( w , w ) = 2 (cid:18) φ ( w , w ; Ω( r x )) + φ ( − w , w ; Ω( r x )) (cid:19) , = 2 (cid:18) φ ( w , w ; Ω( r x )) + φ ( w , w ; Ω( − r x )) (cid:19) . On the other hand, f Z | W = w ( z | w ) is the pdf of the random vector Z | ( W = w ) ∼ N ( µ + η ◦ w ; Ω ),with Ω = (cid:18) σ σ σ σ σ σ (cid:19) ◦ Ω( r y ) . Therefore, evaluation of the integral (A.2) requires the characterization of an integral of the form I = (cid:90) R φ ( z − µ − η ◦ w ; Ω ) φ ( w ; Ω )d w , with Ω being Ω( r x ) or Ω( − r x ). Moreover, I can be written as I = | Υ | (cid:90) R φ (Υ( z − µ ) − w ; ΥΩ Υ) φ ( w ; Ω )d w where Υ = diag { /η , /η } . Thus, using Equation (A.1) we have I = | Υ | φ (cid:18) Υ( z − µ ); ΥΩ Υ + Ω (cid:19) (cid:90) R φ (cid:18) w − L ; ([ΥΩ Υ] − + Ω − ) − (cid:19) d w , = | Υ | φ (cid:18) Υ( z − µ ); ΥΩ Υ + Ω (cid:19) Φ ( L ; ([ΥΩ Υ] − + Ω − ) − )= φ (cid:18) z − µ ; Ω + Υ − Ω Υ − (cid:19) Φ ( L ; ([ΥΩ Υ] − + Ω − ) − )where L = (cid:20) I + ΥΩ ΥΩ − (cid:21) − Υ( z − µ ). 15 ppendix B. Multivariate Askey model on the sphere We consider a multivariate Askey model for the sphere S , defined according to ρ ij r ( θ ; c ij ) = ρ ij (cid:18) − θc ij (cid:19) , θ ∈ [0 , π ] , i, j = 1 , . . . , m, which has been used through Sections 4 and 5, considering c ij = ( c ii + c jj ) /
2. We claim that sucha model is positive definite using the following mixture (see Daley et al., 2015) r ( θ ; c ij ) ∝ (cid:90) ∞ r ( θ ; ξ ) ξ r ( ξ ; c ij )d ξ, θ ∈ [0 , π ] . The results of Gneiting (2013) for the univariate Askey model, coupled with the conditions devel-oped by Daley et al. (2015) complete our assertion.
ReferencesReferences
Alegr´ıa, A., Bevilacqua, M., Porcu, E., 2016. Likelihood-based inference for multivariate space-time wrapped-Gaussian fields. Journal of Statistical Computation and Simulation 86 (13), 2583–2597.Allard, D., Naveau, P., 2007. A New Spatial Skew-Normal Random Field Model. Communications in Statistics -Theory and Methods 36 (9), 1821–1834.Arellano-Valle, R., Azzalini, A., 2006. On the unification of families of skew-normal distributions. ScandinavianJournal of Statistics 33, 561–574.Azzalini, A., 1985. A class of distributions which includes the normal ones. Scandinavian Journal of Statistics 12 (2),171–178.Azzalini, A., 1986. Further results on a class of distributions which includes the normal ones. Statistica 46 (2),199–208.Azzalini, A., 2013. The skew-normal and related families. Vol. 3. Cambridge University Press.Azzalini, A., Capitanio, A., 1999. Statistical applications of the multivariate skew normal distribution. Journal ofthe Royal Statistical Society: Series B (Statistical Methodology) 61 (3), 579–602.Azzalini, A., Dalla Valle, A., 1996. The multivariate skew-normal distribution. Biometrika 83 (4), 715–726.Bevilacqua, M., Alegria, A., Velandia, D., Porcu, E., 2016. Composite likelihood inference for multivariate Gaussianrandom fields. Journal of Agricultural, Biological, and Environmental Statistics 21 (3), 448–469.Bevilacqua, M., Gaetan, C., 2015. Comparing composite likelihood methods based on pairs for spatial Gaussianrandom fields. Statistics and Computing 25 (5), 877–892.Bevilacqua, M., Gaetan, C., Mateu, J., Porcu, E., 2012. Estimating space and space-time covariance functions forlarge data sets: a weighted composite likelihood approach. Journal of the American Statistical Association 107,268–280.Cox, D., Reid, N., 2004. Miscellanea: A note on pseudolikelihood constructed from marginal densities. Biometrika91, 729–37.Curriero, F., Lele, S., 1999. A composite likelihood approach to semivariogram estimation. Journal of Agricultural,Biological and Environmental Statistics 4, 9–28.Daley, D., Porcu, E., Bevilacqua, M., 2015. Classes of compactly supported covariance functions for multivariaterandom fields. Stochastic Environmental Research and Risk Assessment 29 (4), 1249–1263.Davis, R., Yau, C.-Y., 2011. Comments on pairwise likelihood in time series models. Statistica Sinica 21, 255–277.De Oliveira, V., Kedem, B., Short, D., 1997. Bayesian prediction of transformed Gaussian random fields. Journal ofthe American Statistical Association 92, 1422–1433.Du, J., Leonenko, N., Ma, C., Shu, H., 2012. Hyperbolic vector random fields with hyperbolic direct and crosscovariance functions. Stochastic Analysis and Applications 30 (4), 662–674. ent, P., Danabasoglu, G., Donner, L., Holland, M., Hunke, E., Jayne, S., Lawrence, D., Neale, R., Rasch, P.,Vertenstein, M., Worley, P., Yang, Z.-L., Zhang, M., 2011. The Community Climate System Model Version 4.Journal of Climate 24 (19), 4973–4991.Genton, M. G., Zhang, H., 2012. Identifiability problems in some non-Gaussian spatial random fields. Chilean Journalof Statistics 3 (2).Gneiting, T., 2013. Strictly and non-strictly positive definite functions on spheres. Bernoulli 19 (4), 1327–1349.Gualtierotti, A., 2005. Skew-normal processes as models for random signals corrupted by Gaussian noise. InternationalJournal of Pure and Applied Mathematics 20, 109–142.Heagerty, P., Lele, S., 1998. A composite likelihood approach to binary spatial data. Journal of the AmericanStatistical Association 93, 1099–1111.Joe, H., Lee, Y., 2009. On weighting of bivariate margins in pairwise likelihood. Journal of Multivariate Analysis100, 670–685.Jona-Lasinio, G., Gelfand, A., Jona-Lasinio, M., 12 2012. Spatial analysis of wave direction data using wrappedGaussian processes. The Annals of Applied Statistics 6 (4), 1478–1498.Kim, H., Mallick, B., 2004. A bayesian prediction using the skew Gaussian distribution. Journal of Statistical Planningand Inference 120 (1-2), 85–101.Lindsay, B., 1988. Composite likelihood methods. Contemporary Mathematics 80, 221–239.Ma, C., 2009. Construction of non-Gaussian random fields with any given correlation structure. Journal of StatisticalPlanning and Inference 139, 780–787.Ma, C., 2013a. K-distributed vector random fields in space and time. Statistics & Probability Letters 83 (4), 1143–1150.Ma, C., 2013b. Student’s t vector random fields with power-law and log-law decaying direct and cross covariances.Stochastic Analysis and Applications 31 (1), 167–182.Marinucci, D., Peccati, G., 2011. Random fields on the sphere: representation, limit theorems and cosmologicalapplications. Vol. 389. Cambridge University Press.Minozzo, M., Ferracuti, L., 2012. On the existence of some skew-normal stationary processes. Chilean Journal ofStatistics 3, 157.Padoan, S., Ribatet, M., Sisson, S., 2010. Likelihood-based inference for max-stable processes. Journal of the Amer-ican Statistical Association 105, 263–277.Porcu, E., Bevilacqua, M., Genton, M., 2016. Spatio-Temporal Covariance and Cross-Covariance Functions of theGreat Circle Distance on a Sphere. Journal of the American Statistical Association 111 (514), 888–898.Sang, H., Genton, M. G., 2014. Tapered composite likelihood for spatial max-stable models. Spatial Statistics 8,86–103.Stein, M., 1992. Prediction and Inference for Truncated Spatial Data. Journal of Computational and GraphicalStatistics 1, 91–110.Stein, M., Chi, Z., Welty, L., 2004. Approximating likelihoods for large spatial data sets. Journal of the RoyalStatistical Society B 66, 275–296.Varin, C., Reid, N., Firth, D., 2011. An overview of composite likelihood methods. Statistica Sinica 21, 5–42.Vecchia, A., 1988. Estimation and model identification for continuous spatial processes. Journal of the Royal Statis-tical Society B 50, 297–312.Xu, G., Genton, M. G., 2016a. Tukey g-and-h Random Fields. Journal of the American Statistical Association (Toappear).Xu, G., Genton, M. G., 2016b. Tukey max-stable processes for spatial extremes. Spatial Statistics 18, 431 – 443.Zhang, H., 2004. Inconsistent estimation and asymptotically equal interpolations in model-based geostatistics. Journalof the American Statistical Association 99 (465), 250–261.Zhang, H., El-Shaarawi, A., 2010. On spatial skew-Gaussian processes and applications. Environmetrics 21 (1), 33–47.Zhang, H., Wang, Y., 2010. Kriging and cross-validation for massive spatial data. Environmetrics 21 (3-4), 290–304.ent, P., Danabasoglu, G., Donner, L., Holland, M., Hunke, E., Jayne, S., Lawrence, D., Neale, R., Rasch, P.,Vertenstein, M., Worley, P., Yang, Z.-L., Zhang, M., 2011. The Community Climate System Model Version 4.Journal of Climate 24 (19), 4973–4991.Genton, M. G., Zhang, H., 2012. Identifiability problems in some non-Gaussian spatial random fields. Chilean Journalof Statistics 3 (2).Gneiting, T., 2013. Strictly and non-strictly positive definite functions on spheres. Bernoulli 19 (4), 1327–1349.Gualtierotti, A., 2005. Skew-normal processes as models for random signals corrupted by Gaussian noise. InternationalJournal of Pure and Applied Mathematics 20, 109–142.Heagerty, P., Lele, S., 1998. A composite likelihood approach to binary spatial data. Journal of the AmericanStatistical Association 93, 1099–1111.Joe, H., Lee, Y., 2009. On weighting of bivariate margins in pairwise likelihood. Journal of Multivariate Analysis100, 670–685.Jona-Lasinio, G., Gelfand, A., Jona-Lasinio, M., 12 2012. Spatial analysis of wave direction data using wrappedGaussian processes. The Annals of Applied Statistics 6 (4), 1478–1498.Kim, H., Mallick, B., 2004. A bayesian prediction using the skew Gaussian distribution. Journal of Statistical Planningand Inference 120 (1-2), 85–101.Lindsay, B., 1988. Composite likelihood methods. Contemporary Mathematics 80, 221–239.Ma, C., 2009. Construction of non-Gaussian random fields with any given correlation structure. Journal of StatisticalPlanning and Inference 139, 780–787.Ma, C., 2013a. K-distributed vector random fields in space and time. Statistics & Probability Letters 83 (4), 1143–1150.Ma, C., 2013b. Student’s t vector random fields with power-law and log-law decaying direct and cross covariances.Stochastic Analysis and Applications 31 (1), 167–182.Marinucci, D., Peccati, G., 2011. Random fields on the sphere: representation, limit theorems and cosmologicalapplications. Vol. 389. Cambridge University Press.Minozzo, M., Ferracuti, L., 2012. On the existence of some skew-normal stationary processes. Chilean Journal ofStatistics 3, 157.Padoan, S., Ribatet, M., Sisson, S., 2010. Likelihood-based inference for max-stable processes. Journal of the Amer-ican Statistical Association 105, 263–277.Porcu, E., Bevilacqua, M., Genton, M., 2016. Spatio-Temporal Covariance and Cross-Covariance Functions of theGreat Circle Distance on a Sphere. Journal of the American Statistical Association 111 (514), 888–898.Sang, H., Genton, M. G., 2014. Tapered composite likelihood for spatial max-stable models. Spatial Statistics 8,86–103.Stein, M., 1992. Prediction and Inference for Truncated Spatial Data. Journal of Computational and GraphicalStatistics 1, 91–110.Stein, M., Chi, Z., Welty, L., 2004. Approximating likelihoods for large spatial data sets. Journal of the RoyalStatistical Society B 66, 275–296.Varin, C., Reid, N., Firth, D., 2011. An overview of composite likelihood methods. Statistica Sinica 21, 5–42.Vecchia, A., 1988. Estimation and model identification for continuous spatial processes. Journal of the Royal Statis-tical Society B 50, 297–312.Xu, G., Genton, M. G., 2016a. Tukey g-and-h Random Fields. Journal of the American Statistical Association (Toappear).Xu, G., Genton, M. G., 2016b. Tukey max-stable processes for spatial extremes. Spatial Statistics 18, 431 – 443.Zhang, H., 2004. Inconsistent estimation and asymptotically equal interpolations in model-based geostatistics. Journalof the American Statistical Association 99 (465), 250–261.Zhang, H., El-Shaarawi, A., 2010. On spatial skew-Gaussian processes and applications. Environmetrics 21 (1), 33–47.Zhang, H., Wang, Y., 2010. Kriging and cross-validation for massive spatial data. Environmetrics 21 (3-4), 290–304.