[PDF] D-STEM v2: A Software for Modelling Functional Spatio-Temporal Data

Abstract

Functional spatio-temporal data naturally arise in many environmental and climate applications where data are collected in a three-dimensional space over time. The MATLAB D-STEM v1 software package was first introduced for modelling multivariate space-time data and has been recently extended to D-STEM v2 to handle functional data indexed across space and over time. This paper introduces the new modelling capabilities of D-STEM v2 as well as the complexity reduction techniques required when dealing with large data sets. Model estimation, validation and dynamic kriging are demonstrated in two case studies, one related to ground-level air quality data in Beijing, China, and the other one related to atmospheric profile data collected globally through radio sounding.

Full PDF

DD-STEM v2: A Software for Modelling FunctionalSpatio-Temporal Data

Yaqiong Wang

Peking University

Francesco Finazzi

University of Bergamo

Alessandro Fassò

University of Bergamo

Abstract

Functional spatio-temporal data naturally arise in many environmental and climate ap-plications where data are collected in a three-dimensional space over time. The

MATLAB

D-STEM v1 software package was ﬁrst introduced for modelling multivariate space-timedata and has been recently extended to

D-STEM v2 to handle functional data indexedacross space and over time. This paper introduces the new modelling capabilities of

D-STEM v2 as well as the complexity reduction techniques required when dealing with largedata sets. Model estimation, validation and dynamic kriging are demonstrated in two casestudies, one related to ground-level air quality data in Beijing, China, and the other onerelated to atmospheric proﬁle data collected globally through radio sounding.

Keywords : functional data analysis, 4D data, climate data, environmetrics, EM algorithm.

1. Introduction

With the increase of multidimensional data availability and modern computing power, sta-tistical models for spatial and spatio-temporal data are developing at a rapid pace. Hence,there is a need for stable and reliable, yet updated and eﬃcient, software packages. In thissection, we brieﬂy discuss multidimensional data in climate and environmental studies as wellas statistical software for space-time data.

Large multidimensional data sets often arise when climate and environmental phenomenaare observed at the global scale over extended periods. In climate studies, relevant physicalvariables are observed on a three-dimensional (3D) spherical shell (the atmosphere) whiletime is the fourth dimension. For instance, measurements are obtained by radiosondes ﬂy-ing from ground level up to the stratosphere (Fassò et al. et al. et al. × T data (4D for brevity), covariance func-tions deﬁned over the 4D support may be adopted. However, these covariance functions oftenhave a complex form (Porcu et al. a r X i v : . [ s t a t . M E ] J a n Modelling functional spatio-temporal data with D-STEM

Figure 1: Radio sounding data example. Each dot represents the spatial location of a mea-surement taken by a radiosonde. Dots of the same colour belong to the same radiosonde.(Pressure axis not in scale).making inferences, very large covariance matrices (though they may be sparse) are implied.In large climate and environmental applications, 4D data are rarely collected at high fre-quency in all spatial and temporal dimensions. Often, only one dimension is sampled athigh frequency while the remaining dimensions are sampled sparsely. Radiosonde data, forinstance, are sparse over the Earth’s sphere, but they are dense along the vertical dimension,providing atmospheric proﬁles. This suggests that handling all spatial dimensions equally(e.g. using a 3D covariance function) may not be the best option from a modelling or compu-tational perspective, and a data reduction technique may be useful instead. In this paper, thefunctional data analysis (FDA) approach (Ramsay and Silverman 2007) is adopted to modelthe relationship between measurements along the proﬁle, while the remaining dimensions arehandled following the classic spatio-temporal data modelling approach using only 2D spatialcovariance functions.

Various software programmes are available for considering data on a plane or in a two-dimensional (2D) Euclidean space. The choice is more restricted when considering mul-tidimensional or non-Euclidean spaces arising from atmospheric or remote sensing spatio-temporal data observed on the surface of a sphere and over time.For example, Figure 1 depicts the spatial locations of measurements collected globally in asingle day through radio sounding, as discussed in Section 5. Space is three-dimensional, andmeasurements are repeated over time at the same spatial locations over the Earth’s surfacebut at diﬀerent pressure values.The spBayes package (Finley et al. spacetime (Pebesma 2012) and gstat (Pebesma andHeuvelink 2016) packages does not explicitly address the multidimensional case, but, ac-cording to Gasch et al. (2015), both packages have some capabilities to handle the 3D × T. aqiong Wang, Francesco Finazzi, and Alessandro Fassò R package FRK (Zammit-Mangion 2018) handles spatial and spatio-temporal data bothon the Euclidean plane and on the surface of the sphere.

FRK implements a set of toolsfor data gridding and basis function computation, resulting in eﬃcient dimension reduction,allowing it to handle large satellite data sets (Cressie 2018). It is based on a spatio-temporalrandom eﬀects (SRE) model estimated by the expectation-maximisation (EM) algorithm. Re-cent extensions to

FRK include the use of multi-resolution basis functions (Tzeng and Huang2018).A second package based on SRE and the EM algorithm is

D-STEM v1 (Finazzi and Fassò2014). This package implements an eﬃcient state-space approach for handling the temporaldimension and a heterotopic multivariate response approach that is useful when correlatingheterogeneous networks (Fassò and Finazzi 2011; Calculli et al.

D-STEM v1 has been successfully used in various medium-to-large applications, proving thatthe EM algorithm implementation, being mainly based on closed-form iterations, is quitestable. These applications include air quality assessment in the metropolitan areas of Milan,Teheran and Beijing (Fassò 2013; Taghavi-Shahri et al. et al. et al. et al. et al. et al.

FRK and

D-STEM v1, leverage sparse variance–covariance ma-trices. Others exploit the sparsity of the precision matrix, thanks to a spatial Markovianassumption. This class includes the R packages LatticeKrig (Nychka et al.

INLA (Blangiardo et al. et al. et al. spBayes . Finally, the R pack-age laGP (Gramacy 2016), based on a machine learning approach, implements an eﬃcientnearest neighbour prediction-oriented method. Heaton et al. (2018) develop an interestingspatial prediction competition considering a large data set and involving the above-mentionedapproaches.We observe that, although some of the software packages mentioned above consider bothspace and time, to the best of our knowledge, none of them handles a spatio-temporal FDAapproach for data sets of the kind discussed in 1.1.In this paper, we present D-STEM v2, which is a

MATLAB package, extending

D-STEM v1. The new version introduces modelling of functional data indexed over space and time.Moreover, new complexity reduction techniques have been added for both model estimationand dynamic mapping, which are especially useful for large data sets.The rest of the paper is organised as follows. Section 2 introduces the methodology adopted

Modelling functional spatio-temporal data with D-STEM in this paper and, in particular, the data modelling approach and the complexity-reductiontechniques. Section 3 describes the

D-STEM v2 software in terms of the

MATLAB classesused to deﬁne the data structure, model ﬁtting and diagnostics and kriging. This is followedby an illustration of the software use through two case studies. The ﬁrst one, discussed inSection 4, considers high-frequency spatio-temporal ozone data in Beijing. The second one,in Section 5, considers modelling of global atmospheric temperature proﬁles and exploitsthe complexity-reduction capabilities of the new package. Finally, concluding remarks areprovided in Section 6.

2. Methodology

This section discusses the methodology behind the modelling and the complexity-reductiontechniques implemented in

D-STEM v2 when dealing with functional space-time data sets.Moreover, model estimation, validation and dynamic kriging are brieﬂy discussed.

Let s = ( s lat , s lon ) > be a generic spatial location on the Earth’s sphere, S , and t ∈ N a discrete time index. It is assumed that the function of interest, f ( s , h, t ), with domain H = [ h , h ] ⊂ R , can be observed at any ( s , t ) and h ∈ H through noisy measurements y ( s , h, t ) according to the following model: y ( s , h, t ) = f ( s , h, t ) + ε ( s , h, t ) , (1) f ( s , h, t ) = x ( s , h, t ) > β ( h ) + φ ( h ) > z ( s , t ) , (2) z ( s , t ) = Gz ( s , t −

1) + η ( s , t ) . (3)This model is referred to as the functional hidden dynamic geostatistical model (f-HDGM).In Equation (1), ε is a zero-mean Gaussian measurement error independent in space and timewith functional variance σ ε ( h ), implying that ε is heteroskedastic across the domain H . Thevariance is modelled as log( σ ε ( h )) = φ ( h ) > c ε , where φ ( h ) is a p × h , while c ε is a vector of coeﬃcientsto be estimated. In Equation (2), x ( s , h, t ) is a b × β ( h ) =( β ( h ) , ..., β b ( h )) > is the vector of functional parameters modelled as β j ( h ) = φ ( h ) > c β,j , j = 1 , ..., b, and c β = (cid:16) c > β, , ..., c > β,b (cid:17) > is a pb × z ( s , t ) is a p × G is a diagonal transition matrix with diagonal elements in the p × g . The innovation vector η is obtained from a multivariate Gaussian process thatis independent in time but correlated across space with matrix spatial covariance functiongiven by Γ ( s , s ; θ ) = diag (cid:0) v ρ ( s , s ; θ ) , ..., v p ρ ( s , s ; θ p ) (cid:1) , where v = ( v , ..., v p ) > is a vector of variances and ρ ( s , s ; θ j ) is a valid spatial correlationfunction for locations s , s ∈ S , parametrised by θ j , and θ = ( θ , ..., θ p ) > . The unknownmodel parameter vector is given by ψ = (cid:16) c > ε , c > β , g > , v > , θ > (cid:17) > . aqiong Wang, Francesco Finazzi, and Alessandro Fassò p -dimensional basis functions φ ( h ) are usedto model σ ε , β j and φ ( h ) > z ( s , t ) in Equations (1)-(3). In practice, D-STEM v2 allows one tospecify a diﬀerent number of basis functions for each model component. Also note that ε isnot a pure measurement error since it also accounts for model misspeciﬁcation. Finally, thecovariates x ( s , h, t ) are assumed to be known without error for any s , h and t , and thus theydo not need a basis function representation. Choosing basis functions essentially means choosing the basis type and the number of basisfunctions.

D-STEM v2 currently supports Fourier bases and B-spline bases. The formerguarantee that the function is periodic in the domain H , while the latter are not (in general)periodic but have higher ﬂexibility in describing functions with a complex shape. Whicheverbasis function type is adopted, the number p of basis functions must be ﬁxed before modelestimation. Usually, a high p implies a better model R , but over-ﬁtting may be an is-sue. Moreover, special care must be taken when choosing the number of basis functions for φ ( h ) > z ( s , t ). The classic FDA approach suggests ﬁxing a high number of basis functions andadopting penalisation to avoid over-ﬁtting. In our context, this is not viable since the covari-ance matrices involved in model estimation have dimension n p × n p . Since n is usuallylarge, a large p would make model estimation unfeasible, especially if the number of timepoints T is also high. When using B-spline basis, a small p implies that the location of knotsalong the domain H also matters and may aﬀect the model ﬁtting performance. Ideally, p and knot locations are chosen using a model validation technique (see 2.7) by trying diﬀerentcombinations of p and knot locations. If, due to time constraints, this is not possible, equallyspaced knots are a convenient option. The estimation of ψ and the latent space-time variable z ( s , t ) is based on the maximumlikelihood approach considering proﬁle data observed at spatial locations S = { s i , i = 1 , ..., n } and time points t = 1 , ..., T .At a speciﬁc location s i and time t , q i,t measurements are taken at points h s i ,t = (cid:16) h i, ,t , ..., h i,q i,t ,t (cid:17) > and collected in the vector y s i ,t = ( y ( s i , h i, ,t , t ) , ..., y ( s i , h i,q i,t ,t , t )) > , here called the observed proﬁle.Although D-STEM v2 allows for varying q i,t , for ease of notation, it is assumed here thatall proﬁles include exactly q measurements, although h s i ,t may be diﬀerent across proﬁles.Proﬁles observed at time t across spatial locations S are then stored in the nq × y t = ( y > s ,t , ..., y > s n ,t ) > . Applying model (1)-(3) to the deﬁned data above, we have the followingmatrix representation: y t = ˜ X t c β + Φ z ,t z t + ε t , z t = ˜ Gz t − + η t , where ˜ X t = X t Φ β ,t is a nq × bp matrix, with X t the matrix of covariates and Φ β ,t thebasis matrix for β . Φ z ,t is the nq × np basis matrix for the latent np × z t = Modelling functional spatio-temporal data with D-STEM ( z ( s , t ) > , ..., z ( s n , t ) > ) > . η t = ( η ( s , t ) > , ..., η ( s n , t ) > ) > is the np × ε t is the nq × G = I n ⊗ G is the np × np diagonal transition matrix.The complete-data likelihood function L ( ψ ; Y , Z ) can be written as L ( ψ ; Y , Z ) = L ( ψ z ; z ) T Y t =1 L ( ψ y ; y t | z t ) L ( ψ z ; z t | z t − ) , where Y = ( y , ..., y T ), Z = ( z , z , ..., z T ), ψ z = (cid:16) g > , v > , θ > (cid:17) > , ψ y = (cid:16) c > ε , c > β (cid:17) > , and z is the Gaussian initial vector with parameter ψ z . Maximum likelihood estimation is basedon an extension of the EM algorithm detailed in Calculli et al. (2015). The model parameterset ψ is initialised with starting values ψ h i and then updated at each iteration ι of the EMalgorithm.The algorithm terminates if any of the following conditions is satisﬁed:max l (cid:12)(cid:12)(cid:12) ψ h ι i l − ψ h ι − i l (cid:12)(cid:12)(cid:12) / (cid:12)(cid:12)(cid:12) ψ h ι i l (cid:12)(cid:12)(cid:12) < (cid:15) (cid:12)(cid:12)(cid:12) L ( ψ h ι i ; Y ) − L ( ψ h ι − i ; Y ) (cid:12)(cid:12)(cid:12) / (cid:12)(cid:12)(cid:12) L ( ψ h ι i ; Y ) (cid:12)(cid:12)(cid:12) < (cid:15) ,ι > ι ∗ , where ψ h ι i l is the generic element of ψ h ι i at the ι - th iteration, L ( ψ h ι i ; Y ) is the observed-datalikelihood function evaluated at ψ h ι i , 0 < (cid:15) (cid:28) < (cid:15) (cid:28) − ), while ι ∗ is a user-deﬁned positive integer number (e.g. 100) to limit the iterationsin the case of convergence failure of the EM algorithm.Note that S is not time-varying, which means that spatial locations are ﬁxed. This could bea limit in applications where spatial locations change for each t . On the other hand, missingproﬁles are allowed; that is, y s i ,t may be a vector of q missing values at some t . In the extremecase, a given spatial location s i has only one proﬁle over the entire period (if all the proﬁlesare missing, the spatial location can be dropped from the data set). Shumway and Stoﬀer(2017, p. 348) explains how the likelihood function of a state-space model changes in the caseof a missing observation vector and how the EM estimation formulas are derived. Missingdata handling in D-STEM v2 is based on the same approach.

At each iteration of the EM algorithm, the computational complexity of the E-step is O (cid:0) T n p (cid:1) ,which may be unfeasible if n is large. When necessary, D-STEM v2 allows one to use a parti-tioning approach (Stein 2013) for model estimation. The spatial locations S are divided into k partitions, and z t is partitioned conformably, namely, z t = (cid:16) z (1) > t , ..., z ( k ) > t (cid:17) > . Hence, thelikelihood function becomes T Y t =1 L ( ψ y ; y t | z t ) · k Y j =1 L (cid:16) ψ z ; z ( j )0 (cid:17) · k Y j =1 T Y t =1 L (cid:16) ψ z ; z ( j ) t | z ( j ) t − (cid:17) . From the EM algorithm point of view, this implies that the E-step is independently applied toeach partition, possibly in parallel. When all partitions are equal in size, the computationalcomplexity reduces to O (cid:0) T kr p (cid:1) , with r as the partition size. aqiong Wang, Francesco Finazzi, and Alessandro Fassò k , the k-means algorithm appliedto spatial coordinates provides a geographical partitioning of S . However, the number ofpoints in each partition is not controlled, and a heterogeneous partitioning may arise. If somesubsets are very large and others are small, the reduction in computational complexity givenabove is far from being achieved. This can easily happen, for example, when S is a globalnetwork constrained by continent shapes.For this reason, D-STEM v2 provides a heuristically modiﬁed k-means algorithm that en-courages partitions with similar numbers of elements. The algorithm optimises the followingobjective function: k X j =1 X s ∈S j d ( s , c j ) + λ k X j =1 (cid:18) r j − nk (cid:19) , (4)where λ ≥ S j ⊂ S is the set of coordinates in the j - th partition, d is the geodesic distanceon the sphere S and c j and r j are the centroid and the number of elements in the j - th partition, respectively.The second term in (4) accounts for the variability of the partition sizes and acts as a penal-isation for heterogeneous partitionings. Clearly, when λ = 0, the above-mentioned objectivefunction gives the classic k-means algorithm. For high values of λ , solutions with similarlysized partitions are favoured.Unfortunately, an optimality theory for this algorithm has not yet been developed, and thechoice of λ is left to the user. Nonetheless, it may be a useful tool to deﬁne a partitioning thatis appropriate for the application at hand with regard to computing time and geographicalproperties. The EM algorithm provides a point estimate of the parameter vector ψ but no uncertaintyinformation. Building on Shumway and Stoﬀer (2017, p. 408), D-STEM v2 estimates thevariance–covariance matrix Σ ψ ,T = V ( ψ | Y ), by means of the observed Fisher informationmatrix, I T , namely ˆΣ ψ ,T = ( I T ) − . To understand its computational cost, note that the information matrix given above may bewritten as a sum: I T = P Tt =1 i t .For large data sets, each matrix i t may be expensive to compute, and the total computationalcost is linear in T , provided missing data are evenly distributed in time. This results in a time-consuming task with a computational burden even higher than that for model estimation. Forthis reason, D-STEM v2 makes it possible to approximate ˆΣ ψ ,T using a truncated informationmatrix, namely: ˜Σ ψ ,t ∗ = ( Tt ∗ I t ∗ ) − , (5)which reduces the computational burden by a factor of 1 − t ∗ /T .Since ˜Σ ψ ,t ∗ → ˆΣ ψ ,T for t ∗ → T , the truncation time t ∗ is chosen to control the approximation Modelling functional spatio-temporal data with D-STEM error in ˆΣ ψ . In particular, t ∗ is the ﬁrst integer such that (cid:13)(cid:13)(cid:13) ˜Σ ψ ,t − ˜Σ ψ ,t − (cid:13)(cid:13)(cid:13) F (cid:13)(cid:13)(cid:13) ˜Σ ψ ,t (cid:13)(cid:13)(cid:13) F ≤ δ, (6)where k·k F is the Frobenius norm, and δ may be deﬁned by the user.Generally speaking, the behaviour of ˆΣ ψ ,T for large T and, hence, the behaviour of ˜Σ ψ ,t relays on stationarity and ergodicity of the underlying stochastic process; see, for example,Shumway and Stoﬀer (2017, Property P6.4) and references therein.To have operative guidance for the user, let us assume ﬁrst that no missing values are present,the information matrix is well-conditioned and the covariates have no isolated outliers orextreme trends. In this case, away from the borders t ∼ = 1 and t ∼ = T , the observed conditionalinformation i t has a relatively smooth stochastic behaviour, and the approximation in (5) isexpected to be satisfactory at the level deﬁned by δ . Conversely, if some data are missingat time t , the information i t is reduced accordingly. If the missing pattern is random overtime, this is not an issue. But, in the unfavourable case with a high percentage of missingdata mostly concentrated at the end the time series, t ∼ = T , the above approximation mayover-estimate the information and under-estimate the variances of the parameter estimates. In this paper, dynamic kriging refers to evaluating the following quantities:ˆ f ( s , h, t ) = E ˆ ψ ( f ( s , h, t ) | Y ) , (7) VAR (cid:16) ˆ f ( s , h, t ) (cid:17) = V ˆ ψ ( f ( s , h, t ) | Y ) , (8)for any s ∈ S , h ∈ H and t = 1 , ..., T . A common approach is to map the kriging estimateson a regular pixelation S ∗ = { s ∗ , ..., s ∗ m } . This may be a time-consuming task when m and/or n and/or T are large. To tackle this problem, D-STEM v2 allows one to exploit anearest-neighbour approach, where the conditioning term in Equations (7) and (8) is not Y ,but the data at the spatial locations S ∼ j , where S ∼ j ⊂ S is the set of the ˜ n (cid:28) n nearestspatial locations to s ∗ j . The use of the nearest-neighbour approach is justiﬁed by the so-calledscreening eﬀect. Even when the spatial correlation function exhibits long-range dependence,it can subsequently be assumed that y at spatial location s is nearly independent of spatiallydistant observations when conditioned on nearby observations (see Stein 2002; Furrer et al. D-STEM v2 performs kriging for blocks of pixels. To do this, S ∗ is partitioned in u blocks S ∗ = {S ∗ , ..., S ∗ u } , and kriging is done on each block S ∗ l , l = 1 , ..., u ,with u (cid:28) m controlled by the user. For each target block S ∗ l , the conditioning term inEquations (7) and (8) is given by the data observed at ˜ S l = S j ∈J l S ∼ j , J l = n j : s ∗ j ∈ S ∗ l o .Note that, if S ∗ l is dense and S is sparse (namely n (cid:28) m ), then ˜ S l is not much larger than S ∼ j since most of the spatial locations in S ∗ l tend to have the same neighbours S ∼ j . aqiong Wang, Francesco Finazzi, and Alessandro Fassò D-STEM v2 allows one to implement an out-of-sample validation by partitioning the originalspatial locations S into subsets S est and S val . Data at S est are used for model estimationwhile data at S val are used for validation. Once the model is estimated, the kriging formulain Equation (7) is used to predict at S val for all times t and heights h . The following validationmean squared errors are then computed M SE t = 1 P X s ∈S val X h ∈ h s ,t ( y ( s , h, t ) − ˆ y ( s , h, t )) ,M SE s = 1 P T X t =1 X h ∈ h s ,t ( y ( s , h, t ) − ˆ y ( s , h, t )) ,M SE h = 1 P T X t =1 X s ∈S val ( y ( s , h, t ) − ˆ y ( s , h, t )) , where ˆ y ( s , h, t ) is obtained from Equation (7), while P , P and P are the number of termsin each sum.When h s ,t varies across the proﬁles, D-STEM v2 provides a binned MSE by splitting thecontinuous domain H into B equally spaced intervals. Let H ∗ r be the set of observation pointsin the r - th interval, let n r be the corresponding observation number and let ¯ h r = n r P h ∈ H ∗ r h be the mean of points in b - th interval. Then, the M SE ¯ h r is computed by M SE ¯ h r = 1 P X h ∈ H ∗ r T X t =1 X s ∈S val ( y ( s , h, t ) − ˆ y ( s , h, t )) , where P is the total number of observations in the b - th interval.D-STEM v2 also provides the validation R with respect to time R t = 1 − M SE t VAR ( { y ( s , h, t ) , s ∈ S val , h ∈ h s ,t } ) . and the analogous validation R with respect to location s and h r .

3. Software

This section starts by brieﬂy describing the modelling capabilities of

D-STEM v2 inheritedby the previous version for dealing with spatio-temporal data sets. Then, it focuses on the

D-STEM v2 classes and methods, which implement estimation, validation and dynamic mappingof the model presented in Section 2. Although some of the classes are already available in

D-STEM v1, they are listed here for completeness.

D-STEM v1 implemented a substantial number of models. The dynamic coregionalisationmodel (DCM, Finazzi and Fassò 2014) and the hidden dynamic geostatistical model (HDGM,Calculli et al. et al. Modelling functional spatio-temporal data with D-STEM et al.

D-STEM v2 (available at github.com/graspa-group/d-stem ) provides the func-tional version of HDGM, denoted by f-HDGM, which handles modelling and mapping of func-tional space-time data, following the methodology of Section 2. For implementing f-HDGM,

D-STEM v2 relies on the

MATLAB version of the fda package (Ramsay et al.

D-STEM v2.

Two data formats are available to deﬁne observations for the f-HDGM. One is the internalformat used by the

D-STEM v2 classes, and the other one is the user format based on themore user-friendly table data type implemented in recent versions of

MATLAB . The latterpermits storing measurement proﬁles, covariate proﬁles, coordinates, timestamps and unitsof measure in a single object. The internal format is not discussed here.Considering a table in the user format, each row includes the proﬁles collected at a givenspatial location and time point. The column labels are deﬁned as follows: columns Y and Y_name are used for the dependent variable y and its name as a string ﬁeld, respectively; thecolumn with preﬁx X_h_ is used for the values of the domain h ; eventually, columns withpreﬁx X_beta_ are used for covariates x . These tables have only one column for y and onlyone column for h . Instead, we can have any number b (cid:62) X_coordinate and

Y_coordinate for spatial location s and column Time for the timestamp. Units of measure are stored in the

Properties.VariableUnits property of the table columns and used in outputs and plots. Units for

X_coordinate and

Y_coordinate can be deg for degrees, m for meters and km for kilometres. Geodetic distanceis used when the unit is deg ; otherwise, the Euclidean distance is used.At the table row corresponding to location s i and time t , the elements related to y and x arevectors with q i,t elements. Vectors related to y may include missing data ( NaN ). If y is entirelymissing for a given ( s , t ), the row must be removed from the table. Since spatial locations S are ﬁxed in time, and as their number n is determined by the number of unique coordinatesin the table, proﬁles observed at diﬀerent time points but the same spatial location s musthave the same coordinates. In D-STEM , a hierarchical structure of object classes and methods is used to handle data def-inition, model deﬁnition and estimation, validation, dynamic kriging and the related plottingcapabilities. The structure is schematically given below. Further details on the use of eachclass are given within the two case studies in this paper, while class constructors, methodsand property details can be obtained in

MATLAB using the command doc . aqiong Wang, Francesco Finazzi, and Alessandro Fassò Data handling

The stem_data class allows the user to deﬁne the data used in f-HDGM models, mainlythrough the following objects and methods.• Objects of stem_data – stem_modeltype : model type (DCM, HDGM, MBC, Emulator or f-HDGM); notethat model type is needed here because the data structure varies among the dif-ferent models; – stem_fda : basis functions speciﬁcation; – stem_validation (optional): deﬁnition of the learning and testing datasets formodel validation.• Methods and Properties of stem_data – kmeans_partitioning : data partitioning for parallel EM computations of Section2.4; this method is applied to a stem_data object, and its output is used by the EM_estimation method in the stem_model class below; – shape (optional): structure with geographical borders used for mapping.• Internal Objects of stem_data – stem_varset : observed data and covariates; – stem_gridlist : list of stem_grid objects∗ stem_grid : spatial locations coordinates; – stem_datestamp : temporal information.Interestingly, stem_misc.data_formatter is a helper method, which is useful for building stem_varset objects starting from data tables. Its class, stem_misc , is a miscellanea staticclass implementing other methods for various intermediate tasks not discussed here for brevity. Model building

The stem_model class is used to deﬁne, estimate, validate and output a f-HDGM, mainlythrough the following objects and methods.• Objects of stem_model – stem_data : deﬁned above; – stem_par : model parameters; – stem_EM_result : container of the estimation output, after EM_estimate ; – stem_validation_result (optional): container of validation output, availableonly if stem_data contains the stem_validation object; – stem_EM_options (optional): model estimation options; it is an input of the EM_estimate method below.2

Modelling functional spatio-temporal data with D-STEM • Methods of stem_model – EM_estimate : computation of parameter estimates; – set_varcov : computation of the estimated variance-covariance matrix; – plot_profile : plot of functional data; – print : print estimated model summary; – beta_Chi2_test : testing signiﬁcance of covariates; – plot_par : plot functional parameter; – plot_validation : plot MSE validation. Kriging

The kriging handling is implemented with two classes. The ﬁrst is the stem_krig class, whichimplements the kriging spatial interpolation.• Objects of stem_krig – stem_krig_data : mesh data for kriging; – stem_krig_options : kriging options;• Methods of stem_krig – kriging : computation of kriging, the output is a stem_krig_result object.The second is the stem_krig_result class, which stores the kriging output and implementsthe methods for plotting the kriging output.• Methods of stem_krig_result – surface_plot : mapping of kriging estimate and their standard deviation for ﬁxed h ; – profile_plot : method for plotting the kriging function and the variance-covariancematrix for a ﬁxed space and time.Although at ﬁrst reading the user could prefer a single object for both input and output ofthe kriging, these objects may be quite large, making the current approach more ﬂexible.

4. Case study on ozone data

This section illustrates how to make inferences on an f-HDGM for ground-level high-frequencyair quality data collected by a monitoring network. In particular, hourly ozone ( O , in µg/m )measured in Beijing, China, is considered. Ground-level O is an increasing public concern due to its essential role in air pollution andclimate change. In China, O has become one of the most severe air pollutants in recent years(Wang et al. aqiong Wang, Francesco Finazzi, and Alessandro Fassò O concentrations from 2015 to 2017 withrespect to temperature and ultraviolet radiation (UVB) across Beijing. Concentration andtemperature data are available at twelve monitoring stations (Figure 2). Hourly UVB data areobtained from the ERA-Interim product of the European Centre for Medium-Range WeatherForecasts (ECMWF) at a grid size of 0 . ◦ × . ◦ over the city.To describe the diurnal cycle of O , which peaks in the afternoon and reaches a minimumat night-time, the 24 hours of the day are used as domain H of the basis functions, whilethe time index t is on the daily scale. Moreover, due to the circularity of time, Fourier basisfunctions are adopted, which implies that β j ( h ), σ ε ( h ) are periodic functions.The measurement equation for O is y ( s , h, t ) = β ( h ) + x temp ( s , h, t ) β temp ( h ) + x uvb ( t ) β uvb ( h ) + φ ( h ) > z ( s , t ) + ε ( s , h, t ) , (9)where s is the generic spatial location, h ∈ [0 ,

24) is the time within the day expressed inhours and t = 1 , ..., β j ( h ), σ ε ( h ) and φ ( h ) > z ( s , t ) is chosen to be 5,5 and 7, respectively. This paragraph details the implementation of the

D-STEM v2 in three aspects: model es-timation, validation and kriging. Relevant scripts are demo_section4_model_estimate.m , demo_section4_validation.m and demo_section4_kriging.m , respectively, which are avail-able in the supplementary material. All the scripts can be executed by choosing the optionnumber from 1 to 3 in the demo_menu_user.m script.4 Modelling functional spatio-temporal data with D-STEM

Model estimation

This paragraph describes the demo_section4_model_estimate.m script devoted to the esti-mation of the model parameters and of their variance–covariance matrix.The data set needed to perform this case study is stored as a

MATLAB table in the userformat of Section 3.2 and named

Beijing_O3 . It can be loaded from the corresponding ﬁleas follows: load ../Data/Beijing_O3.mat;

In the

Beijing_O3 table, each row refers to a ﬁxed space-time point and gives a 24-elementhourly ozone proﬁle with the corresponding conformable covariates, which are: a constant,temperature and UVB.The following lines of code specify the model type and the basis functions, which are storedin an object of class stem_fda : o_modeltype = stem_modeltype('f-HDGM');input_fda.spline_type = 'Fourier';input_fda.spline_range = [0 24];input_fda.spline_nbasis_z = 7;input_fda.spline_nbasis_beta = 5;input_fda.spline_nbasis_sigma = 5;o_fda = stem_fda(input_fda); When using a Fourier basis, spline_nbasis_z must be set to a positive odd number. Mean-while, spline_nbasis_beta and/or spline_nbasis_sigma must be left empty, if β ( h ) ≡ β and/or σ ε ( h ) ≡ σ ε are constant functions.The next step is to deﬁne an object of class stem_data , which speciﬁes the model type andcontains the basis function object and the data from the Beijing_O3 table, transformed inthe internal data format. This is done using the intermediate input_data structure: input_data.stem_modeltype = o_modeltype;input_data.data_table = Beijing_O3;input_data.stem_fda = o_fda;o_data = stem_data(input_data);

Then, an object of class stem_model is created by using both information on data, stored inthe o_data object, and on parametrisation, contained in the stem_par object named o_par : o_par = stem_par(o_data,'exponential');o_model = stem_model(o_data, o_par); To facilitate visualisation, the method plot_profile of class stem_model shows the O proﬁledata at location ( lat0 , lon0 ), in the days between t_start and t_end (Figure 3): lat0 = 40; lon0 = 116;t_start = 880; t_end = 900;o_model.plot_profile(lat0, lon0, t_start, t_end); aqiong Wang, Francesco Finazzi, and Alessandro Fassò get_beta0 of class stem_model , which provides the starting values for β , and the method get_coe_log_sigma_eps0 for the case of a functional σ ε ( h ). Next, themethod set_initial_values of the o_model object is called to complete the initialisation ofmodel parameters: n_basis = o_fda.get_basis_number;o_par.beta = o_model.get_beta0();o_par.sigma_eps = o_model.get_coe_log_sigma_eps0();o_par.theta_z = ones(1, n_basis.z)*0.18;o_par.G = eye(n_basis.z)*0.5;o_par.v_z = eye(n_basis.z)*10;o_model.set_initial_values(o_par); Note that the theta_z parameter must be provided in the same unit of measure as the spatialcoordinates.Figure 3: O concentrations at location 39.92 latitude and 116.19 longitude for 21 daysbeginning on 29 May 2017. Left: each dot is a concentration measurement. The colour of thedot depicts the concentration. Right: each graph is a daily concentration proﬁle.Before model estimation, EM exiting conditions (cid:15) ( exit_toll_par ), (cid:15) ( exit_toll_loglike )and ι ∗ ( max_iterations ) introduced in Section 2.3 can be optionally deﬁned as follows: o_EM_options = stem_EM_options();o_EM_options.exit_toll_par = 0.0001;o_EM_options.exit_toll_loglike = 0.0001;o_EM_options.max_iterations = 200; Modelling functional spatio-temporal data with D-STEM χ statistic p valueConstant 136.33 0Temperature 14266.07 0UVB 2094.34 0Table 1: χ tests for signiﬁcance of covariates.Model estimation is started by calling the method EM_estimate of the o_model object, withthe optional o_EM_options object passed as an input argument. After model estimation, thevariance–covariance matrix of the estimated parameters is evaluated by calling the method set_varcov , with the optional approximation level δ of Equation (6) passed as an inputparameter. Finally, set_logL computes the observed data log-likelihood. o_model.EM_estimate(o_EM_options);delta = 0.001;o_model.set_varcov(delta);o_model.set_logL(); All the relevant estimation results are found in the internal stem_EM_result object, whichcan be accessed as a property of the o_model object as follows: o_model.plot_par;o_model.beta_Chi2_test;o_model.print;

Figure 4 is produced by calling the plot_par method and shows the estimated β ( h ), β temp ( h ), β uvb ( h ), and σ ε ( h ). Thanks to the use of a Fourier basis, the functions are periodic with aperiod of one day. In the plot of σ ε ( h ), the unexplained portion of O variance, σ ε ( h ), issmall during daylight hours, which is consistent with the results of Dohan and Masschelein(1987).When the conﬁdence bands of parplot contain zero, it may be useful to test the signiﬁcanceof the covariates. By calling the method beta_Chi2_test , the results of χ tests are obtained,and they are reported in Table 1. Although β uvb is close to 0 in the morning, all ﬁxed eﬀectsare highly signiﬁcant overall. The model output is shown in the MATLAB command windowby calling the print method.

Validation

This paragraph describes the script demo_section4_validation.m , which implements val-idation. Compared to the code in demo_section4_model_estimate.m , it only diﬀers inproviding an object of class stem_validation .To create the object called o_validation , the name of the validation variable is needed aswell as the indices of the validation stations. Moreover, if the size of the nearest neigh-bour set for each kriging site ( nn_size ) is not provided as the third input argument in the stem_validation class constructor,

D-STEM v2 uses all the remaining stations. For example,a validation data set with three stations is constructed as follows: aqiong Wang, Francesco Finazzi, and Alessandro Fassò β ( h ) , β temp ( h ) , β uvb ( h ) and σ (cid:15) ( h ), with 90% , ,

99% conﬁdencebands, respectively, shown through the diﬀerent shades.8

Modelling functional spatio-temporal data with D-STEM

S_val = [1,7,10];input_data.stem_validation = stem_validation('O3', S_val);

The validation statistics, computed by

EM_estimate , are saved in the internal object stem_-validation_result , which can be accessed as a property of the o_model object. The stem_validation_result object contains the estimated O residuals for the above-mentionedvalidation stations as well as the validation mean square errors and R , as deﬁned in Section2.7. Kriging

This paragraph describes the demo_section4_kriging.m script, which applies the approachof Section 2.6 to the estimated model to map the O concentrations over Beijing city.The ﬁrst step is to create an object of class stem_grid , which collects the information aboutthe regular grid of pixels B to be used for mapping. Then, an object of class stem_krig_data is created, where the o_krig_grid object is passed as an input argument: load ../Output/ozone_model;step_deg = 0.05;lat = 39.4:step_deg:41.1;lon = 115.4:step_deg:117.5;[lon_mat,lat_mat] = meshgrid(lon,lat);krig_coordinates = [lat_mat(:) lon_mat(:)];o_krig_grid = stem_grid(krig_coordinates,'deg','regular','pixel',...size(lat_mat),'square',0.05,0.05);o_krig_data = stem_krig_data(o_krig_grid); Two comments on the above lines follow. First, since the grid in the o_krig_grid object isregular, the dimensions of the grid ( size(lat_mat) , 35 × . ◦ × . ◦ . Second,the above step using the stem_krig_data constructor may appear redundant at ﬁrst glance.Indeed, it is needed for compatibility with other model types for which, in addition to the stem_grid object, other information is also necessary for the stem_krig_data constructor.Next, the stem_krig_options class provides some options for kriging. By default, theoutput is back-transformed in the original unit of measure if the observations have beenlog-transformed and/or standardised. The back_transform property enables handling this.Moreover, the no_varcov property must be set to 1 to avoid the time-consuming computationof the kriging variance. Eventually, the block_size property is used to deﬁne the number ofspatial locations in S ∗ l . o_krig_options = stem_krig_options();o_krig_options.back_transform = 0;o_krig_options.no_varcov = 0;o_krig_options.block_size = 30; After storing the map of Beijing boundaries into the o_model object, the latter is used with o_krig_data to create an object of class stem_krig . This and o_krig_options togethercontain all information for kriging, which is obtained by the corresponding kriging method: aqiong Wang, Francesco Finazzi, and Alessandro Fassò O concentrations and their standard deviation at 10:30 am ( h = 10 . o_model.stem_data.shape = shaperead('../Maps/Beijing_adm1.shp');o_krig = stem_krig(o_model, o_krig_data);o_krig_result = o_krig.kriging(o_krig_options); Note that this task may be time consuming for large grids. The kriging output savedin the o_krig_result object gives the latent process estimate z t and its variance. The surface_plot and profile_plot methods may be used to obtain and plot ˆ f ( s , h, t ) of Equa-tion (7). In this case, the user has to provide the corresponding covariate ( X_beta ) for thescale/vector h , time t or location s ( lon0 , lat0 ) of interest.Speciﬁcally, the surface_plot method is used to display the O map using h , t , X_beta asinput arguments. In the case of unavailable

X_beta , the mapping concerns the component φ ( h ) > z ( s , t ). Loaded from the homonym ﬁle, the array X_beta_t_100 refers to time t = 100and hour h = 10 . × ×

3. Maps of O concentrations and theirstandard deviation are shown in Figure 5. load ../Data/kriging/X_beta_t_100;t = 100;h = 10.5;[y_hat, diag_Var_y_hat] = o_krig_result.surface_plot(h, t, X_beta_t_100); On the other hand, the profile_plot method is used to display the O proﬁle at a givenspatial location s ( lon0 , lat0 ) and time t . Still, the proﬁle plot concerns the component φ ( h ) > z ( s , t ) if X_beta is not provided. After loading the

X_beta_h (dimension 25 ×

3) fromthe homonym ﬁle, this method represents the proﬁle of O concentrations and their variance–covariance matrix as in Figure 6: load ../Data/kriging/X_beta_h;h = 0:24;lon0 = 116.25;lat0 = 40.45; Modelling functional spatio-temporal data with D-STEM

Figure 6: O , ,

99% conﬁdence bands (diﬀerent shadings), andtheir variance–covariance at latitude 40.45, longitude 116.25 on 29 May 2017. t = 880;[y_hat, diag_Var_y_hat] = o_krig_result.profile_plot(h, lon0, lat0, ...t, X_beta_h);

Note that the prediction in Equation (7) and the variance in Equation (8) are stored in theoutput arguments y_hat , and diag_Var_y_hat , respectively.

5. Case study on climate data

In order to show the complexity-reduction capabilities of

D-STEM v2, a data set of temper-ature vertical proﬁles collected by the radiosondes of the Universal Radiosonde ObservationProgram (RAOB) is now considered. The proﬁles are observed over the Earth’s sphere, andthey are misaligned, that is, each proﬁle diﬀers in terms of the number of observations and al-titude above the ground of each observation. Additionally, the computation burden is higherdue to the higher number of spatial locations at which proﬁles are observed.

Radiosondes are routinely launched from stations all over the world to measure the stateof the upper troposphere and lower stratosphere. Data collected by radio sounding haveapplications in weather prediction and climate studies.Temperature data from 200 globally distributed stations collected daily during January 2015at 00:00 and 12:00 UTC are considered here. Each proﬁle consists of a given number of mea-surements taken at diﬀerent pressure levels. Since the weather balloon carrying the radiosondeusually explodes at an unpredictable altitude, the proﬁle measurements are misaligned acrossthe proﬁles and have diﬀerent pressure ranges. A functional data approach is natural in thiscase since the underlying temperature proﬁle can be seen as a continuous function sampledat some pressure levels. aqiong Wang, Francesco Finazzi, and Alessandro Fassò datapoints. When the full data set is used in climate studies, the number of data points grows toaround 10 . In this case, a recent server machine with multiple CPUs with at least 256 GBof RAM is required for model estimation and kriging.The focus of the case study is on the diﬀerence between the radiosonde measurement and theoutput of the ERA-Interim global atmospheric reanalysis model provided by ECMWF. Inparticular, the aim is to study the spatial structure of the this diﬀerence in 4D space, wherethe dimensions are latitude, longitude, altitude and time.The model for temperature y is as follows y ( s , h, t ) = x ERA ( s , h, t ) β ERA ( h ) + φ ( h ) > z ( s , t ) + ε ( s , h, t ) , where h ∈ [50 , hP a is the pressure level, while t = 1 , ...,

62 is a discrete time index forJanuary 2015. Figure 7 shows the temperature measurements at a given station, where 50and 925 hP a correspond approximately to 25 and 1.3 km, respectively.

This section details the software implementation of the case study described above as inscript demo_section5.m , which can be also executed in the demo_menu_user.m script. To avoidrepetition, only the relevant parts of the script that diﬀer from the case study of Section 4are reported and commented on here. In particular, data loading and instantiation of the stem_model object are not described.2

Modelling functional spatio-temporal data with D-STEM

Model estimation

The problem of vertical misalignment of the measurements is completely transparent to theuser and is handled by the internal stem_misc.data_formatter method when creating the stem_data object. Note that the dimension of the matrices in o_varset depends on q , themaximum number of measurements in each proﬁle. To prevent out-of-memory problems, it isadvisable to avoid data sets in which only a few proﬁles have a large number of measurements,which could result in large matrices in o_varset , with most of the elements set to NaN .B-spline bases are used, since, in this application, vertical proﬁles are not periodic with respectto the pressure domain. The corresponding object of class stem_fda is created in the followingway: spline_order = 2;rng_spline = [50,925];knots_number = 5;knots = linspace(rng_spline(1),rng_spline(2),knots_number);input_fda.spline_type = 'Bspline';input_fda.spline_order = spline_order;input_fda.spline_knots = knots;input_fda.spline_range = rng_spline;o_fda = stem_fda(input_fda);

Note that the knots are equally spaced along the functional range. In general, however,non-equally spaced knots can be provided, and each model component (i.e. σ ε , β j and φ ( h ) > z ( s , t )) can have a diﬀerent set of knots. This is obtained using spline_order and spline_knots with additional suﬃxes _sigma , _beta , _z .Although this data set is not large, the demo shows how to enable the partitioning discussed inSection 2.4. First, the spatial locations are partitioned using the modiﬁed k-means algorithm: k = 5;trials = 100;lambda = 5000;partitions = o_data.kmeans_partitioning(k, trials, lambda); where k is the number of partitions, trials is the number of times when the k-means algo-rithm is executed starting from randomised centroids and lambda is λ in Equation (4).At the end of the k-means algorithm, data are internally re-ordered for parallel computing.Model estimation is done after creating and setting an object of class stem_EM_options . Todo this, the output of the kmeans_globe method is passed to the partitioning_block_size property of the o_EM_options object. Additionally, for parallel computing, the number ofworkers must be set to a value higher than 1. In general, this could be any number up to thenumber of cores available on the machine. o_EM_options = stem_EM_options();o_EM_options.partitions = partitions;o_EM_options.workers = 2;o_model.EM_estimate(o_EM_options); aqiong Wang, Francesco Finazzi, and Alessandro Fassò h coloured by the number of observations n b , and (Right) the validation MSE with respect to time t .The three validation MSEs deﬁned in Section 2.7 are shown in Figure 8 and 9. To gener-ate these ﬁgures, the method plot_validation is called with vertical = 1, which provides“atmospheric proﬁle” plots with h on the vertical axis: vertical = 1;o_model.plot_validation(vertical); Kriging

Interpolation across space and over time is done as in Section 4.2.3. However, complexityreduction is enabled by adopting the nearest neighbour approach detailed in Section 2.6.To do this, a class constructor is ﬁrst called, where the block_size is used to deﬁne thenumber of spatial locations in S ∗ l , and then nn_size is used to deﬁne ˜ n . Additionally, setting o_krig_options.workers makes it possible to do the kriging over the u blocks in parallelusing up to the allocated number of workers: o_krig_options = stem_krig_options();o_krig_options.block_size = 150;o_krig_options.nn_size = 10;o_krig_options.workers = 2; Finally, kriging predictions and standard errors are mapped for a given h ∈ H and time t : h = 875.3;t = 12;o_krig_result.surface_plot(h, t); Modelling functional spatio-temporal data with D-STEM

Figure 9: Validation MSE for the thirty-three stations, where the stations used for estimationare marked with blue stars.Figure 10: φ ( h ) > ˆ z ( s , t ) at pressure 875 . hP a , and 12:00 am on 6 January 2015, where 200stations are shown as black stars. aqiong Wang, Francesco Finazzi, and Alessandro Fassò φ ( h ) > ˆ z ( s , t ) at pressure 875 . hP a , and 12:00 am on 06January 2015, where 200 stations are shown as black stars.Since covariates are not provided to the surface_plot method, the plots are on the compo-nent φ ( h ) > ˆ z ( s , t ), namely, the diﬀerence between RAOB and ERA-Interim and its standarddeviation. The output of the above code is depicted in Figures 10 and 11.

6. Concluding remarks

This paper introduced the package

D-STEM v2 through two case studies of spatio-temporalmodelling of functional data. It is shown that, in addition to maximum likelihood esti-mation, Hessian approximation and kriging for large data sets,

D-STEM v2 also developsseveral data-handling capabilities, allows for automatic construction of relevant objects andprovides graphical output. In particular, it provides high-quality global maps and two kindsof functional plotting: the traditional x–y plot and the vertical proﬁle plot, which is popular,for example, in atmospheric data analysis. In this regard, model validation and kriging arestraightforward.

D-STEM v2 ﬁlls a gap in functional geostatistics. In fact, although statistical methods forgeoreferenced functional data have been recently developed (e.g. Ignaccolo et al. (2014)),standard geostatistical packages do not consider functional data, especially in the spatio-temporal context.The successful use of

D-STEM v1 in a number of applications proved that the EM algorithmimplementation is quite stable. Now, due to improvements in computational eﬃciency, thenew

D-STEM v2 has the capability to handle large data sets. Moreover, thanks to the ap-proximated variance–covariance matrix, it is possible to compute standard errors for all modelparameters relatively fast and avoid the large number of iterations typically required by anMCMC approach for making inferences.However, a limit of the EM algorithm is its limited ﬂexibility to changes in the model equa-tions. Indeed, changes in parametrisation or latent variable structure usually require deriv-6

Modelling functional spatio-temporal data with D-STEM ing new closed-form estimation formulas and changing the software accordingly. Moreover,changes in covariance functions are not easy to handle.Computationally, the main limit of

D-STEM v2 is in the number p of basis functions that canbe handled. Even if partitioning is exploited in k blocks of size r , computational complexityis O (cid:0) T kr p (cid:1) , meaning that p cannot be large.Currently, the authors are working on a new version, which makes it possible to handle mul-tivariate functional space-time data and user-deﬁned spatial covariance functions, which willmake D-STEM v2 a valid and comprehensive alternative to the Gaussian process regressionmodels ( fitrgp ) implemented in the Statistics and Machine Learning Toolbox of

MATLAB . Acknowledgments

We would like to thank China’s National Key Research Special Program Grant (No.2016YFC0207701) and the Center for Statistical Science at Peking University.

References

Bivand RS, Gómez-Rubio V, Rue H (2015). “Spatial Data Analysis with R - INLA with SomeExtensions.”

Journal of Statistical Software , (20), 1–31.Blangiardo M, Cameletti M, Baio G, Rue H (2013). “Spatial and Spatio-Temporal Modelswith R - INLA .” Spatial and Spatio-Temporal Epidemiology , , 33–49.Calculli C, Fassò A, Finazzi F, Pollice A, Turnone A (2015). “Maximum Likelihood Estimationof the Multivariate Hidden Dynamic Geostatistical Model with Application to Air Qualityin Apulia, Italy.” Environmetrics , (6), 406–417.Cressie N (2018). “Mission CO2ntrol: A Statistical Scientist’s Role in Remote Sensing ofAtmospheric Carbon Dioxide.” Journal of the American Statistical Association , (521),152–168.Cressie N, Johannesson G (2008). “Fixed Rank Kriging for Very Large Spatial Data Sets.” Journal of the Royal Statistical Society B , (1), 209–226.Dohan JM, Masschelein WJ (1987). “The Photochemical Generation of Ozone: PresentState-of-the-Art.” Ozone: Science & Engineering , (4), 315–334.Fassò A (2013). “Statistical Assessment of Air Quality Interventions.” Stochastic En-vironmental Research and Risk Assessment , (7), 1651–1660. URL DOI:10.1007/s00477-013-0702-5 .Fassò A, Finazzi F (2011). “Maximum Likelihood Estimation of the Dynamic Coregionaliza-tion Model with Heterotopic Data.”

Environmetrics , (6), 735–748.Fassò A, Finazzi F, Ndongo F (2016). “European Population Exposure to Airborne PollutantsBased on a Multivariate Spatio-Temporal Model.” Journal of agricultural, biological, andenvironmental statistics , (3), 492–511. aqiong Wang, Francesco Finazzi, and Alessandro Fassò AtmosphericMeasurement Techniques , (6), 1803–1816.Finazzi F (2020). “Fulﬁlling the Information Need after an Earthquake: Statistical Modellingof Citizen Science Seismic Reports for Predicting Earthquake Parameters in Near Realtime.” Journal of the Royal Statistical Society: Series A , (3), 857–882.Finazzi F, Fassò A (2014). “ D-STEM : A Software for the Analysis and Mapping of Environ-mental Space-Time Variables.”

Journal of Statistical Software , (6), 1–29.Finazzi F, Fassò A, Madonna F, Negri I, Sun B, Rosoldi M (2018). “Statistical Harmoniza-tion and Uncertainty Assessment in the Comparison of Satellite and Radiosonde ClimateVariables.” arXiv preprint arXiv:1803.05835 .Finazzi F, Haggarty R, Miller C, Scott M, Fassò A (2015). “A Comparison of ClusteringApproaches for the Study of the Temporal Coherence of Multiple Time Series.” StochasticEnvironmental Research and Risk Assessment , (2), 463–475.Finazzi F, Napier Y, Scott M, Hills A, Cameletti M (2019). “A Statistical Emulator forMultivariate Model Outputs with Missing Values.” Atmospheric Environment , , 415 –422.Finazzi F, Scott EM, Fassò A (2013). “A Model-Based Framework for Air Quality Indices andPopulation Risk Evaluation, with an Application to the Analysis of Scottish Air QualityData.” Journal of the Royal Statistical Society: Series C , (2), 287–308.Finley A, Banerjee S, Gelfand A (2015). “ spBayes for Large Univariate and MultivariatePoint-Referenced Spatio-Temporal Data Models.” Journal of Statistical Software, Articles , (13), 1–28. ISSN 1548-7660.Furrer R, Genton MG, Nychka D (2006). “Covariance Tapering for Interpolation of LargeSpatial Datasets.” Journal of Computational and Graphical Statistics , (3), 502–523.Gasch CK, Hengl T, Gräler B, Meyer H, Magney TS, Brown DJ (2015). “Spatio-TemporalInterpolation of Soil Water, Temperature, and Electrical Conductivity in 3D+T: The CookAgronomy Farm Data Set.” Spatial Statistics , , 70–90.Gramacy RB (2016). “ laGP : Large-Scale Spatial Modeling via Local Approximate GaussianProcesses in R .” Journal of Statistical Software , (1), 1–46.Heaton MJ, Datta A, Finley AO, Furrer R, Guinness J, Guhaniyogi R, Gerber F, GramacyRB, Hammerling D, Katzfuss M, Lindgren F, Nychka DW, Sun F, Zammit-Mangion A(2018). “A Case Study Competition among Methods for Analyzing Large Spatial Data.” Journal of Agricultural, Biological and Environmental Statistics , pp. 1–28.Ignaccolo R, Mateu J, Giraldo R (2014). “Kriging with External Drift for Functional Data forAir Quality Monitoring.”

Stochastic Environmental Research and Risk Assessment , (5),1171–1186.Jurek M, Katzfuss M (2018). “Multi-Resolution Filters for Massive Spatio-Temporal Data.” arXiv preprint arXiv:1810.04200 .8 Modelling functional spatio-temporal data with D-STEM

Kahle D, Wickham H (2013). “ ggmap : Spatial visualization with ggplot2 .” The R Journal , (1), 144–161.Katzfuss M (2017). “A Multi-Resolution Approximation for Massive Spatial Datasets.” Jour-nal of the American Statistical Association , (517), 201–214.Lindgren F, Rue H (2015). “Bayesian Spatial Modelling with R - INLA .” Journal of StatisticalSoftware , (19), 1–25.Negri I, Fassò A, Mona L, Papagiannopoulos N, Madonna F (2018). “Modeling SpatiotemporalMismatch for Aerosol Proﬁles.” In Quantitative Methods in Environmental and ClimateResearch , pp. 63–83. Springer-Verlag.Nychka D, Bandyopadhyay S, Hammerling D, Lindgren F, Sain S (2015). “A Multireso-lution Gaussian Process Model for the Analysis of Large Spatial Datasets.”

Journal ofComputational and Graphical Statistics , (2), 579–599.Nychka D, Hammerling D, Sain S, Lenssen N (2016). LatticeKrig : Multiresolution Krig-ing Based on Markov Random Fields . University Corporation for Atmospheric Research,Boulder, CO, USA. R package version 7.0.Pebesma E (2012). “ spacetime : Spatio-temporal Data in R .” Journal of statistical software , (7), 1–30.Pebesma E, Heuvelink G (2016). “Spatio-temporal Interpolation Using gstat .” RFID Journal , (1), 204–218.Porcu E, Alegria A, Furrer R (2018). “Modeling Temporally Evolving and Spatially GloballyDependent Data.” International Statistical Review , (2), 344–377.Ramsay JO, Silverman BW (2007). Applied Functional Data Analysis: Methods and CaseStudies . Springer-Verlag.Ramsay JO, Wickham H, Graves S, Hooker G (2018). fda : Functional Data Analysis . R pack-age version 2.4.8, URL https://CRAN.R-project.org/package=fda .Rue H, Martino S, Blangiardo FL, Simpson D, Riebler A, Krainski ET (2014). INLA :Functions which Allow to Perform Full Bayesian Analysis of Latent Gaussian Models us-ing Integrated Nested Laplace Approximaxion . R package version 0.0-1404466487, URL .Shumway RH, Stoﬀer DS (2017). Time Series Analysis and Its Applications: with R Examples .Springer-Verlag.Stein ML (2002). “The Screening Eﬀect in Kriging.”

The Annals of Statistics , (1), 298–323.Stein ML (2013). “Statistical Properties of Covariance Tapers.” Journal of Computationaland Graphical Statistics , (4), 866–885.Taghavi-Shahri S, Fassò A, Mahaki B, Amin H (2019). “Concurrent Spatiotemporal DailyLand Use Regression Modeling and Missing Data Imputation of Fine Particulate Mat-ter Using Distributed Space-Time Expectation Maximization.” Atmospheric Environment , , 1–11. URL https://doi.org/10.1016/j.atmosenv.2019.117202 . aqiong Wang, Francesco Finazzi, and Alessandro Fassò Technometrics , (2), 198–208.Wan Y, Xu M, Huang H, Chen S (2020). “A Spatio-Temporal Model for the Analysis and Pre-diction of Fine Particulate Matter Concentration in Beijing.” Environmetrics , accepted .Wang T, Xue L, Brimblecombe P, Lam YF, Li L, Zhang L (2017). “Ozone Pollution in China:A Review of Concentrations, Meteorological Inﬂuences, Chemical Precursors, and Eﬀects.” Science of the Total Environment , , 1582–1596.Zammit-Mangion A (2018). FRK : Fixed Rank Kriging . R package version 0.2.2, URL https://CRAN.R-project.org/package=FRK . Aﬃliation:

Yaqiong WangGuanghua School of ManagementPeking UniversityYiheyuan road, 5100871 Peking, ChinaE-mail: [email protected]

Francesco FinazziDepartment of Management, Information and Production EngineeringUniversity of Bergamoviale Marconi, 524044 Dalmine (BG), ItalyE-mail: [email protected]

URL:

Alessandro FassòDepartment of Management, Information and Production EngineeringUniversity of Bergamoviale Marconi, 524044 Dalmine (BG), ItalyE-mail: [email protected]