[PDF] Nonstationary, Nonparametric, Nonseparable Bayesian Spatio-Temporal Modeling Using Kernel Convolution of Order Based Dependent Dirichlet Process

Abstract

In this article, using kernel convolution of order based dependent Dirichlet process (Griffin and Steel (2006)) we construct a nonstationary, nonseparable, nonparametric space-time process, which, as we show, satisfies desirable properties, and includes the stationary, separable, parametric processes as special cases. We also investigate the smoothness properties of our proposed model. Since our model entails an infinite random series, for Bayesian model fitting purpose we must either truncate the series or more appropriately consider a random number of summands, which renders the model dimension a random variable. We attack the variable dimensionality problem using Transdimensional Transformation based Markov Chain Monte Carlo introduced by Das and Bhattacharya (2019b), which can update all the variables and also change dimensions in a single block using essentially a single random variable drawn from some arbitrary density defined on a relevant support. For the sake of completeness we also address the problem of truncating the infinite series by providing a uniform bound on the error incurred by truncating the infinite series. We illustrate the effectiveness of our model and methodologies on a simulated data set and demonstrate that our approach significantly outperforms that of Fuentes and Reich (2013) which is based on principles somewhat similar to ours. We also fit two real, spatial and spatio-temporal datasets with our approach and obtain quite encouraging results in both the cases.

Full PDF

NNonstationary, Nonparametric, Nonseparable BayesianSpatio-Temporal Modeling using Kernel Convolution ofOrder Based Dependent Dirichlet Process

Moumita Das and Sourabh Bhattacharya ∗ Abstract

Spatio-temporal processes are important modeling tools for varieties of problemsin environmental science, biological science, geographical science, etc. It is generallyassumed that the underlying model is parametric, typically a Gaussian process, andthat the covariance function is stationarity and separable. That this structure neednot be always realistic have been perceived by many researchers and attempts havebeen made to construct nonparametric processes consisting of neither stationary norseparable covariance functions. But, as we elucidate, some desirable and importantspatio-temporal properties are not guaranteed by the existing approaches, thus callingfor further innovative ideas.In this paper, using kernel convolution of order based dependent Dirichlet process(Griﬃn & Steel (2006)) we construct a nonstationary, nonseparable, nonparametricspace-time process, which, as we show, satisﬁes desirable properties, and includes thestationary, separable, parametric processes as special cases. We also investigate thesmoothness properties of our proposed model.Since our model entails an inﬁnite random series, for Bayesian model ﬁtting purposewe must either truncate the series or more appropriately consider a random number ofsummands, which renders the model dimension a random variable. We attack the vari-able dimensionality problem using the novel Transdimensional Transformation based ∗ Moumita Das is a PhD student and Sourabh Bhattacharya is an Assistant Professor in Bayesian andInterdisciplinary Research Unit, Indian Statistical Institute, 203, B. T. Road, Kolkata 700108. Correspondinge-mail: [email protected]. a r X i v : . [ s t a t . M E ] M a y arkov Chain Monte Carlo (TTMCMC) methodology introduced by Das & Bhat-tacharya (2014 b ), which can update all the variables and also change dimensions in asingle block using a single random variable drawn from some arbitrary density deﬁnedon a relevant support. For the sake of completeness we also address the problem oftruncating the inﬁnite series by providing a uniform bound on the error incurred bytruncating the inﬁnite series.We illustrate our model and the methodologies on a simulated data set and alsoﬁt a real, ozone data set. The results that we obtain from both the studies are quiteencouraging. Keywords:

Kernel convolution; Nonstationary; Nonseparable; Order based Depen-dent Dirichlet Process; Spatio-temporal data; Transdimensional Transformation basedMarkov Chain Monte Carlo.

1. INTRODUCTIONRecent years have witnessed considerable amount of research on spatial and spatio-temporalmodeling. Though it is common practice to assume that the underlying spatial or spatio-temporal process is stationary and isotropic Gaussian process, in reality such assumptionsseldom hold. This is because there may be local inﬂuences aﬀecting the correlation structureof the random process. For instance, orographic eﬀects inﬂuence the atmospheric transportof pollutants, and result in a correlation structure that depends on diﬀerent spatial locations(Guttorp & Sampson (1994)). It is also restrictive to assume that the underlying process isGaussian, and nonparametric processes seem more reasonable in realistic situations.Realizing the limitations of stationary parametric processes (almost invariably Gaussianprocesses) researchers have come up with many novel ideas for constructing nonstationaryand/or nonparametric processes. The ﬁrst signiﬁcant work in the framework of nonstation-ary parametric processes is by Sampson & Guttorp (1992), who proposed an approach basedon spatial deformation. This work is followed up by Damian, Sampson & Guttorp (2001)and Schmidt & O’Hagan (2003), providing the corresponding Bayesian generalizations. Non-2tationarity has been induced in parametric space-time models by Haas (1995) by proposinga moving window regression residual kriging. A similar approach has been proposed byNott & Dunsmuir (2002). Higdon (1998) (see also Higdon, Swall & Kern (1999), Higdon(2001)) proposed a kernel convolution approach for inducing nonstationarity in Gaussianprocesses. Similar approaches are also proposed by Fuentes & Smith (2001) and Fuentes(2002). Approaches that attempt to model the underlying process as nonparametric, in ad-dition to modeling the covariance structure as nonstationary are more recent in comparison,the approach of Gelfand, Kottas & MacEachern (2005) based on Dirichlet processes (see, forexample, Ferguson (1973), Ferguson (1974)) being the ﬁrst in this regard; see Duan, Guin-dani & Gelfand (2007) for a generalization. Duan, Gelfand & Sirmans (2009) use stochasticdiﬀerential equations to construct a nonstationary, non-Gaussian process. We discuss theseproposals in some detail in Section 2.Griﬃn & Steel (2006) (henceforth, GS) proposed novel order-based dependent Dirichletprocesses (ODDP that satisfy most desirable properties of nonparametric processes, and canbe seen as a signiﬁcant improvement over the other existing ideas. Indeed, our modelingidea also hinges upon the constructive idea of GS, but we attempt to incorporate furtherﬂexibility in our spatial/temporal/spatio-temporal model in terms of nonstationarity andnonseparability through our proposed kernel convolution based methodology.The rest of our paper is structured as follows. In Section 2 we provide a brief overview ofthe existing approaches to construction of nonstationary, nonseparable space-time processesin both parametric and nonparametric frameworks, arguing that not all desirable propertiesare necessarily accounted for in these approaches. Such issues necessitate development ofnew approaches to construction of nonstationary, nonparametric, nonseparable space-timemodels. In Section 3 we introduce our proposed space-time model based on kernel convo-lution of ODDP and show that it satisﬁes the properties that are not guaranteed by theexisting models. We investigate continuity and smoothness properties of our model in Sec-tion 4. Since our proposed model involves a random inﬁnite series, for model ﬁtting oneneeds to either truncate the series or assume a random number of summands and adopt3ariable dimensional Markov Chain Monte Carlo (MCMC) approaches. Although we adoptthe latter framework for our applications, and implement the recently developed Transdimen-sional Transformation based Markov Chain Monte Carlo (TTMCMC) (Das & Bhattacharya(2014 b )) for simulating from our variable dimensional model, for the sake of completenesswe also investigate the truncation approach. Indeed, in Section 5 we consider the diﬀerencebetween the prior predictive models with and without truncation of the random inﬁnite se-ries, providing a bound that depends upon the truncation parameter. Thus, the truncationparameter can be chosen so that the bound falls below any desired level. In Section 6 wediscuss the choice of suitable kernels, prior distributions and choice of the spatio-temporaldomain that is relevant for computational purpose. We describe the joint posterior distri-bution associated with our model, and provide a brief discussion of TTMCMC in Section7. We detail a simulation study illustrating the performance of our model in Section 8, andfollow it up by analysis of a real, ozone data set in Section 9. Finally, we summarize ourcontributions and provide concluding remarks in Section 10.Proofs of our results and requisite details of TTMCMC, particularly in the context of ourspatio-temporal model, are provided in the supplement Das & Bhattacharya (2014 a ), whosesections and algorithms have the preﬁx “S-” when referred to in this paper.2. OVERVIEW OF OTHER AVAILABLE NONSTATIONARY APPROACHES2.1 Parametric approachesThe deformation approaches of Sampson & Guttorp (1992), Damian et al. (2001), andSchmidt & O’Hagan (2003) are based on Gaussian processes. In these approaches repli-cations of the data are necessary, which the authors relate to temporal independence ofthe data. This also means that space-time data can not be modeled using these approaches.Moreover, in the deformation-based approaches model based theoretical correlations betweenrandom observations separated by large enough distances need not necessarily tend to zero.4ampson & Guttorp (1992) deal with the variogram of the following form:Var( Y ( s , t ) − Y ( s , t )) = f ( (cid:107) d ( x ) − d ( x ) (cid:107) ) , (2.1)for any s , s , t , where f is an appropriate monotone function and d is a one-to-one nonlinearmapping. The technique of Sampson & Guttorp (1992) involves appropriately approximating f by ˆ f using the multidimensional scaling method, and obtaining a conﬁguration of points { u , . . . , u n } in a “deformed” space where the process is assumed isotropic. Then, usingthin-plate splines, a nonlinear approximation of d , which we denote by ˆ d , is determined suchthat ˆ d ( s i ) ≈ u i , for i = 1 , . . . , n . Bayesian versions of the key idea have been described inDamian et al. (2001), who use random thin-plate splines and Schmidt & O’Hagan (2003), whouse Gaussian process to implement the nonlinear transformation d . Rather than estimate f nonparametrically, both specify a parametric functional form from a valid class of suchmonotone functions.As is clear, since large diﬀerences (cid:107) s − s (cid:107) does not imply that (cid:107) d ( s ) − d ( s ) (cid:107) is alsolarge, the model based correlations between two observations widely separated need notnecessarily tend to zero, in either of the aforementioned deformation-based approaches.The kernel convolution approaches of Higdon et al. (1999), Higdon (2001), and Fuentes& Smith (2001) overcome some of the diﬃculties of the deformation approach. In theseapproaches data replication is not necessary, and for appropriate choices of the kernel, sta-tionarity, nonstationarity, separability, and nonseparability can be achieved with respect tospatio-temporal data. In the approach of Higdon et al. (1999), Higdon (2001), Y ( x ) = (cid:90) K ( x , u ) Z ( u ) d u , (2.2)where K is a kernel function and Z ( · ) is a white noise process. Then the covariance between Y ( x ) and Y ( x ) is given by C ( x , x ) = (cid:90) K ( x , u ) K ( x , u ) d u . (2.3)In general, this does not depend upon x and x only through d = (cid:107) x − x (cid:107) , thus achievingnonstationarity. However, it is clear from the covariance structure (2.3) that C ( x , x ) does5ot generally tend to zero as d = (cid:107) x − x (cid:107) → ∞ . But for separable space-time processesrelated to representation (2.2) this property holds under the additonal assumption of isotropywith respect to either space or time. We elaborate this below.Although respresentation (2.2) can not achieve separability with respect to space andtime, a modiﬁed representation of the following form does: Y ( s , t ) = (cid:90) K ( s , u ) K ( t, v ) Z ( u ) Z ( v ) d u d v . (2.4)In (2.4), K , K are two kernel functions, and Z ( x ) , Z ( x ) are independent white noiseprocesses. Now the covariance is given by C (( s , t ) , ( s , t )) = (cid:90) K ( s , u ) K ( x s , u ) K ( t , v ) K ( t , v ) d u d v = C ( s , s ) × C ( t , t ) , (2.5)where C ( s , s ) = (cid:90) K ( s , u ) K ( s , u ) d u , (2.6) C ( t , t ) = (cid:90) K ( t , v ) K ( t , v ) d v , (2.7)exhibiting separability. Further assuming that either of C or C is isotropic, it follows thatif either of d = (cid:107) s − s (cid:107) or d = | t − t | tends to inﬁnity, the covariance given by (2.5)tends to zero even though either of C or C is nonstationary. But if both C and C arenonstationary, then this result need not hold.The approach of Fuentes & Smith (2001) comes close towards solving the problem of zerocovariance in the limit with large enough separation between observations, which we nowexplain. They model the underlying process as Y ( x ) = (cid:90) K ( x − u ) Z θ ( u ) d u , (2.8)where Z θ ( x ); x ∈ D is a family of independent, stationary Gaussian processes indexed by θ ,where the covariance of Z θ ( u ) is given byCov (cid:0) Z θ ( u ) ( x ) , Z θ ( u ) ( x ) (cid:1) = C θ ( u ) ( x − x ) . (2.9)6hen, the covariance between Y ( x ) and Y ( x ) is given by C ( x , x ; θ ) = (cid:90) K ( x − u ) K ( x − u ) C θ ( u ) ( x − x ) d u . (2.10)For practical purposes, Fuentes & Smith (2001) approximate Y ( x ) withˆ Y ( x ) = 1 M M (cid:88) m =1 K ( x − u m ) Z θ ( u m ) ( x ) , (2.11)and C ( x , x ; θ ) byˆ C ( x , x ; θ ) = 1 M M (cid:88) m =1 K ( x − u m ) K ( x − u m ) C θ ( u m ) ( x − x ) , (2.12)where { u , . . . , u M } can be thought of as a set of locations drawn independently from thedomain D . Assuming that the family of independent Gaussian processes Z θ ( x ); x ∈ D is alsoisotropic, it follows, using the fact that M is ﬁnite, that ˆ C ( x , x ; θ ) → (cid:107) x − x (cid:107) → ∞ since C θ ( u m ) ( x − x ) → m = 1 , . . . , M . However, this of course does notguarantee that ˆ C ( x , x ; θ ) → M → ∞ . That is, this does not necesasrily imply that C ( x , x ; θ ) → (cid:107) x − x (cid:107) → ∞ . For other available parametric approaches to nonstationarity werefer to the references provided in Chang et al. (2011).2.2 Nonparametric approachesGelfand et al. (2005) seem to be the ﬁrst to propose a nonstationary, noparametric Bayesianmodel based on Dirichlet process mixing. They represent the random ﬁeld Y D = { Y ( x ); x ∈ D } as (cid:80) ∞ (cid:96) =1 w (cid:96) δ θ (cid:96),D , where θ (cid:96),D = { θ (cid:96) ( x ); x ∈ D } are realizations from a speciﬁed stationaryGaussian process, which we denote as G , w = V , w (cid:96) = V (cid:96) (cid:81) (cid:96) − r =1 (1 − V r ) for (cid:96) ≥ V r iid ∼ Beta (1 , α ); r = 1 , , . . . . Thus, a random process G is induced on the spaceof processes of Y D with G being the “central” process. Gelfand et al. (2005) assume the7pace-time data Y t = ( Y ( s , t ) , . . . , Y ( s n , t )) (cid:48) to be time-independent for t = 1 , . . . , T , whichis the same assumption of data replication used in the deformation-based approaches. Thetemporal-independence assumption allows Gelfand et al. (2005) to model the data as follows:for t = 1 , . . . , T , Y t iid ∼ G ( n ) and G ( n ) ∼ DP ( G ( n )0 ), where the G ( n ) and G ( n )0 denote the n -variate distributions corresponding to the processes G and G . The development leads tothe following covariance structure: for any s , s , t ,Cov( Y ( s , t ) , Y ( s , t ) | G ) = ∞ (cid:88) (cid:96) =1 w (cid:96) θ (cid:96) ( s ) θ (cid:96) ( s ) − (cid:40) ∞ (cid:88) (cid:96) =1 w (cid:96) θ (cid:96) ( s ) (cid:41) (cid:40) ∞ (cid:88) (cid:96) =1 w (cid:96) θ (cid:96) ( s ) (cid:41) , (2.13)which is nonstationary. However, marginalized over G , the covariance between Y ( s , t ) and Y ( s , t ) turns out to be stationary. Since, in Gelfand et al. (2005), the Bayesian inference ofthe data Y , . . . , Y n proceeds by integrating out G ( n ) , the entire ﬂavour of nonstationarityis lost. Also, given G , (2.13) is nonstationary but need not necessarily converge to zero if (cid:107) s − s (cid:107) → ∞ .Duan et al. (2007) attempt to generalize the model of Gelfand et al. (2005) by specifying P r { Y ( x ) ∈ A , . . . , Y ( x n ) ∈ A n } = ∞ (cid:88) i =1 · · · ∞ (cid:88) i n =1 p i ,...,i n δ θ i ( x ) ( A ) · · · δ θ in ( x n ) ( A n ) , (2.14)where θ j ’s are iid G as in Gelfand et al. (2005), and { p i ,...,i n ≥ (cid:80) ∞ i =1 · · · (cid:80) ∞ i n =1 p i ,...,i n } determine the site-speciﬁc joint selection probabilities, which also must satisfy simple con-straints to ensure consistency. The resulting conditional covariance (conditional on G ) andthe marginal covariance are somewhat modiﬁed versions of those of Gelfand et al. (2005),but now even the marginal covariance is nonstationary. By choosing G to be an isotropicGaussian process it can be ensured that the marginal covariance tends to zero as two obser-vations are widely separated, but the same can not be ensured for the conditional covariance.Moreover, replications of the data is necesary even for this generalized version of Gelfandet al. (2005), and modeling temporal dependence is precluded as before.A nonstationary, nonseparable non-Gaussian spatiotemporal process has been constructedby Duan et al. (2009) using discretized versions of stochastic diﬀerential equations, but again,the correlations between largely separated observations do not necessarily tend to zero un-8er their model. Also, stationarity or separability can not be derived as special cases of thisapproach.A ﬂexible approach using kernel convolution of L´evy random measures has been detailedin Wolpert, Clyde & Tu (2011), but even this approach does not guarantee that correlationstend to zero for largely separated distances for arbitrarily chosen kernels.In the next section we introduce our idea based on kernel convolution of ODDP and showthat it overcomes the issues faced by the traditional approaches to construction of ﬂexiblespace-time models. 3. KERNEL CONVOLUTION OF ODDPBefore introducing our proposal, it is necesary to ﬁrst provide an overview of ODDP.3.1 Overview of ODDPIn order to induce spatial dependence between observations at diﬀerent locations GS modifythe nonparametric stick-breaking construction of Sethuraman (1994) in the following way:for each point x ∈ D , where D is some speciﬁed domain, they deﬁne the distribution: G x D = ∞ (cid:88) i =1 p i ( x ) δ θ πi ( x ) , (3.1)where p i ( x ) = V π i ( x ) (cid:89) j π i ( x ) = i for each x and i , then the Dirichlet process (DP) results at all locations.GS construct π ( x ) in a way such that it is associated with the realization of a pointprocess. Speciﬁcally, they consider a stationary Poisson process Φ and a sequence of sets U ( x ) for x ∈ D , the latter determining the relevant region for the ordering purpose. In the9ase of only spatial problems, if x ∈ D ⊂ R d , for d ≥

1, then GS suggest U ( x ) = D for all x ∈ D as a suitable construction of U ( x ). For time series problems they suggest D = R and U ( x ) = ( −∞ , x ]. When x = ( s (cid:48) , t ) (cid:48) , that is, when x consists of both spatial and temporalco-ordinates, for our modeling purpose, we use U ( x ) = D × ( −∞ , t ].Letting { z , z , . . . } denote a realization of the stationary Poisson point process, the or-dering π ( x ) is chosen to satisfy (cid:107) x − z π ( x ) (cid:107) < (cid:107) x − z π ( x ) (cid:107) < (cid:107) x − z π ( x ) (cid:107) < · · · , where (cid:107) · (cid:107) is a distance measure and z π ( x ) ∈ Φ ∩ U ( x ). Thus, although the set of probabili-ties { p i ( x ); i = 1 , , . . . } remains same for all locations, they are randomly permuted. Thisrandom permutation, in turn, induces spatial dependence. Assuming a homogeneous Pois-son point process with intensity λ , ODDP is characterized by G , α , and λ . We expressdependence of ODDP on these parameters by ODDP( αG , λ ).Assuming that data { y , . . . , y n } are available at sites { x , . . . , x n } , GS embed the ODDPin a hierarchical Bayesian model: y i ∼ f θ i ( · ) θ i iid ∼ G x i G x i ∼ ODDP( αG , λ ) . Note that the same theory can be extended to space-time situations with x = ( s (cid:48) , t ) (cid:48) , where s stands for the spatial location and t stands for the time point.Next, we introduce our proposed idea of kernel convolution of ODDP.3.2 Kernel convolution of ODDPWe consider the following model for the data Y = { y , . . . , y n } at locations/times { x i =( s (cid:48) i , t i ) (cid:48) ; i = 1 , . . . , n } : y i = f ( x i ) + (cid:15) i , (3.3)where (cid:15) i iid ∼ N (0 , σ ), for unknown σ . We represent the spatio-temporal process f ( x ) as aconvolution of ODDP G x with a smoothing kernel K ( x , · ): f ( x ) = (cid:90) K ( x , θ ) dG x ( θ ) = ∞ (cid:88) i =1 K ( x , θ π i ( x ) ) p i ( x ) ∀ x ∈ D ⊆ R d , (3.4)10 ( ≥

1) being the dimension of x . The following theorem, the proof of which is presented inSection S-1 of the supplement, gives an expression of the expectation of f ( x ). Theorem 1

Let (cid:82) | K ( x , θ ) | dG ( θ ) < ∞ . Then (cid:82) | K ( x , θ ) | dG x ( θ ) < ∞ with probabilityone, and E ( f ( x )) = E (cid:90) K ( x , θ ) dG x ( θ ) = (cid:90) K ( x , θ ) dEG x ( θ ) = (cid:90) K ( x , θ ) dG ( θ ) = E G K ( x , θ ) . Before deriving the covariance structure of f ( · ), we deﬁne the necessary notation followingGS. Let T ( x , x ) = { k : there exists i, j such that π i ( x ) = π j ( x ) = k } . For k ∈ T ( x , x ), we further deﬁne A lk = { π j ( x l ) : j < i where π i ( x l ) = k } , S k = A k ∩ A k and S (cid:48) k = A k ∪ A k − S k . Then, the following theorem, the proof of which is deferred toSection S-2 of the supplement, provides an expression for the covariance structure of f ( · ),which will be our reference point for arguments regarding nonstationarity and other desirablespatial properties in comparison with the existing methods. Theorem 2 If (cid:82) | K ( x , θ ) | dG ( θ ) < ∞ and (cid:82) | K ( x , θ ) K ( x , θ ) | dG ( θ ) < ∞ , then for aﬁxed ordering at x and x ,Cov ( f ( x ) , f ( x )) = Cov G ( K ( x , θ ) , K ( x , θ )) × α + 1)( α + 2) (cid:88) k ∈ T ( x , x ) (cid:18) αα + 2 (cid:19) S k (cid:18) αα + 1 (cid:19) S (cid:48) k . (3.5) whereCov G ( K ( x , θ ) , K ( x , θ )) = (cid:90) K ( x , θ ) K ( x , θ ) dG ( θ ) − E G ( K ( x , θ )) E G ( K ( x , θ )) . (3.6) Corollary 3

It follows from the above theorem that for i = 1 , , if (cid:82) K ( x i , θ ) dG ( θ ) < ∞ ,then Var ( f ( x i )) = Var G ( K ( x i , θ )) α + 1 (3.7)11 nd Corr ( f ( x ) , f ( x )) = Corr G ( K ( x , θ ) , K ( x , θ )) × Corr ( G x , G x ) , (3.8) where Corr ( G x , G x ) = 2 α + 2 (cid:88) k ∈ T ( x , x ) (cid:18) αα + 2 (cid:19) S k (cid:18) αα + 1 (cid:19) S (cid:48) k . (3.9)The expression for the correlation in (3.9) has been obtained by GS.The above results provide an expression for the correlation conditional on a ﬁxed order-ing. To obtain the unconditional correlation it is necessary to marginalize the conditionalcorrelation over the point processs Φ. Following GS we also modify the notation as follows:we now let T ( x , x ) = Φ ∩ U ( x ) ∩ U ( x ), A (cid:96)k = A (cid:96) ( z k ), where A (cid:96) ( z ) = { w ∈ Φ ∩ U ( x (cid:96) ) : (cid:107) w − x (cid:96) (cid:107) < (cid:107) z − x (cid:96) (cid:107)} , for z ∈ Φ ∩ U ( x (cid:96) ). As already mentioned in Section 3.1, when x = ( s (cid:48) , t ) (cid:48) , we deﬁne U ( x ) = D × ( −∞ , t ].Also, for z ∈ T ( x , x ), we let S ( z ) = A ( z ) ∩ A ( z ) and S (cid:48) ( z ) = A ( z ) ∪ A ( z ) − S ( z ),which imply that S ( z ) = { w ∈ T ( x , x ) : (cid:107) w − x (cid:107) < (cid:107) z − x (cid:107) and (cid:107) w − x (cid:107) < (cid:107) z − x (cid:107)} .We further deﬁne, as in GS, S − z ( z ) and S (cid:48)− z ( z ) to be translations of S ( z ) and S (cid:48) ( z ),respectively, by − z . Then, the reﬁned Campbell theorem yields, in the case where Φ is astationary point process with intensity λ :Corr( f ( x ) , f ( x )) = Corr G ( K ( x , θ ) , K ( x , θ )) × λα + 2 (cid:90) U ( x ) ∩ U ( x ) (cid:90) (cid:18) αα + 2 (cid:19) φ − z ( S − z ) (cid:18) αα + 1 (cid:19) φ − z ( S (cid:48)− z ) P ( dφ ) d z (3.10)In (3.10), P ( dφ ) is the Palm distribution of Φ at the origin, and φ − z is the realizationof Φ translated by − z . Note also that the second factor of the above correlation is theunconditional correlation between G x and G x (see GS). The following theorem, the proofof which is provided in Section S-3 of the supplement, shows that the above correlationstructure of our kernel convolution based ODDP satisﬁes desirable properties.12 heorem 4 Corr ( f ( x ) , f ( x )) → as (cid:107) x − x (cid:107) → and Corr ( f ( x ) , f ( x )) → as (cid:107) x − x (cid:107) → ∞ . Under a stationary Poisson process assumption for Φ, and for particular speciﬁcations of U ( x ) mentioned in Section 3.1, the calculations of GS show that this factor depends upon x and x only through (cid:107) x − x (cid:107) , leading to isotropy of the process. There does not seem toexist any result analogous to the reﬁned Campbell theorem in the context of nonstationaryPoisson process which might allow one to construct a nonstationary correlation structure inthis case. The analytic form of the ODDP correlation structure need not be available forother constructions of U ( x ) either. Isotropy results even in the case of the more ﬂexible Coxprocesses.On the other hand, our kernel convolution idea neatly solves this problem of attainmentof nonstationarity via the ﬁrst factor of our correlation structure given in (3.10). Indeed, thekernel K ( x , θ ) can be chosen in the spirit of Higdon et al. (1999), for instance, such thatCorr G ( K ( x , θ ) , K ( x , θ )) does not depend upon x − x alone. In other words, by simplycontrolling the kernel we can ensure nonstationarity of our process f ( · ) even if the underlyingODDP is stationary or even isotropic. Of course, our process can be made stationary as wellby choosing the kernel, say, in the spirit of Higdon (1998), and setting U ( x ) to be of theforms speciﬁed by GS, when x consists of either only spatial co-ordinates or only temporalco-ordinate. When x = ( s (cid:48) , t ) (cid:48) , then we set U ( x ) = D × ( −∞ , t ], as already mentionedbefore.We further note that our general space-time correlation structure given by (3.8) is non-separable, that is, in general, Corr( f ( s , t ) , f ( s , t )) (cid:54) = Corr ( s , s ) × Corr ( t , t ), whereCorr and Corr are spatial and temporal structures respectively. However, if desired, sepa-rability can be easily induced by allowing the kernel to depend upon only the spatial locationand by allowing the ordering π to depend only upon time, or the vice versa. Speciﬁcally,letting K ( x , θ ) = K ( s , θ ) and π ( x ) = π ( t ), we obtainCorr( f ( s , t ) , f ( s , t )) = Corr G ( K ( s , θ ) , K ( s , θ )) × Corr( G t , G t ) , (3.11)13nd letting K ( x , θ ) = K ( t, θ ) and π ( x ) = π ( s ), we obtainCorr( f ( s , t ) , f ( s , t )) = Corr G ( K ( t , θ ) , K ( t , θ )) × Corr( G s , G s ) , (3.12)In contrast, under the ODDP approach of GS, it is clear from the correlation structure thatCorr( G x , G x ) (cid:54) = Corr ( s , s ) × Corr ( t , t ), showing that separability can not be enforcedif desired.Thus, following our approach it is easy to construct nonparametric covariance structuresthat are either stationary or nonstationary, which, in turn, can be constructed as eitherseparable or nonseparable, as desired. These illustrate the considerable ﬂexibility inherentin our approach, while satisying at the same time the desirable conditions that the correlationbetween f ( x ) and f ( x ) tends to 1 or zero accordingly as the distance between x and x tends to zero or inﬁnity.In the next section we provide a brief overview of the other available approaches fornonstationary modelling of spatio-temporal data.4. CONTINUITY AND SMOOTHNESS PROPERTIES OF OUR MODELFor stationary models, properties like continuity and smoothness can be quite generallycharacterized by the continuity and smoothness of the correlation function. In particular,continuity and smoothness of stationary processes typically depend upon the behaviourof the correlation function at zero; see Yaglom (1987 a ) and Yaglom (1987 b ) for details.For nonstationary processes, however, such elegant theory is not available. Indeed, thestructure of the correlation function itself may be diﬃcult to get hold of, rendering it diﬃcultto investigate the properties of the underlying nonstationary stochastic process. For ourpurpose, we utilize the notions of almost sure continuity, mean square continuity and meansquare diﬀerentiability of stochastic processes (see, for example, Stein (1999), Banerjee &Gelfand (2003)) to study the properties of our nonstationary spatio-temporal process. Deﬁnition 5

A process { X ( x ) , x ∈ R d } is L continuous at x if lim x → x E [ X ( x ) − X ( x )] =0 . Continuity in the L sense is also referred to as mean square continuity and will be denoted y X ( x ) L → X ( x ) . Deﬁnition 6

A process { X ( x ) , x ∈ R d } is almost surely continuous at x if X ( x ) → X ( x ) a.s. as x → x . If the process is almost surely continuous for every x ∈ R d then the processis said to have continuous realizations. Theorem 7

Assume the following conditions:(A1) For all x and θ , | K ( x , θ ) | < M for some M < ∞ .(A2) Given any θ , K ( x , θ ) is a continuous function of x .Then f ( · ) is both almost surely continuous and mean square continuous in the interior of ∩ ∞ k =1 A ki k , where A ki k = { x : π k ( x ) = i k } , and for each k = 1 , , . . . , i k ∈ { , , . . . } ; i k (cid:54) = i k (cid:48) for any k (cid:54) = k (cid:48) . On the other hand, f ( · ) is almost surely discontinuous at any point x ∈ ∩ ∞ k =1 A ki k lying on the boundary of A ki k , for any i k . See Section S-4 for a proof of this result. Now we examine mean square diﬀerentiabilityof our process.

Deﬁnition 8

A process { X ( x ) , x ∈ R d } is said to be mean square diﬀerentiable at x if forany direction u , there exists a process L x ( u ) , linear in u such that X ( x + u ) = X ( x ) + L x ( u ) + R ( x , u ) , where R ( x , u ) (cid:107) u (cid:107) L → . Theorem 9

Assume the following conditions:(B1) For all x and θ , | K ( x , θ ) | < M for some M < ∞ .(B2) Given any θ , K ( x , θ ) is a continuously diﬀerentiable function of x .Then f ( · ) is mean square diﬀerentiable in the interior of ∩ ∞ k =1 A ki k . See Section S-5 for a proof of this theorem.15. TRUNCATION OF THE INFINITE SUMMANDSince our proposed model f ( x ) = (cid:80) ∞ k =1 K ( x , θ π ( x ) ) p i ( x ) is an inﬁnite (random) series, formodel-ﬁtting purpose it is necessary to truncate the series to f ( x ) = (cid:80) Nk =1 K ( x , θ π ( x ) ) p i ( x ),where N is to be determined, or to implement variable-dimensional Markov chain methodswhere N is to considered a random variable so that the number of parameters associatedwith f ( x ) is also a random variable.Although we will describe and implement a novel variable dimensional Markov chainmethodology introduced by Das & Bhattacharya (2014 b ), which the authors refer to asTransdimensional Transformation based Markov Chain Monte Carlo (TTMCMC), we ﬁrstprove a theorem with respect to truncation of the inﬁnite random series. Note that inthe context of traditional Dirichlet process characterized by Sethuraman’s stick breakingconstruction (Sethuraman (1994)) which involves inﬁnite random series, Ishwaran & James(2001) proposed a method of truncating the inﬁnite series.We now state our theorem on truncation, the proof of which is provided in Section S-6of the supplement. But before stating the theorem it is necessary to deﬁne some requirednotation. Let P N ( x i ) = N (cid:88) i =1 K ( x i , θ i ) p i , and P ( x i ) = ∞ (cid:88) i =1 K ( x i , θ i ) p i , where N needs to be determined. Also let P N = ( P N ( x ) , . . . , P N ( x n )) (cid:48) and P = ( P ( x ) , . . . , P ( x n )) (cid:48) , and denote by Θ N and Θ the sets of random quantities ( V i , θ i ) associated with P N and P respectively. We deﬁne the following marginal densities of the vector of observations y = { y , . . . , y n } : m N ( y ) = (cid:90) Θ N [ y | P N ][ P N ] d Θ N = (cid:90) Θ [ y | P N ][ P ] d Θ , m ∞ ( y ) = (cid:90) Θ [ y | P ][ P ] d Θ . Theorem 10

Under the assumption that sup θ K ( x i , θ ) ≤ M for i = 1 , . . . , n , where M > is a ﬁnite constant, we have (cid:90) R n | m N ( y ) − m ∞ ( y ) | d y ≤ M n (cid:18) αα + 2 (cid:19) N + 2 (cid:114) π M n (cid:18) αα + 1 (cid:19) N .

6. CHOICE OF KERNEL, PRIOR DISTRIBUTIONS AND COMPUTATIONALREGIONThe choice of kernel K ( · , · ) plays a crucial role in nonstationary spatio-temporal data anal-ysis. For instance, if K ( x , θ ) = K ( x − θ ), then the correlation between Y ( x ) and Y ( x )turns out to be a function of x − x , thus inducing stationarity. For the purpose of nonsta-tionarity, it is necessary to make the parameters of the kernel depend upon space and time.In the spatial context such nonstationary kernels are considered in Higdon et al. (1999). Inthis paper, we consider a nonstationary space-time kernel; for the spatial part of the kernelwe essentially adopt the dependence structure proposed by Higdon et al. (1999) and for thetemporal part we allow the relevant coeﬃcient to be time varying, modeled by a stationaryGaussian process.In particular, we consider the following kernel for our applications: K ( s , t, θ , τ ) = exp (cid:26) −

12 ( s − θ ) T Σ( s )( s − θ ) − δ ( t ) | t − τ | (cid:27) , where Σ( s ) is a 2 × s , and δ ( t ) > t . We assume that log( δ ( t )) is a zero mean Gaussian process withcovariance c δ ( t , t ) = σ δ exp { ( t − t ) /a δ } . Following Higdon et al. (1999) we setΣ( s ) = ϕ  (cid:20) √ A + (cid:107) ψ ( s ) (cid:107) π π + (cid:107) ψ ( s ) (cid:107) (cid:21) (cid:20) √ A + (cid:107) ψ ( s ) (cid:107) π π − (cid:107) ψ ( s ) (cid:107) (cid:21)   cos α ( s ) sin α ( s ) − sin α ( s ) cos α ( s )  , (cid:107) ψ ( s ) (cid:107) = ψ ( s ) + ψ ( s ) and α ( s ) = tan − (cid:16) ψ ( s ) ψ ( s ) (cid:17) . We assume, following Higdonet al. (1999), that ψ ( · ) and ψ ( · ) are independent and identical zero mean Gaussian processeswith covariance c ψ ( s , s ) = σ ψ exp {−(cid:107) s − s (cid:107) /b ψ } . Following Higdon et al. (1999) we put U (3 , ϕ , a δ , and b ψ ; we set σ δ = σ ψ = 1. As proposed by Higdon et al. (1999),we set A = 3 .

5. Since in our applications we center and scale the observed time points, for τ we specify the N (0 ,

1) prior.6.1 Elicitation of hyperparameters of the underlying ODDP G In our applications, we center and scale each of the two compo-nents { s i ; i = 1 , . . . , n } and { s i ; i = 1 , . . . , n } of the available spatial locations { s i =( s i , s i ); i = 1 , . . . , n } . Consequently, the choosing G to be the bivariate normal distri-bution with both means zero, both variances equal to one, and correlation ρ appears to bereasonable. We estimate ρ by the empirical correlation between { s i ; i = 1 , . . . , n } and { s i ; i = 1 , . . . , n } . α For the choice of prior distributions of the parametersassociated with the ODDP we follow Griﬃn & Steel (2004) and GS. In particular, we putthe inverted Beta distribution prior on α , given by p ( α ) = n η Γ(2 η ) α η − Γ( η ) ( α + n ) η , where the hyperparameter n is the prior median of α . Note that the prior variance existsif η > η . This prior implies that αα + n follows a Beta ( η, η )distribution. λ Note that, for small α , only the ﬁrst few elements of stickbreaking representation are important, so fewer number of points from the underlying Poissonprocess is needed to induce the second factor of the correlation structure (3.10) which roughlydepends upon the ratio λ/ ( α + 1) for U ( x ) = D (spatial problem) and U ( x ) = ( −∞ , x ](temporal problem); see GS for the details. Thus a relatively small value of λ suﬃces in18uch cases. Similarly, when α is larger, larger λ is necessary to obtain the same correlation.Keeping these in mind, we select the log-normal prior for λ with mean log( α ) and variance b λ , say. For our applications, we choose b λ = 20, so that we obtain a reasonably vague prior.6.2 Computational regionFollowing GS, we consider a truncated region for the point process Z which includes the rangeof the observed x . This truncated region has been referred to as the computational regionby GS. In particular, we choose a bounding box of the form ( a , b ) × ( a , b ) × · · · × ( a d , b d )as the computational region, where a i = d a i − r , b i = d b i − r . Here d a i and d b i are theminimum and the maximum of x in dimension i , and r = 2 (cid:16) Γ( d/ d π d/ α +1 λ log (cid:15) (cid:17) d , with (cid:15) =exp (cid:110) − λα +1 2 π d/ Γ( d/ d (cid:0) r (cid:1) d (cid:111) . See GS for justiﬁcation of these choices.7. JOINT POSTERIOR AND A BRIEFING OF TTMCMC FOR UPDATINGPARAMETERS IN OUR VARIABLE DIMENSIONAL MODELINGFRAMEWORKLet k denote the random number of summands in f k ( x ) = k (cid:88) i =1 K ( x , θ π i ( x ) ) p i ( x ) ∀ x ∈ D ⊆ R d . (7.1)Let V = ( V , . . . , V k ), z = ( z , . . . , z k ), θ = ( θ , θ ), with θ = ( θ , . . . , θ k ) and θ =( θ , . . . , θ k ). Let also ψ = ( ψ ( s ) , . . . , ψ ( s n )), ψ = ( ψ ( s ) , . . . , ψ ( s n )) and δ =( δ ( t ) , . . . , δ ( t n )). The joint posterior is of the form π ( k, V , z , θ , θ , ψ , ψ , δ , τ, σ, α, λ, b ψ , a δ | Y ) ∝ π ( k ) π ( V , z , θ , θ | k ) π ( ψ , ψ , δ ) π ( τ, ϕ, b ψ , a δ ) π ( σ ) π ( α ) π ( λ | α ) × L ( V , z , θ , θ , σ | k, Y ) , (7.2)where L ( V , z , θ , θ , σ | k, Y ) is the joint normal likelihood of V , z , θ , θ , σ under the model y i = f k ( x i ) + (cid:15) i ; (cid:15) i iid ∼ N (0 , σ ); i = 1 , . . . , n, (7.3)conditional on f k ( · ). 19or our applications, as the prior π ( k ) on k we assume the discrete uniform prioron { , , . . . , } ; in our applications k never even reached 30. Under π ( V , z , θ , θ | k ), V i iid ∼ Beta (1 , α ); i = 1 , . . . , k , z are realizations from the Poisson process with intensity λ ,and for i = 1 , . . . , k , ( θ i , θ i ) iid ∼ G . Under π ( ψ , ψ , δ ), ψ , ψ , δ are independent Gaus-sian processes, as detailed in Section 6. The prior distribution of τ, ϕ, b ψ , a δ , denoted by π ( τ, ϕ, b ψ , a δ ), is already provided in Section 6. For the error standard deviation σ , the priordenoted by π ( σ ) is the log-normal distribution with parameters 0 and 1, so that the meanand variance of σ are about 1.6 and 5, respectively. These quantities appear to be reasonable,and yielded adequate inference.In order to obtain samples from the joint posterior (7.2) which involve the variable dimen-sional f k ( · ), we implement the TTMCMC methodology. In a nutshell, TTMCMC updatesall the parameters, both ﬁxed and variable dimensional, as well as the number of param-eters of the underlying posterior distribution in a single block using simple deterministictransformations of some low-dimensional random variable drawn from some ﬁxed, but low-dimensional arbitrary distribution deﬁned on some relevant support. The idea is an exten-sion of Transformation based Markov Chain Monte Carlo (TMCMC) introduced by Dutta& Bhattacharya (2014) for updating high-dimensional parameters with known dimension-ality in a single block using simple deterministic transformations of some low-dimensional(usually one-dimensional) random variable having arbitrary distribution on some relevantsupport. The strategy of updating high and variable dimensional parameters using verylow-dimensional random variables clearly reduces dimensionality dramatically, thus greatlyimproving acceptance rate, mixing properties, and computational speed. In Section S-7 ofthe supplement we provide a detailed overview of TTMCMC, propose a general algorithm(Algorithm S-7.1) with certain advantages, and in Section S-8 of the supplement we spe-cialize the algorithm to our spatio-temporal modeling set-up, providing full updating details(Algorithm S-8.1). 20. SIMULATION STUDYTo illustrate the performance of our model we ﬁrst create a synthetic data generating processwhich is nonstationary and non-Gaussian. One popular method to create such process isthe kernel convolution approach. However, since we have developed our spatio-temporalmodel itself using the kernel convolution approach, it is perhaps desirable to obtain thesynthetic data from some nonstationary, non-Gaussian process created using some approachindependent of the kernel convolution method. In Section 8.1 we detail such an approach.Then we ﬁt our proposed model to the data pretending that the data-generating process isunknown.8.1 A nonstationary non-Gaussian data generating processLet X ( · ) denote a stationary Gaussian process with mean function µ ( t, s ) = β + β t + β s (1) + β s (2), with s = ( s (1) , s (2)), and covariance function A ( i, j ) = c (( t i , s i ) , ( t j , s j )) = exp (cid:26) − . (cid:18)(cid:113) ( t i − t j ) + ( s i − s j ) + ( s i − s j ) (cid:19)(cid:27) , for any t i , t j , s i = ( s i , s i ) , s j = ( s j , s j ).Let X = ( X ( t , s ) , . . . , X ( t n , s n )) (cid:48) denote observed data points from the Gaussian pro-cess X at the design points { ( t i , s i ); i = 1 , . . . , n } . Let t = ( t , . . . , t n ) (cid:48) and S = ( s (cid:48) , . . . , s (cid:48) n ) (cid:48) .Further, let us denote by A = ( A ( i, j ); i = 1 , . . . , n ; j = 1 , . . . , n ) the covariance matrix and µ = ( µ ( t , s ) , . . . , µ ( t n , s n )) (cid:48) . Then the posterior process [ X ( · ) | X ] is non-stationary Gaus-sian with mean function µ X ( t, s ) = µ ( t, s )+ A A − ( X − µ ) and variance A − A A − A ,where A =  A A A A  .Let the posterior nonstationary Gaussian process [ X ( · ) | X ] be denoted by X ∗ ( · ). Nowconsider another process Y ( · ) with mean function µ ∗ ( t, s ) = X ∗ ( t, s ) and covariance func-tion c Y (( t i , s i ) , ( t j , s j )) = exp {− . | X ∗ ( t i , s i ) − X ∗ ( t j , s j ) |} . Then marginally, Y ( · ) is anonstationary non-Gaussian process.For our illustration we will simulate the synthetic dataset from the process Y ( · ). Thealgorithm for generation of this synthetic data is provided next.21.2 Algorithm for generating the synthetic dataWe have perfomed the following steps to simulate a non stationary 95 × t i from each interval ( i − , i ]; i = 1 , . . . ,

100 as100 time points. We store the time points in a vector which we denote by t =( t (1) , . . . , t (100)) (cid:48) .3. Next we generate 100 random points of the form { s i = ( s (1 , i ) , s (2 , i )); i = 1 , . . . , } from [0 , × [0 ,

50] as locations. We store the locations in a 100 × S =( s (cid:48) , . . . , s (cid:48) ) (cid:48) .4. Then we randomly choose 5 time points from t and 5 locations from S and omit theserandom points from t and S . So, we obtain a new time vector of length 95, say t anda new matrix of locations of order 95 ×

2, say S . We store the omitted time pointsin a separate vector, t , and the locations in a separate matrix, S , for future use.5. Next we calculate the covariance matrix A = ( A ( i, j )) of order 100 ×

100 based on t and S , where ( i, j )-th element of A is given by A ( i, j ) =  i = j exp (cid:16) − . (cid:112) ( t ( i ) − t ( j )) + ( s (1 , i ) − s (1 , j )) + ( s (2 , i ) − s (2 , j )) (cid:17) if i (cid:54) = j

6. We partition the above covariance function A consisting of four component matrices A , A , A and A , where A is a 5 × t and S ; A is a 95 ×

95 covariance matrix based on t and S (the form of the ( i, j )-thelement being the same as for the matrix A , except now t , S are replaced with t and S for A ; for A , t , S are replaced with t and S ); A = A T is a 5 × i, j )-th element of A is given by A ( i, j ) = (cid:16) − . (cid:112) ( t ( i ) − t ( j )) + ( s (1 , i ) − s (1 , j )) + ( s (2 , i ) − s (2 , j )) (cid:17) , i = 1 , . . . j = 1 , . . . , x , from a 5 variate normaldistribution with mean function µ T = D β , and covariance matrix A , where β T = (0 . , . , .

02) and D = ( t ... S ) is the designmatrix. Note that D is a 5 × x we simulate a 95 × x (95 | from a conditional 95 variatenormal distribution with mean µ T (95 | = D β + ( x − µ ) T A − A and covariance Σ (95 | = A − A A − A , where D is obtained exactly same as D , only t and S are replaced with t and S , respectively.9. The last step is to simulate a 95 × y (95 | x (95 | ) , conditionally on x (95 | . Wesimulate y (95 | x (95 | ) from a 95 variate normal distribution with mean µ T y = 0 . x (95 | ) T and covariance matrixΣ y =  i = j exp (cid:0) − . (cid:0) | x (95 | ( i ) − x (95 | ( j ) | (cid:1)(cid:1) if i (cid:54) = j, for i = 1 , . . .

95 and j = 1 , . . .

95. 23.3 Results of ﬁtting our model to the simulated dataNote that for this problem the number of parameters to be updated ranges between 300to 400. Our TTMCMC based model implementation took 35 mins to yield 900000 realiza-tions following a burn-in of 100000. Quite encouragingly, TTMCMC exhibited satisfactoryacceptance rate and mixing properties.

We asess the predictive power of our model withthe leave-one-out cross validation method. All the 95 cases were included in the 95% highestposterior densities of the corresponding leave-one-out posterior predictive densities. Figure10.1 displays the posterior predictive densities of six randomly selected locations, along withthe true values, the latter denoted by the vertical lines. Thus, satisfactory performance ofour proposed model is indicated by the results, particularly given the fact that our modeldoes not assume knowledge of the true, data-generating, parametric model.

Though our simulation mechanism is completely diﬀerentfrom our proposed model, the simulated data do exhibit the pattern that the correlationsare close to zero for two widely separated locations. Indeed, from the structure of thecovariance matrix Σ (95 | , it is easily seen that the ( i, j )-th element ( i (cid:54) = j ) of Σ (95 | is closeto zero whenever the distance between t ( i ) and t ( j ) and/or s i and s j is large.We calculate the posterior densities of correlation for diﬀerent set of locations. In forma-tion of the pair, we select nearby locations, as well as locations which are widely separated,such that we obtain both high and low correlation values under the true, data-generatingmodel. It is evident from Figure 10.2 that the true correlations, ranging from small tohigh values, lie well within their respective 95% credible intervals, vindicating reasonableperformance of our model in terms of capturing the true correlation structure.It is worth mentioning that we repeated the simulation study with 1000 simulated datapoints. In this case the dimension of our spatio-temporal model varies randomly between3000 to 3100. Even in this challenging high-dimensional scenario our TTMCMC based model24mplementation exhibited excellent performance. We omit details for the sake of brevity.9. REAL DATA ANALYSISAccording to the Clean Air Act certain air quality is to be maintained to protect the publichealth, and to maintain proper survival environment of animals and vegetations. As a mea-sure of the quality of air, the Clean Air Act set standard limits for important air pollutantssuch as ozone. For our real data analysis we use the ozone metric called W126 metric, whichis the annual maximum consecutive three month running total of weighted sum of hourlyconcentrations observed between 8AM and 8PM on each day during the high ozone seasonof April through October. Also, we have information on Community Multiscale Air Qualityindices(CMAQ), which is highly correlated with ozone level, so that we use these CMAQvalues as a covariate in our model.9.1 Calculating the W126 metricLet Q l ( s , t ) denote the observed ozone concentration level in parts per million (ppm) unitsat location s at hour l on day t, for t = 1 . . . T and l = 1 . . .

12, where T = 214 days betweenApril 1 and October 31 in a given year. The hours are the 12 day light hours between 8AMand 7PM. The W126 metric for site s is calculated as follows.The weighted hourly metric is calculated using the transformation: U l ( s , t ) = Q l ( s , t ) × (cid:18)

11 + 4403 × exp( − × Q l ( s , t )) (cid:19) . This logistic transformation truncates the values smaller than 0.05ppm to zero, but does notalter the magnitude of values larger than 0.10ppm.The daily index from the 12-hourly weighted values in each day is obtained as Z ( s , t ) = (cid:88) l =1 U l ( s , t ) . The monthly index is calculated from the daily indices by summing and then adjusting25or the number of days in the month as follows: M j ( s ) = (cid:88) t ∈ month j Z ( s , t ) , j = 1 , . . . , , where the the summation is over all the days l that fall within the calender month j .The three-month running totals are centered at the last month and are obtained as:¯ M j ( s ) = j (cid:88) k = j − M k ( s ) , j = 3 , . . . , . Finally, the annual W126 index value is calculated by: Y ( s ) = max j =3 ¯ M j ( s ) . The secondary ozone standard is met at a site s at a given year when the true value of Y ( s ) is less than 21 ppm-hours.Corresponding to each observed ozone concentration Q l ( s , t ) we have a CMAQ modeloutput v l ( A, t ) where the site s is contained in the unique grid cell A . Using the output v l ( A, t ) and the above details daily and annual indexes of CMAQ values namely X ( A, t ) and X ( A ) are constructed.We have the data on annual indexes of ozone values ( W Y ( s ), and correspondingCMAQ X ( A ) values for 76 locations in the US. Now we ﬁt our model to this real data set.Here we model the data on the log scale; we also use the log transformation of the CMAQvalues. In other words, we considerlog( Y ( s )) = α + α log( X ( A i )) + f ( s i ) + (cid:15) i , i = 1 , . . . , where α and α are regression coeﬃcients, f ( s i ) is an annual level spatial random eﬀectat location s i and (cid:15) i is an independent nugget eﬀect with variance σ . Here f ( s i ) is ourproposed spatial model based on kernel convolution with ODPP. All the prior distributionsare the same as mentioned before. For the additional parameters α and α , we use thevague prior distribution N (0 , ), and for σ we use the log-normal prior with mean zeroand variance 10 . 26s before, we assess the predictive power of the model unsing leave-one-out cross valida-tion. For all the locations, the true value of ozone concentration lies within the 95% credibleinterval of the respective cross-validation posterior. This is summarized in the top panelof Figure 10.3, where the middle surface represents the observed data; the lower and theupper surfaces represent the lower and the upper 95% credible regions associated with therespective leave-one-out posterior predictive densities. The surface in the middle of the bot-tom panel are the posterior medians, while the lower and the upper surfaces denote the 95%credible intervals as before. For the convenience of visually comparing the observed dataand the posterior medians, we include Figure 10.4, which also contains the 95% credibleintervals. The plots clearly show that our proposed model is quite adequate for the ozonedata.Posterior densities of correlations, for 6 pairs of sites, are shown in Figure 10.5. All ofthem seem to give high posterior probability to the approximate range (0 . , . p i ’s of ODDP, and kernels are indexed by θ i , where for i = 1 , , . . . , θ i iid ∼ G . We have already obtained some suﬃcient conditions ensuring that ourmodel converges in L p norm, Besov semi-norm. These results make our proposed model apromising candidate for function estimation.ACKNOWLEDGMENTSWe are grateful to Prof. Sujit Sahu for kindly providing us with the ozone data set and toMr. Suman Guha for helpful dicsussions. 28 upplementary Material S-1. PROOF OF THEOREM 1Note that (cid:90) | K ( x , θ ) | dG x ( θ ) = ∞ (cid:88) i =1 | K ( x , θ π i ( x ) ) | p i , (S-1.1)so that the monotone convergence theorem yields E (cid:18)(cid:90) | K ( x , θ ) | dG x ( θ ) (cid:19) = ∞ (cid:88) i =1 E (cid:0) | K ( x , θ π i ( x ) ) | (cid:1) E ( p i )= (cid:90) | K ( x , θ ) | dG ( θ ) ∞ (cid:88) i =1 E ( p i )= (cid:90) | K ( x , θ ) | dG ( θ ) [since (cid:80) ∞ i =1 p i = 1] . Here we have used the fact that θ i and V i ’s are independent. Therefore (cid:82) | K ( x , θ ) | dG x ( θ )is ﬁnite with probability one and hence f ( x ) = (cid:90) K ( x , θ ) dG x ( θ ) = ∞ (cid:88) i =1 K ( x , θ π i ( x ) ) p i ( x )is absolutely convergent with probability one. Since this series is bounded by (S-1.1), whichis integrable, the bounded convergence theorem implies E ( f ( x )) = ∞ (cid:88) i =1 E (cid:0) K ( x , θ π i ( x ) ) (cid:1) E ( p i )= (cid:90) K ( x , θ ) dG ( θ ) . (cid:4) S-2. PROOF OF THEOREM 2 f ( x ) f ( x ) = (cid:90) K ( x , θ ) dG x ( θ ) (cid:90) K ( x , θ ) dG x ( θ )= (cid:88) i K ( x , θ π i ( x ) ) p i ( x ) (cid:88) j K ( x , θ π j ( x ) ) p j ( x )= (cid:88) i (cid:88) j K ( x , θ π i ( x ) ) K ( x , θ π j ( x ) ) p i ( x ) p j ( x ) (S-2.1)29ince both the series are absolutely convergent with probability one. This is bounded inabsolute value by (cid:88) i (cid:88) j | K ( x , θ π i ( x ) ) K ( x , θ π j ( x ) ) | p i ( x ) p j ( x ) . (S-2.2)Now if this is an integrable random variable, we may take the expectation of (S-2.1) insidethe summation sign and obtain E ( f ( x ) f ( x )) = E (cid:18)(cid:90) K ( x , θ ) dG x ( θ ) (cid:90) K ( x , θ ) dG x ( θ ) (cid:19) = (cid:88) i (cid:88) j E (cid:0) K ( x , θ π i ( x ) ) K ( x , θ π j ( x ) ) (cid:1) E ( p i ( x ) p j ( x ))= (cid:88) π i ( x ) (cid:54) = π j ( x ) E (cid:0) K ( x , θ π i ( x ) ) K ( x , θ π j ( x ) ) (cid:1) E ( p i ( x ) p j ( x ))+ (cid:88) π i ( x )= π j ( x ) E (cid:0) K ( x , θ π i ( x ) ) K ( x , θ π j ( x ) ) (cid:1) E ( p i ( x ) p j ( x ))= (cid:88) π i ( x ) (cid:54) = π j ( x ) E ( K ( x , θ )) E ( K ( x , θ )) E ( p i ( x ) p j ( x ))+ (cid:88) π i ( x )= π j ( x ) E ( K ( x , θ ) K ( x , θ )) E ( p i ( x ) p j ( x ))= E ( K ( x , θ )) E ( K ( x , θ ) (cid:88) π i ( x ) (cid:54) = π j ( x ) E ( p i ( x ) p j ( x ))+ E ( K ( x , θ ) K ( x , θ )) (cid:88) π i ( x )= π j ( x ) E ( p i ( x ) p j ( x ))= E ( K ( x , θ )) E ( K ( x , θ ) (cid:20) (cid:88) π i ( x ) (cid:54) = π j ( x ) E ( p i ( x ) p j ( x )) + (cid:88) π i ( x )= π j ( x ) E ( p i ( x ) p j ( x )) (cid:21) + E ( K ( x , θ ) K ( x , θ )) (cid:88) π i ( x )= π j ( x ) E ( p i ( x ) p j ( x )) − E ( K ( x , θ )) E ( K ( x , θ )) (cid:88) π i ( x )= π j ( x ) E ( p i ( x ) p j ( x ))30 Cov ( K ( x , θ ) , K ( x , θ )) (cid:88) π i ( x )= π j ( x ) E ( p i ( x ) p j ( x ))+ E ( K ( x , θ )) E ( K ( x , θ )) (cid:20) (cid:88) π i ( x ) ,π j ( x ) E ( p i ( x ) p j ( x )) (cid:21) = Cov ( K ( x , θ ) , K ( x , θ )) (cid:88) π i ( x )= π j ( x ) E ( p i ( x ) p j ( x ))+ E ( K ( x , θ )) E ( K ( x , θ )) . An analogous equation shows that (S-2.2) is bounded, since, by our assumption (cid:90) | K ( x , θ ) dG ( θ ) < ∞ and (cid:90) | K ( x , θ ) K ( x , θ ) | dG ( θ ) < ∞ . To obtain the complete analytical expression of E ( f ( x ) f ( x )) we need to calculate (cid:80) π i ( x )= π j ( x ) E ( p i ( x ) p j ( x )). Deﬁne T ( x , x ) = { k | there exists i, j such that π i ( x ) = π j ( x ) = k } , for k ∈ T ( x , x ). We further deﬁne A lk = { π j ( x l ) | j < i where π i ( x l ) = k } , S k = A k ∩ A k and S (cid:48) k = A k ∪ A k − S k .Then it can be easily shown that (cid:88) π i ( x )= π j ( x ) E ( p i ( x ) p j ( x )) = 2( α + 1)( α + 2) (cid:88) k ∈ T ( x , x ) (cid:18) αα + 2 (cid:19) S k (cid:18) αα + 1 (cid:19) S (cid:48) k . Hence Cov( f ( x ) , f ( x )) = Cov G ( K ( x , θ ) , K ( x , θ )) × α + 1)( α + 2) × (cid:88) k ∈ T ( x , x ) (cid:18) αα + 2 (cid:19) S k (cid:18) αα + 1 (cid:19) S (cid:48) k . (cid:4) S-3. PROOF OF THEOREM 4In this context, it is more convenient to deal with the notation used in the context ofTheorem 2. Note that (cid:107) x − x (cid:107) → S k → ( k − S (cid:48) k →

0, and hence,31or any realization of the point process, Corr( G x , G x ) = (cid:80) k ∈ T ( x , x ) (cid:0) αα +2 (cid:1) S k (cid:0) αα +1 (cid:1) S (cid:48) k → (cid:80) ∞ k =1 (cid:0) αα +2 (cid:1) k − = ( α + 2) /

2. Since Corr G ( K ( x , θ ) , K ( x , θ )) → (cid:107) x − x (cid:107) →

0, itfollows that Corr( f ( x ) , f ( x )) → (cid:107) x − x (cid:107) →

0. On the other hand, as (cid:107) x − x (cid:107) → ∞ , T ( x , x ) is at most ﬁnite, S k → S (cid:48) k → ∞ . This implies that Corr( G x , G x ) → f ( x ) , f ( x )) →

0. Hence, by the dominated convergence theorem it followsthat the unconditional correlation between f ( x ) and f ( x ) goes to 1 and 0, respectively, as (cid:107) x − x (cid:107) → (cid:107) x − x (cid:107) → ∞ .S-4. PROOF OF THEOREM 7Since for each x , f ( x ) = (cid:80) ∞ i =1 K ( x , θ π i ( x ) ) p i ( x ), each x must satisfy π k ( x ) = i k , for every k = 1 , , . . . , where i k ∈ { , , . . . } ( i k (cid:54) = i k (cid:48) for any k (cid:54) = k (cid:48) ), we must have x ∈ ∩ ∞ k =1 A ki k . Forsimplicity but without loss of generality let i k = k for k = 1 , , . . . . Then for x ∈ ∩ ∞ k =1 A ki k it holds that K ( x , θ π i ( x ) ) = K ( x , θ i ) and p i ( x ) = V i (cid:81) j
0. Hence, using thedominated convergence theorem it follows thatlim (cid:107) u (cid:107)→ E (cid:20) f ( x + u ) − f ( x ) − u (cid:48) (cid:53) f ( x ) (cid:107) u (cid:107) (cid:21) = lim (cid:107) u (cid:107)→ E (cid:20) R ( x , u ) (cid:107) u (cid:107) (cid:21) = 0 . Hence, f ( · ) is mean square diﬀerentiable in the interior of ∩ ∞ k =1 A ki k . (cid:4) S-6. PROOF OF THEOREM 10For our purpose, we ﬁrst state and prove a lemma.34 emma 11

Let p k denote the random weights from an ODDP. For each positive integer N ≥ and each positive integer r ≥ , let T N ( r, α ) = (cid:32) ∞ (cid:88) k = N p k (cid:33) r , U N ( r, α ) = ∞ (cid:88) k = N p rk . Then E ( T N ( r, α )) = (cid:18) αα + r (cid:19) N − , and E ( U N ( r, α )) = (cid:18) αα + r (cid:19) N − Γ( r )Γ( α + 1)Γ( α + r ) . Proof.

Let P be a speciﬁc random measure from ODDP. Then P ( · ) = V π δ θ π ( · )+(1 − V π ) (cid:16) V ∗ π δ θ ∗ π ( · ) + (1 − V ∗ π ) V ∗ π δ θ ∗ π ( · ) + (1 − V ∗ π )(1 − V ∗ π ) V ∗ π δ θ ∗ π ( · ) + · · · (cid:17) , where V ∗ π k = V π k +1 are independent Beta (1 , α ) random variables and θ ∗ π k = θ π k +1 are iid G .So we have P ( · ) D = V π δ θ π ( · ) + (1 − V π ) P ∗ ( · ) , where V π , θ π and P ∗ ( · ) are independent and P ∗ ( · ) is an ODDP.Similarly we can show that U ( r, α ) D = V rπ + (1 − V π ) r U ( r, α ) , where on the right-hand side V π and U ( r, α ) are mutually independent. Therefore takingexpectations E ( U ( r, α )) = Γ( r + 1)Γ( α + 1)Γ( α + r + 1) + αα + r E ( U ( r, α )) . Then we have E ( U ( r, α )) = Γ( r )Γ( α + 1)Γ( α + r ) . (S-6.1)Furthermore for N ≥

2, we have that U N ( r, α ) = (1 − V π ) r · · · (1 − V π ( N − ) r ( U ( r, α )) , E ( U N ( r, α )) = (cid:32) N − (cid:89) k =1 E (1 − V π k ) r (cid:33) E (( U ( r, α ))) . Then using (S-6.1) we have E ( U N ( r, α )) = (cid:18) αα + r (cid:19) N − Γ( r )Γ( α + 1)Γ( α + r ) . Now, similarly we can show that T N ( r, α ) D = (1 − V π ) r · · · (1 − V π ( N − ) r ( T ( r, α ))= (1 − V π ) r · · · (1 − V π ( N − ) r = (cid:18) αα + r (cid:19) N − . Hence, the lemma is proved.We now proceed to the proof of Theorem 10. Note that | m N ( y ) − m ∞ ( y ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:90) Θ [ y | P N ][ P ] d Θ − (cid:90) Θ [ y | P ][ P ] d Θ (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:90) Θ | [ y | P N ] − [ y | P ] | [ P ] d Θ (S-6.2)Now we expand [ y | P N ] around P using multivariate Taylor’s series expansion up to thesecond order:[ y | P N ] = [ y | P ] + ∂ [ y | P N ] ∂P N (cid:12)(cid:12)(cid:12)(cid:12) P N = P ( P N − P ) + ( P N − P ) (cid:48) ∂ [ y | P N ] ∂P N (cid:12)(cid:12)(cid:12)(cid:12) P N = P ∗ ( P N − P ) , [ where P ∗ lies between P and P N , that is, (cid:107) P − P ∗ (cid:107) ≤ (cid:107) P − P N (cid:107) . ]Noting that [ y | P N ] and [ y | P N ] are multivariate normal densities with mean P N and P re-spectively and variance σ I , where I is the n × n identity matrix, we have ∂ [ y | P N ] dP N (cid:12)(cid:12)(cid:12)(cid:12) P N = P = [ y | P ]( y − P ) (cid:48) , and ∂ [ y | P N ] ∂P N (cid:12)(cid:12)(cid:12)(cid:12) P N = P ∗ = [ y | P ∗ ] I ( y − P ∗ )( y − P ∗ ) (cid:48) I − [ y | P ∗ ] I. (cid:90) | [ y | P ]( y − P ) (cid:48) ( P N − P ) + ( P N − P ) (cid:48) [ y | P ∗ ]( y − P ∗ )( y − P ∗ ) (cid:48) ( P N − P ) − ( P N − P ) (cid:48) [ y | P ∗ ] I ( P N − P ) | [ P ] d Θ . So, we have (cid:90) R n | m N ( y ) − m ∞ ( y ) | d y ≤ (cid:90) R n (cid:90) | [ y | P ]( y − P ) (cid:48) ( P N − P ) + ( P N − P ) (cid:48) [ y | P ∗ ]( y − P ∗ )( y − P ∗ ) (cid:48) ( P N − P ) − ( P N − P ) (cid:48) [ y | P ∗ ] I ( P N − P ) | [ P ] d Θ . Now, using Fubini’s theorem we can interchange the order of integration. Finally we have (cid:90) R n (cid:90) | [ y | P ]( y − P ) (cid:48) ( P N − P ) + ( P N − P ) (cid:48) [ y | P ∗ ]( y − P ∗ )( y − P ∗ ) (cid:48) ( P N − P ) − ( P N − P ) (cid:48) [ y | P ∗ ] I ( P N − P ) | [ P ] d Θ d y = (cid:90) (cid:90) R n | [ y | P ]( y − P ) (cid:48) ( P N − P ) + ( P N − P ) (cid:48) [ y | P ∗ ]( y − P ∗ )( y − P ∗ ) (cid:48) ( P N − P ) − ( P N − P ) (cid:48) [ y | P ∗ ] I ( P N − P ) | d y [ P ] d Θ ≤ (cid:90) (cid:20)(cid:90) R n [ y | P ] | ( y − P ) (cid:48) ( P N − P ) | d y + (cid:90) R n ( P N − P ) (cid:48) { [ y | P ∗ ]( y − P ∗ )( y − P ∗ ) (cid:48) + [ y | P ∗ ] I } ( P N − P ) d y (cid:21) [ P ] d Θ= (cid:90) (cid:34)(cid:114) π (cid:48) n × | P N − P | + 2( P N − P ) (cid:48) I ( P N − P ) (cid:35) [ P ] d Θ= E Θ (cid:34)(cid:114) π (cid:48) n × | P N − P | + 2( P N − P ) (cid:48) ( P N − P ) (cid:35) = 2 E Θ (cid:34)(cid:114) π n (cid:88) i =1 | P N ( x i ) − P ( x i ) | + n (cid:88) i =1 ( P N ( x i ) − P ( x i )) (cid:35) . (S-6.3)In the above, n × denotes the n -component vector with each element 1. Now, | P N ( x i ) − P ( x i ) | ≤ M (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) p NN ( x i ) − p ∞ N ( x i ) − ∞ (cid:88) i = N +1 p i ( x i ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , p NN ( x i ) = 1 − (cid:80) N − k =1 p k ( x i ) = 1 − (cid:80) N − k =1 V π k ( x i ) (cid:81) j g ( · ) which is supported on the positive part ofthe real line. To move back to x from x (cid:48) , we need to apply the “backward transformation” x (cid:48) − (cid:15) . In general, given (cid:15) and the current state x , we shall denote the forward transformationby T ( x, (cid:15) ), and the backward transformation by T b ( x, (cid:15) ). For ﬁxed (cid:15) the forward and thebackward transformations must be one to one and onto, and satisfy T b ( T ( x, (cid:15) ) , (cid:15) ) = x = T ( T b ( x, (cid:15) ) , (cid:15) ); see Dutta & Bhattacharya (2014) for a detailed discussion regarding these.The simple idea discussed above has been generalized to the multi-dimensional situationby Dutta & Bhattacharya (2014). Remarkably, for any dimension, the moves can be con-structed by simple deterministic transformations of the one-dimensional random variable (cid:15) ,which is simulated from any arbitrary distribution on some relevant support.The idea based on transformations has been generalized to the case of variable dimen-sionality by Das & Bhattacharya (2014 b ). In other words, Das & Bhattacharya (2014 b )show that using simple deterministic transformations and a single (cid:15) (or just a few (cid:15) ’s) itis possible to devise an eﬀective dimension-hopping algorithm which changes dimension aswell as updates the other parameters, all in a single block, while maintaining, at the sametime, high acceptance rate. In this sense this new methodology accomplishes automation ofmove-types. Das & Bhattacharya (2014 b ) refer to this dimension changing methodology asTransdimensional Transformation based Markov Chain Monte Carlo (TTMCMC).Before we illustrate the key concept of TTMCMC with a simple example, it is necessaryto deﬁne some requisite notation, borrowed from Dutta & Bhattacharya (2014).S-7.1 NotationSuppose now that X is a k -dimensional space of the form X = (cid:81) ki =1 X i so that T =( T , . . . , T k ) where each T i : X i × D → X i , for some set D , are the component-wise trans-formations. Let ζ = ( ζ , . . . , ζ k ) be a vector of indicator variables, where, for i = 1 , . . . , k , ζ i = 1 and ζ i = − x i , and let ζ i = 0 denote no change to x i . Given any such indicator40ector ζ , let us deﬁne T ζ = ( g ,ζ , g ,ζ , . . . , g k,ζ k ) where g i,ζ i =  T bi if ζ i = − x i if ζ i = 0 T i if ζ i = 1 . Corresponding to any given ζ , we also deﬁne the following ‘conjugate’ vector ζ c = ( ζ c , ζ c , . . . , ζ ck ),where ζ ci =  ζ i = −

10 if ζ i = 0 − ζ i = 1 . With this deﬁnition of ζ c , T ζ c can be interpreted as the conjugate of T ζ .Since 3 k values of ζ are possible, it is clear that T , via ζ , induces 3 k many types of ‘moves’of the forms { T ζ i ; i = 1 , . . . , k } on the state-space. Suppose now that there is a subset Y of D such that the sets T ζ i ( x , Y ) and T ζ j ( x , Y ) are disjoint for every ζ i (cid:54) = ζ j . In fact, Y denotes the support of the distribution g ( · ) from which (cid:15) is simulated.S-7.2 Illustration of TTMCMC with a simple exampleLet us now illustrate the main idea of TTMCMC informally using the additive transforma-tion. Although the example we illustrate TTMCMC with is borrowed from Das & Bhat-tacharya (2014 b ), the algorithm we now present is somewhat diﬀerent from that of Das &Bhattacharya (2014 b ). Assume that the current state is x = ( x , x ) ∈ R . We ﬁrst ran-domly select u = ( u , u , u ) ∼ M ultinomial ( w b , w d , w nc ), where w b , w d , w nc ( >

0) such that w b + w d + w nc = 1 are the probabilities of birth, death, and no-change moves, respectively.That is, if u = 1, then we increase the dimensionality from 2 to 3; if u = 1, then we decreasethe dimensionality from 2 to 1, and if u = 1, then we keep the dimensionality unchanged.In the latter case, when the dimensionality is unchanged, the acceptance probability remainsthe same as in TMCMC, as provided in Algorithm 3.1 of Dutta & Bhattacharya (2014).If u = 1, we can increase the dimensionality by ﬁrst selecting one of x and x withprobability 1 / x has been selected, we then construct the move-41ype T b, ζ ( x , (cid:15) ) = ( x + a (cid:15), x − a (cid:15), x + ζ a (cid:15) ) = ( g ,ζ =1 ( x , (cid:15) ) , g ,ζ c = − ( x , (cid:15) ) , g ,ζ ( x , (cid:15) )),say. Here, as in TMCMC, we draw (cid:15) ∼ g ( · ), where g ( · ) is supported on the positive part ofthe real line, and draw ζ = 1 with probability p and ζ = − − p . Notethat the value ζ = 0 is redundant for additive transformation (see Dutta & Bhattacharya(2014) for the details) and so is omitted here. We re-label x (cid:48) = T b, ζ ( x , (cid:15) ) = ( x + a (cid:15), x − a (cid:15), x + ζ a (cid:15) ) as ( x (cid:48) , x (cid:48) , x (cid:48) ). Thus, T b, ζ ( x , (cid:15) ) increases the dimension from 2 to 3.We accept this birth move with probability a b ( x , (cid:15) ) = min (cid:40) , w d w b × p I { } ( ζ c )2 q I {− } ( ζ c )2 p I { } ( ζ )2 q I {− } ( ζ )2 × π ( x + a (cid:15), x − a (cid:15), x + ζ a (cid:15) ) π ( x , x ) × (cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T b, ζ ( x , (cid:15) )) ∂ ( x , (cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:27) . (S-7.1)In (S-7.1), (cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T b, ζ ( x , (cid:15) )) ∂ ( x , (cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) ∂ ( x + a (cid:15), x − a (cid:15), x + ζ a (cid:15) ) ∂ ( x , x , (cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) a − a ζ a (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 2 a . (S-7.2)Now let us illustrate the problem of returning to =( x , x ) ( ∈ R ) from T b, ζ ( x , (cid:15) ) =( x + a (cid:15), x − a (cid:15), x + ζ a (cid:15) ) ( ∈ R ). For our purpose, in this paper, we select one of theﬁrst two elements of T b, ζ ( x , (cid:15) ) with the same probabiliuty. Suppose that we select x + a (cid:15) with probability 1 /

2. We then deterministically choose its right-adjacent x − a (cid:15) , and formthe average x ∗ = (( x + a (cid:15) ) + ( x − a (cid:15) )) / x . For non-additive transformations we canconsider the averages of the backward moves of of the selected element and its right-adjacent.Even in this additive transformation example, after simulating (cid:15) as before we can considerthe respective backward moves of x + a (cid:15) and x − a (cid:15) , both yielding x , and then take theaverage denoted by x ∗ . For the remaining element x + ζ a (cid:15) , we need to simulate ζ c andthen consider the move ( x + ζ a (cid:15) ) + ζ c a (cid:15) = x . Thus, we can return to ( x , x ) using thisstrategy.Letting x (cid:48) = ( x (cid:48) , x (cid:48) , x (cid:48) ) = ( x + a (cid:15), x − a (cid:15), x + ζ a (cid:15) ), and denoting the averageinvolving the ﬁrst two elements by x ∗ , the death move is then given by x (cid:48)(cid:48) = T d, ζ ( x (cid:48) , (cid:15) ) =42 x ∗ , x (cid:48) + ζ c a (cid:15) ) = ( x (cid:48) + x (cid:48) , x (cid:48) + ζ c a (cid:15) ). Now observe that for returning to ( x (cid:48) , x (cid:48) ) from x ∗ , wemust have x ∗ + a (cid:15) ∗ = x (cid:48) and x ∗ − a (cid:15) ∗ = x (cid:48) , which yield (cid:15) ∗ = ( x (cid:48) − x (cid:48) ) / a . Hence, theJacobian associated with the death move in this case is given by (cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T d, ζ ( x (cid:48) , (cid:15) ) , (cid:15) ∗ , (cid:15) ) ∂ ( x (cid:48) , (cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ (cid:16) x (cid:48) + x (cid:48) , x (cid:48) + ζ c a (cid:15), x (cid:48) − x (cid:48) a , (cid:15) (cid:17) ∂ ( x (cid:48) , x (cid:48) , x (cid:48) , (cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) a − a

00 1 0 00 ζ c a (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 12 a . (S-7.3)We accept this death move with probability a d ( x (cid:48)(cid:48) , (cid:15), (cid:15) ∗ ) = min (cid:26) , w b w d × P ( ζ c ) P ( ζ ) π ( x (cid:48)(cid:48) ) π ( x (cid:48) ( t ) ) (cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T d, ζ ( x (cid:48) , (cid:15) ) , (cid:15) ∗ , (cid:15) ) ∂ ( x (cid:48) , (cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:27) = min (cid:40) , × w b w d × p I { } ( ζ c )2 q I {− } ( ζ c )2 p I { } ( ζ )2 q I {− } ( ζ )2 × π ( x (cid:48)(cid:48) ) π ( x (cid:48) ) × a (cid:41) . In general, x ∈ R mk may be of the form ( x , x , . . . , x m ), where x (cid:96) = ( x (cid:96), , x (cid:96), , . . . , x (cid:96),k )for (cid:96) = 1 , , . . . , m , where m ≥ x (cid:96) is changed, then the dimensions of all other x (cid:96) (cid:48) ; (cid:96) (cid:48) (cid:54) = (cid:96) must also change accordingly. Forinstance, in our model,where we have summands with unknown number of components andthe i -th component is characterized by the parameters associated with ODDP ( θ i , θ i , V i , z i ),when the dimension of the current k -dimensional vector of the location parameter of the cen-tral distribution ( θ , . . . , θ k ) is increased by one, then one must simultaneously increase thedimension of the other set of the current k -dimensional location parameter ( θ , . . . , θ k ), the k -dimensional vector of the associated point process ( z , . . . , z k ), as well as the k -dimensionalmass vector ( V , . . . , V k ) by one. In Section S-7.3 we present a TTMCMC algorithm (Algo-rithm S-7.1) for situations of this kind, and show that detailed balance holds (irreducibilityand aperiodicity hold by the same arguments provided in Das & Bhattacharya (2014 b )).It is worth mentioning that although Das & Bhattacharya (2014 b ) provide a TTMCMCalgorithm for these situations (Algorithm 5.1 of their paper), their algorithm is somewhatdiﬀerent from ours in that, for the death move, we select only one element randomly; then43e choose the right-adjacent element; take backward transformations of both of them, ﬁnallytaking the average. On the other hand, Das & Bhattacharya (2014 b ) select two elementsrandomly without replacement. This diﬀerence between the algorithm is reﬂected in the ac-ceptance ratios – our algorithm is slightly simpler in that the random selection probabilitiesdo not appear in our acceptance ratio, unlike that of Das & Bhattacharya (2014 b ).S-7.3 General TTMCMC algorithm for jumping more than one dimensions at a time whenseveral sets of parameters are related Algorithm S-7.1

General TTMCMC algorithm for jumping m dimensions with m relatedsets of co-ordinates. • Let the initial value be x (0) ∈ R mk , where k ≥ m . • For t = 0 , , , . . .

1. Generate u = ( u , u , u ) ∼ M ultinomial (1; w b,k , w d,k , w nc,k ) .2. If u = 1 (increase dimension from mk to ( m + 1) k ), then(a) Randomly select one co-ordinate from x ( t )1 = ( x ( t )11 , . . . , x ( t )1 k ) without replacement.Let j denote the chosen co-ordinate.(b) Generate (cid:15)(cid:15)(cid:15) m = ( (cid:15) , . . . , (cid:15) m ) iid ∼ g ( · ) and for i ∈ { , . . . , k }\{ j } simulate ζ (cid:96),i ∼ M ultinomial (1; p (cid:96),i , q (cid:96),i , − p (cid:96),i − q (cid:96),i ) independently, for every (cid:96) =1 , . . . , m .(c) Propose the birth move as follows: for each (cid:96) = 1 , . . . , m , applythe transformation x ( t ) (cid:96),i → g i,ζ (cid:96),i ( x ( t ) (cid:96),i , (cid:15) ) for i ∈ { , . . . , k }\{ j } and, foreach (cid:96) ∈ { , . . . , m } , split x ( t ) (cid:96),j into g (cid:96),ζ (cid:96),j =1 ( x ( t ) (cid:96),j , (cid:15) (cid:96) ) and g (cid:96),ζ c(cid:96),j = − ( x ( t ) (cid:96),j , (cid:15) (cid:96) ) .In other words, let x (cid:48) = T b, ζ ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) = ( x (cid:48) , . . . , x (cid:48) m ) denote the completebirth move, where, for (cid:96) = 1 , . . . , m , x (cid:48) (cid:96) is given by x (cid:48) (cid:96) = ( g (cid:96),ζ (cid:96), ( x ( t ) (cid:96), , (cid:15) ) , . . . , g j − ,ζ (cid:96),j − ( x ( t ) (cid:96),j − , (cid:15) ) ,g j,ζ (cid:96),j =1 ( x ( t ) (cid:96),j , (cid:15) (cid:96) ) , g j,ζ c(cid:96),j = − ( x ( t ) (cid:96),j , (cid:15) (cid:96) ) , g j +1 ,ζ (cid:96),j +1 ( x ( t ) (cid:96),j +1 , (cid:15) ) , . . . ,. . . , g k,ζ (cid:96),k ( x ( t ) (cid:96),k , (cid:15) )) . e-label the k + 1 elements of x (cid:48) (cid:96) as ( x (cid:48) (cid:96), , x (cid:48) (cid:96), , . . . , x (cid:48) (cid:96),k +1 ) .(d) Calculate the acceptance probability of the birth move x (cid:48) : a b ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) = min (cid:26) , w d,k +1 w b,k × P ( j ) ( ζ c ) P ( j ) ( ζ ) π ( x (cid:48) ) π ( x ( t ) ) (cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T b, ζ ( x ( t ) , (cid:15)(cid:15)(cid:15) m )) ∂ ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:27) , where P ( j ) ( ζ ) = m (cid:89) (cid:96) =1 (cid:89) i ∈{ ,...,k }\{ j } p I { } ( ζ (cid:96),i ) (cid:96),i q I {− } ( ζ (cid:96),i ) (cid:96),i , and P ( j ) ( ζ c ) = m (cid:89) (cid:96) =1 (cid:89) i ∈{ ,...,k }\{ j } p I { } ( ζc(cid:96),i ) (cid:96),i q I {− } ( ζ c(cid:96),i ) (cid:96),i . (e) Set x ( t +1) =  x (cid:48) with probability a b ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) x ( t ) with probability − a b ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) .

3. If u = 1 (decrease dimension from k to k − m , for k ≥ m ), then(a) Generate (cid:15)(cid:15)(cid:15) m = ( (cid:15) , . . . , (cid:15) m ) iid ∼ g ( · ) .(b) Randomly select one co-ordinate (say, the j -th co-ordinate) from x = ( x , , . . . , x ,k − ) . For (cid:96) = 1 , . . . , m , let x ∗ (cid:96),j = (cid:16) g j,ζ c(cid:96),j = − ( x (cid:96),j , (cid:15) (cid:96) ) + g j (cid:48) ,ζ (cid:96),j +1 =1 ( x (cid:96),j +1 , (cid:15) (cid:96) ) (cid:17) / replace the co-ordinate x (cid:96),j by the average x ∗ (cid:96),j and delete x (cid:96),j +1 .(c) Simulate ζ by generating independently, for (cid:96) = 1 , . . . , m and for i ∈ { , . . . , k }\{ j, j + 1 } , ζ (cid:96),i ∼ M ultinomial (1; p (cid:96),i , q (cid:96),i , − p (cid:96),i − q (cid:96),i ) .(d) For (cid:96) = 1 , . . . , m and for i ∈ { , . . . , k }\{ j, j +1 } , apply the transformation x (cid:48) (cid:96),i = g i,ζ (cid:96),i ( x ( t ) (cid:96),i , (cid:15) ) .(e) Propose the following death move x (cid:48) = T d, ζ ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) = ( x (cid:48) , . . . , x (cid:48) m ) wherefor (cid:96) = 1 , . . . , m , x (cid:96) is given by x (cid:48) (cid:96) = ( g ,ζ (cid:96), ( x ( t ) (cid:96), , (cid:15) ) , . . . , g j − ,ζ (cid:96),j − ( x ( t ) (cid:96),j − , (cid:15) ) , x ∗ (cid:96),j , g j +1 ,ζ (cid:96),j +1 ( x ( t ) (cid:96),j +1 , (cid:15) ) ,. . . , g k,ζ (cid:96),k ( x ( t ) (cid:96),k , (cid:15) )) . Re-label the elements of x (cid:48) (cid:96) as ( x (cid:48) (cid:96), , x (cid:48) (cid:96), , . . . , x (cid:48) (cid:96),k − ) . f) For (cid:96) = 1 , . . . , m , solve for (cid:15) ∗ (cid:96) from the equations g (cid:96),ζ (cid:96),j =1 ( x ∗ (cid:96),j , (cid:15) ∗ (cid:96) ) = x (cid:96),j and g (cid:96),ζ c(cid:96),j = − ( x ∗ (cid:96),j , (cid:15) ∗ (cid:96) ) = x (cid:96),j +1 and express (cid:15) ∗ (cid:96) in terms of x (cid:96),j and x (cid:96),j +1 . Let (cid:15)(cid:15)(cid:15) ∗ m = ( (cid:15) ∗ , . . . , (cid:15) ∗ m ) .(g) Calculate the acceptance probability of the death move: a d ( x ( t ) , (cid:15)(cid:15)(cid:15) m , (cid:15)(cid:15)(cid:15) ∗ m ) = min (cid:26) , w b,k − m w d,k × P ( j,j +1) ( ζ c ) P ( j,j +1) ( ζ ) π ( x (cid:48) ) π ( x ( t ) ) (cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T d, ζ ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) , (cid:15)(cid:15)(cid:15) ∗ m , (cid:15)(cid:15)(cid:15) m ) ∂ ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:27) , where P ( j,j +1) ( ζ ) = m (cid:89) (cid:96) =1 (cid:89) i ∈{ ,...,k }\{ j,j +1 } p I { } ( ζ (cid:96),i ) (cid:96),i q I {− } ( ζ (cid:96),i ) (cid:96),i , and P ( j,j +1) ( ζ c ) = m (cid:89) (cid:96) =1 (cid:89) i ∈{ ,...,k }\{ j,j +1 } p I { } ( ζc(cid:96),i ) (cid:96),i q I {− } ( ζ c(cid:96),i ) (cid:96),i . (h) Set x ( t +1) =  x (cid:48) with probability a d ( x ( t ) , (cid:15)(cid:15)(cid:15) m , (cid:15)(cid:15)(cid:15) ∗ m ) x ( t ) with probability − a d ( x ( t ) , (cid:15)(cid:15)(cid:15) m , (cid:15)(cid:15)(cid:15) ∗ m ) .

4. If u = 1 (dimension remains unchanged), then implement steps (1), (2),(3) of Algorithm 3.1 of Dutta & Bhattacharya (2014). • End for

S-7.4 Detailed balanceTo see that detailed balance is satisﬁed for the birth and death moves, note that associatedwith the birth move, the probabilily of transition x ( ∈ R k ) (cid:55)→ T b, z ( x , (cid:15)(cid:15)(cid:15) m ) ( ∈ R k + m ), with46 ≥ m , is given by: π ( x ) × k × w b,k × m (cid:89) (cid:96) =1 g ( (cid:15) (cid:96) ) × m (cid:89) (cid:96) =1 (cid:89) i ∈{ ,...,k }\{ j } p I { } ( ζ (cid:96),i ) (cid:96),i q I {− } ( ζ (cid:96),i ) (cid:96),i × min  , w d,k + m w b,k × (cid:81) m(cid:96) =1 (cid:81) i ∈{ ,...,k }\{ j } p I { } ( ζ c(cid:96),i ) (cid:96),i q I {− } ( ζ c(cid:96),i ) (cid:96),i (cid:81) m(cid:96) =1 (cid:81) i ∈{ ,...,k }\{ j } p I { } ( ζ (cid:96),i ) (cid:96),i q I {− } ( ζ (cid:96),i ) (cid:96),i × π ( T b, ζ ( x , (cid:15)(cid:15)(cid:15) m )) π ( x ) × (cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T b, ζ ( x ( t ) , (cid:15)(cid:15)(cid:15) m )) ∂ ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:27) = 1 k × m (cid:89) i =1 g ( (cid:15) i ) × min  π ( x ) × w b,k × m (cid:89) (cid:96) =1 (cid:89) i ∈{ ,...,k }\{ j } p I { } ( ζ (cid:96),i ) (cid:96),i q I {− } ( ζ (cid:96),i ) (cid:96),i , × w d,k + m × m (cid:89) (cid:96) =1 (cid:89) i ∈{ ,...,k }\{ j } p I { } ( ζ c(cid:96),i ) (cid:96),i q I {− } ( ζ c(cid:96),i ) (cid:96),i π ( T b, ζ ( x , (cid:15)(cid:15)(cid:15) m )) × (cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T b, ζ ( x ( t ) , (cid:15)(cid:15)(cid:15) m )) ∂ ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) (cid:12)(cid:12)(cid:12)(cid:12) . (S-7.4)The transition probability of the reverse death move is given by: π ( x ) × w d,k + m × k × m (cid:89) (cid:96) =1 g ( (cid:15) (cid:96) ) × m (cid:89) (cid:96) =1 (cid:89) i ∈{ ,...,k }\{ j } p I { } ( ζ c(cid:96),i ) (cid:96),i q I {− } ( ζ c(cid:96),i ) (cid:96),i × (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T − d, ζ ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) , (cid:15)(cid:15)(cid:15) ∗ m , (cid:15)(cid:15)(cid:15) m ) ∂ ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) × min  , w b,k w d,k + m × (cid:81) m(cid:96) =1 (cid:81) i ∈{ ,...,k }\{ j } p I { } ( ζ (cid:96),i ) (cid:96),i q I {− } ( ζ (cid:96),i ) (cid:96),i (cid:81) m(cid:96) =1 (cid:81) i ∈{ ,...,k }\{ j } p I { } ( ζ c(cid:96),i ) (cid:96),i q I {− } ( ζ c(cid:96),i ) (cid:96),i × π ( x ) π ( T b, ζ ( x , (cid:15)(cid:15)(cid:15) m )) × (cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T d, ζ ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) , (cid:15)(cid:15)(cid:15) ∗ m , (cid:15)(cid:15)(cid:15) m ) ∂ ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:27) = 1 k × m (cid:89) (cid:96) =1 g ( (cid:15) (cid:96) ) × min  π ( T b, ζ ( x , (cid:15)(cid:15)(cid:15) m )) × w d,k + m × m (cid:89) (cid:96) =1 (cid:89) i ∈{ ,...,k }\{ j } p I { } ( ζ c(cid:96),i ) (cid:96),i q I {− } ( ζ c(cid:96),i ) (cid:96),i × (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T − d, ζ ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) , (cid:15)(cid:15)(cid:15) ∗ m , (cid:15)(cid:15)(cid:15) m ) ∂ ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , w b,k × m (cid:89) (cid:96) =1 (cid:89) i ∈{ ,...,k }\{ j } p I { } ( ζ (cid:96),i ) (cid:96),i q I {− } ( ζ (cid:96),i ) (cid:96),i × π ( x )  . (S-7.5)Noting that (cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T − d, ζ ( x ( t ) ,(cid:15)(cid:15)(cid:15) m ) ,(cid:15)(cid:15)(cid:15) ∗ m ,(cid:15)(cid:15)(cid:15) m ) ∂ ( x ( t ) ,(cid:15)(cid:15)(cid:15) ∗ m ,(cid:15)(cid:15)(cid:15) m ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) ∂ ( T b, ζ ( x ( t ) ,(cid:15)(cid:15)(cid:15) m )) ∂ ( x ( t ) ,(cid:15)(cid:15)(cid:15) m ) (cid:12)(cid:12)(cid:12) , it follows that (S-7.4) = (S-7.5), showingthat detailed balance holds for the birth and the death moves.S-8. TTMCMC ALGORITHM FOR OUR SPATIO-TEMPORAL MODELWe now specialize the general TTMCMC algorithm (Algorithm S-7.1) provided in SectionS-7.3 in our spatio-temporal context. For our spatio temporal model, we need to update47he variable dimensional mass parameter V = ( V , . . . , V k ), the point process variables z = ( z , . . . , z k ), location parameters θ = ( θ , . . . , θ k ), the other set of location param-eters θ = ( θ , . . . , θ k ), and ﬁxed dimensional parameters α , λ , error variance σ and theparameters related to the kernel, namely, ϕ, a δ , b ψ , ( ψ ( s ) , . . . , ψ ( s n )), ( ψ ( s ) , . . . , ψ ( s n )),( δ ( t ) , . . . , δ ( t n )), and τ . For updating the variable dimensional parameters we use proposedTTMCMC algorithm, and for ﬁxed dimension we use the TMCMC algorithm of Dutta &Bhattacharya (2014). We denote by ξ = ( V , z , θ , θ ) the collection of all variable dimen-sional parameters and by η = ( ϕ, a δ , b ψ , ψ ( s ) , . . . , ψ ( s n ) ,ψ ( s ) , . . . , ψ ( s n ) , δ ( t ) , . . . , δ ( t n ) , τ, α, λ ), the collection of all ﬁxed dimensional parameters.The detailed updating procedure is provided as Algorithm S-8.1. Algorithm S-8.1

Detailed updating procedure of our spatio-temporal model • Initialise the number of components k ; let k (0) be the chosen initial value(we chose k (0) = 15 as the initial value for our applications). • Given k = k (0) , let ξ (0) denote the initial value of ξ . Also, let η (0) denotethe initial value of η . • Since z and V are constrained random variables, we consider updating thereparameterized versions V ∗ = log( V ) and z ∗ = log (cid:0) z − ab − a (cid:1) . After every iterationwe invert the transformations to store the original variables V and z . Forthe sake of convenience of presentation of our algorithm we slightly abusenotation by referring to V ∗ and z ∗ as V and z respectively. • For t = 0 , , , . . .

1. Generate u = ( u , u , u ) ∼ M ultinomial (cid:0) , , (cid:1) .2. If u = 1 (increase dimension from k to k +1 for each of the variables V , z , θ , θ ), then(a) Randomly select one co-ordinate from { , . . . , k } . Let j denote thechosen co-ordinate. b) Generate (cid:15)(cid:15)(cid:15) = ( (cid:15) , . . . , (cid:15) ) iid ∼ N (0 , I { (cid:15)> } ( I { (cid:15)> } denoting the inidicatorfunction). For updating the variable dimensional parameters, simulate ζ (1) (cid:96),i =  w.p. − w.p.  for i ∈ { , . . . , k }\{ j } and (cid:96) = 1 , . . . , , and for updating the fixed one dimensional parameters, simulate ζ (2) (cid:96) =  w.p. − w.p.  for (cid:96) = 5 , . . . , . For updating fixed multi-dimensional parameters, simulate ζ (3) (cid:96),i =  w.p. − w.p.  for i ∈ { , . . . , n } and (cid:96) = 11 , , , . (c) Propose the birth move as follows. For i ∈ { , . . . , k }\{ j } , apply theadditive transformation: V ( t ) i → ( V ( t ) i + ζ (1)1 ,i a (cid:15) ) z ( t ) i → ( z ( t ) i + ζ (1)2 ,i a (cid:15) ) θ ( t )1 i → ( θ ( t )1 i + ζ (1)3 ,i a (cid:15) ) θ ( t )2 i → ( θ ( t )2 i + ζ (1)4 ,i a (cid:15) ) and split: V ( t ) j into ( V ( t ) j + a (cid:15) ) and ( V ( t ) j − a (cid:15) ) z ( t ) j into ( z ( t ) j + a (cid:15) ) and ( z ( t ) j − a (cid:15) ) θ ( t )1 j into ( θ ( t )1 j + a (cid:15) ) and ( θ ( t )1 j − a (cid:15) ) θ ( t )2 j into ( θ ( t )2 j + a (cid:15) ) and ( θ ( t )2 j − a (cid:15) ) In other words, let x (cid:48) = T b,ζ (1) ( x ( t ) , (cid:15)(cid:15)(cid:15) m ) = ( V (cid:48) , z (cid:48) , θ (cid:48) , θ (cid:48) ) denote thecomplete birth move, where, V (cid:48) = (( V ( t )1 + ζ (1)1 , a (cid:15) ) . . . ( V ( t ) j − + ζ (1)1 ,j − a (cid:15) ) , ( V ( t ) j + a (cid:15) ) , ( V ( t ) j − a (cid:15) ) . . . ( V ( t ) k + ζ (1)1 ,k a (cid:15) )) (S-8.1)49 (cid:48) = (( z ( t )1 + ζ (1)2 , a (cid:15) ) . . . ( z ( t ) j − + ζ (1)2 ,j − a (cid:15) ) , ( z ( t ) j + a (cid:15) ) , ( z ( t ) j − a (cid:15) ) . . . ( z ( t ) k + ζ (1)2 ,k a (cid:15) )) (S-8.2) θ (cid:48) = (( θ ( t )1 , + ζ (1)3 , a (cid:15) ) . . . ( θ ( t )1 ,j − + ζ (1)3 ,j − a (cid:15) ) , ( θ ( t )1 ,j + a (cid:15) ) , ( θ ( t )1 ,j − a (cid:15) ) . . . ( θ ( t )1 ,k + ζ (1)3 ,k a (cid:15) )) (S-8.3) θ (cid:48) = (( θ ( t )2 , + ζ (1)4 , a (cid:15) ) . . . ( θ ( t )2 ,j − + ζ (1)4 ,j − a (cid:15) ) , ( θ ( t )2 ,j + a (cid:15) ) , ( θ ( t )2 ,j − a (cid:15) ) . . . ( θ ( t )2 ,k + ζ (1)4 ,k a (cid:15) )) (S-8.4) Re-label the k + 1 elements of V (cid:48) as ( V (cid:48) , V (cid:48) , . . . , V (cid:48) k +1 ) , z (cid:48) as ( z (cid:48) , z (cid:48) , . . . , z (cid:48) k +1 ) , θ (cid:48) as ( θ (cid:48) , , θ (cid:48) , , . . . , θ (cid:48) ,k +1 ) , θ (cid:48) as ( θ (cid:48) , , θ (cid:48) , , . . . , θ (cid:48) ,k +1 ) .(d) We apply the additive transformation based on the single (cid:15) to updateall the fixed dimensional parameter η as follows: ϕ ( t ) → ( ϕ ( t ) + ζ (2)5 a (cid:15) ) a ( t ) δ → ( a ( t ) δ + ζ (2)6 a (cid:15) ) b ( t ) ψ → ( b ( t ) ψ + ζ (2)7 a (cid:15) ) α ( t ) → ( α ( t ) + ζ (2)8 a (cid:15) ) λ ( t ) → ( λ ( t ) + ζ (2)9 a (cid:15) ) τ ( t ) → ( τ ( t ) + ζ (2)10 a (cid:15) ) σ ( t ) → ( σ ( t ) + ζ (2)11 a (cid:15) ) ψ ( t )1 ( s i ) → ( ψ ( t )1 ( s i ) + ζ (3)12 ,i a (cid:15) ) ψ ( t )2 ( s i ) → ( ψ ( t )2 ( s i ) + ζ (3)13 ,i a (cid:15) ) δ ( t ) ( t i ) → ( δ ( t ) ( t i ) + ζ (3)14 ,i a (cid:15) )50 et η (cid:48) = T b,ζ (2) ( η ( t ) , (cid:15) ) = ( ϕ (cid:48) , a (cid:48) δ , b (cid:48) ψ , ψ (cid:48) ( s ) , . . . , ψ (cid:48) ( s n ) , ψ (cid:48) ( s ) , . . . , ψ (cid:48) ( s n ) ,δ (cid:48) ( t ) , . . . , δ (cid:48) ( t n ) , τ (cid:48) , α (cid:48) , λ (cid:48) , σ (cid:48) ) denote the complete move type for fixed dimensionalparameters.In the above transformations the a i ’s are the scaling constants tobe chosen appropriately; see Das & Bhattacharya (2014 b ) and Dey &Bhattacharya (2014) for the details. In our applications we choosethe scales on the basis of pilot runs of our TTMCMC algorithm.(e) Calculate the acceptance probability: a b,ζ ( ξ ( t ) , η ( t ) , (cid:15)(cid:15)(cid:15) ) = min (cid:40) , π ( ξ (cid:48) , η (cid:48) ) π ( ξ ( t ) , η ( t ) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T b,ζ (1) ( ξ ( t ) , (cid:15)(cid:15)(cid:15) )) ∂ ( ξ ( t ) , (cid:15)(cid:15)(cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T b,ζ (2) ( η ( t ) , (cid:15) )) ∂ ( η ( t ) , (cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:41) , where ∂ ( T b,ζ (1) ( ξ ( t ) , (cid:15)(cid:15)(cid:15) )) ∂ ( ξ ( t ) , (cid:15)(cid:15)(cid:15) ) = 2 a a a a and ∂ ( T b,ζ (2) ( η ( t ) , (cid:15) )) ∂ ( η ( t ) , e ) = 1 . (f) Set ( ξ ( t +1) , η ( t +1) ) =  ( ξ (cid:48) , η (cid:48) ) with probability a b,ζ ( ξ ( t ) , η ( t ) , (cid:15)(cid:15)(cid:15) ) , ( ξ ( t ) , η ( t ) ) with probability − a b,ζ ( ξ ( t ) , η ( t ) , (cid:15)(cid:15)(cid:15) ) .

3. If u = 1 (decrease dimension from k to k − for each of the variables V , z , θ , θ ), then(a) Generate (cid:15)(cid:15)(cid:15) = ( (cid:15) , . . . , (cid:15) ) iid ∼ N (0 , I { (cid:15)> } .(b) Randomly select one co-ordinate from { , , . . . , k − } . Let j be theselected co-ordinate. Then let V ∗ j = (( V j + a (cid:15) ) + ( V j +1 − a (cid:15) )) / replace the co-ordinate V j by the average V ∗ j and delete V j +1 . Similarly,let z ∗ j = (( z j + a (cid:15) ) + ( z j +1 − a (cid:15) )) / eplace the co-ordinate z j by the average z ∗ j and delete z j +1 . Form θ ∗ ,j = (( θ ,j + a (cid:15) ) + ( θ ,j +1 − a (cid:15) )) / and replace the co-ordinate θ ,j by the average θ ∗ ,j and delete θ ,j +1 ;create θ ∗ ,j = (( θ ,j + a (cid:15) ) + ( θ ,j +1 − a (cid:15) )) / and replace the co-ordinate θ ,j by the average θ ∗ ,j and delete θ ,j +1 .(c) Simulate ζ similarly as in the case of the birth move.(d) For the co-ordinates other than j and j +1 apply the additive transformation V ( t ) i → ( V ( t ) i + ζ (1)1 ,i a (cid:15) ) z ( t ) i → ( z ( t ) i + ζ (1)2 ,i a (cid:15) ) θ ( t )1 i → ( θ ( t )1 i + ζ (1)3 ,i a (cid:15) ) θ ( t )2 i → ( θ ( t )2 i + ζ (1)4 ,i a (cid:15) ) for i ∈ { , . . . , k }\{ j, j + 1 } .(e) In other words, let ξ (cid:48) = T d,ζ (1) ( ξ ( t ) , (cid:15)(cid:15)(cid:15) ) = ( V (cid:48) , z (cid:48) , θ (cid:48) , θ (cid:48) ) denote the completedeath move, where, V (cid:48) = (( V ( t )1 + ζ (1)1 , a (cid:15) ) . . . ( V ( t ) j − + ζ (1)1 ,j − a (cid:15) ) , V ∗ j , ( V ( t ) j +2 + ζ (1)1 ,j +2 a (cid:15) ) . . . ( V ( t ) k + ζ (1)1 ,k a (cid:15) )) (S-8.5) z (cid:48) = (( z ( t )1 + ζ (1)2 , a (cid:15) ) . . . ( z ( t ) j − + ζ (1)2 ,j − a (cid:15) ) , z ∗ j , ( z ( t ) j +21 + ζ (1)2 ,j +21 a (cid:15) ) ,. . . ( z ( t ) k + ζ (1)2 ,k a (cid:15) )) (S-8.6) θ (cid:48) = (( θ ( t )1 , + ζ (1)3 , a (cid:15) ) . . . ( θ ( t )1 ,j − + ζ (1)3 ,j − a (cid:15) ) , θ ∗ ,j , ( θ ( t )1 ,j +2 + ζ (1)3 ,j +2 a (cid:15) ) . . . ( θ ( t )1 ,k + ζ (1)3 ,k a (cid:15) )) (S-8.7)52 (cid:48) = (( θ ( t )2 , + ζ (1)4 , a (cid:15) ) . . . ( θ ( t )2 ,j − + ζ (1)4 ,j − a (cid:15) ) , θ ∗ ,j , ( θ ( t )2 ,j +2 + ζ (1)4 ,j +2 a (cid:15) ) . . . ( θ ( t )2 ,k + ζ (1)4 ,k a (cid:15) )) (S-8.8) Re-label the k − elements of V (cid:48) as ( V (cid:48) , V (cid:48) , . . . , V (cid:48) k − ) , z (cid:48) as ( z (cid:48) , z (cid:48) , . . . , z (cid:48) k − ) , θ (cid:48) as ( θ (cid:48) , , θ (cid:48) , , . . . , θ (cid:48) ,k − ) , and θ (cid:48) as ( θ (cid:48) , , θ (cid:48) , , . . . , θ (cid:48) ,k − ) .(f) Solve for (cid:15) ∗ from the equations V ∗ j + a (cid:15) ∗ = V j and V ∗ j − a (cid:15) ∗ = V j +1 ,which yield (cid:15) ∗ = ( V j − V j +1 )2 a . Similarly, we have (cid:15) ∗ = ( z j − z j +1 )2 a (cid:15) ∗ = ( θ ,j − θ ,j +1 )2 a and (cid:15) ∗ = ( θ ,j − θ ,j +1 )2 a . Let (cid:15)(cid:15)(cid:15) ∗ = ( (cid:15) ∗ , . . . , (cid:15) ∗ ) .(g) For updating the fixed dimensional parameters η = ( ϕ, a δ , b ψ , ψ ( s ) , . . . , ψ ( s n ) , ψ ( s ) , . . . , ψ ( s n ) , δ ( t ) , . . . , δ ( t n ) , τ, α, λ, σ ) implement step d ) .(h) Calculate the acceptance probability of the death move: a d,ζ ( ξ ( t ) , η ( t ) , (cid:15)(cid:15)(cid:15) , (cid:15)(cid:15)(cid:15) ∗ )= min (cid:40) , π ( ξ (cid:48) , η (cid:48) ) π ( ξ ( t ) , η ( t ) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T d,ζ (1) ( ξ ( t ) , (cid:15)(cid:15)(cid:15) ) , (cid:15)(cid:15)(cid:15) ∗ , (cid:15)(cid:15)(cid:15) ) ∂ ( ξ ( t ) , (cid:15)(cid:15)(cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T d,ζ (2) ( η ( t ) , (cid:15) )) ∂ ( η ( t ) , (cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:41) , where (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∂ ( T d,ζ (1) ( ξ ( t ) , (cid:15)(cid:15)(cid:15) ) , (cid:15)(cid:15)(cid:15) ∗ , (cid:15)(cid:15)(cid:15) ) ∂ ( ξ ( t ) , (cid:15)(cid:15)(cid:15) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 12 a a a a and ∂ ( T d,ζ (2) ( η ( t ) , (cid:15) )) ∂ ( η ( t ) , (cid:15) ) = 1 . (i) Set ( ξ ( t +1) , η ( t +1) ) =  ( ξ (cid:48) , η (cid:48) ) with probability a d,ζ ( ξ ( t ) , η ( t ) , (cid:15)(cid:15)(cid:15) , (cid:15)(cid:15)(cid:15) ∗ ) , ( ξ ( t ) , η ( t ) ) with probability − a d,ζ ( ξ ( t ) , η ( t ) , (cid:15)(cid:15)(cid:15) , (cid:15)(cid:15)(cid:15) ∗ ) . . If u = 1 (dimension remains unchanged),then update ( ξ ( t ) , η ( t ) ) by implementing steps (1), (2), (3) of Algorithm3.1 of Dutta & Bhattacharya (2014). • End for

Journal of Multivariate Analysis , 84, 85100.Chang, Y.-M., Hsu, N.-J., & Huang, H.-C. (2011), “Semiparametric Estimation and Selec-tion for Nonstationary Spatial Covariance Functions,”

Journal of COmputational andGraphical Statistics , 19, 117–139.Damian, D., Sampson, P. D., & Guttorp, P. (2001), “Bayesian Estimation of Semi-[arametricNon-stationary Spatial Covariance Structures,”

Environmetrics , 12, 161–178.Das, M., & Bhattacharya, S. (2014 a ), “Supplement to “Nonstationary, Nonparametric,Nonseparable Bayesian Spatio-Temporal Modeling Using Kernel Convolution of OrderBased Dependent Dirichlet Process”,”. Submitted.Das, M., & Bhattacharya, S. (2014 b ), “Transdimensional Transformation Based MarkovChain Monte Carlo,”. Submitted. Available at “http://arxiv.org/pdf/1403.5207”.Dey, K. K., & Bhattacharya, S. (2014), “On Optimal Scaling of Additive Trans-formation based Markov Chain Monte Carlo,”. Submitted. Available at“http://arxiv.org/abs/1307.1446”.Duan, J. A., Gelfand, A. E., & Sirmans, C. F. (2009), “Modeling Space-Time Data UsingStochastic Diﬀerential Equations,” Bayesian Anslysis , 4, 733–758.Duan, J. A., Guindani, M., & Gelfand, A. E. (2007), “Generalized Spatial Dirichlet ProcessModels,”

Biometrika , 94, 809–825.Dutta, S., & Bhattacharya, S. (2014), “Markov Chain Monte Carlo Basedon Deterministic Transformations,”

Statistical Methodology , 16, 100–116.Also available at http://arxiv.org/abs/1106.5850. Supplement available at“http://arxiv.org/abs/1306.6684”.Ferguson, T. S. (1973), “A Bayesian Analysis of Some Nonparametric Problems,”

The Annalsof Statistics , 1, 209–230.Ferguson, T. S. (1974), “Prior Distributions on Spaces of Probability Measures,”

The Annalsof Statistics , 2, 615–629. 55uentes, M. (2002), “Spectral Methods for Nonstationary Spatial Processes,”

Biometrika ,89, 197–210.Fuentes, M., & Smith, R. L. (2001), “A New Class of Nonstationary Spatial Models,”.Technical Report, Department of Statistics, North Carolina State University.Gelfand, A. E., Kottas, A., & MacEachern, S. N. (2005), “Bayesian Nonparametric SpatialModeling With Dirichlet Process Mixing,”

Journal of the American Statistical Associ-ation , 100, 1021–1035.Griﬃn, J. E., & Steel, M. F. J. (2004), “Semiparametric Bayesian Inference for StochasticFrontier Models,”

Journal of Econometrics , 123, 121–152.Griﬃn, J. E., & Steel, M. F. J. (2006), “Order-Based Dependent Dirichlet Processes,”

Jour-nal of the American Statistical Association , 101, 179–194.Guttorp, P., & Sampson, P. D. (1994), Methods for Estimating Heterogeneous Spatial Co-variance Functions with Environmental Applications,, in

Handbook of Statistics XII:Environmental Statistics , eds. G. P. Patil, & C. R. Rao, Elsevier/North Holland, NewYork, pp. 663–690.Haas, T. C. (1995), “Local Prediction of a Spatio-Temporal Process with an Application toWet Sulfate Deposition,”

Journal of the American Statistical Association , 90, 11891199.Higdon, D. (1998), “A Process-Convolution Approach to Modeling Temperatures in theNorth Atlantic Ocean,”

Environmental and Ecological Statistics , 5, 173190.Higdon, D. (2001), Space and Space-Time Modeling Using Process Convolutions,, in

Quan-titative Methods for Current Environmental Issues , eds. C. W. A. V. Barnett, P. C.Chatwin, & A. H. El-Sharaawi, Springer-Verlag, London, pp. 37–56.Higdon, D., Swall, J., & Kern, J. (1999), Non-Stationary Spatial Modeling,, in

BayesianStatistics 6 , eds. J. M. Bernardo, J. O. Berger, A. P. Dawid, & A. F. M. Smith, OxfordUniversity Press, Oxford, pp. 761–768.Ishwaran, H., & James, L. F. (2001), “Gibbs Sampling Methods for Stick-Breaking Prior,”

Journal of the American Statistical Association , 96, 161–173.Nott, D. J., & Dunsmuir, W. T. M. (2002), “Estimation of Nonstationary Spatial CovarianceStructure,”

Biometrika , 89, 819–829. 56ampson, P. D., & Guttorp, P. (1992), “Nonparametric Estimation of Nonstationary SpatialCovariance Structure,”

Journal of the American Statistical Association , 87, 108119.Schmidt, A. M., & O’Hagan, A. (2003), “Bayesian Inference for Nonstationary Spatial Co-variance Structure via Spatial Deformations,”

Journal of the Royal Statistical Society.Series B , 65, 743758.Sethuraman, J. (1994), “A constructive deﬁnition of Dirichlet priors,”

Statistica Sinica ,4, 639–650.Stein, M. L. (1999),

Interpolation of Spatial Data: Some Theory for Kriging , New York:Springer-Verlag.Wolpert, R. L., Clyde, M. A., & Tu, C. (2011), “Stochastic Expansions Using ContinuousDictionaries: L´evy Adaptive Regression Kernels,”

Annals of Statistics (to appear) , .Yaglom, A. M. (1987 a ), Correlation Theory of Stationary and Related Random Functions–Volume-I: Basic Results , New York: Springer-Verlag.Yaglom, A. M. (1987 b ), Correlation Theory of Stationary and Related Random Functions–Volume-II: Supplemtary Notes and References , New York: Springer-Verlag.57 . . . . . Y(s1,t1) −3 −2 −1 0 1 2 3 . . . . . Y(s12,t12) −2 0 2 4 . . . . . Y(s50,t50) −4 −2 0 2 4 . . . . . Y(s53,t53) −3 −2 −1 0 1 2 3 . . . . . Y(s67,t67) −3 −2 −1 0 1 2 3 . . . . . Y(s95,t95)

Figure 10.1: Simulation study: Posterior predictive densities of Y ( s , t ) for 6 diﬀerentlocation-time pairs – the corresponding true values are denoted by the vertical line.58 Correlation between Y(s1,t1) & Y(s2,t2) −0.1 0 0.1 0.2 0.3 0.4 0.5 0.600.511.522.533.544.55

Correlation between Y(s1,t1) & Y(s7,t7) −0.6 −0.4 −0.2 0 0.2 0.400.511.522.533.544.55

Correlation between Y(s1,t1) & Y(s95,t95) −0.2 0 0.2 0.4 0.6 0.8 1 1.200.20.40.60.811.21.41.61.8

Correlation between Y(s5,t5) & Y(s9,t9) −0.2 0 0.2 0.4 0.6 0.8 1 1.200.20.40.60.811.21.41.61.8

Correlation between Y(s5,t5) &Y (s12,t12) −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.501234567

Correlation between Y(s5,t5) & Y(s91,t91) −0.4 −0.2 0 0.2 0.4 0.6 0.8 100.20.40.60.811.21.41.6

Correlation Between Y(s21,t21) & Y(s23,t23) −0.2 0 0.2 0.4 0.6 0.800.511.522.533.5

Correlation between Y(s21,t21) &Y(s55,t55) −0.4 −0.2 0 0.2 0.4 0.600.511.522.533.54

Correlation between Y(s21,t21) & Y(s92,t92) −0.2 0 0.2 0.4 0.600.511.522.533.544.555.5

Correlation between Y(s60,t60) & Y(s10,t10) −0.2 0 0.2 0.4 0.600.511.522.533.544.55

Correlation between Y(s60,t60) & Y(s69,t69) −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.500.511.522.533.544.55

Correlation between Y(s60,t60) & Y(s95,t95)

Figure 10.2: Simulation study: Posterior densities of correlations for 12 diﬀerent pairs oflocations; the vertical lines indicate the true correlations.59

95 −90 −85 −80 −75 −7020304050−202468

Spatial surfaces interpolating data points and posterior predictive quantiles −101234567 −95 −90 −85 −80 −75 −7020304050−202468

Spatial surfaces interpolating median values and posterior predictive quantiles −101234567

Figure 10.3: Real data analysis: The top panel shows the surface plot of ozone concentrations(middle), the lower and the upper 95% credible intervals associated with the leave-one-outposterior predictive densities, denoted by the lower and the upper surfaces, respectively. Thebottom panel shows the surface plot of the posterior medians (middle) along with the lowerand the upper 95% credible intervals associated with the leave-one-out posterior predictivedensities (lower and the upper surfaces, respectively). The observed data points are indicatedby ‘*’. 60

10 20 30 40 50 60 70 80−2−101234567

Figure 10.4: Real data analysis: Posterior predictive distributions summarized by the median(middle line) and the 95% credible intervals as a function of s . The observed data pointsare denoted by ‘*’. −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.800.511.522.53 Correlation between Y(s1) & Y(s5) −0.4 −0.2 0 0.2 0.4 0.6 0.8 100.511.522.53

Correlation between Y(s4) & Y(s21) −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.800.511.522.53

Correlation between Y(s4) & Y(s75) −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.800.511.522.53

Correlation between Y(s25) & Y(s55) −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.800.511.522.53

Correlation between Y(s35) & Y(s45) −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.800.511.522.53

Related Researches

Nonparametric C- and D-vine based quantile regression

by Marija Tepegjozova

On structural and practical identifiability

by Franz-Georg Wieland

Fisher Scoring for crossed factor Linear Mixed Models

by Thomas Maullin-Sapey

A test for comparing conditional ROC curves with multidimensional covariates

by Arís Fanjul-Hevia

Changepoint detection on a graph of time series

by Karl L. Hallgren

RaSE: A Variable Screening Framework via Random Subspace Ensembles

by Ye Tian

Estimating the treatment effect for adherers using multiple imputation

by Junxiang Luo

Nested Group Testing Procedures for Screening

by Yaakov Malinovsky

An empirical comparison and characterisation of nine popular clustering methods

by Christian Hennig

Vine copula mixture models and clustering for non-Gaussian data

by ?zge Sahin

Identifying regions of inhomogeneities in spatial processes via an M-RA and mixture priors

by Marco H. Benedetti

Classification of Categorical Time Series Using the Spectral Envelope and Optimal Scalings

by Zeda Li

Eliciting judgements about dependent quantities of interest: The SHELF extension and copula methods illustrated using an asthma case study

by Björn Holzhauer

Incompletely observed nonparametric factorial designs with repeated measurements: A wild bootstrap approach

by Lubna Amro

Tilted Nonparametric Regression Function Estimation

by Farzaneh Boroumand

Factor-augmented Smoothing Model for Functional Data

by Yuan Gao

Hierarchical Multivariate Directed Acyclic Graph Auto-Regressive (MDAGAR) models for spatial diseases mapping

by Leiwen Gao

Model Calibration via Distributionally Robust Optimization: On the NASA Langley Uncertainty Quantification Challenge

by Yuanlu Bai

Bayesian Fusion: Scalable unification of distributed statistical analyses

by Hongsheng Dai

Mortality Forecasting using Factor Models: Time-varying or Time-invariant Factor Loadings?

by Lingyu He

Statistical Inference for Ordinal Predictors in Generalized Linear and Additive Models with Application to Bronchopulmonary Dysplasia

by Jan Gertheiss

Sensitivity Analysis for Unmeasured Confounding via Effect Extrapolation

by Wen Wei Loh

Splitting strategies for post-selection inference

by Daniel G. Rasines

Covariate adjustment in subgroup analyses of randomized clinical trials: A propensity score approach

by Siyun Yang

Unobserved classes and extra variables in high-dimensional discriminant analysis

by Michael Fop

«

1

2

3

4

»

Submitted on 20 May 2014 (v1), last revised 1 May 2020 (this version, v2) Updated

arXiv.org Original Source

NASA ADS

Google Scholar

Semantic Scholar