[PDF] Orthogonal series estimation of the pair correlation function of a spatial point process

Abstract

The pair correlation function is a fundamental spatial point process characteristic that, given the intensity function, determines second order moments of the point process. Non-parametric estimation of the pair correlation function is a typical initial step of a statistical analysis of a spatial point pattern. Kernel estimators are popular but especially for clustered point patterns suffer from bias for small spatial lags. In this paper we introduce a new orthogonal series estimator. The new estimator is consistent and asymptotically normal according to our theoretical and simulation results. Our simulations further show that the new estimator can outperform the kernel estimators in particular for Poisson and clustered point processes.

Full PDF

OOrthogonal series estimation of the paircorrelation function of a spatial point process

Abdollah Jalilian , Yongtao Guan , and Rasmus Waagepetersen Department of Statistics, Razi University,Bagh-e-Abrisham,Kermanshah 67149-67346, Iran [email protected] Department of Management Science, University of Miami,CoralGables, Florida 33124-6544, U.S.A. [email protected] Department of Mathematical Sciences, Aalborg University,FredrikBajersvej 7G, DK-9220 Aalborg, Denmark [email protected]

April 3, 2018

Abstract

The pair correlation function is a fundamental spatial point process character-istic that, given the intensity function, determines second order moments of thepoint process. Non-parametric estimation of the pair correlation function is a typ-ical initial step of a statistical analysis of a spatial point pattern. Kernel estimatorsare popular but especially for clustered point patterns suffer from bias for smallspatial lags. In this paper we introduce a new orthogonal series estimator. Thenew estimator is consistent and asymptotically normal according to our theoreticaland simulation results. Our simulations further show that the new estimator canoutperform the kernel estimators in particular for Poisson and clustered point pro-cesses.

Keywords : Asymptotic normality; Consistency; Kernel estimator; Orthogonalseries estimator; Pair correlation function; Spatial point process.

The pair correlation function is commonly considered the most informative second-order summary statistic of a spatial point process (Stoyan and Stoyan, 1994; Møllerand Waagepetersen, 2003; Illian et al., 2008). Non-parametric estimates of the paircorrelation function are useful for assessing regularity or clustering of a spatial pointpattern and can moreover be used for inferring parametric models for spatial pointprocesses via minimum contrast estimation (Stoyan and Stoyan, 1996; Illian et al.,2008). Although alternatives exist (Yue and Loh, 2013), kernel estimation is the by farmost popular approach (Stoyan and Stoyan, 1994; Møller and Waagepetersen, 2003;Illian et al., 2008) which is closely related to kernel estimation of probability densities.Kernel estimation is computationally fast and works well except at small spatiallags. For spatial lags close to zero, kernel estimators suffer from strong bias, see e.g.the discussion at page 186 in Stoyan and Stoyan (1994), Example 4.7 in Møller and1 a r X i v : . [ m a t h . S T ] F e b aagepetersen (2003) and Section 7.6.2 in Baddeley et al. (2015). The bias is a majordrawback if one attempts to infer a parametric model from the non-parametric estimatesince the behavior near zero is important for determining the right parametric model(Jalilian et al., 2013).In this paper we adapt orthogonal series density estimators (see e.g. the reviewsin Hall, 1987; Efromovich, 2010) to the estimation of the pair correlation function.We derive unbiased estimators of the coefﬁcients in an orthogonal series expansion ofthe pair correlation function and propose a criterion for choosing a certain optimalsmoothing scheme. In the literature on orthogonal series estimation of probabilitydensities, the data are usually assumed to consist of indendent observations from theunknown target density. In our case the situation is more complicated as the data usedfor estimation consist of spatial lags between observed pairs of points. These lags areneither independent nor identically distributed and the sample of lags is biased due toedge effects. We establish consistency and asymptotic normality of our new orthogonalseries estimator and study its performance in a simulation study and an application toa tropical rain forest data set. We denote by X a point process on R d , d ≥ , that is, X is a locally ﬁnite randomsubset of R d . For B ⊆ R d , we let N ( B ) denote the random number of points in X ∩ B . That X is locally ﬁnite means that N ( B ) is ﬁnite almost surely whenever B is bounded. We assume that X has an intensity function ρ and a second-order jointintensity ρ (2) so that for bounded A, B ⊂ R d , E { N ( B ) } = (cid:90) B ρ ( u )d u, E { N ( A ) N ( B ) } = (cid:90) A ∩ B ρ ( u )d u + (cid:90) A (cid:90) B ρ (2) ( u, v )d u d v. (1)The pair correlation function g is deﬁned as g ( u, v ) = ρ (2) ( u, v ) / { ρ ( u ) ρ ( v ) } when-ever ρ ( u ) ρ ( v ) > (otherwise we deﬁne g ( u, v ) = 0 ). By (1),cov { N ( A ) , N ( B ) } = (cid:90) A ∩ B ρ ( u )d u + (cid:90) A (cid:90) B ρ ( u ) ρ ( v ) (cid:8) g ( v, u ) − (cid:9) d u d v for bounded A, B ⊂ R d . Hence, given the intensity function, g determines the covari-ances of count variables N ( A ) and N ( B ) . Further, for locations u, v ∈ R d , g ( u, v ) > ( < ) implies that the presence of a point at v yields an elevated (decreased) probabil-ity of observing yet another point in a small neighbourhood of u (e.g. Coeurjollyet al., 2016). In this paper we assume that g is isotropic, i.e. with an abuse of notation, g ( u, v ) = g ( (cid:107) v − u (cid:107) ) . Examples of pair correlation functions are shown in Figure 1. Suppose X is observed within a bounded observation window W ⊂ R d and let X W = X ∩ W . Let k b ( · ) be a kernel of the form k b ( r ) = k ( r/b ) /b , where k is a probabilitydensity and b > is the bandwidth. Then a kernel density estimator (Stoyan and2toyan, 1994; Baddeley et al., 2000) of g is ˆ g k ( r ; b ) = 1 sa d r d − (cid:54) = (cid:88) u,v ∈ X W k b ( r − (cid:107) v − u (cid:107) ) ρ ( u ) ρ ( v ) | W ∩ W v − u | , r ≥ , where sa d is the surface area of the unit sphere in R d , (cid:80) (cid:54) = denotes sum over all distinctpoints, / | W ∩ W h | , h ∈ R d , is the translation edge correction factor with W h = { u − h : u ∈ W } , and | A | is the volume (Lebesgue measure) of A ⊂ R d . Variationsof this include (Guan, 2007a) ˆ g d ( r ; b ) = 1 sa d (cid:54) = (cid:88) u,v ∈ X W k b ( r − (cid:107) v − u (cid:107) ) (cid:107) v − u (cid:107) d − ρ ( u ) ρ ( v ) | W ∩ W v − u | , r ≥ and the bias corrected estimator (Guan, 2007a) ˆ g c ( r ; b ) = ˆ g d ( r ; b ) /c ( r ; b ) , c ( r ; b ) = (cid:90) min { r,b }− b k b ( t )d t, assuming k has bounded support [ − , . Regarding the choice of kernel, Illian et al.(2008), p. 230, recommend to use the uniform kernel k ( r ) = ( | r | ≤ / , where ( · ) denotes the indicator function, but the Epanechnikov kernel k ( r ) = (3 / − r ) ( | r | ≤ is another common choice. The choice of the bandwidth b highly affectsthe bias and variance of the kernel estimator. In the planar ( d = 2 ) stationary case,Illian et al. (2008), p. 236, recommend b = 0 . / √ ˆ ρ based on practical experiencewhere ˆ ρ is an estimate of the constant intensity. The default in spatstat (Baddeleyet al., 2015), following Stoyan and Stoyan (1994), is to use the Epanechnikov kernelwith b = 0 . / √ ˆ ρ .Guan (2007b) and Guan (2007a) suggest to choose b by composite likelihood crossvalidation or by minimizing an estimate of the mean integrated squared error deﬁnedover some interval I as MISE (ˆ g m , w ) = sa d (cid:90) I E (cid:8) ˆ g m ( r ; b ) − g ( r ) (cid:9) w ( r − r min )d r, (2)where ˆ g m , m = k, d, c , is one of the aforementioned kernel estimators, w ≥ is aweight function and r min ≥ . With I = (0 , R ) , w ( r ) = r d − and r min = 0 , Guan(2007a) suggests to estimate the mean integrated squared error by M ( b ) = sa d (cid:90) R (cid:8) ˆ g m ( r ; b ) (cid:9) r d − d r − (cid:54) = (cid:88) u,v ∈ X W (cid:107) v − u (cid:107)≤ R ˆ g −{ u,v } m ( (cid:107) v − u (cid:107) ; b ) ρ ( u ) ρ ( v ) | W ∩ W v − u | , (3)where ˆ g −{ u,v } m , m = k, d, c , is deﬁned as ˆ g m but based on the reduced data ( X \{ u, v } ) ∩ W . Loh and Jang (2010) instead use a spatial bootstrap for estimating (2).We return to (3) in Section 5. 3 Orthogonal series estimation

For an

R > , the new orthogonal series estimator of g ( r ) , ≤ r min < r < r min + R ,is based on an orthogonal series expansion of g ( r ) on ( r min , r min + R ) : g ( r ) = ∞ (cid:88) k =1 θ k φ k ( r − r min ) , (4)where { φ k } k ≥ is an orthonormal basis of functions on (0 , R ) with respect to someweight function w ( r ) ≥ , r ∈ (0 , R ) . That is, (cid:82) R φ k ( r ) φ l ( r ) w ( r )d r = ( k = l ) andthe coefﬁcients in the expansion are given by θ k = (cid:82) R g ( r + r min ) φ k ( r ) w ( r )d r .For the cosine basis, w ( r ) = 1 and φ ( r ) = 1 / √ R , φ k ( r ) = (2 /R ) / cos { ( k − πr/R } , k ≥ . Another example is the Fourier-Bessel basis with w ( r ) = r d − and φ k ( r ) = 2 / J ν ( rα ν,k /R ) r − ν / { RJ ν +1 ( α ν,k ) } , k ≥ , where ν = ( d − / , J ν is the Bessel function of the ﬁrst kind of order ν , and { α ν,k } ∞ k =1 is the sequence ofsuccessive positive roots of J ν ( r ) .An estimator of g is obtained by replacing the θ k in (4) by unbiased estimatorsand truncating or smoothing the inﬁnite sum. A similar approach has a long history inthe context of non-parametric estimation of probability densities, see e.g. the review inEfromovich (2010). For θ k we propose the estimator ˆ θ k = 1 sa d (cid:54) = (cid:88) u,v ∈ X W r min < (cid:107) u − v (cid:107) where the presenceof two very close points in X W could imply division by a quantity close to zero. Theexpression for the variance of ˆ θ k given in Section S2 of the supplementary materialindeed shows that the variance is not ﬁnite unless g ( r ) w ( r − r min ) /r d − is boundedfor r min < r < r min + R . If r min > this is always satisﬁed for bounded g . If r min = 0 the condition is still satisﬁed in case of the Fourier-Bessel basis and bounded g . 4or the cosine basis w ( r ) = 1 so if r min = 0 we need the boundedness of g ( r ) /r d − . If X satisﬁes a hard core condition (i.e. two points in X cannot be closerthan some δ > ), this is trivially satisﬁed. Another example is a determinantal pointprocess (Lavancier et al., 2015) for which g ( r ) = 1 − c ( r ) for a correlation function c . The boundedness is then e.g. satisﬁed if c ( · ) is the Gaussian ( d ≤ ) or exponential( d ≤ ) correlation function. In practice, when using the cosine basis, we take r min tobe a small positive number to avoid issues with inﬁnite variances. The orthogonal series estimator (6) has the mean integrated squared error

MISE (cid:0) ˆ g o , w (cid:1) = sa d (cid:90) r min + Rr min E (cid:8) ˆ g o ( r ; b ) − g ( r ) (cid:9) w ( r − r min )d r = sa d ∞ (cid:88) k =1 E ( b k ˆ θ k − θ k ) = sa d ∞ (cid:88) k =1 (cid:2) b k E { (ˆ θ k ) } − b k θ k + θ k (cid:3) . (7)Each term in (7) is minimized with b k equal to (cf. Hall, 1987) b ∗ k = θ k E { (ˆ θ k ) } = θ k θ k + var (ˆ θ k ) , k ≥ , (8)leading to the minimal value sa d (cid:80) ∞ k =1 b ∗ k var (ˆ θ k ) of the mean integrated square error.Unfortunately, the b ∗ k are unknown.In practice we consider a parametric class of smoothing schemes b ( ψ ) . For practi-cal reasons we need a ﬁnite sum in (6) so one component in ψ will be a cut-off index K so that b k ( ψ ) = 0 when k > K . The simplest smoothing scheme is b k ( ψ ) = ( k ≤ K ) . A more reﬁned scheme is b k ( ψ ) = ( k ≤ K )ˆ b ∗ k where ˆ b ∗ k = (cid:98) θ k / (ˆ θ k ) is an estimate of the optimal smoothing coefﬁcient b ∗ k given in (8). Here (cid:98) θ k is anasymptotically unbiased estimator of θ k derived in Section 5. For these two smoothingschemes ψ = K . Adapting the scheme suggested by Wahba (1981), we also consider ψ = ( K, c , c ) , c > , c > , and b k ( ψ ) = ( k ≤ K ) / (1 + c k c ) . In practice wechoose the smoothing parameter ψ by minimizing an estimate of the mean integratedsquared error, see Section 5. g ( · ) − For large R , g ( r min + R ) is typically close to one. However, for the Fourier-Besselbasis, φ k ( R ) = 0 for all k ≥ which implies ˆ g o ( r min + R ) = 0 . Hence the estimatorcannot be consistent for r = r min + R and the convergence of the estimator for r ∈ ( r min , r min + R ) can be quite slow as the number of terms K in the estimator increases.In practice we obtain quicker convergence by applying the Fourier-Bessel expansionto g ( r ) − (cid:80) k ≥ ϑ k φ k ( r − r min ) so that the estimator becomes ˜ g o ( r ; b ) = 1 + (cid:80) ∞ k =1 b k ˆ ϑ k φ k ( r − r min ) where ˆ ϑ k = ˆ θ k − (cid:82) r min + R φ k ( r ) w ( r )d r is an estimator of ϑ k = (cid:82) R { g ( r + r min ) − } φ k ( r ) w ( r )d r . Note that var ( ˆ ϑ k ) = var (ˆ θ k ) and ˜ g o ( r ; b ) − E { ˜ g o ( r ; b ) } = ˆ g o ( r ; b ) − E { ˆ g o ( r ; b ) } . These identities imply that the results regardingconsistency and asymptotic normality established for ˆ g o ( r ; b ) in Section 4 are also validfor ˜ g o ( r ; b ) . 5 Consistency and asymptotic normality

To obtain asymptotic results we assume that X is observed through an increasing se-quence of observation windows W n . For ease of presentation we assume square obser-vation windows W n = × di =1 [ − na i , na i ] for some a i > , i = 1 , . . . , d . More generalsequences of windows can be used at the expense of more notation and assumptions.We also consider an associated sequence ψ n , n ≥ , of smoothing parameters sat-isfying conditions to be detailed in the following. We let ˆ θ k,n and ˆ g o,n denote theestimators of θ k and g obtained from X observed on W n . Thus ˆ θ k,n = 1 sa d | W n | (cid:54) = (cid:88) u,v ∈ X Wn v − u ∈ B Rr min φ k ( (cid:107) v − u (cid:107) − r min ) w ( (cid:107) v − u (cid:107) − r min ) ρ ( u ) ρ ( v ) (cid:107) v − u (cid:107) d − e n ( v − u ) ) , where B Rr min = { h ∈ R d | r min < (cid:107) h (cid:107) < r min + R } and e n ( h ) = | W n ∩ ( W n ) h | / | W n | . (9)Further, ˆ g o,n ( r ; b ) = K n (cid:88) k =1 b k ( ψ n )ˆ θ k,n φ k ( r − r min ) = 1 sa d | W n | (cid:54) = (cid:88) u,v ∈ X Wn v − u ∈ B Rr min w ( (cid:107) v − u (cid:107) ) ϕ n ( v − u, r ) ρ ( u ) ρ ( v ) (cid:107) v − u (cid:107) d − e n ( v − u ) | , where ϕ n ( h, r ) = K n (cid:88) k =1 b k ( ψ n ) φ k ( (cid:107) h (cid:107) − r min ) φ k ( r − r min ) . (10)In the results below we refer to higher order normalized joint intensities g ( k ) of X .Deﬁne the k ’th order joint intensity of X by the identity E  (cid:54) = (cid:88) u ,...,u k ∈ X ( u ∈ A , . . . , u k ∈ A k )  = (cid:90) A ×···× A k ρ ( k ) ( v , . . . , v k )d v · · · d v k for bounded subsets A i ⊂ R d , i = 1 , . . . , k , where the sum is over distinct u , . . . , u k .We then let g ( k ) ( v , . . . , v k ) = ρ ( k ) ( v , . . . , v k ) / { ρ ( v ) · · · ρ ( v k ) } and assume with anabuse of notation that the g ( k ) are translation invariant for k = 3 , , i.e. g ( k ) ( v , . . . , v k ) = g ( k ) ( v − v , . . . , v k − v ) . Consistency of the orthogonal series estimator can be established under fairly mildconditions following the approach in Hall (1987). We ﬁrst state some conditions thatensure (see Section S2 of the supplementary material) that var (ˆ θ k,n ) ≤ C / | W n | forsome < C < ∞ :V1 There exists < ρ min < ρ max < ∞ such that for all u ∈ R d , ρ min ≤ ρ ( u ) ≤ ρ max . 62 For any h, h , h ∈ B Rr min , g ( h ) w ( (cid:107) h (cid:107) − r min ) ≤ C (cid:107) h (cid:107) d − and g (3) ( h , h ) ≤ C for constants C , C < ∞ .V3 A constant C < ∞ can be found such that sup h ,h ∈ B Rr min (cid:82) R d (cid:12)(cid:12)(cid:12) g (4) ( h , h , h + h ) − g ( h ) g ( h ) (cid:12)(cid:12)(cid:12) d h ≤ C .The ﬁrst part of V2 is needed to ensure ﬁnite variances of the ˆ θ k,n and is discussed indetail in Section 3.2. The second part simply requires that g (3) is bounded. The con-dition V3 is a weak dependence condition which is also used for asymptotic normalityin Section 4.3 and for estimation of θ k in Section 5.Regarding the smoothing scheme, we assumeS1 B = sup k,ψ (cid:12)(cid:12) b k ( ψ ) (cid:12)(cid:12) < ∞ and for all ψ , (cid:80) ∞ k =1 (cid:12)(cid:12) b k ( ψ ) (cid:12)(cid:12) < ∞ .S2 ψ n → ψ ∗ for some ψ ∗ , and lim ψ → ψ ∗ max ≤ k ≤ m (cid:12)(cid:12) b k ( ψ ) − (cid:12)(cid:12) = 0 for all m ≥ .S3 | W n | − (cid:80) ∞ k =1 (cid:12)(cid:12) b k ( ψ n ) (cid:12)(cid:12) → .E.g. for the simplest smoothing scheme, ψ n = K n , ψ ∗ = ∞ and we assume K n / | W n | → . Assuming the above conditions we now verify that the mean integrated squared er-ror of ˆ g o,n tends to zero as n → ∞ . By (7), MISE (cid:0) ˆ g o,n , w (cid:1) / sa d = (cid:80) ∞ k =1 (cid:2) b k ( ψ n ) var (ˆ θ k )+ θ k { b k ( ψ n ) − } (cid:3) . By V1-V3 and S1 the right hand side is bounded by BC | W n | − ∞ (cid:88) k =1 (cid:12)(cid:12) b k ( ψ n ) (cid:12)(cid:12) + max ≤ k ≤ m θ k m (cid:88) k =1 ( b k ( ψ n ) − + ( B + 1) ∞ (cid:88) k = m +1 θ k . By Parseval’s identity, (cid:80) ∞ k =1 θ k < ∞ . The last term can thus be made arbitrarily smallby choosing m large enough. It also follows that θ k tends to zero as k → ∞ . Hence, byS2, the middle term can be made arbitrarily small by choosing n large enough for anychoice of m . Finally, the ﬁrst term can be made arbitrarily small by S3 and choosing n large enough. The estimators ˆ θ k,n as well as the estimator ˆ g o,n ( r ; b ) are of the form S n = 1 sa d | W n | (cid:54) = (cid:88) u,v ∈ X Wn v − u ∈ B Rr min f n ( v − u ) ρ ( u ) ρ ( v ) e n ( v − u ) (11)for a sequence of even functions f n : R d → R . We let τ n = | W n | var ( S n ) .To establish asymptotic normality of estimators of the form (11) we need certainmixing properties for X as in Waagepetersen and Guan (2009). The strong mixingcoefﬁcient for the point process X on R d is given by (Ivanoff, 1982; Politis et al.,1998) α X ( m ; a , a ) = sup (cid:8)(cid:12)(cid:12) pr ( E ∩ E ) − pr ( E ) pr ( E ) (cid:12)(cid:12) : E ∈ F X ( B ) , E ∈ F X ( B ) , | B | ≤ a , | B | ≤ a , D ( B , B ) ≥ m, B , B ∈ B ( R d ) (cid:9) , B ( R d ) denotes the Borel σ -ﬁeld on R d , F X ( B i ) is the σ -ﬁeld generated by X ∩ B i and D ( B , B ) = inf (cid:8) max ≤ i ≤ d | u i − v i | : u = ( u , . . . , u d ) ∈ B , v = ( v , . . . , v d ) ∈ B (cid:9) . To verify asymptotic normality we need the following assumptions as well as V1(the conditions V2 and V3 are not needed due to conditions N2 and N4 below):N1 The mixing coefﬁcient satisﬁes α X ( m ; ( s + 2 R ) d , ∞ ) = O ( m − d − ε ) for some s, ε > .N2 There exists a η > and L < ∞ such that g ( k ) ( h , . . . , h k − ) ≤ L for k = 2 , . . . , (cid:100) η (cid:101) ) and all h , . . . , h k − ∈ R d .N3 lim inf n →∞ τ n > .N4 There exists L < ∞ so that | f n ( h ) | ≤ L for all n ≥ and h ∈ B Rr min .The conditions N1-N3 are standard in the point process literature, see e.g. the discus-sions in Waagepetersen and Guan (2009) and Coeurjolly and Møller (2014). The condi-tion N3 is difﬁcult to verify and is usually left as an assumption, see Waagepetersen andGuan (2009), Coeurjolly and Møller (2014) and Dvoˇr´ak and Prokeˇsov´a (2016). How-ever, at least in the stationary case, and in case of estimation of ˆ θ k,n , the expression forvar (ˆ θ k,n ) in Section S2 of the supplementary material shows that τ n = | W n | var (ˆ θ k,n ) converges to a constant which supports the plausibility of condition N3. We discussN4 in further detail below when applying the general framework to ˆ θ k,n and ˆ g o,n . Thefollowing theorem is proved in Section S3 of the supplementary material. Theorem 1

Under conditions V1, N1-N4, τ − n | W n | / (cid:8) S n − E ( S n ) (cid:9) D −→ N (0 , . ˆ θ k,n and ˆ g o,n In case of estimation of θ k , ˆ θ k,n = S n with f n ( h ) = φ k ( (cid:107) h (cid:107) − r min ) w ( (cid:107) h (cid:107) − r min ) / (cid:107) h (cid:107) d − . The assumption N4 is then straightforwardly seen to hold in the caseof the Fourier-Bessel basis where | φ k ( r ) | ≤ | φ k (0) | and w ( r ) = r d − . For the cosinebasis, N4 does not hold in general and further assumptions are needed, cf. the discus-sion in Section 3.2. For simplicity we here just assume r min > . Thus we state thefollowing Corollary 1

Assume V1, N1-N4, and, in case of the cosine basis, that r min > . Then { var (ˆ θ k,n ) } − / (ˆ θ k,n − θ k ) D −→ N (0 , . For ˆ g o,n ( r ; b ) = S n , f n ( h ) = ϕ n ( h, r ) w ( (cid:107) h (cid:107) − r min ) (cid:107) h (cid:107) d − = w ( (cid:107) h (cid:107) − r min ) (cid:107) h (cid:107) d − K n (cid:88) k =1 b k ( ψ n ) φ k ( (cid:107) h (cid:107)− r min ) φ k ( r − r min ) , where ϕ n is deﬁned in (10). In this case, f n is typically not uniformly bounded sincethe number of not necessarily decreasing terms in the sum deﬁning ϕ n in (10) growswith n . We therefore introduce one more condition:85 There exist an ω > and M ω < ∞ so that K − ωn K n (cid:88) k =1 b k ( ψ n ) (cid:12)(cid:12) φ k ( r − r min ) φ k ( (cid:107) h (cid:107) − r min ) (cid:12)(cid:12) ≤ M ω for all h ∈ B Rr min .Given N5, we can simply rescale: ˜ S n := K − ωn S n and ˜ τ n := K − ωn τ n . Then, assuming lim inf n →∞ ˜ τ n > , Theorem 1 gives the asymptotic normality of ˜ τ − n | W n | / { ˜ S n − E ( ˜ S n ) } which is equal to τ − n | W n | / { S n − E ( S n ) } . Hence we obtain Corollary 2

Assume V1, N1-N2, N5 and lim inf n →∞ K − ωn τ n > . In case of thecosine basis, assume further r min > . Then for r ∈ ( r min , r min + R ) , τ − n | W n | / (cid:2) ˆ g o,n ( r ; b ) − E { ˆ g o,n ( r ; b ) } (cid:3) D −→ N (0 , . In case of the simple smoothing scheme b k ( ψ n ) = ( k ≤ K n ) , we take ω = 1 forthe cosine basis. For the Fourier-Bessel basis we take ω = 4 / when d = 1 and ω = d/ / when d > (see the derivations in Section S6 of the supplementarymaterial). In practice we choose K , and other parameters in the smoothing scheme b ( ψ ) , byminimizing an estimate of the mean integrated squared error. This is equivalent tominimizingsa d I ( ψ ) = MISE (ˆ g o , w ) − (cid:90) r min + Rr min (cid:8) g ( r ) − (cid:9) w ( r )d r = K (cid:88) k =1 (cid:2) b k ( ψ ) E { (ˆ θ k ) }− b k ( ψ ) θ k (cid:3) . (12)In practice we must replace (12) by an estimate. Deﬁne (cid:98) θ k as (cid:54) = (cid:88) u,v,u (cid:48) ,v (cid:48) ∈ X W v − u,v (cid:48) − u (cid:48) ∈ B Rr min φ k ( (cid:107) v − u (cid:107) − r min ) φ k ( (cid:107) v (cid:48) − u (cid:48) (cid:107) − r min ) w ( (cid:107) v − u (cid:107) − r min ) w ( (cid:107) v (cid:48) − u (cid:48) (cid:107) − r min ) sa d ρ ( u ) ρ ( v ) ρ ( u (cid:48) ) ρ ( v (cid:48) ) (cid:107) v − u (cid:107) d − (cid:107) v (cid:48) − u (cid:48) (cid:107) d − | W ∩ W v − u || W ∩ W v (cid:48) − u (cid:48) | . Then, referring to the set-up in Section 4 and assuming V3, lim n →∞ E ( (cid:100) θ k,n ) → (cid:40)(cid:90) R g ( r + r min ) φ k ( r ) w ( r )d r (cid:41) = θ k (see Section S4 of the supplementary material) and hence (cid:100) θ k,n is an asymptoticallyunbiased estimator of θ k . The estimator is obtained from (ˆ θ k ) by retaining only termswhere all four points u, v, u (cid:48) , v (cid:48) involved are distinct. In simulation studies, (cid:98) θ k had asmaller root mean squared error than (ˆ θ k ) for estimation of θ k .Thus ˆ I ( ψ ) = K (cid:88) k =1 (cid:8) b k ( ψ ) (ˆ θ k ) − b k ( ψ ) (cid:98) θ k (cid:9) (13)9 oisson Thomas VarGamma DPP0.00 0.04 0.08 0.12 0.00 0.04 0.08 0.12 0.00 0.04 0.08 0.12 0.00 0.04 0.08 0.120.000.250.500.751.0051015202.55.07.50.500.751.001.251.50 r g (r) Figure 1: Pair correlation functions for the point processes considered in thesimulation study.is an asymptotically unbiased estimator of (12). Moreover, (13) is equivalent to thefollowing slight modiﬁcation of Guan (2007a)’s criterion (3): (cid:90) r min + Rr min (cid:8) ˆ g o ( r ; b ) (cid:9) w ( r − r min )d r − sa d (cid:54) = (cid:88) u,v ∈ X W v − u ∈ B Rr min ˆ g −{ u,v } o ( (cid:107) v − u (cid:107) ; b ) w ( (cid:107) v − u (cid:107) − r min ) ρ ( u ) ρ ( v ) | W ∩ W v − u | . For the simple smoothing scheme b k ( K ) = ( k ≤ K ) , (13) reduces to ˆ I ( K ) = K (cid:88) k =1 (cid:8) (ˆ θ k ) − (cid:98) θ k (cid:9) = K (cid:88) k =1 (ˆ θ k ) (1 − b ∗ k ) , (14)where ˆ b ∗ k = (cid:98) θ k / (ˆ θ k ) is an estimator of b ∗ k in (8).In practice, uncertainties of ˆ θ k and (cid:98) θ k lead to numerical instabilities in the mini-mization of (13) with respect to ψ . To obtain a numerically stable procedure we ﬁrstdetermine K as ˆ K = inf { ≤ k ≤ K max : (ˆ θ k +1 ) − (cid:100) θ k +1 > } = inf { ≤ k ≤ K max : ˆ b ∗ k +1 < / } . (15)That is, ˆ K is the ﬁrst local minimum of (14) larger than 1 and smaller than an upperlimit K max which we chose to be 49 in the applications. This choice of K is also usedfor the reﬁned and the Wahba smoothing schemes. For the reﬁned smoothing schemewe thus let b k = ( k ≤ ˆ K )ˆ b ∗ k . For the Wahba smoothing scheme b k = ( k ≤ ˆ K ) / (1+ˆ c k ˆ c ) , where ˆ c and ˆ c minimize (cid:80) ˆ Kk =1 (cid:110) (ˆ θ k ) / (1 + c k c ) − (cid:98) θ k / (1 + c k c ) (cid:111) over c > and c > . We compare the performance of the orthogonal series estimators and the kernel esti-mators for data simulated on W = [0 , or W = [0 , from four point processeswith constant intensity ρ = 100 . More speciﬁcally, we consider n sim = 1000 re-alizations from a Poisson process, a Thomas process (parent intensity κ = 25 , dis-persion standard deviation ω = 0 . ), a Variance Gamma cluster process (parentintensity κ = 25 , shape parameter ν = − / , dispersion parameter ω = 0 . ,Jalilian et al., 2013), and a determinantal point process with pair correlation function g ( r ) = 1 − exp {− r/α ) } and α = 0 . . The pair correlation functions of thesepoint processes are shown in Figure 1. 10 l ll l l l l ll l l l l ll l l l l ll l l VarGamma(rmin, .025] VarGamma(rmin, R] DPP(rmin, .025] DPP(rmin, R]Poisson(rmin, .025] Poisson(rmin, R] Thomas(rmin, .025] Thomas(rmin, R]0.060 0.085 0.125 0.060 0.085 0.125 0.060 0.085 0.125 0.060 0.085 0.1250.060 0.085 0.125 0.060 0.085 0.125 0.060 0.085 0.125 0.060 0.085 0.1250.80.91.01.11.21.3−1.5−1.0−0.50.00.51.01.11.21.31.41.5−2−10101230.20.30.40.5012340.20.30.40.50.6 R l og r e l a t i v e e ff i c i en y w i t h r e s pe c t t o g ^ k Type simplerefinedWahba

Estimator l kernel dkernel cBesselcosine Figure 2: Plots of log relative efﬁciencies for small lags ( r min , . and alllags ( r min , R ] , R = 0 . , . , . , and W = [0 , . Black: kernel estima-tors. Blue and red: orthogonal series estimators with Bessel respectively cosinebasis. Lines serve to ease visual interpretation.For each realization, g ( r ) is estimated for r in ( r min , r min + R ) , with r min = 10 − and R = 0 . , . , . , using the kernel estimators ˆ g k ( r ; b ) , ˆ g d ( r ; b ) and ˆ g c ( r ; b ) or the orthogonal series estimator ˆ g o ( r ; b ) . The Epanechnikov kernel with bandwidth b = 0 . / √ ˆ ρ is used for ˆ g k ( r ; b ) and ˆ g d ( r ; b ) while the bandwidth of ˆ g c ( r ; b ) is cho-sen by minimizing Guan (2007a)’s estimate (3) of the mean integrated squared error.For the orthogonal series estimator, we consider both the cosine and the Fourier-Besselbases with simple, reﬁned or Wahba smoothing schemes. For the Fourier-Bessel basiswe use the modiﬁed orthogonal series estimator described in Section 3.4. The parame-ters for the smoothing scheme are chosen according to Section 5.From the simulations we estimate the mean integrated squared error (2) with w ( r ) =1 of each estimator ˆ g m , m = k, d, c, o , over the intervals [ r min , . (small spatiallags) and [ r min , r min + R ] (all lags). We consider the kernel estimator ˆ g k as the base-line estimator and compare any of the other estimators ˆ g with ˆ g k using the log relativeefﬁciency e I (ˆ g ) = log { (cid:91) MISE I (ˆ g k ) / (cid:91) MISE I (ˆ g ) } , where (cid:91) MISE I (ˆ g ) denotes the estimatedmean squared integrated error over the interval I for the estimator ˆ g . Thus e I (ˆ g ) > indicates that ˆ g outperforms ˆ g k on the interval I . Results for W= [0 , are summarizedin Figure 2.For all types of point processes, the orthogonal series estimators outperform ordoes as well as the kernel estimators both at small lags and over all lags. The detailedconclusions depend on whether the non-repulsive Poisson, Thomas and Var Gammaprocesses or the repulsive determinantal process are considered. Orthogonal-Besselwith reﬁned or Wahba smoothing is superior for Poisson, Thomas and Var Gammabut only better than ˆ g c for the determinantal point process. The performance of theorthogonal-cosine estimator is between or better than the performance of the kernelestimators for Poisson, Thomas and Var Gamma and is as good as the best kernelestimator for determinantal. Regarding the kernel estimators, ˆ g c is better than ˆ g d forPoisson, Thomas and Var Gamma and worse than ˆ g d for determinantal. The above con-clusions are stable over the three R values considered. For W = [0 , (see Figure S111able 1: Monte Carlo mean, standard error, skewness (S) and kurtosis (K) of ˆ g o ( r ) using the Bessel basis with the simple smoothing scheme in case of theThomas process on observation windows W = [0 , , W = [0 , and W =[0 , . r g ( r ) ˆ E { ˆ g o ( r ) } [ ˆ var { ˆ g o ( r ) } ] / ˆ S { ˆ g o ( r ) } ˆ K { ˆ g o ( r ) } W W W W W W ˆ g c is similar to the orthogonal series estimators. For determinantaland W = [0 , , ˆ g c is better than orthogonal-Bessel-reﬁned/Wahba but still inferiorto orthogonal-Bessel-simple and orthogonal-cosine. Figures S2 and S3 in the supple-mentary material give a more detailed insight in the bias and variance properties for ˆ g k , ˆ g c , and the orthogonal series estimators with simple smoothing scheme. Table S1in the supplementary material shows that the selected K in general increases when theobservation window is enlargened, as required for the asymptotic results. The generalconclusion, taking into account the simulation results for all four types of point pro-cesses, is that the best overall performance is obtained with orthogonal-Bessel-simple,orthogonal-cosine-reﬁned or orthogonal-cosine-Wahba.To supplement our theoretical results in Section 4 we consider the distribution ofthe simulated ˆ g o ( r ; b ) for r = 0 . and r = 0 . in case of the Thomas processand using the Fourier-Bessel basis with the simple smoothing scheme. In addition to W = [0 , and W = [0 , , also W = [0 , is considered. The mean, standarderror, skewness and kurtosis of ˆ g o ( r ) are given in Table 1 while histograms of theestimates are shown in Figure S3. The standard error of ˆ g o ( r ; b ) scales as | W | / inaccordance with our theoretical results. Also the bias decreases and the distributions ofthe estimates become increasingly normal as | W | increases. We consider point patterns of locations of

Acalypha diversifolia (528 trees),

Lon-chocarpus heptaphyllus (836 trees) and

Capparis frondosa (3299 trees) species in the1995 census for the m × m Barro Colorado Island plot (Hubbell and Fos-ter, 1983; Condit, 1998). To estimate the intensity function of each species, we use alog-linear regression model depending on soil condition (contents of copper, mineral-ized nitrogen, potassium and phosphorus and soil acidity) and topographical (elevation,slope gradient, multiresolution index of valley bottom ﬂatness, ncoming mean solar ra-diation and the topographic wetness index) variables. The regression parameters areestimated using the quasi-likelihood approach in Guan et al. (2015). The point patternsand ﬁtted intensity functions are shown in Figure S5 in the supplementary material.The pair correlation function of each species is then estimated using the bias cor-rected kernel estimator ˆ g c ( r ; b ) with b determined by minimizing (3) and the orthogonal12 calypha diversifolia r g ^ ( r ) - kernel: g^ c ( r ; b^ = ) Bessel: K^ = , refinedcosine: K^ = , refined Lonchocarpus heptaphyllus . . . . . r g ^ ( r ) - kernel: g^ c ( r ; b^ = ) Bessel: K^ = , refinedcosine: K^ = , refined Capparis frondosa r g ^ ( r ) - kernel: g^ c ( r ; b^ = ) Bessel: K^ = , refinedcosine: K^ = , refinedcosine: K = , refined Figure 3: Estimated pair correlation functions for tropical rain forest trees.series estimator ˆ g o ( r ; b ) with both Fourier-Bessel and cosine basis, reﬁned smoothingscheme and the optimal cut-offs ˆ K obtained from (15); see Figure 3.For Lonchocarpus the three estimates are quite similar while for

Acalypha and

Capparis the estimates deviate markedly for small lags and then become similar forlags greater than respectively 2 and 8 meters. For

Capparis and the cosine basis, thenumber of selected coefﬁcients coincides with the chosen upper limit 49 for the numberof coefﬁcients. The cosine estimate displays oscillations which appear to be artefactsof using high frequency components of the cosine basis. The function (14) decreasesvery slowly after K = 7 so we also tried the cosine estimate with K = 7 which givesa more reasonable estimate. Acknowledgement

Rasmus Waagepetersen is supported by the Danish Council for Independent Research— Natural Sciences, grant ”Mathematical and Statistical Analysis of Spatial Data”,and by the ”Centre for Stochastic Geometry and Advanced Bioimaging”, funded bythe Villum Foundation.

Supplementary material

Supplementary material includes proofs of consistency and asymptotic normality re-sults and details of the simulation study and data analysis.

References

Baddeley, A., Rubak, E., and Turner, R. (2015).

Spatial Point Patterns: Methodologyand Applications with R . Chapman & Hall/CRC Interdisciplinary Statistics. Chap-man & Hall/CRC, Boca Raton.Baddeley, A. J., Møller, J., and Waagepetersen, R. (2000). Non- and semi-parametricestimation of interaction in inhomogeneous point patterns.

Statistica Neerlandica ,54:329–350.Coeurjolly, J.-f. and Møller, J. (2014). Variational approach for spatial point processintensity estimation.

Bernoulli , 20(3):1097–1125.13oeurjolly, J.-F., Møller, J., and Waagepetersen, R. (2016). A tutorial on Palm distri-butions for spatial point processes.

International Statistical Review . to appear.Condit, R. (1998).

Tropical forest census plots . Springer-Verlag and R. G. LandesCompany, Berlin, Germany and Georgetown, Texas.Dvoˇr´ak, J. and Prokeˇsov´a, M. (2016). Asymptotic properties of the minimum contrastestimators for projections of inhomogeneous space-time shot-noise cox processes.

Applications of Mathematics , 61(4):387–411.Efromovich, S. (2010). Orthogonal series density estimation.

Wiley InterdisciplinaryReviews: Computational Statistics , 2(4):467–476.Guan, Y. (2007a). A least-squares cross-validation bandwidth selection approachin pair correlation function estimations.

Statistics and Probability Letters ,77(18):1722–1729.Guan, Y. (2007b). A composite likelihood cross-validation approach in selecting band-width for the estimation of the pair correlation function.

Scandinavian Journal ofStatistics , 34(2):336–346.Guan, Y., Jalilian, A., and Waagepetersen, R. (2015). Quasi-likelihood for spatial pointprocesses.

Journal of the Royal Statistical Society. Series B: Statistical Methodology ,77(3):677–697.Hall, P. (1987). Cross-validation and the smoothing of orthogonal series density esti-mators.

Journal of Multivariate Analysis , 21(2):189–206.Hubbell, S. P. and Foster, R. B. (1983). Diversity of canopy trees in a Neotropical forestand implications for the conservation of tropical trees. In Sutton, S. L., Whitmore,T. C., and Chadwick, A. C., editors,

Tropical Rain Forest: Ecology and Manage-ment. , pages 25–41. Blackwell Scientiﬁc Publications, Oxford.Illian, J., Penttinen, A., Stoyan, H., and Stoyan, D. (2008).

Statistical Analysis andModelling of Spatial Point Patterns , volume 76. Wiley, London.Ivanoff, G. (1982). Central limit theorems for point processes.

Stochastic Processesand their Applications , 12(2):171–186.Jalilian, A., Guan, Y., and Waagepetersen, R. (2013). Decomposition of variance forspatial Cox processes.

Scandinavian Journal of Statistics , 40(1):119–137.Lavancier, F., Møller, J., and Rubak, E. (2015). Determinantal point process modelsand statistical inference.

Journal of the Royal Statistical Society: Series B (StatisticalMethodology) , 77:853–877.Loh, J. M. and Jang, W. (2010). Estimating a cosmological mass bias parameter withbootstrap bandwidth selection.

Journal of the Royal Statistical Society: Series C(Applied Statistics) , 59(5):761–779.Møller, J. and Waagepetersen, R. P. (2003).

Statistical inference and simulation forspatial point processes . Chapman and Hall/CRC, Boca Raton.Politis, D. N., Paparoditis, E., and Romano, J. P. (1998). Large sample inference for ir-regularly space dependent opservations based on subsampling.

Sankhya: The IndianJournal of Statistics , 60(2):274–292. 14toyan, D. and Stoyan, H. (1994).

Fractals, Random Shapes and Point Fields: Methodsof Geometrical Statistics . Wiley.Stoyan, D. and Stoyan, H. (1996). Estimating pair correlation functions of planarcluster processes.

Biometrical Journal , 38(3):259–271.Waagepetersen, R. and Guan, Y. (2009). Two-step estimation for inhomogeneous spa-tial point processes.

Journal of the Royal Statistical Society: Series B (StatisticalMethodology) , 71(3):685–702.Wahba, G. (1981). Data-based optimal smoothing of orthogonal series density esti-mates.

Annals of Statistics , 9:146–l56.Yue, Y. R. and Loh, J. M. (2013). Bayesian nonparametric estimation of pair correla-tion function for inhomogeneous spatial point processes.