[PDF] Some recent developments in statistics for spatial point patterns

Abstract

This paper reviews developments in statistics for spatial point processes obtained within roughly the last decade. These developments include new classes of spatial point process models such as determinantal point processes, models incorporating both regularity and aggregation, and models where points are randomly distributed around latent geometric structures. Regarding parametric inference the main focus is on various types of estimating functions derived from so-called innovation measures. Optimality of such estimating functions is discussed as well as computational issues. Maximum likelihood inference for determinantal point processes and Bayesian inference are briefly considered too. Concerning non-parametric inference, we consider extensions of functional summary statistics to the case of inhomogeneous point processes as well as new approaches to simulation based inference.

Full PDF

SSome recent developments in statistics for spatialpoint patterns

Jesper Møller and Rasmus WaagepetersenDepartment of Mathematical Sciences, Aalborg UniversityFredrik Bajersvej 7G, DK-9220 Aalborg, Denmarkemail: [email protected], [email protected] 6, 2016

Abstract

This paper reviews developments in statistics for spatial point pro-cesses obtained within roughly the last decade. These developments in-clude new classes of spatial point process models such as determinantalpoint processes, models incorporating both regularity and aggregation,and models where points are randomly distributed around latent geomet-ric structures. Regarding parametric inference the main focus is on varioustypes of estimating functions derived from so-called innovation measures.Optimality of such estimating functions is discussed as well as compu-tational issues. Maximum likelihood inference for determinantal pointprocesses and Bayesian inference are brieﬂy considered too. Concerningnon-parametric inference, we consider extensions of functional summarystatistics to the case of inhomogeneous point processes as well as newapproaches to simulation based inference.

Keywords: determinantal point process, estimating function, functional sum-mary statistic, latent geometric structure, regularity and aggregation

A spatial point pattern data set is a ﬁnite collection of points specifying the loca-tions of some ‘events’ observed within a given spatial region W (the ‘observationwindow’). Often W is a k -dimensional compact subset of the k -dimensional Eu-clidean space R k , with k = 2 or k = 3 in most cases, while other but less studiedexamples are manifolds such as W = S k , the k − R k .Figures 1-2 show some examples, which are discussed in Section 1.2.A spatial point process is a stochastic model for a spatial point patterndata set. To adjust for the eﬀect of unobserved events, or if W is very large, orsimply for convenience when constructing models, the spatial point process may1 a r X i v : . [ s t a t . M E ] S e p e deﬁned on a possibly unbounded region S containing W . It may then consistof inﬁnitely many events. Sometimes the events are of diﬀerent types, leading tomultivariate spatial point processes. These are special cases of marked spatialpoint processes, where some extra information about each event is collected.For example, in case of the locations of trees, each mark may specify the tree’sspecies or its diameter at breast height. For dimensions d ≥

2, there is nonatural ordering on the points but sometimes an extension to a spatio-temporalpoint process is considered, where the direction of time usually plays a particularrole.Today spatial point pattern analysis is widespread in many ﬁelds of scienceand a rapid development of new statistical models and methods takes place.Recent textbooks on statistics for spatial point processes include Diggle (2003),Møller & Waagepetersen (2004), Illian et al. (2008), and Chiu et al. (2013).Chapter 4 of Gelfand et al. (2010) collects various reviews on statistics for spatialpoint processes. Moreover, Baddeley et al. (2015) provides a very accessibleaccount for applied statisticians, where the exposition is closely integrated withanalyses performed using the author’s spatstat R package (Baddeley & Turner,2005). This package is comprehensive, ﬂexible, and the main software for spatialpoint pattern analysis.

The left panel in Figure 1 shows locations of so-called vesicles in a microscopialimage of a slice of a synapse from a rat. The observation window W has anon-standard form being the planar region deﬁned by the outer curve repre-senting the membrane of the synapse minus the region deﬁned by the innercurve representing the extent of a mitochondrion. This is a part of a large dataset collected to study whether stress aﬀects the spatial distribution (e.g. theregularity) of vesicles, see Khanmohammadi et al. (2014) for further details.The vesicles data are at the microscopic scale (enclosing rectangle of W hasdimension 250 times 400 nm) with just 16 points that form a regular pattern,at least locally. This is in contrast to the data set in the right panel. Thisdata set contains 2640 clustered locations of Psychotria horizontalis trees in the1000 ×

500 m Barro Colorado Island research plot. Ecologists study such datasets for several species to investigate hypotheses of biodiversity, see e.g. Hubbell& Foster (1983), Condit et al. (1996), and Condit (1998).The left panel in Figure 2 shows the locations of 623 pyramidal cells fromthe Brodmann area 4 of the grey matter of the human brain. According to theminicolumn hypothesis (Mountcastle, 1957) such pyramidal brain cells shouldhave a columnar arrangement perpendicular to the pial surface of the brain,and this should be highly pronounced in Brodmann area 4. However, thishypothesis has been much debated, see Rafati et al. (2016) and the referencestherein. It is not easy to see by eye whether there is a columnar arrangement andstatistical techniques for detecting this have therefore been developed (Mølleret al., 2015b). The right panel shows the sky positions on the celestial sphere of10,611 nearby galaxies catalogued in the Revised New General Catalogue and2 ll l l l ll l ll l llll

Figure 1: Examples of planar point pattern data sets. Left: locations of vesiclesin a slice of a synapse from a rat. Inner polygon shows extent of a mitochondrionwhere vesicles do not occur. Right: locations of Psychotria trees in the BarroColorado Island research plot. See Section 1.2 for further details. l ll lll l llll ll ll l ll ll lll l l ll ll lll lll ll ll ll lll l lllll l lll lllll l ll ll lll ll lll ll l ll ll l llllll l lllll ll ll ll llll l ll lll l l lll ll l l llll l ll lll ll ll lll ll l l ll lll l llllll l ll llll lll l ll l lll lll ll ll ll lll ll l ll ll l l lll lll l llll ll lll ll l ll lll lll ll ll llll ll lllll llllll l l llll lll l ll ll l lll l llll llll lll ll ll lll ll lll ll ll ll ll ll l ll lll l l ll ll ll l ll l l ll lll ll llll l l lll ll ll ll llll l ll ll ll ll ll ll lll ll l l lll lll ll llll l lll ll lll ll ll ll l ll l lll ll l lll ll l lll lll lll ll ll ll lll lll l l ll l l ll l ll llll l ll ll l l ll ll l l l ll l ll lll ll l ll llll llllll ll ll ll lll l llll lll l ll lll ll l lll ll l ll l llll l ll lllll ll lll l l lll llll l ll lll ll l lll l l ll llll ll ll l ll l l l lll l lll lll lll l lll lll l ll lll ll ll l lll ll ll lll lllll ll ll l lllll l llll ll lll l l lll llll l lll ll llllll ll lllll lll llllllll lll l llll lll l ll ll ll llll ll llllllll lllll ll ll l l lllll lll llll lll llllll lllllll ll ll lll l ll ll lll ll ll lllll ll ll l ll lllllll ll l lll lll l ll ll l lllllll lll lll lllllllll llll llllll lll lll ll lllll llllll l lllll lllllll l llllll l l lll lll l ll l lllll lll llll lll l ll l llllllll l llllllll lllll ll ll ll ll llllll ll lll ll ll llll lll l lllll ll ll lll lll l ll ll ll ll l l llll lll ll ll llll lll ll llll lll ll ll lllll llll lllll lll l ll ll lllllllllll lll ll llll l ll ll lll l ll lll ll lllllll llll ll l lll l lll lllll l l lll llll ll ll ll lll lll llllllll llll ll llllll ll llll l ll lll ll ll ll lll ll l lll l llll lll lll lll lll ll l ll l lll lllllllllllllll lllll l l llll ll ll l ll llll ll lll lll lll ll l lll l lll lllllll ll llll ll ll l ll lll lllll lll l ll l llll llll ll ll ll ll ll lll l lllllll l lllllllll l lllllll l lllll l llll ll l ll lllll l lllll l lll ll ll lll lll l llll ll l lll ll lll ll llll ll l ll lll ll llll l l lllllll ll l lllllllllll ll lll ll lllll lll llll lllll llllll l lllll lllll lll llll ll lll lll lllllllll llll ll lll ll lllllllll lll llll lll lll llll l llll l ll lll ll lllll ll lllllll ll ll lll lll l llll ll lll lll ll lll lllllll lllll ll lll ll lll ll lll lll lll lllllll llll ll lll ll ll llllll lll lll ll llllll ll l l lllllll llllll llllll lll lll lllll lll lll l lllll lllllll lllll ll lllllll llll ll ll llll llll llllll llll lll ll l llllll ll lll llll lllllllllll llllll lllll ll ll llllllllll llll lll lll lll lll l lllllllll lllll lllllllllllll lllll lllll lllllll llll llll ll llllll lllllll lllll llll lll lll llllll llllllllll ll ll llllllllll lllllllllllllll ll lllllllllllll llll lllll llll llllllll lllll lllllllllllllllllll llllll ll lllllllll llllllllllll lllllllllllllll llll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llll llll lllllllllllllllllllllllll lllllllllllllllllllllll lllllllll lllll llllll llll llllllllllll lllllllll lllll ll lllllllll ll llllll ll llllllllllll llllllllllll lllll lll lllllllllllllllllllllllllll llllllllll lll llllllll lllllll llll llll lllllllllllllll lll ll lll ll l lllllllllllllllllllllllllllllllllllllllll ll lllll lllllll llll lllllllllllllllllll llll lllllll lllllllllll lllllllllll lllllllllll llll llllllll llllllll lll ll ll llllllllllllll llllll lll lllllll ll llll llllllllll llllll ll llllllllll ll lllll ll ll llll lllllllllllllll llll ll lll ll lllll lllll lllll l ll l ll lll lll ll ll lll l llllll ll llllll lllllll lll ll ll l lll llll ll lllllllllll lll llllll ll lll ll lllll l lllllllllllllllll llllll lllllllllll lllll llll llll lll lllll lllll ll ll ll ll llllll lllll ll lll lllllll lll llllllll lll llll llll lll ll llllll lllll ll lll lll llll lll lll ll llllll lll ll ll lllll llllll lll ll lll lll ll lllllll ll llll lll ll l llllll lll llllll ll lll lll ll ll ll l lllll llll lll ll lll ll ll lllll ll ll lllllll ll ll lll llllll lllllll l lllll llll ll ll llllllll l llllll llll lllll lllllllll ll lll lll llll lllll l lll l lll llll ll l llll lll llll llllllll l llll l lllll l l llll lll ll llll llllllll llllll lllllll lll ll llll l lllllllll llll ll l llll ll llll lllll lllll ll lll llllll l ll ll l llll l ll lllll l lll lll lll lll llll lll lll ll l ll lll ll ll l lll lll lll llll ll ll llll lll l lll ll ll ll l llllllll lll lll lllll ll ll ll lll l ll l ll llll llll llll lll l llll ll ll l ll llll llll lll lll ll lllll ll ll ll lll lll llll lllllll ll ll ll lllll lll lll lll lll ll lll ll lllllllllll l llll l ll lll lll l lll llll lll llll ll llll ll l lllll llll l ll lllll lll llllll ll lll lll llll ll ll llll ll lll l l llllll llll l ll ll l lllll l l ll llll lll lll lllll ll l ll lllllll llll ll lll ll llllllll l lll ll llll lll l ll llll lll lll ll ll lll lll lll l lll l lll ll ll llllllll lll ll llll llllll ll llll lllll l llll ll l llll lll l ll llll ll l lllll lll ll lllllllllll lll ll ll ll l llll llllll llll lll lll ll l llllll lll lll ll ll lll lll lllll ll llll lllll ll ll llll lllll llll l llll lll ll l ll llll llll l l lll llll l ll ll lll l ll ll l ll lll l llll llll ll llll l ll lll ll llll l ll ll ll l lll l llll lllll lll lll lll lllllllllllll lll llllllll lll llll l llll lllllll ll llllllll lllll l ll ll lllll llllll l lllll lllll l lll ll ll l ll ll l ll lll l llllll ll l llllllll ll lllll lll ll l l llll ll lll lll ll llll lll lll l ll l ll l l ll ll llllll lll ll lll llllllll lll llllllll llll lll ll ll ll lll llll l lll ll ll lll ll lll ll ll llll lll l ll lll lll lll l ll ll llllll l llll l lllllll ll lll l l ll lll llll ll ll ll l l llllll ll ll llllll l lll ll lll lll lll lll lll l llll l l lll lll l ll ll lllllll l llll llll ll lllll l lll lll ll lll lll l ll lllll l llll ll ll llllll llll ll l lll lll ll ll lll llll l l ll l ll ll ll llll ll llll lllllll l ll l ll l lll lllllll l llll lll llllllllllllll lll ll lllllll lll l lllllll ll ll lll ll llllll lllll lllllllllll lllllll ll lllllllllllllllllllllllllllll lllllll lll lllll lllllll lllllllllllllllll lllllllllllllllllllll lllllll llllllllll lll lll lllllllll lll lll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllll ll llll ll llll lll lllll lllllll l l lllllll lll llllll lllll ll lll l lll llll lllllll lllll lllllll lll llll lll llllll llll llll lll llll lll ll ll llll llll ll llllll lllllllll lll lll llllll lllll llllll lll llll l lllllll l llll ll lll llll llllllll l ll llll lll l llll ll llll lllll lll llll ll lll lllllllllllll lll lllll ll lll llll lll ll ll l ll llll llll lll ll ll lllll ll l llllllllll llllllllllllll llllllll llllll lll ll llllll l llll ll ll llll lll llll lll ll llllllll lll ll lllll lll ll llll lll llll llll ll lllllll lllll ll llll llll l lllll lll llllllll lllllllllllll llll llllllll ll lll ll ll ll ll l l lll lll lll llll l llllllll llll llllll l lll llll llllllllll ll lllllll lllll l ll lllllll lll l llll llll lllll lllll lll lll llll ll lll lll lll llll lllll lllll ll llllll l lll ll ll ll l llllllll ll lll l ll lll lllllll lllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllll llllll llllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllll llllllll llll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllll lllllllllll lll llll lll llll llllllllllll lll lllllllll ll lll ll lll llll ll lll lllll llll lllll lll lll llllll lllllllll llllll llllllllllllllll llllll lllllllll ll lll llllll lll lllll lll ll lllll lll lllllll llll lllllllll lllll lll lll lll ll llllll lll llll lll lllllll ll llllll llllll lllll ll llllllllllll lll llll llllllll ll llll lll llllll lll llllll lll ll lllll ll lllllll llllll ll lll llll l lll lll lllll lll ll llllll ll l lll ll ll ll llll lllllll l lllll llllllllllllllllllllllllllll lll lll lll llllll llll lll lllllllllll llllll llll lllll lllllllllllllllllllllll lll llllllllll lll l ll lll llll llll llllll l llll lll ll llllll l lll ll ll ll lll l ll ll llll lll l llll l ll l l lll ll lllll ll ll llllll l llll ll ll lllll lllll lll lll ll llll llll lll lllll l l ll lllll l llll ll l lllll llll lll lll lll lllll llll ll lllllll ll ll ll l lllllll ll ll ll l lll llll l lllll l

Figure 2: Examples of a point pattern data sets in the 3 D space and on thesphere. Left: locations of pyramidal brain cells. Right: sky positions of galaxiesprojected onto the globe. See Section 1.2 for further details.Index Catalogue, where the Milky Way obstructs the view and the point patternis inhomogeneous and exhibits clustering, see Lawrence et al. (2016). Statisticalmodels and tools for point processes on the sphere have recently been developed(Robeson et al., 2014; Lawrence et al., 2016; Møller & Rubak, 2016). This paper is mainly conﬁned to a review of recent spatial point process modelsand methods for analyzing spatial point pattern data sets, i.e. without marksor times included and with no ordering on the points. As we have a broadaudience in mind, technical details are suppressed. The paper is to some extenta follow-up to our discussion paper Møller & Waagepetersen (2007) which givesa concise and non-technical introduction to the theory of statistical spatial pointpattern analysis, making analogies with generalized linear models and random3ﬀect models. Thus we mainly focus on developments after the publication ofMøller & Waagepetersen (2007) but make no pretense of a complete literaturereview.Research in statistical inference for spatial point patterns evolve around thefollowing three main topics: • development of ﬂexible models for point patterns exhibiting clustering,regularity, trends depending on spatial covariates and combinations ofthese features; • methodology for ﬁtting parametric spatial point process models usingmoment relations or likelihood based methods often implemented usingMarkov chain Monte Carlo (MCMC) methods; • development of non-parametric summary statistics for assessing spatialinteraction and validating ﬁtted models.After providing a brief introduction to basic theoretical concepts in Section 2,our exposition follows the above categorization. While much research on mod-elling has focused on Poisson, Cox, cluster, and Gibbs models (brieﬂy reviewed inSection 3.1), our modelling Section 3 focuses on the classes of determinantal andpermanental point processes. The potential of these models for statistical appli-cations has only recently been explored. In addition models characterized by in-corporating both clustering and regularity and involving latent geometric struc-tures are reviewed. Section 4 primarily considers estimating function inferencewhere estimating functions are obtained from so-called innovation measures.Maximum likelihood inference for determinantal point processes and Bayesianinference for spatial point processes are reviewed too. Section 5 is concernedwith functional summary statistics for non-stationary point processes and newdevelopments in simulation-based inference based on such summary statistics.Finally, Section 6 discusses further recent developments and perspectives forfuture research. This section speciﬁes our set-up for spatial point processes. For extensions tomultiple and marked point processes, and for mathematical details, in particularmeasure theoretical details, the reader may e.g. consult Møller & Waagepetersen(2004). For instance, below S and B are always Borel sets and h is a measurablefunction, but we do not emphasize such matters.Let S ⊆ R k be the state space for some events, where typically k = 2 or k = 3 and often but not always S is of the same dimension, cf. Section 1.1. By a spatial point process on S we mean a locally ﬁnite random subset X ⊂ S . Letting X B = X ∩ B for B ⊆ S , locally ﬁnite means that the count N ( B ) = X B isﬁnite for any bounded B ⊆ S . Then the distribution of X can be deﬁnedby two equivalent approaches: By the ﬁnite dimensional distributions of thecounts; or, for any bounded B ⊆ S , by specifying ﬁrst the distribution of N ( B )4nd second the distribution of the events in X B conditional on knowing N ( B ).When S = R k and the distribution of X is invariant under translations in R k respectively rotations about the origin in R k , we say that X is stationary respectively isotropic . Similarly, when d = k − S = S d is the d -dimensionalsphere, and the distribution of X is invariant under rotations about the originin R k , we say that X is isotropic .For n = 1 , , . . . and pairwise disjoint bounded sets B , . . . , B n ⊆ S , wedeﬁne the n th order factorial moment measure µ ( n ) ( B ) of the product set B = B × . . . × B n by the expected value of (cid:81) ni =1 N ( B i ). This extends by standardmethods to a measure µ ( n ) on S n by setting (cid:90) S n h ( u , . . . , u n ) d µ ( n ) ( u , . . . , u n ) = E (cid:54) = (cid:88) u ,...,u n ∈ X h ( u , . . . , u n )for non-negative functions h deﬁned on S n . Here (cid:54) = over the summation signmeans that u , . . . , u n are pairwise distinct.For simplicity and speciﬁcity, unless otherwise stated, we assume that S is k -dimensional and µ ( n ) has a density ρ ( n ) with respect to Lebesgue measure on( R k ) n restricted to S n , so thatE (cid:54) = (cid:88) u ,...,u n ∈ X h ( u , . . . , u n ) = (cid:90) S n h ( u , . . . , u n ) ρ ( n ) ( u , . . . , u n ) d u · · · d u n . Then ρ ( n ) is called the n th order joint intensity function . Note that this func-tion is uniquely deﬁned except for a Lebesgue nullset. For pairwise distinct u , . . . , u n ∈ S , we may interpret ρ ( n ) ( u , . . . , u n ) d u · · · d u n as the probabilityof observing a point in each of n inﬁnitesimally small volumes d u , . . . , d u n con-taining u , . . . , u n , respectively. If instead e.g. S = S d , then everywhere abovewe should replace Lebesgue measure with d -dimensional surface measure on S d .The cases n = 1 , µ = µ (1) is the intensitymeasure, ρ = ρ (1) is the intensity function , and the pair correlation function(pcf ) is deﬁned for u, v ∈ S and u (cid:54) = v by g ( u, v ) = ρ (2) ( u, v ) / { ρ ( u ) ρ ( v ) } if ρ ( u ) ρ ( v ) >

0, and g ( u, v ) = 0 otherwise. The pcf is thus a normalizedversion of ρ (2) and it is usually easier to interpret: Roughly speaking, the case g ( u, v ) = 1 corresponds to that ‘ u and v appear independently of each other’,the case g ( u, v ) > u and v ’ or ‘aggregatione.g. due to covariates’, and the case g ( u, v ) < u and v ’ or ‘regularity due to the model in question’ (as exempliﬁedlater in (h) in Section 3.2.1). Note that this interpretation of g ( u, v ) may onlybe meaningful when u and v are suﬃciently close. Since the diagonal in S haszero Lebesgue measure, the deﬁnition of g ( u, u ) can be arbitrary. Depending onthe particular model there is often a natural choice as explained later.5f for each u ∈ S , π ( u ) ∈ [0 ,

1] is a given number, then an independent π -thinned process X th is obtained by independently retaining each point u ∈ X with probability π ( u ). It follows that X th has n th order joint intensity function ρ ( n )th ( u , . . . , u n ) = π ( u ) · · · π ( u n ) ρ ( n ) ( u , . . . , u n ), and the pcf is the same forthe two processes.When S = R k , second order intensity-reweighted stationarity means that g ( u, v ) = g ( u − v ) is invariant under translations in R k . For example, sta-tionarity of X implies second order intensity-reweighted stationarity of both X and X th . Baddeley et al. (2000) discuss several other examples of second orderintensity-reweighted stationary point processes. If instead S = S d , then secondorder intensity-reweighted isotropy means that g ( u, v ) = g ( d ( u, v )) is invariantunder rotations, where d ( u, v ) is the great circle distance.We sometimes refer to so-called Palm distributions. The reduced Palm dis-tribution of X given a location u ∈ S can be regarded as the conditional distri-bution of X \ { u } given that X has a point at u . We write X ! u for a spatial pointprocess distributed according to the reduced Palm distribution of X given u . Inthe stationary case, X ! u is distributed as X ! o translated by u , where o denotesthe origin, and so we refer to E h ( X ! o ) as the conditional expectation of h withrespect to the further points in X given a typical point of X . Coeurjolly et al.(2015b) give an introduction to Palm distributions intended for a statisticalaudience.A useful approach to understanding many models and methods is to considera subdivision of S = ∪ mi =1 C i into m ≥ C i and approximate thespatial point process by a random ﬁeld of presence-absence binary variables X i = 1[ X ∩ C i (cid:54) = ∅ ] where 1[ A ] denotes indicator function of an event A . Thatis, X i = 1 if at least one point is present in the cell C i and X i = 0 otherwise.By the previously mentioned interpretation, the joint intensities determine thedistribution of the binary variables X i associated with a subdivision of S intoinﬁnitesimally small cells C i . It is also possible to reverse the binary random ﬁeldpoint of view. A Poisson process (Section 3.1) for example, can be viewed as alimit of binary random ﬁelds of independent binary random variables associatedwith a sequence of subdivisions for which the cell sizes tend to zero. In Section 3.1 we give a very brief review of Poisson, cluster, Cox, and Gibbspoint process models which are covered extensively in the existing literature onspatial point processes, cf. Section 1.1. This is followed by more detailed reviewsof recent contributions to the modelling of spatial point processes.

Assume that ρ is an integrable non-negative function on S so that µ ( B ) = (cid:82) B ρ ( u )d u is ﬁnite for all bounded B ⊆ S . Then X is a Poisson process withintensity function ρ provided that N ( B ) is Poisson distributed with mean µ ( B )6or any bounded B ⊆ S and conditionally on N ( B ) = n , the n points in X B are iid on B with density function proportional to ρ . The Poisson process hasfurther independence properties: • For any disjoint subsets B , . . . , B k , k >

1, of S , X B , . . . , X B k are inde-pendent. • A Poisson process can be viewed as a limit of binary random ﬁelds { X i } mi =1 associated with increasingly ﬁne subdivisions of S into cells C i of volumes | C i | , where the X i are independent with P ( X i = 1) = ρ ( u i ) | C i | for arepresentative point u i ∈ C i . • For any n ≥ ρ ( n ) ( u , . . . , u n ) = (cid:81) ni =1 ρ ( u i ).By the latter property, g ( u, v ) = 1 for any pairwise distinct u, v ∈ S with ρ ( u ) ρ ( v ) >

0, which motivates deﬁning g ( u, u ) = 1 on the diagonal of S for aPoisson process.Most spatial point pattern data exhibit clustering or regularity that doesnot comply with the independence properties of a Poisson process. The Poissonprocess is nevertheless important as a reference process for no spatial interaction.Even more importantly, the Poisson process is the basis for obtaining moreﬂexible models either by speciﬁcation of probability densities with respect tothe Poisson process (Section 3.1.1) or by explicit constructions using Poissonprocesses as building blocks (Section 3.1.2). The term

Gibbs point processes is in practice used as a broad term for ﬁnitepoint processes speciﬁed by a density and their extensions to inﬁnite spatialpoint processes obtained by considering limits for so-called local speciﬁcations,see Møller & Waagepetersen (2004) and the references therein. For simplicitywe just consider the ﬁnite case below and review various concepts needed later.Denote by Z a ﬁnite Poisson process on S with intensity function ρ , i.e. µ ( S ) < ∞ . The process Z is used as a reference process. If S is bounded, Z isoften taken to be the unit rate Poisson process with ρ ( · ) = 1. Denote by N theset of locally ﬁnite point conﬁgurations of S . Suppose that f is a non-negativefunction on N satisfying E f ( Z ) = 1. A spatial point process X then has density f on N (is absolute continuous) with respect to (the distribution of) Z providedE h ( X ) = E [ h ( Z ) f ( Z )]for non-negative functions h on N . Here, by the deﬁnition of a Poisson process,E h ( Z ) = ∞ (cid:88) n =0 exp[ − µ ( S )] n ! (cid:90) S · · · (cid:90) S h ( x , . . . , x n ) ρ ( x ) · · · ρ ( x n ) d x · · · d x n . (1)In the following we consider several examples of point process densities.7f X is a ﬁnite Poisson process with intensity function α satisfying ρ ( u ) > α ( u ) >

0, then X has density f ( x ) = exp (cid:20) µ ( S ) − (cid:90) S α ( u )d u (cid:21) (cid:89) u ∈ x α ( u ) ρ ( u ) , x ∈ N , (2)with respect to Z . If the integral involving α in (2) can not be evaluated ana-lytically, it is at least typically easy to approximate it by numerical integrationrendering maximum likelihood inference feasible for parametric models of Pois-son processes.To obtain models with interaction between points, densities are speciﬁed interms of an interaction function φ ( · ) ≥ f ( x ) = (cid:89) y ⊆ x φ ( y ) . Note that f ( x ) is proportional to h ( x ) = (cid:81) y ⊆ x : y (cid:54) = ∅ φ ( y ), so E f ( Z ) = 1 isequivalent to φ ( ∅ ) = 1E h ( Z ) =  E (cid:89) y ⊆ Z : y (cid:54) = ∅ φ ( y )  − . Thus, φ () is the normalizing constant that upon multiplication turns h intoa probability density. When specifying an interaction function it is impor-tant to check that the expectation above deﬁning the normalizing constant isin fact ﬁnite. The expectation is typically not available in closed form andmust be approximated e.g. by computationally intensive MCMC methods. Of-ten pairwise-only interaction processes are considered in which case φ ( y ) = 1whenever the cardinality of y is greater than two. The Strauss process is thewell-known example where for distinct u, v , φ ( { u } ) = φ ( { v } ) = β > φ ( { u, v } ) = γ (cid:107) u − v (cid:107)≤ R ] for 0 < γ ≤ R >

0. Simulated realizations of theStrauss process for various parameter settings can be found in Møller (1999).The joint intensities of a spatial point process with density f with respectto Z can be expressed as ρ ( n ) ( u , . . . , u n ) = E f ( { u , . . . , u n } ∪ Z )for pairwise distinct u , . . . , u n . The expectation on the right hand side is ingeneral not analytically tractable. Fast numerical approximations of the ﬁrsttwo joint intensities are developed in Baddeley & Nair (2012a,b).Suppose f is hereditary meaning that f ( x ) > f ( y ) > y ⊆ x .Then the n -point conditional intensity is deﬁned by λ ( u , . . . , u n , x ) = f ( { u , . . . , u n } ∪ x ) f ( x )for pairwise distinct u , . . . , u n and ﬁnite point conﬁgurations x with x ∩{ u , . . . , u n } = ∅ . The conditional intensities e.g. play an important role in the deﬁnition ofinnovation measures for Gibbs point processes, see Section 4.1.8 .1.2 Cox and cluster processes A large class of models for random spatial aggregation of points is obtained byreplacing the deterministic intensity function in a Poisson process by a randomfunction. More precisely, let Λ = { Λ( u ) } u ∈ S be a non-negative random ﬁeldsuch that Λ is locally integrable almost surely (that is, for bounded B ⊆ S , (cid:82) B Λ( u ) d u exists and is ﬁnite with probability one). Then X is a Cox process provided that conditional on Λ = λ , X is a Poisson process with intensityfunction λ . The joint intensities are simply given by ρ ( n ) ( u , . . . , u n ) = E n (cid:89) i =1 Λ( u i )for n ≥ u , . . . , u n ∈ S . For a log Gaussian Cox process ,i.e. when log Λ is a Gaussian process, ρ ( n ) is just given by the Laplace transformof a multivariate normal distribution (Møller et al., 1998).To obtain clustered point patterns, one way is to consider superpositions ofa countable number of localized point processes. Accordingly, a Poisson clusterprocess is a union X = ∪ v ∈ Φ X v of typically ﬁnite spatial point processes X v on S indexed by a parent Poisson process Φ on S or perhaps some other space. Forconvenience, conditional on Φ , the X v are often assumed to be ﬁnite Poissonprocesses each with intensity function ρ v , in which case X can also be viewed asa Cox process with random intensity function Λ ( u ) = (cid:80) v ∈ Φ ρ v ( u ). Shot-noiseCox processes is a speciﬁc example of this, where Φ is a Poisson process on R d × ]0 , ∞ [ and for a v = ( c, γ ) ∈ Φ , ρ v ( u ) = γk ( c, u ), where k ( c, · ) is a proba-bility density on S (Møller, 2003). Simulated realizations of log Gaussian andPoisson cluster processes can be found in Chapter 5 in Møller & Waagepetersen(2004).At the expense of analytical tractability, Poisson parent processes can be re-placed e.g. by regular parent processes (Van Lieshout & Baddeley, 2002; McK-eague & Loizeaux, 2002; Møller & Torrisi, 2005) and the Poisson distributionfor cluster size could be replaced by other integer valued distributions. Jalilianet al. (2013) introduce a ﬂexible class of shot-noise Cox processes with kernelfunction k given by the density of normal-variance mixture distributions. Forthis class of models, the pair correlation function is given by g ( u, v ) = 1+ c ( u, v )where c ( u, v ) is a covariance function given by a scaled normal-variance mix-ture density. This includes e.g. Mat´ern and Cauchy covariance functions. Mostexamples of Cox process models are second order intensity-reweighted station-ary, with an isotropic pcf. Møller & Toftager (2014) study Cox process modelswith an elliptical pcf, including shot noise Cox processes and log Gaussian Coxprocesses.The density of a ﬁnite Cox process with respect to a ﬁnite Poisson process Z is f ( x ) = E f ( x | Λ)where f ( ·| λ ) is the conditional density of X given Λ = λ with respect to Z . Theexpectation can rarely be computed analytically. Likelihood based inference9or Cox processes therefore typically require numerical (including Monte Carlo)approximations of the likelihood function. Macchi (1975) introduced two interesting models motivated by fermions andbosons in quantum mechanics, viz. determinantal and permanental point pro-cesses.

Determinantal point processes (DPPs) are of interest because of their applica-tions in mathematical physics, combinatorics, random-matrix theory, machinelearning, and spatial statistics (see Lavancier et al., 2015, and the referencestherein). For DPPs on R k , rather ﬂexible parametric models can be constructedand likelihood and moment based inference procedures apply (Lavancier et al.,2014, 2015). DPPs on the sphere are discussed from a statistical perspective inMøller et al. (2015a) and Møller & Rubak (2016).A DPP is deﬁned by a function C : S × S (cid:55)→ C as follows: X is a DPP withkernel C if for any n = 1 , , . . . , X has a non-negative n th order joint intensityfunction given by ρ ( n ) ( u , . . . , u n ) = det ( C ( u i , u j ) i,j =1 ,...,n ) for all u , . . . , u n ∈ S, (3)where det ( C ( u i , u j ) i,j =1 ,...,n ) is the determinant of the n × n matrix with ( i, j )thentry C ( u i , u j ). Then we write X ∼ DPP( C ).Before discussing the existence of this process, several points are in orderwhen we assume X ∼ DPP( C ) exists (in fact it is then unique).(a) The kernel can be complex, since this becomes convenient when consid-ering simulation of DPPs as discussed below. In statistical models, it isusually real.(b) A Poisson process on S with intensity function ρ is the special case of aDPP where C ( u, u ) = 1 and C ( u, v ) = 0 for u (cid:54) = v .(c) The intensity function is ρ ( u ) = C ( u, u ), and the pcf is g ( u, v ) = 1 −| C ( u, v ) | / { C ( u, u ) C ( v, v ) } . Thus g ( u, u ) = 0.(d) By (3), an independent π -thinning of X ∼ DPP( C ) results in X th ∼ DPP( C th ), where C th ( u, v ) = (cid:112) π ( u ) π ( v ) C ( u, v ). Thus, if C ( u, v ) = C ( u − v ) is stationary, X th is second order intensity-reweighted stationary(on the sphere this holds provided C ( u, v ) = C ( d ( u, v )) is isotropic).(e) In particular, for any B ⊆ S , X B is a DPP on B , with the restriction of C to B × B as its kernel. This is in contrast to Gibbs point processes.For example, the restriction of a Strauss process to a smaller region is nota Strauss process and its density is intractable even up to a constant ofproportionality. 10f) A smooth one-to-one transformation of X results in a new DPP, cf. La-vancier et al. (2014).(g) By (3) and since ρ ( n ) is non-negative, C has to be positive semi-deﬁnite. Infact, in most work on DPPs, the kernel is assumed to be Hermitian, i.e. C is a complex covariance function. In the sequel, we make this assumption.(h) By (c) this implies that g ≤

1, which usually is a strict inequality, i.e. thereis ‘regularity at any scale’. In fact, an even stronger result holds: for all u , . . . , u n ∈ S we have ρ ( n ) ( u , . . . , u n ) ≤ ρ ( u ) · · · ρ ( u n ), with equality ifand only if X is a Poisson process with intensity function ρ .The existence of X ∼ DPP( C ) relies on a spectral representation of thekernel: Consider C restricted to any compact set B ⊆ S . Under mild conditions(e.g. continuity of C is suﬃcient, cf. Lavancier et al., 2014, 2015), for any u, v ∈ B , C ( u, v ) = ∞ (cid:88) i =1 λ i φ i ( u ) φ i ( v ) , (4)where z denotes complex conjugate of z ∈ C , λ i is the eigenvalue correspondingto the eigenfunction φ i , and (cid:82) B φ i ( u ) φ j ( u ) d u = 1[ i = j ]. Then existence isequivalent to that all λ i ≤

1, cf. Lavancier et al. (2014, 2015) and the referencestherein. If S = R k and C ( u, v ) = C ( u − v ) is stationary, under mild conditionsexistence is equivalent to that the spectral density for C is at most 1 (Lavancieret al., 2014, 2015). If S = S d and X is isotropic, the eigenfunctions are givenby spherical harmonics, see Møller et al. (2015a).Lavancier et al. (2014, 2015) conclude that DPPs oﬀer relatively ﬂexiblemodels for repulsiveness, although less ﬂexible than Gibbs point processes, e.g.DPPs cannot be as repulsive as Gibbs hard-core point processes can be. This isdue to the bound on the spectrum of C needed for the existence of X ∼ DPP( C ):The bound implies a trade-oﬀ between a large intensity and a strong degree ofinteraction.The spectral decomposition allows a closed form expression for the likelihoodwhen X is observed within a compact region, but in practice the likelihoodneeds to be approximated as discussed later in Section 4.4. Furthermore, N ( B )is distributed as (cid:80) ∞ i =1 N i , where N , N , . . . are independent Bernoulli variableswith parameters λ , λ , . . . . Simulation of N , N , . . . is easy and conditionalon N , N , . . . , the events in X B can be generated one by one using a fastexact simulation procedure based on the fact that X B has a density which isproportional to the ( (cid:80) ∞ i =1 N i )th order intensity function for a DPP with kernel (cid:80) ∞ i =1 N i φ i ( u ) φ i ( v ). See Hough et al. (2006) and Lavancier et al. (2014, 2015).To summarize, in comparison to Gibbs point processes, DPPs possess manyappealing properties: DPPs can be easily simulated; their moments and thelikelihood are tractable; a DPP restricted to a smaller region is again a DPP;an independent thinning of a DPP or a smooth one-to-one transformation isagain a DPP; and the distribution of the number of events within a compactregion is known. 11 .2.2 Permanental point processes and extensions Given a complex covariance function C : S × S (cid:55)→ C , X is a permanental pointprocess (PPP) with kernel C if for any n = 1 , , . . . , X has a non-negative n thorder intensity function as in (3) but with the determinant replaced by thepermanent per ( C ( u i , u j ) i,j =1 ,...,n ) = (cid:88) ( σ ,...,σ n ) n (cid:89) i =1 C ( u σ i , u σ j ) , where the sum is over all permutations of 1 , . . . , n . In fact, X is then a Coxprocess driven by | Z | , where Z is a zero-mean complex Gaussian process withcovariance function C . The intensity function is ρ ( u ) = C ( u, u ) and the pcf is g ( u, v ) = 1 + | C ( u, v ) | / { C ( u, u ) C ( v, v ) } . This implies that g ≥ X is restricted to a compact set, PPPshave yet not been used much in applications, probably because computing per-manents is computationally infeasible even for relatively small matrices. Since1 ≤ g ≤

2, PPPs may be a rather restrictive class of models for aggregation.Nevertheless, we think PPPs deserve to be investigated more because of theirattractive moment properties.Determinantal and permanental point processes can be extended by replac-ing the determinant or permanent with a weighted determinant or permanent,see McCullagh & Møller (2006). This extension includes the case of a Coxprocess driven by (cid:80) ki =1 Y i , where Y , Y , . . . are independent zero-mean realGaussian processes with a common real covariance function C . In the classical spatial point process literature, spatial point processes are of-ten classiﬁed into three main cases, viz. complete spatial randomness (i.e. thePoisson process), regularity (e.g. DPPs and most Gibbs point processes), andaggregation (e.g. Cox processes). This can be too simplistic, and often we needa model with aggregation on the large scale and regularity on the small scale.Gibbs point processes with an inhomogeneous ﬁrst order potential and in-hibitive pairwise interactions are models for deterministic aggregation combinedwith inhibition at the small scale. Alternatively, in order to introduce inho-mogeneity, homogeneous regular Gibbs processes can be subjected to locationdependent thinning, smooth transformation, or local scaling (Baddeley et al.,2000; Jensen & Nielsen, 2000; Hahn et al., 2003). Another possibility is toconsider a homogeneous Gibbs point process with a well-chosen higher-orderpotential that incorporates inhibition at small scales and attraction at largescales. A famous example is the Lennard-Jones pair-potential (Ruelle, 1969).12nother speciﬁc potential of this type can be found in Goldstein et al. (2015).Disadvantages of Gibbs point process models or models derived from those arethat densities and joint intensities are intractable and that simulation requireselaborate MCMC methods, cf. Section 3.1.1.In the following we focus on two modeling strategies to obtain random ag-gregation as well as random local regularity . In Andersen & Hahn (2015) thestarting point is a (aggregated) Cox process which is subjected to dependentthinning to create local regularity. Conversely, Lavancier & Møller (2016) beginwith a (regular) determinantal point process and use randomly spatially varyingthinning to obtain aggregation.Speciﬁcally, Andersen & Hahn (2015) apply Mat´ern type II dependent thin-ning to shot-noise Cox processes (Section 3.1.2). For a given point pattern anda speciﬁed distance h , Mat´ern II thinning acts by ﬁrst attaching random posi-tive marks (‘arrival times’) to each point. Subsequently a point is removed if ithas a neighbour within distance h and with a smaller mark (i.e. the neighbourarrived earlier). For a given location u , Andersen & Hahn (2015) deﬁne theretention probability at u as the ratio between the intensities of the thinnedprocess and the orginal process at u . They derive expressions for the reten-tion probabilities in terms of integrals that, however, must be evaluated usingnumerical integration. Simple approximate formulae for the intensity functionand the second-order joint intensity of the thinned process are also provided andused for ﬁtting of parametric models to data sets of cell centres in microscopialimages of bone marrow tissue.On the other hand, Lavancier & Møller (2016) consider a spatial point pro-cess Y on S and a random ﬁeld Π = { Π( x ) : x ∈ S } of retention probabilitiesindependent of Y . Conditional on Π, X is then an independent thinning of Y with retention probabilities 0 ≤ Π( u ) ≤ u ∈ S . In Stoyan (1979) and Chiuet al. (2013), X is called an interrupted point process (which is a good termi-nology when each Π( x ) is either 0 or 1). Simple relations between the intensityand pair correlation functions of Y and X are then available: ρ X ( u ) = q ( u ) ρ Y ( u ) , q ( u ) = EΠ( u ) , u ∈ S, (5)and (setting 0 / g X ( u, v ) = M ( u, v ) g Y ( u, v ) , M ( u, v ) = E[Π( u )Π( v )]E[Π( u )]E[Π( v )] , u, v ∈ S. (6)For example, if g Y ≤ M > g X smaller/larger than 1 on the small/large scale. In the caseswhere Y is a determinantal point process or a Mat´ern hard core model of typeI or II (Mat´ern, 1986) and Π is based on a transformed Gaussian process or thecharacteristic function of a Boolean model of balls, Lavancier & Møller (2016)derive explicit formulae for ρ X and g X . For such models, simulation of ( Y , Π , X )restricted to a bounded region is furthermore straightforward. Conditional sim-ulation of Π given X , Y or both is more complicated and discussed in Lavancier& Møller (2016). 13 .4 Latent geometric structures Spatial point processes in the neighbourhood of a reference structure are fre-quently observed. Often the reference structure has a kind of linear structure.A diverse set of examples at very diﬀerent scales are gold coins near Romanroads (pp. 226–229 in Hodder & Orton, 2013), copper deposits in the neigh-bourhood of lineaments (Berman, 1986; Baddeley & Turner, 2006; Illian et al.,2008), galaxies at the boundary of cosmic voids (Icke & Van de Weygaert, 1987;Van de Weygaert & Icke, 1989; Van de Weygaert, 1994), pores at the boundaryof grains (Karlsson & Liljeborg, 1994; Skare et al., 2007), animal latrines nearterritorial boundaries (Blackwell, 2001; Blackwell & Møller, 2003; Skare et al.,2007), linear rows of mines (Walsch & Raftery, 2002), speciﬁc tree species alongrivers in a rain forest (Valencia et al., 2004), and brain cells with a columnarstructure (cf. Figure 2). In many cases the reference structure is not easy torecognize. The objective of the statistical analysis of point patterns of this typemay either be to describe the distribution of the point pattern or to reconstructthe reference structure or perhaps both.For example, as in Blackwell (2001), Blackwell & Møller (2003), and Skareet al. (2007), the unknown reference structure may be modelled by a Voronoitessellation and the points of an unknown point process on the edges of thistessellation are randomly disturbed to produce an observed point pattern withpoints around the edges. Such complex models are analyzed in a BayesianMCMC setting with priors on the spatial point process of nuclei generating theVoronoi tessellation, the point process on the edges, and the model parameters.Typically, for tractability, the spatial point process priors are Poisson processes,in which case the likelihood for the observed point pattern is described by aCox process where the driving intensity is speciﬁed by the nuclei and the modelparameters. The posterior then just concerns the Voronoi tessellation and themodel parameters, and a major element of the MCMC algorithm is the re-construction of the Voronoi tessellation after a proposed local change of thetessellation.The construction above is in Møller & Rasmussen (2015) extended to con-struct models for cluster point processes within territories modelled by theVoronoi cells, where conditional on the territories/cells, the clusters are in-dependent Poisson processes whose points may be aggregated around or awayfrom the nuclei and along or away from the boundaries of the cells.To model a columnar structure for the pyramidal cell data set in Figure 2,a hierarchical construction starting with a Poisson line process L is used inMøller et al. (2015b): Consider on each line l i of L a Poisson process Y i andlet X i be obtained by random displacements of the points in Y i . Then X isthe superposition of all the X i . For this model Møller et al. (2015b) discussmoment results and simulation procedures based on the fact that X becomes aCox process which is closely related to a shot-noise Cox process where the centreprocess is replaced by L . Simulation of X within a bounded observation windowneeds to take care of edge eﬀects, and conditional simulation of its randomintensity (when X is viewed as a Cox process) requires MCMC techniques; for14etails, see Møller et al. (2015b). In the case where all the lines are paralleland their direction is known (this is a reasonable assumption for the pyramidalbrain cell data set), the pcf is tractable and useful for inference. In this section we assume that X is observed within a bounded set W ⊆ S and that a parametric model has been speciﬁed for the distribution of X oralternatively just for the joint intensities ρ ( n ) for some n ≥ n = 1 , θ and assumed to be real ofdimension p , and we write ρ ( n ) ( · ; θ ), λ ( n ) ( · ; θ ) etc. to stress the dependence on θ . In the case of a Gibbs process where the likelihood is speciﬁed, maximumlikelihood inference and Bayesian inference is in general diﬃcult due to the com-plicated normalizing constant. The Poisson process is one exception where themodest computational challenge is the evaluation of an integral over the obser-vation window W, cf. (2). For Cox and cluster processes, the likelihood functionis also very complicated, involving expectations with respect to the unobservedrandom intensity function or the unobserved parent points. Approximate max-imum likelihood or Bayesian inference may be implemented using Monte Carlomethods or Laplace approximations (see e.g. Møller & Waagepetersen, 2004,2007, for reviews). Some comments on maximum likelihood inference for de-terminantal point processes are given in Section 4.4 while Section 4.5 discussessome points regarding Bayesian inference.Due to the computational obstacles related to likelihood based inference,much interest has focused on establishing computationally easier approaches forspatial point processes with speciﬁed conditional intensity in the case of Gibbsprocesses and speciﬁed intensity and pair correlation functions in the case ofCox and cluster processes. This includes Takacs-Fiksel and pseudolikelihoodestimation for Gibbs point processes (e.g. Besag, 1977; Fiksel, 1984; Takacs,1986; Jensen & Møller, 1991; Diggle et al., 1994; Billiot, 1997; Baddeley &Turner, 2000; Billiot et al., 2008; Coeurjolly et al., 2012) and various types ofminimum contrast, composite likelihood, Palm likelihood and estimating func-tions for Cox and cluster processes and DPPs (e.g. Schoenberg, 2005; Guan,2006; Waagepetersen, 2007; Tanaka et al., 2008; Waagepetersen & Guan, 2009;Dvoˇr´ak & Prokeˇsov´a, 2012; Prokeˇsov´a & Jensen, 2013; Prokeˇsov´a et al., 2014;Guan et al., 2015; Zhuang, 2015; Lavancier et al., 2014, 2015). As discussed inSections 4.1-4.3, all of these methods belong to a common framework of esti-mating functions based on signed innovation measures (Baddeley et al., 2005;Waagepetersen, 2005; Zhuang, 2006). 15 .1 Innovation measures and estimating functions For a spatial point process with n ’th order joint intensity ρ ( n ) , we deﬁne the n ’th order innovation measure as I ( n ) ( B × · · · × B n )= (cid:54) = (cid:88) u ,...,u n ∈ X u ∈ B , . . . , u n ∈ B n ] − (cid:90) B ×···× B n ρ ( n ) ( u , . . . , u n ) d u · · · d u n for B i ⊆ S , i = 1 , . . . , n . By deﬁnition of the factorial moment measure, I ( n ) ( B ×· · ·× B n ) has expectation zero. For Gibbs point processes the factorialmoment measure is not known in closed form and it is more convenient to deﬁne conditional innovation measures in terms of the n ’th order conditional intensity:For F a set of locally ﬁnite point conﬁgurations and B i as above, I ( n ) ( B × · · · × B n × F | X )= (cid:54) = (cid:88) u ,...,u n ∈ X u ∈ B , . . . , u n ∈ B n , X \ { u , . . . , u n } ∈ F ] − (cid:90) B ×···× B n X ∈ F ] λ ( n ) ( u , . . . , u n , X ) d u · · · d u n which has expectation zero by the Georgii-Nguyen-Zessin formula (Georgii,1976; Nguyen & Zessin, 1979).Introducing the dependence on θ as well as p -dimensional weight functions h ( · ; θ ) on S n or on S n × N , we then obtain classes of n ’th order unbiasedestimating functions e ( n ) h ( θ ) = (cid:90) W n h ( u , . . . , u n ; θ ) I ( n ) (d u · · · d u n ; θ )= (cid:54) = (cid:88) u ,...,u n ∈ X W h ( u , . . . , u n ; θ ) − (cid:90) W n h ( u , . . . , u n ; θ ) ρ ( n ) ( u , . . . , u n ; θ ) d u · · · d u n (7)or e tf , ( n ) h ( θ ) = (cid:90) W n ×N h ( u , . . . , u n , z ; θ ) I ( n ) (d u · · · d u n d z | X ; θ )= (cid:54) = (cid:88) u ,...,u n ∈ X W h ( u , . . . , u n , X \ { u , . . . , u n } ; θ ) − (cid:90) W n h ( u , . . . , u n , X ; θ ) λ ( n ) ( u , . . . , u n , X ; θ ) d u · · · d u n . (8)Here unbiasedness means that E e ( n ) h ( θ ) = E e tf , ( n ) h ( θ ) = 0, with the expectationscalculated under θ . We will use the term Takacs-Fiksel estimating function for16n estimating function of the type (8). For Takacs-Fiksel estimating functionswe may account for edge eﬀects by letting W be an eroded observation window,see e.g. Møller & Waagepetersen (2004). In practice estimating functions of the form (7) are mainly considered in theﬁrst ( n = 1) or second order ( n = 2) cases and can be motivated by compositelikelihood arguments. The ﬁrst notion of composite likelihood for spatial point processes is due toGuan (2006) who deﬁned a bivariate probability density f ( u, v ; θ ) ∝ ρ (2) ( u, v ; θ ), u, v ∈ W, and considered the log composite likelihood (cid:80) (cid:54) = u,v ∈ X log f ( u, v ; θ ).Below we discuss and compare other approaches to composite likelihood forspatial point processes based on the inﬁnitesimal interpretation of joint intensityfunctions.Letting h ( u ; θ ) = d log ρ ( u ; θ ) / d θ in (7) in the case n = 1, the score of thePoisson log likelihood is recovered. If the underlying spatial point process isnot Poisson, the estimating function can be interpreted as a composite like-lihood score for the binary indicators X i of presence of points in cells of aninﬁnitesimal partition of the observation window mentioned in the end of Sec-tion 2 (Schoenberg, 2005; Waagepetersen, 2007; Guan & Loh, 2007; Møller &Waagepetersen, 2007). Likewise, in case n = 2 with h ( u, v ; θ ) = 1[ (cid:107) u − v (cid:107) ≤ R ] d log ρ (2) ( u, v ; θ ) / d θ for some tuning parameter R >

0, the score of a secondorder composite likelihood (cid:54) = (cid:88) u,v ∈ X W (cid:107) u − v (cid:107) ≤ R ] dd θ log ρ (2) ( u, v ; θ ) − (cid:90) W (cid:107) u − v (cid:107) ≤ R ] dd θ ρ (2) ( u, v ; θ )d u d v (9)is obtained — this time for binary indicators X i X j of simultaneous occurrenceof points in R -close pairs of distinct cells C i and C j of the aforementionedpartition (Waagepetersen, 2007; Møller & Waagepetersen, 2007). Returning toGuan (2006)’s composite likelihood the associated score is (cid:54) = (cid:88) u,v ∈ X W (cid:107) u − v (cid:107) ≤ R ] dd θ log ρ (2) ( u, v ; θ ) − (cid:54) = (cid:88) u,v ∈ X W (cid:82) W dd θ (cid:107) u − v (cid:107) ≤ R ] ρ (2) ( u, v ; θ ) d u d v (cid:82) W (cid:107) u − v (cid:107) ≤ R ] ρ (2) ( u, v ; θ ) d u d v . (10)The estimating functions (9) and (10) are closely related since their ﬁrst termsagree. Moreover, the expectation of the last term in (10) coincides with the lastterm in (9) so that (10) is also unbiased.17he so-called Palm likelihood (Tanaka et al., 2008) is based on compositelikelihood arguments too. For a ﬁxed point u ∈ R d the reduced Palm spatialpoint process X ! u has intensity function ρ ( v | u ; θ ) = ρ (2) ( u, v ; θ ) /ρ ( u ; θ ). As-suming that X is stationary with known constant intensity, the unbiased score(Prokeˇsov´a & Jensen, 2013) of the Palm likelihood is (cid:88) u ∈ X W (cid:88) v ∈ X \ u dd θ log ρ ( v | u ; θ )1[ (cid:107) v − u (cid:107) ≤ R ] − N ( X W ) (cid:90) (cid:107) v (cid:107)≤ R dd θ ρ ( v | o )d v. (11)Obviously, considering the ﬁrst term in (11), the Palm likelihood is closelyrelated to the two other types of second order composite likelihoods.Asymptotic results for the various types of composite likelihoods are pro-vided in Guan (2006), Waagepetersen (2007), Guan & Loh (2007), Waagepetersen& Guan (2009), and Prokeˇsov´a & Jensen (2013). It is not known which type ofcomposite likelihood is most eﬃcient. In any case, none of them are optimal,see Section 4.2.2. While the estimating functions described in the previous setting have a nicemotivation in terms of composite likelihoods, they are in general not optimal.Optimality can be achieved following the route of quasi-likelihood.For an m × Y with mean vector µ depending on a p -dimensionalparameter θ , a class of unbiased estimating functions are given by A T ( Y − µ )for m × p matrices A . This can be viewed as a matrix-vector analogue of theestimating function (7) in case n = 1 with Y − µ and A corresponding to theinnovation process I ( n ) and the weight function h , respectively. The optimalchoice of A is the solution of V A = D where D is the m × p matrix of partialderivatives d µ i / d θ j and V is the covariance matrix of Y . This choice of A yieldsthe so-called quasi-likelihood score (e.g. Heyde, 1997).Guan et al. (2015) generalize the concept of quasi-likelihood to the case ofspatial point processes. Considering the class of ﬁrst order estimating functions(7) they identify the optimal weight function h as the solution of a Fredholmintegral equation h + T h = dd θ log ρ ( · , θ )where T is a certain integral operator depending on the pair correlation functionof the spatial point process. They further establish asymptotic properties of thespatial point process quasi-likelihood and also provide an eﬃcient numericalimplementation of the method.Guan et al. (2016) take the quasi-likelihood idea further and consider asecond order quasi-likelihood giving the optimal choices of h and h for thelinear combination e (1)+(2) h ,h ( θ ) = e (1) h ( θ ) + e (2) h ( θ ) . h and h now depend also on third and fourth order joint densities. However,in the stationary case, Guan et al. (2016) show that restricting attention toconstant functions h and functions h with h ( u, v ) only depending on v − u does not lead to a loss of eﬃciency asymptotically and moreover simpliﬁescomputations greatly.Given a set of estimating functions e ( θ ) , . . . , e m ( θ ), a related topic is toobtain a combined estimating function as an optimal linear combination e ( θ ) = (cid:80) mi =1 A i e i ( θ ) for p × p matrices A i . The weight matrices A i can be deter-mined according to quasi-likelihood arguments and this approach is consideredin Deng et al. (2015) for estimation of parameters in stationary point processes.Similarly, Lavancier & Rochet (2016) consider how to obtain an estimate asan optimal linear combination ˆ θ = (cid:80) i =1 w i ˆ θ i given a collection of estimatesˆ θ , . . . , ˆ θ m of the same parameter θ . Consider (7) with n = 1 and h ( u ) = d log ρ ( u ; θ ) / d θ . Suppose in addition to X another spatial point process Z of intensity α is available where X and Z arealmost surely disjoint. Then (cid:88) u ∈ X ∪ Z d ρ ( u ; θ ) / d θρ ( u ; θ ) + α ( u ) (12)is an unbiased estimate of the last integral in (7). The estimating functionobtained by replacing the integral by the unbiased estimate (12) is the derivativeof (cid:88) u ∈ X log ρ ( u ; θ ) ρ ( u ; θ ) + α ( u ) + (cid:88) u ∈ Z log α ( u ) ρ ( u ; θ ) + α ( u ) (13)which is the limit of log conditional likelihoods based on the conditional distribu-tion of binary indicators X i given X i + Z i = 1 where the Z i are presence/absenceindicators for Z . Thus, in epidemiological terminology, X and Z play the roleof case and control processes.Diggle & Rowlingson (1994) introduced (13) in the case where Z is a spatialpoint process representing a background population and X is a spatial pointprocess of disease cases with intensity function ρ ( u ; θ ) = f ( u ; θ ) α ( u ) . In this setup α is assumed to be unknown but cancels out in both terms of (13).Waagepetersen (2008) considers the case where Z is a user generated dummypoint process generated with the sole purpose of approximating the integral.Guan et al. (2008) generalize (13) to the case of second order joint intensities.Diggle et al. (2010), Huang et al. (2014), and Chang et al. (2015) further ex-ploit and expand the case control methodology for spatial point processes inepidemiological applications with multiple sources of data.19 .3 Pseudo-likelihood Estimating functions of the type (8) have so far mainly been considered when n = 1 in which case the by far most popular weight function is h ( u, x ) =d log λ ( u, x ; θ ) / d θ leading to the pseudo-likelihood score (e.g. Besag, 1977; Jensen& Møller, 1991; Baddeley & Turner, 2000; Billiot et al., 2008). A review of otherchoices of weight functions is provided in Coeurjolly et al. (2012) who also intro-duce weight functions designed to lead to computationally quick Takacs-Fikselestimating functions in case of Strauss and quermass point processes (Kendallet al., 1999). In this section we focus on the pseudo-likelihood approach.As for the Poisson likelihood the computational issue with the pseudo-likelihood score is the evaluation of the integral in (8). Baddeley & Turner(2000) suggested a computationally eﬃcient numerical quadrature approxima-tion based on the Berman & Turner (1992) device. Formally, the approximatedpseudo-likelihood takes the form of a Poisson regression so that it can be easilyimplemented using standard statistical software for generalized linear models.One problem with the Berman-Turner approach is that the approximate pseudo-likelihood score is not unbiased which can lead to a strongly biased estimateunless a large number of quadrature points is used.As an alternative to the Berman-Turner device, Baddeley et al. (2014b)proposed an unbiased approximation of the pseudo-likelihood score followingthe case-control approach in Section 4.2.3 with ρ ( u ; θ ) replaced by λ ( u, X ) in(13). The approximated unbiased score is (cid:88) u ∈ X (d log λ ( u, X ; θ ) / d θ ) α ( u ) λ ( u, X ; θ ) + α ( u ) − (cid:88) u ∈ Z (d log λ ( u, X ; θ ) / d θ ) λ ( u, X ; θ ) + α ( u ) (14)where Z is a user generated spatial point process of random quadrature pointsindependent of X and with intensity function α . In the common case where λ ( u, X ; θ ) is of log-linear form, (14) is formally equivalent to the score of a logis-tic regression with probabilities of the form λ ( u, X ; θ ) / { λ ( u, X ; θ )+ α ( u ) } where − log α ( u ) plays the role of a known oﬀset. Thus also (14) is very easily imple-mented using standard software for generalized linear models and is available in spatstat . Baddeley et al. (2014b) establish asymptotic consistency and nor-mality of estimates obtained using (14). The use of random quadrature points Z adds additional estimation error compared with the exact pseudo-likelihood.However, in terms of mean square error, (14) outperforms the Berman-Turnerimplementation of the pseudo-likelihood due to less bias. We now discuss maximum likelihood inference for a determinantal point process X ∼ DPP( C ) restricted to a compact space S and based on a realization X = { x , . . . , x n } ⊂ S . We can make this assumption without loss of generality, cf.(e) in Section 3.2.1. We assume that the eigenvalues of C as given in (4) satisﬁes20 i < i = 1 , , . . . . Then X has a density f ( x ) = exp( | S | − D ) det (cid:16) ˜ C ( x i , x j ) i,j =1 ,...,n (cid:17) with respect to the unit rate Poisson process on S , where D = − log P( X = ∅ ) = − log ∞ (cid:88) i =1 log(1 − λ i )and ˜ C is of the same form as C in (4) but with λ i replaced by ˜ λ i = λ i / (1 − λ i ).See Lavancier et al. (2014) and the references therein.In general when dealing with DPPs on R k , the eigenfunctions in (4) are un-known and (4) needs to be replaced by an approximation. When S is rectangularand the (unrestricted) kernel is stationary, i.e., C ( x , y ) = C ( x − y ), Lavancieret al. (2014, 2015) discussed an approximation based on both the Fourier basis(used as eigenfunctions), the Fourier transform of C on R k , a Fourier seriesexpansion of C on S , and a periodic extension of C outside S . This leadsto inﬁnite series approximations ˜ C app and D app of ˜ C and D , respectively. Inpractice, a truncation of the inﬁnite series is needed, leading to ˜ C app ,N and D app ,N , say, where N relates to the number of terms in the truncation and isincreased until the approximate MLE stabilizes. On the other hand, if S = S d and X is isotropic, the spectral representation (4) is explicitly given in terms ofknown spherical harmonics, and so we do not an approximation except that atruncation is still used (see Møller & Rubak, 2016, for details).A simulation study reported in Lavancier et al. (2014) shows that approxi-mate likelihood inference based on replacing ˜ C by ˜ C app ,N works well in practice.As discussed in Lavancier et al. (2014), in case of stationary isotropic DPPswith a kernel of the form C ( x , y ) = ρR ( (cid:107) x − y (cid:107) ; θ ) where ( ρ, θ ) is the unknownparameter, the maximum likelihood estimate of the intensity ρ is well approx-imated by the usual non-parametric estimate given by ˆ ρ = n/ | S | . Further,Lavancier et al. (2014, 2015) discuss approximate MLE, model comparison, andlikelihood ratio tests for a number of examples of speciﬁc data sets. They con-clude that estimates obtained by moment based methods as in Section 4.2 givesimilar results as those based on MLE, though they are somewhat less eﬃcient.For non-stationary parametric DPP models, where the intensity depends on co-variates while the pair correlation function is stationary, Lavancier et al. (2014,2015) use the easier approach of minimum contrast estimation. Bayesian inference for parametric Poisson, Mat´ern III hard core (Huber &Wolpert, 2009; Møller et al., 2010), Cox, and cluster processes have been re-viewed in Møller & Waagepetersen (2007) and Guttorp & Thorarinsdottir (2012).For Poisson and Mat´ern III models the likelihood is known. This is also the casefor Cox and cluster processes provided the latent random function or the clus-ter centres are included among the unknown parameters. Various hybrid (or21etropolis within Gibbs) algorithms for posterior simulation apply, where e.g.in the case of cluster processes the main ingredient is often a kind of birth-death-move Metropolis Hastings algorithm (Geyer & Møller, 1994; Møller &Waagepetersen, 2004; Huber, 2011). However, their implementations and acareful output analysis can be cumbersome, and relatively large computationtimes may be required. For a LGCP with a Mat´ern covariance function forthe underlying Gaussian process, an alternative to MCMC is based on inte-grated nested Laplace approximations (INLA) (see ‘Bayesian Computing withINLA: A Review’ in this volume) provided certain restrictions are imposed onthe smoothness parameter (Rue et al., 2009; Lindgren et al., 2011; Illian et al.,2012). Taylor & Diggle (2012), however, question that INLA is always bothsigniﬁcantly faster and more accurate than MCMC.For a Gibbs point process, the unknown normalizing constant in the like-lihood causes computational problems when calculating the Hastings ratio forMCMC posterior simulations; Murray et al. (2006) call this ‘MCMC for doubly-intractable distributions’. Møller et al. (2006) oﬀer a solution based on anMCMC auxiliary variable method which involves perfect simulation (many ref-erences to perfect simulation for spatial point processes can be found in Huber,2015). In its simplest form, the auxiliary variable method may have low ac-ceptance probabilities, but more elaborate methods were used in Berthelsen &Møller (2006) for pairwise interaction point processes, assuming that the ﬁrst-order term is a shot noise process, and that the interaction function for a pair ofpoints depends only on the distance between the two points and is a piecewiselinear function. Murray et al. (2006)’s exchange algorithm is a modiﬁcation ofthe auxiliary variable method which is simpler to use.Guttorp & Thorarinsdottir (2012) discuss how to use Bayes factors for modelselection problems, e.g. when considering a cluster process with diﬀerent modelsfor the dispersion distribution. The reversible jump MCMC algorithm (Green,1995) is then used when proposing a jump from one dispersion distribution toanother.

Classical functional summary statistics such as Ripley’s K -function, the emptyspace function F , the nearest-neighbour function G , and the related J -function(see e.g. Møller & Waagepetersen, 2004) play a major role in exploratory analysisof spatial point patterns and validation of ﬁtted models. For a stationary pointprocess, if b ( o, t ) denotes the ball with center o and radius t , ρK ( t ) = E { X ! o ∩ b ( o, t ) } is the expected number of further points in X within distance t of atypical point in X , while F ( t ) = P( X ∩ b ( o, t ) (cid:54) = ∅ ) and G ( t ) = P( X ! o ∩ b ( o, t ) (cid:54) = ∅ )are probabilities of observing at least one point within distance t of respectivelyan arbitrary ﬁxed point in space or a typical point in X . The J -function isdeﬁned as J ( t ) = [1 − G ( t )] / [1 − F ( t )] for t with F ( t ) <

1. Section 5.1 discussessome new summaries and Section 5.2 the use of envelopes and Monte Carlotests. 22 .1 New functional summaries

Baddeley et al. (2000) introduce the notion of second order intensity-reweightedstationary point processes and extend the deﬁnition of the K -function to suchspatial point processes. They brieﬂy discuss ways of generalizing F and G tonon-stationary point processes but the practical applicability of these gener-alizations was restricted to Poisson processes. Van Lieshout (2011) is moresuccessful in generalizing F , G , and J . She deﬁnes the concept of intensity-reweighted moment stationary (IRMS) point processes meaning that so-called n -point correlation functions are translation invariant (whereby in particularan intensity-reweighted moment stationary point process is also second orderintensity-reweighted stationary). In the stationary case, F , G , and J can beexpressed in terms of series representations involving joint intensities of all or-ders (provided the series are convergent) and Van Lieshout (2011) extends theseseries representations to the case of IRMS point processes. Van Lieshout (2011)further discusses speciﬁc examples of IMRS point processes and non-parametricestimation of the generalized summary statistics.To detect anisotropy in cases with linear structures such as in Figure 2 (leftpanel), Møller et al. (2015b) introduce in the stationary and the second orderintensity-reweighted stationary cases a functional summary statistic K e ( r, t ).This is called the cylindrical K -function since it is a directional K -functionwhose structuring element is a cylinder C e ( r, t ) centered at o with direction e (a unit vector), radius r , and height 2 t . In the stationary case, ρK e ( r, t ) =E { X ! o ∩ C e ( r, t ) } is the expected number of further points within the cylindergiven that X has a point at o . Choosing diﬀerent directions and sizes of thecylinder, Rafati et al. (2016) and Møller et al. (2015b) demonstrate that a non-parametric estimate of K e ( r, t ) is useful for detecting preferred directions andcolumnar structures in 2D and 3D spatial point pattern data sets.For isotropic point processes on S d , Robeson et al. (2014) studied Ripleys K -function while Lawrence et al. (2016) and Møller & Rubak (2016) indepen-dently introduced empty space and nearest neighbour-functions F and G (andhence also a J -function). The interpretations are similar as for stationary pointprocesses on R k but using great circle distance on the sphere (the focus in thepapers are on the case d = 2 but most theory easily extends to the generalcase of dimension d = 1 , , . . . ). In the case of second order intensity-reweightedisotropy, Lawrence et al. (2016) and Møller & Rubak (2016) also study the in-homogeneous K -function. While Møller & Rubak (2016) provide the technicaldetails on how to deﬁne Palm distributions for point processes on S d and illus-trate the application of functional summary statistics for DPPs on the sphere,Lawrence et al. (2016) deal with edge eﬀects and a cluster point process on thesphere used for modelling the data set in Figure 2 (right panel). A functional summary statistic such as K ( r ) contains information from diﬀer-ent spatial scales. Usually we consider a graphical representation of a non-23arametric estimator of e.g. K together with a simulation envelope which ex-presses the variability of this estimator. Below we discuss how formal statisticaltests may be constructed from such an envelope.Suppose we have observed a realization of a spatial point process X andthen simulated spatial point process realizations X , . . . , X m +1 under a claimedmodel for X so that the joint distribution of X , . . . , X m +1 is exchangeableunder the model. For i = 1 , . . . , m + 1, let (cid:98) T i ( r ) = (cid:98) T ( r, X i ) ( r >

0) denotean estimator of a functional summary statistic such as

F, G, J, K or their in-homogeneous versions. For each value of r > k = 1 , , . . . , we deﬁne a pointwise envelope with lower and upper bounds (cid:98) T ( k )low ( r ) and (cid:98) T ( k )up ( r ) given bythe k th smallest and largest values of (cid:98) T ( r ) , . . . , (cid:98) T m +1 ( r ). Then (cid:98) T ( r ) is within (cid:98) T ( k )low ( r ) and (cid:98) T ( k )up ( r ) with probability α := 2 k/ ( m + 1) or in case of ties withapproximate probability α . Thus a plot of the pointwise envelopes over a rangeof r values enables us to assess for each ﬁxed value of r a Monte Carlo testwhere at (approximate) level α we reject the claimed model if (cid:98) T ( r ) is outsidethe envelope. If relevant, this may be replaced by a one-sided Monte Carlo test(see e.g. Baddeley et al., 2015, Section 10.7.5).Several authors have warned against the interpretation of the pointwise en-velope and the corresponding Monte Carlo test. In practice, the speciﬁed modelfor the data is an estimated model under a composite hypothesis, and so theMonte Carlo test is strictly speaking invalid but usually conservative. A globalenvelope test can be obtained by rejecting the claimed model if (cid:98) T ( r ) is notalways inside the pointwise envelopes for a given ﬁnite selection of r -values.However, Ripley (1977) noticed that due to multiple testing this gives an un-known probability of committing a type I error which is larger than the level α associated with each of the pointwise tests. See also Loosmore & Ford (2006)and Baddeley et al. (2015, Section 10.7.2).To solve the multiple testing problem, ways of constructing p -values corre-sponding to a global envelope with lower and upper bounds ( (cid:98) T low ( r )) r ∈ I and( (cid:98) T up ( r )) r ∈ I for a given ﬁnite set I ⊂ (0 , ∞ ) (approximating a predeﬁned inter-val of r -values) and number m of simulations have been suggested in Myllym¨akiet al. (2015); see also Baddeley et al. (2015, Chapter 10). In one approach,a so-called rank envelope test is based on extreme ranks R , . . . , R m +1 , where R i is the largest k so that (cid:98) T i ( r ) is within (cid:98) T ( k )low ( r ) and (cid:98) T ( k )up ( r ) for all r ∈ I ,cf. Myllym¨aki et al. (2015). They notice that the extreme ranks are tied andshow how a liberal/lower p -value p − and a conservative/upper p -value p + can beconstructed. For a given probability α , liberal/conservative refers to the test ob-tained by rejecting if respectively p − or p + is less or equal to α . Myllym¨aki et al.(2015) further deﬁne the 100(1 − α )% rank envelope by the bounds (cid:98) T ( k α )low ( r ) and (cid:98) T ( k α )up ( r ) for r ∈ I , where k α is the largest k such that { R i < k } ≤ α ( m + 1).They show that rejecting due to p + ≤ α is equivalent to that (cid:98) T ( r ) is strictlyoutside the global rank envelope for at least one r ∈ I . They also show howto use a more comprehensive summary of the pointwise ranks, namely so-calledrank counts. Regarding the number of simulations, Myllym¨aki et al. (2015)24ecommend to use at least m = 2500 when α = 5%.Finally, the approach in Dao & Genton (2014) to reduce or eliminate theproblem of conservatism has been adapted in Myllym¨aki et al. (2015) and Bad-deley et al. (2015, Section 10.10). Software for the methods in Myllym¨aki et al.(2015) is provided as an R library spptest at https://github.com/myllym/spptestand it will be available in spatstat . We conclude with a brief discussion on some other recent and future develop-ments and some open problems.In practice ‘spatial’ often refers to two dimensions, however 3D point patterndata sets are becoming more common, e.g. as illustrated in connection to thepyramidal brain cell data set (Figure 2) and in Khanmohammadi et al. (2015)who consider 3D marked point pattern data obtained from focused ion beamscanning electron microscopial images. Most methods for analyzing spatial pointpattern data sets are from a theoretical point of view applicable in any spatialdimension. From a practical point of view, however, the 3D case poses additionalcomputational challenges. This may again require developments of the existingtheory to ensure practically applicable methods of analysis.Due to space constraints we have not covered developments for marked andmultivariate spatial point processes. Research in multivariate point processeshas to a large extent focused on the bivariate or trivariate (e.g. Baddeley et al.,2014a) case but in ecology large data sets with locations of thousands of trees forhundreds of species are collected. Such data sets call for research in methods foranalysing highly multivariate point patterns. Steps in this direction are takenin Jalilian et al. (2015) and Waagepetersen et al. (2016). However, considerablechallenges remain in order to obtain practically applicable statistical modelsand methods for point patterns with hundreds of types of points.We have also not covered point processes deﬁned on a network of lines, e.g.in connection to road accidents or spines on the dendrite networks of a neuron(Baddeley et al., 2014a, 2015). Here statistical models and methods should takeinto account the network geometry as addressed in Ang et al. (2012), Okabe &Sugihara (2012), and the references therein. Assuming that the pair correlationfunction only depends on shortest path distance, Ang et al. (2012) show howto deﬁne the counterpart of Ripley’s K -function. There is a lack of modelswith this property apart from the Poisson process and another speciﬁc modelstudied in Ang et al. (2012). Currently, together with Ethan Anderes and JakobG. Rasmussen, the ﬁrst author of the present paper is developing covariancefunctions on linear networks which only depend on shortest path distance, andthereby DPPs and LGCPs may be constructed on such networks so that thepair correlation function only depends on shortest path distance.As discussed in Section 3.4, many observed spatial point patterns contain25oints placed roughly on line segments. However, in some cases the exact mech-anism responsible for the formations of lines is unknown. For instance, Møller& Rasmussen (2012) model linear structures for locations of bronze age gravesin Denmark and mountain tops in Spain, without introducing a latent line seg-ment process. Instead they use sequential point process models, i.e. where thepoints are ordered (Van Lieshout, 2006a,b). Since the ordering is not known itis treated as a latent variable.Estimating functions for spatial point processes can also be derived usinga variational approach inspired by the variational estimators for Markov ran-dom ﬁelds developed by Almeida & Gidas (1993). Baddeley & Dereudre (2013)do this for Gibbs point processes, while Coeurjolly & Møller (2014) consideranother variational estimator for the intensity function. In comparison withthe estimation procedures in Section 4.2.1, the variational estimators are exactand explicit, quicker to use, and very simple to implement. However, inferenceis eﬀectively conditional on the observed number of points, and so the inten-sity parameter cannot be estimated. In particular the variational estimator inCoeurjolly & Møller (2014) is simple to use, but there may be a loss in ef-ﬁciency, since it does not take into account spatial correlation or interaction.Baddeley & Dereudre (2013) and Coeurjolly & Møller (2014) establish strongconsistency and asymptotic normality of their variational estimators, and theyalso discuss ﬁnite sample properties in comparison to the estimation methodsin Section 4.2.1.Coeurjolly et al. (2015a) follow ideas in Guan et al. (2015) to construct aweight function that in certain situations outperforms the weight function forthe pseudo-likelihood. However, it is still not known what is the optimal weightfunction in (8). To the best of our knowledge, MLE for inhomogeneous DPPshas yet not be studied. Seemingly, it also remains to consider MLE for the case S = S d .While spatstat supports a wealth of inference procedures based on fre-quentist methods, including most of those discussed in Sections 4-5, spatstat is so far of very limited use for Bayesian inference. Instead various softwarepackages for Bayesian analysis have been developed in connection to speciﬁcPoisson, cluster, Cox and Gibbs point process models. Moreover, as far as weknow Bayesian analysis for DPPs is another unexplored area. DISCLOSURE STATEMENT

The authors are not aware of any aﬃliations, memberships, funding, or ﬁnancialholdings that might be perceived as aﬀecting the objectivity of this review.

ACKNOWLEDGMENTS

Supported by the Danish Council for Independent Research — Natural Sciences,grant 12-124675, ”Mathematical and Statistical Analysis of Spatial Data”, and26y the ”Centre for Stochastic Geometry and Advanced Bioimaging”, funded bygrant 8721 from the Villum Foundation.

References

Almeida MP, Gidas B. 1993. A variational method for estimating the parametersof MRF from complete or incomplete data.

Annals of Applied Probability

Spatial Statistics

To appearAng QW, Baddeley A, Nair G. 2012. Geometrically corrected second order anal-ysis of events on a linear network, with applications to ecology and criminol-ogy.

Scandinavian Journal of Statistics

Bernoulli

Journal of the RoyalStatistical Society: Series C (Applied Statistics)

Stat

Electronic Journal of Statistics

Australian and New Zealand Journal of Statistics R package for analyzing spatial pointpatterns. Journal of Statistical Software ,ISSN: 1548-7660Baddeley A, Turner R. 2006. Modelling spatial point patterns in r. In

CaseStudies in Spatial Point Pattern Modelling, number 185 in Lecture Notes inStatistics , eds. A Baddeley, P Gregori, J Mateu, R Stoica, D Stoyan. NewYork: Springer-Verlag, 23–74Baddeley AJ, Couerjolly JF, Rubak E, Waagepetersen R. 2014b. Logistic re-gression for spatial Gibbs point processes.

Biometrika

Statistica Neerlandica

Journal of the Royal Statistical Society,Series B

Applied Statistics

Australian and New Zealand Journal of Statis-tics

Bulletin ofthe International Statistical Institute

Statistics

Elec-tronic Journal of Statistics

Biomet-rics

Advances in Applied Probability

Biometrics

Scandinavian Journalof Statistics

Bernoulli

Journal of Tropical Ecology

Journal of Computa-tional and Graphical Statistics

Biometrika

InternationalStatistical Review

Journal of the American Statistical Association

Journal of the Royal Statistical Society, Series A(Statistics in Society)

Kybernetika :1007–1026Fiksel T. 1984. Estimation of parameterized pair potentials of marked and non-marked Gibbsian point processes.

Elektronische Informationsverarbeitung undKypernetik

Communications in Mathematical Physics

Scandinavian Journal of Statistics

Biometrics

To appearGreen P. 1995. Reversible jump Markov chain Monte Carlo computation andBayesian model determination.

Biometrika

Journal of the American Statistical Association

Journal of the American StatisticalAssociation

Journalof the American Statistical Association

Journal of the Royal Statistical Society, Series B

Advances and Challenges in Space-time Modelling of NaturalEvents , eds. E Porcu, J Montero, M Schlather, Lecture Notes in Statistics.Springer: Berlin Heidelberg, 79–102Hahn U, Jensen EBV, van Lieshout MNM, Nielsen LS. 2003. Inhomogeneousspatial point processes by location-dependent scaling.

Advances in AppliedProbability

Probability Surveys

Journal of the American Statistical Association

Tropical Rain Forest: Ecology and Man-agement , eds. SL Sutton, TC Whitmore, AC Chadwick. Oxford: BlackwellScientiﬁc Publications, 25–41Huber M. 2011. Spatial point processes. In

Handbook of MCMC , eds. S Brooks,A Gelman, G Jones, X Meng. Chapman & Hall/CRC Press, 227–252Huber M. 2015. Perfect simulation. Chapman & Hall/CRC, Boca RatonHuber M, Wolpert R. 2009. Likelihood-based inference for Mat´ern type-III re-pulsive point processes.

Advances in Applied Probability

Astronomy andAstrophysics

TheAnnals of Applied Statistics

Biometrics

Scandinavian Journal of Statistics

Bernoulli

Annals of Applied Probability

Journal of Microscopy

Advances in Applied Probability

Proceedings of the 5th ACM Conference on Bioinformatics,Computational Biology, and Health Informatics, BCB ’14, Newport Beach,California, USA, September 20-23, 2014

Frontiers in Neuroanatomy

ScandinavianJournal of Statistics

Journal of the Royal Statistical Society: Series B(Statistical Methodology)

Compu-tational Statistics and Data Analysis

Stat

Journal of the Royal Statistical Society, Series B G or K point patternspatial statistics. Ecology

Ad-vances in Applied Probability

Advances in AppliedProbability

Spatial Cluster Modelling , eds. AB Lawson, D Denison. BocaRaton, Florida: Chapman & Hall/CRC Press, 87–107Møller J. 1999. Markov chain Monte Carlo and spatial point processes. In

Stochastic Geometry: Likelihood and Computation , eds. OE Barndorﬀ-Nielsen, WS Kendall, MNM van Lieshout. Boca Raton: Monographs onStatistics and Applied Probability 80, Chapman and Hall/CRC, 141–172Møller J. 2003. Shot noise Cox processes.

Advances in Applied Probability

Stochastic Processes and their Applications

Biometrika

Stochastic Environmental Research and Risk Assess-ment

Scandinavian Jour-nal of Statistics K -functionand Poisson line cluster point processes. CSGB Research Reports 3, Cen-tre for Stochastic Geometry and Advanced Bioimaging. Preprint on arXiv:1503.07423. To appear in BiometrikaMøller J, Syversveen AR, Waagepetersen RP. 1998. Log Gaussian Cox processes. Scandinavian Journal of Statistics

Scandinavian Journal of Statistics

Advances inApplied Probability

Scandinavian Journal of Statistics

Journal of Neurophysiology

Proceedings of the 22nd Annual Conference on Uncertaintyin Artiﬁcial Intelligence (UAI-06) . AUAI Press, 359–366Myllym¨aki M, Mrkvicka T, Grabarnik P, Seijo H, Hahn U. 2015. Global envelopetests for spatial processes. Preprint on arxiv: 1307.0239v4Nguyen X, Zessin H. 1979. Integral and diﬀerential characterizations of Gibbsprocesses.

Mathematische Nachrichten

Annals of the Institute of Statistical Mathematics

Journal of Microscopy

Journal of theRoyal Statistical Society: Series B (Statistical Methodology)

SpatialStatistics

Journal of the Royal Statistical Society: Series B (StatisticalMethodology)

Journal of Statistical Planning and Inference

Statistics and Computing

Biometrical Journal

Statistics

Biometrical Journal

Journal ofStatistical Computation and Simulation

Journal of Ecology

Astronomy and Astrophysics

Proceedings Prague Stochastics 2006 , eds. M Huˇskov´a,M Janˇzura. Matfyzpress, Prague, 215–224Van Lieshout MNM. 2006b. Markovianity in space and time. In

Dynamics &Stochastics: Festschrift in Honour of M.S. Keane , eds. D Denteneer, F DenHollander, E Verbitskiy. Institute for Mathematical Statistics, Beachwood,154–168Van Lieshout MNM. 2011. A J -function for inhomogeneous point processes. Statistica Neerlandica

Spatial Cluster Modelling , eds. AB Lawson, D Denison. BocaRaton, Florida: Chapman & Hall/CRC Press, 61–86Waagepetersen R. 2005. Discussion of ‘Residual analysis for spatial point pro-cesses’.

Journal of the Royal Statistical Society,Series B

Biometrics

Biometrika

Journal of the Royal Statistical Society, Series B

Journal ofthe Royal Statistical Society: Series C (Applied Statistics)

Technometrics

Journal of the Royal Statistical Society,Series B