[PDF] An Interaction Neyman-Scott Point Process Model for Coronavirus Disease-19

Abstract

With rapid transmission, the coronavirus disease 2019 (COVID-19) has led to over 2 million deaths worldwide, posing significant societal challenges. Understanding the spatial patterns of patient visits and detecting the local spreading events are crucial to controlling disease outbreaks. We analyze highly detailed COVID-19 contact tracing data collected from Seoul, which provides a unique opportunity to understand the mechanism of patient visit occurrence. Analyzing contact tracing data is challenging because patient visits show strong clustering patterns while clusters of events may have complex interaction behavior. To account for such behaviors, we develop a novel interaction Neyman-Scott process that regards the observed patient visit events as offsprings generated from a parent spreading event. Inference for such models is complicated since the likelihood involves intractable normalizing functions. To address this issue, we embed an auxiliary variable algorithm into our Markov chain Monte Carlo. We fit our model to several simulated and real data examples under different outbreak scenarios and show that our method can describe spatial patterns of patient visits well. We also provide visualization tools that can inform public health interventions for infectious diseases such as social distancing.

Full PDF

AAn Interaction Neyman-Scott Point ProcessModel for Coronavirus Disease-19

Jaewoo Park , Won Chang , and Boseung Choi Department of Statistics and Data Science, Yonsei University Department of Applied Statistics, Yonsei University Division of Statistics and Data Science, University of Cincinnati Division of Big Data Science, Korea UniversityFebruary 8, 2021

Abstract

With rapid transmission, the coronavirus disease 2019 (COVID-19) has led to over 2million deaths worldwide, posing signiﬁcant societal challenges. Understanding the spatialpatterns of patient visits and detecting the local spreading events are crucial to controllingdisease outbreaks. We analyze highly detailed COVID-19 contact tracing data collectedfrom Seoul, which provides a unique opportunity to understand the mechanism of patientvisit occurrence. Analyzing contact tracing data is challenging because patient visits showstrong clustering patterns while clusters of events may have complex interaction behavior.To account for such behaviors, we develop a novel interaction Neyman-Scott process thatregards the observed patient visit events as oﬀsprings generated from a parent spreadingevent. Inference for such models is complicated since the likelihood involves intractablenormalizing functions. To address this issue, we embed an auxiliary variable algorithminto our Markov chain Monte Carlo. We ﬁt our model to several simulated and real dataexamples under diﬀerent outbreak scenarios and show that our method can describe spatialpatterns of patient visits well. We also provide visualization tools that can inform public a r X i v : . [ s t a t . A P ] F e b ealth interventions for infectious diseases such as social distancing. Keywords: infectious disease; doubly-intractable distributions; cluster point process; Bayesianhierarchical model; Markov chain Monte Carlo

Caused by the transmission of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2),the coronavirus disease 2019 (COVID-19) was ﬁrst reported in December 2019 in Wuhan, Hubeiprovidence, China (WHO, 2020). By February 2021, there have been 100 million conﬁrmed casesof COVID-19, with more than 2 million deaths. The disease spreads more quickly than inﬂuenza,mainly through close contact with infected people (CDC, 2020). Human-to-human transmissionis most common for COVID-19, primarily via respiratory droplets or aerosols from an infectedperson. Thus contact tracing, which records the travel paths of conﬁrmed patients in detail, isa highly eﬀective disease control measure. Spatial point process models for contact tracing datacan provide useful insights into the mechanism of patient visits. Here, we propose a new pointprocess model for studying the probabilistic mechanism of patient visits. Our model can provideuseful epidemiological information such as a warning system for local hotspots.Here we analyze the locations visited by conﬁrmed patients in Seoul, South Korea. Thedata set contains full contact histories of conﬁrmed patients, and is hence a unique source forpoint process modeling. Contact tracing data for most other countries contain only partial orimperfect tracing records. We regard each visit in the contact tracing data as an event in thepoint process model. Since such events are spatially clustered, a natural way to model this isthe Neyman-Scott point process (Neyman and Scott, 1952), which considers observed events asoﬀsprings generated around unobserved parents. In the epidemiological context, we can regard aparent point as a cluster center of disease (i.e., spreading event), and oﬀsprings are correspondingclusters (i.e., patient visits).Several computational methods have been developed for inference for point processes, includ-ing a minimum contrast method (Diggle, 2013) and pseudolikelihood approximation (Guan, 2006,Diggle et al., 2010). However, such approaches are sensitive to the choice of tuning parametersand may not be accurate in the presence of strong spatial dependence among points. Through2imulation studies, Mrkviˇcka et al. (2014) reports that Bayesian estimation is the most precisefor the Neyman-Scott point process. Furthermore, we can easily adjust priors based on epidemi-ological knowledge. Therefore, the Bayesian framework can be practical for ﬁtting hierarchicalpoint process models.A simple Neyman-Scott process has limited applicability here because it assumes an indepen-dent parent process. Local spreading events (parents) may have complex interaction behavior,leading to an intractable likelihood function; this is common in many spatial point process mod-els. Several extensions for Neyman-Scott processes have been developed. Waagepetersen (2007),Mrkviˇcka et al. (2014), Mrkviˇcka and Soubeyrand (2017) study inference for inhomogeneousNeyman-Scott point processes. Yau and Loh (2012) considers a special case where the parentprocess exhibits repulsion. Albert-Green et al. (2019) generalizes the Neyman–Scott process byallowing the parents to follow the log-Gaussian Cox process (Møller et al., 1998). However, theseexisting models may not be appropriate for our COVID-19 data because of the complex inter-actions of local spreading events. Unobserved spreading events attract each other at mid-rangedistances because other spreading events likely to appear in nearby communities. At the sametime, spreading events repel each other at small distances to avoid merging. Otherwise, onespreading event would become oﬀsprings to others.In this manuscript, we propose a novel parametric approach that exhibits interaction betweenunobserved spreading events. This new approach can infer model parameters that describespatial patterns of spreading events. Our model allows us to detect unobserved spreading events,thereby providing a practical warning system for coronavirus hotspots. Recently, Goldstein et al.(2015) proposes an attraction-repulsion point process model by extending the Strauss process(Strauss, 1975) that only allows repulsion behavior. Russell et al. (2016) uses such a processto model diﬀerent types of interactions among animals, including herding behavior and collisionavoidance. In this manuscript, we develop an interaction Neyman-Scott point process to describethis behavior. We incorporate the attraction-repulsion (Goldstein et al., 2015) process into theparent process. Our model includes an intractable normalizing function, posing computationaland inferential challenges. To address this, we adopt a novel auxiliary variable Markov chainMonte Carlo (MCMC) (Møller et al., 2006, Liang, 2010), that can avoid the direct evaluation ofintractable normalizing functions. 3ote that susceptible-infected-recovered (SIR) models (Kermack and McKendrick, 1927, Di-etz, 1967) are also popular for modeling infectious disease dynamics. There is a vast literature onmodeling the COVID-19 data based on SIR type models (Choi and Ki, 2020, Fanelli and Piazza,2020, Anastassopoulou et al., 2020, Barlow and Weinstein, 2020). These models are typicallybased on daily counts of infection, death, and recovery. Rather than modeling conﬁrmed cases,as in SIR models, we ﬁt our point process model to locations visited by conﬁrmed patients fromour contact tracing data. Our approach can provide epidemiological insights for social distancingin daily life by using the full information of visiting history.The remainder of this paper is as follows. In Section 2, we describe the COVID-19 contacttracing data in Seoul, South Korea. In Section 3, we introduce a new interaction Neyman-Scott process model and discuss its computational and inferential challenges. We describe ourMCMC algorithm for this model and provide implementation details. In Section 4, we apply ourmethods to both simulated and real COVID-19 data sets. We show that our methods can detectthe sources of spreading events and describe the spatial patterns of patient visits. Furthermore,we provide a disease risk map that can give important epidemiological interpretations. In Section5, we conclude the paper with a discussion and summary.

In this section, we provide the background of our contact tracing data and summarize the resultsfrom exploratory data analysis.

We study the COVID-19 contacting tracing data in Seoul. During the early stages of the pan-demic in South Korea, Daegu and the North Gyeongsang (NGO) province were the central areasof the disease outbreak. However, the regional government only provided data on daily newinfections, deaths, and recoveries, rather than disclosing full contact tracing information. On theother hand, conﬁrmed COVID-19 patients in Seoul were perfectly under control, with their fullcontact information recorded. When a patient is conﬁrmed to be infected by a screening clinic, heor she is immediately quarantined. In addition, the local government tracks the patient’s moving4

ATID Date time Address Trans Description Latitude LongitudeXXX 0307 Chunhodaero 145, Seoul car Screening Clinic 959371 1952879XXX 0302 Gyungheedaero 7-1,Seoul other Caf´e 960469 1954862XXX 0302 HanChunro 58, Seoul car House 962047 1956113XXX 0301 Hoegiro 19, Seoul other Caf´e 960541 1954855XXX 0301 Hoegiro 18, Seoul route Restaurant 960500 1954704XXX 0301 Imunro 37, Seoul route Caf´e 960812 1954692XXX 0301 HanChunro 58, Seoul other House 962047 1956113XXX 0229 Shinimunro 40, Seoul route Hosipital 961735 1956027XXX 0229 Shinimunro 24, Seoul route Pharmacy 961718 1956025XXX 0229 Hoegiro 21, Seoul route Restaurant 960558 1954887XXX 0229 Imunro 37, Seoul car Caf´e 960812 1954692XXX 0229 Hwigyeongro 2, Seoul route Caf´e 961181 1955099XXX 0229 HanChunro 58, Seoul other House 962047 1956113

Table 1: Movement path of a conﬁrmed patient. For conﬁdentiality, we do not report the patient’sID (PATID).paths and posts the routes on local government websites. This information is legally disclosedfor two weeks and then archived. This code follows the “Response Guidelines to Prevent theSpread of COVID-19 (local government)” (KDCA, 2020). The contact tracing data sets havebeen collected directly from these websites.Table 1 provides examples of contact tracing data for a patient. For each individual, theinformation for travel route (transportation, places visited, coordinates) has been recorded. Forexample, the patient in Table 1, visited 13 places from February 29 to March 7, 2020. This patientwas conﬁrmed infected at a screening clinic in Seoul and immediately quarantined. Figure 1illustrates a graphical representation of the travel routes of ﬁve patients, including the patientin Table 1. To study the spatial patterns of patient visits, we regard each visit (coordinate) asa realization of the point process.In this paper, at any given time point, we ﬁt a point process model based on contact tracingdata accumulated over the last two weeks. According to the basic data analysis for the earlystages of the COVID-19 spread in South Korea and China, two weeks is the mean period ofrecovery from infection (Ki, 2020, Choi and Ki, 2020). Two weeks is also the period for epidemicinvestigation conducted by the local health authority for all conﬁrmed cases. Therefore, when weconstruct a warning system for COVID-19 at a certain time, modeling patient visits in the lasttwo weeks would be most useful. As illustration examples, we ﬁt our model for non-overlapping8 diﬀerent time periods from February 20th to June 11th in 2020. Our goal here is to examine5igure 1: Travel routes of ﬁve conﬁrmed patients in the Dondaemoon-gu area. Each colorrepresents a patient’s path.how our model works and provide important epidemiological insights for these diﬀerent timeperiods with various disease spreading patterns. In particular, in the main document we focuson three consecutive time periods that end at March 19th, April 2nd, and April 15th which canbe considered as severe, moderate, and mild outbreak cases, respectively in terms of overall visitnumbers. The results for other time periods are provided in the supplementary material.

Spatial point processes provide a natural solution to model spatial patterns for locations visitedby conﬁrmed patients. Here, we provide the motivation for a new interaction point process modelwith some exploratory data analysis. Let X = { x , · · · , x n } be a realization of point process overthe bounded spatial domain S ∈ R . The pair correlation function (PCF) is useful for exploring6 Locations visited by confirmed

PCF g ( r ) g^ Ripley ( r ) g Pois ( r ) Figure 2: The left panel shows the locations visited by patients (March 6 - March 19, 2020). Theright panel shows the estimated PCF from the observation. The red line indicates theoreticalPCF under complete spatial randomness.point process observations (Stoyan and Stoyan, 1994). The PCF is deﬁned as J ( r ) = 12 πr K (cid:48) ( r ) , K ( r ) = |S| n E (cid:104) (cid:88) i (cid:54) = j (cid:107) x i − x j (cid:107)≤ r (cid:105) , where |S| is the area of the spatial domain. Here, Ripley’s K ( r ) is the expected density of pointswithin distance r . Under complete spatial randomness, K ( r ) = πr , which results in J ( r ) = 1. J ( r ) > r (attraction) while J ( r ) < r (repulsion).Figure 2 illustrates an example of COVID-19 data and their PCF. We observe that patientvisits are spatially clustered ( J ( r ) > X = ∪ c ∈ C X c , where X c is the oﬀspring (clusters) and C is the parent (cluster centers). Givenparent process C , oﬀspring X c , c ∈ C follows an independent Poisson process with intensity αk ( u − c, ω ). Here, ω controls the spread of oﬀsprings around their parent, and α determines the7xpected number of oﬀsprings per each cluster. With the Gaussian kernel k ( u − c, ω ) with mean c and variance ω , X is called the Thomas process (Thomas, 1949). The basic Neyman-Scottprocess models the unobserved parent process C as a simple Poisson process.The Neyman-Scott process is appropriate for modeling COVID-19 data because the locationsvisited by conﬁrmed patients, X c , are clustered around the unobserved spreading event C . Fromthis, we can detect the local spreading events (cluster center of patient visits) of COVID-19.However, local spreading events C may have complex interactions. At mid-range distances r ,spreading events tend to clump together because other spreading events likely to exist in thenearby region. At small r , spreading events tend to remain apart; otherwise, they would becomeoﬀsprings of other spreading events, and hence two clusters can be ‘merged.’ The basic Neyman-Scott process cannot describe such behavior because the basic model considers an independentPoisson process for C . In the following section, we add another layer of complexity to thebasic Neyman-Scott process. This new model can provide epidemiological interpretation forunderstanding spreading events for COVID-19. In this section, we propose a new Neyman-Scott process by incorporating interaction behaviorinto C . This new parametric approach can detect unobserved spreading events of COVID-19 anddescribe their patterns. Based on this, we can provide a warning system for higher risk regions. Consider the realization of Neyman-Scott point process X = { x , · · · , x n } in the bounded plane S ⊂ R (Seoul domain). This indicates the observed locations visited by the conﬁrmed patients.Let C = { c , · · · , c m } be their unobserved parent process. In our context, each c i is the locationof a spreading event in a local community. Note that Seoul has a two-level hierarchical admin-istrative structure, Gu-Dong: Gu consists of multiple Dongs and each Dong has average size of1 .

427 km . We focus on modeling spreading events with a size of Dong because this coincideswith most citizens’ life radius. From this, we can provide a warning system for distancing indaily life. Spreading events tend to remain apart (repulsion) at small distances to avoid merging;8therwise, they would become oﬀsprings to each other. At mid-range distances, they tend toclump together (attraction), because other spreading events likely to appear in nearby areas.Spreading events become independent at suﬃciently large distances. To describe such patterns,we model the parent in the Neyman-Scott process as an attraction-repulsion interaction pointprocess (Goldstein et al., 2015). The locations of unobserved spreading events (parent process)is modeled as f ( C | κ, θ , θ ) = κ m (cid:81) mi =1 exp (cid:110) min (cid:16)(cid:80) i (cid:54) = j log ( φ ( D i,j )) , (cid:17)(cid:111) Z ( κ, θ , θ ) , (1)where the interaction function is φ ( D ) =  θ − (cid:16) √ θ θ ( D − θ ) (cid:17) < D ≤ D . D − D )) D > D (2)Here, D , D are set to make the interaction function φ ( D ) continuously diﬀerentiable (Gold-stein et al., 2015). The interaction function φ ( D ) is determined by the distance between twoparent points c i , c j . There are three parameters { κ, θ , θ } in this model: κ controls the overallintensity of the parent process and { θ , θ } control the shape of the interaction function φ ( D ). θ gives the peak value of φ , whereas θ gives the location of the peak value. When the dis-tance between spreading events D ij becomes too small, φ is less than 1, which means spreadingevents show repulsion behavior. On the other hand, as D ij becomes larger, φ is increased, whichmeans spreading events show attraction behavior at mid-range distance. Such attractions be-come weaker as the distance between the spreading events becomes larger. From this, (1) candescribe the attraction-repulsion behavior of local spreading events.Given the local spreading events, the locations visited by the conﬁrmed patients can bemodeled as f ( X | C , α, ω ) = exp (cid:16) |S| − (cid:90) S m (cid:88) i =1 αk ( u − c i , ω ) du (cid:17) n (cid:89) j =1 m (cid:88) i =1 αk ( x j − c i , ω ) . (3)We use a symmetric Gaussian kernel with a center at each spreading event c i and variance ω .9his results in a higher intensity of patient visits around the spreading events. In our context, α controls the expected number of conﬁrmed patients for each spreading event, and ω controlsthe levels of spreading events activity. The proposed model in the previous section has a hierarchical structure. Unobserved localspreading events (parent process) follow the spatial interaction process deﬁned in (1) and (2).Observed patient visits (oﬀsprings) are distributed around the unobserved parents with Gaussiankernels. The Bayesian framework is useful for such hierarchical models in that it can easilyquantify the model parameters’ uncertainty using MCMC. With priors p ( Θ ), the joint posteriordistribution is π ( Θ , C | X ) ∝ f ( X | C , α, ω ) f ( C | κ, θ , θ ) p ( Θ ) , Θ = { α, ω, κ, θ , θ } . (4)Although (4) is a natural way to construct a joint posterior distribution, there are computationaland inferential challenges to ﬁt such a model due to the intractable normalizing functions. Themodel for spreading events in (1) can be written as f ( C | κ, θ , θ ) = h ( C | κ, θ , θ ) Z ( κ, θ , θ ) = κ m (cid:81) mi =1 exp (cid:110) min (cid:16)(cid:80) i (cid:54) = j log ( φ ( D i,j )) , (cid:17)(cid:111) Z ( κ, θ , θ ) . (5)Here, the calculation of normalizing function Z ( κ, θ , θ ) requires integration over the spatialdomain S , which is infeasible. Inference for such models is challenging because an intractable Z ( κ, θ , θ ) is a function of model parameters. The resulting posterior (4) is referred to as adoubly-intractable distribution (Murray et al., 2006).Several computational methods have been proposed to address this issue. By assuming con-ditional independence of points, Besag (1974) proposed the pseudolikelihood that does not haveintractable normalizing functions. Due to its convenience, pseudolikelihood approximations areoften used in many point process applications (e.g., Diggle et al., 2010, Tamayo-Uria et al.,2014). However, it is well known that such approximations are unreliable when there is strongdependence among points (Mrkviˇcka et al., 2014). Geyer and Thompson (1992) propose the-10retically justiﬁed Monte Carlo maximum likelihood methods, which maximize a Monte Carloapproximation to the likelihood. However, this has limited applicability because the algorithmrequires gradients for h ( C | κ, θ , θ ), which is not available for interaction point process modelssuch as the one deﬁned above. Bayesian approaches can be an alternative for such cases. SeveralBayesian methods have been proposed for doubly-intractable distribution (see Park and Haran(2018) for a comprehensive review). Among current approaches, Park and Haran (2018) reportsthat double Metropolis-Hastings (DMH) is the most practical method for point process models.DMH can avoid the calculation of Z ( κ, θ , θ ) and alleviate memory issues, which can be seriousfor adaptive algorithms (Atchade et al., 2008, Liang et al., 2016). In the following section, weformulate DMH that can carry out Bayesian inference for an interaction Neyman-Scott pointprocess model. Here, we describe an MCMC algorithm for the attraction-repulsion Neyman-Scott point processmodel. Consider the model parameters Θ ( t ) = { α ( t ) , ω ( t ) , κ ( t ) , θ ( t )1 , θ ( t )2 } and latent parent process C ( t ) in (4) at the t -th iteration. We update the parameters successively. Oﬀspring parameters α ( t +1) , ω ( t +1) can be updated from α ( t +1) , ω ( t +1) ∼ f ( X | C ( t ) , α ( t ) , ω ( t ) ) p ( α ( t ) , ω ( t ) )with a joint prior density p ( α ( t ) , ω ( t ) ). We can update parent parameters from κ ( t +1) , θ ( t +1)1 , θ ( t +1)2 ∼ f ( C ( t ) | κ ( t ) , θ ( t )1 , θ ( t )2 ) p ( κ ( t ) , θ ( t )1 , θ ( t )2 )with a joint prior density p ( κ ( t ) , θ ( t )1 , θ ( t )2 ) (see Section 4.1 for how we choose the priors in ourproblem). Compared to updating oﬀspring parameters, updating parent parameters is chal-lenging due to the intractable normalizing function in (5). Intractable Z ( κ, θ , θ ) leads to the11ntractable acceptance probability in MCMC as α = min  h ( C ( t ) | κ (cid:48) ,θ (cid:48) ,θ (cid:48) ) Z ( κ (cid:48) ,θ (cid:48) ,θ (cid:48) ) p ( κ (cid:48) , θ (cid:48) , θ (cid:48) ) q ( κ (cid:48) , θ (cid:48) , θ (cid:48) | κ ( t ) , θ ( t )1 , θ ( t )2 ) h ( C ( t ) | κ ( t ) ,θ ( t )1 ,θ ( t )2 ) Z ( κ ( t )1 ,θ ( t )1 ,θ ( t )2 ) p ( κ ( t ) , θ ( t )1 , θ ( t )2 ) q ( κ ( t ) , θ ( t )1 , θ ( t )2 | κ (cid:48) , θ (cid:48) , θ (cid:48) ) ,  (6)for the proposed parameters κ (cid:48) , θ (cid:48) , and θ (cid:48) . To avoid direct evaluation of the intractable nor-malizing functions in α , we incorporate double Metropolis-Hastings (DMH) (Liang, 2010) intoour MCMC algorithm. Instead of evaluating the intractable normalizing functions in the α ,DMH sampler generates an auxiliary variable A from (1) via a birth-death MCMC (Moller andWaagepetersen, 2003). We provide details of the birth-death MCMC in the supplementary ma-terial. Note that the auxiliary variable A is from the same probability model as C and can beregarded as synthetic point process data. The acceptance probability of DMH is now given as α = min (cid:40) h ( C ( t ) | κ (cid:48) , θ (cid:48) , θ (cid:48) ) h ( A | κ ( t ) , θ ( t )1 , θ ( t )2 ) p ( κ (cid:48) , θ (cid:48) , θ (cid:48) ) q ( κ (cid:48) , θ (cid:48) , θ (cid:48) | κ ( t ) , θ ( t )1 , θ ( t )2 ) h ( C ( t ) | κ ( t ) , θ ( t )1 , θ ( t )2 ) h ( A | κ (cid:48) , θ (cid:48) , θ (cid:48) ) p ( κ ( t ) , θ ( t )1 , θ ( t )2 ) q ( κ ( t ) , θ ( t )1 , θ ( t )2 | κ (cid:48) , θ (cid:48) , θ (cid:48) ) , (cid:41) . (7)This does not include intractable terms because the introduction of A modiﬁes the original densi-ties in (6) by multiplying h ( A | κ ( t ) , θ ( t )1 , θ ( t )2 ) /Z ( κ ( t ) , θ ( t )1 , θ ( t )2 ) into the numerator and h ( A | κ (cid:48) , θ (cid:48) , θ (cid:48) ) /Z ( κ (cid:48) , θ (cid:48) , θ (cid:48) )into the denominator. If the simulated auxiliary variable A resembles the parent process C , theproposed parameters κ (cid:48) , θ (cid:48) , θ (cid:48) are likely to be accepted.Finally, we obtain C ( t +1) from C ( t +1) ∼ f ( X | C ( t ) , α ( t +1) , ω ( t +1) ) f ( C ( t ) | κ ( t +1) , θ ( t +1)1 , θ ( t +1)2 )using a birth-death MCMC. Note that stationary distribution of the birth-death MCMC algo-rithm for updating C ( t +1) is diﬀerent from that of generating the auxiliary variable A in DMH.The stationary distribution for generating the auxiliary variable follows (1) (see the supplemen-tary material for details). Our MCMC algorithm is summarized in Algorithm 1.12 lgorithm 1 MCMC for an attraction-repulsion Neyman-Scott process

Given Θ ( t ) = { α ( t ) , ω ( t ) , κ ( t ) , θ ( t )1 , θ ( t )2 } and C ( t ) at t -th iteration.Part 1: Update oﬀspring parameters: α ( t ) , ω ( t ) . Propose α (cid:48) , ω (cid:48) ∼ q ( ·| α ( t ) , ω ( t ) ) and accept it with probability α = min (cid:26) f ( X | C ( t ) , α (cid:48) , ω (cid:48) ) p ( α (cid:48) , ω (cid:48) ) q ( α ( t ) , ω ( t ) | α (cid:48) , ω (cid:48) ) f ( X | C ( t ) , α ( t ) , ω ( t ) ) p ( α ( t ) , ω ( t ) ) q ( α (cid:48) , ω (cid:48) | α ( t ) , ω ( t ) ) , (cid:27) else reject (set α ( t +1) = α ( t ) , ω ( t +1) = ω ( t ) ). Part 2: Update parent parameters: κ ( t ) , θ ( t )1 , θ ( t )2 . Propose κ (cid:48) , θ (cid:48) , θ (cid:48) ∼ q ( ·| κ ( t ) , θ ( t )1 , θ ( t )2 ).Generate an auxiliary variable A from the probability model using birth-death MCMC: A ∼ f ( C ( t ) | κ (cid:48) , θ (cid:48) , θ (cid:48) ).Accept it with probability α = min (cid:40) h ( C ( t ) | κ (cid:48) , θ (cid:48) , θ (cid:48) ) h ( A | κ ( t ) , θ ( t )1 , θ ( t )2 ) p ( κ (cid:48) , θ (cid:48) , θ (cid:48) ) q ( κ (cid:48) , θ (cid:48) , θ (cid:48) | κ ( t ) , θ ( t )1 , θ ( t )2 ) h ( C ( t ) | κ ( t ) , θ ( t )1 , θ ( t )2 ) h ( A | κ (cid:48) , θ (cid:48) , θ (cid:48) ) p ( κ ( t ) , θ ( t )1 , θ ( t )2 ) q ( κ ( t ) , θ ( t )1 , θ ( t )2 | κ (cid:48) , θ (cid:48) , θ (cid:48) ) , (cid:41) else reject (set κ ( t +1) = κ ( t ) , θ ( t +1)1 = θ ( t )1 , θ ( t +1)2 = θ ( t )2 ). Part 3: Update unobserved parent process: C ( t ) .C ( t +1) ∼ f ( X | C ( t ) , α ( t +1) , ω ( t +1) ) f ( C ( t ) | κ ( t +1) , θ ( t +1)1 , θ ( t +1)2 ) using birth-death MCMC13 Application

In this section, we apply our approach to simulated and real data examples. The computer codefor this is implemented in R and C++ using the

Rcpp and

RcppArmadillo packages (Eddelbuettelet al., 2011).

To complete the posterior speciﬁcation we now explicitly deﬁne the prior distributions. InNeymann-Scott process, there is a strong dependence between α and ω . The same observedpattern can be regarded as points generated from a small number of parents, with each of themhas a large number of oﬀsprings (large α , large ω ), or they may come from a large number ofparents, with each of them has a small number of oﬀsprings (small α , small ω ). To avoid suchidentiﬁability issues, we need to set the upper and lower bounds for them using uniform priors(Møller and Waagepetersen, 2007, Kopeck`y and Mrkviˇcka, 2016).As pointed out in Section 3.1, we focus on modeling spreading events in Dong scale. To thisend we set lower and upper bound for ω based on the average size of a Dong. This encouragesthat patient visits from diﬀerent Dongs come from diﬀerent spreading events. Therefore, we setthe range for ω to [ (cid:112) |S| / , (cid:112) |S| / κ needs to be determined in consideration of the overall intensity and theoﬀspring intensity α . The overall intensity is around 6 × − , computed as dividing the averagenumber of patient visits within two weeks by the area of Seoul |S| . The average number of patientvisits is computed over the 8 time periods. Since κ controls for parent process intensity and α controls for average number of oﬀsprings per each parent, κ × α should be around the overallintensity 6 × − . By specifying the prior of κ to be uniform with range [1 × − , × − ], κ × α belongs to the range [3 × − , × − ] so that the resulting range can include the overallintensity 6 × − .Similar to ω , we set a uniform prior for θ whose upper and lower bounds are proportional14 ω κ × θ θ Scenario 1Truth 6 360 1.2 1.5 600Estimates 5.70 370.34 1.13 1.85 626.8195%HPD (4.79, 6.62) (356.27, 390.39) (0.39, 2.05) (1.25, 2.67) (500.02, 798.47)Scenario 2Truth 5 400 1.0 1.5 650Estimates 5.58 400.02 1.18 1.40 684.5995%HPD (4.72, 6.52) (373.65, 427.80) (0.60, 1.79) (1.00, 1.93) (500.02, 937.61)Scenario 3Truth 4 440 0.5 1.5 700Estimates 4.12 423.73 0.75 1.64 716.7195%HPD (3.20, 5.02) (387.68, 461.51) (0.34, 1.22) (1.03, 2.36) (500.03, 941.41)Table 2: Inference results for simulated data sets. 100,000 MCMC samples are generated, dis-carding the ﬁrst 50,000 as burn-in. Posterior mean estimates are reported for each parameter.The highest posterior density (HPD) is calculated using the coda package.to the average size of a Dong. We use a uniform prior with range [ (cid:112) |S| / , (cid:112) |S| /

25] so that θ gives the location of the peak value within a Dong. To allow for attraction, we set the prior for θ to be uniform with range [1 ,

3] as in Goldstein et al. (2015).

We now conduct a simulation study to validate our method. We ﬁrst simulate the parent process C from (1) with true parent parameters κ, θ and θ . We then simulate oﬀsprings X from (3)centered around each parent point with true oﬀspring parameters α and ω . We consider threediﬀerent scenarios here: In Scenario 1, we emulate COVID-19 data with a severe outbreak. Thishas numerous local spreading events ( m = 129) and each of them also generates a large numberof oﬀsprings. This results in n = 798 patient visits in total. In Scenario 2, we consider a moderateoutbreak. This has fewer spreading events ( m = 93) and each generates a moderate number ofoﬀsprings; the simulated patient visits consisted of n = 446 points. Finally, in Scenario 3, weconsider a mild outbreak in which we have the smallest number of spreading events ( m = 56)and patient visits ( n = 239).Table 2 indicates that the true parameter values used in all scenarios are estimated withreasonable accuracy. We also compare the true and recovered parent points in Figure 3. Weobtain the recovered parent points from the last iteration of the MCMC algorithm (i.e., C ( t ) in15arch 19th α ω κ × θ θ Estimates 5.42 357.40 1.33 1.72 547.4395%HPD (4.69, 6.11) (356.26, 359.63) (0.57, 2.06) (1.31, 2.21) (500.00, 638.49)April 2nd α ω κ × θ θ Estimates 4.61 357.44 1.00 1.92 553.2295%HPD (3.88, 5.25) (356.26, 359.74) (0.43, 1.71) (1.37, 2.58) (500.02, 646.51)April 15th α ω κ × θ θ Estimates 4.08 358.55 0.56 2.08 568.4095%HPD (3.24, 4.93) (356.26, 363.21) (0.25, 0.88) (1.34, 2.99) (500.01, 699.50)Table 3: Inference results for real data. 100,000 MCMC samples are generated, discarding theﬁrst 50,000 as burn-in.Algorithm 1). We observe that our method can detect spreading events well as there are goodagreements between true and ﬁtted points in all scenarios. Trace plots for number of parentsalso indicate that the MCMC chain from Algorithm 1 converges.

Parameter Interpretation

We now apply our method to the real observational data de-scribed in Section 2.1. The parameter estimation results for time periods that end at March19th, April 2nd, and April 15th are summarized in Table 3. These three consecutive periodscan be regarded as severe, moderate, and mild outbreaks, respectively, in terms of the numberof patient visits. Parameter estimation results for other time periods are provided in the supple-mentary material. The average number of oﬀspring points per each parent ( α ) and the intensityfor parent ( κ ) change as the overall number of visits changes; that is, more overall events leadto higher α and κ values. The width of the Gaussian kernel controlled by ω , on the other hand,stays similar. For April 15th, when the total number of visits is relatively small ( n = 190), α takes a lower value as well.The estimated values for the remaining two parameters θ and θ describe the interactionbetween parent points and show how clusters of the events are distributed in Seoul. The distancefor maximum interaction, determined by θ , increases as the number of parents reduces, reﬂectinga sparser distribution of the parent events. Nevertheless, from the θ estimates (which aresigniﬁcantly greater than 1), parent points in all periods show clear attraction at the locationgiven by θ (around 550m). This implies that diﬀerent event clusters tend to appear near each16ther rather than independently. Visualization for COVID-19 Risk

From the estimated values of α , κ , and ω , we can createan intensity map for patient visit events. Since local spreading events cause rapid communitytransmission, it is essential to avoid the higher risk locations in daily life. Understanding theprobabilistic mechanism occurring patient visits and detecting local spreading events (hotspots)are key to manage disease outbreaks. Let ˆ α and ˆ ω be the posterior mean of α and ω respectively.Given the parent points from the last iteration of the MCMC, denoted as ˆ c , · · · , ˆ c m , the spatialintensity function for oﬀspring points is given by g ( x ) = m (cid:88) i =1 αk ( x − ˆ c i , ˆ ω ) (8)for any x ∈ S . This map can be used to identify high-risk areas where the intensity exceeds acertain threshold set by administrative decision. Figure 5 illustrates the intensity map g ( x ) andhigh-risk areas deﬁned as g ( x ) / > .

427 km , which corresponds to roughly one patient per1 .

427 km each day. Here we divide g ( x ) by 14 to convert the intensity computed for two weeksinto the intensity for one day. The area 1 .

427 km is the average size of each Dong in Seoul, thesmallest administrative district.The hotspots shown in Figure 5 present diﬀerent types of spreading events. For example,the hotspots in Guro-gu (bottom left corner of the map) are caused by a group infection from acall center building around March. The event led to multiple detected conﬁrmed cases. On theother hand, the hotspots in Jungrang-gu (top right corner in the map) are mostly caused by asingle patient. This patient moved around multiple places in the local area, which resulted inmany close contacts. The former gained considerable media attention, and people became awareof the potential risk in the community. The latter, on the other hand, did not receive mediacoverage considering privacy protection, and the local community was largely unaware of therisk. Our approach may provide a way to warn local people without revealing information aboutthe speciﬁc contagious individual. From Figure 5, we observe that the overall intensity decreasesas time goes, indicating the outbreaks had been controlled around mid-April. In summary, thesevisualizations allow authorities to identify contagion risk hotspots and help them to set up public17ealth interventions while maintaining privacy protection. Visualization for Risk Boundaries

Our approach can also provide a way to generate “riskboundaries” within a short time period (such as one day), assuming that the event distributionwithin that period can be approximated by a model based on the previous two weeks. Suchboundaries are potentially useful for issuing public health advisories. The left column of Figure6 shows that the patient visits occurred on April 3rd mostly concentrated within the areas withhigh intensity g ( x ), given by the existing parent points (ˆ c , ..., ˆ c m ), but there are also pointslocated slightly away from high intensity areas. However, there are no points that are clearlyfar away from the existing high intensity areas either. Therefore, these points can be viewed asoﬀspring events from new parents that are created near the existing parents.The right column of Figure 6 shows the risk boundaries that consider both the occurrenceof new parent points and the range of oﬀspring densities created by them. Speciﬁcally, riskboundaries are represented as the orange circles, whose centers are the existing parent pointsˆ c , . . . , ˆ c m . The radii are given as θ + 1 . ω , the sum of the distance that maximizes theinteraction between the parent points ( θ ) and one side of the 95% interval for the oﬀspringdensity (1 . ω ). As Figure 6 shows, these risk boundaries indicate how far new events canreach given the existing parent points. The other periods also show qualitatively similar results(Figures 4, 5 in the supplementary material). In this manuscript, we proposed a novel interaction Neyman-Scott point process for modelingCOVID-19 data. Our model can be used to study the spatial patterns of patient visits and detectspreading events in local communities by considering location-based interactions. To address theintractable normalizing functions involved in the posterior, we embed DMH (Liang, 2010) intoour MCMC algorithm. We apply our approach to COVID-19 data sets pertaining to Seoul, anddraw meaningful epidemiological conclusions based on parameter estimates. Furthermore, weprovide easy-to-interpret disease risk maps from the ﬁtted results. Through simulation studies,we show that our method can provide reasonably accurate parameter estimates and detect true18preading events.Here, we focus on developing a new cluster point process model by regarding each patient visitas a realization from the process. Note that there are other spatial models that can be applied tocontact tracing data. Examples include dynamic models for movement tracks of animals (Russellet al., 2016) and spatial gradient methods for modeling the spread of invasive species (Goldsteinet al., 2019). Adopting such methods for studying COVID-19 contact tracing data could provideanother interesting epidemiological insight.Our method can be useful for planning public health interventions such as deciding the socialdistancing levels. Social distancing is necessary to prevent the spread of infectious diseases,especially for people in risk groups for severe coronavirus disease. However, raising the socialdistancing level could adversely aﬀect the local economy. Therefore, it is essential to measurethe degree of the COVID-19 risk: how dangerous is the current coronavirus outbreak? Basedon our approach, we can provide some guidelines for deciding the appropriate distancing levelsto minimize economic damage. We can quantify such risks based on model parameters. Forinstance, α and ω estimates provide the expected number of patient visits per spreading eventand the levels of spreading events activity (range of spread). From the θ estimate, we can expectthe probable location of a new spreading event. Using these parameter estimates, disease controlauthorities can decide the threshold for social distancing levels. Note that such interpretation isnot available from existing cluster point process approaches.Furthermore, our proposed approach has some advantages considering privacy protection.The disease risk maps allow authorities to alert local communities about disease hotspots withoutrevealing privacy-sensitive information. This is opposite to the current practice in South Korea,where local governments or health authorities release each individual’s travel history; althoughthe names of individuals are not revealed, the released information can be used to infer theidentity of each patient. The method and ideas developed in this manuscript are generallyapplicable to other infectious diseases as well. 19 upplementary Material The supplementary material available online provides details for birth-death MCMC algorithmsand ﬁtted results for other time periods.

Acknowledgement

Jaewoo Park was supported by the Yonsei University Research Fund of 2020-22-0501 and theNational Research Foundation of Korea (NRF-2020R1C1C1A0100386811). Boseung Choi wassupported by the National Research Foundation of Korea (2020R1F1A1A01066082). The OhioSuper Computer Center (OSC) provided part of computational resources for this study. Theauthors are grateful to Tom´aˇs Mrkviˇcka for providing useful sample codes and the anonymousreviewers for their careful reading and valuable comments. The authors are also grateful to KoreaSpatial Information & Community Co. Ltd. for their support with data and Sungjae Kim andEunjin Eom for data collection and organization.20

Scenario 1 x−coordinates y − c oo r d i na t e s TrueRecovered 0 10000 20000 30000 40000 50000

Trace plot of parents

Index CC d i m [ : + ] Scenario 2 x−coordinates y − c oo r d i na t e s TrueRecovered 0 10000 20000 30000 40000 50000

Trace plot of parents

Index CC d i m [ : + ] Scenario 3 x−coordinates y − c oo r d i na t e s TrueRecovered 0 10000 20000 30000 40000 50000

Trace plot of parents

Index CC d i m [ : + ] Figure 3: The left panel compares the true (circle) and recovered (triangle) parent points fordiﬀerent scenarios. The right panel illustrates the trace plots of parents.21

March 19th x−coordinates y − c oo r d i na t e s OffspringsParents 0 10000 20000 30000 40000 50000

Trace plot of parents

Index CC d i m [ : + , ] April 2nd x−coordinates y − c oo r d i na t e s OffspringsParents 0 10000 20000 30000 40000 50000

Trace plot of parents

Index CC d i m [ : + , ] April 15th x−coordinates y − c oo r d i na t e s OffspringsParents 0 10000 20000 30000 40000 50000

Trace plot of parents

Index CC d i m [ : + , ] Figure 4: The left panel compares the true (circle) and recovered (triangle) parent points for realdata examples. The right panel illustrates the trace plots of parents.22igure 5: (Left Column) Intensity map for patient visit events computed by the oﬀspring intensityfunction in (8). (Right Column) High risk area with the intensity greater than 1 patient / 1 day1.427 km , roughly meaning one patient per day within a “Dong”.23igure 6: The same intensity map as the left column in Figure 5 (left column) and the riskboundaries deﬁned as circles, whose centers are the existing sampled parent points ˆ c , . . . , ˆ c m and the radii are deﬁned as θ + 1 . ω (right Column). The black dots in both columns showevents observed on the day following the last date of each time interval.24 upplementary Material for An InteractionNeyman-Scott Point Process Model forCoronavirus Disease-19 Jaewoo Park, Won Chang, and Boseung Choi

A Birth-Death MCMC

Birth-death samplers (Moller and Waagepetersen, 2003) are often used to generate point pro-cesses given the model parameters. We propose adding a new point (birth), removing an existingpoint (death), or moving a point with equal probability. To generate an auxiliary variable inDMH, we use a birth-death sampler whose stationary distribution is f ( C | κ, θ , θ ). Algorithm 2summarizes this step. In our full MCMC algorithm (Algorithm 1 in the manuscript), we up-date C , given all the model parameters. Here, we use a birth-death sampler whose stationarydistribution is f ( X | C , α, ω ) f ( C | κ, θ , θ ). Algorithm 3 summarizes this step.25 lgorithm 2 Birth-death MCMC for generating the auxiliary variable in DMH

Given parent parameters κ, θ , θ and the point pattern C = { c , · · · , c m } .Birth step: add a point with probability / ξ uniformly over the spatial domain S : C + = C ∪ { ξ } Accept it with probability α = min (cid:26) f ( C + | κ, θ , θ ) |S| f ( C | κ, θ , θ )( m + 1) , (cid:27) Death step: remove a point with probability / η ∈ { c , · · · , c m } : C − = C \ { η } Accept it with probability α = min (cid:26) f ( C − | κ, θ , θ )( m − f ( C | κ, θ , θ ) |S| , (cid:27) Move step: move a point with probability / ξ and remove an existing point η : C (cid:48) = C ∪ { ξ } \ { η } Accept it with probability α = min (cid:40) f ( C (cid:48) | κ, θ , θ ) f ( C | κ, θ , θ ) , (cid:41) lgorithm 3 Birth-death MCMC for updating C in the full model Given parent parameters α, ω, κ, θ , θ and the point pattern C = { c , · · · , c m } .Birth step: add a point with probability / ξ uniformly over the spatial domain S : C + = C ∪ { ξ } Accept it with probability α = min (cid:26) f ( X | C + , α, ω ) f ( C + | κ, θ , θ ) |S| f ( X | C , α, ω ) f ( C | κ, θ , θ )( m + 1) , (cid:27) Death step: remove a point with probability / η ∈ { c , · · · , c m } : C − = C \ { η } Accept it with probability α = min (cid:26) f ( X | C − , α, ω ) f ( C − | κ, θ , θ )( m − f ( X | C , α, ω ) f ( C | κ, θ , θ ) |S| , (cid:27) Move step: move a point with probability / ξ and remove an existing point η : C (cid:48) = C ∪ { ξ } \ { η } Accept it with probability α = min (cid:40) f ( X | C (cid:48) , α, ω ) f ( C (cid:48) | κ, θ , θ ) f ( X | C , α, ω ) f ( C | κ, θ , θ ) , (cid:41) Analysis Results for Diﬀerent Timelines

March 5th α ω κ × θ θ Estimates 5.02 357.98 0.89 1.75 611.6795%HPD (4.40, 5.70) (356.26, 361.38) (0.41, 1.43) (1.25, 2.41) (500.02, 802.75)April 30th α ω κ × θ θ Estimates 3.19 563.40 0.17 1.66 724.7495%HPD (3.00, 3.58) (402.32, 744.51) (0.07, 0.29) (1.00, 2.65) (500.14, 966.40)May 14th α ω κ × θ θ Estimates 3.21 359.62 0.63 2.11 585.7895%HPD (3.00, 3.56) (356.26, 366.27) (0.27, 1.03) (1.43, 2.97) (500.00, 745.21)May 28th α ω κ × θ θ Estimates 5.13 357.17 1.34 1.66 538.4995%HPD (4.53, 5.79) (356.26, 358.93) (0.66, 2.25) (1.28, 2.13) (500.01, 607.51)June 11th α ω κ × θ θ Estimates 5.25 357.39 1.07 1.98 548.4595%HPD (4.50, 6.07) (356.27, 359.51) (0.37, 1.93) (1.41, 2.66) (500.02, 644.46)Table 4: Inference results for real data. 100,000 MCMC samples are generated, discarding theﬁrst 50,000 as burn-in. 28

March 5th x−coordinates y − c oo r d i na t e s OffspringsParents 940000 950000 960000 970000

April 30th x−coordinates y − c oo r d i na t e s OffspringsParents940000 950000 960000 970000

May 14th x−coordinates y − c oo r d i na t e s OffspringsParents 940000 950000 960000 970000

May 28th x−coordinates y − c oo r d i na t e s OffspringsParents940000 950000 960000 970000

June 11th x−coordinates y − c oo r d i na t e s OffspringsParents

Figure 7: Recovered parent points for the real data example. Circles indicate oﬀsprings andtriangles indicate recovered parent points. 29igure 8: (Left column) Intensity map of the patient visit events. (Right column) High-risk areawith intensity greater than 1 patient/1 day 1.427 km , which roughly translates to one patientper day within a “Dong”, the smallest administrative unit.30igure 9: (Left column) Intensity map of the patient visit events. (Right column) High-risk areawith intensity greater than 1 patient/1 day 1.427 km , which roughly translates to one patientper day within a “Dong”, the smallest administrative unit.31igure 10: The intensity map (left column) and risk boundaries deﬁned as circles, whose centersare the existing sampled parent points ˆ c , . . . , ˆ c m and the radii are deﬁned as θ + 1 . ω (rightcolumn). The black dots in both columns show the observed events during the day following thelast date of each time interval. 32igure 11: The intensity map (left column) and risk boundaries deﬁned as circles, whose centersare the existing sampled parent points ˆ c , . . . , ˆ c m , and the radii are deﬁned as θ + 1 . ω (rightcolumn). The black dots in both columns show the observed events during the day following thelast date of each time interval. 33 eferences Albert-Green, A., W. J. Braun, C. B. Dean, and C. Miller (2019). A hierarchical point processwith application to storm cell modelling.

Canadian Journal of Statistics 47 (1), 46–64.Anastassopoulou, C., L. Russo, A. Tsakris, and C. Siettos (2020). Data-based analysis, modellingand forecasting of the covid-19 outbreak.

PloS one 15 (3), e0230405.Atchade, Y., N. Lartillot, and C. P. Robert (2008). Bayesian computation for statistical modelswith intractable normalizing constants.

Brazilian Journal of Probability and Statistics 27 ,416–436.Barlow, N. S. and S. J. Weinstein (2020). Accurate closed-form solution of the sir epidemicmodel.

Physica D: Nonlinear Phenomena , 132540.Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems.

Journal ofthe Royal Statistical Society. Series B (Methodological) 36 , 192–236.CDC (2020). How COVID-19 spreads. .Choi, S. and M. Ki (2020). Estimating the reproductive number and the outbreak size of covid-19in korea.

Epidemiology and health 42 .Dietz, K. (1967). Epidemics and rumours: A survey.

Journal of the Royal Statistical Society.Series A (General) , 505–528.Diggle, P. J. (2013).

Statistical analysis of spatial and spatio-temporal point patterns . CRC press.Diggle, P. J., I. Kaimi, and R. Abellana (2010). Partial-likelihood analysis of spatio-temporalpoint-process data.

Biometrics 66 (2), 347–354.Eddelbuettel, D., R. Fran¸cois, J. Allaire, J. Chambers, D. Bates, and K. Ushey (2011). Rcpp:Seamless R and C++ integration.

Journal of Statistical Software 40 (8), 1–18.Fanelli, D. and F. Piazza (2020). Analysis and forecast of covid-19 spreading in china, italy andfrance.

Chaos, Solitons & Fractals 134 , 109761.34eyer, C. J. and E. A. Thompson (1992). Constrained Monte Carlo maximum likelihood fordependent data.

Journal of the Royal Statistical Society. Series B (Methodological) , 657–699.Goldstein, J., M. Haran, I. Simeonov, J. Fricks, and F. Chiaromonte (2015). An attraction-repulsion point process model for respiratory syncytial virus infections.

Biometrics 71 (2),376–385.Goldstein, J., J. Park, M. Haran, A. Liebhold, and O. N. Bjørnstad (2019). Quantifying spatio-temporal variation of invasion spread.

Proceedings of the Royal Society B 286 (1894), 20182294.Guan, Y. (2006). A composite likelihood approach in ﬁtting spatial point process models.

Journalof the American Statistical Association 101 (476), 1502–1512.KDCA (2020). Responseguidelines to prevent the spread of covid-19 (local government). http://ncov.mohw.go.kr/en/ .Kermack, W. O. and A. G. McKendrick (1927). A contribution to the mathematical theoryof epidemics.

Proceedings of the royal society of london. Series A, Containing papers of amathematical and physical character 115 (772), 700–721.Ki, M. (2020). Epidemiologic characteristics of early cases with 2019 novel coronavirus (2019-ncov) disease in korea.

Epidemiology and health 42 .Kopeck`y, J. and T. Mrkviˇcka (2016). On the Bayesian estimation for the stationary Neyman-Scott point processes.

Applications of Mathematics 61 (4), 503–514.Liang, F. (2010). A double Metropolis–Hastings sampler for spatial models with intractablenormalizing constants.

Journal of Statistical Computation and Simulation 80 (9), 1007–1022.Liang, F., I. H. Jin, Q. Song, and J. S. Liu (2016). An adaptive exchange algorithm for samplingfrom distributions with intractable normalizing constants.

Journal of the American StatisticalAssociation 111 , 377–393.Møller, J., A. N. Pettitt, R. Reeves, and K. K. Berthelsen (2006). An eﬃcient Markov chain MonteCarlo method for distributions with intractable normalising constants.

Biometrika 93 (2), 451–458. 35øller, J., A. R. Syversveen, and R. P. Waagepetersen (1998). Log Gaussian Cox processes.

Scandinavian journal of statistics 25 (3), 451–482.Moller, J. and R. P. Waagepetersen (2003).

Statistical inference and simulation for spatial pointprocesses . CRC Press.Møller, J. and R. P. Waagepetersen (2007). Modern statistics for spatial point processes.

Scan-dinavian Journal of Statistics 34 (4), 643–684.Mrkviˇcka, T., M. Muˇska, and J. Kubeˇcka (2014). Two step estimation for Neyman-Scott pointprocess with inhomogeneous cluster centers.

Statistics and Computing 24 (1), 91–100.Mrkviˇcka, T. and S. Soubeyrand (2017). On parameter estimation for doubly inhomogeneouscluster point processes.

Spatial statistics 20 , 191–205.Murray, I., Z. Ghahramani, and D. J. C. MacKay (2006). MCMC for doubly-intractable distribu-tions. In

Proceedings of the 22nd Annual Conference on Uncertainty in Artiﬁcial Intelligence(UAI-06) , pp. 359–366. AUAI Press.Neyman, J. and E. Scott (1952). A theory of the spatial distribution of galaxies.

The AstrophysicalJournal 116 , 144.Park, J. and M. Haran (2018). Bayesian inference in the presence of intractable normalizingfunctions.

Journal of the American Statistical Association 113 (523), 1372–1390.Russell, J. C., E. M. Hanks, and M. Haran (2016). Dynamic models of animal movementwith spatial point process interactions.

Journal of Agricultural, Biological, and EnvironmentalStatistics 21 (1), 22–40.Stoyan, D. and H. Stoyan (1994).

Fractals, random shapes, and point ﬁelds: methods of geomet-rical statistics , Volume 302. John Wiley & Sons Inc.Strauss, D. J. (1975). A model for clustering.

Biometrika 62 (2), 467–475.Tamayo-Uria, I., J. Mateu, and P. J. Diggle (2014). Modelling of the spatio-temporal distributionof rat sightings in an urban environment.

Spatial Statistics 9 , 192–206.36homas, M. (1949). A generalization of Poisson’s binomial limit for use in ecology.

Biometrika 36 (1/2), 18–25.Waagepetersen, R. P. (2007). An estimating function approach to inference for inhomogeneousNeyman–Scott processes.

Biometrics 63 (1), 252–258.WHO (2020). Novel Coronavirus - China. .Yau, C. Y. and J. M. Loh (2012). A generalization of the Neyman-Scott process.