aa r X i v : . [ m a t h . P R ] J u l Bernoulli (3), 2012, 1031–1041DOI: 10.3150/10-BEJ350 Spatial birth–death swap chains
MARK HUBER
Department of Mathematics and Computer Science, 850 Columbia Ave., Claremont McKennaCollege, Claremont, CA 91711, USA. E-mail: [email protected]
Markov chains have long been used for generating random variates from spatial point processes.Broadly speaking, these chains fall into two categories: Metropolis–Hastings type chains runningin discrete time and spatial birth–death chains running in continuous time. These birth–deathchains only allow for removal of a point or addition of a point. In this paper it is shown thatthe addition of transitions where a point is moved from one location to the other can aid inshortening the mixing time of the chain. Here the mixing time of the chain is analyzed throughcoupling, and use of the swap moves allows for analysis of a broader class of chains. Furthermore,these swap moves can be employed in perfect sampling algorithms via the dominated couplingfrom the past procedure of Kendall and Møller. This method can be applied to any pairwiseinteraction model with repulsion. In particular, an application to the Strauss process is developedin detail, and the swap chains are shown to be much faster than standard birth–death chains.
Keywords: birth death process; coupling from the past; perfect simulation; spatial pointprocesses; Strauss process; swap moves
1. Introduction
Spatial point processes are in wide use in statistical modeling (see [15] for an overview).Typically finite point processes are modeled as being absolutely continuous with respectto a Poisson point process. That is, they have a density f ( x ) /c where f ( x ) is an easilycomputable function but the normalizing constant c of the density is impractical tocompute. A Monte Carlo algorithm gains information about f ( x ) /c by studying randomvariates drawn from the distribution the density describes.To obtain these variates, a Markov chain is built whose stationary distribution matchesthe target distribution. Metropolis–Hastings chains run in discrete time (see [9]), andthe spatial birth–death chain approach of Preston [21] runs in continuous time. In [6]problems were given where the Metropolis–Hastings approach is faster than Preston’s.The drawback of these Markov chain Monte Carlo methods is that unless the mixingtime of the Markov chain is known, the quality of the variates is suspect. Heuristics suchas the autocorrelation test can prove that a chain has not mixed, but cannot establishthe positive claim that a chain has mixed. This is an electronic reprint of the original article published by the ISI/BS in
Bernoulli ,2012, Vol. 18, No. 3, 1031–1041. This reprint differs from the original in pagination andtypographic detail. (cid:13)
M. Huber
Perfect simulation algorithms solve this problem. They generate samples exactly fromthe desired distribution without the need to know the mixing time of a Markov chain.Kendall [18] showed how the coupling from the past (CFTP) idea of Propp and Wil-son [22] could be used together with a spatial birth and death chain to obtain samplesfrom area interaction processes. Kendall and Møller [19] showed how this method couldbe extended to any locally stable point process using a method they called dominatedCFTP . They also considered perfect sampling using Metropolis–Hastings chains, butrestricted these chains to only adding or deleting a point at each step.So [6] indicates that Metropolis–Hastings chains can beat continuous time chains,but [19] shows how to exactly sample using continuous time chains. The goal of thiswork is to introduce a new swap move to the continuous time chains that speeds upconvergence, while still allowing for perfect simulation.In Section 2 the theory behind spatial birth–death chains with the new swap move isdeveloped, and an example of such a chain is given for the Strauss process. Section 3reviews the use of dominated coupling from the past, and shows how the addition ofswap moves fits into this protocol. Section 4 bounds the expected running time of theprocedure for a restricted class of models.
2. Spatial point processes
Dyer and Greenhill [7] first introduced a swap move for hard core point processes indiscrete spaces. In this section their method is extended to more general point processes.For ease of exposition, we consider here point processes that do not contain multiplepoints. Let S be a separable measurable set, and λ be a diffuse measure on S (so λ ( { v } ) =0 for all v ∈ S ) such that λ ( S ) < ∞ . (Typically S is a bounded Borel set of R .) Thena Poisson point process is a finite subset of S chosen as follows. First, let N be a Poissondistributed random variable with parameter λ ( S ). Next, let X , . . . , X N be independentlyand identically distributed (i.i.d.) and drawn from the probability distribution λ ( · ) /λ ( S ).Then { X , . . . , X N } (called a configuration ) is a draw from a Poisson point process withintensity measure λ ( · ) over S . Let µ be the distribution of the configuration and Ω theset of all possible configurations. More details of µ and Ω can be found in [4, 21].As an example of data modeled using these types of processes, Harkness and Isham [12]studied locations of ant nests in a rectangular region R . With two types of ants, S = R × { , } and λ is the product of Lebesgue and a measure on { , } .The processes considered here are absolutely continuous with respect to µ with den-sity f satisfying a local stability condition (as in [19]):( ∃ K > ∀ x ∈ Ω)( ∀ v ∈ S \ x )( f ( x ∪ { v } ) ≤ Kf ( x )) . (1)Many point processes of interest meet this condition, including the area interaction pro-cess [2, 25], the Strauss process [17, 24] and the continuous random cluster model [11]. patial birth–death swap chains The development of the swap move given here follows the framework of Preston [21],who introduced the use of spatial birth–death chains for these problems. These chainsare examples of jump processes , where at a given state x , the chain stays in the state foran exponential length of time with expected value given by 1 /α ( x ). The state then jumpsto a new state using kernel K , so the probability that the new state is in A is K ( x, A )),independent of the past history (see [8], Chapter X, for the details of jump processes).In the Preston framework, the rate of births (addition of points to the configuration)and deaths (deletion of points from the configuration) depends only on the current state: • There exists a non-negative measurable birth rate function b from Ω × S equippedwith the standard product σ -field to R with the Borel σ -field. Call b ( x, v ) the birthrate at which point v is added to configuration x . • There exists a non-negative measurable death rate function d from Ω × S equippedwith the standard product σ -field to R with the Borel σ -field. Furthermore, w ∈ x ⇒ d ( x, w ) > w / ∈ x ⇒ d ( x, w ) = 0. Then d ( x, w ) is the death rate at whicha point w is removed from configuration x .To this birth–death framework we now add a swap rate: • There exists a non-negative measurable swap rate function s from Ω × S × S equippedwith the standard product σ -field to R with the Borel σ -field. Furthermore, w / ∈ x ⇒ s ( x, w, v ) = 0. So s ( x, w, v ) is the swap rate at which point w is removed and point v is added.The birth, death, and swap rates are used to build a kernel K for the Markov chain asfollows. For all A ∈ B , let K b ( x, A ) = R v ∈ S b ( x, v ) ( x ∪ { v } ∈ A ) λ (d v ) . When K b ( x, Ω) < ∞ for all x in Ω, the birth kernel is K b ( x, A ) = K b ( x, A ) /K b ( x, Ω) . Similarly, K d ( x, A ) = P w ∈ x d ( x, w ) ( x \ { w } ∈ A ) , which always has a finite number of terms and so K d ( x, A ) = K d ( x, A ) /K d ( x, Ω) . The total rate of births is r b ( x ) = R v ∈ S b ( x, v ) λ (d v ), and the total rateof deaths is r d ( x ) = P v ∈ x d ( x, v ).For the swap kernel, set K s ( x, A ) = P w ∈ x R v ∈ S s ( x, w, v ) ( x ∪ { v } \ { w } ∈ A ) λ (d v ) . When K s ( x, Ω) < ∞ for all x ∈ Ω, let K s ( x, A ) = K s ( x, A ) /K s ( x, Ω) , r s ( x ) = X w ∈ x Z v ∈ S s ( x, w, v ) λ (d v ) . (2)The overall rate at which the configuration changes is α ( x ) = r b ( x ) + r d ( x ) + r s ( x ) , andthe overall kernel is: K ( x, A ) = K b ( x, A ) r b ( x ) α ( x ) + K d ( x, A ) r d ( x ) α ( x ) + K s ( x, A ) r s ( x ) α ( x ) . (3)Harris recurrence guarantees that a Markov process has a unique invariant measure(see [1] for details of Harris recurrence in the continuous-time context). Kaspi and Man-delbaum [16] showed that a continuous-time chain is Harris recurrent if and only if thereexists a non-zero σ -finite measure where X almost surely hits sets with positive measure. M. Huber
In particular, for all the chains considered here, the death rate equals the number ofpoints in the configuration, and the birth rate is bounded above by a constant. This forcesthe chain to visit the empty configuration infinitely often, making it Harris recurrent.The detailed balance conditions (that imply f is invariant) for jump processes are: f ( x ) α ( x ) × K ( x, d y ) d µ ( x ) = f ( y ) α ( y ) K ( y, d x ) d µ ( y ) . For moves from configurationswith n points to those with n + 1 (or vice versa), the detailed balance conditions aresatisfied [21, 23] when the rate of births balance the rate of deaths with respect to f . So f ( x ) b ( x, v ) = f ( x ∪ { v } ) d ( x ∪ { v } , v ) . (4)Swap moves stay inside the same dimensional space, and it is straightforward to showthat reversibility for swap moves holds when f ( x ) s ( x, w, v ) = f ( x ∪ { v } \ { w } ) s ( x ∪ { v } \ { w } , v, w ) . (5) Kendall and Møller [19] describe how to create a jump process with stationary density f for locally stable processes. Briefly, their method works as follows. Two coupled chainswill be run: the dominating chain with state D ( t ) at time t and the target chain withstate X ( t ) at time t . It will always be true that X ( t ) ⊆ D ( t ). Each point w ∈ D ( t ) hasdeath rate d ( D ( t ) , w ) = 1. If a point dies that is also in X ( t ), it is removed from both X ( t )and D ( t ). The rate of births for the dominating chain is r b = Kλ ( S ), where K is thelocal stability constant in equation (1). If a birth occurs, a point v is chosen accordingto the probability measure λ ( · ) /λ ( S ). Then v is always added to D ( t ) to get the nextdominating state, but is only added to X ( t ) with probability f ( X ( t ) ∪ { v } ) / [ Kf ( X ( t ))] . Assume that each point v born in D ( t ) is marked with a uniform draw from [0 , X ( t ) if the mark falls below f ( X ( t ) ∪ { v } ) / [ Kf ( X ( t ))] . Suppose X (0) ⊆ D (0). Then since deaths are always accepted in both chains, buta birth in the dominating chain might not occur in the target chain, the dominatingconfiguration will be a superset of the target configuration for all t ≥ v can be linked to a single point w ∈ X ( t ). Consider an example. Strauss model
In the Strauss model [17, 24], the density has a factor that is exponential in the numberof pairs of points that lie within distance R of each other. Let ρ be a metric on S (usuallyEuclidean distance), then the density can be written: f S ( x ) = Z − β ,β ,R ) β x β s ( x )2 , s ( x ) = X { v,v ′ } : v ∈ x,v ′ ∈ x \{ v } ( ρ ( v, v ′ ) ≤ R ) , (6)where Z ( β ,β ,R ) is the normalizing constant for the density. As noted in [17], in orderfor Z ( β ,β ,R ) to be finite (and hence for the density to exist) β must be at most 1. patial birth–death swap chains x be the state of the target chain, and suppose point v is born in the dominatingchain. Call point w ∈ x a neighbor of v if ρ ( v, w ) ≤ R . The Strauss process is locallystable with K = β , so the chance of accepting v into x is f ( x ∪ { v } ) / [ Kf ( x )] = β s ( x,v )2 , where s ( x, v ) = P v ′ ∈ x ( ρ ( v ′ , v ) ≤ R ) is the number of neighbors of v in x .Let Bern ( p ) denote the Bernoulli distribution with parameter p . One way to draw B ∼ Bern ( β s ( x,v )2 ) is to draw B , . . . , B s ( x,v ) i . i . d . ∼ Bern ( β ) and set B = B B · · · B s ( x,v ) . (Here i . i . d . ∼ denotes that the draws are to be independent and identically distributed.)When B i = 0, say that the point indexed by i blocks the birth of v . Suppose that v is blocked by a single neighbor w . Then the swap move removes w , and allows the birthof v . Call this new configuration x ′ . The probability of swapping from x to x ′ (givenbirth v ) is β s ( x,v ) − (1 − β ). This makes it straightforward to check that f ( x ) s ( x, w, v ) = f ( x ′ ) s ( x ′ , v, w ), so (5) is satisfied. To implement this swap move, simply mark each point v born in D ( t ) with an i.i.d. sequence of Bern ( β ) random variables.
3. Perfect simulation by dominated CFTP
In the previous section it was shown how to couple a dominating chain and target chainusing standard birth–death chains and the new birth–death swap chain. Here a fur-ther coupling is built that allows exact draws to be taken from the stationary distribu-tion of the target chain using the dominating CFTP (dCFTP) method of Kendall andMøller [19].Both X ( t ) and D ( t ) are time-reversible, so they can be run backwards in time as easilyas forwards while maintaining the property that if X (0) ⊆ D (0), then X ( t ) ⊆ D ( t ) for all t ∈ ( −∞ , lower chain and upper chain , denoted L ( t ) and U ( t ) , respectively.Suppose that these four chains have the sandwiching property that L ( t ) ⊆ X ( t ) ⊆ U ( t ) ⊆ D ( t ) for all t ∈ ( −∞ , . (7)The process ( L ( t ) , U ( t )) can also be thought of as a bounding process for X ( t ) (see [14]).Suppose X (0) is drawn from the stationary distribution. Then if L (0) = U (0), X (0)also equals the lower and upper chain, and the state they all equal is a draw from thestationary distribution. This is the idea behind CFTP.For each positive integer N , a lower and upper chain can be created. Consider D ( t )moving backward through time, and let τ N denote the time where the N th backwardevent occurs. Set L N ( τ N ) to the empty configuration, and U N ( τ N ) = D ( τ N ).Every time there is an event at time t (either a birth or death in the dominating processmoving forwards in time) it is important to ensure that U N ( t ) and L N ( t ) continue tobound X ( t ) once the event updates the chain. That is, if a point v is added to the target M. Huber chain state, it must also be added to the upper chain. If a point w is removed from thetarget chain state, it must also be removed from the lower chain. Such a coupling has the funneling property (see [3]). All the couplings used here have this important property.An induction argument shows that the funneling property implies L N (0) ⊆ X (0) ⊆ U N (0). Note if L N (0) = U N (0), then X (0) is trapped between them and also equals thiscommon value. This is the coupling part of CFTP.The “from the past” part of CFTP works as follows. Suppose L N (0) = U N (0). Thenincrease the value of N and try again. Let N ′ > N . The first N events for the dominatingprocess (looking backward in time from time 0) have already been generated, these sameevents must be used in subsequent evaluations of the bounding process. Therefore, only N ′ − N additional events need to be generated. Once these events have been generated,run L N ′ and U N ′ forward until L N ′ (0) and U N ′ (0) can be compared.If L N (0) = U N (0) for some N , then L N ′ (0) = U N ′ (0) for all N ′ > N as well, so it is notnecessary to try every value of N . Propp and Wilson [22] noted that by doubling N ateach step, the total number of checked events is at most twice the minimum number. Thechoice of N initial is arbitrary, but L N (0) cannot equal U N (0) unless every point in D ( τ N )has died by time 0. For simplicity, here N initial is set equal to the expected number ofpoints in the dominating process at time 0, which is Kλ ( S ) (see [3] for a more advancedapproach to choosing N initial ).Kendall and Møller showed (Theorem 2.1 of [19]) that as long as the probabilitythat D ( t ) visits the empty configuration in [0 , t ] goes to 1 as t goes to infinity, thisprocedure will terminate in finite time with probability 1. The resulting configuration L N (0) = U N (0) is a draw exactly from the target distribution.Now consider the question: How should the lower and upper chains be updated foreach event in the dominating process so the funneling property holds for the swap move? For a jump process A ( t ), let A ( t − ) denote the limit as ε goes to 0 of A ( t − ε ), that is,the state of the process right before time t . The bounding process needs to be updatedif a point is born or dies at time t . The procedure followed is the same as given in [14].If a point w ∈ X ( t − ) dies, it is removed from X ( t ), and so can be removed fromboth L N ( t ) and U N ( t ). Now suppose point v is born into the dominating chain at time t .Case 1: Point v is blocked by at most one point w in U N ( t − ). Then X ( t − ) ⊆ U N ( t − )and so if w ∈ X ( t ), then w is swapped away by v , and if w / ∈ X ( t − ), then v can be born.So either way X ( t ) = X ( t − ) \ { w } ∪ { v } , w is removed from U N ( t ) (and L N ( t ) if it isthere also) and v is added to both L N ( t ) and U N ( t ).Case 2: The point v is blocked by at least two points in L N ( t − ). Then there are atleast two blocking points in X ( t − ), so the birth does not occur in L N ( t ) , X ( t ) or U N ( t ).Case 3: the point v is blocked by at most one point in L N ( t − ), and at least two pointsin U N ( t − ). Then if X ( t − ) contains the two blocking points in U N ( t − ), the swap does notoccur, but if it only contains the single blocking point in L N ( t − ), the swap does occur.The result is that the birth v must be added to U N ( t ) (but not to L N ( t )) to ensure X ( t ) ⊆ U N ( t ), and any blocking point in L N ( t − ) must be removed from L N ( t ). patial birth–death swap chains Figure 1.
Running time of dCFTP for Strauss model on S = [0 , , β = 0 . R = 0 . λ isLebesgue measure. Figure 1 shows the running time advantage gained by using the swap move. The timesare measured in number of events generated by dominated CFTP (dCFTP). On the leftare the raw number of times for the chain without the swap move and with the swapmove. The plot on the right shows the ratio of these two times. Note that as β getslarger, the speedup gained by using the swap move also increases.
4. Analyzing the running time
Consider how many events must be generated before the dominated coupling from thepast procedure terminates, that is, before U N (0) = L N (0). Deaths in U N ( t ) \ L N ( t ) causethe bounding process to move together, while births can add a point to U N ( t ) but notto L N ( t ), and the swap move sometimes removes a point from L N ( t ) but not U N ( t ).Therefore, it is reasonable that the perfect simulation algorithm will run faster in situa-tions where the birth rate is low.In this section it is shown that, for perfect simulation of the Strauss process, the originalno swap chain takes (with high probability) a small number of steps per perfect samplewhen β and R are not too large, and β is not too small. By creating a mixture of theswap chain and no swap chain, it is possible to improve this result to where it applies forvalues of β that are twice as large as for the no swap chain.The mixture works as follows: At each step, with probability p swap , the swap movechain is used, while with probability 1 − p swap , the original no swap chain is used. Thebest theoretical bound is achieved when p swap = 1 / Theorem 4.1.
Suppose that N events are generated backwards in time and then runforward to get U N (0) and L N (0) . Let B ( v, R ) be the area within distance R of v ∈ S , let r = sup v ∈ S λ ( B ( v, R )) . M. HuberIf β (1 − β ) r < , then for the chain without the swap move P ( U N (0) = L N (0)) ≤ − . N )+ β λ ( S ) exp( − N (1 − β (1 − β ) r ) / (4 β λ ( S ))) . (8) If β (1 − β ) r < , then for the chain where a swap is executed with probability / , P ( U N (0) = L N (0)) ≤ − . N ) (9)+ β λ ( S ) exp( − N (1 − . β (1 − β ) r ) / (4 β λ ( S ))) . Why the value of 1 / P ( U N (0) = L N (0)) ≤ a exp( − bN ) , and let T be the number of events generated in a callof dCFTP. Then for T ≥ t , dCFTP must have failed on a run of length at least t/
2. So E [ T ] = ∞ X N =1 P ( T ≥ N ) ≤ " ⌈ (2 /b ) ln a ⌉ X N =1 + ∞ X N = ⌈ (2 /b ) ln a ⌉ a exp( − bN/ , (10)which makes E [ T ] = O(ln a/b ), and the mean running time O( β λ ( S )(ln β λ ( S ))) for theno swap chain when β (1 − β ) r < / β (1 − β ) r < Proof of Theorem 4.1.
Recall U N ( τ N ) = D ( τ N ), a Poisson spatial point process withparameter β λ ( S ). L N ( τ N ) is the empty configuration, and the bounding processes arerun forward in time. Let Q ( t ) = U N ( t ) \ L N ( t ). Then the chains have come together ifand only if Q (0) = 0. Begin by considering the no swap chain. Strauss no swap move . All individual death rates are 1, so the total rate of deaths ofpoints in Q ( t ) is just Q ( t ). Call a death a good event since it reduces Q ( t ) by 1.For Q ( t ) to increase by 1 (call this a bad event ), a birth must occur at v and be addedto U N ( t ) but not L N ( t ). Let w be any point in Q ( t ). Then for Q ( t ) to give rise to anotherpoint in Q ( t ), a point v must be born within distance R of w and the Bern ( β ) draw mustbe 0. The area surrounding w is at most r , and the Bernoulli draw acts as a thinningprocedure in a Poisson process (see Appendix G of [20].) So the rate at which w createsnew points in Q ( t ) is at most β (1 − β ) r , and the overall rate of bad events is at most β (1 − β ) r Q ( t ).Suppose the rate of bad events is smaller than the rate of good events. The probabilitythat one event occurs in the time interval from t to t + h is proportional to h , theprobability that n events occurs is O ( h n ). Hence E [ E [ Q ( t + h ) | U ( t ) , L ( t )] − Q ( t )] ≤ E " ( Q ( t ) β (1 − β ) r − Q ( t )) h + ∞ X i =2 i O( h i ) , patial birth–death swap chains h → E [ E [ Q ( t + h ) | U ( t ) , L ( t )] − Q ( t )] h ≤ − E [ Q ( t )(1 − β (1 − β ) r )] . Let q ( t ) = E [ Q ( t )], and let τ N be the time of the N th event moving backwards in time.Then q ( τ N ) ≤ E [ D ( τ N )] = β λ ( S ), so together with q ′ ( t ) ≤ − q ( t )(1 − β (1 − β ) r ): q ( t ) ≤ β λ ( S ) exp( − t (1 − β (1 − β ) r )) . By Markov’s inequality, P ( Q (0) = ∅ ) = P ( Q (0) ≥ ≤ q (0).Now fix N , the number of events to run back in time, and set t = N/ [4 β λ ( S )]. Thechance Q (0) does not equal 0 starting at − t is at most exp( − N/ [4 β λ ( S )](1 − β (1 − β ) r )) . Using Chernoff bounds [5], it can be shown that for A ∼ Pois ( α ) , P ( A > α ) ≤ exp( − α (2 ln 2 − t time, the probability that more than N/ β λ ( S ) is at most exp( − ( N/ − . Both the times of the births and times of deaths (viewed individually) are Poisson pro-cesses with rate β λ ( S ), therefore the probability that either uses more than N/ − . N ). But if at this time each process used atmost N/ N events puts the user even farther back intime, and if coalescence occurs at − t , it will also occur starting at τ N . Again using theunion bound, the probability of failure is at most2 exp( − . N ) + exp( − N (1 − β (1 − β ) r )) . Strauss with swap move . Now consider what happens when p swap >
0. The rate of goodevents (deaths) remains unchanged, but the rate of bad events changes. In Section 3.1,Case 1 leaves Q ( t ) unchanged or reduces it by 1, Case 2 leaves Q ( t ) unchanged, andCase 3 increases Q ( t ) by 1 or 2. To be precise, let A L be the set of blocking pointsin L N ( t − ), and A U be the set of blocking points in U N ( t − ). Then the situations thatchange Q ( t ) are:Type A U A L Q ( t ) − Q ( t − ) no swap Q ( t ) − Q ( t − ) with swap1 1 0 1 −
12 at least 2 1 0 23 at least 2 0 1 1Let b denote the area of the region where a birth is Type 1, with b and b definedsimilarly. Together, the rate of change from births is: b [(1 − p swap ) − p swap ] + b [2 p swap ] + b [(1 − p swap ) + p swap ] . Any point in b neighbors at least two points in Q ( t − ), and points in b or b neighborat least one. Each point in Q ( t − ) has r area adjacent to it, so b + b + 2 b ≤ Q ( t ) r .0 M. Huber
Figure 2.
Running time of dCFTP for Strauss model on S = [0 , , β = 50, β = 0 . R = 0 . λ is Lebesgue measure, as p swap runs from 0 to 1. The variable p swap can be set to any number from 0 to 1: letting p swap = 1 / / b + (1 / b + b ≤ (1 / Q ( t ) r .Recall the bad event rate when p swap = 0 was bounded above by Q ( t ) r . With p swap =1 /
4, the bad event rate is bounded above by Q ( t ) r/
2, and this factor of two carriesthroughout the remainder of the proof to give (9). (cid:3)
H¨aggstr¨om and Steif gave a result similar to the previous theorem for finitary codingsfor high noise Markov random fields [10], but their analysis involves moving backwardsrather than forwards in time, and their result does not employ the swap move.Figure 2 illustrates the mean run time for a fixed value of λ as the probability of a swapvaries from p = 0 up to p = 1. The running time (as measured by generated iterations)decreases as the chance of swapping increases. This same phenomenon was noted for hardcore gas models on graphs [13], and at present is unexplained by theory.
5. Conclusions
The regular birth–death chains only move when no point blocks the birth of a pointin the dominating process. The birth–death swap chains move when at most one pointblocks the birth of a point in the dominating process. This alone means that more movesare being taken, and helps to explain the improved analysis and improved performancewhen used for perfect sampling with dominated coupling from the past.
References [1]
Az´ema, J. , Kaplan-Duflo, M. and
Revuz, D. (1967). Mesure invariante sur lesclasses r´ecurrentes des processus de Markov.
Z. Wahrsch. Verw. Gebiete patial birth–death swap chains [2] Baddeley, A.J. and van Lieshout, M.N.M. (1995). Area-interaction point processes.
Ann. Inst. Statist. Math. Berthelsen, K.K. and
Møller, J. (2002). A primer on perfect simulation for spatialpoint processes.
Bull. Braz. Math. Soc. (N.S.) Carter, D.S. and
Prenter, P.M. (1972). Exponential spaces and counting processes.
Z. Wahrsch. Verw. Gebiete Chernoff, H. (1952). A measure of asymptotic efficiency for tests of a hypothesis basedon the sum of observations.
Ann. Math. Statistics Clifford, P. and
Nicholls, G. (1994). Comparison of birth-and-death and Metropolis–Hastings Markov chain Monte Carlo for the Strauss process. Unpublished manuscript.[7]
Dyer, M. and
Greenhill, C. (2000). On Markov chains for independent sets.
J. Algo-rithms Feller, W. (1966).
An Introduction to Probability Theory and Its Applications. Vol. II .New York: Wiley. MR0210154[9]
Geyer, C. (1999). Likelihood inference for spatial point processes. In
Stochastic Geom-etry (Toulouse, 1996) . Monogr. Statist. Appl. Probab. H¨aggstr¨om, O. and
Steif, J.E. (2000). Propp-Wilson algorithms and finitary codings forhigh noise Markov random fields.
Combin. Probab. Comput. H¨aggstr¨om, O. , van Lieshout, M.C.N.M. and Møller, J. (1999). Characterizationresults and Markov chain Monte Carlo algorithms including exact simulation for somespatial point processes.
Bernoulli Harkness, R.D. and
Isham, V. (1983). A bivariate spatial point pattern of ants’ nests.
Appl. Stat. Huber, M. (2000). A faster method for sampling independent sets. In
Proceedings of theEleventh Annual ACM-SIAM Symposium on Discrete Algorithms (San Francisco, CA,2000)
Huber, M. (2004). Perfect sampling using bounding chains.
Ann. Appl. Probab. Illian, J. , Penttinen, A. , Stoyan, H. and
Stoyan, D. (2008).
Statistical Analysis andModelling of Spatial Point Patterns . Chichester: Wiley. MR2384630[16]
Kaspi, H. and
Mandelbaum, A. (1994). On Harris recurrence in continuous time.
Math.Oper. Res. Kelly, F.P. and
Ripley, B.D. (1976). A note on Strauss’s model for clustering.
Biometrika Kendall, W.S. (1998). Perfect simulation for the area-interaction point process. In
Proba-bility Towards the Year 2000 ( L. Accardi and
C. C. Heyde , eds.) 218–234. Springer,New York.[19]
Kendall, W.S. and
Møller, J. (2000). Perfect simulation using dominating processes onordered spaces, with application to locally stable point processes.
Adv. in Appl. Probab. Møller, J. and
Waagepetersen, R.P. (2004).
Statistical Inference and Simulation forSpatial Point Processes . Monographs on Statistics and Applied Probability . BocaRaton, FL: Chapman & Hall/CRC. MR2004226[21]
Preston, C.J. (1977). Spatial birth-and-death processes.
Bull. Inst. Int. Stat. M. Huber [22]
Propp, J.G. and
Wilson, D.B. (1996). Exact sampling with coupled Markov chainsand applications to statistical mechanics.
Random Structures Algorithms Ripley, B.D. (1977). Modelling spatial patterns (with discussion).
J. Roy. Statist. Soc.Ser. B Strauss, D.J. (1975). A model for clustering.
Biometrika Widom, B. and
Rowlinson, J.S. (1970). A new model for the study of liquid-vapor phasetransition.
J. Chem. Phys.1670–1684.