Polymer Dynamics via Cliques: New Conditions for Approximations
Tobias Friedrich, Andreas Göbel, Martin S. Krejca, Marcus Pappik
aa r X i v : . [ m a t h . P R ] J u l Polymer dynamics via cliques withapplications to hard-sphere mixtures
Tobias Friedrich Andreas Göbel Martin S. KrejcaMarcus Pappik
Hasso Plattner Institute, University of Potsdam, Potsdam, Germany{tobias.friedrich, andreas.goebel, martin.krejca, marcus.pappik}@hpi.de
Abstract
Abstract polymer models are systems of weighted objects, called polymers, equippedwith an incompatibility relation. An important quantity associated with such models isthe partition function, which is the weighted sum over all sets of compatible polymers.Various approximation problems reduce to approximating the partition function of a poly-mer model. Central to the existence of such approximation algorithms are weight con-ditions of the respective polymer model. Such conditions are derived either via complexanalysis or via probabilistic arguments. We follow the latter path and establish a newcondition—the clique dynamics condition—, which is less restrictive than the ones in theliterature. The clique dynamics condition implies rapid mixing of a Markov chain thatutilizes cliques of incompatible polymers that naturally arise from the translation of al-gorithmic problems into polymer models. This leads to improved parameter ranges forseveral approximation algorithms, such as a factor of at least 2 / α for the hard-core modelon bipartite α -expanders.Additionally, we apply our method to approximate the partition function of the multi-component hard-sphere model, a continuous model of spherical particles in the Euclidianspace. To this end, we define a discretization that allows us to bound the rate of conver-gence to the continuous model. To the best of our knowledge, this is the first algorithmicapplication of polymer models to a continuous geometric problem and the first rigorouscomputational result for hard-sphere mixtures. Keywords:
Markov chain • partition function • Gibbs distribution • approximatecounting • abstract polymer model 1 Introduction
Statistical physics models systems of interacting particles as probability distributions. Thisviewpoint explains a variety of real-world phenomena, including ferromagnetism [28], segre-gation [49], and real-world network generation [8]. A characteristic of such systems is thatthey undergo phase transitions depending on some external parameter. Such phase transitionshave been recently linked with the tractability of computational tasks. These connections havelead to a two-way exchange: tools from statistical physics are used to explain computationalphenomena, and tools from computer science are used to explain physical phenomena. Anestablished technique for investigating phase transitions in statistical physics that involvestranslating the states of a spin system as perturbations from a ground state [16, Chapter 7] hasbeen recently introduced to computer science as an algorithmic tool for computational tasksof spin systems [27].To motivate the definition of the central mathematical object of this article, we give a high-level description of how to model a spin system in terms of perturbations from a ground state.Assume we want to study a q -state spin system on a graph G . The states of the spin systemare usually mappings σ : V ( G ) → Q from the vertices of G to some finite set Q . Each suchconfiguration σ has a weight w ( σ ) ∈ R ≥ and the sum of the weights of all the configurations Z = Í σ ( w ( σ )) is called the partition function. The probability distribution that characterizesour system gives µ ( σ ) = w ( σ )/ Z , for each configuration σ . Let σ be the ground state weuse in this translation. Given a configuration σ , we identify the set of vertices D ⊆ V ( G ) where, for each v ∈ D , we have σ ( v ) , σ ( v ) . Observe that we can uniquely identify thisconfiguration by a set Γ whose elements γ consist of a connected component of G [ D ] togetherwith the restriction of σ on this component. Furthermore, we assign a weight w γ to each γ ∈ Γ ,such that Î γ ∈ Γ w γ = w ( σ )/ w ( σ ) . Thus, provided that all such sets of pairs Γ contain no twopairs γ , γ ′ that are incompatible, i.e., Γ cannot be uniquely decoded to an assignment becausefor example γ and γ ′ map the same vertex to a different element in Q , there is a bijectionbetween the configurations σ and the sets Γ . Furthermore, the distribution µ is expressed as adistribution over the sets Γ , since it retains the property that the probability of Γ is proportionalto its weight. Such a construction suggests the following definition.A polymer model P = (C , w , / ) is a tuple consisting of a non-empty, countable set C , aset w = { w γ } γ ∈C of positive real weights and a reflexive and symmetric relation / ⊆ C . Theelements γ ∈ C are called polymers . The relation / is called the incompatibility relation and, for γ , γ ′ ∈ C , we say that γ and γ ′ are incompatible if γ / γ ′ , and that they are compatible otherwise.In addition, we call a finite subset Γ ⊆ C a polymer family if and only if all polymers of Γ arepairwise compatible. Given a polymer model P , we let F (P) denote the set of all polymerfamilies of P . Note that F (P) is countable. The partition function of P is defined to be, Z (P) = Õ Γ ∈F (P) Ö γ ∈ Γ w γ , (1)which we require to be finite. Further, the Gibbs distribution of P is the probability distribu-2ion µ (P) over F (P) such that, for all Γ ∈ F (P) , µ (P) ( Γ ) = Î γ ∈ Γ w γ Z (P) . (2)A helpful interpretation for understanding the definition of a polymer model is the following.Ignoring the reflexivity of / , we view the pair (C , / ) as a graph, which we call the polymergraph . We observe that the families of F (P) correspond to the independent sets of (C , / ) .Thus, for the special case where w γ = λ ∈ R , for each γ ∈ C , the distribution µ is the hard-coremodel [48] on the polymer graph and Z (P) is the independence polynomial [44].In this article, we consider the following two computational tasks:(1) Approximately sampling from the Gibbs distribution of a polymer model, that is, return arandom family Γ from a distribution whose total variation distance from µ (P) is at most ε .(2) Returning an estimate ˜ Z , such that ( − ε ) Z (P) ≤ ˜ Z ≤ ( + ε ) Z (P) . There is an expanding list of results that utilize abstract polymer models to obtain efficientapproximation and sampling algorithms for new parameter regimes for various spin systemson graphs. This line of research was initiated by Helmuth et al. [27], who used polymers toobtain polynomial-time approximation and sampling algorithms at a regime where the weightof the interactions of particles with an external field is low. Creative ways of translating a spinsystem into a polymer model utilize restrictions upon the input graph of a spin system in orderto yield polynomial-time approximation algorithms for problems that are hard to approximateon general inputs. Such examples include spin systems on expander graphs [19, 30, 38], thehard-core model on unbalanced bipartite graphs [5], and the ferromagnetic Potts model on d -dimensional lattices [2]. Polymer models have also been used to approximate and sample edgespin systems (holant problems) at low temperatures [6].Translating a spin system on a graph G with n vertices into an abstract polymer modelcommonly results in a polymer model that contains an exponential number of polymers interms of n —as can be observed in our earlier discussion of such a translation. Therefore, theapproximation and sampling algorithms we are interested in have runtime polynomial in n .There are two main algorithmic approaches for such algorithms. (i) Cluster expansion This approach considers complex weights for the polymers and isbased on the cluster expansion , an infinite series expansion of ln Z . The essential element forpolynomial-time computation is a theorem of Kotecký and Preiss [36, Theorem 1], a conditionfor establising absolute convergence of the cluster expansion. By satisfying the Kotecký–Preisscondition, one can truncate the cluster expansion to its most significant terms and obtain an ε -additive approximation for ln Z . Computing the significant terms of the cluster expansioncan be achieved by enumerating connected induced subgraphs of the polymer graph of sizeup to log |C| . Using an algorithm of Patel and Regts [42], the enumeration takes polynomialtime in terms of the input graph of the spin system. The ε -additive approximation of ln Z ε -approximation for Z . The runtime of this approach iscommonly O (cid:0) n log ∆ (cid:1) , where ∆ is the maximum degree of the input graph G for the spin systemand n = | V ( G )| . Approximating Z together with the self-reducibility of the polymer modelobtained gives a sampling algorithm for µ (P) , as shown by Helmuth et al. [27]. (ii) Markov chain Monte Carlo The first to use the Markov chain Monte Carlo method onpolymers are Chen et al. [7]. The idea of this method is to define a Markov chain with statespace F (P) and with stationary distribution µ (P) . The Markov chain requires the polymermodel to have originated from a spin system on a graph G with n vertices. In each iteration,the chain samples a polymer γ with probability proportional to its weight w γ and then adds orremoves γ from its state if possible. When the mixing condition [7, Definition 1] is satisfied, itis shown that the Markov chain converges to µ (P) after O ( n log n ) many iterations. The mixingcondition matches a convergence condition arising from an analysis by Fernández et al. [14]of another stochastic process of polymers on lattices. An ε -approximate sampler for µ (P) canbe obtained by simulating the Markov chain. The computational challenge for this approachis to sample the polymer γ in order to perform a transition of the Markov chain. As Chenet al. [7] show, this can be done in expected constant time provided the sampling condition [7,Definition 4] is satisfied. This results in an O ( n log n ) algorithm for sampling from the Gibbsdistribution of a spin system. Using simulated annealing, Chen et al. show that this samplercan be converted to a randomized approximation scheme (FPRAS) for Z that runs in expectedO (cid:0) n log n (cid:1) time. Comparison of the known conditions
A number of conditions for the convergence of thecluster expansion has appeared in the literature, such as [11, 15, 36]. The condition of Fernándezand Procacci [15] is the least restrictive among them, that, is the Kotecký–Preiss condition [36]and others appearing in the literature imply the Fernández–Procacci condition. Thus, usingthe Fernández–Procacci condition, one could potentially obtain approximation algorithms forbroader parameter ranges than the ones obtained by using the Kotecký–Preiss [36] condition.However, the condition by Kotecký and Preiss [36] is convenient to apply in polymer modelsof vertex spin systems and comes with implications on the rate of convergence of the clus-ter expansion used in algorithmic settings. When compared to cluster expansion conditions(restricted to non-negative real weights), the mixing condition of Chen et al. [7] is less re-strictive than the Kotecký–Preiss condition, however, it is incomparable with the conditionof Fernández and Procacci [15]. Note that the sampling condition [7] the most restrictive ofthe aforementioned conditions.
We study a new Markov chain ( X t ) t ∈ N for abstract polymer models, whose stationary distri-bution is µ (P) . The dynamics of our Markov chain are based on a clique cover , that is, a set Λ = { Λ i } i ∈[ m ] with Ð Λ = C , such that the polymers in each clique Λ i are pairwise incompati-ble. Observe that when we consider families of compatible polymers in Λ i , they contain at mostone polymer. Our Markov chain at each step chooses uniformly at random a clique Λ i in Λ andsamples a family in Λ i according to the distribution µ | Λ i defined as follows. For γ ∈ Λ i , we4ave µ | Λ i ({ γ }) = w γ / Z | Λ i and, for the empty set, µ | Λ i (∅) = / Z | Λ i , where Z | Λ i = + Í γ ∈ Λ i w γ .If the family chosen is the empty family and X t contains a polymer from Λ i , then the chainremoves this polymer. If the family chosen contains a polymer, then, if possible, the chain addsthis polymer to its state. For a detailed description of our chain, please refer to Definition 3.Our chain applies to any abstract polymer model, since we can always use the trivial cliquecover, where each clique contains exactly one polymer. However, clique covers with a muchsmaller number of cliques arise naturally from the translation of spin systems into a polymermodel. For example, the translation we discussed earlier in the introduction yields a cliquecover with n cliques, one for each vertex in the original graph G . Since such cliques commonlyhave exponential size, our chain utilizes that a family of compatible polymers may contain atmost one polymer from each Λ . The chain in Chen et al. [7] also utilizes this fact, however,in a more restricted setting and with a different sampling distribution for each vertex-clique.An additional nice feature of our chain is that it coincides with the (spin) Glauber dynamics(cf. [13, insert/delete chain]) when considered with the trivial clique cover. This comes fromthe choice of sampling from µ | Λ i for each clique chosen at each iteration.Central to our mixing time analysis for this chain is the following condition. ◮ Condition 1 (clique dynamics).
Let P = (C , w , / ) be a polymer model, and let f : C → R > . We say that P satisfies the clique dynamics condition with f if and only if, for all γ ∈ C , itholds that Õ γ ′ ∈C : γ ′ / γ , γ ′ , γ f ( γ ′ ) w γ ′ + w γ ′ ≤ f ( γ ) . ◭ We show that when the clique dynamics condition is satisfied, the mixing time of our Markovchain is polynomial in the number of cliques in our clique cover and logarithmic to the choiceof function f (Theorem 6). Note that our condition does not exclude polymer models with w γ ≥ γ . When restricted to the setting of Chen et al. [7], the clique dy-namics condition is implied by the mixing condition and thus less restrictive. Involving thefunction f in our condition makes it easily comparable with the conditions for cluster expan-sion. As we discuss in Section 3.1, we show that the clique dynamics condition is more generalthan the Fernández–Procacci [15] condition for the cluster expansion—and consequently moregeneral than the Kotecký–Preiss condition [36]. An interesting implication of our analysis isthat cluster expansion conditions imply our condition for the mixing time of a Markov chain.To the best of our knowledge this is the first such connection. We conjecture that our condi-tion can be further generalized, as tightness results indicated by hardness of approximation arerestricted to the special case of the hard-core model [1, 17, 18, 45]. No such results are knownfor polymer models. To obtain the mixing time bound, we use coupling. The high-level idea of this technique isto define a potential δ expressing distances between the states of the Markov chain Z t . If two(commonly correlated) copies ( X t , Y t ) of the Markov chain Z t after k transitions result in statessuch that δ ( X k , Y k ) = k bounds the mixing time of the Markov5hain Z t . One way of using this method, known as path coupling, is to define the metric ononly adjacent pairs of states; in our setting, these are polymer families Γ , Γ ′ , where Γ ∪{ γ } = Γ ′ for some polymer γ . The metric δ is then extended to all pairs in the state space by consideringa shortest path of adjacent pairs and summing their distances in terms of δ . To obtain a boundon the mixing time then, we could apply a theorem of Dyer and Greenhill [12]. The theoremrequires to show that the distance between X t and Y t when they are in adjacent states, reducesin expectation, i.e., δ ( X t + , Y t + | X t , Y t ) ≤ δ ( X t , Y t ) . When the latter inequality is strict, thetheorem implies a bound on the mixing time of the Markov chain Z t that is logarithmic in D —the diameter of the metric space defined by δ . When the inequality is not strict, this results ina mixing time linear in D .When interested in polymer models that come from a spin system on a graph G , the diam-eter D of the metric chosen on the state space of X t can be exponential in n , as it dependson the choice of the function f appearing in the clique dynamics condition. We show that if δ ( X t + , Y t + | X t , Y t ) < δ ( X t , Y t ) for some appropriately chosen δ , then it suffices to use thetheorem of Dyer and Greenhill [12] to obtain the required mixing time bound. This is achievedif we assume that the inequality in clique dynamics condition is strict. However, the conditionthen becomes incomparable with the Fernández–Procacci [15] condition.To make the conditions comparable, we revisit a theorem of Greenberg et al. [20] that givesa mixing time bound logarithmic in D when δ ( X t + , Y t + | X t , Y t ) ≤ δ ( X t , Y t ) . However, as wediscuss in the appendix, the theorem is not applicable when one considers only adjacent pairs ( X t , Y t ) with respect to δ . The theorem does hold though when one performs the analysis overall pairs of states of the Markov chain, as we show in Section 3, which gives a mixing timebound that is polynomial in the number of cliques in the polymer model, assuming the cliquedynamics condition. Our main algorithmic result uses our Markov chain and the bound of its mixing time to ap-proximately sample from µ . ◮ Theorem 9.
Let P = (C , w , / ) be a computationally feasible polymer model, let Λ be apolymer clique cover of P with size m , and let Z max = max i ∈[ m ] { Z | Λ i } . Furthermore, assumethat(a) Z max ∈ poly ( m ) ,(b) P satisfies the clique dynamics condition for a function f such that, for all γ ∈ C , itholds that e − poly ( m ) ≤ f ( γ ) ≤ e poly ( m ) , and that,(c) for all i ∈ [ m ] , we can sample from µ | Λ i in time poly ( m ) .Then, for all ε ∈ ( , ] , we can ε -approximately sample from µ in time poly ( m / ε ) . ◭ Additionally, as we discuss in Section 4.2, we use self-reducibility on the clique cover anduse the above theorem to obtain an ε -approximation algorithm for the partition function Z .6 Theorem 12.
Let P = (C , w , / ) be a computationally feasible polymer model, let Λ be apolymer clique cover of P with size m . Assume that P satisfies the conditions of Theorem 9.Then, for all ε ∈ ( , ] , there is a randomized ε -approximation of Z computable in timepoly ( m / ε ) . ◭ Since it is common for spin systems on graphs with n vertices to translate into polymermodels with a clique cover of n cliques, the above theorems imply polynomial-time algorithmsfor their respective problems. Assumption (a) is trivially satisfied for the applications we con-sider and, furthermore, assumption (b) allows for a broad range in the choice of the function f appearing in the clique dynamics condition.When we apply the above theorems to the spin systems previously studied in the litera-ture, assumption (c) is not straightforward to satisfy, as the size of the cliques are commonlyexponential in n = | V ( G )| . Chen et al. [7] used the sampling condition in order to samplepolymers in expected constant time. As we are interested in extending the parameter rangewhile remaining in the realm of polynomial time computations, we do not need to use such arestrictive condition. For this purpose, we introduce the clique truncation condition. ◮ Condition 24 (clique truncation).
Let P = (C , w , / ) be a polymer model, let Λ be apolymer clique cover of P with size m , and let |·| be a size function for P . For all i ∈ [ m ] , wesay that Λ i satisfies the clique truncation condition for a monotonically increasing, invertiblefunction д : R → R > and a bound B ∈ R > if and only if Õ γ ∈ Λ i д (| γ |) w γ ≤ B . ◭ We show that when the clique truncation condition is satisfied, we can reduce the size of eachclique to a polynomial in n by removing low weight polymers from the polymer model. Moreprecisely, Corollary 26 states that for an ε -approximation it is sufficient to consider only poly-mers γ with | γ | ≤ д − ( Bm / ε ) . This allows us to use the algorithm of Patel and Regts [42] to sam-ple from the Gibbs distribution of each clique by enumerating all polymers in the clique. In allour calculations, the parameter range restrictions imposed by the clique truncation conditionare weaker than the ones imposed by the clique dynamics condition. As illustrated in Table 1,this leads to improved parameter ranges for spin systems previously studied in literature (seeSection 6.1 for a detailed discussion on the hard-core model on bipartite α -expanders). We apply our results to the multi-component hard-sphere model, an inherently geometric modelthat is central in the analysis of thermodynamics of liquids and liquid mixtures [4, 24]. It isa continuous model that studies the macroscopic behavior and distribution of spherical parti-cles, assuming that the only interaction among the particles is the hard-core interaction, i.e.,no two particles can occupy the same space. We are interested in the grand canonical ensem-ble of the hard-sphere model in a d -dimensional finite hypercube V = [ , ℓ ) d . We consider q A similar idea was used for the hard-core model on bipartite expanders in the first arXiv version of Chen et al.[7]. able 1: Improvement on the parameter ranges of our technique for problems with knownapproximation algorithms. Note that for a fair comparison we refined the calculations of thebounds in [30] in a similar fashion as in Section 6.1.
Problem Previous range New range
Hard-core model onbipartite α -expanders λ > ( e ∆ ) α [30] λ ≥ (cid:0) e0 . ∆ (cid:1) α q -state Potts model on α -expanders β > / + ln ( ∆q ) α [30] β ≥ / + ln ( ∆q ) α Hard-core model onunbalanced bipartite graphs ∆ L ∆ R λ R ≤ ( + λ L ) δ R ∆ L [5] 3 . ∆ L ∆ R λ R ≤ ( + λ L ) δ R ∆ L Perfect matchingpolynomial z ≤ (cid:16)p . ( ∆ − ) (cid:17) − [6] z ≤ (cid:16)p . ( ∆ − ) (cid:17) − different types of particles Q = {( r , p ) , . . . , ( r q , p q )} , represented as d -dimensional spheresof radius r i ∈ R > and a chemical potential p i ∈ R . For each particle type i ∈ [ q ] , the cen-ters of spherical particles are distributed according to a Poisson point process of intensity e p i on V . The resulting distribution over all possible system states is characterized by the mixtureof these point processes conditioned on the fact that the particles with radii corresponding totheir particle type are non-overlapping. We are interested in approximating the grand canon-ical partition function of the multi-component hard-sphere model, which is the normalizingconstant of the corresponding probability density over the states of the system (see Section 5for a formal definition). Related work
Most rigorous algorithmic results for the hard-sphere model are restricted tothe special case of a single component, i.e., one type of particle. Note that the one-particlemodel has been used to obtain bounds for the optimal sphere packing density [9, 10, 23, 29,43]. This model carries a historic weight, as in the seminal work of Metropolis et al. [39],the Monte Carlo method was introduced on a two-dimensional single-component hard-spheremodel on 224 particles. Approximate-sampling Markov chain approaches have been mainlyfocused on the canonical ensemble of the model, that is, the distribution defined over a fixednumber of spheres [25, 29, 33]. Considering the grand canonical ensemble, exact sampling al-gorithms have appeared in the literature for the two-dimensional model without asymptoticruntime guarantees [34, 35, 41]. Guo and Jerrum [22] have introduced an exact sampling algo-rithm for the grand canonical ensemble of the hard-sphere model on d -dimensions. The modelthey consider consists of a single type of particle of radius 1 and chemical potential ln ( λ / v d ) ,where v d is the volume of the d -dimensional sphere of radius 1 and λ ∈ R ≥ . The algorithmis based on rejection sampling with runtime in O ( ℓ d ) using oracle access to a sampler from acontinuous Poisson point process. The parameter regime for which their runtime guaranteesapply is λ < −( d + / ) . Recently Helmuth et al. [26] considered the single-center dynamics, acontinuous-state space Markov chain generalizing Glauber dynamics in order to study decay of8orrelations for the model. Their results show that when λ < −( d − ) , the single-center dynam-ics is rapidly mixing. Finally, we note that the hard-core model can be considered as discreteversion of the mono-atomic grand canonical hard-sphere model. Although tight approxima-tion results for the hard-core model exist, it is not known how these results on a discrete graphtopology can be mapped to the original hard-sphere model in continuous space. Our results
We obtain an ε -approximation algorithm for the grand canonical partition func-tion of the multi-component hard-sphere model (Theorem 19). We show that the runtime ofour algorithm is polynomial in ℓ d , the number of particle types q , and ε − . To our knowledge,this is the first rigorous algorithmic result to consider multiple components.To approximate the grand canonical partition function, we consider a discretization of thecontinuous model where the sphere centers are only allowed to be on grid points. We show thatthe partition function of the continuous model is closely approximated by the partition func-tion of the discrete model with a sufficient number of points (Lemma 17). This is essentiallyachieved by giving a lower bound on the rate of convergence of the two functions in terms ofthe number of grid points considered. This shows that we can obtain an ε -approximation forthe continuous model via an ε -approximation for the discrete model. Thus, we define a poly-mer model for the discrete hard sphere in terms of perturbations from the empty state. Ourpolymers simply consist of a center position on the grid together with a type of particle thatoccupies it. Two polymers are incompatible if and only if the particles overlap. This transla-tion yields polymer cliques consisting of d -dimensional subgrids. The number of such subgridsonly depends on ℓ , the dimension, and the minimum particle radius of our system. Thus, thenumber of the polymer cliques in the cover is independent of the number of grid points chosento approximate the continuous model. Consequently, the mixing time of our Markov chain isindependent of the number of grid points. Note that for this application sampling form, thedistribution of each clique does not require additional assumptions, such as the clique trunca-tion condition. Finally, we convert the sampler for the polymer model to an ε -approximationfor the partition function of the discrete model, which translates to an ε -approximation for thegrand canonical partition function of the continuous model.We note that our approximation algorithm does not require access to a continuous sampler.As we show in Section 5.3, when we apply our algorithm to the model with the chemical po-tential considered by Guo and Jerrum [22] and Helmuth et al. [26], the parameter range weget for one particle is λ < − d , improving the bound in Guo and Jerrum [22]. The rapidlymixing bound of λ < −( d − ) from Helmuth et al. [26] is achieved via path coupling, using arefined potential that heavily abuses the symmetry of the single-particle model. This metriccould be directly applied to the Glauber dynamics of the single-component discrete model (see[47]) and yield an approximation algorithm for this range. However, it is not obvious whethersuch a potential can be applied to the multi-component model or to polymer models. Finally,note that a discretization process in the spirit of ours might be applicable to establishing newbounds for the correlation decay of the hard sphere model using the correlation decay of thepolymer model, as hinted in Helmuth et al. [26, Section 1.6].9 .3 Outline The technical part of our article is structured as follows. We establish notation and introducethe tool for bounding the mixing time of our chain in Section 2. We define and analyze ourMarkov chain in Section 3. The algorithmic results are stated and proved in Section 4. We thenapply our algorithms to the multi-component hard-sphere model in Section 5. In Section 6,we show how to efficiently sample polymers from their respective cliques, which we use toimprove the parameter ranges of known algorithmic bounds on spin systems. Finally, in theappendix, we discuss why the theorem for bounding the mixing time of our chain in its originalform does not apply.
We denote the set of all natural numbers, including 0, by N and the set of all real numbersby R . For an n ∈ N , let [ n ] denote the interval [ , n ] ∩ N . If the polymer model P is clearfrom context, we may drop the index and write F , Z , and µ instead of F (P) , Z (P) , and µ (P) respectively. We base the transitions of our Markov chain for a polymer model (C , w , / ) on restricted sets B ⊆ C . We define the set of all polymer families restricted to B to be F |B = F ∩ B . Fur-ther, we define the restricted partition function Z |B to be equation (1) but with F (P) replacedby F |B . Similarly, we define the restricted Gibbs distribution µ |B to be a probability distributionover F |B , i.e., equation (2) but with Z (P) replaced by Z |B . Our restrictions are special sets ofpolymers, which we define next.By definition, for a polymer model, a polymer family Γ cannot contain incompatible poly-mers. Thus, when considering a subset B ⊆ C where all polymers are pairwise incompatible,at most one polymer of B is in Γ . We call such a subset B a polymer clique. Last, for an m ∈ N > , we call a set Λ = { Λ i } i ∈[ m ] of polymer cliques a polymer clique cover if and only if Ð Λ = C , and we call m the size of Λ . Note that the elements of Λ need not bepairwise disjoint. Further note that, for each i ∈ [ m ] , the partition function restricted to Λ i boils down to Z | Λ i = Õ Γ ∈F | Λi Ö γ ∈ Γ w γ = + Õ γ ∈ Λ i w γ , as the polymers of Λ i are pairwise incompatible and thus each family of Λ i (except ∅ ) contains asingle polymer. Similarly, the Gibbs distribution restricted to Λ i simplifies to µ | Λ i (∅) = / Z | Λ i = /( + Í γ ∈ Λ i w γ ) and, for each γ ′ ∈ Λ i , to µ | Λ i ({ γ ′ }) = w γ ′ / Z | Λ i = w γ ′ /( + Í γ ∈ Λ i w γ ) . For a Markov chain M with a unique stationary distribution D and an ε ∈ ( , ] , let τ M ( ε ) denote the mixing time of M (with error ε ). That is, τ M ( ε ) denotes the first point in time t ∈ N D and the distribution of M at time t is at most ε .In order to bound the mixing time of our Markov chains, we use a theorem by Greenberget al. [20, Theorem 3 . ◮ Theorem 2 (coupling with exponential potential).
Let M be an ergodic Markov chainwith state space Ω and with transition matrix P such that, for all x ∈ Ω , it holds that P ( x , x ) > d , D ∈ R > , d ≤ D , let δ : Ω → { } ∪ [ d , D ] be such that δ ( x , y ) = x = y .Assume that there is a coupling between the transitions of two copies ( X t ) t ∈ N and ( Y t ) t ∈ N of M such that, for all t ∈ N and all x , y ∈ Ω , it holds thatE [ δ ( X t + , Y t + ) | X t = x , Y t = y ] ≤ δ ( x , y ) . (3)Furthermore, assume that there are κ , η ∈ ( , ) such that, for the same coupling and all t ∈ N and all x , y ∈ Ω with x , y , it holds thatPr [| δ ( X t + , Y t + ) − δ ( x , y )| ≥ ηδ ( x , y ) | X t = x , Y t = y ] ≥ κ . (4)Then, for all ε ∈ ( , ] , it holds that τ M ( ε ) ≤ (cid:0) ln ( D / d ) + ( ) (cid:1) ln ( + η ) κ ln (cid:18) ε (cid:19) . If ln ( D / d ) ∈ Ω ( ) , then this bound simplifies to τ M ( ε ) ∈ O ln ( D / d ) ln ( + η ) κ ln (cid:18) ε (cid:19) ! . ◭ We use the following formal notion of approximate sampling. Let ν be a probability distri-bution on a countable state space Ω . For ε ∈ ( , ] , we say that a distribution ξ on Ω is an ε -approximation of ν if and only if d TV (cid:0) ν , ξ (cid:1) ≤ ε , where d TV (cid:0) · , · (cid:1) denotes the total variation dis-tance. Further, we say that we can ε -approximately sample from ν if and only if we can samplefrom any distribution ξ such that ξ is an ε -approximation of ν .We are also interested in approximating the partition function of polymer models, whichwe define as follows. For x ∈ R > and ε ∈ ( , ] , we call a random variable X a randomized ε -approximation for x if and only ifPr [( − ε ) x ≤ X ≤ ( + ε ) x ] ≥ . Note that if x is the output to an algorithmic problem on some instance and independent sam-ples of X can be obtained in polynomial time in the instance size and 1 / ε , then this translatesto the definition of an FPRAS. 11 Polymer dynamics
We analyze the following Markov chain for a polymer model with a polymer clique cover. ◮ Definition 3 (polymer clique dynamics).
Let P be a polymer model, and let Λ be a poly-mer clique cover of P with size m . We define M(P) to be a Markov chain with state space F .Let ( X t ) t ∈ N denote a (random) sequence of states of M(P) , where X is arbitrary. Then, for all t ∈ N , the transitions of M(P) are as follows: choose i ∈ [ m ] uniformly at random ; choose Γ ∈ F | Λ i according to µ | Λ i ; if Γ = ∅ then X t + = X t \ Λ i ; else if X t ∪ Γ is a valid polymer family then X t + = X t ∪ Γ ; else X t + = X t ; ◭ Given a polymer model P = (C , w , / ) and a polymer Markov chain M(P) , let P denote thetransition matrix of M(P) . That is, for all Γ , Γ ′ ∈ F (P) , the entry P ( Γ , Γ ′ ) denotes the probabil-ity to transition from state Γ to state Γ ′ in a single step. Note that P is time-homogeneous andthat, for all Γ , Γ ′ ∈ F (P) with P ( Γ , Γ ′ ) >
0, it holds that the symmetric difference of Γ and Γ ′ has a cardinality of at most 1, since the polymer families of a polymer clique are all singletons.Further note that M(P) has a positive self-loop probability, as the polymer families from apolymer clique are pairwise incompatible.The transition probabilities of two neighboring states of
M(P) follow a simple pattern. Inorder to ease notation, for all γ ∈ C , let z γ = Í i ∈[ m ] : γ ∈ Λ i / Z | Λ i . For all Γ , Γ ′ ∈ F (P) suchthat there is a γ ∈ C , γ < Γ such that Γ ′ = Γ ∪ { γ } , it holds that P ( Γ , Γ ′ ) = m Õ i ∈[ m ] : γ ∈ Λ i µ | Λ i ({ γ }) = m Õ i ∈[ m ] : γ ∈ Λ i w γ Z | Λ i = w γ z γ m > P ( Γ ′ , Γ ) = m Õ i ∈[ m ] : γ ∈ Λ i µ | Λ i ({∅}) = m Õ i ∈[ m ] : γ ∈ Λ i Z | Λ i = z γ m > . We show that the polymer clique dynamics are suitable for sampling from the Gibbs distri-bution of a polymer model, since the limit distribution of the Markov chain converges to µ . ◮ Lemma 4.
Let P be a polymer model. The polymer Markov chain M(P) is ergodic withstationary distribution µ (P) . ◭ Proof.
First, note that
M(P) is irreducible, as there is a positive probability to go from anypolymer family Γ ∈ F to the empty polymer family ∅ in a finite number of steps by consecu-tively removing each polymer γ ∈ Γ . Similarly, there is a positive probability to go from ∅ toany polymer family Γ ′ ∈ F in a finite number of steps by consecutively adding all polymers γ ′ ∈ Γ ′ .We proceed by proving that µ (P) , which we abbreviate as µ , is a stationary distributionof M(P) . To this end, we show that
M(P) satisfies the detailed-balance condition with respect12o µ . That is, for all Γ , Γ ′ ∈ F , it holds that µ ( Γ ) · P ( Γ , Γ ′ ) = µ ( Γ ′ ) · P ( Γ ′ , Γ ) . (6)Note that it is sufficient to check equation (6) for all pairs of states with a symmetric differenceof exactly one polymer.Let Γ , Γ ′ ∈ F and assume without loss of generality that Γ ′ = Γ ∪ { γ } for some polymer γ < Γ . Note that, by equation (5), P ( Γ , Γ ′ ) = w γ · P ( Γ ′ , Γ ) . Further, by definition of the Gibbsdistribution, we have µ ( Γ ′ ) = w γ · µ ( Γ ) . Thus, we get µ ( Γ ) · P ( Γ , Γ ′ ) = µ ( Γ ) · w γ · P ( Γ ′ , Γ ) = µ ( Γ ′ ) · P ( Γ ′ , Γ ) , which shows that µ is a stationary distribution of M(P) .Finally, we argue that
M(P) is ergodic. Note that an irreducible Markov chain has a sta-tionary distribution if and only if it is positive recurrent. In addition, every state of
M(P) has a positive self-loop probability, which implies that the chain is aperiodic. This shows that
M(P) is ergodic and concludes the proof. (cid:4)
Recall Condition 1 (clique dynamics) from the introduction. Assuming that the conditionholds, we obtain the following bound on the mixing time of
M(P) . ◮ Lemma 5.
Let P = (C , w , / ) be a polymer model satisfying the clique dynamics conditionwith function f , and let Λ be a polymer clique cover of P with size m . Then, for all ε ∈ ( , ] ,it holds that τ M(P) ( ε ) ∈ O © « m min γ ∈C { z γ } ln © « m max γ ∈C n f ( γ ) z γ ( + w γ ) o min γ ∈C n f ( γ ) z γ ( + w γ ) o ª®®¬ ln (cid:18) ε (cid:19)ª®®¬ . ◭ Proof.
We aim to apply Theorem 2, which requires us to define a potential δ . We do so byutilizing the function δ ′ : C → R > with γ f ( γ )/ (cid:0) z γ ( + w γ ) (cid:1) . Let ⊕ denote the symmetricset difference. For all Γ , Γ ′ ∈ F , we define δ ( Γ , Γ ′ ) = Õ γ ∈ Γ ⊕ Γ ′ δ ′ ( γ ) . Note that δ ( Γ , Γ ′ ) only depends on the symmetric difference of Γ and Γ ′ and that δ ( Γ , Γ ′ ) = Γ ⊕ Γ ′ = ∅ , which only is the case when Γ = Γ ′ .We continue by constructing a coupling between two copies of M(P) , namely between ( X t ) t ∈ N and ( Y t ) t ∈ N . We couple these chains such that, for each transition,• both choose the same index i ∈ [ m ] and• both draw the same polymer family Γ ∆ ∈ F | Λ i from µ | Λ i .This constitutes a valid coupling, as each chain transitions according to its desired marginaltransition probabilities. 13e now show for all t ∈ N and Γ , Γ ′ ∈ F thatE [ δ ( X t + , Y t + ) | X t = Γ , Y t = Γ ′ ] ≤ δ ( Γ , Γ ′ ) . Note that this trivially holds if Γ = Γ ′ , as the chains X and Y behave identically from then on.Thus, we are left with the case that Γ , Γ ′ , which implies that | Γ ⊕ Γ ′ | ≥ γ ∈ C , let N ( γ ) = { γ ′ ∈ C | γ ′ / γ , γ ′ , γ } denote the neighborhood of γ . We extend this definition to arbitrary subsets of polymers B ⊆ C by N (B) = Ð γ ∈B N ( γ ) .Let ∆ = Γ ⊕ Γ ′ , and let γ ∈ ∆ . Assume without loss of generality that γ ∈ Γ . By equation (5),with probability z γ / m , the chain X removes γ and the chain Y remains in its state. Conse-quently, δ ( X t + , Y t + ) decreases by δ ′ ( γ ) . Similarly, if γ ∈ ∆ \ N ( ∆ ) , with probability w γ z γ / m ,the chain Y adds γ and the chain X remains in its state. Again, δ ( X t + , Y t + ) decreases by δ ′ ( γ ) .Let δ − ( Γ , Γ ′ ) denote the expected (conditional) decrease of δ . By the observations above, wesee that δ − ( Γ , Γ ′ ) = Õ γ ∈ ∆ δ ′ ( γ ) z γ m + Õ γ ∈ ∆ \ N ( ∆ ) δ ′ ( γ ) w γ z γ m = Õ γ ∈ ∆ δ ′ ( γ ) z γ m ( + w γ ) − Õ γ ∈ ∆ ∩ N ( ∆ ) δ ′ ( γ ) w γ z γ m . Moreover, δ increases whenever a polymer γ is added to only one of both chains. This onlyoccurs if γ ∈ N ( ∆ ) \ ∆ and has probability w γ z γ / m for each such polymer. Similarly to theexpected decrease, we denote the expected increase by δ + ( Γ , Γ ′ ) . We bound δ + ( Γ , Γ ′ ) ≤ Õ γ ∈ N ( ∆ )\ ∆ δ ′ ( γ ) w γ z γ m = Õ γ ∈ N ( ∆ ) δ ′ ( γ ) w γ z γ m − Õ γ ∈ ∆ ∩ N ( ∆ ) δ ′ ( γ ) w γ z γ m ≤ Õ γ ∈ ∆ Õ γ ′ ∈ N ( γ ) δ ′ ( γ ′ ) w γ ′ z γ ′ m − Õ γ ∈ ∆ ∩ N ( ∆ ) δ ′ ( γ ) w γ z γ m . Together, we obtainE [ δ ( X t + , Y t + ) | X t = Γ , Y t = Γ ′ ] = δ ( Γ , Γ ′ ) + δ + ( Γ , Γ ′ ) − δ − ( Γ , Γ ′ )≤ δ ( Γ , Γ ′ ) + Õ γ ∈ ∆ Õ γ ′ ∈ N ( γ ) δ ′ ( γ ′ ) w γ ′ z γ ′ m − Õ γ ∈ ∆ δ ′ ( γ ) z γ m ( + w γ ) = δ ( Γ , Γ ′ ) + Õ γ ∈ ∆ © « Õ γ ′ ∈ N ( γ ) δ ′ ( γ ′ ) w γ ′ z γ ′ m − δ ′ ( γ ) z γ m ( + w γ ) ª®¬ . We proceed by showing that, for each γ ∈ ∆ , the respective summand in the sum above is atmost zero. By the definition of δ ′ , we get Õ γ ′ ∈ N ( γ ) δ ′ ( γ ′ ) w γ ′ z γ ′ m − δ ′ ( γ ) z γ m ( + w γ ) = m © « Õ γ ′ ∈ N ( γ ) f ( γ ′ ) w γ ′ + w γ ′ − f ( γ ) ª®¬ .
14y the definition of N ( γ ) and since P satisfies the clique dynamics condition, we bound1 m © « Õ γ ′ ∈ N ( γ ) f ( γ ′ ) w γ ′ + w γ ′ − f ( γ ) ª®¬ = m © « Õ γ ′ ∈C : γ ′ / γγ ′ , γ f ( γ ′ ) w γ ′ + w γ ′ − f ( γ ) ª®®®¬ ≤ . Consequently, we get thatE [ δ ( X t + , Y t + ) | X t = Γ , Y t = Γ ′ ] ≤ δ ( Γ , Γ ′ ) . We now show that there are values η , κ ∈ ( , ) such that, for all t ∈ N and all Γ , Γ ′ ∈ F with Γ , Γ ′ , it holds thatPr (cid:2) | δ ( X t + , Y t + ) − δ ( Γ , Γ ′ )| ≥ ηδ ( Γ , Γ ′ ) (cid:12)(cid:12) X t = Γ , Y t = Γ ′ (cid:3) ≥ κ . (7)Note that every polymer family in F has at most m polymers because it can have at most onepolymer from each polymer clique. Thus, for ∆ = Γ ⊕ Γ ′ , we bound | ∆ | ≤ m . Consequently,there is at least one polymer γ ∈ ∆ such that δ ′ ( γ ) ≥ δ ( Γ , Γ ′ )/( m ) . Assume without loss ofgenerality that γ ∈ Γ . With probability z γ / m , chain X deletes γ and chain Y remains in itsstate, resulting in | δ ( X t + , Y t + ) − δ ( Γ , Γ ′ )| ≥ δ ′ ( γ ) ≥ δ ( Γ , Γ ′ )/( m ) . Thus, equation (7) is truefor η = /( m ) and κ = z γ / m ≥ (cid:0) min γ ∈C { z γ } (cid:1) / m .It remains is to determine d , D ∈ R > such that, for all Γ , Γ ′ ∈ F with Γ , Γ ′ , it holds that δ ( Γ , Γ ′ ) ∈ [ d , D ] . Let ∆ = Γ ⊕ Γ ′ , noting again that | ∆ | ≤ m . We choose d ≥ min γ ∈C { δ ′ ( γ )} = min γ ∈C (cid:26) f ( γ ) z γ ( + w γ ) (cid:27) and D ≤ m max γ ∈C { δ ′ ( γ )} = m max γ ∈C (cid:26) f ( γ ) z γ ( + w γ ) (cid:27) . Applying Theorem 2 and observing that ln (cid:0) + m (cid:1) ≥ m concludes the proof. (cid:4) Last, we combine Lemmas 4 and 5 and obtain the main result of this section. ◮ Theorem 6.
Let P = (C , w , / ) be a polymer model, let Λ be a polymer clique cover of P withsize m , and let Z max = max i ∈[ m ] { Z | Λ i } . Further, assume that P satisfies the clique dynamicscondition with function f , and let f max = max γ ∈C { f ( γ )} and f min = min γ ∈C { f ( γ )} .Then the Markov chain M(P) has the unique stationary distribution µ (P) and, for all ε ∈( , ] , it holds that τ M(P) ( ε ) ∈ O m Z max ln (cid:18) m Z f max f min (cid:19) ln (cid:18) ε (cid:19) ! . ◭ Proof.
By Lemma 4, it follows that
M(P) has the unique stationary distribution µ (P) . Thebound on the mixing time follows from Lemma 5 and by observing that, for all γ ∈ C , it holds15hat 1 Z max ≤ z γ ≤ m and 1 ≤ + w γ ≤ Z max . (cid:4) Note that, if a polymer model P satisfies, for all γ ∈ C and some f : C → R > , that Õ γ ′ ∈C : γ ′ / γ f ( γ ′ ) w γ ′ ≤ f ( γ ) , then it satisfies the clique dynamics condition for the same function f . Although the conditionabove is slightly more restrictive than the clique dynamics condition, it is more convenient touse for algorithmic applications. It can be seen as a weaker and more general version of themixing condition by Chen et al. [7]. In order to set our clique dynamics condition in the context of existing conditions for absoluteconvergence of the cluster expansion, we compare it to the condition of Fernández and Procacci[15]. We choose it for comparison because it is, to the best of our knowledge, the least restric-tive condition for absolute convergence of the cluster expansion of abstract polymer models.As Fernández and Procacci [15] show, their condition is an improvement over other knownconditions, including the Dobrushin condition [11] and the Kotecký–Preiss condition [36]. ◮ Definition 7 (Fernández and Procacci [15]).
Let P = (C , w , / ) be a polymer model, andlet N ( γ ) = { γ ′ ∈ C | γ ′ / γ } . We say that P satisfies the Fernández–Procacci condition if andonly if there is a function f : C → R > such that, for all γ ∈ C , it holds that Õ Γ ∈F (P)| N ( γ ) Ö γ ′ ∈ Γ f ( γ ′ ) w γ ′ ≤ f ( γ ) . ◭ Note that we state the condition slightly differently from the version of the original authorsto ease comparison. The original form is recovered by setting f : γ f ′ ( γ )/ w γ for somefunction f ′ : C → R > . Further, the original version allows f (or f ′ respectively) to take thevalue 0. However, note that if f ( γ ) = γ ∈ C , then the condition is trivially voidbecause ∅ ∈ F (P)| N ( γ ) , which lower bounds the left hand side of the inequality by 1.The following statement shows how our clique dynamics condition relates to the Fernández–Procacci condition as given in Definition 7. ◮ Proposition 8.
If a polymer model P = (C , w , / ) satisfies the Fernández–Procacci conditionfor a function f , then it also satisfies the clique dynamics condition for the same function. ◭ Proof.
Note that ∅ ∈ F (P)| N ( γ ) and, for all γ ′ ∈ C with γ ′ / γ , it holds that { γ ′ } ∈ F (P)| N ( γ ) . Thus, Õ γ ′ ∈C : γ ′ / γ f ( γ ′ ) w γ ′ < + Õ γ ′ ∈C : γ ′ / γ f ( γ ′ ) w γ ′ ≤ Õ Γ ∈F (P)| N ( γ ) Ö γ ′ ∈ Γ f ( γ ′ ) w γ ′ ≤ f ( γ ) .
16s discussed above, this implies that P satisfies the clique dynamics condition. (cid:4) Note that Proposition 8 implies that if a polymer model satisfies the Fernández–Procaccicondition for a function f , then Theorem 6 bounds the mixing time of the polymer Markovchain for any given clique cover. Further, Proposition 8 and its implied mixing time bounds forthe polymer Markov chain carry over to all convergence conditions that are more restrictivethan the Fernández–Procacci condition, such as the Dobrushin condition and the Kotecký–Preiss condition. We now discuss how the polymer Markov chain M of a polymer model P with a clique coverof size m is used to approximate Z (P) in a randomized fashion. To this end, M is turned intoan approximate sampler for P (Theorem 9). Then this sampler is applied in an algorithmicframework (Algorithm 1) that yields an ε -approximation of Z (P) (Theorem 12). Under certainassumptions, such as that the restricted partition function of each polymer clique is in poly ( m ) ,the approximation is computable in time poly ( m / ε ) .In order to discuss the computation time of operations on a polymer model rigorously, weneed to make assumptions about the operations we consider and their computational cost. Tothis end, we say that a polymer model P = (C , w , / ) with a polymer clique cover Λ of size m is computationally feasible if and only if all of the following operations can be performed in timepoly ( m ) :(1) for all i ∈ [ m ] , we can draw Λ i uniformly at random,(2) for all i ∈ [ m ] and all γ ∈ C , we can check whether γ ∈ Λ i ,(3) for all γ , γ ′ ∈ C , we can check whether γ / γ ′ ,(4) for all γ ∈ C , we can compute w γ .In addition to the more complex operations above, we further assume that, for all γ ∈ C andall Γ ∈ F , we can compute Γ \ { γ } and Γ ∪ { γ } , and we can decide whether Γ = ∅ in timepoly ( m ) .Please note that we do not use assumption (4) in this section and it could thus be droppedfrom the definition. However, as we require it for our results in Section 6.1, where we consideralgorithmic applications of polymer models, we include it here. We show under what assumptions one can approximately sample from the Gibbs distributionof a computationally feasible polymer model in time polynomial in the size of the clique cover. ◮ Theorem 9.
Let P = (C , w , / ) be a computationally feasible polymer model, let Λ be apolymer clique cover of P with size m , and let Z max = max i ∈[ m ] { Z | Λ i } . Further, assume that(a) Z max ∈ poly ( m ) , 17b) P satisfies the clique dynamics condition for a function f such that, for all γ ∈ C , itholds that e − poly ( m ) ≤ f ( γ ) ≤ e poly ( m ) , and that,(c) for all i ∈ [ m ] , we can sample from µ | Λ i in time poly ( m ) .Then, for all ε ∈ ( , ] , we can ε -approximately sample from µ in time poly ( m / ε ) . ◭ Proof.
In order to sample from µ , we utilize the polymer Markov chain M(P) based on Λ . ByTheorem 6, it holds that τ M(P) ( ε ) ∈ O m Z max ln (cid:18) m Z f max f min (cid:19) ln (cid:18) ε (cid:19) ! . Due to assumptions (a) and (b), it holds that τ M(P) ( ε ) ∈ poly ( m / ε ) . It remains to show thateach step of M(P) , as laid out in Definition 3, can be computed in time poly ( m ) . To this end,let X t denote the current state of poly ( m ) .Because of assumptions (1) and (c), for all i ∈ [ m ] , we can draw Λ i uniformly at random andcan sample Γ ∈ F | Λ i according to µ | Λ i in time poly ( m ) . This covers lines 1 and 2.Regarding Line 3, note that we can check whether Γ = ∅ in time poly ( m ) .Assume that Γ = ∅ ,and note that | X t | ≤ m ∈ poly ( m ) , as X t contains at most one polymer per polymer clique. Inorder to compute X t \ Λ i , it suffices to iterate over every γ ∈ Γ and check if γ ∈ Λ i , whichcan be done in time poly ( m ) , by assumption (2). Once we found γ ∈ Λ i , we remove it in timepoly ( m ) .Regarding Line 4, assume now that Γ = { γ } for some γ ∈ Λ i . In order to decide if X t ∪ Γ is avalid polymer family, it is sufficient to iterate over all γ ′ ∈ X t and check whether any of themis incompatible to γ . By assumption (3) this can be done in time poly ( m ) , which concludes theproof. (cid:4) By making a slightly stronger assumption about the polymer model, assumptions (a) and (b)of Theorem 9 are easily satisfied. ◮ Observation 10.
Recall from Section 3 that if P satisfies, for all γ ∈ C , the slightly morerestrictive condition Õ γ ′ ∈C : γ ′ / γ f ( γ ′ ) w γ ′ ≤ f ( γ ) , (8)then the clique dynamics condition is satisfied for the same function f . Thus, if equation (8)holds for an appropriate function f , assumption (b) also holds. Further, by setting γ to be thepolymer in Λ i that minimizes f , equation (8) implies that Z | Λ i ≤
2, meaning that assumption (a)is trivially satisfied. ◭ By now, we mainly discussed conditions for approximately sampling from the Gibbs distribu-tion. We now discuss to turn this into a randomized approximation for the partition function.To this end, we apply self-reducibility [32]. However, note that the obvious way for applying18 lgorithm 1:
Randomized approximation of the partition function of a polymer model
Input: polymer model P = (C , w , / ) , polymer clique cover of P with size m , number ofsamples s ∈ N > , sampling error ε s ∈ ( , ] Output: ε -approximation of Z (P) according to Lemma 11 for i ∈ [ m ] do for j ∈ [ s ] do Γ ( j ) ← ε s -approximate sample from µ | K i ; b σ i ← s Í j ∈[ s ] (cid:8) Γ ( j ) ∈ F | K i − (cid:9) ; b σ ← Î i ∈[ m ] b σ i ; return / b σ ;self-reducibility, namely based on single polymers, might take |C| reduction steps. This is notfeasible in many algorithmic applications of polymer models.To circumvent this problem, we propose a self-reducibility argument based on polymercliques. By doing so, the number of reductions is bounded by the size of the clique cover thatis used, thus adding no major overhead to the runtime of our proposed approximate samplingscheme. Besides this idea of applying self-reducibility based on cliques, most of our argumentsare analogous to known applications, like in [31, Chapter 3].We proceed by formalizing clique-based self-reducibility. Let P = (C , w , / ) be a polymermodel, and let Λ be a polymer clique cover of P with size m . We define a sequence of subsetsof polymers ( K i ) ≤ i ≤ m with K = ∅ and, for i ∈ [ m ] , with K i = K i − ∪ Λ i .Further, for all i ∈ [ m ] , let σ i = Z | K i − / Z | K i . Note that Z | K = Z | K m = Z . It holds that Z = Ö i ∈[ m ] Z | K i Z | K i − = © « Ö i ∈[ m ] σ i ª®¬ − . Hence, when approximating Z , it is sufficient to focus, for all i ∈ [ m ] , on approximating σ i .For all i ∈ [ m ] , a similar relation holds with respect to the probability that a random Γ ∈ F | K i is already in F | K i − . More formally, let i ∈ [ m ] , and let Γ ∼ µ | K i . Note thatE (cid:2) (cid:8) Γ ∈ F | K i − (cid:9)(cid:3) = Õ Γ ∈F | Ki µ | K i ( Γ ) · (cid:8) Γ ∈ F | K i − (cid:9) = Õ Γ ∈F | Ki − µ | K i ( Γ ) = Z | K i − Z | K i = σ i . (9)We use these observations in order to obtain a randomized approximation of Z (Algorithm 1)by iteratively, for all i ∈ [ m ] , approximating σ i by sampling from µ | K i .The following result bounds, for all ε ∈ ( , ] , the number of samples s and the samplingerror ε s that are required by Algorithm 1 to obtain an ε -approximation of Z . ◮ Lemma 11.
Let P = (C , w , / ) be a polymer model, let Λ be a polymer clique cover of P with size m , let Z max = max i ∈[ m ] { Z | Λ i } , and let ε ∈ ( , ] . Consider Algorithm 1 for P with s = + Z max m / ε and ε s = ε /( Z max m ) .Then Algorithm 1 returns a randomized ε -approximation of Z . ◭ roof. Let i ∈ [ m ] . We start by bounding σ i . Note that Z | K i ≥ Z | K i − and Z | K i ≤ Z | K i − Z | Λ i .Thus, 1 / Z max ≤ σ i ≤ [ b σ ] with respect to 1 / Z . Second,we bound the absolute difference of b σ and E [ b σ ] . Combining both errors concludes the proof. Bounding E [ b σ ] . Note that, for all i ∈ [ m ] , it holds that σ i − ε s ≤ E [ b σ i ] ≤ σ i + ε s , since b σ i isthe mean of ε s -approximate samples. By the bounds on σ i and our choice of ε s , we get (cid:16) − ε m (cid:17) σ i ≤ E [ b σ i ] ≤ (cid:16) + ε m (cid:17) σ i . Recall that 1 / Z = Î i ∈[ m ] σ i . Further, since { b σ i } i ∈[ m ] are mutually independent, we haveE [ b σ ] = Î i ∈[ m ] E [ b σ i ] . Consequently, since, for all x ∈ [ , ] and all k ∈ N > , it holds thate − x / k ≤ − x /( k + ) [31, Chapter 3], we obtaine − ε / Z ≤ (cid:16) − ε m (cid:17) m Z ≤ E [ b σ ] ≤ (cid:16) + ε m (cid:17) m Z ≤ e ε / Z ≤ e ε / Z . (10) Bounding the absolute difference of b σ and E [ b σ ] . By Chebyshev’s inequality, we getPr h | b σ − E [ b σ ]| ≥ ε [ b σ ] i ≤ ε Var [ b σ ] E [ b σ ] = ε E (cid:2)b σ (cid:3) E [ b σ ] − ! . Again, by the mutual independence of { b σ i } i ∈[ m ] , we have E [ b σ ] = Î i ∈[ m ] E [ b σ i ] and E (cid:2)b σ (cid:3) = Î i ∈[ m ] E (cid:2) b σ i (cid:3) . Thus,Pr h | b σ − E [ b σ ]| ≥ ε [ b σ ] i ≤ ε © « Ö i ∈[ m ] E (cid:2) b σ i (cid:3) E [ b σ i ] − ª®¬ = ε © « Ö i ∈[ m ] (cid:18) + Var [ b σ i ] E [ b σ i ] (cid:19) − ª®¬ . For bounding the variance of b σ i , recall that b σ i = s Í j ∈[ s ] (cid:8) Γ ( j ) ∈ F | K i − (cid:9) , where { Γ ( j ) } j ∈[ s ] are independently drawn from an ε s -approximation of µ | K i . By equation (9), we haveVar [ b σ i ] = s Õ j ∈[ s ] Var h n Γ ( j ) ∈ F | K i − oi = s E [ b σ i ]( − E [ b σ i ]) . Noting that E [ b σ i ] ≥ (cid:0) − ε /( m ) (cid:1) σ i ≥ /( Z max ) , we boundVar [ b σ i ] E [ b σ i ] = s E [ b σ i ] − s ≤ Z max s . Hence, using that, for all x ∈ [ , ] and all k ∈ N > , it holds that e x /( k + ) ≤ + x / k , we obtainPr h | b σ − E [ b σ ]| ≥ ε [ b σ ] i ≤ ε (cid:18) (cid:18) + Z max s (cid:19) m − (cid:19) ≤ ε (cid:16) e Z max m /( s ) − (cid:17) ≤ ε Z max m s − . Due to our choice of s , and using the same approach as in bounding equation (10), with proba-20ility at least 3 /
4, it holds thate − ε / E [ b σ ] ≤ (cid:16) − ε (cid:17) E [ b σ ] ≤ b σ ≤ (cid:16) + ε (cid:17) E [ b σ ] ≤ e ε / E [ b σ ] . (11) Combining the results.
Combining equations (10) and (11) yields that ( − ε ) Z ≤ e − ε / Z ≤ b σ ≤ e ε / Z ≤ ( + ε ) Z with probability at least 3 /
4, which concludes the proof. (cid:4)
Based on Algorithm 1 and Lemma 11, we now state our main theorem on the approximationof the partition function of an abstract polymer model. ◮ Theorem 12.
Let P = (C , w , / ) be a computationally feasible polymer model, let Λ be apolymer clique cover of P with size m . Assume that P satisfies the conditions of Theorem 9.Then, for all ε ∈ ( , ] , there is a randomized ε -approximation of Z computable in timepoly ( m / ε ) . ◭ Proof.
The statement follows from Lemma 11, choosing the parameters of Algorithm 1 accord-ingly. Note that Theorem 9 both assume that Z max ∈ poly ( m ) . This implies that s ∈ poly ( m / ε ) and that we can sample ε s -approximately from µ in time poly ( m / ε ) . Note that, for all i ∈ [ m ] ,the same holds for µ | K i , as this only requires the Markov chain to ignore some of the polymercliques in each step. (cid:4) We study the grand canonical ensemble of the hard-sphere model in a d -dimensional finitehypercube V = [ , ℓ ) d of side length ℓ ∈ R ≥ . In this model, particles are represented as d -dimensional balls; let v d denote the volume of a d -dimensional unit ball. We consider amixture of q ∈ N > different types of particles Q = {( r i , p i )} i ∈[ q ] , each characterized by aradius r i ∈ R > and a contribution to the chemical potential p i ∈ R . In what follows, weassume for the maximum radius r max = max i ∈[ q ] { r i } that r max ∈ O ( ℓ ) , which means that thelargest observed particles are asymptotically not larger than the considered spatial region V .Moreover, we assume that particles of the same type are indistinguishable, which is usually thecase for the systems considered in statistical physics. That is, all placements of non-overlappingparticles in V that are similar up to swapping two particles of the same type are considered asone and the same configuration.As we briefly discussed in Section 1.2.3, a probabilistic interpretation of this model is that,for each particle type i ∈ [ q ] , the centers of particles are distributed according to a Poissonpoint process of intensity e p i on V . The distribution of states in the ensemble is characterizedby the mixture of these point processes conditioned on the particles not overlapping.In this probabilistic sense, the grand canonical partition function is the normalizing constantof the corresponding probability density over all states of the system. For a given space V and21 set of q particle types Q , it is formally defined as Z ( V , Q ) = + Õ k ∈ N > Õ k + ··· + k q = k © « Ö i ∈[ q ] e k i p i k i ! ª®¬ ∫ V k D r ( ) , ..., r ( k ) (cid:16) x ( ) , . . . , x ( k ) (cid:17) d ν d × k , where• k i represents the number of particles of type i ,• Í k + ··· + k q = k denotes the sum over all ( k i ) i ∈[ q ] ∈ N q such that Í i ∈[ q ] k i = k ,• for each i ∈ [ q ] , the factor 1 /( k i ! ) cancels the effect of double-counting placements thatare equal up to swapping particles of type i ,• r ( i ) assigns particle i its radius, i.e., r ( i ) = r j for j ∈ [ q ] such that Í a < j k a < i ≤ Í a ≤ j k a ,• D r ( ) , ..., r ( k ) : R d × k → { , } is 1 if and only if the particles with radii r ( ) , . . . , r ( k ) andcenters ( x ( ) , . . . , x ( k ) ) ∈ R d × k are non-overlapping, and• ν d × k is the Lebesgue measure on R d × k .Readers who are familiar with this model from physics might notice that we omitted theinfluence of the inverse temperature and the Boltzmann constant. We did this in order tosimplify notation. However, note that this can be included by scaling the chemical potentialsappropriately. A physicist’s version of this definition can, for example, be found in [50].In the following section, we propose a discrete version of the hard-sphere model and provesufficient conditions for approximating its partition function via a polymer representation(Lemma 15). Then, we show how the continuous model is mapped to the discrete model, and webound the speed of convergence with respect to the resolution of the discretization (Lemma 17).Based on that, we obtain rigorous computational results for the continuous model (Theorem 19)and demonstrate their application to a common form of chemical potential (Proposition 20). We discretize the hard-sphere model in the following sense: instead of allowing particles totake arbitrary position in a continuous d -dimensional cube V , we restrict their centers to be atdiscrete grid points of a finite d -dimensional square lattice.Formally, the discrete hard-sphere model in d dimensions is defined by a finite integer lattice G = [ , n ) d ∩ N d for n ∈ N > and a set of q particle types Q = {( r i , p i )} i ∈[ q ] , again eachcharacterized by a radius r i ∈ R > and a chemical potential p i ∈ R . As before, particles of thesame type are assumed to be indistinguishable.Analogously to the continuous model, the grand canonical partition function of the discrete22ard-sphere model is defined by Z ( G , Q ) = + Õ k ∈ N > Õ k + ··· + k q = k © « Ö i ∈[ q ] e k i p i k i ! ª®¬ Õ ( x ( ) , ..., x ( k ) ) ∈ G k D r ( ) , ..., r ( k ) (cid:16) x ( ) , . . . , x ( k ) (cid:17) , where• k i is the number of particles of type i ∈ [ q ] ,• Í k + ··· + k q = k denotes the sum over all ( k i ) i ∈[ q ] ∈ N q such that Í i ∈[ q ] k i = k , and• r ( i ) and D r ( ) , ..., r ( k ) : R d × k → { , } are defined as in the continuous case.We continue by showing how we use polymer models to approximate the grand canonicalpartition function of the discrete hard-sphere model. For this, we use the following definitionof a polymer representation. ◮ Definition 13 (polymer representation of the discrete hard-sphere model).
Given aninstance of the discrete hard-sphere model ( G , Q ) , we define its polymer representation to bethe polymer model P = (C , w , / ) such that• each polymer γ ∈ C is defined by a a tuple ( x γ , r γ , p γ ) with x γ ∈ G and ( r γ , p γ ) ∈ Q , andeach such combination results in a polymer,• two polymers γ , γ ′ ∈ C are incompatible if and only if d (cid:0) x γ , x γ ′ (cid:1) < r γ + r γ ′ , and,• for each polymer γ ∈ C , we set w γ = e p γ .Further, we might say that a polymer γ ∈ C with γ = ( x γ , r γ , p γ ) is of type i ∈ [ q ] if ( r γ , p γ ) = ( r i , p i ) , and at position x ∈ G if x γ = x . ◭ The following lemma justifies using this polymer representation to approximate the grandcanonical partition function of the discrete hard-sphere model. ◮ Lemma 14.
For an instance of the discrete hard-sphere model ( G , Q ) and its polymer repre-sentation P = (C , w , / ) as in Definition 13, it holds that Z (P) = Z ( G , Q ) . ◭ Proof.
Since ∅ contributes 1 to Z (P) , it is sufficient to show Õ Γ ∈F| Γ | ≥ Ö γ ∈ Γ w γ = Õ k ∈ N > Õ k + ··· + k q = k © « Ö i ∈[ q ] e k i p i k i ! ª®¬ Õ ( x ( ) , ..., x ( k ) ) ∈ G k D r ( ) , ..., r ( k ) (cid:16) x ( ) , . . . , x ( k ) (cid:17) . We start by rewriting the left-hand side in terms of the power set of the set of polymers: Õ Γ ∈F| Γ | ≥ Ö γ ∈ Γ w γ = Õ Γ ∈ C | Γ | ≥ Ö γ ∈ Γ w γ Ö γ , γ ′ ∈ Γγ , γ ′ ( − { γ / γ ′ }) = Õ k ∈ N > Õ Γ ∈ C | Γ | = k Ö γ ∈ Γ w γ Ö γ , γ ′ ∈ Γγ , γ ′ ( − { γ / γ ′ }) . B i ⊆ C for i ∈ [ q ] be the set of polymers of type i . Note that the sets B i form a partitionof C . Thus, we have Õ k ∈ N > Õ Γ ∈ C | Γ | = k Ö γ ∈ Γ w γ Ö γ , γ ′ ∈ Γγ , γ ′ ( − { γ / γ ′ }) = Õ k ∈ N > Õ k + ... + k q = k Õ Γ ∈ B | Γ | = k . . . Õ Γ q ∈ B q | Γ | = k q Ö γ ∈ Ð i ∈[ q ] Γ i w γ Ö γ , γ ′ ∈ Ð i ∈[ q ] Γ i γ , γ ′ ( − { γ / γ ′ }) . (12)Since the weight of each polymer of type i ∈ [ q ] is the same, namely e p i , we get Ö γ ∈ Ð i ∈[ q ] Γ i w γ = Ö i ∈[ q ] e k i p i . Note that two polymers γ , γ ′ ∈ B i are equal if and only if x γ = x γ ′ , that is, there is a one-to-one correspondence between polymers in B i and positions in G . Let K i = Í j ∈[ i − ] k j ; notethat K =
0. We rewrite equation (12) as Õ k ∈ N > Õ k + ··· + k q = k © « Ö i ∈[ q ] e k i p i ª®¬ Õ ( x ( K + ) , ..., x ( K + k ) ) ∈ G k k ! . . . Õ (cid:16) x ( Kq + ) , ..., x ( Kq + kq ) (cid:17) ∈ G kq k q ! · Ö i , j ∈[ q ] Ö K i + ≤ a ≤ K i + k i K j + ≤ b ≤ K j + k j a , b n d (cid:16) x ( a ) , x ( b ) (cid:17) ≥ r i + r j o . (13)Last, using that Ö i , j ∈[ q ] Ö K i + ≤ a ≤ K i + k i K j + ≤ b ≤ K j + k j a , b n d (cid:16) x ( a ) , x ( b ) (cid:17) ≥ r i + r j o = D r ( ) , ..., r ( k ) (cid:16) x ( ) , . . . , x ( k ) (cid:17) , we conclude the proof by simplifying equation (13) to Õ k ∈ N > Õ k + ··· + k q = k © « Ö i ∈[ q ] e k i p i ª®¬ Õ ( x ( K + ) , ..., x ( K + k ) ) ∈ G k k ! . . . Õ (cid:16) x ( Kq + ) , ..., x ( Kq + kq ) (cid:17) ∈ G kq k q ! D r ( ) , ..., r ( k ) (cid:16) x ( ) , . . . , x ( k ) (cid:17) = Õ k ∈ N > Õ k + ··· + k q = k © « Ö i ∈[ q ] e k i p i k i ! ª®¬ Õ ( x ( ) , ..., x ( k ) ) ∈ G k D r ( ) , ..., r ( k ) (cid:16) x ( ) , . . . , x ( k ) (cid:17) . (cid:4) ◮ Lemma 15.
Given an instance of the d -dimensional discrete hard-sphere model ( G , Q ) with G = [ , n ) d ∩ N d and q particle types Q = {( r i , p i )} i ∈[ q ] , let r min = min i ∈[ q ] { r i } , and let b d ( r ) bean upper bound on the number of integer points in a d -dimensional sphere of radius r ∈ R > ,centered at the origin. Assume that, for all i ∈ [ q ] , there is an h i ∈ R > withexp − (cid:18) nr min (cid:19) d ! ≤ h i ≤ exp (cid:18) nr min (cid:19) d ! such that, for all j ∈ [ q ] , it holds that Õ i ∈[ q ] b d (cid:0) r i + r j (cid:1) e p i h i h j ≤ . (14)Then, for each ε ∈ ( , ] , there is a randomized ε -approximation of Z ( G , Q ) computable intime poly © « n √ d r min ! d q + d ln ( n ) ε ª®¬ . ◭ Proof.
By Lemma 14, it is sufficient to approximate the partition function Z (P) of the polymerrepresentation P = (C , w , / ) of ( G , Q ) . We show that, by Theorem 9, we can sample efficientlyfrom µ (P) . Applying Theorem 12 afterward concludes the proof.We start by fixing a polymer clique cover Λ of P and bounding its size. To this end, for atuple ( i , . . . , i d ) ∈ N d , let H i , ..., i d = (cid:26) ( x , . . . , x d ) ∈ G (cid:12)(cid:12)(cid:12)(cid:12) ∀ j ∈ [ d ] : i j (cid:22) √ d r min (cid:23) ≤ x j < ( i j + ) (cid:22) √ d r min (cid:23) (cid:27) . In other words, we divide G into subcubes with side length at most 2 r min /√ d . Note that eachpair of polymers γ , γ ′ ∈ C with x γ , x γ ′ ∈ H i , ..., i d is incompatible, as d (cid:0) x γ , x γ ′ (cid:1) < r min . Weidentify each polymer clique by a tuple ( i , . . . , i d ) ∈ N d and set Λ i , ..., i d = { γ ∈ C | x γ ∈ H i , ..., i d } . This results in | Λ | ∈ O (cid:0) ( n √ d / r min ) d (cid:1) polymer cliques, from which we can draw one uni-formly at random by choosing d uniform integers, each of size O (cid:0) n √ d / r min (cid:1) . Further, note thatchecking whether a polymer γ ∈ C is in a certain polymer clique can be done by checkingwhether x γ is in the corresponding region of the grid; and checking γ / γ ′ is equivalent tocomparing their Euclidean distance to the sum of their radii.We now show that P satisfies the clique dynamics condition for an appropriate function f .To simplify this step, we use Observation 10. For each γ ∈ C of type i ∈ [ q ] , we set f ( γ ) = h i .Note that if a polymer γ ′ ∈ C of type i is incompatible to γ , then d (cid:0) x γ ′ , x γ (cid:1) < r γ ′ + r γ = r i + r γ .25he number of such pairs is bounded from above by b d (cid:0) r i + r γ (cid:1) . Thus, for each γ ∈ C , it holdsthat Õ γ ′ ∈C : γ ′ / γ w γ ′ f ( γ ′ ) ≤ Õ i ∈[ q ] b d (cid:0) r i + r γ (cid:1) e p i h i . Without loss of generality, let γ be of type j ∈ [ q ] . By equation (14), Õ i ∈[ q ] b d (cid:0) r i + r j (cid:1) e p i h i ≤ h j . Because r γ = r j and f ( γ ) = h j , Observation 10 implies that assumptions (a) and (b) of Theorem 9are satisfied.It remains to show that we can sample from the Gibbs distribution of each polymer cliqueefficiently. For each i ∈ N d , let H i denote the region of the grid that corresponds to Λ i . For all γ ∈ Λ i of type j ∈ [ q ] , it holds that µ | Λ i ({ γ }) = w γ Z | Λ i = e p j Z | Λ i with Z | Λ i = + Õ γ ∈ Λ i w γ = + | H i | Õ j ∈[ q ] e p j , where | H i | denotes the number of grid points in H i . Note that | H i | can be calculated exactly intime O ( d ln ( n )) knowing r min , n , and d . Thus, we can compute Z | Λ i in time O ( q + d ln ( n )) . Wesample from µ | Λ i as follows:1. sample x ∈ H i uniformly at random,2. sample j ∈ [ q ] with probability proportional to e p j , and3. return ∅ with probability Z | Λi and { γ } with γ = ( x , r j , p j ) otherwise.Note that step 1. needs time O ( d ln ( n )) by drawing d integers uniformly from the range thatcorresponds to H i . In step 2., we enumerate in time O ( q ) . Further, we return ∅ with probability1 / Z | Λ i and, for each γ ∈ Λ i of type j ∈ [ q ] , we return the { γ } with probability Z | Λ i − Z | Λ i · | H i | · e p j Í j ∈[ q ] e p j = Z | Λ i − Z | Λ i · e p j Z | Λ i − = e p j Z | Λ i , which results in the desired distribution µ | Λ i . (cid:4) Note that the problem of getting an upper bound b d ( r ) on the number of integer pointsin a hypersphere is sometimes also referred to as the Gauss circle problem in d dimensions.Tight asymptotic upper bounds on this remain an open mathematical problem. An overviewon known bounds are, for example, reported by Strömbergsson and Södergren [46]. In general, b d ( r ) = d r d works as a crude bound if r ≥ − d . However, depending on the radius r and thenumber of dimensions d , more sophisticated bounds are applicable.26 .2 Discretization method and results on the continuous model We now show how our results on the discrete hard-sphere model relate to the continuousversion. In order to do so, we start by defining a transformation from the continuous to thediscrete model for a given resolution. ◮ Definition 16 (discretization of the continuous hard-sphere model).
Let ( V , Q ) be aninstance of the d -dimensional continuous hard-sphere model with V = [ , ℓ ) d and q particletypes Q = {( r i , p i )} i ∈[ q ] . Further, let ρ ∈ R > be such that ρ ℓ ∈ N > . The discretization of ( V , Q ) with resolution ρ is a d -dimensional discrete hard-sphere model ( G ( ρ ) , Q ( ρ ) ) with• G ( ρ ) = [ , ρ ℓ ) d ∩ N d and• Q ( ρ ) = {( r ( ρ ) i , p ( ρ ) i )} i ∈[ q ] , where r ( ρ ) i = ρr i and p ( ρ ) i = p i − d ln ( ρ ) . ◭ The following lemma shows that, for sufficiently large resolutions ρ , the discretization can beseen as an approximation of the continuous hard-sphere model in terms of the grand canonicalpartition function. ◮ Lemma 17.
Let ( V , Q ) be a continuous hard-sphere model with V = [ , ℓ ) d and q parti-cle types Q = {( r i , p i )} i ∈[ q ] , let r min = min i ∈[ q ] { r i } , and let p max = max i ∈[ q ] { p i } . For everyresolution ρ ≥ √ d , it holds that Z ( V , Q ) ≥ © « − ρ − exp © « Θ © « l √ d r min ! d d ln ( l ) + q e p max ª®¬ª®¬ª®¬ · Z (cid:16) G ( ρ ) , Q ( ρ ) (cid:17) and Z ( V , Q ) ≤ © « + ρ − exp © « Θ © « l √ d r min ! d d ln ( l ) + q e p max ª®¬ª®¬ª®¬ · Z (cid:16) G ( ρ ) , Q ( ρ ) (cid:17) . ◭ Proof.
We prove the lemma by bounding the additive error (cid:12)(cid:12)(cid:12) Z ( V , Q ) − Z (cid:16) G ( ρ ) , Q ( ρ ) (cid:17)(cid:12)(cid:12)(cid:12) . Because Z ( V , Q ) ≥
1, this directly results in the desired multiplicative bound.In order to obtain an additive bound, we start by transforming Z (cid:16) G ( ρ ) , Q ( ρ ) (cid:17) to a form thatis more similar to the form of Z ( V , Q ) . Note that, for each k ∈ N > and ( k , . . . , k q ) ∈ N q with k + · · · + k q = k , it holds that Ö i ∈[ q ] e k i p ( ρ ) i k i ! = (cid:18) ρ (cid:19) d · k Ö i ∈[ q ] e k i p i k i ! . Let φ ( ρ ) : G ( ρ ) → V with ( x , . . . , x d ) 7→ φ ( ρ ) ( x ) = ( x / ρ , . . . , x d / ρ ) . Note that, for all x ( i ) , x ( j ) ∈ G ( ρ ) with assigned radii r ( ρ ) ( i ) = ρr ( i ) and r ( ρ ) ( j ) = ρr ( j ) , it holds that d (cid:16) x ( i ) , x ( j ) (cid:17) ≥ r ( ρ ) ( i ) + r ( ρ ) ( j ) ↔ d (cid:16) φ ( ρ ) (cid:16) x ( i ) (cid:17) , φ ( ρ ) (cid:16) x ( j ) (cid:17) (cid:17) ≥ r ( i ) + r ( j ) . Z (cid:16) G ( ρ ) , Q ( ρ ) (cid:17) = + Õ k ∈ N > Õ k + ··· + k q = k © « Ö i ∈[ q ] e k i p ( ρ ) i k i ! ª®¬ Õ ( x ( ) , ..., x ( k ) ) ∈ ( G ( ρ ) ) k D r ( ρ ) ( ) , ..., r ( ρ ) ( k ) (cid:16) x ( ) , . . . , x ( k ) (cid:17) = + Õ k ∈ N > Õ k + ··· + k q = k © « Ö i ∈[ q ] e k i p i k i ! ª®¬ Õ ( x ( ) , ..., x ( k ) ) ∈ ( G ( ρ ) ) k (cid:18) ρ (cid:19) d · k D r ( ) , ..., r ( k ) (cid:16) φ ( ρ ) (cid:16) x ( ) (cid:17) , . . . , φ ( ρ ) (cid:16) x ( k ) (cid:17) (cid:17) . (15)We continue by rewriting Õ ( x ( ) , ..., x ( k ) ) ∈ ( G ( ρ ) ) k (cid:18) ρ (cid:19) d · k D r ( ) , ..., r ( k ) (cid:16) φ ( ρ ) (cid:16) x ( ) (cid:17) , . . . , φ ( ρ ) (cid:16) x ( k ) (cid:17) (cid:17) for any fixed k ∈ N > and k + · · · + k q = k as an integral over V k . Let φ ( ρ ) (cid:16) G ( ρ ) (cid:17) ⊆ V denotethe image of φ ( ρ ) , and let Φ ( ρ ) : V → φ ( ρ ) (cid:16) G ( ρ ) (cid:17) with ( x , . . . , x d ) 7→ (cid:18) ⌊ ρx ⌋ ρ , . . . , ⌊ ρx d ⌋ ρ (cid:19) . Further, for all k ∈ N > and all (cid:16) x ( ) , . . . , x ( k ) (cid:17) ∈ (cid:16) φ ( ρ ) (cid:16) G ( ρ ) (cid:17) (cid:17) k , let W ( ρ ) x ( ) , ..., x ( k ) = n (cid:16) y ( ) , . . . , y ( k ) (cid:17) ∈ V k (cid:12)(cid:12)(cid:12) ∀ i ∈ [ k ] : Φ ( ρ ) (cid:16) y ( i ) (cid:17) = x ( i ) o = (cid:18) (cid:16) Φ ( ρ ) (cid:17) − (cid:16) x ( ) (cid:17) (cid:19) × · · · × (cid:18) (cid:16) Φ ( ρ ) (cid:17) − (cid:16) x ( k ) (cid:17) (cid:19) . Note that the sets W ( ρ ) x ( ) , ..., x ( k ) partition V k into ( d × k ) -dimensional hypercubes of side length 1 / ρ .Thus, for all (cid:16) x ( ) , . . . , x ( k ) (cid:17) ∈ (cid:16) φ ( ρ ) (cid:16) G ( ρ ) (cid:17) (cid:17) k , it holds that ν d × k (cid:16) W ( ρ ) x ( ) , ..., x ( k ) (cid:17) = (cid:18) ρ (cid:19) d · k .
28y this and by the definition of a Lebesgue integral for elementary functions, we obtain Õ ( x ( ) , ..., x ( k ) ) ∈ ( G ( ρ ) ) k (cid:18) ρ (cid:19) d · k D r ( ) , ..., r ( k ) (cid:16) φ ( ρ ) (cid:16) x ( ) (cid:17) , . . . , φ ( ρ ) (cid:16) x ( k ) (cid:17) (cid:17) = Õ ( x ( ) , ..., x ( k ) ) ∈ ( G ( ρ ) ) k ν d × k (cid:18) W ( ρ ) φ ( ρ ) ( x ( ) ) , ..., φ ( ρ ) ( x ( k ) ) (cid:19) · D r ( ) , ..., r ( k ) (cid:16) φ ( ρ ) (cid:16) x ( ) (cid:17) , . . . , φ ( ρ ) (cid:16) x ( k ) (cid:17) (cid:17) = Õ ( x ( ) , ..., x ( k ) ) ∈ ( φ ( ρ ) ( G ( ρ ) )) k ν d × k (cid:16) W ( ρ ) x ( ) , ..., x ( k ) (cid:17) · D r ( ) , ..., r ( k ) (cid:16) x ( ) , . . . , x ( k ) (cid:17) = ∫ V k D r ( ) , ..., r ( k ) (cid:16) Φ ( ρ ) (cid:16) x ( ) (cid:17) , . . . , Φ ( ρ ) (cid:16) x ( k ) (cid:17) (cid:17) d ν d × k . Substituting this expression back into equation (15) yields Z (cid:16) G ( ρ ) , Q ( ρ ) (cid:17) = + Õ k ∈ N > Õ k + ··· + k q = k © « Ö i ∈[ q ] e k i p i k i ! ª®¬ ∫ V k D r ( ) , ..., r ( k ) (cid:16) Φ ( ρ ) (cid:16) x ( ) (cid:17) , . . . , Φ ( ρ ) (cid:16) x ( k ) (cid:17) (cid:17) d ν d × k . We now express (cid:12)(cid:12)(cid:12) Z ( V , Q ) − Z (cid:16) G ( ρ ) , Q ( ρ ) (cid:17)(cid:12)(cid:12)(cid:12) in terms of the absolute difference of the integralsfor all k ∈ N > and all k , . . . , k q . Note that the integrals only depend on the resulting as-signment of r ( ) , . . . , r ( k ) . We fix any set of radii r ( ) , . . . , r ( k ) and write D for D r ( ) , ..., r ( k ) tosimplify notation. We aim for a bound on (cid:12)(cid:12)(cid:12)(cid:12)∫ V k D (cid:16) x ( ) , . . . , x ( k ) (cid:17) d ν d × k − ∫ V k D (cid:16) Φ ( ρ ) (cid:16) x ( ) (cid:17) , . . . , Φ ( ρ ) (cid:16) x ( k ) (cid:17) (cid:17) d ν d × k (cid:12)(cid:12)(cid:12)(cid:12) ≤ ∫ V k (cid:12)(cid:12)(cid:12) D (cid:16) x ( ) , . . . , x ( k ) (cid:17) − D (cid:16) Φ ( ρ ) (cid:16) x ( ) (cid:17) , . . . , Φ ( ρ ) (cid:16) x ( k ) (cid:17) (cid:17)(cid:12)(cid:12)(cid:12) d ν d × k . Let N ( ρ ) ⊆ V k be such that for all (cid:16) x ( ) , . . . , x ( k ) (cid:17) ∈ N ( ρ ) it holds that D (cid:16) x ( ) , . . . , x ( k ) (cid:17) , D (cid:16) Φ ( ρ ) (cid:16) x ( ) (cid:17) , . . . , Φ ( ρ ) (cid:16) x ( k ) (cid:17) (cid:17) . Note that N ( ρ ) actually also depends on the assigned radii. As D is an indicator function, it holds that ∫ V k (cid:12)(cid:12)(cid:12) D (cid:16) x ( ) , . . . , x ( k ) (cid:17) − D (cid:16) Φ ( ρ ) (cid:16) x ( ) (cid:17) , . . . , Φ ( ρ ) (cid:16) x ( k ) (cid:17) (cid:17)(cid:12)(cid:12)(cid:12) d ν d × k = ν d × k (cid:16) N ( ρ ) (cid:17) . We construct a superset of N ( ρ ) , of which we calculate the Lebesgue measure. First, notethat N ( ρ ) = ∅ for k =
1, as in this case D (cid:16) x ( ) (cid:17) = D (cid:16) Φ ( ρ ) (cid:16) x ( ) (cid:17) (cid:17) = x ( ) ∈ V . Further,let K = (cid:0) ℓ √ d /( r min ) (cid:1) d . Note that, for all k > K , it holds that at least two particles havedistance less than 2 r min , meaning that such a configuration has always overlapping particles29nd N ( ρ ) = ∅ . We are left with considering 2 ≤ k ≤ K .We observe that, for all (cid:16) x ( ) , . . . , x ( k ) (cid:17) ∈ V k such that D (cid:16) x ( ) , . . . , x ( k ) (cid:17) , D (cid:16) Φ ( ρ ) (cid:16) x ( ) (cid:17) , . . . , Φ ( ρ ) (cid:16) x ( k ) (cid:17) (cid:17) , there is be a pair of points x ( i ) , x ( j ) for i , j ∈ [ k ] such that i , j and d (cid:16) x ( i ) , x ( j ) (cid:17) < r ( i ) + r ( j ) ≤ d (cid:16) Φ ( ρ ) (cid:16) x ( i ) (cid:17) , Φ ( ρ ) (cid:16) x ( j ) (cid:17) (cid:17) or d (cid:16) x ( i ) , x ( j ) (cid:17) ≥ r ( i ) + r ( j ) > d (cid:16) Φ ( ρ ) (cid:16) x ( i ) (cid:17) , Φ ( ρ ) (cid:16) x ( j ) (cid:17) (cid:17) . As, for every point x ( i ) ∈ V , it holds that d (cid:16) x ( i ) , Φ ( ρ ) (cid:16) x ( i ) (cid:17) (cid:17) ≤ √ dρ , there is a pair of points x ( i ) , x ( j ) for i , j ∈ [ k ] such that i , j and (cid:12)(cid:12)(cid:12) r ( i ) + r ( j ) − d (cid:16) x ( i ) , x ( j ) (cid:17)(cid:12)(cid:12)(cid:12) ≤ √ dρ . For all i , j ∈ [ k ] with i , j let S ( ρ ) i , j ⊆ V k be the set of points (cid:16) x ( ) , . . . , x ( k ) (cid:17) ∈ V k such that thisis the case. Then ν d × k (cid:16) N ( ρ ) (cid:17) ≤ ν d × k © « Ø ≤ i < j ≤ k S ( ρ ) i , j ª®¬ ≤ Õ ≤ i < j ≤ k ν d × k (cid:16) S ( ρ ) i , j (cid:17) . By Fubini’s theorem, noting that S ( ρ ) i , j only depends on i and j , we get ν d × k (cid:16) S ( ρ ) i , j (cid:17) = ∫ V k ((cid:12)(cid:12)(cid:12) r ( i ) + r ( j ) − d (cid:16) x ( i ) , x ( j ) (cid:17)(cid:12)(cid:12)(cid:12) ≤ √ dρ ) d ν d × k = ℓ d ( k − ) ∫ V ((cid:12)(cid:12)(cid:12) r ( i ) + r ( j ) − d (cid:16) x ( i ) , x ( j ) (cid:17)(cid:12)(cid:12)(cid:12) ≤ √ dρ ) d ν d × ≤ ℓ d ( k − ) · v d · © « r ( i ) + r ( j ) + √ dρ ! d − r ( i ) + r ( j ) − √ dρ ! d ª®¬ , By the assumption ρ ≥ √ d and the binomial theorem, we further bound r ( i ) + r ( j ) + √ dρ ! d − r ( i ) + r ( j ) − √ dρ ! d = d Õ i = · { i is odd } (cid:18) di (cid:19) (cid:0) r ( i ) + r ( j ) (cid:1) d − i √ dρ ! i √ dρ d Õ i = · { i is odd } (cid:18) di (cid:19) (cid:0) r ( i ) + r ( j ) (cid:1) d − i √ dρ ! i − ≤ √ dρ d Õ i = · { i is odd } (cid:18) di (cid:19) (cid:0) r ( i ) + r ( j ) (cid:1) d − i i − ≤ √ dρ (cid:0) r ( i ) + r ( j ) + (cid:1) d . Using this bound for ν d × k (cid:16) S ( ρ ) i , j (cid:17) , we obtain ν d × k (cid:16) N ( ρ ) (cid:17) ≤ v d · ℓ d ( k − ) · √ dρ · Õ ≤ i < j ≤ k (cid:0) r ( i ) + r ( j ) + (cid:1) d ≤ v d · ℓ d ( k − ) · √ dρ · k · ( r max + ) d , where r max = max i ∈[ q ] { r i } . Thus, we get (cid:12)(cid:12)(cid:12) Z ( V , Q ) − Z (cid:16) G ( ρ ) , Q ( ρ ) (cid:17)(cid:12)(cid:12)(cid:12) ≤ K Õ k = Õ k + ··· + k q = k © « Ö i ∈[ q ] e k i p i k i ! ª®¬ ν d × k (cid:16) N ( ρ ) r ( ) , ... r ( k ) (cid:17) ≤ ρ K Õ k = Õ k + ··· + k q = k © « Ö i ∈[ q ] e k i p i k i ! ª®¬ · v d · ℓ d ( k − ) · √ d · k · ( r max + ) d . We simplify the bound further by bounding1 ρ K Õ k = Õ k + ··· + k q = k © « Ö i ∈[ q ] e k i p i k i ! ª®¬ · v d · ℓ d ( k − ) · √ d · k · ( r max + ) d ≤ ρ · v d · ℓ d ( K − ) · √ d · K · ( r max + ) d K Õ k = Õ k + ··· + k q = k © « Ö i ∈[ q ] e k i p i k i ! ª®¬ . Applying the multinomial theorem, we obtain K Õ k = Õ k + ··· + k q = k © « Ö i ∈[ q ] e k i p i k i ! ª®¬ = K Õ k = k ! (cid:0) e p + · · · + e p q (cid:1) k ≤ e q · e p max , where the last inequality follows from the Taylor expansion of e x at 0.31verall, we bound (cid:12)(cid:12)(cid:12) Z ( V , Q ) − Z (cid:16) G ( ρ ) , Q ( ρ ) (cid:17)(cid:12)(cid:12)(cid:12) ≤ ρ · v d · ℓ d ( K − ) · √ d · K · ( r max + ) d e q · e qp max ≤ ρ e Θ ( Kd ln ( ℓ ) + ln ( r max + ) + q e p max ) . Recalling that we assume r max ∈ O ( ℓ ) , as mentioned at the beginning of Section 5, we obtain (cid:12)(cid:12)(cid:12) Z ( V , Q ) − Z (cid:16) G ( ρ ) , Q ( ρ ) (cid:17)(cid:12)(cid:12)(cid:12) ≤ ρ − exp © « Θ © « l √ d r min ! d d ln ( l ) + q e p max ª®¬ª®¬ . (cid:4) Before we get to our main result for this section, it is useful to have a closer look at b d .As we increase the resolution ρ for our discretization, we scale the radii of the continuousmodel accordingly to r ( ρ ) i = ρr i . This causes the bound b d ( ρr i ) to converge to the volume ofa sphere of radius ρr i . The following lemma gives a simple but sufficient bound for the speedof this convergence, which we use to include this effect into our approximation result for thecontinuous partition function. ◮ Lemma 18.
Let δ ∈ ( , ] , r ∈ R > , and let b d ( ρr ) denote the number of integer points in asphere of radius ρr .Then, for all ρ ≥ (cid:0) √ d (cid:1) d /( δr ) , it holds that b d ( ρr ) ≤ ( + δ ) · v d · ( ρr ) d . ◭ Proof.
We start by considering a sphere of radius ρr + √ d at the origin. Note that this enlargedsphere contains for each grid point ( x , . . . , x d ) in the original sphere the cubic region [ x , x + ] × · · · × [ x d , x d + ] of volume 1. Thus, the volume of the enlarged sphere is a trivial upperbound on the number of grid points in the original sphere.Formally, we get b d ( ρr ) ≤ v d · (cid:16) ρr + √ d (cid:17) d , which we rewrite as v d · (cid:16) ρr + √ d (cid:17) d = v d · ( ρr ) d + v d · Õ i ∈[ d ] (cid:18) di (cid:19) ( ρr ) d − i √ d i . Further, note that for our choice of ρ it holds that ρr ≥
1. Thus, we get v d ·( ρr ) d + v d · Õ i ∈[ d ] (cid:18) di (cid:19) ( ρr ) d − i √ d i ≤ v d ·( ρr ) d + v d ·( ρr ) d − · d √ d d = v d ·( ρr ) d · (cid:18) + ρr (cid:16) √ d (cid:17) d (cid:19) . We conclude the proof by noting that (cid:0) √ d (cid:1) d / ρr ≤ δ . (cid:4) We now prove our main statement for approximation of the partition function of the continu-ous hard-sphere model. A crucial point of this proof is that the size of the polymer clique coverof the discrete model, for any fixed number of dimensions d , only depends on the fraction of n r min . Both are scaled equally for any resolution ρ , which means that the number of polymercliques in the cover is independent of ρ . Thus, although the number of polymers grows in ρ d ,the number of polymer cliques and the mixing time of our Markov chain remains fixed, andwe only have to argue that the integers that need to be drawn for running the Markov chaindo not become too large. ◮ Theorem 19.
Let ( V , Q ) be a continuous hard-sphere model with V = [ , ℓ ) d and q particletypes Q = {( r i , p i )} i ∈[ q ] , let r min = min i ∈[ q ] { r i } , and let δ ∈ ( , / r min ] . Assume that, for all i ∈ [ q ] , there is an h i ∈ R > withexp − (cid:18) ℓ r min (cid:19) d ! ≤ h i ≤ exp (cid:18) ℓ r min (cid:19) d ! such that, for all j ∈ [ q ] , it holds that ( + δ ) v d Õ i ∈[ q ] (cid:0) r i + r j (cid:1) d e p i h i h j ≤ . (16)Then, for each ε ∈ ( , ] , there is a randomized ε -approximation of Z ( V , Q ) computable intime poly © « ℓ √ d r min ! d dq + d ln ( d ℓ ) + d ln ( / δ ) ε ª®¬ . ◭ Proof.
We aim to obtain an ε ′ -approximation for the discretization Z (cid:16) G ( ρ ) , Q ( ρ ) (cid:17) by Lemma 15for some ρ . To this end, by Lemma 17, choosing ρ ≥ δr min (cid:16) √ d (cid:17) d · exp (cid:18) Θ (cid:18) (cid:16) l √ d r min (cid:17) d d ln ( l ) + q e p max (cid:19) (cid:19) ε ′ ≕ ξ and noting that ρ ≥ √ d due to δ ≤ / r min and ℓ ≥
1, we know that Z (cid:16) G ( ρ ) , Q ( ρ ) (cid:17) is an ε ′ -approximation of Z ( V , Q ) .We now show that the discretization ( G ( ρ ) , Q ( ρ ) ) satisfies the conditions of Lemma 15. Let r ( ρ ) min = min i ∈ q { r ( ρ ) i } = ρr min , and let n = ρ ℓ be the number of grid points along each dimensionof G ( ρ ) . It is important to note that nr ( ρ ) min = ρ ℓ ρr min = ℓ r min . Consequently, for all i ∈ [ q ] , it holds thatexp © « − nr ( ρ ) min ! d ª®¬ ≤ h i ≤ exp © « nr ( ρ ) min ! d ª®¬ ↔ exp − (cid:18) ℓ r min (cid:19) d ! ≤ h i ≤ exp (cid:18) ℓ r min (cid:19) d ! . ρ implies for all i , j ∈ [ q ] that ρ ≥ δ · ( r i + r j ) (cid:16) √ d (cid:17) d . By Lemma 18 and equation (16), we obtain Õ i ∈[ q ] b d (cid:16) r ( ρ ) i + r ( ρ ) j (cid:17) e p ( ρ ) i h i h j ≤ Õ i ∈[ q ] b d (cid:0) ρ · ( r i + r j ) (cid:1) e p i ρ d h i h j ≤ ( + δ ) v d Õ i ∈[ q ] ( r i + r j ) d e p i h i h j ≤ . Consequently, the conditions of Lemma 15 are satisfied, and it yields a runtime ofpoly © « n √ d r ( ρ ) min ! d q + d ln ( n ) ε ′ ª®¬ = poly © « ℓ √ d r min ! d q + d ln ( ρ ) + d ln ( ℓ ) ε ′ ª®¬ for an ε ′ -approximation of Z (cid:16) G ( ρ ) , Q ( ρ ) (cid:17) . By choosing ρ ∈ Θ ( ξ ) , we boundln ( ρ ) ∈ O © « ln (cid:18) δ (cid:19) + ln (cid:18) r min (cid:19) + d ln ( d ) + d ln ( ℓ ) ℓ √ d r min ! d + q e p max ª®¬ . Further, note that, for all j ∈ [ q ] , it holds that v d ·( r j ) d e p j ≤ p max ∈ O (cid:18) (cid:16) √ d /( r min ) (cid:17) d (cid:19) ,which leads to a runtime ofpoly © « ℓ √ d r min ! d dq + d ln ( d ℓ ) + d ln ( / δ ) ε ′ ª®¬ . Last, by choosing ε ′ ≤ ε /
3, we note that the ε ′ -approximation of Z (cid:16) G ( ρ ) , Q ( ρ ) (cid:17) , which isan ε ′ -approximation of Z ( V , Q ) , is an ε -approximation of Z ( V , Q ) , as ( + ε ′ ) ≤ ( + ε ) and ( − ε ′ ) ≥ ( − ε ) . This concludes the proof. (cid:4) p i = ln (cid:16) λ v d ·( r i ) d (cid:17) We demonstrate the application of Theorem 19 to a specific form of chemical potential. Namely,for each particle type i ∈ [ q ] , we choose the chemical potential p i = ln (cid:0) λ / (cid:0) v d · ( r i ) d (cid:1) (cid:1) , wherethe parameter λ ∈ R > represents some external condition. This is a straightforward general-ization of the form of chemical potential that is commonly assumed in the single-componentmodel and, for example, discussed by Guo and Jerrum [22] and Helmuth et al. [26]. The result-34ng grand canonical partition function takes the form Z ( V , Q ) = + Õ k ∈ N > Õ k + ··· + k q = k © « Ö i ∈[ q ] λ v d · r di ! k i · k i ! ª®¬ ∫ V k D r ( ) , ..., r ( k ) (cid:16) x ( ) , . . . , x ( k ) (cid:17) d ν d × k . The goal is to bound the range of λ for which we an efficient approximation of this partitionfunction is obtained. Our result for this setting is shown in the following statement. ◮ Proposition 20.
Let ( V , Q ) be a continuous hard-sphere model with V = [ , ℓ ) d and q particle types Q = {( r i , p i )} i ∈[ q ] , where p i = ln (cid:0) λ / (cid:0) v d · ( r i ) d (cid:1) (cid:1) for some parameter λ ∈ R > .Further, let r min = min i ∈[ q ] { r i } and r max = max i ∈[ q ] { r i } , and let r = r max / r min . If, for some δ ∈ ( , ] , it holds that λ ≤ + δ · d + ( q − ) · (cid:18) r + √ r (cid:19) d ! − , then, for every ε ∈ ( , ] , there is a randomized ε -approximation of Z ( V , Q ) computable in timepoly © « ℓ √ d r min ! d dq + d ln ( d ℓ ) + d ln ( / δ ) ε ª®¬ . ◭ Proof.
We aim to apply Theorem 19. To this end, for all i ∈ [ q ] , let h i = (√ r i ) d . Thus, for all j ∈ [ q ] , it holds that ( + δ ) v d Õ i ∈[ q ] ( r i + r j ) d e p i h i h j ≤ d + ( q − ) · (cid:18) r + √ r (cid:19) d ! − · Õ i ∈[ q ] (cid:0) r i + r j (cid:1) d r di (cid:18) r i r j (cid:19) d / . Note that, for all j ∈ [ q ] , we have Õ i ∈[ q ] (cid:0) r i + r j (cid:1) d r di (cid:18) r i r j (cid:19) d / = Õ i ∈[ q ] (cid:18) + r j r i (cid:19) d (cid:18) r i r j (cid:19) d / = d + Õ i ∈[ q ] i , j (cid:18)r r i r j + r r j r i (cid:19) d and that, for all i , j ∈ [ q ] , it holds that (cid:18)r r i r j + r r j r i (cid:19) d ≤ (cid:18) √ r + √ r (cid:19) d = (cid:18) r + √ r (cid:19) d . Thus, we have d + ( q − ) · (cid:18) r + √ r (cid:19) d ! − · Õ i ∈[ q ] (cid:0) r i + r j (cid:1) d r di (cid:18) r i r j (cid:19) d / ≤ . Applying Theorem 19 concludes the proof. (cid:4) Truncation of polymer cliques
In Section 4, we discuss under which assumptions the partition function of a polymer model P with polymer clique cover Λ of size m can be approximated in time polynomial in m (Theorem 12).One of the assumptions requires to be able to sample, for all i ∈ [ m ] , from µ | Λ i in time poly ( m ) .Unfortunately, for many algorithmic problems, the number of polymer families of each poly-mer clique is large, and efficient sampling from µ | Λ i is non-trivial. However, as we only requireto approximately sample from µ | Λ i , it is sufficient to ignore polymer families with low proba-bilities, that is, with low weight.We formalize this concept rigorously by defining a size function for polymers. We aimto remove polymers of large size (low weight), which still yields a sufficient approximationof µ | Λ i (Lemma 23). As a consequence, we can still approximate Z (P) in time polynomialin m (Theorem 27). In Section 6.1, we showcase how this new theorem applies to bipartite α -expanders with bounded degree. ◮ Definition 21 (size function).
Given a polymer model (C , w , / ) , a size function is a func-tion |·| : C → R > . For a fixed size function |·| and some polymer γ ∈ C we call | γ | the size of γ . ◭ Given a size function, we truncate the polymer model to polymers of small size. ◮ Definition 22 (truncation).
Given a polymer model (C , w , / ) equipped with a fixed sizefunction |·| and some set of polymers B ⊆ C . For all k ∈ R , we call B ≤ k = { γ ∈ B | | γ | ≤ k } the truncation of B to size k . Further, we write B > k = B \ B ≤ k . ◭ Note that
B ⊆ C and that B ≤ k , B > k is a partitioning of B , which implies B ≤ k , B > k ⊆ C .Thus, we can apply our notions of restricted polymer families, partition function, and Gibbsdistribution as stated in Section 2.1 to B ≤ k and B > k as well. The case B = C (i.e., we truncatethe entire polymer model) plays a special role, which is why we use the shorter notation F ≤ k = F C ≤ k , Z ≤ k = Z C ≤ k , and µ ≤ k = µ C ≤ k . Analogously, we define F > k , Z > k , and µ > k . ◮ Lemma 23 (truncation of polymer cliques).
Let P = (C , w , / ) be a polymer model, let Λ be a polymer clique cover of P with size m , and let |·| be a size function for P . Assume thatthere is a k ∈ R and an ε ∈ ( , ) such that, for all i ∈ [ m ] , it holds that Õ γ ∈ Λ > ki w γ ≤ εm . (17)Then e − ε ≤ Z ≤ k / Z ≤ d TV (cid:0) µ , µ ≤ k (cid:1) ≤ ε . ◭ Proof.
We start by proving e − ε ≤ Z ≤ k / Z ≤
1. Since Z ≤ k ≤ Z , as removing polymers does notincrease the partition function, it remains to show that Z ≤ e ε Z ≤ k .We observe that Z ≤ Z ≤ k Z > k with equality if and only if, for all Γ ∈ F ≤ k and all Γ ′ ∈ F > k ,it holds that Γ ∪ Γ ′ ∈ F . We proceed by showing that Z > k ≤ e ε .36ote that C > k = Ð i ∈[ m ] Λ > ki and that each polymer family in F > k contains at most onepolymer from each Λ > ki . Thus, we obtain Z > k ≤ Ö i ∈[ m ] Z | Λ > ki = Ö i ∈[ m ] © « + Õ γ ∈ Λ > ki w γ ª®®¬ . Due to equation (17), we get Z > k ≤ ( + ε / m ) m ≤ e ε , which proves the first claim.Regarding the second claim, it suffices to see that d TV (cid:0) µ , µ ≤ k (cid:1) = Z − Z ≤ k Z ≤ − e − ε ≤ ε . (cid:4) Similar to how we required the clique dynamics condition for Theorem 12, we formalize thefollowing condition for a polymer model with a size function. ◮ Condition 24 (clique truncation).
Let P = (C , w , / ) be a polymer model, let Λ be apolymer clique cover of P with size m , and let |·| be a size function for P . For all i ∈ [ m ] , wesay that Λ i satisfies the clique truncation condition for a monotonically increasing, invertiblefunction д : R → R > and a bound B ∈ R > if and only if Õ γ ∈ Λ i д (| γ |) w γ ≤ B . ◭ If the clique truncation condition is satisfied, by choosing a reasonable value k for truncatinga polymer model, only little overall weight is removed. That is, the truncated model representsa good approximation of the original. ◮ Lemma 25.
Let P = (C , w , / ) be a polymer model, let Λ be a polymer clique cover of P with size m , let |·| be a size function for P , and let i ∈ [ m ] . Assume that Λ i satisfies the cliquetruncation condition for a function д and a bound B .Then, for all ε ′ ∈ ( , ) and all k ≥ д − ( B / ε ′ ) , it holds that Õ γ ∈ Λ > ki w γ ≤ ε ′ . ◭ Proof.
Let ε ′ ∈ ( , ) and k ≥ д − ( B / ε ′ ) . Due the clique truncation condition and the mono-tonicity of д , we observe that д ( k ) Õ γ ∈ Λ > ki w γ ≤ Õ γ ∈ Λ > ki д (| γ |) w γ ≤ Õ γ ∈ Λ i д (| γ |) w γ ≤ B . As д is positive, dividing by д ( k ) yields Í γ ∈ Λ > ki w γ ≤ B / д ( k ) . Substituting our bound for k andnoting that д is invertible, we conclude that Õ γ ∈ Λ > ki w γ ≤ Bд (cid:0) д − (cid:0) Bε ′ (cid:1) (cid:1) = ε ′ . (cid:4)
37s a direct consequence of Lemma 25, we get that the partition function of the truncatedmodel is a useful approximation of the original partition function. ◮ Corollary 26.
Let P = (C , w , / ) be a polymer model, let Λ be a polymer clique cover of P with size m , and let |·| be a size function for P . Assume that there is a д : R → R > and a B ∈ R > such that, for i ∈ [ m ] , the polymer clique Λ i satisfies the clique truncation conditionfor д and B .Then, for all ε ∈ ( , ) and all k ≥ д − ( Bm / ε ) , it holds that e − ε ≤ Z ≤ k / Z ≤ d TV (cid:0) µ , µ ≤ k (cid:1) ≤ ε . ◭ Proof.
The statement follows directly from Lemmas 23 and 25 by choosing ε ′ = ε / m . (cid:4) Using the truncated polymer model, we achieve an ε -approximation result of the partitionfunction of the original model that is computable in time poly ( m / ε ) , similar to Theorem 12. ◮ Theorem 27.
Let P = (C , w , / ) be a computationally feasible polymer model, let Λ bea polymer clique cover of P with size m , and let |·| be a size function for P . Further, let Z max = max i ∈[ m ] { Z | Λ i } , and let t ( k ) denote an upper bound, for all i ∈ [ m ] , on the time toenumerate Λ ≤ ki . Last, assume that(a) Z max ∈ poly ( m ) ,(b) P satisfies the clique dynamics condition for a function f such that, for all γ ∈ C , itholds that e − poly ( m ) ≤ f ( γ ) ≤ e poly ( m ) , and that(c) there are д : R → R > and B ∈ R > with B ∈ poly ( m ) and t ( д − ( x )) ∈ poly ( x ) (forall x ∈ R > ) such that, for all i ∈ [ m ] , it holds that Λ i satisfies the clique truncationcondition.Then, for all ε ∈ ( , ] , we can ε -approximately sample from µ in time poly ( m / ε ) , and thereis a randomized ε -approximation of Z computable in poly ( m / ε ) . ◭ Note that Observation 10 applies to Theorem 27 as well. That is, by using more restrictiveassumptions, assumptions (a) and (b) are satisfied. We proceed with proving the theorem.
Proof of Theorem 27.
As in the proof of Theorem 9, we consider the polymer Markov chain
M(P) .Further, let k = д − ( Bm / ε ) , let M k denote the polymer Markov chain on (C ≤ k , w , / ) , andlet P k denote its transitions. We aim to run M k for at least t ∗ = τ M ( ε / ) iterations, startingfrom ∅ ∈ F ≤ k .We prove that d TV (cid:0) µ , P t ∗ k (∅ , ·) (cid:1) ≤ ε . By the triangle inequality, we obtain d TV (cid:0) µ , P t ∗ k (∅ , ·) (cid:1) ≤ d TV (cid:0) µ , µ ≤ k (cid:1) + d TV (cid:0) µ ≤ k , P t ∗ k (∅ , ·) (cid:1) . By our choice of k and by Corollary 26 together with assumption (c), we get that d TV (cid:0) µ , µ ≤ k (cid:1) ≤ ε /
2. Further, note that truncation preserves the clique dynamics condition for the same func-tion f and does not increase any quantity that is used for bounding the mixing time. Thus, τ M k ( ε / ) ≤ τ M(P) ( ε / ) , and we obtain d TV (cid:0) µ ≤ k , P t ∗ k (∅ , ·) (cid:1) ≤ ε / t ∗ .38t remains to show that the runtime is bounded by poly ( m / ε ) . Analogously to the proofof Theorem 9, due to assumptions (a) and (b), we know that τ M(P) ( ε / ) ∈ poly ( m / ε ) , whichimplies τ M k ( ε / ) ∈ poly ( m / ε ) . Also analogously, it holds that each step can be done in poly ( m ) ,except for sampling, for all i ∈ [ m ] , from µ | Λ i . However, note that, for all i ∈ [ m ] , we only needto sample from µ | Λ ≤ ki . We do so by enumerating Λ ≤ ki in time t ( k ) . By our choice of k and byassumption (c), this takes time at most t ( k ) = t (cid:18) д − (cid:18) Bmε (cid:19) (cid:19) ∈ poly (cid:16) mε (cid:17) , which proves that we can ε -approximately sample from µ in the desired runtime.Showing that we can ε -approximate Z in time poly ( m / ε ) is done analogously. By Corollary 26and assumption (c), we know that for k = д − ( Bm / ε ) it holds that e − ε / ≤ Z ≤ k / Z ≤
1, whichimplies e − ε / Z ≤ Z ≤ k ≤ Z . As argued above, the truncation of the polymer model P to thissize k satisfies the conditions of Theorem 9, where the sampling from each clique is done byignoring polymers larger than k . Thus, by Theorem 12, we obtain an ε / Z ≤ k in time poly ( m / ε ) = poly ( m / ε ) . Noting that, for ε ≤
1, it holds that1 − ε ≤ (cid:16) − ε (cid:17) e − ε / and (cid:16) + ε (cid:17) ≤ + ε , which concludes the proof. (cid:4) In order to demonstrate how Theorem 27 improves known bounds for the algorithmic use ofpolymer models, we investigate the hard-core model for high fugacity λ ∈ R > on bipartite α -expanders with bounded maximum degree ∆ . For a graph ( V , E ) and an S ⊆ V , let N G ( S ) denote the set of all vertices that are adjacent to a vertex in S . ◮ Definition 28 (bipartite α -expander). Let G = ( V , E ) be a bipartite graph with partition V = V L ∪ V R . For all i ∈ { L , R } , we call S ⊆ V i small if and only if | S | ≤ | V i |/
2. For all α ∈ ( , ) ,graph G is a bipartite α -expander if and only if, for all small sets of vertices S , it holds that | N G ( S )| ≥ ( + α )| S | . ◭ For any graph G , the hard-core partition function is a graph polynomial of some parameter λ ∈ R > , called fugacity. Let I G be the set of all independent sets in G . The hard-core partitionfunction for fugacity λ is now formally defined as Z ( G , λ ) = Õ I ∈I G | I | λ . We approximate Z ( G , λ ) in terms of the partition function of two polymer models, con-structed as proposed by Jenssen et al. [30]. For a bipartite α -expander G with bounded de-gree ∆ , we consider the graph G , which is the graph with vertices V and an edge between v , u ∈ V if v , u have at most distance 2 in G . For all i ∈ { L , R } , we define a polymer model P ( i ) = (C ( i ) , w ( i ) , / ) as follows: 39 each polymer γ ∈ C ( i ) is defined by a non-empty set of vertices γ ⊆ V i such that γ issmall and induces a connected subgraph in G ,• for γ ∈ C ( i ) , let w ( i ) γ = λ | γ | / (cid:0) ( + λ ) | N G ( γ )| (cid:1) , and• two polymers γ , γ ′ ∈ C ( i ) are incompatible if and only if there are vertices v ∈ γ , w ∈ γ ′ with graph distance at most 1 in G .To ease notation, for all i ∈ { L , R } , we write µ ( i ) and Z ( i ) instead of µ (P ( i ) ) and Z (P ( i ) ) , respec-tively.We use P ( L ) and P ( R ) for approximating the hard-core partition function of bipartite α -expanders in the following sense. ◮ Lemma 29 ([30, Lemma ]). Given a bipartite α -expander G = ( V L ∪ V R , E ) with | V L ∪ V R | = n , let Z ( G , λ ) denote its hard-core partition function with fugacity λ ∈ R > , and let the polymermodels P ( L ) , P ( R ) be defined as above. For all λ ≥ e / α , it holds that ( − e − n ) Z ( G , λ ) ≤ ( + λ ) | V R | Z ( L ) + ( + λ ) | V L | Z ( R ) ≤ ( + e − n ) Z ( G , λ ) . ◭ To apply Theorem 27, we have to fix a polymer clique cover Λ for each polymer model P ( i ) with i ∈ { L , R } . Based on the incompatibility relation, a natural choice is to define, for each v ∈ V i , a clique Λ v such that γ ∈ Λ v if and only if v ∈ γ . As we need to verify the cliquedynamics condition, it is useful to have a bound on the number of incompatible polymers,which the following lemma provides. ◮ Lemma 30 ([3, Lemma . ]). For an undirected graph G = ( V , E ) with maximum degree ∆ and for all v ∈ V , the number of vertex-induced subgraphs that contain v and have at most k ∈ N > vertices is bounded from above by e k ∆ k − / (cid:0) k / √ π (cid:1) . ◭ Commonly, the bound ( e ∆ ) k − / k ≥
2. Further, note that the original paper used a weakerbound, namely ( e ∆ ) k . Although this bound holds for all k ∈ N > , it yields a much worsedependency on ∆ . For a fair comparison, we added the result of refined calculations for theapproach by Jenssen et al. [30] to Table 1.Note that the choice of the function f used in the clique dynamics condition is very sensitiveto the bound on the number of subgraphs. For the bound stated in Lemma 30, it turns out thatusing f ( γ ) = | γ | yields the best bounds on λ (see the proof of Proposition 32 for details). Withthis choice of f , the condition that we identified in Observation 10 is similar to the mixing con-dition of Chen et al. [7, Definition 1], except that we do not require a strict inequality. Further,note that such a choice of f is not possible for the Kotecký–Preiss condition [36]. If purelyexponential bounds on the number of subgraphs are used, the best results are usually obtainedby setting f to take an exponential form. A detailed understanding of how to choose f mightbe of interest for applications to specific graph classes and other combinatorial structures.In order to apply truncation, we further need a notion of size for polymers. An obvious choiceis to set | γ | = | γ | . The following lemma then bounds the time for enumerating polymers in aclique up to some size k ∈ N > . 40 Lemma 31 ([42, Lemma . ]). Let G = ( V , E ) be an undirected graph with maximumdegree ∆ , and let v ∈ V . There is an algorithm that enumerates all connected, vertex-inducedsugraphs of G that contain v and have at most k ∈ N > vertices in time e O ( k log ( ∆ )) . ◭ We now prove our bound on λ for an efficient approximation of the hard-core partitionfunction on bipartite α -expanders. Most of the calculations are similar to those of Jenssen et al.[30], except that we use our newly obtained conditions. ◮ Proposition 32.
Let G ( V L ∪ V R , E ) be a bipartite α -expander with | V L ∪ V R | = n and withmaximum degree ∆ ∈ N > . For λ ≥ max {( e ∆ / . ) / α , e / α } and for all ε ∈ ( , ] , there is anFPRAS for Z ( G , λ ) with runtime ( n / ε ) O ( ln ( ∆ )) . ◭ Proof. If ε ∈ O ( e − n ) , we compute Z ( G , λ ) by enumerating all independent sets. Since thereare at most 2 n independent sets, which is polynomial in 1 / e − n , the statement then follows. Itremains to analyze the case ε ∈ Ω ( e − n ) . To this end, assume that ε ≥ − n .By Lemma 29, Z ( G , λ ) can be e − n -approximated using Z ( L ) and Z ( R ) . We aim for an ε / Z ( L ) and Z ( R ) , each with failure probability at most 1 − √ /
2. Note that1 − ε ≤ ( − e − n ) (cid:16) − ε (cid:17) and ( + e − n ) (cid:16) + ε (cid:17) ≤ + ε . Thus, with probability at least (√ / ) = / ε -approximation of Z ( G , λ ) . Wecan obtain the desired error probability of at most 1 − √ / Z ( L ) and Z ( R ) by taking the median of O (cid:0) ln ( /( − √ )) (cid:1) = O ( ) independent approximations withfailure probability at most 1 / i ∈ { L , R } . In order to approximate Z ( i ) , we aim to apply Theorem 27. To this end, for all v ∈ V i , we define a polymer clique Λ v containing all polymers γ ∈ C ( i ) with v ∈ γ . This resultsin a polymer clique cover of size n .We proceed by proving that the polymer model satisfies the clique dynamics condition for f ( γ ) = | γ | . We use Observation 10 to simplify this step. This also implies that assumption (a)of Theorem 27 is satisfied. For any γ ∈ C ( i ) we start by bounding the set of polymers γ ′ / γ by Õ γ ′ ∈C ( i ) : γ ′ / γ f ( γ ′ ) w ( i ) γ ′ ≤ Õ v ∈ N G ( γ ) Õ γ ′ ∈ Λ v f ( γ ′ ) w ( i ) γ ′ = Õ v ∈ N G ( γ ) Õ k ∈ N > Õ γ ′ ∈ Λ v | γ ′ | = k f ( γ ′ ) w ( i ) γ ′ . Because G is a bipartite α -expander, for all γ ∈ C ( i ) , we have w ( i ) γ ≤ / λ α | γ | . Further, notethat the degree of G is bounded by ∆ . By Lemma 30 and our definition of f , we obtain Õ v ∈ N G ( γ ) Õ k ∈ N > Õ γ ′ ∈ Λ v | γ ′ | = k f ( γ ′ ) w ( i ) γ ′ ≤ ∆ | γ | Õ k ∈ N > e k (cid:0) ∆ (cid:1) k − k / √ π · k · λ αk = | γ |√ π Õ k ∈ N > (cid:18) e ∆ λ α (cid:19) k √ k . λ ≥ (cid:0) e ∆ / . (cid:1) / α , we get | γ |√ π Õ k ∈ N > (cid:18) e ∆ λ α (cid:19) k √ k ≤ | γ |√ π Õ k ∈ N > ( . ) k √ k ≤ | γ |√ π √ π = f ( γ ) . It remains to show, for all v ∈ V i , that Λ v satisfies the clique truncation condition for a д : R → R > and a B ∈ R > . To this end, for all γ ∈ C ( i ) , let | γ | = | γ | „ let д (| γ |) = e . | γ | ,and let B =
1. Analogously to our verification of the clique dynamics condition, we see, for all v ∈ V i , that Õ γ ∈ Λ v д (| γ |) w ( i ) γ ≤ ∆ √ π Õ k ∈ N > (cid:18) e ∆ λ α (cid:19) k k / e . k = ∆ √ π Õ k ∈ N > (cid:18) e . ∆ λ α (cid:19) k k / . For λ ≥ (cid:0) e ∆ / . (cid:1) / α , we get1 ∆ √ π Õ k ∈ N > (cid:18) e . ∆ λ α (cid:19) k k / ≤ ∆ √ π Õ k ∈ N > (cid:0) . . (cid:1) k k / < ∆ √ π . ≤ B . Last, we bound the runtime of the FPRAS. By Lemma 31, we can enumerate each polymerclique up to size k in time t ( k ) ∈ e O ( k log ( ∆ )) . As д − : x ( x ) , we have t ◦ д − : x x O ( ln ( ∆ )) ,which is polynomial for ∆ ∈ Θ ( ) . For the runtime bound, note that we truncate to size k = д − ( n / ε ) . Thus, the time for computing each step of the polymer Markov chain is bounded by t ( k ) = ( n / ε ) O ( ln ( ∆ )) , which dominates the runtime. (cid:4) The results for the remaining applications in Table 1 are derived via similar calculations. Forthe Potts model on expander graphs and the hard-core model on unbalanced bipartite graphs,we use Lemmas 30 and 31 together with the same function f for the clique dynamics conditionas in the proof of Proposition 32. For the perfect matching polynomial, we use the bounds forthe number of polymers and for polymer enumeration that are stated by Casel et al. [6], andwe choose f ( γ ) = e a | γ | for a ≈ . References [1] Ivona Bezáková, Andreas Galanis, Leslie Ann Goldberg, and Daniel Stefankovic. “In-approximability of the independent set polynomial in the complex plane.” In:
Proc. ofSTOC’18 . 2018, pp. 1234–1240. doi : 10.1145/3188745.3188788 (see page 5).[2] Christian Borgs, Jennifer T. Chayes, Tyler Helmuth, Will Perkins, and Prasad Tetali. “Ef-ficient sampling and counting algorithms for the Potts model on Z d at all temperatures.”In: Proc. of STOC’20 . 2020, pp. 738–751. doi : 10.1145/3357713.3384271 (see page 3).[3] Christian Borgs, Jennifer Chayes, Jeff Kahn, and László Lovász. “Left and right conver-gence of graphs with bounded degree.” In:
Random Structures & Algorithms doi : 10.1002/rsa.20414 (see page 40).424] Tomáš Boublik, Ivo Nezbeda, and Karel Hlavaty.
Statistical thermodynamics of simple liq-uids and their mixtures . Fundamental Studies in Engineering. Elsevier, 1980. isbn : 9780444416995(see page 7).[5] Sarah Cannon and Will Perkins. “Counting independent sets in unbalanced bipartitegraphs.” In:
Proc. of SODA’20 . 2020, pp. 1456–1466. doi : 10.1137/1.9781611975994.88 (seepages 3, 8).[6] Katrin Casel, Philipp Fischbeck, Tobias Friedrich, Andreas Göbel, and J. A. Gregor Lagodzin-ski. “Zeros and approximations of holant polynomials on the complex plane.” In:
CoRR abs/1905.03194 (2019). url : http://arxiv.org/abs/1905.03194 (see pages 3, 8, 42).[7] Zongchen Chen, Andreas Galanis, Leslie Ann Goldberg, Will Perkins, James Stewart, andEric Vigoda. “Fast algorithms at low temperatures via Markov chains.” In:
Proc. of AP-PROX/RANDOM’19 . 2019, 41:1–41:14. doi : 10.4230/LIPIcs.APPROX-RANDOM.2019.41 (seepages 4, 5, 7, 16, 40).[8] Giulio Cimini, Tiziano Squartini, Fabio Saracco, Diego Garlaschelli, Andrea Gabrielli,and Guido Caldarelli. “The statistical physics of real-world networks.” In:
Nature ReviewsPhysics doi : 10.1038/s42254-018-0002-6 (see page 2).[9] Henry Cohn. “A conceptual breakthrough in sphere packing.” In:
Notices of the AmericanMathematical Society
64 (2017), pp. 102–115. doi : 10.1090/noti1474 (see page 8).[10] Henry Cohn, Abhinav Kumar, Stephen Miller, Danylo Radchenko, and Maryna Viazovska.“The sphere packing problem in dimension 24.” In:
Annals of Mathematics
185 (2016),pp. 1017–1033. doi : 10.4007/annals.2017.185.3.8 (see page 8).[11] Roland L. Dobrushin. “Estimates of semi-invariants for the Ising model at low tempera-tures.” In:
Translations of the American Mathematical Society-Series 2
177 (1996), pp. 59–82 (see pages 4, 16).[12] Martin Dyer and Catherine Greenhill. “A more rapidly mixing Markov chain for graphcolorings.” In:
Random Structures & Algorithms doi : 10.1002/(SICI)1098-2418(199810/12)13:3/4<285::AID-RSA6>3.0.CO;2-R(see pages 6, 47).[13] Martin Dyer and Catherine Greenhill. “On Markov chains for independent sets.” In:
Jour-nal of Algorithms doi : 10.1006/jagm.1999.1071 (see page 5).[14] Roberto Fernández, Pablo A. Ferrari, and Nancy L. Garcia. “Loss network representationof Peierls contours.” In:
Annals of Probability doi : 10.1214/aop/1008956697(see page 4).[15] Roberto Fernández and Aldo Procacci. “Cluster expansion for abstract polymer models.New Bounds from an old approach.” In:
Communications in Mathematical Physics doi : 10.1007/s00220-007-0279-2 (see pages 4–6, 16).[16] Sacha Friedli and Yvan Velenik.
Statistical Mechanics of Lattice Systems: A Concrete Math-ematical Introduction . Cambridge University Press, 2017. isbn : 978-1-107-18482-4. doi :10.1017/9781316882603 (see page 2). 4317] Andreas Galanis, Qi Ge, Daniel Stefankovic, Eric Vigoda, and Linji Yang. “Improved inap-proximability results for counting independent sets in the hard-core model.” In:
RandomStructures and Algorithms doi : 10.1002/rsa.20479 (see page 5).[18] Andreas Galanis, Leslie Ann Goldberg, and Daniel Stefankovic. “Inapproximability of theindependent set polynomial below the Shearer threshold.” In:
Proc. of ICALP’17 . Vol. 80.2017, 28:1–28:13. doi : 10.4230/LIPIcs.ICALP.2017.28 (see page 5).[19] Andreas Galanis, Leslie Ann Goldberg, and James Stewart. “Fast algorithms for gen-eral spin systems on bipartite expanders.” In:
Proc. of MFCS’20 . To appear. 2020. url :https://arxiv.org/abs/2004.13442 (see page 3).[20] Sam Greenberg, Amanda Pascoe, and Dana Randall. “Sampling biased lattice configura-tions using exponential metrics.” In:
Proc. of SODA’09 . 2009, pp. 76–85. doi : 10.1137/1.9781611973068.9(see pages 6, 11, 47, 48).[21] Sam Greenberg, Dana Randall, and Amanda Pascoe Streib. “Sampling biased monotonicsurfaces using exponential metrics.” In:
CoRR abs/1704.07322 (2017). url : https://arxiv.org/abs/1704.07322(see pages 48, 49).[22] Heng Guo and Mark Jerrum. “Perfect simulation of the hard disks model by partial rejec-tion sampling.” In:
CoRR abs/1801.07342 (2018). url : https://arxiv.org/abs/1801.07342v2(see pages 8, 9, 34).[23] Thomas Hales. “A proof of the Kepler conjecture.” In:
Annals of Mathematics
162 (022005), pp. 1065–1185. doi : 10.4007/annals.2005.162.1065 (see page 8).[24] Jean-Pierre Hansen and Ian R. McDonald, eds.
Theory of Simple Liquids . Fourth Edition.Academic Press, 2013. isbn : 978-0-12-387032-2. doi : 10.1016/B978-0-12-387032-2.00013-1(see page 7).[25] Thomas P. Hayes and Cristopher Moore. “Lower bounds on the critical density in thehard disk model via optimized metrics.” In:
CoRR abs/1407.1930 (2014). url : http://arxiv.org/abs/1407.1930(see page 8).[26] Tyler Helmuth, Will Perkins, and Samantha Petti. “Correlation decay for hard spheres viaMarkov chains.” In:
CoRR abs/2001.05323 (2020). url : https://arxiv.org/abs/2001.05323(see pages 8, 9, 34).[27] Tyler Helmuth, Will Perkins, and Guus Regts. “Algorithmic Pirogov–Sinai theory.” In:
Proc. of STOC’19 . 2019, pp. 1009–1020. doi : 10.1145/3313276.3316305 (see pages 2–4).[28] Ernst Ising. “Contribution to the theory of ferromagnetism.” In:
Zeitschrift für Physik doi : 10.1007/BF02980577 (see page 2).[29] Matthew Jenssen, Felix Joos, and Will Perkins. “On the hard sphere model and spherepackings in high dimensions.” In:
Forum of Mathematics, Sigma doi : 10.1017/fms.2018.25(see page 8).[30] Matthew Jenssen, Peter Keevash, and Will Perkins. “Algorithms for
Proc. of SODA’19 . 2019, pp. 2235–2247. doi : 10.1137/1.9781611975482.135(see pages 3, 8, 39–41). 4431] Mark Jerrum.
Counting, sampling and integrating: algorithms and complexity . SpringerScience & Business Media, 2003. doi : 10.1007/978-3-0348-8005-3 (see pages 19, 20).[32] Mark Jerrum, Leslie G. Valiant, and Vijay V. Vazirani. “Random generation of combina-torial structures from a uniform distribution.” In:
Theoretical Computer Science
43 (1986),pp. 169–188. doi : 10.1016/0304-3975(86)90174-X (see page 18).[33] Ravi Kannan, Michael W. Mahoney, and Ravi Montenegro. “Rapid mixing of severalMarkov chains for a hard-core model.” In:
Proc. of ISAAC’03 . 2003, pp. 663–675. doi :10.1007/978-3-540-24587-2_68 (see page 8).[34] Wilfrid S. Kendall. “Perfect Simulation for the Area-Interaction Point Process.” In:
Proba-bility Towards 2000 . Springer New York, 1998, pp. 218–234. isbn : 978-1-4612-2224-8. doi :10.1007/978-1-4612-2224-8_13 (see page 8).[35] Wilfrid S. Kendall and Jesper Møller. “Perfect simulation using dominating processeson ordered spaces, with application to locally stable point processes.” In:
Advances inApplied Probability doi : 10.1239/aap/1013540247 (see page 8).[36] Roman Kotecký and David Preiss. “Cluster expansion for abstract polymer models.” In:
Communications in Mathematical Physics doi : 10.1007/BF01211762(see pages 3–5, 16, 40).[37] David A Levin and Yuval Peres.
Markov chains and mixing times . Vol. 107. AmericanMathematical Society, 2017. isbn : 978-1470429621 (see pages 47, 48).[38] Chao Liao, Jiabao Lin, Pinyan Lu, and Zhenyu Mao. “Counting independent sets and col-orings on random regular bipartite graphs.” In:
CoRR abs/1903.07531 (2019). url : http://arxiv.org/abs/1903.07531(see page 3).[39] Nicholas Metropolis, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller,and Edward Teller. “Equation of state calculations by fast computing machines.” In:
TheJournal of Chemical Physics doi : 10.1063/1.1699114 (see page 8).[40] Michael Mitzenmacher and Eli Upfal.
Probability and computing: randomization and prob-abilistic techniques in algorithms and data analysis . 2nd ed. Cambridge university press,2017. isbn : 110715488X (see page 49).[41] Sarat Babu Moka, Sandeep Juneja, and Michel R. H. Mandjes. “Analysis of perfect sam-pling methods for hard-sphere models.” In:
SIGMETRICS Performance Evaluation Review doi : 10.1145/3199524.3199536 (see page 8).[42] Viresh Patel and Guus Regts. “Deterministic polynomial-time approximation algorithmsfor partition functions and graph polynomials.” In:
Electronic Notes in Discrete Mathemat-ics
61 (2017), pp. 971–977. doi : 10.1016/j.endm.2017.07.061 (see pages 3, 7, 41).[43] Will Perkins. “Birthday inequalities, repulsion, and hard spheres.” In:
Proceedings of theAmerican Mathematical Society
144 (2015), pp. 2635–2649. doi : 10.1090/proc/13028 (seepage 8).[44] Han Peters and Guus Regts. “On a conjecture of Sokal concerning roots of the inde-pendence polynomial.” In:
Michigan Mathematical Journal doi :10.1307/mmj/1541667626 (see page 3). 4545] Allan Sly. “Computational transition at the uniqueness threshold.” In:
Proc. of FOCS’10 .2010, pp. 287–296. doi : 10.1109/FOCS.2010.34 (see page 5).[46] Andreas Strömbergsson and Anders Södergren. “On the generalized circle problem fora random lattice in large dimension.” In:
Advances in Mathematics
345 (2019), pp. 1042–1074. doi : 10.1016/j.aim.2019.01.034 (see page 26).[47] Eric Vigoda. “A note on the Glauber dynamics for sampling independent sets.” In:
Elec-tronic Journal of Combinatorics doi : 10.37236/1552 (see page 9).[48] Dror Weitz. “Counting independent sets up to the tree threshold.” In:
Proc. of STOC’06 .2006, pp. 140–149. doi : 10.1145/1132516.1132538 (see page 3).[49] H. Peyton Young.
Individual strategy and social structure: An evolutionary theory of insti-tutions . Princeton University Press, 1998. doi : 10.2307/j.ctv10h9d35 (see page 2).[50] Ihor Yukhnovskii and Oksana Patsahan. “Grand canonical distribution for multicompo-nent system in the collective variables method.” In:
Journal of statistical physics doi : 10.1007/BF02179251 (see page 22).46 ppendix
We discuss Theorem 2 in detail. First, we explain why the assumptions of the original theo-rem by Greenberg et al. [20, Theorem 3 .
3] are insufficient. With Example 33, we provide acounterexample. Last, we prove our version of the theorem.Besides some minor generalizations, the most important difference between Theorem 2 andTheorem 3 . all pairsof states. We also require the expected change of δ as well as the probability bound to hold for all pairs of states. In contrast, Greenberg et al. [20] claim that it is sufficient if these propertieshold for neighboring states with respect to some adjacency structure. In what follows, weargue that this does not always suffice.It is well known that couplings on adjacent states can be extended to all pairs of statessuch that the expected decrease of δ for adjacent states implies an expected decrease for allpairs of states [12]. However, a similar argument does not necessarily hold for bounds onthe probability that δ changes by at least a certain amount. More precisely, it is possible toconstruct a Markov chain and a coupling such thatPr [| δ ( X t + , Y t + ) − δ ( x , y )| ≥ ηδ ( x , y ) | X t = x , Y t = y ] ≥ κ holds for all pairs of adjacent states x , y ∈ Ω but not for all pairs of non-adjacent states.Thus, Theorem 3 . . n by O (cid:0) ln ( n ) ln ( / ε ) (cid:1) . This contradicts the lowerbound of Ω ( n ln ( / ε )) that results from the diameter of the state space [37, Chapter 7 . . ◮ Example 33.
We consider a symmetric random walk on a cycle of length n ∈ N > (i.e., Ω = { } ∪ [ n − ] ). In what follows, let all + − n . In order to have the desired self-loop probability, we define the transitions P , for all x ∈ Ω , by P ( x , x ) = / P ( x , x + ) = P ( x , x − ) = / x , y ∈ Ω with x , y are adjacent if and only if x = y + x = y − δ to be the shortest-path distance in the cycle. Note that, for all x , y ∈ Ω with x , y , it holds that δ ( x , y ) ∈ [ , ⌊ n / ⌋] .Let ( X t ) t ∈ N and ( Y t ) t ∈ N be two copies of the chain ( Ω , P ) , and let x , y ∈ Ω be adjacent.Without loss of generality, assume x = y +
1. For X t = x , Y t = y we construct the followingcoupling:• With probability 1 /
4, choose X t + = x and Y t + = x , resulting in δ ( X t + , Y t + ) = /
4, choose X t + = y and Y t + = y , resulting in δ ( X t + , Y t + ) = /
4, choose X t + = x and Y t + = y , resulting in δ ( X t + , Y t + ) = /
4, choose X t + = x + Y t + = y −
1, resultingin δ ( X t + , Y t + ) = [ δ ( X t + , Y t + ) | X t = x , Y t = y ] = δ ( x , y ) .For η = .
999 and κ = /
4, it holds thatPr [| δ ( X t + , Y t + ) − δ ( x , y )| ≥ ηδ ( x , y ) | X t = x , Y t = y ] ≥ κ . Theorem 3 . (cid:0) ln ( n ) ln ( / ε ) (cid:1) ,which contradicts the linear lower bound stated by Levin and Peres [37, Chapter 7 . . ◭ Note that Example 33 is not a counterexample for Theorem 2, as there are, for all η ∈ ω ( / n ) ,non-adjacent states x , y ∈ Ω with Pr [| δ ( X t + , Y t + ) − δ ( x , y )| ≥ ηδ ( x , y ) | X t = x , Y t = y ] = Proof of our version
We closely follow the proof of Greenberg et al. [20]. Central to this is the following theorem,which we present in a slightly different fashion than Greenberg et al. [21, Lemma 3.5]. ◮ Theorem 34.
Let d , D ∈ R with d ≤ D , let q ∈ [ d , D ] , and let ( S t ) t ∈ N be a stochastic processadapted to a filtration (F t ) t ∈ N . Further, let T = inf { t ∈ N | S t ≤ q } . Assume that, for all t ∈ N ,it holds that S t · { t ≤ T } ∈ [ d , D ] , thatE [ S t + · { t < T } | F t ] ≤ S t · { t < T } , (18)and that there is a Q ∈ R > such thatE (cid:2) ( S t + − S t ) · { t < T } (cid:12)(cid:12) S t (cid:3) ≥ Q · { t < T } . (19)Then E [ T ] ≤ E (cid:2) ( D − S T ) (cid:3) − E (cid:2) ( D − S ) (cid:3) Q . ◭ Different to the original theorem by Greenberg et al. [21, Lemma 3.5], we include a filtra-tion, indicator functions, and define the predicate of the stopping time via an inequality. Ourreasons are as follows. The proof of Theorem 34 aims to apply the optional-stopping theoremfor submartingales. A submartingale is, by definition, a stochastic process ( Z t ) t ∈ N adapted toa filtration (F t ) t ∈ N such that, for all t ∈ N , the expectation of Z t is finite and E [ Z t + | F t ] ≥ Z t .It is important to note that the expectation E [ Z t + | F t ] is itself a random variable and that theinequality E [ Z t + | F t ] ≥ Z t is stronger than E [ Z t + ] ≥ E [ Z t ] (which follows by the law oftotal expectation). Hence, we require a filtration.Second, the indicator functions make sure that equations (18) and (19) (and the boundednessof S ) only have to hold as long as S did not stop. Afterward, they are trivially satisfied. This isimportant, as S is bounded from below by d and its expectation does not increase. Assume thatwe did not use indicator functions. If there is a t ∈ N such that S t = d , then S t + = d holds aswell, as otherwise the inequality E [ S t + ] ≤ E [ S t ] (which follows by the law of total expectationfrom equation (18)) does not hold. However, this implies that E (cid:2) ( S t + − S t ) (cid:12)(cid:12) F t (cid:3) = q in the definition of T is important, as S does not need totake on exactly q . If this never happens, equation (19) may eventually not hold, due to the sameargument as in the previous paragraph. Using the inequality in the definition of T guaranteesthat E [ T ] is finite.Note that our additional assumptions in Theorem 34 only fixes issues in the proof of Green-berg et al. [21, Lemma 3.5]. The proof itself remains mostly unchanged.While we state Theorem 34 in an elaborate fashion, we use it in a slightly different way in thefollowing proof of Theorem 2. First, in order to ease notation, we ignore the indicator functionsand check equations (18) and (19) for values of t ∈ N such that t < T is true. Second, instead ofusing a filtration and calculating expectations that are random variables, such as E [ S t + | F t ] (ignoring the indicator function), we use normal expectations but do so for every possibleoutcome of S t . That is, we make sure that equations (18) and (19) are satisfied pointwise. Sincewe only consider countable state spaces, this approach is valid. Proof of Theorem 2.
We aim to bound the expected time until δ hits 0 for the coupled copies ( X t ) t ∈ N and ( Y t ) t ∈ N of M and for all pairs of starting states x , y ∈ Ω . This results in a boundon the expected coupling time, and, because M is ergodic, also bounds τ M (see, for example,Chapter 11 by Mitzenmacher and Upfal [40] for a detailed discussion).We start by defining a scaled potential δ ′ such that, for all x , y ∈ Ω , it holds that δ ′ ( x , y ) = δ ( x , y )/ d . Note that δ ′ takes values in { } ∪ [ , D / d ] , and, for all t ∈ N , it holds that X t = Y t ↔ δ ( X t , Y t ) = ↔ δ ′ ( X t , Y t ) = . Further, for all x , y ∈ Ω , by the linearity of expectation and by equation (3), it holds thatE [ δ ′ ( X t + , Y t + ) | X t = x , Y t = y ] = d E [ δ ( X t + , Y t + ) | X t = x , Y t = y ] ≤ d δ ( x , y ) = δ ′ ( x , y ) and, by equation (4), thatPr [| δ ′ ( X t + , Y t + ) − δ ′ ( x , y )| ≥ ηδ ′ ( x , y ) | X t = x , Y t = y ] = Pr [| δ ( X t + , Y t + ) − δ ( x , y )| ≥ ηδ ( x , y ) | X t = x , Y t = y ] ≥ κ . We define the stochastic processes whose expected hitting times we bound, as follows. Forall x , y ∈ Ω , let ( φ xyt ) t ∈ N , where φ xyt = δ ′ ( X t , Y t ) , given X = x , Y = y . Further, for all x ∈ [ , D / d ] , let ln ( x ) = ( ( ) x − ( ) if x ∈ [ , ) , ln ( x ) if x ∈ [ , D / d ] , and, for all x , y ∈ Ω and all t ∈ N , let ψ xyt = ln (cid:0) φ xyt (cid:1) . Note that ψ xyt = − ( ) ↔ φ xyt = ↔ X t = Y t , given X = x , Y = y . Thus, for all x , y ∈ Ω , we bound the expectation of T x , y = inf t ∈ N (cid:8) ψ xyt ≤ − ( ) (cid:9) from above.49e aim to apply Theorem 34, which requires showing, for all t ∈ N with t < T x , y andall s ∈ rng ( S t ) , that E (cid:2) ψ xyt + (cid:12)(cid:12) ψ xyt = s (cid:3) ≤ s (equation (18)) and obtaining a lower bound onE (cid:2) ( ψ xyt + − s ) (cid:12)(cid:12) ψ xyt = s (cid:3) (equation (19)) (as we discuss after Theorem 34).Let t ∈ N , and assume that t < T x , y . Further, let s ∈ [ , ln ( D / d )] such that ψ xyt = s . Notethat ln is a concave function and that φ xyt ≥
1. By Jensen’s inequality, we obtainE (cid:2) ψ xyt + (cid:12)(cid:12) ψ xyt = s (cid:3) ≤ ln (cid:16) E h φ xyt + (cid:12)(cid:12)(cid:12) ln (cid:0) φ xyt (cid:1) = s i (cid:17) = ln (cid:0) E (cid:2) φ xyt + (cid:12)(cid:12) φ xyt = e s (cid:3) (cid:1) ≤ ln ( e s ) = s , which shows equation (18).Again, let t ∈ N , and assume that t < T x , y . Further, let s ∈ [ , ln ( D / d )] such that ψ xyt = s . Weproceed by bounding E (cid:2) ( ψ xyt + − s ) (cid:12)(cid:12) ψ xyt = s (cid:3) from below. Let A be the event that ψ xy jumpsfrom ψ xyt = s ≥ ψ xyt + = − ( ) (i.e., φ xyt ≥ φ xyt + = M implies that Pr [ A ] < (cid:2) A (cid:3) >
0. By the law of total expectation, weobtain E (cid:2) ( ψ xyt + − s ) (cid:12)(cid:12) ψ xyt = s (cid:3) = E (cid:2) ( ψ xyt + − s ) (cid:12)(cid:12) ψ xyt = s , A (cid:3) Pr [ A ] + E h ( ψ xyt + − s ) (cid:12)(cid:12)(cid:12) ψ xyt = s , A i ( − Pr [ A ]) . We lower-bound each term in the sum separately. Because of ψ xyt ≥
0, we haveE (cid:2) ( ψ xyt + − s ) (cid:12)(cid:12) ψ xyt = s , A (cid:3) Pr [ A ] = (cid:0) − ( ) − s (cid:1) · Pr [ A ] ≥ ( ) · Pr [ A ] . (20)Furthermore, because η >
0, by Markov’s inequality, we getE h ( ψ xyt + − s ) (cid:12)(cid:12)(cid:12) ψ xyt = s , A i ≥ ln ( + η ) Pr h ( ψ xyt + − s ) ≥ ln ( + η ) (cid:12)(cid:12)(cid:12) ψ xyt = s , A i = ln ( + η ) Pr h | ψ xyt + − s | ≥ ln ( + η ) (cid:12)(cid:12)(cid:12) ψ xyt = s , A i . We decomposed the probability on the right-hand side asPr h | ψ xyt + − s | ≥ ln ( + η ) (cid:12)(cid:12)(cid:12) ψ xyt = s , A i = Pr h ψ xyt + − s ≥ ln ( + η ) (cid:12)(cid:12)(cid:12) ψ xyt = s , A i + Pr h ψ xyt + − s ≤ − ln ( + η ) (cid:12)(cid:12)(cid:12) ψ xyt = s , A i . We rewrite the first of these probabilities asPr h ψ xyt + − s ≥ ln ( + η ) (cid:12)(cid:12)(cid:12) ψ xyt = s , A i = Pr " ln φ xyt + e s ! ≥ ln ( + η ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) φ xyt = e s , A = Pr " φ xyt + e s ≥ + η (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) φ xyt = e s , A = Pr h φ xyt + − e s ≥ η e s (cid:12)(cid:12)(cid:12) φ xyt = e s , A i . (21)50ince, for all x ∈ ( , ) , it holds that − ln ( + x ) ≥ ln ( − x ) , we bound the second probability byPr h ψ xyt + − s ≤ − ln ( + η ) (cid:12)(cid:12)(cid:12) ψ xyt = s , A i ≥ Pr h ψ xyt + − s ≤ ln ( − η ) (cid:12)(cid:12)(cid:12) ψ xyt = s , A i = Pr " ln φ xyt + e s ! ≤ ln ( − η ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) φ xyt = e s , A = Pr " φ xyt + e s ≤ − η (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) φ xyt = e s , A = Pr h φ xyt + − e s ≤ − η e s (cid:12)(cid:12)(cid:12) φ xyt = e s , A i . (22)Combining equations (21) and (22), we obtainPr h | ψ xyt + − s | ≥ ln ( + η ) (cid:12)(cid:12)(cid:12) ψ xyt = s , A i ≥ Pr h | φ xyt + − e s | ≥ η e s (cid:12)(cid:12)(cid:12) φ xyt = e s , A i . (23)For bounding the right-hand side of equation (23), assume that φ xyt = s ′ ≥
1. Consider theprobability that φ xy takes steps of at least size ηs ′ . By the law of total probability,Pr (cid:2) | φ xyt + − s ′ | ≥ ηs ′ (cid:12)(cid:12) φ xyt = s ′ (cid:3) = Pr (cid:2) | φ xyt + − s ′ | ≥ ηs ′ (cid:12)(cid:12) φ xyt = s ′ , A (cid:3) Pr [ A ] + Pr h | φ xyt + − s ′ | ≥ ηs ′ (cid:12)(cid:12)(cid:12) φ xyt = s ′ , A i ( − Pr [ A ]) . Since A is the event to go from φ xyt ≥ φ xyt + =
0, for all η ∈ ( , ) , it holds thatPr (cid:2) | φ xyt + − s ′ | ≥ ηs ′ (cid:12)(cid:12) φ xyt = s ′ , A (cid:3) = . Thus and by equation (4), we obtainPr h | φ xyt + − s ′ | ≥ ηs ′ (cid:12)(cid:12)(cid:12) φ xyt = s ′ , A i = Pr (cid:2) | φ xyt + − s ′ | ≥ ηs ′ (cid:12)(cid:12) φ xyt = s ′ (cid:3) − Pr [ A ] − Pr [ A ]≥ κ − Pr [ A ] − Pr [ A ] . (24)By combining equations (23) and (24) with s ′ = e s , we getE h ( ψ xyt + − s ) (cid:12)(cid:12)(cid:12) ψ xyt = s , A i ≥ ln ( + η ) Pr h | ψ xyt + − s | ≥ ln ( + η ) (cid:12)(cid:12)(cid:12) ψ xyt = s , A i ≥ ln ( + η ) κ − Pr [ A ] − Pr [ A ] . (25)Last, we use equations (20) and (25) and that η < ( + η ) ≤ ln ( ) to obtainE (cid:2) ( ψ xyt + − s ) (cid:12)(cid:12) ψ xyt = s (cid:3) ≥ ( ) Pr [ A ] + ( − Pr [ A ]) ln ( + η ) κ − Pr [ A ] − Pr [ A ] = ( ) Pr [ A ] + ln ( + η ) κ − ln ( + η ) Pr [ A ] ( ) Pr [ A ] + ln ( + η ) κ ≥ ln ( + η ) κ , which shows equation (19).By Theorem 34, for all x , y ∈ Ω , we getE (cid:2) T x , y (cid:3) ≤ (cid:0) ln ( D / d ) + ( ) (cid:1) − E (cid:2) (cid:0) ln ( D / d ) − ψ xy (cid:1) (cid:3) ln ( + η ) κ ≤ (cid:0) ln ( D / d ) + ( ) (cid:1) ln ( + η ) κ . This results in the desired mixing time bound of τ M ( ε ) ≤ (cid:0) ln ( D / d ) + ( ) (cid:1) ln ( + η ) κ ln (cid:18) ε (cid:19) . (cid:4)(cid:4)