MModels of random subtrees of a graph
Luis Fredes † and Jean-Fran¸cois Marckert ∗† Universit´e Paris-Saclay. ∗ CNRS, LaBRI, Universit´e Bordeaux
Abstract
Consider a connected graph G = ( E, V ) with N = | V | vertices. A subtree of G with size n is atree which is a subgraph of G , with n vertices. When n = N , such a subtree is called a spanningtree. The main purpose of this paper is to explore the question of uniform sampling of a subtree of G , or a subtree of G with a fixed number of nodes n , for some n ≤ N . We provide asymptoticallyexact simulation methods using Markov chains. We highlight the case of the uniform subtree of Z with n nodes, containing the origin (0 ,
0) for which Schramm asked several questions. We producepictures, statistics, and some conjectures.The second aim of the paper is devoted to survey other models of random subtrees of a graph,among them, we will discuss DLA models, the first passage percolation, the uniform spanning treeand the minimum spanning tree. We also provide a number of new models, some statistics, andsome conjectures.
Acknowledgments
We acknowledge support from ERC 740943 GeoBrown.
The very origins of this work are two open questions raised by Oded Schramm in the conferencepaper [38] of the International Congress of Mathematicians, Madrid, 2006, where he was one of theplenary speakers. His Section 2.5 is devoted to lattice trees. We take the liberty to copy it, here,verbatim:
Section 2.5. Lattice trees . We now present an example of a discrete model wherewe suspect that perhaps conformal invariance might hold. However, we do not presentlyhave a candidate for the scaling limit.Fix n ∈ N + , and consider the collection of all trees contained in the grid G that containthe origin and have n vertices. Select a tree T from this measure, uniformly at random. Problem 2.8.
What is the growth rate of the expected diameter of such a tree? If werescale the tree so that the expected (or median) diameter is 1, is there a limit for the lawof the tree as n → + ∞ ? What are its geometric and topological properties? Can the limitbe determined?It would be good to be able to produce some pictures. However, we presently do notknow how to sample from this measure. Problem 2.9.
Produce an efficient algorithm which samples lattice trees approximatelyuniformly, or prove that such an algorithm does not existExcellent questions for which no real advances have been published during these 14 last years.Before discussing the content of the present paper, let us fix some notations.1 a r X i v : . [ m a t h . P R ] F e b onvention and notations A graph G is a pair ( V, E ), where V is the countable set of vertices,and E the multiset of edges. Each edge is a set of the form { a, b } where a and b are different vertices.The word multiset means that each edge { a, b } comes with a multiplicity, which is a positive integer.All the graphs G considered in the paper are connected. As usual, a subgraph G (cid:48) = ( V (cid:48) , E (cid:48) ) of G = ( V, E ) is a graph satisfying V (cid:48) ⊂ V and E (cid:48) ⊂ E .We use the standard definition of paths, cycles, connectivity and connected components, inducedsubgraph (see e.g. [7]). A tree is a connected graph T = ( V T , E T ) with no cycle: it satisfies | E T | = | V T | − | S | stands for the cardinality of S ). A subtree of G is a tree which is also a subgraphof G . A subtree T is said to be spanning if V T = V .Sometimes we will call E ( G ) and V ( G ) the edges and vertices of the graph G , respectively. Wewill also write −→ E ( G ) for the set of oriented edges, which contains ( a, b ) and ( b, a ) for each edge { a, b } ∈ E ( G ). For an oriented edge −→ e = ( e , e ) ∈ −→ E ( G ), we denote simply by e the unorientedversion { e , e } ∈ E ( G ).A rooted tree is a pair ( T, r ), where T is a tree and r ∈ V T is a distinguished vertex, called theroot. We orient the edges of a rooted tree toward its root; a node of T having non incoming edge (wesay also having no child), is called a leaf. The set of leaves is denoted ∂T .For a finite connected graph G and some positive integer n ≤ | V | , the notation Subtrees ( G, n )stands for the set of subtrees of G with n vertices. For a vertex r ∈ V , let Subtrees • r ( G, n ) be thesubset of
Subtrees ( G, n ) of trees which contains r (they can be seen as being rooted at r ). We also definethe set Subtrees ( G ) = ∪ n Subtrees ( G, n ) of all subtrees of G , and Subtrees • r ( G ) = ∪ n Subtrees • r ( G, n )the set of those rooted at r .For any finite set S , the uniform distribution on S is denoted Uniform ( S ). Remark 1.
Most of the models presented in the paper can be defined on multigraphs (in which multipleedges are allowed) as well as loops, up to small extra-cost. For the sake of clarity, we focus only onthe case of usual connected graphs, with simple edges, and without any loops.
Content of the paper
Schramm’s question is a particular case of the following more generalquestion: Let G = ( V, E ) be a finite connected graph.
Question [ (cid:63) ] : Is there an efficient way to sample
Uniform ( Subtrees ( G, n )) or
Uniform ( Subtrees • r ( G, n ))?Indeed, consider the discrete torus
Torus ( N ) := ( Z /N Z ) seen as a graph, with edges between pair of nodes of the type ( x, y ) and ( x, y +1 mod N ), and between( x, y ) and ( x +1 mod N, y ). Schramm’s question about a way to sample
Uniform (cid:16)
Subtrees • (0 , ( Z , n ) (cid:17) is equivalent to find a way to sample Uniform ( Subtrees • (0 , ( Torus ( n ) , n )), since the graphs Z and thefinite graph Torus ( n ) coincide locally in a n − Question [ (cid:63) ] on a general graph leads us to investigate a lot of methods allowingto sample random trees embedded in a graph, for example, Markov chain simulations, combinatorialmethods, acceptance/rejection methods relying on simple to sample models, to design new models, toproceed to partial “evaporation” of uniform spanning trees, etc.Our feeling is that writing in a single place all these considerations is important for the community;we then decided to write a kind of survey, containing references and open questions.2he paper is organised as follows: • In Section 1.2 we give a small list of simple graphs on which the simulation of subtrees is easy; thepaper is devoted to the other cases. • In Section 2, we recall some facts concerning the spanning tree case n = | V | , which is already solved(Aldous – Broder and Wilson algorithms and other related considerations). The problem to sample Uniform ( Subtrees ( G, n )) can be seen as a generalisation of the uniform spanning tree question, andthen, discussing to some extent the spanning tree case is needed. Another reason is that it is naturalto try to extract a subtree of a spanning tree to get a uniform element in
Uniform ( Subtrees ( G, n ))(instead of extracting this tree directly from the graph). • The small Section 3 presents the combinatorics of subtrees of a graph: how to enumerate
Subtrees ( G, n )?How to use that to sample uniformly? The power of these considerations is limited to small graphs. • In Section 4, we are interested in Markov chain taking their values in
Subtrees ( G, n ). We propose 3different models of ergodic Markov chains with uniform invariant distribution. • In Section 5 we focus on the grid case, and on Oded Schramm questions: using one of the Markovchains of Section 4, we made (approximate) simulations of uniform subtrees of the grid with n vertices(for n up to some thousands). We provide pictures, statistics and conjectures. These empirical resultssupport the idea that a limiting distribution exists for rescaled trees, and that the limit is a continuoustree (macroscopic loops, or space filling phenomena could prevent the limit to be a tree, but thesephenomena do not appear on simulations). Intuitive and partial justification about the fact that themixing time has been reached are given in Section 5.4, Fig. 3 and 4 and videos at [18]. • In Section 6, we provide several Markov chains with state space
Subtrees ( G ), the set of subtrees ofa graph (without fixing the size of the subtrees). • In Section 7, we survey many models of random subtrees with n nodes, of a graph (for most of themwe provide simulation pictures, description of the distribution and sometimes open questions). (cid:4) in Section 7.1, we introduce a new model of random subtree with n nodes: the pioneer tree;which generalizes the Aldous–Broder construction, (cid:4) in Section 7.2 we present a principle which prevents the possibility to construct a uniform elementof Subtrees ( G, n ) using “the last steps” of a simple Markov chain (and similar constructions), (cid:4) in Section 7.3, we give some models on
Subtrees ( G, n ) inspired by the cycle popping constructionof Wilson algorithm (one random outgoing edge per node), (cid:4) in Section 7.4, we give a model of distinguished connected component in a size biased forest, (cid:4) in Section 7.5, we discuss two ways to extract a random subtrees with n nodes of a UST, (cid:4) in Section 7.6, we provide a model that we qualify of directed limited aggregation (DLA) andwhich is defined on any graph (and coincides with the original model on Z ); in Section 7.7, theinternal DLA, (cid:4) in Section 7.8, we propose several models of random trees defined on weighted graphs: amongthem, a model uses Prim algorithm, one Kruskal and another uses first passage percolation. • Finally, in Section 8, we investigate the question of model of subtrees of a tree. (cid:4)
In Section 8.2, we give an exact sampling method of a uniform subtree of a tree (a coupling fromthe past method), (cid:4) in Section 8.3 we propose several models of extractions of a subtree of size n relying on somemodel of leaves evaporation. 3 .2 Simple cases and other questions Sampling uniformly in
Subtrees ( G, n ) is easy for some families of graphs. Among others : (cid:63) If G = K N the complete graph on N vertices, then T a uniform element in Subtrees • ( K N , n ) is auniform labelled tree on n vertices { x , · · · , x n } where this set is itself a uniform subset of { , · · · , N } containing 1, with cardinality n . Hence, up to a relabelling of the vertices, T is a uniform Cayley treeof size n , also called a uniform labelled tree, and one can say that these trees are among the simplestand most studied model of random trees in the literature: they are moreover easy to sample (usingPr¨ufer sequences, bijection with parking sequences, relation with additive coalescence, etc. see e.g.[1]). (cid:63) If G is the cycle Z /N Z , a line graph (the graph with vertices 1 to N , with edges between i and i + 1), or some simple graphs as { , } × { , · · · , N } or { , , . . . , k } × { , · · · , N } , for k fixed, or anyfamily of graphs on which some simple combinatorial decompositions can be performed easily, thenone may find some ad hoc methods to sample Uniform ( Subtrees ( G, n )). (cid:63)
Finding a uniform subtree with n nodes of an infinite complete m -ary trees containing the root canalso be done easily, since this tree can be seen as the set of internal nodes of a uniform m -ary tree with n internal nodes (that is 1 + nm nodes): sampling such a uniform tree is an easy task with severalknown methods, linear in the tree size, since it is a model of “simple trees”, which can also be seen asa Galton-Watson tree conditioned by the size. (cid:63) Another research line is the study of uniform random subtree of some families of random graphs.It turns out that in at least one case, the random generation is simple; as shown by the first authorand A. Sepulveda [20]: the random generation of uniform subtree t of size m in a random rootedquadrangulation with n faces can be done for any ( n, m ) with m < n + 1 in reasonable time. Thiscomes from the existence of a one to one correspondence between, on one side, quadrangulationsmarked by a distinguished subtree, and on the other side, a pair formed by a quadrangulations with asimple boundary together with a planar tree (this bijection also works for more general face degrees). (cid:63) On the probability that a random tree is spanning, S. Wagner [40], gives a lower bound on theprobability that a randomly chosen tree is spanning (depending on a a linear lower bound on theminimum degree of the nodes). Take t ( G ) a random unrooted tree uniform in Subtrees ( G ), then P ( t ( K n ) is spanning ) → e − /e , P ( t ( K n,n ) is spanning ) → e − /e Chin et al. [13]
Given a finite connected graph G = ( V, E ), there are several methods to sample a UST of G . If thegraph is small, the Tutte formula (see e.g. [5]) or the matrix tree theorem can be used to determine theprobability of presence of a given edge of G in a UST, and allow to take into account this informationrecursively: they can be used to sample a UST (see for example Durfee & al. [16] and referencestherein).If the graph is giant, faster methods are needed. Two famous algorithms, recalled in the twofollowing sections are Aldous–Broder algorithm ([8, 1], see also Hu-Lyons-Tang [44] for a variant)and Wilson algorithm [42] (see also Lyons & Peres [28, Section 4], J´arai [22]) provides really efficientalgorithms. Their expected running time for undirected graphs are O ( τ c ) and O ( τ ) respectively, where4 c and τ are the cover time and mean hitting time of the simple random walk in G , respectively.Wilson’s algorithm is the faster among the two, since the mean hitting time is always smaller thanthe cover time (but Aldous–Broder algorithm is easier and faster to program!). Figure 1:
Simulation of a UST of the graph ( Z / Z ) using Wilson algorithm (on this picture, identify theright and left sides, and of the top and bottom sides to get the actual spanning tree). We say that a Markov kernel M = ( M a,b , a, b ∈ V ) is positive on a connected graph G = ( V, E ), if { a, b } ∈ E ⇔ M a,b >
0. Denote by ρ = ( ρ v , v ∈ V ) the unique stationary distribution of this kernel.Consider W = ( W k , k ≥
0) a Markov chain with kernel M . Set τ k ( W ) = inf { j, |{ W , · · · , W j }| = k } , ≤ k ≤ | V | the first time the walk W hit k different points: hence τ = 0, and the cover time is τ | V | ( W ) (we willwrite τ k instead of τ k ( W ) when it is clear from the context). Definition 2.
Denote by
FirstEntranceTree ( W , · · · , W τ | V | ( W ) ) the rooted spanning tree with root W and whose | V | − edges are given by the oriented edge ( W τ k , W − τ k ) for ≤ k ≤ τ | V | . Theorem 3. [[8] and [1]] If M is positive and reversible with invariant distribution ρ , then P (cid:104) FirstEntranceTree (cid:16) W , · · · , W τ | V | (cid:17) = ( t, r ) | W = r (cid:105) = Const . (cid:89) e ∈ E ( t,r ) M e /ρ ( r ) . (1)Here, and elsewhere, for any rooted spanning tree ( t, r ), the edges E ( t, r ) of ( t, r ) are orientedtoward the root r (so that if e = ( e , e ), e is the parent of e , and M e := M e ,e ).Proofs can be found in [8, 1, 28, 22].As a consequence if M is the Markov kernel corresponding to the simple random walk on G , M a,b =1 { a,b }∈ E / deg ( a ), the invariant distribution ρ v is proportional to the deg G ( v ), so that (1) is indepen-dent from t and FirstEntranceTree ( W , · · · , W τ | V | ) is a UST, rooted at W (which can be viewed asuniformly unrooted if desired). The cover time τ c in this setting is the maximum expected time to visit all the vertices of the graph, where themaximum is taken over all starting points. The mean hitting time is defined as τ = (cid:80) i,j π ( i ) π ( j ) E i,j , where π is the invariant distribution and E i,j is the meantime starting from i to reach j . The first entrance tree is associated to a path, random or not. M on G , there exists a uniqueinvariant distribution ρ . Define ←− M by ←− M x,y := ρ y M y,x /ρ x , for all ( x, y ) ∈ V , (2)so that ←− M is simply the Markov kernel of the time reversal of a Markov chain with kernel M underits invariant distribution. Theorem 4 ([19]) . If M is positive on G , and W is a Markov chain with kernel M and invariantdistribution ρ , then for any rooted spanning tree ( t, r ) P (cid:104) FirstEntranceTree (cid:16) W , · · · , W τ | V | (cid:17) = ( t, r ) | W = r (cid:105) = Const . (cid:89) e ∈ E ( t,r ) ←− M e /ρ ( r ) . (3)This theorem, implies Aldous–Broder result since in the reversible case, ←− M = M . A new combi-natorial proof completely different from that of Aldous and Broder is given in [19]. We refer to Lawler [26] for more information concerning loop erased random walks. Let M be apositive Markov kernel on a connected graph G = ( V, E ) and r ∈ V a distinguished node. For anystarting point v ∈ V and non empty subset S of V , we denote by LERW M [ v, S ] the distribution ofa M -loop erased random walk starting at v and killed at its hitting time of S (meaning that beforeerasure, the random walk is a Markov chain with kernel M ).Wilson algorithm can be stated as follows: Consider an ordering of the vertices ( v = r, v , . . . , v | V | )of V , and set T as the initial tree reduced to the point v = r . For any 2 ≤ i ≤ N , consider a looperased random walk L i with distribution LERW M [ v i , T i − ], starting at v i and stopped at the vertexset of the current tree T i − . The tree T i is the tree having as set of edges those of T i − union the setof steps of L i (meaning that if L i = ( a , · · · , a m ), the new edges are the ( a j , a j +1 )). If v i is already in T i − , there is no new edges. Denote by WilsonTree r the final tree T | V | . We have Proposition 5 ([42]) . For any positive kernel M , for any rooted spanning tree ( t, x ) of G , P ( WilsonTree r = ( t, x )) = Const . x = r (cid:89) e ∈ E ( t,r ) M e . (4)See Wilson [42], Propp and Wilson [37], J´arai [22], or Lawler [26].When M a,b = deg ( a ) , then P ( WilsonTree = ( t, x )) =
Const . x = r / (cid:81) v (cid:54) = r deg ( v ), which again, doesnot depend on the tree t , so that again, WilsonTree is a UST t rooted at r .This construction admits a companion description called cycle popping (detailed in [42], [37], [22])which is more suitable to generalizations (see Section 7.3).Consider for each vertex v , different from r , a random outgoing edge ˜e v , independent of the others,such that P ( ˜e v = ( v, w )) = M v,w , and call such orientation O = ( ˜e v , v ∈ V \ { r } ). It is simple to checkthat if the set of oriented edges in O forms a tree (is connected without any cycle), it will be a spanningtree rooted at r , and the probability that this spanning tree is a given ( t, r ) is Const . (cid:81) e ∈ E ( t,r ) M e .When the sequence of edges in O does not form a spanning tree, the connected component t ( r ) of r is a tree and all other connected components contain a (unique) oriented cycle. The cycle popping6lgorithm [37, Sec. 6] consists in choosing a cycle (according to the user preference) and re-samplingthe outgoing edges for all the vertices on it; this operation is repeated until the resulting orientationdoes not contain any cycle, so that it corresponds to a tree T (cid:63) . Wilson [42] proved that T (cid:63) has alsothe distribution as that given in (4) (better than that, he explains how the construction using theLERW is just a way to view/order the cycle popping).As suggested by Theorem 4, if a coupling between Aldous – Broder construction and that of Wilsoncould be found, then probably Wilson algorithm should be run using the kernel ←− M instead of M . Open question 1.
It is possible to couple Wilson and Aldous–Broder constructions so that theydepend on the same trajectories (and give the same results)?
Mixing the UST question with the configuration model?
The following question is open toour knowledge and seems particularly interesting: it is the question of the sampling of a UST withprescribed degrees.
Open question 2.
Given a connected graph G = ( V, E ) and some positive integers ( d u , u ∈ V ) associated with the vertices of V , find an algorithm which produces a UST t of G conditioned by theevent { deg t ( u ) = d u , u ∈ V } when there exists such a spanning tree. The existence of a spanning tree satisfying { deg t ( u ) = d u , u ∈ V } can be decided by exhaustiveapproach or using the matrix tree theorem (put a weight u i u j between i and j , and extract thecoefficient of (cid:81) j u d uj j in the determinant of the Laplacian matrix of the graph, deprived from line 1and column 1: this coefficient gives the number of such spanning trees). The problem of the uniformgeneration of a Hamiltonian path (a path which visits each node exactly ones) is equivalent to thatof a spanning tree whose nodes have all degree 2, except for nodes having degree 1. No efficientalgorithm seems to be known for this task, and the decision problem of existence of a Hamiltonianpath is NP-complete (Karp [23]). When G is finite, Uniform ( Subtrees ( G, n )) can be sampled if one knows a way to sample uniformlyin
Subtrees • r ( G, n ) for all r ∈ V and if | Subtrees • r ( G, n ) | is known for each r (or if they are known to beequal for some reasons, for example, if a group acts transitively on the graph). Indeed, since the treesof Subtrees ( G, n ) have the same number of nodes, it suffices to first pick a random node r according tothe unique probability distribution proportional to ( | Subtrees • r ( G, n ) | , r ∈ V ), and then, conditionallyon r = r , to pick a tree uniformly in Subtrees • r ( G, n ) (this last step is explained in Section 3.1, below,and relies on what is explained in the sequel of this section).It turns out that the sequence ( | Subtrees ( G, n ) | , n ≥
1) can be computed using a decompositionsimilar to that of the Tutte formula [39]. The first part of what follows and which concerns a Tuttepolynomial for subtrees of a graph, is present mutatis mutandis in [13, Prop. 4.4.], for unrootedsubtrees.Apart from their own interest, these algebraic considerations bring some additional insight, andpossibly, potential methods to sample
Uniform ( Subtrees ( G, n )) in some particular graphs: in general,the cost of the computation of the ( | Subtrees ( G, n ) | , n ≥
1) is important, and can be done only onsmall graphs (or particular ones).Tutte recursion produces loop, multiple edges, and may disconnect the graph (if we allow bridgesdeletion, which is the case here). In this section, we then consider multigraphs G = ( V, E ), possibly7isconnected, having possibly some loops (edges of the form { a, b } = { a } ). Of course, the numberof subtrees of a graph having some loops is unchanged by their removal. Since we deal with rootedsubtrees of G , any part of the graph disconnected from the root of the tree can be ignored.Consider a multigraph G = ( V, E ), and e an edge (possibly not in E ). Recall the two classicaloperations, contraction and suppression of edges: • The graph G \ e obtained from the suppression of e , is the multigraph G (cid:48) = ( V (cid:48) , E (cid:48) ) coincidingwith G = ( V, E ) except that a copy of the edge e is suppressed from E if any, and G (cid:48) = G otherwise, • The graph
G.e obtained from the contraction of e , is the multigraph G (cid:48) = ( V (cid:48) , E (cid:48) ) defined asfollows: if e is a loop, then V (cid:48) = V and E (cid:48) is obtained from E by removing 1 to the multiplicity ofthe edge e ; and if e is not a loop, say e = { a, b } , we define V (cid:48) = V \ { b } , and E (cid:48) from E , by replacingevery occurrence of b in an edge e (cid:48)(cid:48) of E by a .Consider the following polynomial T r ( G ) = (cid:88) t ∈ Subtrees • r ( G ) x | E ( t ) | which is the generating function of the sequence ( | Subtrees • r ( G, n ) | − , n ≥ r in G has a single vertex (for example, if G = ( { r } , { r, r } k ) forsome k ≥ T r ( G ) = 1. Notice that if an edge e = { a, b } is not included in the connectedcomponent of r , or, if e is a loop, then T r ( G ) = T r ( G \ e ) = T r ( G.e ) . Proposition 6.
Let G = ( V, E ) be a multigraph and r ∈ V . For any edge e ∈ E adjacent to r , T r ( G ) = x T r ( G.e ) + T r ( G \ e ) . (5) Remark 7.
Removing or contracting edges adjacent to r reduces the number of edges, so that (5)indeed defines T r ( G ) (using in fine that T r ( G (cid:48) ) = 1 when V ( G (cid:48) ) = { r } ).Proof. Any tree counted in the left hand side either uses e or not.This formula allows to make a connection with Tutte formula which has shown a tremendousimportance in algebraic graph theory. However, its computation is at least linear in the number ofsubtrees, since each expansion in (5) can be seen as describing a subtree edge per edge: a contractededge is in the subtree, while a deleted one, is not. Remark 8.
The matrix tree theorem can also be used to enumerate the number of subtrees of size n of a given graph G , by considering one by one all the induced subgraphs with n vertices of G , and bysumming their number of spanning trees. It gives (cid:0) | V | n (cid:1) different graphs, for which a determinant ofsize ( n − × ( n − has to be computed. This cannot be used in practice when (cid:0) | V | n (cid:1) is a bit large. Formula (5) can be used to compute the first values of T r ( G ) for G = Subtrees • r ( Torus ( N )) for N from 1 to 4:132 x + 12 x + 4 x + 111664 x + 9408 x + 4074 x + 1308 x + 345 x + 80 x + 18 x + 4 x + 142467328 x + 56597760 x + 39892832 x + 19618560 x + 7588872 x + 2461360 x +698700 x + 178848 x + 42496 x + 9534 x + 2052 x + 425 x + 88 x + 18 x + 4 x + 1 . (cid:80) r ∈ V T r ( G ) counts each unrooted subtree with k edges k + 1 times, we have ∂∂x x T ( G ) = (cid:80) r ∈ V T r ( G ), with T ( G ) being the generating function of subtrees of G counted by their size, T ( G ) = (cid:80) t ∈ Subtrees ( G ) x | E ( t ) | . See [13, Prop. 3.1.] for some explicit polynomials in different classes of graphs.This can be generalized to forests. A graph F = ( V F , E F ) is said to be a forest if its connected com-ponents are trees. Given, r , · · · , r k distinct elements of V (for k ≥ Forests r , ··· ,r k ( G )the set of forests composed by k non intersecting trees, where for each i ∈ (cid:74) , k (cid:75) , r i ∈ t i .Define be the multivariate generating function F u (cid:74) ,k (cid:75) ( G ) = (cid:88) { t , ··· ,t k }∈ Forests u (cid:74) ,k (cid:75) ( G ) k (cid:89) j =1 x | E ( t j ) | j , (6)of forests in Forests u , ··· ,u k ( G ) counted according to the size of its connected components. Followingthe same idea in Proposition 6 we obtain the following proposition. Proposition 9.
For any edge e with only one endpoint u j in u (cid:74) , k (cid:75) , we have F u (cid:74) ,k (cid:75) ( G ) = F u (cid:74) ,k (cid:75) ( G \ e ) + x j F u (cid:74) ,k (cid:75) ( G.e ) . (7) The expansion formula (5) (or (6)) provides a natural decomposition of the set of subtrees of G containing a given edge e or not. To sample a random tree T under Uniform ( Subtrees • r ( G, n )):– choose an edge e adjacent to r ,– compute | Subtrees • r ( G \ e, n ) | and | Subtrees • r ( G.e, n − | (using the Tutte recursion),– with probability | Subtrees • r ( G \ e, n ) | / | Subtrees • r ( G, n ) | , the tree T is chosen uniformly in Subtrees • r ( G \ e, n ), otherwise define T as the tree having as edge set { e } union the edge set of a uniform randomtree taken in Subtrees • r ( G.e, n − Subtrees • r ( G ) proportional to x | E ( t ) | for some fixed x > P ( T = t ) = x | E ( T ) | / T r ( G ). In this case, itsuffices to remove e with probability T r ( G \ e ) / T r ( G ) and to keep it with probability x T r ( G.e ) / T r ( G )using the same procedure as above. Notice that as we consider/discard edges in the constructionof the tree the consecutive products simplify (telescopic products), up to the point where one has x | E ( t ) | T r ( G (cid:48) ) / T r ( G ), where G (cid:48) has T r ( G (cid:48) ) = 1 as explained in Remark 7.In this case, when conditioning by the size to be n , the sample is uniform in Subtrees • r ( G, n ). Formore on this method see [15, Section 3].
A graph G can be represented in various ways in a computer. For example, if E is not too largewe can use V = { , · · · , n } and a triangular array ( m { a,b } ( E ) , ≤ a < b ≤ n ) where m { a,b } is themultiplicity of the edge { a, b } in E . For regular graphs as Torus ( N ) or as the complete graph, theedges do not need to be stored, since they can be recovered online.Explicit programming of Markov chains ( X i , i ≥
0) taking their values in
Subtrees • r ( G, n ), willoften imply that, to construct X i +1 , some (set of) edges and (set of) vertices will be removed or addedto X i . In many cases a “sub-procedure” devoted to check the tree property of a graph is needed (todesign algorithms accepting or rejecting a modification of X i ).9 hecking the tree property is doable, and has a cost. Some classical algorithms devoted tocheck if a subgraph g of a given graph G is a tree exist: in practice, they have a non negligible cost(however, at most linear in the size of g if one neglects the access cost to the data). • In all generality, if g is given “from scratch”, checking if this graph is a tree can be done by performingthe breadth first or depth traversal [14, Sec. 22.2 and 22.3]. • If g has been obtained from a tree by the addition of a single edge and the removal of another one,then checking the tree property can be done as follows: if the edge from a to b has been removed, dothe breadth first search from a and check if b is still accessible. • When possible, it is preferable to work with rooted trees instead of unrooted ones. For the canonicalorientation in which edges are directed toward the root, all nodes but the root, has exactly one outgoingedge. Assume that we want to add an oriented edge ( u, v ) (taken in (cid:126)E ) in the tree and remove say anedge ( a, b ). Adding ( u, v ) in ( t, r ) may:– either make of u a new leaf, in which case it is easy to see if removing ( a, b ) preserves the treeproperty (in words, ( a, b ) needs to be a leaf, and v must be different from a );– or, adding ( u, v ) produces a (non-oriented) cycle. In this case, u will have two outgoing edges thatcan be followed to find the cycle in an efficient way. From here, it is easy to check if the edge ( a, b )is on this cycle, which is a necessary and sufficient condition for the preservation of the tree propertyupon removal of ( a, b ) (something has to be adapted if the root r is involved in the modifications).The orientations of the edges lying on the cycle have to be modified to get the right orientation of theresulting rooted tree. Reversible Markov kernels on
Subtrees ( G, n ) . A Markov chain with kernel K ( · , · ) on E × E issaid to be reversible with respect to a distribution ρ , if it satisfies the detailed balance equations : ρ i K ( i, j ) = ρ j K ( j, i ) for any i, j ∈ E. (8)In this case ρ is invariant for this Markov chain. When the kernel K is symmetric, i.e. K ( i, j ) = K ( j, i )for all i, j ∈ E , the Markov chain is reversible and the uniform measure on E is invariant. Uniform ( Subtrees ( G, n )) In what follows we present some dynamics on trees, seen as actions on their set of edges, each treebeing implicitly defined by its edge set. In the sequel t and t (cid:48) are two trees taken in Subtrees ( G, n ),for some n ≥
2, and G = ( V, E ). The number of edges of both t and t (cid:48) is n − the edge-exchange map. For G = ( V, E ), the map
Exchange is defined as
Exchange : Subtrees ( G, n ) × E × E −→ Subtrees ( G, n )( t, e, e (cid:48) ) (cid:55)−→ t (cid:48) = Exchange ( t, e, e (cid:48) )where: • t (cid:48) is defined from E ( t (cid:48) ) = E ( t ) ∪ { e } \ { e (cid:48) } if this set of edges defines a tree, • t (cid:48) = t otherwise. Definition of the kernel K ( A ) : Exchange the status of two edges of G .Suppose X ∈ Subtrees ( G, n ) is given. To get X ∼ K ( A ) ( X , . ), just set X d ) = Exchange ( X , e , e )10here ( e , e ) are two edges taken uniformly and independently in E . Analysis : Clearly aperiodic, irreducible and symmetric:
Uniform ( Subtrees ( G, n )) is the unique invari-ant distribution. Ergodicity is ensured by Perron-Frobeni¨us theorem.
Drawbacks : If | E | is big compared to n , most of the transitions will let t unchanged, which resultsin a very long mixing time. When t is changed, checking the tree property is expensive for large n . Definition of the kernel K ( B ) : Exchange the status of two edges adjacent to the current tree .Assume that X = t ∈ Subtrees ( G, n ). To get X ∼ K ( B ) ( X , . ) do the following. Take ( −→ e , −→ e )two edges chosen as follows: take ( u , v ) two i.i.d. uniform random nodes of t , then choose aneighbour u (cid:48) of u and v (cid:48) of v , uniformly, and independently. Set −→ e = ( u , u (cid:48) ) and −→ e = ( v , v (cid:48) ).Set X = Exchange ( t, −→ e , −→ e ). Analysis : A simple check shows that this kernel is also aperiodic and irreducible. The probability ofa transition from t to t (cid:48) (cid:54) = t is 1 / ( n deg G ( u ) deg G ( v )) if it can be attained from Exchange . Observethat the tree obtained t (cid:48) has also n nodes, and still u and v are some of them. We then get the sameprobability from t (cid:48) to choose ( v , u ) (instead of ( u , v )) and then ( v (cid:48) , u (cid:48) ) as neighbours from what wesee that K ( B ) ( t, t (cid:48) ) = K ( B ) ( t (cid:48) , t ) and therefore its unique invariant is Uniform ( Subtrees ( G, n )).
Drawbacks : Checking the tree property is expensive for large n . Again, we did not succeed to providea coupling from the past nor to get some bound on the mixing time. Remark 10. ( i ) An algorithm much faster to program exists, but it is less efficient so that, on largegraph it is likely to be insufficient; it is defined as follows: we add a condition to perform the exchangeoperation: if −→ e belongs to t , u (cid:48) is a leaf of t and v (cid:48) is outside t , then set X = Exchange ( t, −→ e , −→ e ) ,else, take X = t . With this variant, no cycles are created, and then, the fact that the tree property ispreserved is trivial. However, this variant does not modify the “core” of the tree, but just its boundaryso that it is less efficient “to forget” the initial structure. ( ii ) Variants are available for all these kernels. For example in K ( A ) one can consider ( (cid:126)e , (cid:126)e ) drawnfrom any symmetric distribution with full support over (cid:126)E ( G ) . In K ( B ) one can take ( u , v ) chosenwith any symmetric distribution with full support over V ( G ) . The fastest Markov chain K ( B ) is also slow to change the structure of a tree: for a general graph G , when n is a bit large,it is unlikely that both edges belong to the same cycle; hence, in general most modifications will beexchanges of leaves and perimeter edges.The main idea of the next kernel is the following: when the first added edge creates a cycle, thenforce the second edge to be in this cycle! In order to construct a reversible kernel, we need the positionof the second edge to be somehow reversible, with respect to the first one: the second edge will beused to break the cycle, we then call BreakCycle ( g, e ) { e (cid:48) } the probability to choose e (cid:48) “to break thecycle” if g is a graph with a unique cycle, containing e . Definition 11.
A kernel
BreakCycle ( · ) is said to be reversible, if, for any connected graph g withexcess 1 and for each pair of edges { e, e (cid:48) } in the cycle C of g BreakCycle ( g, e )( { e (cid:48) } ) = BreakCycle ( g, e (cid:48) )( { e } ) . (9)11 xamples of reversible BreakCycle kernels.
The main examples we have in mind, do not dependon g completely, but just on the cycle it contains. Assume that g contains a cycle C with κ edges.Label the successive edges of the cycle ( e , · · · , e κ − ) according to an arbitrary order of rotation aroundthe cycle (such an order has to be fixed for each g with a cycle). Now BreakCycle has property (9)if
BreakCycle ( g, e i )( { e j } ) = BreakCycle ( g, e j )( { e i } ). Examples include: BreakCycle ( g, { e i } )( { . } ) givenfor each i by Uniform ( { e k , k ∈ Z /κ Z } ), or Uniform ( { e k , k ∈ Z /κ Z , k (cid:54) = i } ). Definition of the kernel K ( C ) : If the added edge forms a cycle, then break the cycle .The user chooses a
BreakCycle kernel.Assume that X = t , to get X ∼ K ( B ) ( t, . ) do the following. Take the random oriented edge −→ e =( u , u (cid:48) ), where u is a uniform vertex, and conditional on u , u (cid:48) is uniform among the neighboursof u . • If the addition of −→ e = ( u , u (cid:48) ) to t creates a new leaf, then pick a second independent edge −→ e (cid:48) = ( v , v (cid:48) ) (with the same distribution). If −→ e (cid:48) is a leaf of t and the removal of −→ e (cid:48) in t ∪ −→ e produces a tree t (cid:48) then take X = t (cid:48) else take X = t . • If the addition of −→ e = ( u , u (cid:48) ) does not create a leaf and is not an edge of t , then u (cid:48) belongsto t and adding −→ e creates a cycle; we then define X as the tree obtained by the removal of oneedge sampled from BreakCycle ( t ∪ { e } , e ) in t ∪ { e } (where we write t ∪ { e } for the graph havingas edge set e union the edge set of t ). Analysis : It is irreducible and aperiodic. The chain is reversible if
BreakCycle ( · ) is. Drawbacks : Checking the tree property in the rooted case is fast. Again, we did not succeed toprovide a coupling from the past type realization of this kernel, nor to get some bounds on the mixingtime. It is much faster than the other kernels in practice.
We say that t and t (cid:48) in Subtrees ( Torus ( N ) , n ) are N -equivalent if they are equal up to a translationin Torus ( N ), and let Subtrees ( Torus ( N ) , n ) be the set of equivalent classes. The push-forward measureof Uniform ( Subtrees ( Torus ( N ) , n )) by the canonical projection π N is Uniform ( Subtrees ( Torus ( N ) , n ))since all classes have cardinality N . Since the diameter of any tree with n nodes is smaller than n −
1, the previous discussion shows that the uniform distribution on
Subtrees ( Torus ( N ) , n ) and on Subtrees ( Torus ( N (cid:48) ) , n ) can be identified up to translation, if N and N (cid:48) are both bigger or equal than n . When one wants to sample uniformly in Subtrees ( Torus ( N ) , n ) it is then reasonable to work in Subtrees ( Torus ( n ) , n ) (the smallest valid torus), or to work up to translation. Indeed, when one worksunder the kernel K ( C ) , the mixing time of the chain depends on the size of the torus since the largestis the torus, the longer it takes to forget “not only the shape of the initial tree”, but also its position.Observe that sampling in Subtrees • (0 , ( Torus ( n ) , n ) and in Subtrees ( Torus ( n ) , n ) are basically equiv-alent, since it is easy to a sample of one of them from the other. We programmed and ran the chain K ( C ) , with BreakCycle ( t ∪ { e } , e ) being the kernel which yieldsthe suppression of a uniform edge on the cycle of t ∪ { e } . We made some statistics and videos to showthe power and limits of this kernel; in few words, it can be used to sample a close to uniform tree with12 on Torus ( N ), for n up to say 8000 nodes in few minutes and n = 10000 in few hours for a code inC and a personal computer, starting from any distribution.Our program starts from a rectangle tree, see Fig. 2 which is a highly structured tree; we triedmany Markov kernels with this kind of starting point and only efficient Markov chains “forget” theinitial distribution in a reasonable time. HW Figure 2:
A rectangle-tree with width W , and height H .
380 400 420 440 460 480 500 520 540 470 480 490 500 510 520 530 540 550 560 570
Figure 3:
Markov chain started from a rectangle tree × , with 1600 nodes, run on Torus (1000) , andobserved at time k ×
200 millions, for the k th picture. The total execution time is around 1 minute. Thelast tree is the result after 1.6G iterations. A film with 800 images of the 1.6G steps of the chain (2M stepsbetween each image) is available at [18].
300 400 500 600 700 800 900 400 450 500 550 600 650 700 750
Figure 4:
Simulation as in Fig. 3 starting from a rectangle tree × , ran on Torus (1000) , and with steps of the chain between each picture. The last tree is the value of the chain after G steps. The totalexecution time is around 1 hour. A film with 800 images of the G steps of the chain ( steps betweeneach image) is available at [18]. For any tree t in Subtrees ( Torus ( N ( n )) , n ), define the Euclidean width and height w ( t ) and h ( t ) as respectively the number of columns and rows of the torus containing at least one vertex of t .A second variable of interest is the random graph distance D ( t ) = d t ( u , v ) between two i.i.d. uniformnodes u and v of a (deterministic or random) tree t .The proportion of nodes in t with degree j is q j ( t ) = { u ∈ t : Degree ( u ) = j } V ( t ) . We conjecture the following (recall the discussion at the beginning of Section 5.1).
Conjecture 1.
For t n taken uniformly in Subtrees ( Torus ( n ) , n ) ( i ) there exists α ∈ [0 . , . such that, ( w ( t n ) , h ( t n )) / n α ( d ) −−→ ( w , h ) for a pair of non trivialrandom variables ( w , h ) , with two almost surely positive marginals.( ii ) there exists β ∈ [3 / − . , / . such that, D ( t n ) / n β ( d ) −−→ D where D is a real randomvariable, almost surely non zero. iii ) ( q j ( t n ) , ≤ j ≤ ( proba. ) −−−−−→ n ( q , · · · , q ) a constant vector satisfying q ∈ [0 . ± . , q ∈ [0 . ± . , q ∈ [0 . ± . , q ∈ [0 . ± . . Remark 12.
The conjectured limiting proportions of nodes of each degree (iii) are different from thoseof the UST in Z (see [28, P. 112]). We made thousands of simulations of this chain, each of them running for billions steps. Thesenumbers of steps where decided “empirically”: starting for a rectangle tree, for example, with size1000 × ×
25, and performing hundreds of simulations with s steps, suffices to compare somestatistics as the width and the height, which are asymptotically the same (independently from theinitial tree) : in case of discordance of these statistics, s must be taken larger. The videos (availableat [18]) give some clues that the mixing time have been reached (if one considers the trees up totranslation), even if they are not formal proofs, of course.We then performed thousands of simulations (on a multicore PC):Tree size 1000 2500 5000 8100Number of simulations 5039 5486 6111 5232Initial rectangle tree shape 40 ×
25 50 ×
50 50 ×
100 90 × M G G G (10)(hence, we made 5486 simulations of trees of size 2500 starting initially by a rectangle tree 50 × u i − , u i ) , ≤ i ≤
10] were chosen to compute graph distance d t ( u i − , u i ), where u i − , u i areindependent and uniform in the vertex set of the tree t ; this provides 10 numbers for each tree. These10 values are dependent, as are the width and the height.Now, for each of the sampled trees, the exact number of nodes of each degree has been computed,which provides for each tree a proportion vector ( q i ( t ) , ≤ i ≤ t ). Distance statistics
Number of nodes 1000 2500 5000 8100Empirical mean of the width 96 .
41 173 .
58 273 .
63 372 . .
00 171 .
00 269 .
00 367 . d ( u , v ) 95 .
68 189 .
60 317 .
92 457 . d ( u , v ) 88 .
00 176 .
00 293 .
00 421 .
00 (11)Suppose that a sequence of random variables such that Y n /n γ ( d ) −−→ Z for some γ > Z , then it is expected that for n and m both large, median ( Y n ) / median ( Y m ) should be close to( n/m ) γ . Assuming that we have a sample of i.i.d. copies of Y n , ( Y ( i ) n , ≤ i ≤ N ), then we can definethe empirical means (cid:99) Y n = ( Y (1) n + · · · + Y ( N ) n ) /N , and the empirical median ( (cid:92) median ( Y n ) = inf { x : |{ j : Y ( j ) n ≤ x }| ≥ N/ γ , where samples for two differentvalues of n and m are needed: Est median ( γ ) = log (cid:16) (cid:92) median ( X n ) / (cid:92) median ( X m ) (cid:17) / log( n/m ) . (12)15y the same method, a second estimator using the empirical mean is Est mean ( γ ) = log (cid:16) (cid:99) X n / (cid:100) X m (cid:17) / log( n/m ) . (13)Finally, we introduce a last estimator of the exponent γ using the 9 empirical deciles ( Dec i ( Y n ) , ≤ i ≤
9) where
Dec i ( Y n ) = min { x : |{ j : Y ( j ) n ≤ x }| ≥ N i/ } . We then take γ as the values thatminimises the L distance between the vectors m x ( Dec i ( Y n ) , ≤ i ≤
9) and n x ( Dec i ( Y m ) , ≤ i ≤ Est best fit decile ( γ ) = argmin (cid:32) x (cid:55)→ (cid:88) i =1 | Dec i ( Y n ) m x − Dec i ( Y m ) n x | (cid:33) , (for x ∈ [1 / , Using ( n, m ) gives the following estimation:( n, m ) (1000 , , , α (mean) 0 .
642 0 .
657 0 . α (median) 0 .
641 0 .
654 0 . α .
640 0 .
656 0 . β (mean) 0 .
746 0 .
746 0 . β (median) 0 .
756 0 .
735 0 . β .
744 0 .
748 0 .
753 (14)
Figure 5:
On the first line, (interpolated) empirical cumulated function of w ( t n ) /n α for α being respectively0.64, 0.65 and 0.66. On the second, (interpolated) empirical cumulated function of d t n ( u , v ) /n β for β beingrespectively 0.74, 0.75 and 0.76. The 4 curves are so close that they are almost non discernible (they are ofcourse far from each other for other exponents) the estimator argmin (cid:0) x (cid:55)→ (cid:80) i =1 | Dec i ( Y n ) /n x − Dec i ( Y m ) /m x | (cid:1) is not good, since it is often reached for x = 1, forwhich all the terms inside the absolute value are small. emark 13. In view of the simulations and the coincidence of empirical cumulated functions of d t n ( u , v ) /n β as presented in Fig. 5 it is tempting to conjecture that β = 3 / . For α , we thought thatit could be / and we used a lot of computer work to produce large trees (of size ) to test this,but finally larger sizes did not change much the outcome and it seems that α should be smaller than / . Degree statistics
For a sample X = ( X , · · · , X n ) denote by m ( X ) the empirical mean and samplevariance: m ( X ) = ( X + · · · + X n ) /n , s ( X ) = ( (cid:80) ni =1 ( X i − m ( X )) ) / ( n − m ( q ) 0 . . . . s ( q ) 7 . E −
05 3 . E −
05 1 . E −
05 9 . E − m ( q ) 0 . . . . s ( q ) 2 . E −
04 1 . E −
04 5 . E −
05 3 . E − m ( q ) 0 . . . . s ( q ) 8 . E −
05 3 . E −
05 1 . E −
05 1 . E − m ( q ) 0 . . . . s ( q ) 1 . E −
05 7 . E −
06 3 . E −
06 2 . E −
06 (15)Observe that the standard deviation is small and seems to go fast to zero.
Subtrees ( G ) Here we will study some Markovian mechanisms with some explicit invariant distributions whichallows to sample in
Subtrees ( G ) = ∪ n Subtrees ( G, n ) (recall Section 1.1). In section 8 we will turn ourattention to the case where G is itself a tree, in which case a coupling from the past is possible. We define two versions of the functions
Remove aiming at removing an edge e of a tree t dependingon whether we are dealing with rooted tree or not. For an oriented edge −→ e , we denote by e itsunoriented version. Unrooted version of the
Remove function:
Remove : Subtrees ( G ) × −→ E −→ Subtrees ( G )( t, −→ e ) (cid:55)−→ t (cid:48) = Remove ( t, −→ e )where the direction of −→ e is used only when t has a single edge: • if E ( t ) = { e } and −→ e = ( v , v ) then t (cid:48) = { v } , the tree reduced to the single node { v } , • else (if | E ( t ) | > E ( t ) \ { e } is the edge set of a tree t (cid:63) containing r , set t (cid:48) = t (cid:63) , • otherwise, t (cid:48) = t . Rooted version
Remove r : it aims at removing an edge in a rooted tree ( t, r ) ∈ Subtrees • r ( G ), whilepreserving r . Here, since the tree is rooted at r , r is never considered as a leaf. If ( t, r ) ∈ Subtrees • r ( G )and e = { e , e } then up to rename the vertices, one may suppose that e is the parent of e in ( t, r )(is closer to r ):– if e is not a leaf, then do nothing, and set t (cid:48) = t ,– if e is a leaf, then t (cid:48) is the tree with vertex set V ( t (cid:48) ) = V ( t ) \ { e } and edge set E ( t (cid:48) ) = E ( t ) \ { e } r is preserved).Define the function Add as Add : Subtrees ( G ) × E −→ Subgraphs ( G )( t, e ) (cid:55)−→ g = Add ( t, e )where the graph g has set of edges E ( g ) = E ( t ) ∪ { e } if e is adjacent to t , and g = t otherwise. Hence g is connected and may have at most one cycle, and in this case, it contains e ( Excess ( g ) ≤ Add has been used, a correction of the obtained graph is sometimes needed ifone needs to output a tree, and we use again the
BreakCycle kernels.
Subtrees ( G ) We present here a Markov kernel reminiscent from the discrete time birth and death process. Thisfamous model of Markov chains ( Y j , j ≥
0) takes its value in N ; its Markov kernel is parametrized bya sequence of triplets [( a k , b k , c k ) , k ≥
0] as follows: P ( X = k + 1 | X = k ) = a k , P ( X = k | X = k ) = b k , P ( X = k − | X = k ) = c k , with c = 0. It is known that such a chain is positive recurrent if (cid:80) k (cid:81) kj =1 a j − c j < + ∞ in which casethe invariant distribution is proportional to π k = (cid:81) kj =1 a j − c j .Consider a sequence of triplets [( p i , q i , r i ) , ≤ i ≤ | V | ], indexed by the possible subtree sizes of G = ( V, E ), which will be use to try to “add”, “do nothing” and “remove” one edge of the currenttree. As above, for all i , p i + q i + r i = 1. For the moment we assume that (cid:40) r i > , for all i ∈ (cid:74) , | V | (cid:75) , p i > , for all i ∈ (cid:74) , | V | − (cid:75) . We take a collection of kernels
BreakCycle ( · ) satisfying the reversibility condition (9) . Definition of the kernel K ( D ) : .Assume X = t ∈ Subtrees ( G ) (with any size). To define X ∼ K ( D ) ( t, . ), proceed as follows. Pickindependently, a random oriented edge −→ e ∼ Uniform ( −→ E ( G )), and “a random choice c ” where P ( c = +1) = p | t | , P ( c = 0) = q | t | , P ( c = −
1) = r | t | , which will be the respective probability to “try” to add e , to do nothing, and to remove −→ e . Do • if c = +1 then “try to add e ”: consider g = Add ( t, e ). If g is a tree, set X = g . If g has acycle, then pick X = g \ { e (cid:48) } , where e (cid:48) is chosen according to BreakCycle ( g, e ). • if c = 0, do nothing, and set X = t , • if c = −
1, then “try to remove −→ e ”: set X = Remove ( t, −→ e ). Analysis : K ( D ) is clearly aperiodic and irreducible. If t (cid:48) and t have the same number of edges and t (cid:48) (cid:54) = t , then, one can pass from t to t (cid:48) by a transition of this Markov chain only if E ( t ) and E ( t (cid:48) ) arealmost the same: E ( t ) \ { e } = E ( t (cid:48) ) \ { e (cid:48) } for two edges ( e, e (cid:48) ) . (16)In words, the edge e has been suppressed from E ( t ) and e (cid:48) added, to get t (cid:48) . Hence P ( X = t (cid:48) | X = t ) = p | t (cid:48) | × (1 / | E | ) × BreakCycle ( g, e (cid:48) )( { e } ) (17)18or g , the graph with set of edges E ( t ) ∪ { e (cid:48) } (because we need c = 1, e = e (cid:48) and then to pick theright edge e on the cycle of g rooted at e (cid:48) , with probability BreakCycle ( g, e (cid:48) )( { e } )). Since BreakCycle · has the reversibility property, in this case P ( X = t (cid:48) | X = t ) = P ( X = t | X = t (cid:48) ).Consider t ∈ Subtrees ( G ) such that 3 ≤ | t | < | V | and suppose that e ∈ E ( t ) such that one endpointof e is a leaf in t . Therefore, the kernel satisfies K ( D ) ( t, t \ { e } ) = (1 / | E | ) r | t | , K ( D ) ( t \ { e } , t ) = (1 / | E | ) p | t |− (18)and again the case | t | = 2 provides a slight complication, in which case, K ( D ) ( t, t \ { e } ) = (1 / (2 | E | )) r | t | , K ( D ) ( t \ { e } , t ) = (1 / | E | ) p | t |− . (19) Proposition 14.
The Markov chain with kernel K ( D ) is reversible and its unique invariant measure ρ on Subtrees ( G ) gives the same weight ν n to each element of Subtrees ( G, n ) , for all ≤ n ≤ | V | , thatis ρ t = ν | t | , for all t ∈ Subtrees ( G ) . The sequence ( ν k , ≤ k ≤ | V | ) satisfies ν m = 2 ν m (cid:89) i =2 (cid:18) p i − r i (cid:19) , for all m ∈ (cid:74) , | V | (cid:75) . (20) and | V | (cid:88) n =1 ν n | Subtrees ( G, n ) | = 1 . (21) Hence, if t ∼ ρ , L ( t | | t | = n ) is the uniform distribution on Subtrees ( G, n ) . Remark 15.
In the Proposition, the sequence ( ν i ) depends on G , and then, it should have been written ( ν i ( G )) to make this dependence clearer.Proof. First, by Perron-Frobeni¨us, there is a unique invariant measure. Therefore, it is enough to showthat the only measure ρ on Subtrees ( G ), described in the proposition, satisfies the detailed balanceequations (8). For t and t (cid:48) = t \ { e } and | t | ≥ ν | t | K ( D ) ( t, t \ { e } ) = ν | t \{ e }| K ( D ) ( t \ { e } , t ) . (22)From (18) one sees that ν | t | = ν | t |− p | t |− / r t when | t | ≥
3. Plugging (19) in (22), in the case where | t | = 2, gives: ν (1 / (2 | E | )) r = ν (1 / | E | ) p ⇔ ν = 2 ν p r . Remark 16. (cid:4)
Tuning the sequence ( p , q , r ) allows to favour a tree size, or an approximate tree size. (cid:4) If q i = 0 , r i = p i = 1 / for all i and if the function BreakCycle ( · ) has the reversibility property, then ν | t | = | t | =1 so that the distribution is uniform on Subtrees ( G ) (except for the tree reduced to a singlenode which has a different weight). variant with a fixed root. One can turn K ( D ) into a kernel K ( D ) r of a Markov chain taking itsvalues in Subtrees • r ( G ) where r is a fixed vertex of V . This version will play an important role for theexact sampling of a uniform subtree of a graph in Section 8.We define K ( D ) r by emphasising its differences with K ( D ) : to preserve r , use Remove r instead of Remove , and instead of taking directed edges −→ e in −→ E ( G ), we consider the unoriented ones e in E ( G ).In this case, one can prove the following proposition by adapting the proof of Proposition 14. Proposition 17.
The Markov chain with kernel K ( D ) r is reversible and its unique invariant measure ρ • r on Subtrees • r ( G ) gives the same weight ν n to each element of Subtrees • r ( G, n ) , for all ≤ n ≤ | V | ,that is ρ t = ν | t | , for all t ∈ Subtrees ( G ) . The sequence ( ν k , ≤ k ≤ | V | ) satisfies ν m = ν m (cid:89) i =2 (cid:18) p i − r i (cid:19) , for all m ∈ (cid:74) , | V | (cid:75) . (23) and | V | (cid:88) n =1 ν n | Subtrees • r ( G, n ) | = 1 . (24) Hence, if t ∼ ρ , L ( t | | t | = n ) is the uniform distribution on Subtrees • r ( G, n ) . Compared to (20), (23) changed the factor 2 on the right hand side into a 1.
We propose in this part a kernel having a computable invariant distribution when all the verticesof G = ( V, E ) have the same degree D . This kernel is almost the same as the previous one ( K ( D ) ), itsanalysis is the same, but it mixes much faster: the idea is to pick edges adjacent to the current tree,instead of uniform edges in E ( G ). Definition of the kernel K ( E ) : A fast kernel for regular graphs .Keep exactly the same definition as kernel K ( D ) , except for the choice of the random edge −→ e , dothe following instead. Assume that X = t , pick uniformly at random node u in V ( t ), and thena random edge −→ e = ( u , u (cid:48) ) uniformly in the set of adjacent edges of u (so that u is the origin ofthis edge). Analysis : Transition between trees with the same size is done as in K ( D ) . And it is direct to checkthat for any t such that | t | ≥
3, and e an edge such that t \ { e } is a tree (with one node less) K ( E ) ( t, t \ { e } ) = 1 | t | (cid:18) deg G ( u ) + 1 deg G ( u (cid:48) ) (cid:19) r | t | K ( E ) ( t \ { e } , t ) = 1( | t | −
1) 1 deg G ( u ) p | t |− again if | t | = 2, in this case if t (cid:48) = Remove ( t, ( u , u (cid:48) )) is the tree t (cid:48) reduced to u , so that K ( E ) ( t, t (cid:48) ) = 1 | t | deg G ( u ) r | t | = r | | deg G ( u ) K ( E ) ( t (cid:48) , t ) = 1 | t (cid:48) | deg G ( u ) p | t (cid:48) | = p deg G ( u ) , since, in this transition the directed edge ( u , u (cid:48) ) needs to have the right direction.20 roposition 18. If the degree of all nodes in G is the same, then the Markov chain with kernel K ( E ) is reversible and its unique invariant measure ρ on Subtrees ( G ) gives the same weight ν n to eachelement of Subtrees ( G, n ) , for all ≤ n ≤ | V | , that is ρ t = ν | t | , for all t ∈ Subtrees ( G ) . The sequence ( ν k , ≤ k ≤ | V | ) satisfies ν m = 2 ν m (cid:89) i =2 (cid:18) p i − / ( i − r i /i (cid:19) , for ≤ m ≤ | V | (25) and | V | (cid:88) n =1 ν n | Subtrees ( G, n ) | = 1 . (26) Hence, if t ∼ π , L ( t | | t | = n ) is the uniform distribution on Subtrees ( G, n ) . Remark 19.
Recall that the kernels K ( D ) and K ( E ) are defined using the sequence of triplets [( p i , q i , r i ) , ≤ i ≤ | V | ] . The conditions ( p i > , ≤ i < | V | ) and ( r i > , ≤ i ≤ | V | ) are imposed so that they ensurethe irreducibility of these chains on Subtrees ( G ) . Now, assume that one takes X according to somedistribution ν with support in ∪ n ∈ (cid:74) n ,n (cid:75) Subtrees ( G, n ) where ≤ n < n ≤ | V | . Assume that r n = 0 and p n = 0 , and r k > for k ∈ (cid:74) n + 1 , n (cid:75) , p k > for k ∈ (cid:74) n , n − (cid:75) . In this case, the Markovchain under consideration is irreducible in ∪ n ∈ (cid:74) n ,n (cid:75) Subtrees ( G, n ) (exercise left to the reader). Inthis case we have the same result for the distribution of the invariant measure as in Proposition 18between and n (instead of | V | ) when n = 1 , and if n > , the invariant distribution is given by ν m = ν n m (cid:89) i =2 (cid:18) p i − / ( i − r i /i (cid:19) for m ∈ (cid:74) n + 1 , n (cid:75) . (27) When n < n , the irreducibility of the chain and (26) is easily adapted to the present case.If n = n , then one can see that the vertex set V ( X ) of the initial tree X cannot change: for each i , V ( X i ) = V ( X ) , so that this model is a Markov chain taking its value in the spanning trees of V ( X ) ,and this is not the problem we are interested in here (Wilson or Aldous–Broder algorithms are theright tools to do that). In the pursue of an algorithm to sample a subtree of size n with distribution Uniform ( Subtrees ( G, n )),a first leading idea is that when n = N , the researched method must output a UST. Since only twomain methods are known, Aldous–Broder algorithm, and Wilson’s algorithm, it is natural to searchin two directions:(a) try to generalize these methods to smaller subtree sizes,(b) try to find a way to extract a subtree of size n of the UST with the right distribution.In this Section, we review several models of random subtrees in Subtrees ( G, n ) that are present in theliterature, or that are new.
We take the same setting as in Section 2.1: G is a connected graph, M a positive Markov kernel M on G , and W is a M -Markov chain (we drop the condition of reversibility). Recall the definition21f FirstEntranceTree ( W , · · · , W τ | V | ) given in (2). The aim of the pioneer tree is to generalise Aldous–Broder construction: instead of taking all the first entrance edges to all nodes (for a M -Markov chainunder its stationary regime), which provides a tree with weight (cid:81) e ←− M e as stated in Theorem 4, justkeep the n first ones! To show that it shares, as the uniform spanning tree model does, a strong linkwith a tree valued Markov chain, we will need to label its edges. Definition of Model A:
The pioneer random tree .The n Pioneer tree
PRT n ( W i , ≤ i ≤ τ n ) is the rooted edge-labelled tree ( FirstEntranceTree ( W i , ≤ i ≤ τ n ) , L n ), where L n gives the label k − W τ k , W − τ k ), for all 2 ≤ k ≤ n .Hence, the vertex set of PRT n ( W i , ≤ i ≤ τ n ) is { W , · · · , W τ ( n ) } , the first n vertices visited by W . Definition 20.
Denote by
Subtrees • ,L, ↓ r ( G, n ) the set of rooted edge-labelled tree (( t, r ) , (cid:96) ) such that ( t, r ) belongs to Subtrees • r ( G, n ) , and such that the n − labels associated with the edges form the set { , · · · , n − } and are decreasing on any injective path from a leaf to the root r . A simple consequence of the construction is the following fact:
Lemma 21.
The Pioneer tree
PRT n ( W , · · · , W τ n ) belongs to Subtrees • ,L, ↓ W ( G, n ) and PRT n ⊂ PRT n +1 , for any ≤ n ≤ | V | − . (28) Hence for all n , PRT n is an edge-labelled subtree of the global spanning tree PRT | V | equipped with itsedge-labels. In the same way as Aldous–Broder
FirstEntranceTree ( W , · · · , W τ | V | ) can be seen as the state, attime 0, of a spanning tree valued Markov chain started at time −∞ (this is the argument at the coreof Aldous–Broder proof), for any n the Pioneer tree has a very similar property, for the followingMarkov chain taking its values in ∪ r ∈ V Subtrees • ,L, ↓ r ( G, n ): again, n is any number in (cid:74) , | V | (cid:75) , so thatthe following construction comprises the spanning tree case, but not only. Definition of the kernel K ( F ) : Add a random step and erase the oldest edge .Assume that at time 0, (( T , R ) , L ) is a tree rooted at R , element of Subtrees • ,L, ↓ R ( G, n ). Underthe kernel K ( F ) , (( T , R ) , L ) is defined as follows: • First, P ( R = v | R = u ) = ←− M u,v , which means that the roots ( R k , k ≥
0) performs a Markovchain with kernel ←− M on G (observe the kernel used here). • Consider the oriented edge e = ( R , R ) of G ; R will be the new root of the new tree T .(a) If R = R (possible if there is a loop): in this case set (( T , R ) , L ) = (( T , R ) , L ),(b) If R is already in T , then adding the edge e = ( R , R ) in T creates a cycle (possibly, thesmall cycle R → R → R ). To get T , add e to T , label e temporarily 0, record m the maximal label on the created cycle, and remove the edge with label m ; finally, orient theremaining edges of the cycle toward R (c) else, R was not in T so that if one adds the edge e = ( R , R ) to T , then R is a newnode. To get T , add the edge e to T , label e temporarily 0 and remove the edge e (cid:48) adjacentto the leaf with maximal label (the label m of e (cid:48) is n − An injective path w = ( w , · · · , w m ) is a path such that i, j ∈ (cid:74) , m (cid:75) , i (cid:54) = j ⇒ w i (cid:54) = w j .
22o define L in both cases, keep the labels of all edges of L that are > m , and add 1 to all theother labels (those in (cid:74) , m − (cid:75) , including the new one labelled temporarily 0). Proposition 22. ( i ) The labels L are different and decreasing on each path toward the root, andthen so that K ( F ) is indeed a Markov kernel on ∪ r ∈ V Subtrees • ,L, ↓ r ( G, n ) .( ii ) If ( X ( n ) j , j ≥ is a Markov chain on ∪ r ∈ V Subtrees • ,L, ↓ r ( G, n ) with kernel K ( F ) , then for X ( n − j be the labelled tree obtained by removing the edge with largest label in X ( n ) j , the pro-cess ( X ( n − j , j ≥ is a Markov chain on ∪ r ∈ V Subtrees • ,L, ↓ r ( G, n − with kernel K ( F ) .( iii ) The Markov chain with kernel K ( F ) is ergodic on ∪ r ∈ V Subtrees • ,L, ↓ r ( G, n ) , and its invariantdistribution, is the distribution of the Pioneer PRT n ( W i , ≤ i ≤ τ n ) for W following theinvariant distribution ρ of M (with full support on ∪ r ∈ V Subtrees • ,L, ↓ r ( G, n ) ). Hence, several points seem interesting: the consistence of the trees (
PRT n , ≤ n ≤ | V | ), the factthat a labelling is needed to construct this coupling, the fact that Aldous–Broder scheme to study the FirstEntranceTree can be applied here again using a time reversal chain under its stationary distribution,and also the fact that, forgetting their labels, all of them are subtrees of the original Aldous–Broderspanning tree.
Sketch of proof.
Giving all the details would take too much room. We give the main ideas only.( i ) The proof is done by inspection of the two cases ( b ) and ( c ) in the definition of K ( F ) .( ii ) Suppose that (( t n , (cid:96) n ) , r ) and (( t n +1 , (cid:96) n +1 ) , r ) are two edge labelled trees with n and n + 1 nodes,such that (( t n , (cid:96) n ) , r ) is obtained from (( t n +1 , (cid:96) n +1 ) , r ) by the suppression of the edge with highestlabel n (we write Proj (( t n +1 , (cid:96) n +1 ) , r ) = ( t n , (cid:96) n ) , r )). When taking a step under the kernel K ( F ) , a newedge ( r, r (cid:48) ) is added, r (cid:48) becomes the new root: this addition gives different possible situations for t n and for t n +1 :– (A) r (cid:48) is not in t n +1 (nor in t n ),– (B) r (cid:48) is in t n +1 but not in t n .In case ( A ), after applying ( c ) of definition of K ( F ) , the two obtained trees (( t (cid:48) n , (cid:96) (cid:48) n ) , r (cid:48) ) and (( t (cid:48) n +1 , (cid:96) (cid:48) n +1 ) , r (cid:48) )satisfy Proj (( t (cid:48) n +1 , (cid:96) (cid:48) n +1 ) , r ) = ( t (cid:48) n , (cid:96) (cid:48) n ) , r )).In the case ( B ), the cycle obtained by adding ( r, r (cid:48) ) to t n +1 contains necessarily the edge with largestlabel of t n +1 (otherwise a cycle would have been created also by adding ( r, r (cid:48) ) to t n ). From here theconclusion is simple.( iii ) The main idea consists in introducing a time reversal (as in Aldous–Broder argument), and asecond family of trees that we call LastExitTree .Any finite path ( z , · · · , z m ) on G can be used to define a rooted tree LastExitTree ( z , · · · , z m ),rooted at z m as follows: first LastExitTree ( z ) is the tree reduced to its root z ; from k = 0 to m − LastExitTree ( z , · · · , z k +1 ) from LastExitTree ( z , · · · , z k ) by the suppression of the outgoingedge from z k +1 (if any), by the addition of the edge ( z k , z k +1 )and by setting the root at z k +1 . The setof nodes of LastExitTree ( z , · · · , z m ) is { z , · · · , z m } ; if one denotes by ν k = max { j : |{ z j , · · · , z m }| = k } , the last time k nodes remain to be visited “in the future”, then, for any k ∈ { , · · · , |{ z , · · · , z m }|} , ν k is the date of visit of a node for the last time; hence, the tree LastExitTree ( z , · · · , z m ) has for edges( z ν k , z ν k ) , for k = |{ z , · · · , z m }| to 2 . (29)23n Definition 2, FirstEntranceTree is associated with a covering path; this definition can be extendedto any path, covering or not. It is immediate to check, that, for any path ( w , . . . , w m ) on G , FirstEntranceTree ( w , · · · , w m ) = LastExitTree ( w m , · · · , w ) . (30)
535 540 545 550 555 560 565 570 575 580 585 260 270 280 290 300 310 320
540 560 580 600 620 640 660 680 700 720 740 760 400 450 500 550 600 650 700
Figure 6:
Remove oldest edge, simulation on Torus (1000) , of a tree with 1000 and then 10000 edges (inthe first case, 25M step starting from a rectangle tree 40 × , in the second case 200M steps starting from arectangle tree 100 × . Assume now that ( X k , k ∈ Z ) is a M Markov chain and ( Y k , k ∈ Z ) is a ←− M Markov chain, both ofthem taken under their invariant distribution.We start with the spanning tree case. There are three main ideas: (cid:4)
Construction of
LastExitTree following the “erase the oldest” dynamic. k (cid:55)→ LastExitTree ( Y i , i ≤ k ) is a Markov process such that from time k to k + 1 a new edge( Y k , Y k +1 ) is added, and the outgoing edge e from Y k +1 , if any, is suppressed; and in such a casebefore suppression, the addition of ( Y k , Y k +1 ) created a cycle C . By induction on k one can provethat the edge creation timestamps give an increasing labelling on any injective path to the root. Weclaim that the edge e was the “oldest” edge of C . This sentence has a meaning, since the date ofcreation of each edge is ( Y i , i ≤ k ) measurable: each edge correspond to the last exit edge in a node.Therefore, the furthest from the root is an edge on the LastExitTree , the smallest creation timestampit has and therefore the oldest it is. Hence, the edge ( Y k , Y k +1 ) creates a cycle with a path going to Y k , which is then a branch in the tree, so that the outgoing edge from Y k +1 is indeed the oldest inthe cycle. Hence, up to the labels, the tree in the “erase the oldest edge” dynamics is the same as k (cid:55)→ LastExitTree ( Y i , i ≤ k ). (cid:4) Adding the “right” labels to the analysis.
Label the edges of
LastExitTree ( Y i , i ≤ k ) by (cid:96) k the relative order in (cid:74) , |{ Y , . . . , Y k }| − (cid:75) of theircreation timestamps as in the preceding part, this is an increasing labelling on any injective pathtowards the root. We produce a reverse labelling (cid:96) ↓ k of (cid:96) k as follows (cid:96) ↓ k ( e ) = |{ Y , . . . , Y k }| − j, for j ∈ (cid:74) , |{ Y , . . . , Y k }| − (cid:75) Under (cid:96) ↓ k , the bigger the label, the smaller its timestamp is and therefore the older the edge.Now, in the spanning tree case the chain “erase the oldest chain” and k (cid:55)→ ( LastExitTree ( Y i , i ≤ k ) , (cid:96) ↓ k )(from k large enough) can be identified under their stationary regime (this can be seen more easily bythe time reversal argument that follows). (cid:4) Time reversal application to obtain
Pioneer from
LastExitTree + labels:
The combinatorial property (30) allows to see that if X is a M Markov chain and Y a ←− M Markov24hain under their common stationary distribution ρ PRT ( X i , ≤ i ≤ τ V ) ( d ) = ( LastExitTree ( Y i , i ≤ , (cid:96) ↓ ) . To complete the proof for n < | V | , it suffices to use (28) and its counterpart for ( LastExitTree ( Y i , i ≤ , (cid:96) ↓ ): in words, keeping from the spanning tree process the n − PRT ( X i , ≤ i ≤ τ n ), and in the right hand the tree ( LastExitTree ( Y i , i ≤ , (cid:96) ↓ ) e : (cid:96) ↓ ( e ) 0) suitably normalized shows that from a probabilistic perspective, thequestion is the following. Open question 3. Give a description of the distribution of FirstEntranceTree ( W , · · · , W τ ( n ) ) condi-tionally on vertex set { W , · · · , W τ ( n ) } . For more information on the combinatorics behind this model, we send the reader to [19]. Subtrees ( G, n ) After many tries to sample Uniform ( Subtrees ( G, n )) using Markov chains on the graph G itself, wefinally understood why it is not possible in a “giant graph” | V | >> n , if one uses only a Markov chainstopped before it visits an important part of the graph. We build the following minimal statementand proof, but the argument can be used to reject many constructions one may want to try. The mainidea is the following: a simple random walk has a simple stationary measure ρ which only depends onthe degree of each node (that is ρ u = deg ( u ) / ( (cid:80) v ∈ V deg ( v ))). Hence, a Markov chain taken under itsinvariant distribution, is localized in a graph “proportionally to the degree of the starting node”. Theprobability that a uniform tree in Subtrees ( G, n ) has vertex set V (cid:48) ⊂ V is proportional to the numberof spanning trees in Induced G ( V (cid:48) ), which roughly, can be thought to rather depends on the product ofthe nodes degree in Induced G ( V (cid:48) ) than in their sums. Hence, the distribution of the support V (cid:48) hassomehow nothing to do with ρ . The following theorem formalizes this idea. Theorem 23. Consider a simple random walk W = ( W k , k ∈ Z ) on a graph G = ( V, E ) underits invariant distribution (meaning that knowing W i , W i +1 is uniform among the neighbours of W i ).Denote by −→ τ n := inf { k ≥ { W , · · · , W k } = n } the first time the random walk visits n points, or,“the same thing”, backward, ←− τ n := max { k ≤ { W k , · · · , W } = n } . In general, there does not existany n -tree map F such that F ( W , · · · , W −→ τ n ) is uniform on Subtrees ( n, G ) or on Subtrees • ( n, G ) . Thesame statement holds from F ( W ←− τ n , · · · , W ) instead. The “In general” in the statement is important. Aldous–Broder theorem asserts that when n = | V | the map F exists: it is FirstEntranceTree ! The proof of Theorem 23 consists in exhibiting a familyof graphs on which, for n small compared to G , it is not possible to extract from ( W , · · · , W a n ) auniform element of Subtrees ( n, G ), even for a n large compared to τ n . Proof. Observe the graph on the Fig. 7 formed by a line graph with m points appended to a laddergraph with m vertices. Observe that there are: a unique vertex of degree 1 (one endpoint of the line25 m n/ A line graph of size m connected with a scale of size m . On the second line, an example oftree, where at even abscissas, the two sites are occupied, and at odd abscissas, either the up or down site isoccupied: a single tree spans these nodes (a word of size approx. n/ on the alphabet “up”,”down” encodesthis tree). graph), m − m + 1 which have degree 3; sorespectively ρ gives weight 1 / (2 m + 3 m − / (2 m + 3 m − 2) and 3 / (2 m + 3 m − m and m are of the type m = m = n × factor with factor a large number. If one takes a point u in the line graph at least at distance n from the border of the line graph, this point belong to n treesof size n . A point v in the scale, at least at distance n from its border, belong at least to c × β n trees,for c > , β > n/ trees.The conclusion: assume that starting from ρ , one could discover by a random walk visiting n different points of the graph, the researched tree of size n : a proportion of one third of the trees ofsize n should be on the line graph (since it represents 1 / ρ when factor → + ∞ ), butwe just said that this proportion is smaller than n/ ( n + 2 n/ ). K ( F ) : remove the youngest edge Just replace maximal by minimal in the description of K ( F ) . This kernel has the tendency todestroy all edges, and to provide a poor model of random trees (see Fig. 8). 800 820 840 860 880 900 920 940 960 560 580 600 620 640 660 680 700 720 Figure 8: Remove youngest edge, simulation on Torus (1000) , of a tree with 1000 and then 10000 edges(in the first case, 1M steps starting from a rectangle tree 40 × , in the second case 10M steps starting froma rectangle tree 100 × . .3 A model inspired by Wilson algorithm Definition of Model B: The connected component of r in the one outgoing edge per node .Let r ∈ V be a distinguished vertex; consider ( e u , u ∈ V \ { r } ) a family of independent randomdirected edges, where e u = ( u, u (cid:48) ) and u (cid:48) is a uniform neighbour of u . Denote by t ( r ) theconnected component of r : it is a tree.For a general graph G , the support of the distribution of t ( r ) is not Subtrees • r ( G ). For example,if G = Torus ( n ), each connected component of the complement of t ( r ) contains oriented cycles, andthen, these components cannot have size 1 (see a simulation in Fig. 9).Given a tree t , recall V p ( t ) = { w ∈ V : d t ( w, V ( t )) = 1 } the set of perimeter sites of t . For each w ∈ V p ( t ), let p t ( w ) = |{ ( w, u ) ∈ E, u (cid:54)∈ t }| / deg G ( w ) the probability that the outgoing edge from w does not touch t . For any t ∈ Subtrees • r ( G ) P ( t ( r ) = t ) = (cid:16) (cid:89) v ∈ tv (cid:54) = r deg G ( v ) (cid:17)(cid:16) (cid:89) w ∈ V p ( t ) p t ( w ) (cid:17) . (31) 92 94 96 98 100 102 104 106 108 110 88 90 92 94 96 98 100 102 104 106 108 Figure 9: Simulation following 7.3 on Torus (200) , 3536949 simulations were needed to get a tree of sizeat least 100. In fact, by chance, the tree had exactly size 100. It seems that a mean of around 5 millionssimulations are needed to get this size at least. Simulating big trees by this method seems out of reach.) To get a model having full support in Subtrees • r ( G ), let us allow vertices without outgoing edge. Definition of Model C: At most one outgoing edge per node .Take a parameter q ∈ (0 , 1] and consider a collection [ B v ( q ) , v ∈ V \ { r } ] of i.i.d. Bernoulli( q )random variables to label the vertices. Consider for each vertex u in V \ { r } with B u ( q ) = 1 auniform random outgoing edge e u = ( u, u (cid:48) ), independent of the others (defined as in Model B ).Again take t q ( r ) the connected component of r .It is simple to see that for t ∈ Subtrees • r ( G ), P ( t q ( r ) = t ) = (cid:16) (cid:89) v ∈ tv (cid:54) = r q deg G ( v ) (cid:17)(cid:16) (cid:89) w ∈ V p ( t ) (1 − q ) + qp t ( w ) (cid:17) . When q = 1 we recover the model (31) above, but for q ∈ (0 , t q ( r ) is the complete set Subtrees • r ( G ). However, in practice, on Torus ( n ), this model produces verysmall trees, even smaller than in the case q = 1 for which getting a large tree is rare (see Fig. 9). Recall the definition of forest given in Section 3. A distribution on the set of spanning “subforest” f of a graph G is said to be size biased, if P ( f = ( f , · · · , f k )) is proportional to (cid:81) kj =1 | f j | for any forest,27ith any number of components: roughly, this distribution favours the multiplicity of components ofsmall sizes ≥ G , it suffices to add a point to the vertex set,that is to take V (cid:48) = V ∪ { z } , and to add an edge between z and all the elements of V , that is to define E (cid:48) = E ∪ {{ z, v } , v ∈ V } . Set G (cid:48) = ( V (cid:48) , E (cid:48) ). Definition of Model D: The tree containing r in a size biased forest .Let T (cid:48) be a UST of G (cid:48) = ( V, (cid:48) , E (cid:48) ), and consider the forest f = ( f , · · · , f k ) (for some k ≥ V and edge set E ( f ) = E ( T (cid:48) ) ∩ E , that is, the edges of T (cid:48) not adjacent to v . The forest f is a size biased spanning forest (since each f i can be connected by | f i | different edges to z ). Let t ( r ) be the connected component of f containing r . For all t ∈ Subtrees • r ( G ), P ( t ( r ) = ( t, r )) = | t | × | SpanningTrees ( G (cid:48) \ t ) || SpanningTrees ( G (cid:48) ) | where | SpanningTrees ( G (cid:48) ) | is the number of spanning trees of G (cid:48) , and | SpanningTrees ( G (cid:48) \ t ) | thenumber of spanning trees of G (cid:48) deprived of all the vertices of t .Notice that here | SpanningTrees ( G (cid:48) \ t ) | can be computed using the matrix tree theorem and then,if a bound on | SpanningTrees ( G (cid:48) \ t ) | is known for all subtrees t of size n , and not too bad, a rejectmethod can be used to sample a uniform element of Subtrees • r ( G, n ). Analysis : The computation of a uniform spanning tree of G (cid:48) is fast, and can be done on huge graphs. Drawbacks : The reject method here is unlikely to work if the desired size n is far from 0 and | V | : in most graphs G , it produces some huge ratios between the weights | SpanningTrees ( G (cid:48) \ t ) | and | SpanningTrees ( G (cid:48) \ t (cid:48) ) | for t, t (cid:48) ∈ Subtrees • r ( G, n ). Besides, the evaluation of | SpanningTrees ( G (cid:48) \ t ) | by the matrix tree theorem produces also some difficulties if the graph is large, since manipulation ofhuge integers is an issue. Variant A method to favour larger components is to use Wilson algorithm with some random walk( W bi , i ≥ 0) less likely to visit z . When W bi = z , the node W bi +1 is uniform on V ; otherwise, if W bi = v ∈ V , then W bi +1 = z with probability p , and with probability 1 − p , W bi +1 is a uniformneighbour of v in G . This construction induces a distribution on SpanningTrees ( G (cid:48) ) proportional to p Indegree ( z ) (cid:89) u (cid:54) = r,f ( u ) (cid:54) = z deg G ( u )where Indegree ( z ) counts the number of steps with destination z in the construction and f ( u ) denotesthe father of u in the final spanning tree of G (cid:48) . This is valid when z is not chosen as the first point inWilson’s algorithm (otherwise some minor adaptations are needed). Hence, for a d -regular graph G ,this is proportional to ( d p ) Indegree ( z ) . Lemma 24. Assume that G is d -regular, let T (cid:48)(cid:48) be the spanning tree of G (cid:48) constructed by the variantpresented above, and the spanning forest f = ( f , · · · , f k ) of G (for some k ≥ ), with vertex set V andedge set E ( f ) = E ( T (cid:48)(cid:48) ) ∩ E . For any spanning forest f = ( f , · · · , f D +1 ) of G , P ( f = f | { Indegree ( z ) = D } ) is proportional to (cid:81) D +1 j =1 | f j | . In practice, on the graph Torus ( N ), it is possible to adjust p so that the probability of the event { Indegree ( z ) = 1 } is far from 0; by acceptance/rejection, it then gives a procedure to simulate f with28 ndegree ( z ) = 1, in words, a spanning forest containing two trees. It is also possible to condition by { Indegree ( z ) = 1, | t ( r ) | = n } (see Fig. 10).For a tree t rooted at r with diameter d < N on Torus ( N ), call canonical embedding of ( t , r ),denoted Canonical ( t ), the tree in Z , rooted at 0, obtained by taking the translated tree t − r , andprojected in Z (in the only reasonable way which preserves the edges orientation). Conjecture 2. Conditionally on | t ( r ) | = n , the rescaled vertex sets, Canonical ( t 250 300 350 400 450 500 550 600 650 450 500 550 600 650 700 750 800 850 Figure 10: Left: Simulation with p = 10 − on the Torus (200) , conditioned by D = 1 , and by the fact thatthe tree attached to r “the center of the torus” has size between [19000 , , that is approximately halfof the total size (240 simulations were needed, the output size of t ( r ) is 20852). Right: p = 2 × − , onthe Torus (1000) , conditioned by D = 1 , and | t ( r ) | ∈ [45000 , (2553 simulations were needed beforesatisfying these conditions, with output | t ( r ) | = 52106 ). A method which seems promising to obtain an element of Subtrees ( G, n ) with a prescribed distri-bution, consists in first extracting a UST t of G , and then to extract by a second (random) procedure,a subtree t (cid:48) of t .It turns out that getting a uniform element t (cid:48) of Subtrees ( G, n ) by this methods seems reallydifficult (but possible, using non efficient rejection methods). Extracting a uniform node of t gives auniform node of G , so a uniform element of Subtrees ( G, Subtrees ( G, 2) that is a uniform edge of G , by a random choice of an edge of t ,then one has to face that P ( e ∈ E ( t )) is a complex function of the graph G , as shown by Pemantle[33]. We were not able to find a method not relying on standard rejection method, using the explicitcomputation of P ( e ∈ E ( t )).However, extraction of subtrees of UST allows to obtain some interesting models; we review someof them here, but additional ways to extract random subtrees from a tree are examined in Section 8. In Section 8.2, we will provide an algorithm to sample a uniform subtree of a given tree (or uniformconditionally on the size, with some adjustable parameter to favour a given mean size); it is temptingto use these algorithms on a a uniform spanning tree UST of G . Let us just say something about the29istribution of t ( n ) ∼ Uniform ( Subtrees ( UST , n )), whose support is Subtrees ( G, n ), since, any tree ofsize n is a subtree of at least one spanning tree (see Simulation in Fig. 11). For all t ∈ Subtrees ( G, n ) P ( t ( n ) = t ) ∝ (cid:88) T ∈ SpanningTrees ( G ) t ∈ Subtrees • v ( T ) | Subtrees • v ( T ) | . Figure 11: Simulation of a uniform subtree of a rooted UST of Torus (300) . In each picture the UST issampled using Wilson’s Algorithm. Left: p i = 0 . 35 = 1 − r i for all i ∈ (cid:74) , | T | − (cid:75) , the output is a tree ofsize . Right: p i = 0 . 36 = 1 − r i for all i ∈ (cid:74) , | T | − (cid:75) , the output is a tree of size . Take a rooted UST ( T , r ) of G , with, as usual, its edges directed toward r . Consider a sequence( u i , i ≥ 1) of i.i.d. uniform nodes on V \ { r } . Define the sequence of set of trees ( F i : i ≥ 0) by, F = { T } , and for i ≥ F i is obtained from the removal of the outgoing edge of u i from F i − (whichincreases the number of trees by 1 if this edge is removed for the first time). Let t i ( r ) be the connectedcomponent of r in F i , and set t Simulation of a UST on Torus (5000) seen as directed toward its root; removal of edges is doneuntil t Let N ( n ) be a sequence of integer lim sup n/N ( n ) < (meaning that the diameter of t 0) is distinguished (introduced by Witten & Sander [43], in 1981, on which very few is knownEberz-Wagner [17]). To define it, consider a sequence of simple random walks ( W k = ( W ki , i ≥ , k ∈ N ) starting from W k = ∞ for all k ∈ N . The DLA is a sequence of clusters of vertices ( D i , i ∈ N )defined recursively as follows. Set D = { r } . Assume D k − has been defined for k − ≥ D k = D k − ∪ { u k } , with u k the first point at distance 1 from D k − visited by W k .In order to associate a tree to this dynamic, a simple modification is needed in order to build edgesinstead of points: start by dla = r considered as a tree reduced to its root. Instead of waiting W k to be at distance 1 of the current tree dla k − , wait till the hitting time τ k = inf (cid:110) m : W km ∈ dla k − (cid:111) , so that e k = ( W kτ k − , W kτ k ) is the step allowing to reach dla k − . Define dla k as the tree having asedge set, the edge set of dla k − union { e k } . This kind of construction can be performed on any graph: Definition of Model E: The (finite graph) DLA tree .On a finite connected graph G = ( V, E ) with r ∈ V , the DLA sequence of trees ( TDLA r ( j, G ) , ≤ j ≤ | V | ) is defined as explained above for ( dla k , k ≥ 0) with two simple modifications: therandom walks are independent simple random walks on G which start at i.i.d. points ( w k , k ∈ N )chosen uniformly in V , and if a random walk W k has its starting point W k in the current tree TDLA r ( k − , G ), then (do nothing and) set TDLA r ( k − , G ) = TDLA r ( k, G ). Variant can be imagined from the initial representation, by choosing a connecting edge at random from u k to thecurrent cluster D k − 10 0 10 -10 0 10 -60 0 60 -60 0 60 Figure 13: DLA tree with k = 250 and k = 5000 vertices, built on Torus (1000) . The initial particle is at (0 , . This way of defining the TDLA seems efficient to us in the sense that it allows to define the TDLA onall graphs: for example, on the complete graph, it allows to construct uniform increasing trees (theedges from any node to the root are increasing, and the node labels are exchangeable). The standardDLA tree dla n would be defined on Z using random walks starting from ∞ as explained above. Conjecture 4. There exists C ∈ (1 / , such that, for any c > C , D var ( TDLA ( n, Torus ( n c )) , dla n ) → n → + ∞ where D V ar is the total variation distance. The usual definition of dla n in Z appears to be a kind of limit of TDLA ( n, Torus ( N )) when N → + ∞ (or of TDLA ( n, [ − N, N ] )), with the initial particle places at 0, since, for N → + ∞ , the n starting points of the n random walks goes to + ∞ with N , and the topology of the graph far from 0should not play an important role.If one works on the square [0 , N ] with a initial point at r = (0 , the square DLA . The initial vertex is at a corner, andthere are two parameters: the square side, and the number of particles.Of course, as everyone who has seen these kinds of pictures, it is tempting to conjecture that fora well chosen rescaling sequence a ( n ) going to + ∞ , then for N ( n ) going to ∞ sufficiently fast, TDLA ( n, Torus ( N ( n )) a ( n ) ( d ) −−→ TDLA ∞ for the Hausdorff measure on compact subset of R , where TDLA ∞ is a.s. a (non trivial) continuumrandom tree embedded in R (that is a connected subset of R , where between any two points x, y ∈ TDLA ∞ , there is a single injective path γ (up to the time parametrization), such that γ (0) = x , γ (1) = y , and γ ∈ [0 , ∈ TDLA ∞ ). As can be guessed from Fig. 14, a convergence can still beconjectured for TDLA ( n, [0 , N ( n )] ) /a ( n ) (probably for the same normalisation) to another continuumrandom tree TDLA ( n, [0 , + ∞ ) ) /a ( n ). 32 Figure 14: DLA tree TDLA (5000 , [0 , ) with initial vertex at (0 , (that is defined on the square [0 , , with root at (0 , and 5000 vertices). We made some statistics in order to try to guess the critical exponents in the case of square DLAstarting with a single vertex in a corner.Tree size 5000 6000 7000 8000Number of simulations 13671 13659 13645 13635 (32)Again, we made two types of distance statistics, as in Section 5.3: the Euclidean width andheight w ( t ) and h ( t ) (number of vertical resp. horizontal row occupied), and random graph distance D ( t ) = d t ( c , v ) between this time, the “root corner” and a random node in the tree. In order to makethe statistics, for each simulation, we used both values w ( t ) and h ( t ), and sample 10 random nodes v for each dla. We use the same methods as in Section 5.3 to evaluate the more plausible values of α and β for which w ( t n ) /n α and d t ( c , v ) /n β would converge in distribution, in view of our samples.The square size is the same for all simulations (1000 × . 93 190 . 31 208 . 36 225 . . 00 190 . 00 208 . 00 225 . d ( c , v ) 160 . 62 178 . 69 195 . 43 211 . d ( c , v ) 166 . 00 185 . 00 202 . 00 218 . 00 (33)( n, m ) (5000 , , , α (mean) 0 . 589 0 . 588 0 . α (median) 0 . 578 0 . 587 0 . α . 579 0 . 581 0 . β (mean) 0 . 585 0 . 581 0 . β (median) 0 . 594 0 . 570 0 . β . 581 0 . 586 0 . 586 (34)33 .00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.000.00.20.40.60.81.0 5000600070008000 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.000.00.20.40.60.81.0 5000600070008000 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.000.00.20.40.60.81.0 50006000700080000.0 0.5 1.0 1.5 2.0 2.50.00.20.40.60.81.0 5000600070008000 0.0 0.5 1.0 1.5 2.0 2.50.00.20.40.60.81.0 5000600070008000 0.0 0.5 1.0 1.5 2.0 2.50.00.20.40.60.81.0 5000600070008000 Figure 15: On the first line, (interpolated) empirical cumulated function of w ( t n ) /n α for α being respectively0.57, 0.58 and 0.59. On the second, (interpolated) empirical cumulated function of d t n ( c , v ) /n β for β beingrespectively 0.57, 0.58 and 0.59. Conjecture 5. In the case of TDLA ( n, [0 , + ∞ ) ) , both w ( t n ) /n α and d t n ( c , v ) /n β converge in dis-tribution for α = β , for a certain α ∈ [0 . , . . In fact, since the points start inside the square, they are more likely to start “inside” the currentcluster, which implies, probably, that our simulations produce results a bit smaller than the expectedlimiting values. In Lawler [25, Sec. 2.6] it is discussed that α is conjectured to be ( d + 1) / ( d + 1) indimension d . This model has been introduced by Diaconis – Fulton [32] and it is defined as follows. Consider asequence of i.i.d. simple random walks ( W k , k ∈ N ), all of them starting at the same vertex W k = r for k ∈ N . The internal DLA is a sequence of clusters of vertices ( I i ) ∞ i =0 defined as follows. Set I = { r } . Assume I k has been defined and define I k +1 = I k ∪ { u k +1 } , where u k +1 is the first pointin the complement of I k hit by the random walk W ( k +1) (that is let τ ( k + 1) = inf { m : W k +1 m / ∈ I k } ,then u k +1 = W k +1 τ ( k +1) ) . Definition of Model F: The internal DLA tree .Use the random walks defined above. Define the sequence of trees ( T k , k ∈ N ) as follows: T isthe tree reduced to its root r . To define T k +1 from T k , add the edge e k +1 = ( w k +1 τ ( k +1) − , W k +1 τ ( k +1) )corresponding to the step of the random walk W k +1 reaching a node in the complement of thevertex set of T k . Again T k is a tree with k + 1 vertices (simulations on Fig. 16).Much information is known on the cluster (see notably, see Lawler – Bramson – Griffeath [27] fora limit shape theorem). 34 .5 5.0 2.5 0.0 2.5 5.0 7.57.55.02.50.02.55.07.5 20 10 0 10 203020100102030 Figure 16: Tree Internal DLA with k = 200 and k = 2000 vertices, v = (0 , . In this part, we assume some i.i.d. weights C = ( C e : e ∈ E ( G )) associated with edges pickedaccording to a non atomic measure µ on (0 , + ∞ ). The induced random order σ of the edges is the(a.s. well defined) permutation satisfying e σ (1) < · · · < e σ ( | E | ) . (35)The two first models given in this part are built using Prim [34] and Kruskal [24] algorithms whichextract the minimum spanning tree (MST) of a weighted graph ( G, C ). In fact, the MST is a functionof the induced random order σ (this is a consequence of Prim, Kruskal and also Boruvka algorithm). The Prim component of a vertex .Build a sequence of trees ( P j , j = 1 , · · · , N ) where P j is a tree with j nodes, as follows: first,take P = r a fixed node. Assume that P j has been built and set P j +1 as the tree P j togetherwith the edge e of minimal weight between a node of P j and a node out of P j .Since the weights are chosen according to an atomless measure, the sequence ( P j , j = 1 , · · · , N ) is a.s.well defined. Open question 4. Take a connected weighted graph ( G, C ) and r a fixed vertex of V . What is thedistribution of P n ? Of the process ( P n ) ? The case where G is the complete graph with n vertices is well understood, even at the level ofthe process ( P n ) (see Aldous [2], Broutin & Marckert [9] where the connection with the multiplicativecoalescent is done). Remark 25. The probability P ( P n = t ) is proportional to the number of induced permutation orders σ giving t . It is possible to find a description of these σ , by fixing first the relative order of the edgesof t (their Prim order); this provides a way to describe the σ e of the perimeter edges e of t ; however,the formula P ( P n = t ) thus obtained, has a summation form which seems to be intractable. 880 900 920 940 960 980 1000 1020 1040 900 920 940 960 980 1000 1020 1040 Figure 17: Prim’s algorithm applied on Torus (2000) with i.i.d. weights C e ∼ Uniform (0 , , stopped whenthe tree reached vertices. Define a sequence of graphs ( K i , j ≥ K = ( V, ∅ ) the graph with no edges and vertex set V , and K i = ( V, E i ) the graph with edge set E i = E i − ∪ { e σ ( i ) } if this set of edges does not containany cycle, or E i = E i − otherwise. Stop the construction at the MST K , which is the first graph K i containing N − m , K i is a forest; as time passes by, its connected components merge. In particularthe connected component containing a given fixed vertex r ∈ V has a non decreasing size. Definition of Model H: The Kruskal component of a vertex .Stop the construction in Kruskal algorithm when the cluster containing r has at least n edges.Denote by K size ≥ n ( r ) the tree obtained (see Fig. 18).For any t ∈ Subtrees • r ( G ) with | t | ≥ n , let I t := Induced G ( { u : d G ( u, t ) ≤ } ) be the induced subgraphof G , formed by the nodes at distance ≤ t . Each edge e ∈ E ( I t ) is either an edge of the tree,a perimeter edge of t (meaning that e ∪ E ( t ) is the edge set of a tree) or a “cyclic edge” meaningthat e ∪ E ( t ) is the edge set of a graph with a (unique) cycle, denoted C t ( e ). Denote by P t the set ofperimeter edges and In t the set of cyclic edges. Proposition 26. For any tree t ∈ ∪ k ≥ n Subtrees • r ( G, k ) P ( K size ≥ n ( r ) = t ) = | S t | / | E ( I t ) | ! where S t is the subset of the symmetric group S ( E ( t )) composed by the permutations σ that satisfy thefollowing properties: ( a ) σ e ≥ σ f , ∀ f ∈ C t ( e ) , ∀ e ∈ In t , ( b ) min { σ f , f ∈ P t } ≥ max { σ e , e ∈ E ( t ) } . ( c ) if one removes the edge e of t with larger label σ ( e ) , the connected component of t containing theroot has size < n .Proof. It is simple to see that the realisation of the event { K size ≥ n ( r ) = t } depends only on the relativeorder of σ on E ( I t ), which, for symmetric reason, is a uniform order. Now, by definition removingthe last edge added at t must leave the connected component attached to the origin with a size < n (condition ( c )). Now, condition ( b ) is needed: without it, some perimeter edges of t would have been36dded before t is completed. Condition ( a ) translates the condition that an edge is not added if itforms a cycle: the other edges need to have been added before.A variant consists in rejecting K size ≥ n ( r ) as long as its size is not exactly n (meaning that wereassign weights to all edges). Denote by K size = n ( r ) the result obtained. 80 100 120 60 100 130 220 250 330 160 200 250 Figure 18: Kruskal’s algorithm on the Torus (200) with a uniform random order of the edges, stopped whenthe tree containing the vertex v = (100 , has size in [1000 , . (the output size is 1001). 20simulations were needed before success (the size can jump over this interval during the construction process).The second construction is done on the Torus (500) , and the algorithm is stopped when a tree with size in [5000 , . containing v = (250 , is obtained by the same method with uniform random orderof the edges (four simulations were needed, with output size equals 5007). We made some statistics in order to try to guess critical exponents in the case of the Kruskal treeson the Torus ( N ). In order to get an efficient way to test the creation of cycles, we turned the Kruskalforest into a forest of rooted trees as follows: at the beginning all nodes are roots of trees reducedto a single vertex. Each time unit, a uniform vertex u k and a uniform direction d k (north, est, west,south) are chosen independently from the other choices. Let v k be the vertex at distance 1 of u k onthe torus, such that ( u k , v k ) has direction d k . The oriented edge ( u k , v k ) is then added to the “forest”if it does not create a cycle. In this case, in the rooted tree ( t, r ) that contained u k , the edges from r to u k are oriented toward u k so that the new root of the new tree after this merging, is the root ofthe tree that contained v k beforehand. Hence, the component we are interested in is the rooted treethat contains a given node, chosen before the starting of the simulation. Since the diameter of a treeis at most twice the larger distance to the root, we expect the critical exponent to be independent ofthe choice of the root.We fix a value n , and in order to not loose too much waiting time for a realisation of a tree withsize exactly n , we wait till | K size ≥ n | ≤ n (1 + 0 . n ≤ | K size ≥ n | ≤ n (1 + 0 . Torus (700) which is, in practice largeenough so that none of the thousands simulations we did got a width or a height of this size.Tree size 4000 6000 8000 10000Number of simulations 10758 10679 10446 10365 (36)Again, we made two types of distance statistics, as in Section 5.3: the Euclidean width andheight w ( t ) and h ( t ) (number of vertical resp. horizontal row occupied), and random graph distance D ( t ) = d t ( r , v ) between the root of the tree and a random node in the tree. In order to make thestatistics, for each simulation, we used both values w ( t ) and h ( t ), and sample 10 random nodes v for37ach such tree. We use the same methods as in Section 5.3 to evaluate the more plausible values of α and β for which w ( t n ) /n α and d t n ( r , v ) /n β would converge in distribution, in view of our samples.Number of nodes 4000 6000 8000 10000Empirical mean of the width 112 . 96 140 . 07 163 . 51 183 . . 00 138 . 00 161 . 00 180 . d ( r , v ) 174 . 77 225 . 08 269 . 44 311 . d ( r , v ) 165 . 00 213 . 00 254 . 00 295 . 00 (37)( n, m ) (4000 , , , α (mean) 0 . 530 0 . 538 0 . α (median) 0 . 537 0 . 536 0 . α . 528 0 . 538 0 . β (mean) 0 . 624 0 . 625 0 . β (median) 0 . 630 0 . 612 0 . β . 623 0 . 628 0 . 653 (38) Figure 19: On the first line, (interpolated) empirical cumulated function of w ( t n ) /n α for α being respectively0.52, 0.53 and 0.54. On the second, (interpolated) empirical cumulated function of d t n ( r , v ) /n β for β beingrespectively 0.63, 0.64 and 0.65. Conjecture 6. w ( t n ) /n α converges in distribution for α ∈ [0 . , . . The simulations suggest that, either β exists but the sizes of the simulated trees are not largeenough to estimate it, or there does not exist any such β (a correction term like (log n ) γ may beneeded). However, the curves (19) show that the empirical cumulative function of d t n ( r , v ) /n β arereally close for the simulated n for some β , so that it can be guess that d t n ( r , v ) /a n should convergefor some ( a n ) to a non trivial limit. 38 efinition of Model I: The minimal weighted subtree .Consider the subtree tree t n ( µ ) of G with n edges and minimal weight among those with n edges.The problem consisting in finding the minimal weighted subtree of size n is called the k -cardinalitytree problem (see e.g. [12] and reference therein): it is a NP-complete problem, and we renounced toprovide some pictures for this model. Remark 27. Here, the distribution of t n ( µ ) depends on µ not only on the relative order of edges. Other optimisation problems like this one exist in the literature, for example, the Steiner treeproblem (which amounts to finding the tree with minimal weight connecting a subset of nodes U ⊂ V in a graph) and its numerous variants, for which the nodes are also weighted (the node-weighted Steinertree problem [11]), the edge capacitated Steiner tree problem (see [4] in which additional constraintson the tree are added). Again consider the same model of weighted graph ( G, C ) and a distinguished vertex r . Now, witheach node w (cid:54) = r associate the (a.s. well defined) path L w from r to w with minimal weight C ( w )(sum of the weight of edges belonging to the path). The union of the paths T ( ν ) := ∪ w (cid:54) = r L w formsa.s. a tree (it is connected and acyclic with probability 1, since a cycle implies that two different pathshave the same weight). Definition of Model J: The first passage percolation tree .Denote by T n ( ν ) the tree formed by the union of the paths from r to the n nodes (including r )with smallest weights.On some simple graphs and distributions ν having a mean, it is easy to prove that T n ( ν ) is notuniform. In Z , there exists a limit shape theorem (Cox-Durrett shape theorem, see Auffinger & al.[3, Section 2] for this theorem, and an overview of last passage percolation problems). T For a given tree T , the polynomial Φ T ( x ) = (cid:80) k ≥ x k | Subtrees ( T, k ) | is called the subtree poly-nomial of T . Due to the decomposition of trees at their root, the computation of Φ T is much lessexpansive than in the case of graphs, even in the weighted case (see Yan & Yeh [45]). This impliesthat uniform sampling of subtrees of a given size of a tree can be done exactly, in principle, even ongiant trees, just by counting the number of subtrees with size k containing a given vertex, and makingsome decomposition (see also Brown and Mol [10] and reference therein). The absence of cycles in G simplifies the implementation of the Markov chains we introduced inSection 6: the BreakCycle ( · ) kernel is not needed anymore. Moreover, it is easy to define monotonekernels for the inclusion order, so that coupling from the past techniques will be possible. In thissection, G = ( T, r ) is a rooted tree (and we keep the notation T = ( V, E )).39 480 485 490 495 500 505 510 515 520 480 485 490 495 500 505 510 515 440 460 480 500 520 540 560 440 460 480 500 520 540 560 465 470 475 480 485 490 495 500 505 510 515 480 485 490 495 500 505 510 515 520 525 420 440 460 480 500 520 540 560 420 440 460 480 500 520 540 560 580 Figure 20: First passage percolation on the Torus (1000) . On the first line, the C e are i.i.d. uniform on [0 , .The two trees are done on the same environment, and have sizes 1000 and 10000. On the second line, theweight are distributed as /E where E has the exponential distribution with parameter 1. Recall the (rooted) kernel K ( D ) r in Subtrees • r ( G ) introduced in Section 4.2, defined using a sequenceof parameters [( p i , q i , r i ) , ≤ i ≤ | V | ]. Here the graph G = ( T, r ) so that the attempt of addition of anew edge e to the current tree X will never create cycle.Consider the following condition: Hypothesis M : p ≤ p ≤ · · · ≤ p | V ( T ) |− , r ≥ · · · ≥ r | V ( T ) | . Since p i + r i + q i = 1, it is also required that r k + p k ≤ 1. In words, the bigger the tree is, the fasterit grows, and the smaller the tree is, the faster it shrinks. Proposition 28. Consider a sequence [( p i , q i , r i ) , ≤ i ≤ | V | ] satisfying Hypothesis M . For a tree T ,consider t taken under the invariant distribution of K ( D ) r , for a fixed r ∈ T . We then have P ( t = t ) = ν ( T ) | t | (cid:89) i =2 (cid:18) p i − r i (cid:19) t ∈ Subtrees • v ( T ) where ν ( T ) denotes ν in Proposition 17 (and Rem. 15) applied to T . Recall that given T and n , L ( t | | t | = n ) is uniform in Subtrees • v ( T, n ). This is independent of( p i ) and ( r i ), which leaves a degree of freedom in order to bias the size of t . Figure 11 shows somesimulations obtained using this procedure on the torus. Remark 29. A particular case is obtained when p i − = r i for all i ∈ (cid:74) , | V | (cid:75) , since this reduces to thesampling of a uniform subtree of T . coupling from the past for K ( D ) r . We will show that under the Hypothesis M , it is possible tocouple the Markov chain under the kernel K ( D ) r so that it is monotone for the inclusion partial order,where for t, t (cid:48) ∈ Subtrees • r ( T ) we say that t (cid:22) t (cid:48) if E ( t ) ⊂ E ( t (cid:48) ). This partial order possesses as leastelement the tree t = { r } (reduced to its root), and as greatest element, the complete tree t = T .For more information on the coupling from the past when the space state possesses a partial orderwith a unique minimal and a unique maximal element we refer to [36, 35].The realization of the coupling is done according to the following lines. First define a function f : Subtrees • r ( T ) × E ( T ) × [0 , −→ Subtrees • r ( T )( t, e, v ) (cid:55)−→ Add ( t, e ) if v ≤ p | t | , Remove r ( t, e ) if v ≥ − r | t | ,t if p t < v < − r | t | . Consider a realization of a sequence of i.i.d. vectors (( e k , v k ) : k ∈ Z ) where e k ∼ Uniform ( E ( T )) isindependent from v k ∼ Uniform [0 , f k ( · ) = f ( · , e k , v k )and for every pair of integers ( k , k ) such that k < k we consider F k k ( t ) = f k ◦ f k − ◦ · · · ◦ f k ( t ) . For a reader not familiar with this kind of considerations, there are two key points: • firstly, for any t ∈ Subtrees • r ( T ), the process ( F k ( t ) , k ≥ 0) has the distribution of a Markov chainwith kernel K ( D ) r with initial state, the tree t , • and secondly, a natural coupling is provided since the complete family [( F k ( t ) , k ≥ , t ∈ Subtrees • r ( T )]can be followed altogether simultaneously since they are built using the same source of randomness.The Hypothesis M ensures the monotonicity of the chain: a direct consequence of this hypothesis andof the definition of f , is that, for every ( e, v ) ∈ E ( G ) × [0 , 1] and any t, t (cid:48) ∈ Subtrees • r ( T ) t (cid:22) t (cid:48) ⇒ f ( t, e, v ) (cid:22) f ( t (cid:48) , e, v ) , and therefore, for every k ≤ k , F k k ( t ) (cid:22) F k k ( t (cid:48) ) too. In particular, for each tree t ∈ Subtrees • r ( G ), F k k ( t ) (cid:22) F k k ( t ) (cid:22) F k k ( t ) . Hence F t t ( t ) = F t t ( t ) iff F t t ( t ) is the same for all t ∈ Subtrees ( r, T ).We recall the monotone coupling from the past algorithm.—————————— Monotone coupling from the past: • iter := 2 (or another free parameter > • s := 1 • While F − s ( t ) (cid:54) = F − s ( t ) do s := s × iter • End while • Return F − s ( t ) ——————————41he chain backward ( F − s ( · ) : s ∈ N ) is indirectly related to chain forward ( F s ( · ) : s ∈ N ); set −→ τ = inf { s ≥ F s ( t ) = F s ( t ) } and ←− τ = max { s ≤ F − s ( t ) = F − s ( t ) } the so-called forward and backward coupling time, respectively. As stated in [36, P. 21], −→ τ and ←− τ have the same distribution.We will give some bounds on the forward coupling time.For each “time” s ≥ 0, define a colouring C s = ( C s ( w ) , w ∈ V ( T )) of the vertex set of T as follows:– if w ∈ V ( F s ( t )), set C s ( w ) = red ,– if w ∈ V ( F s ( t )) \ V ( F s ( t )), set C s ( w ) = white ,– otherwise set C s ( w ) = black .At time 0, C ( w ) = white for all nodes of T , except the root which is red . For any s , the set of red vertices are those of the “minimal tree”, F s ( t ), while those of the “maximal tree” F s ( t ) are in theunion of the red and white nodes. The coupling time −→ τ coincides with the time where there is no white vertex left. (a) Intermediate phase (b) Merged state Figure 21: A possible evolution of the coupled forward Markov chain, on a binary tree T . For t ∈ Subtrees • r ( T ) define the set of perimeter sites of t as V p ( t ) = { v ∈ V : d T ( v, t ) = 1 } . The set ofleaves of t as V (cid:96) ( t ) = { v ∈ V : v leaf of t } . Also define the maximal sizes of the perimeter and leavessets for a tree with k nodes as V p ( k ) = max {| V p ( t ) | : t ∈ Subtrees • r ( T, k ) } (39) V (cid:96) ( k ) = min {| V (cid:96) ( t ) | : t ∈ Subtrees • r ( T, k ) } (40) Proposition 30. Suppose Hypothesis M holds. If for all i ∈ (cid:74) , | V ( T ) | − (cid:75) , p i / r i ≤ cV (cid:96) ( i ) /V p ( i ) then E ( −→ τ ) = E ( ←− τ ) ≤ ( N − N (cid:88) j =2 ( j − r j V (cid:96) ( j ) c j − − c − . In particular if T is a complete d -ary tree with height h , then with N = ( d h +1 − / ( d − vertices, E ( −→ τ ) ≤ r N ( c − dd − (cid:18) c − c N − c − ( N − (cid:19) (41) Proof. For every t ∈ Subtrees • r ( T ) denote by τ ( t ) = inf { s ≥ F s ( t ) = t } the hitting time of the tree t . By the property of the coupled forward Markov chain,max { τ ( t ) , t ∈ Subtrees • r ( T ) } =: τ ( T ) ≥ −→ τ , a.s.All along the proof, we write N instead of | V ( T ) | . Using the Markov property we get | V (cid:96) ( T ) | N − r N E ( τ ( T )) = 1 + r N N − (cid:88) e ∈ V (cid:96) ( T ) E ( τ ( T \ { e } )) , (42)42nd for t ∈ Subtrees • r ( T, k ) for k ∈ (cid:74) , N − (cid:75) (cid:18) p k | V p ( t ) | N − r k | V (cid:96) ( t ) | N − (cid:19) E ( τ ( t )) = 1 + p k N − (cid:88) e ∈ V p ( t ) E ( τ ( t ∪ { e } )) + r k N − (cid:88) e ∈ V (cid:96) ( t ) E ( τ ( t \ { e } )) . (43)Call E k = max { E ( τ ( t )) : t ∈ Subtrees • r ( T, k ) } and ∆ k = E k − E k − . Notice that E = 0. Boundingeach term E ( τ ( T \ { e } )) in the right hand side of (42) by E N − and by noticing that E N = E ( τ ( T ))we obtain ∆ N ≤ ( N − / ( r N V (cid:96) ( N )) . (44)For k ∈ (cid:74) , N − (cid:75) fix t k one of the trees attaining E k . Now, consider (43) applied to t k and boundeach E ( τ ( t ∪ { e } )) and E ( τ ( t \ { e } )) in the right hand side respectively by E k +1 and E k − . (cid:18) p k | V p ( t k ) | N − r k | V (cid:96) ( t k ) | N − (cid:19) E k ≤ p k | V p ( t k ) | N − E k +1 + r k | V (cid:96) ( t k ) | N − E k − (45)= ⇒ r k | V (cid:96) ( t k ) | N − k ≤ p k | V p ( t k ) | N − k +1 + 1 (46)Therefore for k ∈ (cid:74) , N − (cid:75) , using the definition of V p ( k ), V (cid:96) ( k ) and the hypothesis that p k / r k ≤ V (cid:96) ( k ) /V p ( k ) one obtains∆ k ≤ p k | V p ( t k ) | r k | V (cid:96) ( t k ) | ∆ k +1 + N − r k | V (cid:96) ( t k ) | ≤ p k V p ( k ) r k V (cid:96) ( k ) ∆ k +1 + N − r k V (cid:96) ( k ) ≤ c ∆ k +1 + N − r k V (cid:96) ( k ) . (47)By repeatedly applying (47) and finally (44) one obtains that for all k ∈ (cid:74) , N (cid:75) one has∆ k ≤ N (cid:88) j = k c j − k N − r j V (cid:96) ( j ) . To conclude notice that E ( τ ( T )) = E N = (cid:80) Nk =2 ∆ k and therefore this gives E ( −→ τ ) ≤ E N ≤ N (cid:88) k =2 N (cid:88) j = k c j − k N − r j V (cid:96) ( j ) = N (cid:88) j =2 j (cid:88) k =2 c j − k N − r j V (cid:96) ( j ) ≤ ( N − N (cid:88) j =2 ( j − r j V (cid:96) ( j ) c j − − c − . To conclude the second part on the d -regular tree we use that by Hypothesis M , r j is non-increasingand that in the infinite d -regular tree one has that V (cid:96) ( k ) ≤ V p ( k ) /d , since each leaf has exactly d perimeter sites, moreover V p ( i ) = ( i + 1)( d − − i ( d − 1) for d > Definition of Subtree of tree Model A: Uniform Leaf evaporation .Take a tree T with N nodes, and define ( LeafEvaporation ( T, k ) , ≤ k ≤ N − 1) as follows: LeafEvaporation ( T, 0) = T , and for k > LeafEvaporation ( T, k ) is obtained by the removal of auniform leaf of LeafEvaporation ( T, k − 1) (so that k counts the number of evaporated edges). Remark 31 (rooted versus unrooted case) . There are two natural variants of this algorithm dependingon whether we work with unrooted tree T , in which case, all nodes of degree 1 are leaves, or if T = ( T, r ) is a rooted tree, in which the root r is never considered as a leaf (this is the standard convention). 43e consider the rooted case here: the root r is never considered as a leaf. Any history of leafevaporation can be encoded by labelling the edges of the initial tree by the date of evaporation of theleaves from 1 to | T | − 1. For t ∈ Subtrees • r ( G, n ), consider the set H [ T, t ] of labelling of the edges of T \ t by the integers between 1 and | T \ t | , such that, the labels of the edges on any injective pathfrom any leave of T to t are increasing. The following result describes the law of the remaining treeafter N − n leaves evaporation: Proposition 32. For t ∈ Subtrees • r ( G, n ) , P [ LeafEvaporation ( T, N − n ) = t ] = (cid:88) h ∈ H [ t,T ] | T \ t |− (cid:89) x =0 | ∂ ( T \ { e : h ( e ) ≤ x }| Proof. For each history, at each step, the probability to remove a given leaf is the inverse of the currentnumber of leaves. 600 800 1000 1200 1400 1600 1800 2000 2200 1600 1700 1800 1900 2000 2100 2200 2300 Figure 22: A tree extracted by the leaves evaporation algorithm with n = 10000 nodes, on a UST ofTorus (500) (left) and on Torus (4000) (right). Remark 33. Uniform edge evaporations of some classical families of trees (of non embedded trees)have been thoroughly studied following an idea of Meir & Mon [30] in 1970. Many recent developmentsunder the name of “cut-tree” have been published which aims at describing the tree structure of thefragmentation history (see e.g. Janson [21], Bertoin – Miermont [6]). Definition of Subtree of tree Model B: Evaporation of the smallest leaf .Consider a rooted tree T with N nodes in which the edges are equipped with i.i.d. weights takenunder µ ∼ Uniform [0 , 1] (only their relative order matters). Successively, remove the leaf adjacentto the edge with smallest weight (among those adjacent to leaves). The set of leaves evolves,as leaves are removed: this forms a sequence of tree T N = T, · · · , T = r where T i has i nodes.Return T n if the target size is n (see some simulation in Fig. 23).Denote by ( T, σ ) a labelling of the edges of T by a uniform permutation of { , · · · , N } where N = | E ( T ) | . Open question 5. Give a nice description of the distribution of the tree remaining after | T | − n leaves evaporations. In fact, we have a description of the remaining tree distribution but we feel that something deeperis hidden: the leaf evaporation depends only on the induced random order σ of the weighted edges,44 Figure 23: A tree extracted by the removal of the edges with smallest label with n = 5000 nodes, on a USTof the Torus (500) (left) and on Torus (200) (right). as defined in (35). Let us put a second label (cid:96) on each edge: its date of disappearance (starting from1, ending at | T | ); after elimination, the label are increasing from the leaves of T till they reach thesingle remaining node (or the subtree t obtained, if the leaf evaporation is stopped when a certainsize is reached): call such a labelling, a valid labelling. Now, the evaporation leads to t , if the N − n smallest (cid:96) -labels are on the edges of T \ t .Consider the map π which sends ( T, σ ) onto ( T, (cid:96) ), that is which gives the elimination order onthe edges of T . To describe the distribution of the remaining tree t , it suffices to be able to compute (cid:12)(cid:12) π − ( T, (cid:96) ) (cid:12)(cid:12) for any valid (cid:96) . We will see that this is somehow explicit: Lemma 34. For (cid:96) valid, the elements of π − ( T, (cid:96) ) are the σ that satisfies, (cid:96) ( e ) < (cid:96) ( e ) implies σ ( e ) < σ ( e ) , if e is a leaf at time (cid:96) ( e ) − . Proof. Take two edges e and e , such that (cid:96) ( e ) < (cid:96) ( e ) so that e is eliminated before e . Consider σ ( e ) and σ ( e ) the corresponding edge values. Now, consider T (cid:63) = T \ { e : (cid:96) ( e ) < (cid:96) ( e ) } the stateof T just before the elimination of e (that is, when all the edges with smaller label than (cid:96) ( e ) areremoved). In T (cid:63) the edge e is still present, so there are two cases:– If e is a leaf, then we must have σ ( e ) < σ ( e ),– If e is not a leaf, then σ ( e ) may be larger or smaller than σ ( e ).For a given ( T, (cid:96) ), the cardinality of π − ( T, (cid:96) ) can be explicitly computed, but it produces anintricate formula, which needs to be summed over valid (cid:96) to compute P ( T n = t ).The next model looks similar but it is different; it defines a tree value process with non increasingsize : it can reach or not the target size n . Up to a change of time, it is independent from µ . Definition of Subtree of tree Model C: Evaporation of the leaves with weight ≤ w .Consider a rooted tree ( T, r ) with N nodes in which the edges are equipped with different weights,for example, chosen independently according to µ , the uniform distribution on [0 , w ,consider the subtree T ( w ) of T obtained by removing the leaves with weight ≤ w (removing theseleaves may create new leaves, at which the same procedure applies recursively). When r hasdegree 1, it is not considered as a leaf.Of course, this model is a percolation model on the weighted graph. One has, Proposition 35. For any tree t ∈ Subtrees • r ( G ) , P µ ( T ( w ) = t ) = µ ( w, + ∞ ) | ∂t | µ [0 , w ) | T \ t | , here | T \ t | is the number of nodes in T that are not in t (this is also the number of edges). Reducing progressively the tree size one by one so that a target size is reached for sure is natural,and Models A and B are of this type. In the literature, one finds some works [31] and [29], relatedto distributed algorithms, aiming to “elect” a node in a tree, using leaf evaporation (this name doesnot appear there however). We present here the general evaporation scheme defined in [29] which canbe used to extract a subtree of a given size by stopping the process when this size is reached (this isnot discussed in [31, 29]).In the sequel, we denote by ( T, w ) an unrooted tree in which nodes are weighted by w = ( w u , u ∈ V )by some non negative (possibly random) real numbers, the weight of the leaves being positive; someexamples will be given afterward. The algorithm uses a family of distribution µ ( q, . ) on (0 , + ∞ ), forany q > u with parameter q . Definition of Model K: Election type evaporation .At time 0 the leaves of T are active and the internal nodes are not. A leaf u with weight w u evaporates after a random time with distribution µ ( q u , . ), independently from the others, where q u = w u . (cid:4) Upon evaporation, leaf u transmits its parameter q u to its single neighbour v in the tree. (cid:4) A node v with degree d , which becomes a leaf after complete evaporation of d − τ ). The node v has received the parameters( q v , · · · , q v d − ) of these neighbours. It then compute its own parameter q v = f ( w v , q v , · · · , q v d − ) , then generate a random variable τ ( v ) with distribution µ ( q v , . ); the node v will evaporate at globaltime τ + τ ( v ) (hence, τ ( v ) is its remaining lifetime, when it becomes active).The function f is a parameter of the algorithm, as well as the initial weights ( w u , u ∈ V ), the familyof distributions µ ( ., . ), and even the parameters that can be used to store additional information, asthe complete geometry of the evaporated subtrees, as well as their lifetimes, for example.In [31], the model is as follows: the initial weight of all u ∈ V are w u = 1, for all u ∈ V . The map f is given by f ( w v , q v , · · · , q v d − ) = w v + q v + · · · + q v d − = 1 + q v + · · · + q v d − meaning that a node adds to its own weight the weights transmitted from its eliminated neighbours(hence, becoming active, its weight is the size of the tree formed by v , and the eliminated subtreeswhich were hanging from it); finally the remaining life time of a node with parameter q is distributedas m q ∼ Expo ( q ), the exponential distribution with parameter q .The main result in [31] is the following: if one continues the elimination procedure till a single node u remains, then u is a uniform node of V .Denote by Evaporation ( T, n ) the random tree obtained from this particular election type evapora-tion process when only n nodes remains (for n ≤ | V ( T ) | ). For a given t ∈ Subtrees ( T, n ), the graph T − t induced by the removal of edges of t in T is a forest composed by n trees. For any v ∈ t , denoteby ∆ v the tree of T − t attached to v . We have Theorem 36. For any subtree t ∈ Subtrees ( G, n ) , P ( Evaporation ( T, n ) = t ) = ( | ∂t | − N − n )!( | ∂t | + N − n )! (cid:88) v ∈ ∂t | ∆ v | . emark 37. Notice that the last edge is then uniform, as well as the last node (this case is statedin [31]); in the case where the minimum degree of the internal nodes of T is m , then the uniformityholds also for all n ≤ m too.Proof. The evaporation process passes by t , if at a given moment all the trees ∆ v have disappeared, buttheir root (since their roots belong to t ). It may be shown, by recurrence that, for a given tree t (cid:48) (whoseroot is never considered as a leaf) that the time A t (cid:48) for the root to be active is distributed as M | t (cid:48) |− ,where for all k , M k is the maximum of k independent exponential random variables with parameter1. Once the root of such tree become active, it has the additional lifetime a t (cid:48) ∼ Expo ( | t (cid:48) | ), which it isindependent of A t (cid:48) . The complete evaporation time E t (cid:48) of t (cid:48) , which includes the erasure of the root,is distributed as M | t (cid:48) | since E t (cid:48) := A t (cid:48) + a t (cid:48) . Hence ( A t (cid:48) , E t (cid:48) ) is distributed as ( M | t (cid:48) |− , M | t (cid:48) |− + a | t (cid:48) | )where the delay a | t (cid:48) | is independent of M | t (cid:48) |− P ( A t (cid:48) < y < E t (cid:48) ) = P ( M | t (cid:48) |− ≤ y ≤ M | t (cid:48) | ) = e − y (1 − e y ) | t (cid:48) |− (48) P ( A t (cid:48) < y ) = P ( M | t (cid:48) |− ≤ y ) = (1 − e − y ) | t (cid:48) |− . (49)Now, a certificate that the evaporation process passes by t is as follows: a root of one of the ∆ v disappeared at some time x at which all the other ∆ w have disappeared, but their root. This gives P ( Evaporation ( T, n ) = t )= (cid:88) v ∈ ∂t (cid:90) ∞ (cid:89) u ∈ ∂t \{ v } P ( M | ∆ u |− ≤ x ≤ M | ∆ u | ) (cid:89) u ∈ t \ ∂t P ( M | ∆ u |− ≤ x ) P ( M | ∆ v | ∈ dx )= (cid:88) v ∈ ∂t | ∆ v | (cid:90) ∞ e −| ∂t | x (1 − e − x ) N − n dx which suffices to conclude (the third equality comes from (48) and (49)). Remark 38. In [29], much more general models of evaporation processes are designed, for which thelaw of the remaining tree can be computed; they can be turned into evaporation procedure and stoppedwhen a given size is obtained. We don’t pursue the description of these results here since it is notclear for the moment that they are useful to target any important distributions.The configurations represented in Fig. 24 allow to reject a lot of algorithms relying on leaf evapo-ration on the UST to sample Uniform ( Subtrees ( G, n )) ; on this picture, the two blue subtrees induce thesame subgraph on G : to get them after leaves evaporation, the leaf evaporation procedure needs to de-stroy every green subtree before destroying any blue edge. But, blue edges do not appear simultaneouslyin the two cases: 2 blue edges adjacent to leaves are present in the right hand side at the beginning,and in the right hand side, progressively, up to 5 blue leaves may be present at some (random) timeduring the evaporation. Definition of Subtree of tree Model D: Removal of uniform edge .This model is often called “tree cutting” in the literature; take a tree T rooted at some node r , andremove successively a uniform edge chosen uniformly among the remaining edges of T . Denote by T ( k ) the tree obtained from T by the removal of k edges, and T r ( k ) the connected component ofthe origin. The sequence ( T r ( k ) , ≤ k ≤ | T | − 1) coincides with the process ( T (cid:63)r ( w ) , ≤ w ≤ r by keeping the edges with weight ≥ w , at its jump time. Remark 39. This model is discussed also in Section 7.5.2 and applied there in the case of a uniformspannning tree (which provides a second level of randomness). Two trees, in blue, than can be obtained from the elimination of the same “green subtrees” Figure 25: A tree extracted by election type evaporation halted when n = 10000 nodes remains executedon an UST of Torus (500) (left) and on a UST of Torus (4000) (right) Proposition 40. For any t subtree of T , any w ∈ [0 , , denote by B ( t ) the edges of T \ t adjacent to t (the boundary of t in T ). P ( T (cid:63)r ( w ) = t ) = w | B ( t ) | (1 − w ) | E ( t ) | and P ( T r ( k ) = t ) = 1 k ≥| B ( t ) | (cid:0) | E ( T ) |−| E ( t ) |−| B ( t ) | ) k −| B ( t ) | (cid:1)(cid:0) | E ( T ) | k (cid:1) . The same formula are valid for a graph instead. Proof. The first formula is easy: the edges in E ( t ) must be still here, and those of B ( t ) must havedisappeared. For the second formula: since k edges have been suppressed, and by symmetry, theyform a uniform subset of E ( T ); the favourable cases are those for which this subset is B ( t ) union asubset of size k − B ( t ) of E ( T ) \ ( B ( t ) ∪ E ( t ); these number of subsets are given by the numerator ofthe second formula. References .[23] R. M. Karp. Reducibility among Combinatorial Problems, pages 85–103. Springer US, Boston,MA, 1972.[24] J. Kruskal. On the shortest spanning subtree of a graph and the traveling salesman problem.Proceedings of the American Mathematical society, 7:48–50, 1956.[25] G. F. Lawler. Intersections of random walks. Birkh¨auser, 1996.[26] G. F. Lawler. Loop-Erased Random Walk, pages 197–217. Birkh¨auser Boston, Boston, MA, 1999.[27] G. F. Lawler, M. Bramson, and D. Griffeath. Internal diffusion limited aggregation. The Annalsof Probability, 20(4):2117–2140, 1992.[28] R. Lyons and Y. Peres. Probability on Trees and Networks. Cambridge Series in Statistical andProbabilistic Mathematics. Cambridge University Press, 2017.[29] J.-F. Marckert, N. Saheb-Djahromi, and A. Zemmari. Election algorithms with random delays intrees. In Discrete Mathematics and Theoretical Computer Science, pages 611–622, 2009.[30] A. Meir and J.Moon. Cutting down random trees. Journal of the Australian MathematicalSociety, pages 313–324, 1970.[31] Y. M´etivier, N. Saheb-Djahromi, and A. Zemmari. Locally guided randomized elections in trees:The totally fair case. Information and Computation, 198(1):40–55, 2005.[32] W. F. P. Diaconis. A growth model, a game, an algebra, lagrange inversion, and characteristicclasses. Rend. Sem. Math. Univ. Politec. Torino, 49(1):95–119, 1993.[33] R. Pemantle. Uniform random spanning trees. arXiv math/0404099, 2004.[34] R. C. Prim. Shortest connection networks and some generalizations. Bell System TechnicalJournal, 36(6):1389–1401, 1957.[35] J. Propp and D. Wilson. Coupling from the past: a user’s guide. Microsurveys in DiscreteProbability, 41:181–192, 1998.[36] J. G. Propp and D. B. Wilson. Exact sampling with coupled markov chains and applications tostatistical mechanics. Random Structures & Algorithms, 9(1-2):223–252, 1996.[37] J. G. Propp and D. B. Wilson. How to get a perfectly random sample from a generic markov chainand generate a random spanning tree of a directed graph. Journal of Algorithms, 27(2):170–217,1998.[38] O. Schramm. Conformally invariant scaling limits: an overview and a collection of problems. InInternational Congress of Mathematicians. Vol. I, pages 513–543. Eur. Math. Soc., Z¨urich, 2007.[39] W. T. Tutte. A contribution to the theory of chromatic polynomials. Canadian journal ofmathematics, 6:80–91, 1954.[40] S. Wagner. On the probability that a random subtree is spanning. arXiv:1910.07349, Oct. 2019.5041] W. Werner, O. Schramm, and G. F. Lawler. Conformal invariance of planar loop-erased randomwalks and uniform spanning trees. The Annals of Probability, 32(1B):939–995, Jan. 2004.[42] D. B. Wilson. Generating random spanning trees more quickly than the cover time. In Proceedingsof the twenty-eighth annual ACM symposium on Theory of computing, pages 296–303, 1996.[43] T. A. Witten and L. M. Sander. Diffusion-limited aggregation. Physical review B, 27:5686–5697,May 1983.[44] P. T. Y. Hu, R. Lyons. A reverse aldous/broder algorithm. arXiv:1907.10196, 2019.[45] W. Yan and Y.-N. Yeh. Enumeration of subtrees of trees. Theoretical Computer Science,369(1):256 – 268, 2006. 51 ontents Uniform ( Subtrees ( G, n )) . . . . . . . . . . . 10 Subtrees ( G ) Subtrees ( G ) . . 186.3 A fast kernel with computable invariant distribution for regular graphs . . . . . . . . . 20 Subtrees ( G, n ) . . . . . . . . . . . . . 257.2.1 Variant of the kernel K ( F ) : remove the youngest edge . . . . . . . . . . . . . . 267.3 A model inspired by Wilson algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 277.4 Subtree of a size biased forest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277.5 Subtree extraction of the uniform spanning tree . . . . . . . . . . . . . . . . . . . . . . 297.5.1 Uniform random subtree of the UST . . . . . . . . . . . . . . . . . . . . . . . . 297.5.2 Model of evaporation of the edges of a UST . . . . . . . . . . . . . . . . . . . . 307.6 DLA type model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317.6.1 Few stats on square DLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337.7 Internal DLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347.8 Constructions on weighted graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357.8.1 Prim component of the origin . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357.8.2 Kruskal algorithm: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367.8.3 Few stats on the Kruskal trees . . . . . . . . . . . . . . . . . . . . . . . . . . . 377.8.4 First passage percolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 T39