The dispersion time of random walks on finite graphs
Nicolas Rivera, Alexandre Stauffer, Thomas Sauerwald, John Sylvester
aa r X i v : . [ c s . D M ] N ov The Dispersion Time of Random Walks on Finite Graphs ∗ Nicol´as Rivera , Alexandre Stauffer , Thomas Sauerwald , and John Sylvester University of Cambridge, Cambridge, United Kingdom, [email protected] University of Bath, Bath, United Kingdom, [email protected]
Abstract
We study two random processes on an n -vertex graph inspired by the internal diffusionlimited aggregation (IDLA) model. In both processes n particles start from an arbitrary butfixed origin. Each particle performs a simple random walk until first encountering an unoccupiedvertex, and at which point the vertex becomes occupied and the random walk terminates. In oneof the processes, called Sequential-IDLA , only one particle moves until settling and only thendoes the next particle start whereas in the second process, called
Parallel-IDLA , all unsettledparticles move simultaneously. Our main goal is to analyze the so-called dispersion time of theseprocesses, which is the maximum number of steps performed by any of the n particles.In order to compare the two processes, we develop a coupling which shows the dispersiontime of the Parallel-IDLA stochastically dominates that of the Sequential-IDLA; however, thetotal number of steps performed by all particles has the same distribution in both processes.This coupling also gives us that dispersion time of Parallel-IDLA is bounded in expectationby dispersion time of the Sequential-IDLA up to a multiplicative log n factor. Moreover, wederive asymptotic upper and lower bound on the dispersion time for several graph classes, suchas cliques, cycles, binary trees, d -dimensional grids, hypercubes and expanders. Most of ourbounds are tight up to a multiplicative constant. Keywords:
Random Walks, Internal Diffusion Limited Aggregation, Dispersion.
The internal diffusion limited aggregation (IDLA) model, first introduced independently by Diaconis& Fulton [22] and Meakin & Deutch [42], is a protocol for recursively building a randomly growingsubset (aggregate) of vertices of a graph. Initially, the aggregate consists of only one vertex, denotedas the origin , and we let a particle be settled at that vertex. Then, at each step, we start a newparticle from the origin and let it perform a random walk until it visits a vertex not contained inthe aggregate. At this point, we say that the new particle settles at that vertex, and the vertex isadded to the aggregate. We then add a new particle at the origin, and iterate this procedure overand over again.IDLA was introduced on the infinite lattice Z d . Here we consider a finite connected n -vertexgraph G . Note that after n particles have settled, the aggregate occupies the whole of G . Duringthis time, each particle performed some number of random walk steps before it settles. Clearly, thisnumber depends on the geometry of the aggregate when the particle started moving. We define ∗ An extended abstract version of this paper appeared in
The 31st ACM Symposium on Parallelism in Algorithmsand Architectures , SPAA ’19, pages 103–113, New York, NY, USA, 2019. ACM. [46] dispersion time as the largest number of random walk steps performed by any one of the n particles before reaching an unoccupied vertex.We refer to the above protocol as Sequential-IDLA , in allusion to the fact that a particle cannotbegin to move until the one before it settles. However, alternative scheduling protocols could bedefined, in the sense that we could choose to add and move a new particle from the origin beforethe previous one has settled. In this way, there could be several unsettled particles moving atthe same time, but they must abide by the rule that whenever an unoccupied vertex is visited,one particle must settle there. We call any process of this sort a dispersion process . We areinterested in understanding the affect of different scheduling protocols on the dispersion time. Inparticular consider the following protocol: start all n particles from the origin at time 0 (thus oneof them will instantaneously settle at the origin). Then, all particles perform one random walkstep simultaneously; if one or more particles jump to an unoccupied vertex, then one such particlesettles there. Iterate this procedure until all particles have settled. We called this protocol the Parallel-IDLA .Both dispersion processes can be regarded as a set of simple local protocols for resource al-location. Specifically, the sequential dispersion process is quite similar to a local-search basedreallocation scheme from [13], where a job continues to reallocate itself to a neighbour with lessload until it has found a local minimum. Furthermore, the parallel dispersion process is related tothe “QoS Load Balancing” model [1], a particular instance of selfish load balancing (see also [9, 10]for similar protocols). In the QoS model, tasks perform random walks in parallel and terminateonly if they have found a resource on which the estimated processing time is acceptable accordingto some agent-specific threshold. Our dispersion processes can be also viewed as a spatial coordi-nation game, where the goal is to achieve a state in which players are all making distinct choices.As mentioned in [4], such games serve as a model for the dynamics in location games or habitatselection of species.Recall that the dispersion time is the maximum number of steps taken by any of the n particlesin either IDLA process. For the complete graph K n the Sequential-IDLA process has essentiallythe same dynamics as the famous coupon collector process and the dispersion time correspondsto the longest wait between collecting successive (new) coupons. Thus the discrepancy betweenthe Sequential and Parallel dispersion times for K n measures the effect of parallelising the couponcollector process on the longest time between coupons. This motivates the study of dispersion timeon different networks which we can view as a generalization of the coupon collector process. Inthe general setting we address the question: what is the cost of parallelising the IDLA process?Addressing this question requires us to determine or at least estimate the Parallel and Sequentialdispersion times.The total time taken by all walks, as opposed to the longest walk, is also natural to study forthese models. Returning briefly to the complete graph we see that the sum of the walk lengths in theSequential-IDLA corresponds to the time to collect all coupons - this is what is typically studied forthe coupon collector. Our couplings show that for any fixed graph the sum of all walk lengths, laterdenoted by W , is the same for Parallel and Sequential IDLA. From one perspective this motivatesthe study of W for general graphs, this is work in progress by the authors. However, in this paperwe are more interested in the discrepancies between the Sequential and Parallel processes, some ofwhich are captured by the dispersion time.Since in IDLA particles perform random walks, both dispersion processes can be regardedas a protocol for exploring and covering an unknown network. However, as opposed to previouslystudied models of covering a graph with multiple random walks [3, 8, 20], the length of the particles’trajectories may vary wildly in the dispersion process. This introduces strong correlations betweendifferent particles, a challenge which is not present in the cover time of multiple random walks.2 .1 Our Contributions Let τ vseq ( G ) and τ vpar ( G ) to denote the dispersion time of Sequential-IDLA and Parallel-IDLA on G with origin v , respectively. The key question is how are τ vseq ( G ) and τ vpar ( G ) related and is therean ordering between them. We answer this question by developing a coupling, based on “cutting& pasting” particle trajectories, which we use to show the following result below. Theorem 1 (see Theorems 4.1 and 4.2) . For any connected n -vertex graph G and v ∈ V ( G ) , τ vseq ( G ) (cid:22) τ vpar ( G ) . Further, E (cid:2) τ vpar ( G ) (cid:3) = O (cid:0) E (cid:2) τ vseq ( G ) (cid:3) · log n (cid:1) . If instead we count the total number of jumps performed by all particles, then this quantity hasthe same distribution in both processes. Our work leaves whether E (cid:2) τ vpar ( G ) (cid:3) = O (cid:0) E (cid:2) τ vseq ( G ) (cid:3)(cid:1) as an open question. Note however, that Theorem 5.1 demonstrates that already for the clique, theParallel-IDLA is about 30 percent slower than the Sequential-IDLA. Thus, we cannot have equalitybetween the two processes, even though the path is an example where both processes have the samedispersion time up to lower order terms, see Theorem 5.4.In Section 4 we introduce the continuous-time Uniform-IDLA (CTU-IDLA), a variant of theParallel-IDLA where each particle moves at times given by its own exponential rate 1 clock untilit settles. Denote its dispersion time by τ vc − unif ( G ) and let τ vc − seq ( G ) be the dispersion time of theSequential-IDLA run with continuous random walks. We also consider running the parallel andsequential processes with lazy walks and let τ vL − par ( G ) , τ vL − seq ( G ) denote their dispersion times. Theorem 2 (see Theorems 4.3, 4.11 and 4.10) . For any connected n -vertex graph G and v ∈ V ( G ) , τ vc − unif ( G ) = Θ (cid:0) τ vpar ( G ) (cid:1) , τ vL − par ( G ) = Θ (cid:0) τ vpar ( G ) (cid:1) ,τ vc − seq ( G ) = Θ (cid:0) τ vseq ( G ) (cid:1) and τ vL − seq ( G ) = Θ (cid:0) τ vseq ( G ) (cid:1) , hold w.h.p. and in expectation. We also consider general scheduling sequences satisfying a natural condition we call “index-repeating” which states that if the walks are not allowed to settle and the process continues forever,then no walk will ever stop moving. We can show that greatest number of steps taken by a walk inthe IDLA process according to any index-repeating schedule is stochastically dominated by the samequantity in the Parallel process (Theorem 4.8). The intuition behind Parallel-IDLA being “slower”than Sequential-IDLA is that, due to competition between particles trying to settle concurrently,the lengths of particle trajectories in Parallel-IDLA vary more than in Sequential-IDLA.Let t seq ( G ) = max v ∈ V E (cid:2) τ vseq ( G ) (cid:3) and t par ( G ) = max v ∈ V E (cid:2) τ vpar ( G ) (cid:3) be the worst-case ex-pected dispersion times over all possible origins/starting vertices in V . Let t hit ( G ) be the maximumamong all vertices v, w of the expected hitting time of a random walk from v to w . We derive abasic but useful upper bound on the dispersion time in terms of the hitting time. Theorem 3 (See Theorem 3.1, Corollary 3.2, Theorem 5.11, Proposition 5.20) . Let G be anyconnected graph with n vertices. Then, for any vertices v V , Pr (cid:2) τ vpar ( G ) > · t hit ( G ) · log n (cid:3) ≤ n and t par ( G ) = O ( t hit ( G ) · log n ) . The same results also hold for τ seq and t seq . These results imply the following worst-case bounds: For any n -vertex graph, t seq , t par = O (cid:0) n log n (cid:1) . • For any regular n -vertex graph, t seq , t par = O (cid:0) n log n (cid:1) .Moreover, the Lollipop and the cycle, respectively, are graphs matching the two bounds up to constantfactors. In view of the upper bound in Theorem 3 and based on the intuition that the last walk in theSequential-IDLA should have a hard target to hit, one would expect that the worst-case hittingtime provides at least an approximate lower bound on the dispersion time. This intuition turns outto be false in general, as evidenced by a certain class of bounded-degree trees (see Proposition 6.3)which exhibits a gap of almost √ n between the hitting and dispersion time. We obtain some lowerbounds based on the maximum degree ∆( G ) and the 1 / t mix . Theorem 4 (See Theorems 3.7, 3.8 & 3.10) . Let G be a connected n -vertex graph, then t seq ( G ) =Ω( | E | / ∆) . If ∆ = O ( | E | /n ) then t seq ( G ) = Ω( t mix ) . For any tree T , we have t seq ( T ) = Ω( n ) . The first two bounds are tight and the third is known to be tight up to a log n factor. The upperbound in Theorem 3.1 matches Matthews bound for the cover time up to constant [37, Thm. 11.2].While Theorem 3.1 is tight for the cycle, it turns out not to be tight for most “well-connected”graphs like expanders, high-dimensional grids and hypercubes. Thus as a general rule of thumb forwell connected graphs the dispersion time is usually of order t hit and poorly connected graphs itis usually of order t hit · log n . The behaviour of this extra log factor potentially appearing in thedispersion time contrasts with that of the log factor which may appear in the cover time.Let t hit ( π, S ) denote the expected hitting time of S ⊆ V by a random walk from stationarity.We provide a general framework for establishing bounds better than O ( t hit log n ) by consideringcertain sums of hitting times of subsets of decreasing sizes. Theorem 5 (see Theorems 3.3 and 3.6, and Corollary 3.5) . For any connected n -vertex graph G , t par ( G ) ≤ · ⌈ log n ⌉ X j =1 (cid:18) t mix + max S ⊆ V : | S |≥ j − t hit ( π, S ) (cid:19) , where walks in the IDLA process are lazy. Furthermore, t seq ( G ) ≤ · max ≤ j ≤⌈ log n ⌉ (cid:26) j · (cid:18) t mix + max S ⊆ V : | S |≥ j − t hit ( π, S )) (cid:19)(cid:27) . Consequently for any connected n -vertex almost-regular graph, t par ( G ) = O (cid:18) n − λ (cid:19) . Neglecting constant factors, both upper bounds look comparable, however it is not difficultto verify that the upper bound on t seq is at most the upper bound on t par , up to constants.Conversely, the gap between the two upper bounds can be shown to be at most O (log n ). Note thatboth statements recover the basic O ( t hit · log n ) upper bound, but as soon as there is a sufficientspeed-up for hitting times of larger sets (and the mixing time is not too large), these bounds maygive a bound of O ( t hit ). We will see that this is indeed the case for several fundamental classes ofgraphs in Section 5, where we apply the previous bounds, and in particular Theorem 5.In Section 5 we calculate the dispersion times in several fundamental networks. Table 1 sum-marises our results and shows that we can determine the expected dispersion times up to multiplica-tive constant factors in all graphs apart from the 2-dimensional grid, where there is a discrepancy4 raph family name Cover time Hitting time Mixing time Dispersion time t cov t hit t mix t seq t par path n n O ( n ) κ p · n log n cycle n / n / O ( n ) Θ( n log n )2-dimensional grid Θ( n log n ) Θ( n log n ) Θ( n ) Ω( n log n ) O (cid:0) n log( n ) (cid:1) d-dimensional grid, d > n log n ) Θ( n ) Θ( n /d ) Θ( n )hypercube Θ( n log n ) Θ( n ) log n log log n Θ( n )binary tree Θ( n log n ) Θ( n log n ) n Θ( n log( n ) )complete graph Θ( n log n ) Θ( n ) 1 κ cc · n ( π / · n expanders Θ( n log n ) Θ( n ) O (log n ) Θ( n ) Table 1: The last two columns summarize our results, the first three columns are for comparison.The constant κ cc above has an explicit formula given by Lemma 5.2 and it evaluates to roughly1 . π / ≈ . κ p is a non-explicit, though specified inSection 5. Simulations run by Nikolaus Howe (student) suggest κ p ≈ . . . . .of order log n between the lower and upper bounds. This remains an interesting open problemwhich seems to require very detailed knowledge of the shape of the aggregate on a finite box/tori.As discussed in Section 1.3 below, this is a non-trivial problem even in the infinite n log n ) = Θ( t hit · log n ), see Theorem 5.14. The first tool we invent to analyse these processes is the Cut & Paste bijection which maps be-tween the histories of IDLA processes. The bijection allows us to couple the dispersion times of theParallel-IDLA to those of the Sequential-IDLA and other variants such as Uniform-IDLA (where ateach step a random unsettled particle moves), as well as IDLA processes with lazy or continuous-time walks. Bounding dispersion times via these other variants is useful for avoiding issues such asperiodicity or simultaneous arrivals at unoccupied vertices. At a base level the stochastic domina-tion of τ vseq by τ vpar means we can sandwich both quantities with a bound on τ vpar from above andon τ vseq from below. Another useful way describe dispersion time is in terms of hitting times of setsby multiple random walks. In particular we present two different upper bounds on τ vpar and τ vseq interms of hitting times of sets. We also prove a lower bound on τ vseq by the mixing time, this comesfrom the relationship between the mixing time and the hitting time of large sets.Although the Sequential and Parallel IDLA processes are closely related, the different sourcesof dependence arising from the contrasting scheduling protocols provide several challenges. In theSequential-IDLA interaction between the walkers comes via the configuration of vertices settledby the previous walks. This can make proving a tight lower bound on τ vseq tricky and often someknowledge of the geometry of the aggregate after a certain time is helpful. What is needed areresults reminiscent of the “shape theorems” discussed in Section 1.3 below. This requirement fordetailed knowledge of the aggregate appears to be crucial in achieving a tight lower bound on τ vseq for the binary tree and 2-dimensional grid. In comparison with the Sequential-IDLA interactionsare less passive in the Parallel-IDLA as particles jostle to be the first to settle a vertex. Thisinteraction can increase the length of the longest walk as is witnessed by the Cut & Paste bijection.5 .3 Related Work As pointed out by Diaconis & Fulton [22], there are several mathematical reasons for studyingIDLA, including using it to take a product of sets - a special case of the “smash product”. Thelimit shape of the aggregate on Z d was first studied by Lawler, Bramson and Griffeath [35] whoshowed that, after adding n particles and properly rescaling the aggregate by n /d , in the limit as n → ∞ this converges to an Euclidean ball. There has been a series of improvements to this “shapetheorem” of [35], by bounding the rate of convergence to the euclidean ball. The first refinementwas made by Lawler [33] and the state of the art was achieved recently by two independent groupsof authors [6, 5, 7, 30, 31]. Several authors have also proved shape theorems on other infinitegraphs and groups including combs, d -ary trees, non-amenable groups and Bernoulli percolationon Z d [29, 12, 28, 47, 24]. In all of these cases the limit shape is always a ball with respect to theunderlying graph metric. Limit shapes in Z d for other variants of IDLA have also been established.These variations include using non-standard random walks such as for drifted [41] and cookie walks[45] or starting the walks from different positions [23]. The time for the process started with someinitial aggregate to “forget” this starting state has also been studied [39, 48].One model where interaction between particles prevents settling at a site is a two-type particlesystem called “Oil and Water” where particles of opposite types displace each other [16]. Therehave been some papers on models related to the Parking function of a graph where cars driverandomly around a graph searching for vacant spots [21, 27]. More commonly, however, interactionis directly between particles and not with the host graph such as predator prey/coalescing models[20]. The problem of uniformly distributing n non-communicating memoryless particles across n unoccupied sites is also considered from a game theoretic perspective [4].Other models related to IDLA include rotor-router aggregation, chip firing, Abelian sandpilemodels and activated random walks [11, 38, 49]. Many of these interacting particle systems satisfya so-called “least action principle” which is key to their analysis. Such a principle roughly statesthat the natural behavior of the system is in a sense optimal and, if the process is perturbed, thenthe outcome will have a higher energy. One may try to find a least action principle for Sequential-IDLA by conjecturing that if we allow that a random walk sometimes does not settle when visitingan unoccupied vertex (thereby performing more random walk steps), then this could only delay thedispersion time. However, we show in Proposition 6.2 that this is not the case. In particular, wegive a graph for which the dispersion time decreases if one allows some particles to perform morerandom walks steps.To the best of our knowledge, the dispersion time and IDLA on a finite graph has not beenstudied before. Moore and Machta consider running IDLA walks synchronously for the purposes ofsimulating the limit shape [43] in parallel models of computation, however their results don’t appearto overlap with ours. Simulating the process efficiently has also been studied more recently [26].Thacker and Volkov [50] study a border DLA based growth model on finite graphs and investigatehow long until the aggregate grown from a fixed origin hits a fixed boundary. Throughout G = ( V, E ) will denote an undirected, unweighted, connected graph with n vertices.Let ∆( G ) denote the maximum degree of G . We say that a graph G is almost-regular if the ratiobetween maximum degree and minimum degree is bounded from above by a constant.To recap we let τ vpar ( G ) denote the dispersion time of the Parallel-IDLA process on G startedfrom v , that is the first iteration at which every vertex hosts (exactly) one particle. Similarly τ vseq ( G ) denotes the dispersion time of the Sequential-IDLA process on G started from v , that6s the longest time it takes a single particle to settle. Let t seq ( G ) = max v ∈ V E (cid:2) τ vseq ( G ) (cid:3) and t par ( G ) = max v ∈ V E (cid:2) τ vpar ( G ) (cid:3) . We shall drop the dependence on G from our notation when thegraph is clear from the context.Further, let t hit ( u, v ) = E [ τ hit ( u, v ) ], where τ hit ( u, v ) is the time for a random walk to reach v from u . Let t hit ( G ) := max u,v ∈ V ( G ) t hit ( u, v ). For a probability distribution µ on V and a set S ⊂ V let t hit ( µ, S ) denote the expected time for the walk starting from µ to hit any vertex in S .Thanks to our results relating lazy and non-lazy walks (Theorem 4.3), we can convenientlyswitch between the two models at the cost of a constant factor (under some mild additional condi-tions this factor is 2 + o (1)), thus walks may be lazy. We use P to denote the transition matrix ofthe non-lazy walk (and e P = ( I + P ) / p tu,v to denote the proba-bility a random walk goes from u to v in t steps (and e p tu,v respectively for the lazy walk). We let t mix = min t ≥ n t : max x ∈ V P y ∈ V (cid:12)(cid:12)e p tx,y − π ( y ) (cid:12)(cid:12) ≤ /e o denote the mixing time of G .Some of the dispersion results in the paper hold in expectation, some hold w.h.p. (with prob-ability 1 − o (1)) and others hold in both senses. One does not necessarily imply the other, inparticular Proposition 6.1 show there are graphs where neither dispersion time concentrates. Road Map.
The rest of this paper is organized as follows. We first present some general upperand lower bounds in Section 3 before turning to the more involved coupling proofs in Section 4.In Section 5 we apply the results from Section 3 and Section 4 to specific networks completing theresults in Table 1.1, for some of these networks a more refined analysis is required. We concludethe paper in Section 7 with a summary of our results and some open problems.
The first upper bound we present holds for any graph and only requires knowledge of the maximumhitting time of a random walk between two vertices. Although this result can be also recoveredfrom the more general Theorem 3.3, it serves as a good “warm-up”.
Theorem 3.1.
Let G be any connected graph with n vertices. Then for any v ∈ V , Pr (cid:2) τ vpar ( G ) > · t hit ( G ) · log n (cid:3) ≤ n and t par ( G ) = O ( t hit ( G ) · log n ) . The same results also hold for τ vseq and t seq .Proof. To begin sample n random walks of length T = 8 t hit ( G ) log n starting from the origin, thenw.p. at least 1 − n − all of these walks have covered all the vertices of the graph. To see this notethe probability a single walk of length 2 t hit visits u ∈ V is at least 1 / u is visited in time T w.p. 1 − n − . Thus, by a union bound, in time T one walk covers the graph w.p. at least 1 − n − and all walks cover the graph w.p. at least 1 − n − .Now, we run the Parallel-IDLA process by using the n sampled walks, thus each particle followsa predetermined trajectory. Since all the n walks cover the graph, it follows that all the particleshave to settle by time T = 8 t hit ( G ) log n with probability at least 1 − n − . To obtain the resultin expectation, divide the time in phases of 8 t hit ( G ) log n time-steps, then the number of phasesneeded to finish the process is stochastically dominated a geometric random variable of mean1 / (1 − n − ) concluding that E (cid:2) τ vpar ( G ) (cid:3) = O ( t hit ( G ) log n ). Since this holds for any v ∈ V it followsthat t par = O ( t hit ( G ) log n ). The same results holds for τ vseq and t seq due to Theorem 4.1.7his simple bound is actually tight in many cases, see Table 1. The next result is a simpleconsequence, yet it provides the correct asymptotic worst-case bounds for the dispersion time. Corollary 3.2 (General quantitative bounds on graphs) . • For any n -vertex graph, t seq , t par = O (cid:0) n log n (cid:1) . • For any regular n -vertex graph, t seq , t par = O (cid:0) n log n (cid:1) .Proof. This follows from Theorem 3.1 and the bounds on t hit in [40, Thm. 2.1].Notice these bounds exceed the corresponding upper bounds on the cover time [2, Thm. 6.12,Thm. 6.15] by a log n -factor. Both bounds above are sharp up to a multiplicative constant aswitnessed by the lollipop and the cycle respectively, see Proposition 5.20 and Theorem 5.11 respec-tively. In fact for any fixed r < ∞ one can construct a family of r -regular graphs for which thesecond bound above is tight. For example when r = 3 one can iteratively augment an even cycleby adding an edge between two vertices of degree two who are at distance two to obtain a 3-regulargraph with the same asymptotic dispersion time as the cycle. In this section we achieve more refined bounds by considering hitting times of sets as opposed tovertices. To avoid periodicity related issues we assume the trajectory of the particles is a lazyrandom walk. As shown in Theorem 4.3, the parallel or sequential dispersion times with lazy walkare equivalent to their non-lazy counterparts up to constant factors, thus any results establishedfor the dispersion time with lazy walks also apply for non-lazy walk (up to a constant factor) andvice versa. Define τ vpar ( G, k ) to be the first time (from worst case start vertex) that less than 2 k − t kpar ( G ) = max v ∈ V E (cid:2) τ vpar ( G, k ) (cid:3) denotethe worst-case expectation. Clearly τ vpar ( G,
1) = τ vpar ( G ), which is the standard parallel dispersiontime. Theorem 3.3.
Consider the Parallel-IDLA process with lazy walks. Then, for any connected n -vertex graph and any k ≥ , we have t kpar ( G ) ≤ · ⌈ log n ⌉ X j = k (cid:18) t mix + max S ⊆ V : | S |≥ j − t hit ( π, S ) (cid:19) . One consequence of this theorem for k = log n − O ( t mix ) steps, at least n/ S ⊆ V : | S |≥ n/ t hit ( π, S ) = O ( t mix ). Remark 3.4.
Note that the upper bound can be estimated directly to be at most ⌈ log n ⌉ · ( t mix + t hit ) ≤ ⌈ log n ⌉ · t hit , so this bound is (up to a multiplicative constant) a refinementof Theorem 3.1.Proof of Theorem 3.3. We divide the process into log n phases which are labelled in reverse order ⌈ log n ⌉ , ⌈ log n ⌉ − , . . . , ,
1. Phase j starts as soon as the number of unsettled walks k satisfies k ∈ [2 j − , j ). It could be case that the number of unsettled walks more than halves in one step andphase j is skipped, for now assume this is not the case. Let t be the first time step at the beginningof phase j , and let S ⊆ V be the set of unoccupied vertices at time t , thus | S | = k . Consider k random walks moving independently and having no interaction with the unsettled vertices, then8et τ j be the (random) time such that no subset S ′ of S with size at least k/ k/ τ j stochastically dominates thelength of phase j . Suppose the number of unsettled walks is still at least k/ t + τ j . Hencethere exists still a subset S ′ of unoccupied vertices with size at least k/ t + τ j . We knowthat at least k/ S ′ . Thus all these walks mustterminate earlier, as otherwise the vertices in S ′ cannot all be unoccupied at step t + τ j , however, inthis case we have a contradiction to the assumption that at least k/ E [ τ j ] from above. Consider first a fixed random walk and a fixed set S ′ ⊆ S of size at least k/
2. The probability that a fixed random walk does not hit S ′ within30 · ( t mix + t hit ( π, S ′ )) steps is at most (1 / , this follows easily from the fact that after 5 t mix time,with probability at least 1 − e − , we can couple the Markov chain with the stationary distribution(e.g. Lemma A.5. in [32]), and then, given that the coupling is successful Markov’s inequality givesus that with probability at most 1 / S ′ , thus, the probability we do not hit S ′ in5 · ( t mix + t hit ( π, S ′ )) steps is at most e − + (1 − e − )(1 / < /
2, and thus after 6 time-intervalsof length 5 · ( t mix + t hit ( π, S ′ )) the probability the walk does not hit S ′ is at most (1 / .Hence the probability that at least k/ k walks do not hit the set S ′ is at most (cid:18) kk/ (cid:19) · (cid:18) (cid:19) − k/ ≤ k · − k . Taking the Union bound over all possible (cid:0) kk/ (cid:1) ≤ k subsets of S which are of size at least k/
2, itfollows that the probability that there exists a subset S of the unoccupied vertices of size at least k/ k/ S is at most2 k · k · − k ≤ − k ≤ / . Hence the expected time the process spends in phase j (assuming that we reach this phase and donot skip it) is at most 2 · · ( t mix + max S ⊆ V : | S |≥ j − t hit ( π, S )) . Summing up these contributions from k to ⌈ log n ⌉ yields the result. Corollary 3.5.
Let G be a connected n -vertex almost regular-graph. Then, t par ( G ) = O ( n/ (1 − λ )) .Proof. By Lemma A.2 states that t hit ( v, S ) = O (cid:16) n (1+ ⌈ log | S |⌉ )(1 − λ ) | S | (cid:17) holds for any S ⊂ V , v ∈ V . Wealso have t mix = O (cid:16) log n − λ (cid:17) by [37, Thm. 12.3]. Plugging these estimates into Theorem 3.3 yields t par ( G ) ≤ O (1) · ⌈ log n ⌉ X j =1 (cid:18) log n − λ + n · j − )(1 − λ )2 j − (cid:19) = O (cid:18) n − λ (cid:19) . The result follows since lazy and non-lazy dispersion times are equivalent up to a constant factorby Theorem 4.3.Let us now turn to the sequential process, where we can derive a similar bound, which turnsout to be slightly stronger.
Theorem 3.6.
For any n vertex graph G , we have t seq ( G ) ≤ · max ≤ j ≤⌈ log n ⌉ (cid:26) j · (cid:18) t mix + max S ⊆ V : | S |≥ j − t hit ( π, S )) (cid:19)(cid:27) . roof. Since in the sequential process only one walk moves at a time we can couple simple andlazy walks so that the dispersion time with simple walks is always less than with lazy walks. Thuswe can assume the walk is lazy. Fix a time τ to be determined later. Consider the ( n − k )-thwalk in the Sequential-IDLA, when there are still k unoccupied vertices. It was argued in theproof of Theorem 3.3 that the probability the random walk does not hit a set S of size k within5( t mix + max S ⊆ V : | S | = k t hit ( π, S )) time steps is at most 1 / v . Denote q ( k ) = (cid:22) τ · ( t mix + max S ⊆ V : | S | = k t hit ( π, S )) (cid:23) , hence the probability that the random walk does not succeed within τ steps (assuming τ is largeenough) is at most 2 − q ( k ) . Thus by the Union bound, the probability that at least one of the n walks do not succeed is at most P nk =1 − q ( k ) . By dividing the sum into ⌈ log n ⌉ buckets of sizes (atmost) 1 , , . . . , m , . . . , ⌈ log n ⌉ , and using monotonicity of hitting times, it follows that the aboveterm is at most ⌈ log n ⌉ X j =1 j · exp − τ log 25 · ( t mix + max S ⊆ V : | S |≥ j − t hit ( π, S )) ! . Next observe that we need to ensure that for every j it holds that τ ≥ j · (cid:18) t mix + max S ⊆ V : | S |≥ j − t hit ( π, S )) (cid:19) , otherwise just a single addend above is larger than 1. However, if we just choose τ := 3 max ≤ j ≤ log n (cid:26) j · · (cid:18) t mix + max S ⊆ V : | S |≥ j − t hit ( π, S )) (cid:19)(cid:27) , then we see that the total sum in the Union bound expression is at most 1 /
2, and we can concludethat with probability at least 1 / k walks takes more than τ steps. Repeating theargument that the probability that one walk take more than mτ steps is at most 2 − m gives theresult.It can be checked that the bounds of Theorem 3.6 are (up to constant) potentially better thanthe bounds of Theorem 3.3 up to a log n factor.Bounds on the expected hitting time of sets can be obtained by analyzing return probabilities,in some situations these bounds are very tight. Since those bounds are more related to Markovchains properties than the IDLA process, and in order to keep the analysis of the IDLA process asclean as possible, we do not provide those bounds here, but in Appendix A. These bounds can beapplied either in Theorem 3.3 and Theorem 3.6, but also in directly for specific graph families. Theorem 3.7.
Let G be a connected n -vertex graph with maximum degree ∆ , then t seq ( G ) =Ω( | E | / ∆) . Hence in particular, Ω( n ) is a lower bound for almost-regular graphs.Proof. We will analyse the Sequential-IDLA process and lower bound the time it takes for the lastwalk to find a free site.Recall that for any pair of vertices u, v ∈ V , t com ( u, v ) = t hit ( u, v ) + t hit ( v, u ) is the commutetime between u and v . By [40, Cor. 2.5] there is an ordering of the n vertices so that if u precedes10 , then t hit ( u, v ) ≤ t hit ( v, u ). Let us take the vertex w as the origin of the dispersion process sothat for any other vertex v , we have t hit ( w, v ) ≥ t hit ( v, w ). Hence for every vertex v , t hit ( w, v ) ≥ / · t com ( w, v ) . Let R ( u, v ) be the effective resistance between u and v and note that R ( w, v ) ≥ / deg( w ) +1 / deg( v ) ≥ / ∆. Hence t com ( w, v ) = 2 | E | · R ( w, v ) = Ω( | E | / (∆ + 1)) by the commute timeidentity [37, Prop. 10.6]. It follows that, in expectation, the last walk in the Sequential-IDLA takesΩ( | E | / ∆) steps.Theorem 5.1 shows this is tight up to constant when G is the complete graph K n . We alsopresent a refined lower bound for trees. Theorem 3.8.
Let T be any n -vertex tree, then t seq ( T ) ≥ n − . Proof.
If an IDLA process started from any vertex of T the last vertex settled by must be a leaf.Call the last vertex v which is connected to T by one edge { u, v } . Thus the expected time taken bythe last walk to settle is at least the expected time t hit ( u, v ) to cross the edge { u, v } . The Essentialedge Lemma [2, Lem. 5.1] states that H ( u, v ) = 2 | A ( u, v ) | − A ( u, v ) is the component of T containing u after the removal of { u, v } . Since | A ( u, v ) | ≥ n −
1, the proof is complete.Let S n be the n -vertex star and notice that t seq ( S n ) = 2 t seq ( K n ) ≈ . n by Theorem 5.1. Thisshows Theorem 3.8 is tight up to a small multiplicative constant. Remark 3.9.
It would be natural to hope the lower bound t seq = Ω ( t hit ) should hold since onewould expect the vertices with largest hitting times to be explored later by the sequential process andthus contribute to the dispersion time. Proposition refutes this by exhibiting a graph where t seq is a poly ( n ) -factor smaller than t hit . For a graph G let Φ be the conductance of G and let λ and t mix be the second largest eigenvalueand mixing time associated with the lazy random walk on G respectively. The following lower boundis tight up to a log n factor as witnessed by the cycle, Theorem 5.11. Proposition 3.10.
Let G be a graph satisfying ∆ = O ( | E | /n ) . Then there exits a v ∈ V such that Ω( n ) walks in the Sequential-IDLA process from v talk time Ω( t mix ) to settle w.h.p., consequently t seq ( G ) = Ω( t mix ) = Ω (cid:18) − λ (cid:19) = Ω (cid:18) (cid:19) . Proof.
By the characterization of mixing times by hitting times of large sets [44], for all reversiblelazy random walks t mix ≤ c max u,A : π ( A ) > / t hit ( u, A ) , (1)where c < ∞ is a universal constant, which can be assumed to be greater than 1. Let u and A be avertex and a set that together maximize the above expectation t hit ( u, A ). Let r := ∆ · n/ | E | < ∞ and observe that | A | ∆ > | E | / | A | > n/ (3 r ). Consider now a simple random walk oflength ℓ := t mix / (120 · r · c ). For every vertex v ∈ V , let p v be the probability that a random walkstarting from v hits the set A within ℓ steps. Note that there must be at least one vertex w ∈ V such that p w < / (12 · r ) since otherwise the expected time to hit A is less than t mix / ( c ·
10) for allvertices v , contradicting (1).Let X be the number of walks from w that take time less than ℓ to hit A (if there were notallowed to settle before this). As these walks are independent it follows that X is distributed as11in( n, p w ), thus E [ X ] < n/ (12 r ) and Pr [ X ≥ n/ (6 r ) ] ≤ e − Ω( np w ) by the Chernoff bound. Thusw.h.p. at least (1 − / (6 r )) n of the n walks will take time at least ℓ to reach A , thus at least n/ (6 r )walks which settling in A take time at least ℓ . Hence t seq = Ω( t mix ) , proving the result. Then using the fact that t mix ≥ (1 / (1 − λ ) −
1) log(1 /e ) [37, Thm. 12.4] andthen the fact that we need at least one step to mix gives t mix = Ω( − λ ). Cheeger’s inequality [37,Thm. 13.14], which states − λ = Ω( ), completes the proof.The following bound will be of use in Section 4 as although rather weak it holds w.h.p. for anystart vertex. The proof appears in a different context [25] but we reproduce it here for completeness. Lemma 3.11.
For any connected n -vertex graph and v ∈ V ( G ) , if n is large enough, it holds that τ vseq > log n with probability at least − e − n / .Proof. Consider the following process: run n independent random walks on G starting from v , andwe stop them at time L = c log n , with c = 1 /
14. Denote by C the set of vertices that are hitby at least one of those random walks. A simple coupling argument shows that Pr [ C 6 = V ] ≤ Pr (cid:2) τ vseq > L (cid:3) , and thus we will prove that Pr [ C 6 = V ] ≥ − e − n / . For any of the n walks, denote by C i the set of vertices covered by the i -th walker in the first L steps. Hence C = ∪ ni =1 C i . Denote U = { u ∈ V : Pr [ u ∈ C i ] ≥ Ln } , and note U is independent of i . Hence for any i , L ≥ E [ |C i | ] = X u ∈ V Pr [ u ∈ C i ] ≥ X u ∈ U Ln ≥ | U | Ln , therefore | U | ≤ n . Denote by D = V \ C the set of uncovered vertices, then E [ |D| ] ≥ X u ∈ V \ U Pr [ u ∈ D ] ≥ X u ∈ V \ U (1 − Pr [ u ∈ C ]) n ≥ n (cid:18) − L n (cid:19) n ≥ n · e − L/ (2 − L/n ) , where in the last step we use the bound e − x/ (1 − x ) ≤ − x for | x | <
1. Hence, we deduce that E [ |D| ] ≥ n n − c (1+ o (1)) ≥ n − c . Finally, we will prove that with probability at least 1 − e − n / it holds that |D| is greater than E [ |D| ], and then |D| >
0. Note that |D| depends on the trajectory of the n random walks andchanging one of them changes |D| in at most L + 1 values. Therefore by the method of boundeddifferences we have Pr (cid:20) |D| − E [ |D| ] < − E [ |D| ] (cid:21) ≤ exp − E [ |D| ] n ( L + 1) ! ≤ exp − (cid:0) n − c (cid:1) n ( L + 1) ! , recalling L = c log n gives that Pr (cid:2) |D| − E [ |D| ] < − E [ |D| ] (cid:3) is at mostexp (cid:18) − n − c n ( c log n ) (1 + o (1)) (cid:19) ≤ exp − n − c/ c ! ≤ exp (cid:0) − n − c (cid:1) = e − n / , since c = 1 /
14. The result follows as E [ |D| ] ≥ n − / > Coupling and Stochastic Domination
In this section we shall prove the following stochastic domination using a coupling.
Theorem 4.1.
Let G be a finite graph and v ∈ V ( G ) . Then τ vseq ( G ) (cid:22) τ vpar ( G ) . An immediate corollary of this is the relation E (cid:2) τ vseq ( G ) (cid:3) ≤ E (cid:2) τ vpar ( G ) (cid:3) , we also prove thereverse inequality up to log n factors. Theorem 4.2.
Let G be a finite graph and v ∈ V ( G ) . Then E (cid:2) τ vpar ( G ) (cid:3) = O (cid:0) E (cid:2) τ vseq ( G ) (cid:3) · log n (cid:1) . We define the lazy Sequential/Parallel-IDLA to be the Sequential/Parallel-IDLA with the par-ticles moving according to a lazy (instead of simple) random walk. Let τ vL − seq ( G ) be the dispersiontime of the lazy Sequential-IDLA on G starting from v , and τ vL − par ( G ) be the analogous quantityfor the lazy Parallel-IDLA. The relation between the lazy and standard IDLA dispersion times isgiven in the following theorem. Theorem 4.3.
Let G and v ∈ V ( G ) . Then the following holds w.h.p. and in expectation τ vL − seq ( G ) = Θ (cid:0) τ vseq ( G ) (cid:1) and τ vL − par ( G ) = Θ (cid:0) τ vpar ( G ) (cid:1) . Additionally, if there exits some ℓ = ω (log n ) such that Pr (cid:2) τ vpar ( G ) ≤ ℓ (cid:3) ≤ /ℓ then τ vL − seq ( G ) = (2 + o (1)) · τ vseq ( G ) and τ vL − par ( G ) = (2 + o (1)) · τ vpar ( G ) , hold w.h.p. and in expectation. The proofs of the above theorems are based on a coupling between the Sequential and Parallel-IDLA processes. To construct this coupling we consider a (Parallel or Sequential) IDLA processon G as an irregular 2-dimensional array L where each element L ( i, j ) ∈ V . This array L has n rows representing the n particles. Column t represents time t , and thus L ( i, t ) represents the vertexvisited by walk i at time t . We let ρ i denote the length of walk i , hence the index of each row i goes from 0 to ρ i . We denote by I L the set of all indices ( i, t ) of the array L .Given ( i, s ) , ( j, t ) ∈ I L , we say that ( i, s ) is smaller than ( j, t ) in sequential order, written( i, s ) < S ( j, t ) if either ( i < j ) or ( i = j, s < t ). Thus in sequential order, the block L is read as L (1 , , L (1 , , . . . L (1 , ρ ) , L (2 , , . . . L (2 , ρ ) , . . . , L ( n, , . . . , L ( n, ρ n ) . Likewise we say that ( i, s ) is smaller than ( j, t ) in parallel order, denoted by ( i, s ) < P ( j, t ) if either( s < t ) or ( s = t, i < j ). So, in parallel order, the block L is read as L (1 , , L (2 , , . . . , L ( n, , L (1 , , L (2 , , . . . , L ( n, , . . . , L (1 , r ) , L (2 , r ) , . . . , L ( n, r ) , . . . where if r > ρ i then L ( i, r ) is empty so it is skipped.Note that if L is a block representing a parallel or Sequential-IDLA the following property holds L ( i, ρ i ) = L ( j, ρ j ) for each pair i = j . (2)If L satisfies (2) then { L ( i, ρ i ) : i ∈ [ n ] } = V and the final element of each row is unique.13 block L satisfying (2) represents a Sequential-IDLA process if and only if each row i representsa path in G from vertex L ( i,
0) = v to L ( i, ρ i ) and for all ( i, t ) ∈ I L ( i, t ) is the first occurrence of vertex L ( i, t ) in L w.r.t. < S iff t = ρ i . (3)This says that when L is read in sequential-order the first time a new vertex is read it ends thecurrent row. Similarly a block L satisfying (2) is a realization of a Parallel-IDLA process if and onlyif each row i represents a path in G from vertex L ( i,
0) = v to L ( i, ρ i ) and and for all ( i, t ) ∈ I L ( i, t ) is the first occurrence of vertex L ( i, t ) in L w.r.t. < P iff t = ρ i . (4)For a 2-dim array L we denote its total length (the work done) by W ( L ), this is the totalnumber of moves recorded by L and thus W ( L ) := ρ + · · · + ρ n . Let Seq mv , or Par mv , denote theset of all sequential, respectively parallel, blocks representing realizations of IDLA starting from v and total length m , i.e. W ( L ) = m .To build the coupling between Sequential and Parallel-IDLA, we are going to use a series of “Cut& Paste” transformations. Consider ( i, t ) ∈ I L , then define CP ( i,t ) ( L ) as the block constructedby taking L and cutting the cells ( i, t + 1) , . . . , ( i, ρ i ) and pasting it after the unique ( k, ρ k ) with L ( i, t ) = L ( k, ρ k ). Example:
Represented below are L , a block on V = { , , , } , and CP (4 , ( L ) which is theresult of applying the cut & paste CP (4 , to L . L = 11 21 2 2 31 2 1 2 3 4 CP (4 , ( L ) = 11 2 1 2 3 41 2 2 31 2While CP (1 , ( L ) = CP (2 , ( L ) = CP (3 , ( L ) = CP (4 , = L . Note that if L satisfies property (2),then L ′ = CP ( i,t ) ( L ) also satisfies (2). Property (2) is an important invariant for our algorithms. We propose two algorithms
StP and
PtS , formally specified by Algorithms 1 and 2 below. Thealgorithm
StP transforms a sequential process into a parallel and
PtS transforms a parallel processinto a sequential. The key component of both algorithms is the “cut & paste” operation CP .Both algorithms work as follows: a pointer moves through the input array L in a fixed order andwhen the pointer sees a vertex label for the first time this label is added to the set S of seen verticesand a cut & paste transform CP is applied to L at this position before the pointer continues. Thedifference is that in StP the pointer explores columns then rows (i.e. in parallel order < P ), whereas PtS reads rows then columns (i.e. in sequential order < S ).Broadly speaking the algorithms try to read the input array as if it was of the type specifiedby the output and if the input fails to have this form then it will edit it using the cut & pastetransform until it has the correct form. 14 esult: transforms a sequential array L into a parallel array S ← ∅ ; t ← while |S| < n dofor i = 1 , . . . , n doif ( i, t ) ∈ I L and L ( i, t )
6∈ S then
S ← S ∪ { L ( i, t ) } ; L ← CP ( i,t ) ( L ); endend t ← t + 1; end return L ; Algorithm 1:
Sequential to Parallel (
StP ) Result: transforms a parallel array L intoa sequential array S ← ∅ ; for i = 1 , . . . , n do t ← while ( i, t ) ∈ I L do if L ( i, t )
6∈ S then S ← S ∪ { L ( i, t ) } ; L ← CP ( i,t ) ( L ); exit ( while ) end t ← t + 1; endend return L ; Algorithm 2:
Parallel to Sequential (
PtS )The set S = S ( L, k ) stores the different values of L ( i, j ) observed after k iterations of theinnermost loop. The algorithms terminate once they have scanned the whole array, this is the firsttime when |S| = n . Sometimes they may apply CP ( i,j ) with j = ρ i , this leaves L unchanged. Lemma 4.4 (Correctness and bijectivity of Algorithms 1 & 2) . The following holds, • PtS is a bijection from
Par mv to Seq mv . • StP is a bijection from
Seq mv to Par mv .Proof. Observe that during the running of the
PtS and
StP , Algorithms 1 & 2, the only changesmade to the input array L are a sequence of cut & paste transforms CP i ,t , CP i ,t . . . . Since eachcut & paste transform preserves Property (2) it follows that PtS and
StP preserve (2). Likewisecutting & pasting preserves total length, thus so do
PtS and
StP . Recall that the operator CP ( i,t ) cuts and pastes the random walk trajectory ( i, t + 1) , . . . , ( i, ρ i ) onto the unique ( k, ρ k ) with L ( i, t ) = L ( k, ρ k ). Thus row k in L ′ = CP ( i,t ) is a valid path from vertex L (0 , k ) to L ( i, ρ i ).For PtS we must check that if L ∈ Par mv , then PtS ( L ) ∈ Seq mv , i.e. PtS ( L ) satisfies (3). Recallthat the PtS algorithm reads the input array L in sequential order and when a vertex label is seenfor the first time at some position ( i, j ) it applies the cut & paste transform CP ( i,j ) and the pointermoves to the next row. If ( i, j + 1) is non-empty then CP ( i,j ) pastes the remainder of row i to somerow i ′ with endpoint value L ( i, j ). Observe that i ′ > i since ( i, j ) is the first occurrence of L ( i, j )in sequential order. Thus each new vertex found w.r.t. < S forms an endpoint as it is cut when it isfirst discovered and nothing else can be pasted onto that row later by the algorithm. This provesthat PtS ( L ) is a valid Sequential-IDLA block.Likewise for StP let L ∈ Seq mv and we check StP ( L ) satisfies (4). Suppose when reading L inparallel order ( i, j ) is the first occurrence of L ( i, j ), StP will apply CP ( i,j ) and continue to readthe array in parallel order. Position ( i, j ) is now fixed as the end point of row i as no later copy &paste can alter this row. This holds since to paste something else onto row i we would have to seevertex L ( i, j ) for the first time (again) later in parallel order which cannot happen.For injectivity let F represent either of the maps PtS , StP , and
L, L ′ be distinct arrays bothfrom Par mv or Seq mv respectively. Assume for a contradiction that F ( L ) = F ( L ′ ). Since L = L ′ thereis a first position ( i, j ) at which they differ w.r.t. < S or < P , i.e. L ( i, j ) = L ′ ( i, j ). It cannot be the15ase that L ( i, j ) = ∅ and L ′ = ∅ , or vice versa, since otherwise the arrays must differ at position( i, j −
1) which occurs before ( i, j ) in either ordering. Let ( i, j ) be the current position when F isis running on L and L ′ . If L ( i, j )
6∈ S ( t, L ) and L ′ ( i, j )
6∈ S ( t, L ′ ) then CP ( i,j ) is applied and theposition ( i, j ) is now fixed in both arrays, i.e. F ( L )( i, j ) = F ( L ′ )( i, j ), a contradiction. Similarlyif L ( i, j ) ∈ S ( t, L ) and L ′ ( i, j ) ∈ S ( t, L ′ ) then no transform is applied and the positions are fixed.Otherwise the element at ( i, j ) is seen in one array and not in the other, i.e. L ( i, j )
6∈ S ( t, L ) and L ′ ( i, j ) ∈ S ( t, L ). This is a contradiction as ( i, j ) is the first position at which L and L ′ differ.For bijectivity since StP : Seq mv → Par mv and PtS : Par mv → Seq mv are both injections andSeq mv , Par mv are finite it follows that | Seq mv | = | Par mv | . Thus StP , PtS are surjections.
Remark 4.5.
One can prove
StP has inverse
PtS , we omit the proof as we do not use this fact.
Lemma 4.6.
Let L ∈ Seq mv . Then max i ∈I L ρ i ≤ max i ∈I StP ( L ) ρ i .Proof. Assume for a contradiction that max i ∈I L ρ i > max i ∈I StP ( L ) ρ i . This means that each rowattaining maximum length in L must have a section cut and pasted to a row of shorter length bythe StP algorithm. However the
StP algorithm runs in parallel order and cannot paste onto a cellwhich it has already read. Thus any row suitable to receive the end of the current row must haveits end point in the same column or a column to the right of the current one. This cannot decreasethe length of the longest row.We now have what we need to prove that τ vseq ( G ) (cid:22) τ vpar ( G ) for any G and v ∈ V ( G ). Proof of Theorem 4.1.
By Lemma 4.4
StP is a bijection between Par mv and Seq mv . Thus we canpair every sequential process L of total length W ( L ) = m with a unique parallel process L ′ of totallength W ( L ′ ) = m . Both L and L ′ visit the same vertices with the same frequency and in the sameorder, thus the probability of each vertex sequence of total length m in either process is the same.This implies that the total lengths of the processes are distributed identically.Lemma 4.6 states that for this pair the longest row in L ′ is at least as long as the longest rowin L . Thus for any k, m ≥ Pr (cid:20) max i ∈I L ρ i ≥ k (cid:12)(cid:12)(cid:12) W ( L ) = m (cid:21) ≤ Pr (cid:20) max i ∈I L ′ ρ i ≥ k (cid:12)(cid:12)(cid:12) W ( L ′ ) = m (cid:21) . This implies the result since τ vseq ( G ) and τ vpar ( G ) are given by the length of the longest row in thesequential and parallel processes respectively.In the other direction we now prove E (cid:2) τ vpar (cid:3) = O (cid:0) E (cid:2) τ vseq (cid:3) · log n (cid:1) for any G and v ∈ V . Proof of Theorem 4.2.
Let L be a Parallel-IDLA block and σ be a random permutation of { , . . . , n } .Let σ ( L ) be the block that results from permuting the rows of L using σ . The block σ ( L ) representsa Parallel-IDLA process where conflicts between particles are solved by giving priority to particleswith least value of σ ( index ) (instead of least index , as per the definition of Parallel-IDLA). Also,for simplicity we fix σ (1) = 1. Note that L and σ ( L ) have the same rows, and thus the maximumrow-length is the same in both blocks. We remark that PtS , Algorithm 2, still produces a validsequential array even if the input is σ ( L ) instead of L .Let L be an arbitrary parallel array and consider a run of PtS , Algorithm 2, on σ ( L ) where wedo not reveal σ in advance. Instead we reveal the permutation σ row by row as PtS reads the arrayin sequential order (in other words, instead of running
PtS ( σ ( L )), we equivalently run PtS ( L ) butwe read rows in random order, starting with row 1 (= σ (1)) of L , and then rows σ (2) , σ (3) , . . . , σ ( n ).This is equivalent to replacing i by σ ( i ) in lines 1-5 of Algorithm 2). Note that the Cut & Paste16peration is unaffected by not revealing the order of the rows. This holds because the Cut & Pastetransform only pastes behind unread rows, independent of their location in the array L and what ismore, there is only one row where we can paste a cut section by property (2). Consider the largestrow (or choose one arbitrarily if there is more than one) in the original block L . We shall paintthis row red and call the last cell ξ . During the running of PtS ( L ) the marked cell ξ moves fromrow to row because of the Cut & Paste operations. Here is the key observation: If ℓ is the lengthof the original red row and ξ moves no more than N times then in the output array PtS ( L ) has arow of length at least ℓ/N . This holds because the red row was partitioned N times and thus oneof the pieces has to have length at least ℓ/N Let i k be the iteration (how many rows we have read) by the k th time PtS reads a row containingthe marked cell ξ . When we read a row which contains ξ for first time in iteration i , we may applya Cut & Paste somewhere in this row (if not we are done). If so ξ would find itself at the end ofan unread row x of L , which will be read in a (random) iteration i , i.e. σ ( i ) = x . Note i is auniform random value in { i + 1 , . . . , n } . In iteration i , we read the row with the marked cell andagain, the algorithm might cut and paste this row behind an unread row x which will be read atsome time i , which is again uniformly random in { i + 1 , . . . , n } , and so on. Each time we makea cut and paste the index i j of the recipient row will be in the latter half of the list { i j + 1 , . . . , n } with probability 1 /
2. Thus since
PtS works through this list in order the expected length of thelist of possible positions for the next value i j +1 halves every iteration. We cannot keep halving thislist indefinitely because either at some point a row ended by ξ is not cut or ξ is in the last row tobe read (which is never cut). Thus the number of times ξ moves (i.e. expected times the longestrow is cut) is at most C log n with probability at least 1/2 by Markov’s inequality. Denote by X the (random) number of times we cut a row containing the marked cell ξ . Let ℓ be the length of thelongest row of L , and ℓ ′ the random variable representing the length of the longest row of PtS ( L )using a random permutation σ . Conditional on cutting L ’s longest row X times, we have musthave at least one row of length ℓ/X once the algorithm has terminated. Thus, given the block L with largest row ℓ , we have E (cid:2) ℓ ′ | L (cid:3) > E (cid:2) ℓ ′ | L, X ≤ C log n (cid:3) · ≥ ℓ C log n . By taking expectation over all blocks L generated from a Parallel-IDLA with a random σ weconclude the result. Recall that in the Sequential-IDLA we run the walks one by one in order and walk i + 1 startsonly after walk i has settled, while in the Parallel-IDLA all particles walk simultaneously until theysettle, breaking ties by settling the particle with smallest index. In either Sequential or Parallelwe are interested in the longest walk. Another natural way to run the IDLA process is in uniformorder: we choose a random unsettled particle and move it to a random neighbouring vertex which itsettles on if unoccupied. We call this process the Uniform-IDLA. This process can be seen as lyingbetween the Sequential and Parallel-IDLA models. To sample from the Uniform-IDLA process,we first consider an infinite sequence R = ( R i ) where the R i s are independent random variablessampled from { , . . . , n } . Then we run the Uniform-IDLA as follows: First particle 1 settles at theorigin, so the origin is occupied. Then, at each time-step t ≥
1, if particle R t is unsettled, it movesto a random neighbour, otherwise it stays in its current location. If such neighbour is not occupied,particle R t settles on it and the vertex is now occupied.17learly for some sequences R the process may never terminate, for example R = (1 , , , . . . ).We say that an sequence R on n indices is index-repeating if for any index i ∈ { , . . . , n } and any T ≥
0, there exits some t ≥ T such that R t = i . Remark 4.7.
For any fixed n and any fixed distribution D on the indices with full support, the R obtained by sampling indices according to D will be index-repeating almost surely. Given an index-repeating ordering R , we can find a bijection between the Uniform-IDLA andParallel-IDLA. An R -block is defined in the same fashion as a parallel block, i.e. L ( i, j ) representsthe position of the i -th particle after j jumps, but additionally, we associate to every ( i, j ) ∈ I L aninteger T ( i, j ). This T is called the timing array and defined as T ( i, j ) = t if R t = i for j -th timeand T ( i,
0) = 0 for all particles i . Note that using the block and timing array we can reconstructthe uniform process as we have not only the paths but the time-steps when the particles moved.Whenever we speak of an R -block we shall assume that R is index repeating.The bijection between an R -block and a parallel block is defined algorithmically in the samefashion as before. To transform an R -block into a parallel block we just apply StP , Algorithm 1,to the R -block oblivious to R since StP reads in parallel order. However to transform a parallelblock into an R -block, we must read the block in the order given by T ( i, t ) (i.e. read the block withsmallest value T ( i, t ), then the second smallest, etc..) and apply CP i,j whenever the vertex L ( i, j )is read for first time. It is very important that now when applying the Cut & Paste operation wemove not only the cells containing a portion of the path but also the times T ( i, t ) associated tothose cells, i.e. if cell ( i, t ) moves to ( j, s ) then T ( j, s ) gets the value of T ( i, j ), while T ( i, t ) is leftundefined. Pseudo-code for the procedure we have just described is given in Algorithm 3. Result: transforms a parallel array L and order sequence R into a R -Uniform array S ← ∅ ; C ← list of cells ( i, j ) ordered by T ( i, j ) in increasing order; k ← while |S| < n do k ← k + 1; ( i, j ) ← C ( k ); if L ( i, j )
6∈ S then
S ← S ∪ { L ( i, j ) } ; L ← CP ( i,j ) ( L ); endend return L ; Algorithm 3:
Parallel to R -Uniform ( PtU R )Let Unif mR,v be the set of all Uniform-IDLA blocks with ordering R starting from v with totalnumber of steps m . Then, using similar arguments to the sequential-parallel case we obtain. Theorem 4.8.
For any fixed index-repeating sequence R on n indices, there is a bijection between Unif mR,v and
Par mv . Moreover the number of steps taken by the longest walk of the Uniform-IDLAis stochastically dominated by the number of steps in the longest walk of the Parallel-IDLA.Proof. The bijection follows from injectivity and correctness of
StP and
PtU R (as in Theorem4.4). Then as in the proof of Theorem 4.1 we run StP and apply Lemma 4.6. This Lemma stillapplies as
StP is oblivious to the ordering of the input array.18bserve however that the dispersion time of the Uniform array is not determined purely by thenumber of steps/length of the longest row but by the values T ( i, j ) of the timing array. In this section we consider continuous-time versions of the Sequential and Uniform-IDLA process.By this we mean running these IDLA processes with random walks with exponential rate 1 jumps.We shall need the following concentration result.
Lemma 4.9.
Let X be an Gamma( n, λ ) random variable and Y = P ni =1 Y i where Y i are indepen-dent Geo( p ) random variables. Then for any δ > and < ε < ,(i) Pr (cid:20) X ≥ (1 + δ ) nλ (cid:21) ≤ e − δn and Pr (cid:20) X ≤ (1 − ε ) nλ (cid:21) ≤ e − εn . (ii) Pr (cid:20) Y ≥ (1 + δ ) np (cid:21) ≤ e − δ n δ ) and Pr (cid:20) Y ≤ (1 − ε ) np (cid:21) ≤ e − ε n − ε/ . Proof. Item (i): If X is Gamma( n, λ ) then E [ X ] = n/λ and E (cid:2) e tX (cid:3) = (1 − t/λ ) − n for all t < λ .Now by Markov’s inequality for any t < λ , Pr [ X ≥ (1 + δ ) µ ] ≤ e − t (1+ δ ) µ E (cid:2) e tX (cid:3) ≤ e − t (1+ δ ) n/λ (1 − t/λ ) − n ≤ e − t (1+ δ ) n/λ e tn/λ ≤ e − δtn/λ . By considering − X one can also show Pr [ X ≤ (1 − ε ) µ ] ≤ e − εtn/λ provided ε <
1. Since t < λ was arbitrary the result follows by choosing t = λ/ Item (ii): Following [15], if Y ≥ k E [ Y ] then we have less than n successes in k E [ Y ] Bernoullitrials with success probability p . Thus Pr [ Y ≥ (1 + δ ) n/p ] ≤ Pr [ Bin((1 + δ ) n/p, p ) < n ] and Pr [ Bin((1 + δ ) n/p, p ) < n ] = Pr [ Bin((1 + δ ) n/p, p ) < (1 + δ ) n − δn ] ≤ e − δ n δ ) , by [17, Thm. 3.2]. Similarly for the lower bound Pr [ Y ≤ (1 − ε ) n/p ] ≤ Pr [ Bin((1 − ε ) n/p, p ) ≥ n ] ≤ e − ε n − ε ) n + εn/ = e − ε n − ε/ . For the Sequential-IDLA it is easy to consider its continuous-time analogue, the ContSeq-IDLA,we just have random walks that jump at times given by a Poisson process of intensity 1. Also, wecan easily sample from the ContSeq-IDLA by sampling a standard (discrete time) Sequential-IDLAand then considering independent exponential times of mean 1 between the jumps. Let τ vc − seq ( G )be the time it took to the slowest particle to settle in the ContSeq-IDLA from v ∈ V . Theorem 4.10.
Let G and v ∈ V ( G ) . Then(i) τ vc − seq ( G ) = Θ (cid:0) τ vseq ( G ) (cid:1) holds w.h.p. and in expectation.If in addition τ vseq ( G ) = ω (log n ) then(ii) τ vc − seq ( G ) = (1 + o (1)) · τ vseq ( G ) holds w.h.p. and in expectation. roof. We sample a ContSeq-IDLA by sampling L , a Sequential-IDLA, and a Exp(1) randomvariable for each walk step in L . Thus conditional on L walk i in the ContSeq-IDLA has lengthGamma( τ i , E ℓ be the event that L contains at least one row of length at least ℓ ≥ ρ ∗ being length of the longest row of our sampled Sequential-IDLA we can stochastically dominate the length of any walk in the ContSeq-IDLA by an indepen-dent Gamma( ρ ∗ ,
1) random variable. Observe that Pr (cid:2) τ vc − seq > ρ ∗ + δρ ∗ | ρ ∗ (cid:3) ≤ n · e − δρ ∗ / byLemma 4.9 (i). Thus conditional on E ℓ we can take δ = (4 log n ) /ℓ so w.p. 1 − o (1 /n ) at most4 log n extra steps are taken by any continuous walk. This gives Pr (cid:2) τ vc − seq ≤ (1 + 4 log( n ) /ℓ ) τ vseq (cid:3) ≥ − Pr [ ( E ℓ ) c ] − o (1 /n ) . (5)Let ρ ∗ ≥ L , and observe that Pr (cid:2) τ vc − seq > ρ ∗ + i log n | ρ ∗ (cid:3) ≤ ne − ( i log n ) / follows by taking δ = ( i log n ) /ρ ∗ in Lemma 4.9 (i). Thus E (cid:2) τ vc − seq | ρ ∗ (cid:3) ≤ ρ ∗ + 2 log n + (log n ) · ∞ X i =2 ne − ( i log n ) / = ρ ∗ + O (log n ) . (6)Observe that E [ ρ ∗ ] = E (cid:2) τ vseq (cid:3) as ρ ∗ is the longest row of L , thus E (cid:2) τ vc − seq (cid:3) ≤ E (cid:2) τ vseq (cid:3) + O (log n ).Note by definition that if Pr [ E ℓ ] = 1 − o (1) then E (cid:2) τ vseq (cid:3) ≥ (1 − o (1)) ℓ . Lemma 3.11 states thatfor any graph and any start vertex, Pr (cid:2) E (log n ) / (cid:3) = 1 − o (1). Thus the upper bound in (i) holdsin expectation. The upper bound in (ii) follows similarly assuming Pr (cid:2) E ω (log n ) (cid:3) = 1 − o (1). By(5) the upper bound in cases (i) & (ii) also hold w.h.p..For the lower bounds take one walk from L of the maximum length ρ ∗ and consider its lengthin the ContSeq-IDLA process. This has Gamma distribution Gamma( ρ ∗ ,
1) and thus Pr h τ vc − seq < ρ ∗ − √ ℓ (cid:12)(cid:12)(cid:12) ρ ∗ i · E ℓ ≤ e −√ ℓ/ by Lemma 4.9 (i). The w.h.p. lower bounds for (i) and (ii) follow by taking expectations of theabove, since in both cases ℓ = Ω(log n ) and Pr [ ( E ℓ ) c ] = o (1). This holds in expectation also.It is natural to consider the continuous-time version of the Uniform-IDLA, we call this theCTU-IDLA. In this process each particle has an exponential clock with rate 1. Then, as long as theparticle is not settled, when the clock rings the particle moves to a random neighbour and settles ifpossible. Note that this is equivalent to running the discrete-time Uniform-IDLA with a uniformlyrandom sequence R but waiting an amount of time distributed Exp(1 / ( n − R (recall particle 1 occupies the origin and R t takes values in { , . . . , n } ). Alternatively, wecan sample the CTU-IDLA by using PtU R , Algorithm 3. First, sample a L , a (discrete-time)Parallel-IDLA, then run Algorithm 3 but using a list C built from a timing array T populated asfollows: Set T ( i,
0) = 0 for each i . Then let T ( i, j + 1) = T ( i, j ) + X i,j where { X i,j } i ∈ [ n ] ,j ∈ N areindependent Exp(1) random variables. We shall name this procedure PtU C . This procedure canalso be seen as running Algorithm 3 but instead of using the list C to choose the next cell ( i, t )(line 1), each row of the block has a exponential clock of mean 1. When the clock of row i rings,the algorithm chooses the first unread cell of row i (if it exists), and proceeds with line 2. One canshow this algorithm is (almost surely) correct due the bijection between Unif mR,v and Par mv for afixed ordering R established in Theorem 4.8 and Remark 4.7. Let τ vc − unif be the time it takes theCTU-IDLA started from v to settle all particles. Theorem 4.11.
Let G be a connected graph and v ∈ V ( G ) . Then(i) τ vc − unif ( G ) = Θ (cid:0) τ vpar ( G ) (cid:1) holds w.h.p. and in expectation. f in addition τ vpar ( G ) = ω (log n ) then(ii) τ vc − unif ( G ) = (1 + o (1)) · τ vpar ( G ) holds w.h.p. and in expectation.Proof. We can sample a CTU-IDLA as described above by sampling L , a Parallel-IDLA, andrunning PtU C on L . By Theorem (4.8) the longest row of PtU C ( L ) is no longer than the longestrow of L , thus conditional on L walk i in the CTU-IDLA has length Gamma( τ i , E ℓ be the event that L contains at least one row of length at least ℓ ≥ L of the maximum length ρ ∗ (assume the label of thisrow is i ) and consider the action of PtU C on the cells in this row. If no cut is made duringthe running of PtU C ( L ) then the length of i stochastically dominates an Gamma( ρ ∗ ,
1) randomvariable. Suppose that a Cut & Paste transform is applied to i at a cell containing vertex v and theremainder of this row is pasted onto row j . Although row j may have contained less cells before v than the number of steps taken by row i to reach v , the amount of time (with respect to the clock)it takes particle j to reach v must be at least as long as the time for particle i to reach v (otherwisethe Cut & Paste would not have been applied). Thus conditional on this Cut & Paste the lengthof row j stochastically dominates an Gamma( ρ ∗ ,
1) random variable. Now, as in Theorem 4.10, Pr h τ vc − unif < τ vpar − √ ℓ (cid:12)(cid:12)(cid:12) E ℓ i = e −√ ℓ/ by Lemma 4.9 (i). The lower bounds for (i) and (ii) follow since in both cases ℓ = Ω(log n ). Consider the lazy versions of the discrete-time Sequential and Parallel-IDLA models, where withprobability 1 / τ vL − seq ( G ), τ vL − par ( G ), be the number of steps needed to complete the lazy Sequential,respectively lazy Parallel, IDLA process started from v .Although we are mainly concerned with the simple random walk IDLA models would like tobe able to switch to the lazy setting at times as it allows us to use mixing time results. For theSequential it is fairly clear that up to lower order terms the lazy sequential is a factor of 2 slowerthan the Parallel, using the continuous time Uniform-IDLA we can also show this for Parallel-IDLA. Proof of Theorem 4.3.
We begin by proving the results for the sequential processes.Let E ℓ be the event that L contains at least one row of length at least ℓ ≥
1. We sample a lazySequential-IDLA by coupling with a simple Sequential-IDLA L and adding in lazy steps w.p. 1 / i with length τ i in the sequential process has length P τ i i =1 Y i in the coupledlazy process, where the Y i are independent Geo(1 /
2) random variables. For the upper bound let ρ ∗ be the length of L ’s longest row, then we have Pr h τ vL − seq > ρ ∗ + k ρ ∗ | ρ ∗ i ≤ n · e − k ρ ∗ k ) by Lemma 4.9 (ii). Similarly to Theorems 4.10 the w.h.p. upper bounds for τ vL − seq follow byconditioning on E ℓ for the two cases of ℓ in the statement.21or upper bounds in expectation if we condition on ( E ℓ ) c then 1 ≤ ρ ∗ ≤ ℓ , it follow that E (cid:2) τ vL − seq | ( E ℓ ) c (cid:3) < ℓ + 6 ℓ log n + (2 ℓ log n ) · ∞ X i =3 Pr (cid:2) τ vL − seq > ℓ + 2 iℓ log n (cid:3) ≤ O ( ℓ log n ) + (2 ℓ log n ) · ∞ X i =3 e − i / i +1) = O ( ℓ log n ) . If ℓ ≥ log n/
14 then by Lemma 4.9 (ii) Pr h τ vL − seq > ρ ∗ + p iρ ∗ log n | ρ ∗ i E ℓ ≤ n · e − i log n i log n ) /ρ ∗ +1) ≤ n · e − ( √ i log n ) / , thus E h τ vL − seq · E ℓ | ρ ∗ i = 2 ρ ∗ + O (cid:0) √ ρ ∗ · log n (cid:1) , similar to (6). By Jensen’s (concave) inequality E (cid:2) τ vc − seq (cid:3) ≤ E (cid:2) τ vseq (cid:3) + O (cid:18)q E (cid:2) τ vseq (cid:3) · log n (cid:19) + O ( ℓ · log n ) · Pr [ ( E ℓ ) c ] . (7)Thus for any graph G and v ∈ V , E h τ vL − seq i = O (cid:0) E (cid:2) τ vseq (cid:3)(cid:1) by Lemma 3.11. If Pr [ ( E ℓ ) c ] ≤ /ℓ for some ℓ = ω (log n ) then E (cid:2) τ vc − seq (cid:3) ≤ (2 + o (1)) E (cid:2) τ vseq (cid:3) by (7).For the w.h.p. lower bound, conditional in i being a walk of maximum length ρ ∗ in L, walk i inthe L-Seq-IDLA has length P ρ ∗ i =1 Y i , where Y i ∼ Geo(1 / Pr h τ vL − seq < ρ ∗ − (log n ) / √ ρ ∗ (cid:12)(cid:12)(cid:12) ρ ∗ i · E ℓ ≤ e − (log n ) / / − o (1)) = o (1) . Thus, since Pr [ ( E ℓ ) c ] = o (1) for any G, v ∈ V by Lemma 3.11, talking expectations of the equationsabove yields τ vL − seq ≥ (2 − o (1)) τ vseq w.h.p . Thus this also holds in expectation.We now prove the bounds for the Parallel processes, the proof technique will be slightly different.First assume that for any G , v ∈ V and some ℓ = ω (log n ) we have Pr [ ( E ℓ ) c ] = o (1). In this case weknow that τ vpar ( G ) = (1 + o (1)) τ vc − unif ( G ) w.h.p. and in expectation from Theorem 4.11. Considerthe CTU-IDLA but using clocks of mean 2 and use τ v − c − unif ( G ) to denote the dispersion timeof this process. It is clear that we can couple the clocks of mean 1 and 2 to give τ v − c − unif ( G ) =(2 + o (1)) τ vc − unif ( G ) w.h.p. and in expectation. Note that sampling from this process is equivalentto sampling from the Uniform-IDLA of mean 1, but ignoring the ring of the clock with probability1 / e G , this is G but to each vertex we add as many selfloops as neighbours, then τ vc − unif ( e G ) has the same distribution as τ v − c − unif ( G ), likewise τ vpar ( e G )and τ vL − par ( G ) are also equidistributed. Theorem 4.11 is then applied to e G yielding τ v − c − unif ( G ) =(1 + o (1)) τ vL − par ( G ) w.h.p. and in expectation. Combining these relations yields τ vL − par ( G ) = τ v − c − unif ( G ) = (2 + o (1)) · τ vc − unif ( G ) = (2 + o (1)) · τ vpar ( G ) , w.h.p. and in expectation. For a general graph G and v ∈ V the exact same argument workshowever each of the equalities above holds only up to a Θ(1) factor, the result follows. In this section we determine the dispersion for many well known graph topologies.22 .1 The Complete Graph
We shall begin with the clique as this is most simple to analyse.
Theorem 5.1.
Let K n be the complete graph on n vertices and κ cc be as in Lemma 5.2. Then t par ( K n ) ∼ π · n and t seq ( K n ) ∼ κ cc · n, where κ cc := ∞ X i =1 (cid:18) i (3 i − − i (3 i + 1) (cid:19) ≈ . . Before proving the above we state a result needed to treat the Sequential-IDLA on cliques.
Lemma 5.2 ([14]) . Let T := T n be the maximum of n independent geometric random variableswith parameters in for ≤ i ≤ n . Then the limit lim n →∞ E [ T ] /n exists and is equal to κ cc The constant κ cc is related to the longest wait time in the coupon collector process [14]. Proof of Theorem 5.1.
Instead of analyzing the parallel process, we analyze the continuous-timeUniform-IDLA process (CTU-IDLA), in which each particle has a exponential clock of rate 1, andmoves every time the clock rings until the particle settles. By Theorem 4.8 we have that thedispersion time of the Parallel-IDLA process and the CTU-IDLA process are asymptotically equalas long as the dispersion time of the Parallel-IDLA is ω (log n ) w.p. 1 − o (cid:0) / log n (cid:1) . The propertyholds trivially because as the last particle in the Sequential-IDLA takes geometric time of mean n to settle, this holds also for the Parallel-IDLA due to the stochastic domination τ vseq (cid:22) τ vpar byTheorem 4.1. The analysis of the CTU-IDLA is quite simple: since particles move in continuous-time no two particles settle at the same time. Suppose there are k unsettled particles, then thetime needed until one of the k particles settles in one of the k unoccupied vertices is exponentiallydistributed with mean ( n − /k . Summing up from k = 1 to n − n P k ≥ k − = n · ( π / − o (1)).For t seq the longest walk in the Sequential-IDLA on K n is the longest waiting time in theCoupon Collector problem. This time is distributed as the maximum of n independent geometricrandom variables with parameters n − i +1 n for 1 ≤ i ≤ n . The result follows from Lemma 5.2. Remark 5.3.
Observe that κ cc ≈ . and π / ≈ . so the two constants are distinct. Let P n be the path with n -vertices. Interestingly, the path provides an example where the sequentialand parallel dispersion process take the same time up to lower order terms. Theorem 5.4.
Let M be the maximum of n independent random variables representing the hittingtime of a random walk to the vertex n , starting from on P n . Then for the dispersion time, t seq ( P n ) = (1 ± o (1)) · E [ M ] = t par ( P n ) . Proof.
In the following, we will denote by t seq ( m ) the expected running time of the Sequential-IDLA on a path with m vertices, when the source is the endpoint labelled by 1. In the following,let Y , Y , . . . , Y n be a collection of n independent random variables, each of which describing thehitting time of a random walk from endpoint 1 to n − n/ log n (thus Y i = τ hit (1 , n − n/ log n )). Inparticular, these random walks will not settle and are therefore completely independent.23he proof will be based on the following chain of inequalities: t seq (cid:18) n − n log n (cid:19) (1) ≤ t par (cid:18) n − n log n (cid:19) (2) ≤ E " max ≤ i ≤ n − n log n Y i (3) ≤ (1 + o (1)) · E " max ≤ i ≤ n log n Y i (4) ≤ (1 + o (1)) · t seq ( n ) , and then finally t seq ( n ) (5) ≤ (1 + o (1)) · t seq (cid:18) n − n log n (cid:19) , and if all these inequalities hold, the claims of the theorem are established.Note that inequality (1) is a direct consequence of Theorem 4.1, and inequalities (2) and (4)follow directly from the definition of the Parallel-IDLA and Sequential-IDLA, respectively. Thus itonly remains to prove (3) and (5).We first prove (3) - in fact, for notational convenience we will establish the stronger claim E (cid:20) max ≤ i ≤ n Y i (cid:21) ≤ (1 + o (1)) · E " max ≤ i ≤ n log n Y i , i.e., on the left hand side, we take the maximum over n random variables instead of just n − n/ log n .To simplify notation, define e Y := max ≤ i ≤ n/ log n Y i and define Y := max ≤ i ≤ n Y i . In order toprove that E h e Y i and E [ Y ] are close, consider a coupling where we first expose the values of theset { Y , Y , . . . , Y n } and then assign those values through a random permutation. Next define by F the random variable counting the Y i ’s which are larger than e Y , in symbols, F := (cid:12)(cid:12)(cid:12)n n/ log n < j ≤ n : Y j > e Y o(cid:12)(cid:12)(cid:12) . Next note that for any λ ≥ Pr [ F ≥ λ · log n ] ≤ n/ log n Y i =1 (cid:18) − λ log nn − i + 1 (cid:19) ≤ n/ log n Y i =1 exp (cid:18) − λ log nn (cid:19) ≤ exp ( − λ ) . The first inequality holds by considering the probability that random ordering does not “choose”any of the λ log n longest walks for one of the first n/ log n walks. I.e. if we have chosen k so far,non of them being one of the λ log n longest, then we choose a long walk next time w.p. ( λ log n ) /k .Thus for λ = 2 log n , Pr (cid:2) F ≥ n (cid:3) = n − .Consider now the gap between the (2 log n )-th largest element of the values { Y , Y , . . . , Y n } and the maximum. To this end, we will use the principle of deferred decisions and expose the n trajectories in parallel order and stop as soon as there at most 2 log n walks which have not hitthe other endpoint.Hence suppose we order these values such that w.l.o.g. Y ≤ Y ≤ · · · ≤ Y n . Then for any j ≥ n − n , the random variable Y j − Y n − n is stochastically dominated by one plus thehitting time from 1 to n , so in particular, E h Y j − Y n − n i = O (1 + n ). Furthermore, usingthe fact that from any start point, a random walk reaches the vertex n is at most 2 n steps withprobability at least 1 /
2, it follows that for any λ > Pr h Y j − Y n − n > λ · n i = O (2 − λ ) . (8)24hoosing λ = C log log n for some large constant C >
0, it follows by the Union bound over the atmost log n indices j ∈ F that Pr h Y ≥ e Y + O ( n log log n ) | F ≤ n i ≤ n . To conclude, it follows by the Union bound that w.p. at least 1 − / (log n ) , our coupling satisfies Y − e Y ≤ C · n log log n. Otherwise, we still have E h Y − e Y | E i = O ( n log n ) + O ( n log log n ), where E denotes the eventthat any of the above probabilistic arguments fail. The result follows since Pr [ E ] = O (cid:0) / log n (cid:1) .We now continue to prove inequality (5). To this end we will construct a coupling between the n walks in t seq ( n ) and the n − n/ log n walks in t seq ( n − n/ log n ). Consider the first n/ log n randomwalks in the t seq ( n ) setting. For each of them, the expected time to settle is O ( n / log n ) and byan argument similar to (8), none of them will take more than O ( n ) with probability 1 − n − ω (1) .The trajectories of the next n − n/ log n walks of t seq ( n ) can be coupled with the ones in t seq ( n − n/ log n ), so if a walk moves from vertex x to x +1 in t seq ( n − n/ log n ), then the correspondingwalk in t seq ( n ) moves from x + n/ log n to x + 1 + n/ log n . The only difficulty arises when thewalk in t seq ( n ) is at a vertex between 1 and n − n/ log n . To capture this, we will consider so-calledexcursions which are epochs in which the random walk is at such a vertex. Notice that the totalnumber of steps that are taken as part of any excursion is at most the total number of visits toany vertex in 1 , , . . . , n/ log n . However, note that the expected number of visits to any of thesevertices is O ( n log n ) for a random walk of O ( n log n ) steps, and thus by a standard ChernoffBound for random walks, it follows that any of these vertices is visited at most O ( n log n ) timeswith probability at least 1 − n − . Thus by the Union bound, the total number steps spend in anyexcursion is at most O ( n ) with probability at least 1 − n − .To conclude, we have shown that with probability at least 1 − n − there is a coupling between τ seq ( n ) and τ seq ( n − n/ log n ) such that τ seq ( n ) ≤ τ seq ( n − n/ log n ) + O ( n ) . Note that we can verify whether this coupling holds by inspecting only the first O ( n log n ) stepsof the random walks. Thus even conditional on the coupling failing, we have t seq ( n ) = O ( n log n ).Since t seq ( n ) = Ω( n ), it follows that for the expected values, t seq ( n ) ≤ (1 + o (1)) · t seq ( n − n/ log n ) . We call a graph an expander if 1 − λ = Ω(1), where λ is the second largest absolute eigenvalue. Theorem 5.5.
Let G be an n -vertex almost-regular expander graph. Then t seq ( G ) , t par ( G ) = Θ( n ) .Proof. The lower bound for t seq follows from Theorem 3.7. The upper bound on t par ( G ) followsfrom Corollary 3.5. The result then follows since t seq ( G ) ≤ t par ( G ) by Theorem 4.1. Remark 5.6.
In particular this result covers (w.h.p.) random d -regular graphs, for fixed d , andthe binomial random graph G ( n, p ) above the connectivity threshold, when np ≥ c log( n ) , c > . H d , where n = 2 d , is the graph where each vertex is a binary string of length d and two vertices are connected if their associated binary strings differ in one digit. The hypercubeis not an expander since 1 − λ = 1 /d = 1 / log n however, we still achieve a linear bound. Theorem 5.7.
Let H d be the hypercube with n = 2 d vertices. Then t seq ( H n ) , t par ( H d ) = Θ( n ) .Proof. The lower bound for t seq follows from Theorem 3.7. Due to Theorem 4.1 we only need tofind an upper bound for t par . As laziness only changes the dispersion time by a constant factor,we work with lazy walks. For the upper bound we seek to apply Theorem 3.3 however, unlikein Theorem 5.5, we shall use an argument based on return probabilities in H n to bound hittingtimes rather than appealing to Lemma A.2. Also note that, since the sum in Theorem 4.1 onlyhas O (log n ) terms and by monotonicity of hitting times of sets, it will be sufficient to cover thecase 1 ≤ | S | ≤ (log n ) /
2. If we can prove that the hitting time is O ( n/ | S | ) in this case then weare done. We divide time into epochs of length 2 log n and prove the probability we hit S inone epoch is at least Ω((log n ) | S | /n ). In the first log n steps of an epoch we allow the walk tomix ignoring if the walk hits S . Then, with high probability we can couple our walk with thestationary distribution. In the second log n steps of an epoch we observe if the walk hits S . Let Z the random variable which counts the number of visits to the set S in (log n ) steps. Then Pr π [ τ S ≤ (log n ) ] = Pr π [ Z ≥
1] and Pr π [ Z ≥
1] = E π [ Z ] E π [ Z | Z ≥ ≥ (log n ) | S | /n max u ∈ S P (log n ) t =0 e p tu,S . Claim 5.8.
For any set S and u ∈ S if | S | ≤ (log n ) / , then P (log n ) t =0 e p tu,S = O (1) . Thus by Claim 5.8 (proved later) Pr π [ τ S ≤ (log n ) ] = Ω (cid:0) Ω((log n ) | S | /n ) (cid:1) and so t hit ( π, S ) ≤O (max { n/ | S | , n/ log n } ). Thus as discussed earlier the upper bound follows from Theorem 3.3.The proof of Claim 5.8 will make heavy use of [19, Lem. 7], we paraphrase it hear for convenience: Lemma 5.9.
Let W ( i ) , i ≥ be the lazy walk in H d and T = (log n ) . Then, for any v ∈ V ,(i) R v := T X i =0 e p tu,u = 2 + 2 d + O (cid:18) d (cid:19) . (ii) Suppose W (0) is at distance at least 2 from v (resp. at least 3 from v ). The probability W visits Γ( v ) within L = O ( T log n ) steps is P (2 , L ) = O (1 /d ) (resp. P (3 , L ) = O (1 /d ) ).(iii) Let C ⊆ N ( v ) . For a walk starting from u ∈ C , let R C denote the expected number of returnsto C within T steps. Then, in the lazy walk, R C = 2 + O (1 /d ) .Proof of Claim 5.8. Let C = { u } ∪ (Γ( u ) ∩ S ). Let R ( u, t ) (resp. R ≥ ( u, t )) be the expectednumber of visits to a vertex at distance 2 (resp. distance ≥
3) from u before time t . Then if T = (log n ) we have (log n ) X i =1 e p iu,S ≤ R C + log n · R ( u, T ) + log n · R ≥ ( u, T ) , (9)where the first term counts returns to u and the portion of S in u ’s neighbourhood, the secondterm counts visits to members of S at distance 2 and the third to those at distance 3 or greater(where we recall that | S | ≤ (log n ) / R ( u, T ) ≤ P (2 , T ) · R v , where v is any vertex by transitivity of H d .Thus by Lemma 5.9, R ( u, T ) ≤ O (1 /d ) · (2 + O (1 /d )) = O (1 /d ). Similarly R ≥ ( u, T ) ≤ P (3 , T ) · R v = O (cid:0) /d (cid:1) . Note R C = O (1) by Lemma 5.9 (ii). It follows from (9) that P Ti =1 e p iu,S = O (1).26 .4 Tori and Grids Let B ( r ) := (cid:8) x ∈ Z d : x + · · · + x d ≤ r (cid:9) be the ball of radius r in Z d . Lemma 5.10.
Let d = 1 , be fixed. For any β > there exists some C > such that the randomwalk of length Ct log t from the origin in Z d does not exit B ( √ t ) with probability at least /t β .Proof. Let S j be the position of a random walk at time j started from 0. For t > E bethe event (cid:8) S j ∈ B ( √ t/
2) for all 0 ≤ j ≤ c t − (cid:9) . By the Central Limit Theorem [34] for all ε > c > t Pr (cid:20) S c t √ c t B (cid:18) √ t/ √ c t (cid:19) (cid:21) ≤ (cid:18) O (cid:18) √ t (cid:19)(cid:19) Z R \ B ( c ) e −| x | π d x ≤ ε. Thus by the Reflection Principal [36, Prop. 1.6.2] the probability a random walk stays within theball B (cid:0) √ t/ (cid:1) for c t units of time is at least 1 − ε . For i ≥ E i be the event n S j ∈ B ( √ t ) for all i · c t ≤ j ≤ ( i + 1) · c t − o ∩ n S ( i +1) · c t − ∈ B ( √ t/ o . By geometric considerations we see that Pr [ E i +1 |E i ] ≥ (1 − ε ) / d ≥ / (2 d + 1) for small enough c >
0. Observe that (cid:8) S k ∈ B ( √ t ) for all 0 ≤ k ≤ αc t log t (cid:9) ⊇ T α log ti =0 E i , for any α >
0. Thus forany fixed β > α ≤ β/ log(2 d + 1) we have Pr h S k ∈ B ( √ t ) for all 0 ≤ k ≤ αc t log t i ≥ (1 − ε ) (cid:18) d + 1 (cid:19) α log t ≥ t β . The result follows by taking c >
Theorem 5.11.
For the path/cycle, Θ( n log n ) steps are needed in expectation and with probabilityat least − o (1) .Proof. The upper bound for either graph follows from Lemma 3.1. For the lower bound in thecycle if at some time an interval [ − a, b ] has been settled around the origin then by the gamblersruin formula the end point closest to the origin receives the next particle with probability at least1 /
2. Thus by a Chernoff bound w.h.p. after 2 n/ − n/ , n/
4] isoccupied. Each of the remaining n/ B ( n/
4) in order to settle. Thusby Lemma 5.10 there is some
C > Cn log( n ) to exit B ( n/
4) is at least 1 − (1 − /n β ) n/ = 1 − o (1). The result for the path bysimilarly considering a return to the origin as a change in parity for a walk on the cycle and addingsettled vertices to both ends simultaneously.The next result does not settle the dispersion time on the two-dimensional grid, but improveson the trivial Ω( n ) bound. Proposition 5.12.
Let G be either the finite box [ −⌊√ n/ ⌋ , ⌊√ n/ ⌋ ] ⊂ Z in the two-dimensionalgrid, or the two-dimensional finite torus on n vertices. Then t seq ( G ) , t par ( G ) = Ω( n log n ) .Proof. We will prove the lower bound for t seq only, since the corresponding lower bound for t par will follow from t par ≥ t seq .Let A ( t ) denote the aggregate of the Sequential-IDLA once t particles have settled. Theorem 1of [30] states that for each γ there exists an a = a ( γ ) < r , P (cid:2) B ( r − a log r ) ⊆ A ( πr ) ⊆ B ( r + a log r ) (cid:3) ≥ − r − γ . (10)27e can couple the process on G with the process on Z up until the point t ∗ when the first particlesettles a vertex on the boundary (or wraps around in the torus). By (10) we can condition onthe aggregate A ( t ∗ ) containing a ball of radius ⌊√ n/ ⌋ − a log n w.h.p., for some a < ∞ . Thusthe remaining n − t ∗ > (1 − π/ n > n/ B ( √ n/
3) before settling.Now by Lemma 5.10 the probability that one walk takes longer than Cn log n to do this is at least1 − (1 − /n β ) n/ = 1 − o (1). The result follows. Theorem 5.13.
Let G be the d -dimensional torus/grid where d ≥ . Then t seq ( G ) , t par ( G ) = Θ( n ) .Proof. The lower bound for t seq follows from Theorem 3.7. For the d -dimensional torus/grid wehave the well-known bound p tu,v ≤ /n + O ( t − d/ ). This estimate applied in combination withLemma A.2 to Theorem 3.3 implies a bound of O ( n ) on the dispersion time whenever d ≥ In this section we consider the binary tree T n with n + 1 = 2 k − k − leaves and root r . Theorem 5.14.
For the binary tree T n with root r , we have τ rseq , τ rpar = Θ (cid:0) n (log n ) (cid:1) w.h.p. andin expectation, consequently t seq , t par = Θ (cid:0) n (log n ) (cid:1) . Recall that the hitting time in the Binary tree with n vertices is O ( n log n ). Thus, by Theorem3, the dispersion time of the Parallel-IDLA process is O ( n (log n ) ) w.h.p. and in expectation. Thusto establish Theorem 5.14 it remains to show that the dispersion time of Sequential-IDLA is atleast Ω( n (log n ) ) w.h.p., proving that t seq = Θ( t par ) = Θ( n (log n ) ), due to Theorem 4.1.To prove the lower bound we show the last poly ( n ) unoccupied vertices are clustered in such away that one of the last poly ( n ) walks will have trouble finding the cluster. The first step of thisstrategy is to establish the following lemma, which in some sense a shape theorem for the binarytree. Lemma 5.15.
Consider a complete binary tree with n = 2 k − vertices and the root r being thesource of the Sequential-IDLA. Let τ be the first time when one of the two sub-trees with k − − vertices is completely filled and fix < ε < / . Then with probability at least − n − ε , the othersub-tree still has at least n ε / (3 log n ) unoccupied vertices at time τ . The lemma above allows us to show that after some time in the process all the remainingunsettled vertices are contained in a sub-tree of significant distance from the root. The next lemmasays that w.h.p. one of the remaining walks takes a long time to enter the sub-tree.
Lemma 5.16.
Let u be an arbitrary but fixed vertex which has distance ε log n from the root,where < ε ≤ is some constant. For any given c > there exists c ′ > such that a random walkof length c ′ εn log n starting from the root r visits u with probability at most − n − c . These two lemmas are the main technical component of this chapter and are proved in Sections5.5.1 and 5.5.2 respectively, first we shall prove Theorem 5.14.
Proof of Theorem 5.14.
As mentioned above it suffices to prove a w.h.p. lower bound on τ rseq .Suppose n = 2 k − r denote the root of the binary tree. Let T , T , . . . , T x with x =2 ⌊ k/ ⌋ ≤ n / be a labelling of all sub-trees whose root is at distance ⌊ k/ ⌋ form the root r . Noteeach of those sub-trees has 2 k −⌊ k/ ⌋ − o (1)) n / vertices.Observe that whenever a particle enters to one of those sub-trees, we can imagine the fillingprocess as an independent IDLA process on that sub-tree. Indeed, when a particle moves inside such28 tree, it is moving as a random walk on the tree until it settles, and if the particle leaves the sub-treefrom the root, we can imagine we pause the process until a new particle arrives again, restarting theprocess. From the previous observation we can apply Lemma 5.15 above with ε = 1 /
8, it followsthat whenever we fill one of the sub-trees of T i , the other sub-tree has at least Ω (cid:16) n ε/ log n (cid:17) ≥ n / with probability at least 1 − n ε/ . By the union bound, the above happens for all the sub-trees T , . . . , T x at the same time with probability at least 1 − n ε/ n / ≥ − n − Θ(1) .Now, consider all the sub-trees at distance ⌊ k/ ⌋ + 1 from the root, and suppose that just aftersettling the i -th particle, all but one of those sub-trees are filled. Without lost of generality, weassume that such a sub-tree is a sub-tree of T , and denote the left and right sub-trees of T by T and T . Additionally, suppose that that T is filled just after settling the i -th particle. We concludethat after settling the i -particle, all trees T , . . . , T x are filled, and since T just became filled, wededuce that all the remaining unsettled vertices are located in T and there are at least n / ofthem. Choosing c = 1 /
10 in Lemma 5.16 below gives that one of the M ≥ n / remaining walks takelonger than c ′ εn log n hit the sub-tree T with probability at least 1 − (cid:0) − n − / (cid:1) n / ≥ − o (1).Concluding that it takes Ω( n log n ) to settle all particles w.h.p. We begin with the following simple lemma needed to prove Lemma 5.15.
Lemma 5.17.
In the binary tree T n of height k , the probability that a fixed leaf u is visited beforethe walk returns to the root r is / (2( k − .Proof. The formula Pr [ A Random Walk from r hits u before returning to r ] = ( R ( r, u ) · d ( r )) − can be found in [37, Prop. 9.5.]. The result follows since the resistance R ( r, u ) in a tree is given bygraph distance and the degree of the root, d ( r ), is 2. Proof of Lemma 5.15.
We divide the Binary tree into a root, and a left and right sub-tree. Tostudy the IDLA process, we consider the following algorithm. Consider an infinite sequence of(independent) random walks starting in the root of the left-tree. These walks finish when theyhit the root of the original tree. We also consider an (independent) infinite sequence for the rightsub-tree. To run the IDLA process, we start in the root of the binary tree and settle the firstparticle. From the second particle on, each time a particle is in the root it moves to the left orright sub-tree with probability 1 /
2. The i -th time a particle moves to the left (right) sub-tree,it follows deterministically the i -th predetermined walk until it reaches a vertex for first time orreturns to the root of the binary tree. The advantage of this procedure is that once we predeterminethe infinite random walk sequences in the left and right sub-tree, we know the number of timesparticles need to move from the root either to the left sub-tree or to the right sub-tree in orderto fill the left and right sub-trees respectively. Let us call such quantities, the number of visits toeach sub-tree required to fill it, L and R (for the left and right sub-trees). Note that n/ ≤ L, R because we need to move at least n/ S be the number of walks neededto cover the last n ε / (3 log n ) unoccupied vertices of the left ( or right) sub-tree. Define the event E = { S < n ε } . We prove that E occurs w.h.p., indeed, Pr [ S ≥ n ε ] ≤ Pr (cid:20) Bin (cid:18) n ε ,
12 log n (cid:19) ≤ n ε n (cid:21) ≤ exp (cid:18) − n ε
72 log n (cid:19) . (11)In the first inequality follows from Lemma 5.17 (hitting a leaf is harder than hitting a non-leaf ina excursion), for second inequality we use Chernoff’s bounds. Therefore, with probability at least29 − − n ε
72 log n ), the last n ε walks cover at least n ε / (3 log n ) unoccupied vertices of the left (orright) sub-trees.From now, we assume all the walk in the left (right) sub-trees are predetermined. Let W j be1 if the j -th time a particle leaves the root moves to the left sub-tree, W j = 0 otherwise. Denote L i = P ij =1 W j and R i = n − L i . Define τ = min { i ≥ L i = L or R i = R } . Claim 5.18.
For ε < / it holds that max { R − R τ , L − L τ } ≥ n ε with probability at least − n − ε . The proof of the claim is temporally deferred. The claim above essentially tells us that when wefill one sub-tree, the other needs at least n ε more walks to be filled with high probability. Denote E = { max { R − R τ , L − L τ } ≥ n ε } . Note that the statement of this Lemma follows from provingthat E ∩ E holds with probability at least 1 − n − ε . By (11) and Claim 5.18 we have Pr [ ( E ∩ E ) c ] ≤ Pr [ E c ] + Pr [ E c ] ≤ (cid:18) − n ε
72 log n (cid:19) + n − ε + ≤ n − ε . Proof Of Claim 5.18 . Recall that after the i -th time a particle moves from the root to one of thesub-trees, we have R i + L i = i . Also, if such particle moves to the left sub-tree L i = L i − + 1 and R i = R i − (similarly if the particle moves to the right sub-tree). We can see the process as ballsinto 2 bins (left and right bins). At each round we allocate a ball to one of the bins at random. Theprocess finishes when the left bin has L balls or when the right bin has R balls, but for conveniencewe allow the process to keep adding balls after such a point. We work with a continuous timeversion of this process where balls arrive to each bin following independent Poisson processes N l ( t )and N r ( t ) of rate 1 for the left and right bin, respectively. Let τ l (resp. τ r ) be the first time t suchthat N l ( t ) ≥ L (resp. N r ( t ) ≥ R ). Consider the time τ l = t and consider the load of the other bin N r ( t ). First, note that as L, R ≥ n/
2, the event { t ≤ n ε } occurs only with probability at mostexp( − n Ω(1) ) (using a Chernoff bound) and therefore in the remainder of the proof we will assume t ≥ n ε . Note that for any τ l = t , the load of the right bin is exactly a Poisson random variablewith parameter t . For any integer x , Pr [ P oi ( t ) = x ] ≤ / √ πt thus using the lower bound on t Pr [ | R − N r ( τ l ) | < n ε | τ l = t ] ≤ n ε · (2 / √ πt ) ≤ n − ε / . Analogous arguments work for τ r and | L − N l ( τ r ) | . By the union bound the result holds. Let c > be fixed. Then, a random walk ( X t ) of length n ⌈ c ( k − ⌉ / on T n startingfrom the root r visits an arbitrary but fixed leaf u w.p. at most − e − c/ · n − c/ (2 log 2) .Proof. First note that by Lemma 5.17, it follows that a random walk does not visit leaf u beforethe ⌈ c ( k − ⌉ -th return to the root with probability at least (cid:18) − k − (cid:19) ⌈ c ( k − ⌉ ≥ e − ck/ − ( c/ o (1)) ≥ (cid:18) n (cid:19) c/ (2 log 2) · e − c/ , where the second inequality due to the fact that n − k − n = k log 2.Consider now a random walk of length ℓ = dn ⌈ c ( k − ⌉ for some constant d >
0. We wish toshow we have that not too many excursions (visits to the root r ) during ℓ times w.h.p. Let L be30he set of leaves and r be the root. Let τ A , ( τ + A ) be the first hitting (return) time of the vertex/set A by the random walk X t . By [37, Prop. 9.5.] we have P (cid:2) τ L < τ + r (cid:12)(cid:12) X = r (cid:3) = 1 / (2 ·
2) = 1 / P (cid:2) τ r < τ + L (cid:12)(cid:12) X ∈ L (cid:3) = 1 / (2 · ( n/ /n. (12)To simplify the analysis we shall consider only times when the walk is at the root or the leavesreducing the tree to a two state Markov chain. Indeed, we start the walk at the root and say itjumps to a leaf w.p. 1 /
4, once at a leaf it can jump to the root w.p. 1 /n .To bound the number of visits to r from above we can assume that each attempt to get from r to L (or L to r ) takes at least 2 units of time. Thus we have at most ℓ/ L and the number of successes is dominated by a Binomial r.v. with parameters ℓ/ /n by (12). Thus we hit r from L at most t = (1 + 1 / ℓ/ (2 n ) times w.p. 1 − n − ω (1) by a Chernoffbound. Let R i be the number of returns to r by the random walk from r on its i th trip to r beforehitting L again, note that R i is geometrically distributed with parameter 1 / Y ( t ) = P t i =0 R i be the number of returns to r during a random walk of length ℓ conditionalon t successful returns to r from L , then by Lemma 4.9 (ii) Pr [ Y ( t i ) > t / ≤ exp (cid:16) − t (1 − / / (cid:17) , holds for any fixed d >
0. Then, by taking d = 1 /
3, with probability at least 1 − n − ω (1) the numberof returns to r (and excursions from r ) is bounded by 9 t / ≤ (9 / · (11 / ℓ/ (2 n ) < ⌈ c ( k − ⌉ .Thus we have Pr (cid:2) τ hit ( r, u ) > n ⌈ c ( k − ⌉ / (cid:3) ≥ Pr (cid:2) X t does not visit u in the first ⌈ c ( k − ⌉ excursions (cid:3) − Pr (cid:2) There are more than ⌈ c ( k − ⌉ excursions (cid:3) ≥ e − c/ · n − c/ (2 log 2) − n − ω (1) . The proof follows from noting the above is greater than e − c/ · n − c/ (2 log 2) for large n .Finally, we can now extend the result from the previous lemma to internal vertices, and provethe Lemma 5.16, the a key Lemma about hitting time of clustered sets. Proof of Lemma 5.16.
Let e T be the top of the binary tree T , this is the tree induced by all verticesthat have distance at most ε log n from the root. Let e L be the set of leaves in e T . By Lemma 5.19,we know from that given c >
0, a random walk of length cε n ε log n/ e T does not visit a vertex u ∈ e L with probability at least n − cε/ (2 log 2) . By the random walk Chernoff bound [18] the randomwalk on e T makes at least ν = cε n ε log n/ e L \{ u } with probability at least1 − √ n ε · exp − (1 / · cε n ε log n · · t mix (cid:16) e T (cid:17) = 1 − n ω (1) . We will couple the walk on e T to a longer walk on the tree T by allowing the walk to continueinto sub-trees pendant to e L . Let S = P νi =1 V i be the amount of time spent in the sub-treespendent to e L by the coupled walk, where V i is the amount of time spent in a pendent sub-treebefore returning to e L for the i th time. Now by (12) a random walk in T from l ∈ e L goes intothe sub-tree pendant from l and does not return to l for at least n − ε steps with probability(2 / · (1 / · (1 − /n − ε ) n − ε ∼ / (6 e ). Since the amount of time spent by the walks in eachsub-tree is identically distributed S ≥ ν/ (7 e ) · n − ε = cε n log n/ (7 e ) with probability 1 − e − Ω( n ε ) by a Chernoff bound. Combining the above a walk of length cε n log n/ (7 e ) on T hits u withprobability at most 1 − e − c/ · n − cε/ (2 log 2) − n ω (1) − e − Ω( n ε ) ≤ − n − cε/ (2 log 3) . .6 The Lollipop Let L n be the lollipop graph, which consists of a ⌈ n/ ⌉ vertex clique K attached at a vertex v ∈ K by a single edge to the endpoint of a path P of length ⌊ n/ ⌋ . Proposition 5.20.
Let u ∈ K , u = v . Then, τ useq ( L n ) = Ω (cid:0) n · log( n ) (cid:1) w.h.p. .Proof. Let w be a vertex half way down the path and E be the event that a walk from a vertex in K \{ v } hits w before returning to K \{ v } . For E to occur the walk must hit v , walk one step in thepath then hit w before returning to K \{ v } , thus Pr [ E ] ≤ (2 /n ) · (2 /n ) · (4 /n ) · (1 − /n ) ≤ /n . During the sequential process n/ w before settling and that by the time w isfirst hit the clique K is fully occupied w.h.p.. Conditional on this we can lower bound τ useq by theexpected number of trials it takes for the longest of the last n/ w . For each walk sucha trial is described by the event E and thus the number of trials required by one walk dominates aGeo (cid:0) /n (cid:1) random variable. Hence we have Pr (cid:2) walk i needs more than n log( n ) /
18 trials (cid:3) ≥ (cid:0) − /n (cid:1) n log( n ) / ≥ / √ n. Thus the probability all of the last n/ n log( n ) /
18 trials is less than(1 − / √ n ) n/ = o (1) . The result follows.
In this section we present several graphs used throughout the paper as counter examples.
We begin two examples showing that the dispersion time doesn’t always concentrate. Let G bethe clique+edge: this is K n with a extra vertex v ∗ attached by an edge to v ∈ K n . Let G be theclique+hub+edge: a single edge { v, v ∗} attached at v to h ( n ) − K n − . Proposition 6.1.
Let D v ( G ) denote either τ vpar ( G ) or τ vseq ( G ) . Then there exists graphs G , theclique+edge, and G , the clique+hub+edge, and u ∈ V ( G ) , v ∈ V ( G ) such that Pr [ D u ( G ) ≤ O ( E [ D u ( G )] /n ) ] = Ω(1) and Pr [ D v ( G ) ≥ Ω ( E [ D v ( G )] · n ) ] = Ω(1 /n ) . Proof.
Let G be the clique+edge. If the parallel or sequential process is started from v thenwith probability (1 − /n ) n ≈ /e the vertex v ∗ is not explored in one step and so the processtakes Ω (cid:0) n (cid:1) as one of the walks must choose to go back to v and then to visit v ∗ . However withprobability 1 − (1 − /n ) n ∼ = 1 − /e one of the n walks hits v ∗ in the first step and then the processtakes O ( n ), as is the case with K n .Let G be the clique+hub+edge. If an IDLA process is started from v then with probabilityat least 1 − (1 − /h ( n )) n ≈ − e − n/h ( n ) there is a walker which visits v ∗ in one step. Therest of the graph is essentially a clique and so the process takes O ( n ) time. With probability(1 − /h ( n )) n · (1 − /n ) n ∼ e − n/h ( n ) − every walker enters the graph K n \ N ( v ) and so the processtakes an additional Θ (cid:16) / (cid:16) h ( n ) n · n · h ( n ) (cid:17)(cid:17) expected time to cover the graph. Thus E [ D v ( G )] =Θ( n ) · (1 − e − n/h ( n ) ) + Θ( n ) · e − n/h ( n ) . Choosing h ( n ) = n/ log n yields E [ D v ( G )] = Θ( n ) and Pr (cid:2) D v ( G ) ≥ Ω( n ) (cid:3) = Ω(1 /n ). 32 .2 Least Action Principal Continuing our discussion from Section 1.3 we shall show that a least action is violated by a stoppingrule on G , the clique+edge we defined earlier in this section. Let ξ ix = 1 iff the site x is vacantafter i − W ( X ) denote the number of walk X . The normal “first vacantsite is settled” rule is then ρ = inf n t : ξ W ( X ) X ( t ) = 1 o . Proposition 6.2.
Define the following stopping rule on G e ρ = inf n t : ( t ≥ n log( n ) or X ( t ) = v ) and ξ W ( X ) X ( t ) = 1 o Then the parallel or sequential process on G stopped according to e ρ disperses in O ( n log n ) time.Whereas with the standard stopping rule ρ we have t seq ( G ) = Ω( n ) .Proof. The number of visits to the vertex v at the base of the extra edge { v, v ∗ } is greater than n log n with probability at least 1 − e − n by Chernoff bounds. The probability that none of thesewalks hit V ∗ is then (1 − /n ) n log( n ) = 1 /n . So v ∗ is covered by time 3 n log n w.p. 1 − /n conditionalon this the remaining walks settle in time O ( n ). If v ∗ fails to be covered by time 3 n log( n ) thenthe process takes O (cid:0) n (cid:1) by Proposition 6.1, the result follows.For the standard stopping rule an application of Theorem 3.3 shows that we the number of walksis reduced to a sub-polynomial size k in sub-linear time with constant probability. The probabilitythat one of these k random walk hits v ∗ in two steps from V \{ v, v ∗ } is 1 /n . The probability anyof the last k − v ∗ before settling (which takes at most O ( n ) time) is o (1). Thus occupying v ∗ is left to the last walk which takes Ω( n ) time with constant probability. The next Proposition, mentioned in Remark 3.9, shows that t hit fails as a lower bound for t seq . Proposition 6.3.
Fix < ε < / and let T be the complete binary tree on n vertices with a pathof length n / − ε attached to the root of the tree at one endpoint. Then t seq ( T ) = O (cid:0) n · log( n ) (cid:1) and t hit ( T ) = Ω( n / − ε ) . Proof.
The proof is in the counter examples section of the appendix, Appendix 6.
Proof.
Consider a complete binary tree with n vertices and attach a path of length k by andendpoint to the root, where 1 ≤ k = o ( √ n ). Note that the maximum hitting time in T isΘ ( n · max { k, log ( n ) } ), this follows by the commute time identity [37, Prop. 10.6] since effec-tive resistance in a tree is given by graph distance. Considering now the dispersion time, regardlessof the source vertex, the root gets at least Ω( n ) visits from n different random walks before all ver-tices are settled in a binary tree. Every time a walk visits the root, it reaches the endpoint of thepath with probability 1 /k (and in this case, the time to reach the other endpoint is Θ( k )). Henceif we consider the Sequential-IDLA, the path of length k is completely covered before the last walk.The expected time for the last walk to settle is then at most the maximum hitting time in the binarytree which is at most O ( n log n ), and with probability at least 1 − n − , that time is O ( n log n ).By stochastic domination, the time for the ℓ -th walk to settle for any 1 ≤ ℓ ≤ n − O ( n log n ) time.33 Conclusions
The aim of this project is to better understand IDLA processes on finite graphs. The main tool wedeveloped to gain an insight on the processes is the Cut & Paste bijection. This bijection allows usto study directly the affect of the different scheduling protocols on the random walk trajectories.We use this bijection to couple the various IDLA variants allowing us to order or equate theirdispersion times and show that t seq and t par are equal up to a multiplicative factor of order log n .In addition to the qualitative information provided by the bijection we also develop upper andlower bounds in terms of graph quantities such as max degree, number of edges, mixing time andhitting times of vertices or sets by a single random walk. These bounds enable us to establish thecorrect asymptotic order of the dispersion time for the Parallel and Sequential processes on severalnatural networks. The bounds also provide some tight general bounds in terms of n . From ouranalysis of fundamental networks we conclude that for most natural graphs the dispersion time isof order t hit or t hit · log n however we present examples where this is very far from the truth. As pointed out earlier, our results establish the correct asymptotic order of the dispersion time formost natural networks. The only exception is the 2 d -grid, where the dispersion time is shown to bebetween Ω( n log n ) and O ( n log n ). The known shape theorems for the infinite n log n . This provides us with the first open problem . Open Problem 1.
Determine the dispersion time of the d -grid/torus. The second main open problem is whether the sequential and parallel dispersion times are ofthe same order, we know of no graph where this does not hold however it seems hard to prove.
Open Problem 2.
Is it true that for any graph G , t par ( G ) = O ( t seq ( G )) ? In order to prove this result, it might be useful to derive some general lower bounds on thedispersion time, which are in turn interesting and useful in their own right. In particular
Conjecture 7.1.
Let G be a connected n -vertex graph, then t seq ( G ) = Ω( n ) . The following conjecture is motivated by the idea that when you run
StP algorithm the randomwalk sections cut and pasted do not have to cover the graph. If true this conjecture would resolvethe open problem above some classes of graphs.
Conjecture 7.2.
Let G be a connected n -vertex graph and t cov ( G ) be the cover time. Then t par ( G ) ≤ t seq ( G ) + t cov ( G ) . The counter example to concentration (Proposition 6.1) motivates the following open problem.
Open Problem 3.
What conditions must a graph satisfy for the dispersion time to concentratearound its expectation?
In forthcoming work we examine the total number of steps taken by an IDLA dispersion processand its relation to other graph properties. It might be also worth studying a version of the dispersionprocess where the origin is sampled uniformly at random for each particle.34 cknowledgements
A.S. is supported by the EPSRC Early Career Fellowship EP/N004566/1. N.R.,T.S. and J.S. aresupported by T.S.’ ERC Starting Grant 679660 (DYNAMIC MARCH).
References [1] Heiner Ackermann, Simon Fischer, Martin Hoefer, and Marcel Sch¨ongens. Distributed algo-rithms for qos load balancing.
Distributed Computing , 23(5-6):321–330, 2011.[2] David Aldous and James Allen Fill. Reversible Markov chains and random walks on graphs,2002. Unfinished monograph, recompiled 2014.[3] Noga Alon, Chen Avin, Michal Kouck´y, Gady Kozma, Zvi Lotker, and Mark R. Tuttle. Manyrandom walks are faster than one.
Combin. Probab. Comput. , 20(4):481–502, 2011.[4] Steve Alpern and Diane J Reyniers. Spatial dispersion as a dynamic coordination problem.
Theory and decision , 53(1):29–59, 2002.[5] Amine Asselah and Alexandre Gaudilli`ere. From logarithmic to subdiffusive polynomial fluc-tuations for internal DLA and related growth models.
Ann. Probab. , 41(3A):1115–1159, 2013.[6] Amine Asselah and Alexandre Gaudilli`ere. Sublogarithmic fluctuations for internal DLA.
Ann.Probab. , 41(3A):1160–1179, 2013.[7] Amine Asselah and Alexandre Gaudilli`ere. Lower bounds on fluctuations for internal DLA.
Probab. Theory Related Fields , 158(1-2):39–53, 2014.[8] Chen Avin, Michal Kouck´y, and Zvi Lotker. Cover time and mixing time of random walks ondynamic graphs.
Random Structures & Algorithms , 52(4):576–596, 2018.[9] Petra Berenbrink, Tom Friedetzky, Leslie Ann Goldberg, Paul W. Goldberg, Zengjian Hu, andRussell A. Martin. Distributed selfish load balancing.
SIAM J. Comput. , 37(4):1163–1181,2007.[10] Petra Berenbrink, Martin Hoefer, and Thomas Sauerwald. Distributed selfish load balancingon networks.
ACM Trans. Algorithms , 11(1):2:1–2:29, 2014.[11] Anders Bj¨orner, L´aszl´o Lov´asz, and Peter W. Shor. Chip-firing games on graphs.
European J.Combin. , 12(4):283–291, 1991.[12] S´ebastien Blach`ere and Sara Brofferio. Internal diffusion limited aggregation on discrete groupshaving exponential growth.
Probab. Theory Related Fields , 137(3-4):323–343, 2007.[13] Paul Bogdan, Thomas Sauerwald, Alexandre Stauffer, and He Sun. Balls in bins via localsearch. In
Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on DiscreteAlgorithms , pages 16–34. SIAM, Philadelphia, PA, 2012.[14] Charlotte Brennan, J Kariv, and Arnold Knopfmacher. Longest waiting time in the couponcollectors problem.
British Journal of Mathematics & Computer Science , 8:330–336, 01 2015.[15] Daniel Brown. How I wasted too long finding a concentration inequality for sums of geometricvariables. https://cs.uwaterloo.ca/~browndg/negbin.pdf .3516] Elisabetta Candellero, Shirshendu Ganguly, Christopher Hoffman, and Lionel Levine. Oil andwater: a two-type internal aggregation model.
Ann. Probab. , 45(6A):4019–4070, 2017.[17] Fan Chung and Linyuan Lu. Concentration inequalities and martingale inequalities: a survey.
Internet Math. , 3(1):79–127, 2006.[18] Kai-Min Chung, Henry Lam, Zhenming Liu, and Michael Mitzenmacher. Chernoff-Hoeffdingbounds for Markov chains: generalized and simplified. In , volume 14 of
LIPIcs. Leibniz Int. Proc. Inform. ,pages 124–135. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2012.[19] Colin Cooper and Alan Frieze. A note on the vacant set of random walks on the hypercubeand other regular graphs of high degree.
Mosc. J. Comb. Number Theory , 4(4):21–44, 2014.[20] Colin Cooper, Alan Frieze, and Tomasz Radzik. Multiple random walks in random regulargraphs.
SIAM J. Discrete Math. , 23(4):1738–1761, 2010.[21] Michael Damron, Janko Gravner, Matthew Junge, Hanbaek Lyu, and David Sivakoff. Parkingon transitive unimodular graphs.
Ann. Appl. Probab. , 29(4):2089–2113, 2019.[22] P. Diaconis and W. Fulton. A growth model, a game, an algebra, Lagrange inversion, andcharacteristic classes.
Rend. Sem. Mat. Univ. Politec. Torino , 49(1):95–119 (1993), 1991.Commutative algebra and algebraic geometry, II (Italian) (Turin, 1990).[23] Hugo Duminil-Copin, Itai Benjamini, Gady Kozma, and Cyrille Lucas. Internal diffusion-limited aggregation with uniform starting points.
Preprint , arXiv:1707.03241, 2017.[24] Hugo Duminil-Copin, Cyrille Lucas, Ariel Yadin, and Amir Yehudayoff. Containing internaldiffusion limited aggregation.
Electron. Commun. Probab. , 18:no. 50, 8, 2013.[25] Robert Els¨asser and Thomas Sauerwald. Tight bounds for the cover time of multiple randomwalks.
Theoret. Comput. Sci. , 412(24):2623–2641, 2011.[26] Tobias Friedrich and Lionel Levine. Fast simulation of large-scale growth models.
RandomStructures Algorithms , 42(2):185–213, 2013.[27] Christina Goldschmidt and Micha lPrzykucki. Parking on a random tree.
Combin. Probab.Comput. , 28(1):23–45, 2019.[28] Wilfried Huss. Internal diffusion-limited aggregation on non-amenable graphs.
Electron. Com-mun. Probab. , 13:272–279, 2008.[29] Wilfried Huss and Ecaterina Sava. Internal aggregation models on comb lattices.
Electron. J.Probab. , 17:no. 30, 21, 2012.[30] David Jerison, Lionel Levine, and Scott Sheffield. Logarithmic fluctuations for internal DLA.
J. Amer. Math. Soc. , 25(1):271–301, 2012.[31] David Jerison, Lionel Levine, and Scott Sheffield. Internal DLA in higher dimensions.
Electron.J. Probab. , 18:No. 98, 14, 2013.[32] Varun Kanade, Frederik Mallmann-Trenn, and Thomas Sauerwald. On coalescence time ingraphs: when is coalescing as fast as meeting? In
Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms , pages 956–965. SIAM, Philadelphia, PA, 2019.3633] Gregory F. Lawler. Subdiffusive fluctuations for internal diffusion limited aggregation.
Ann.Probab. , 23(1):71–86, 1995.[34] Gregory F. Lawler.
Intersections of random walks . Modern Birkh¨auser Classics.Birkh¨auser/Springer, New York, 2013. Reprint of the 1996 edition.[35] Gregory F. Lawler, Maury Bramson, and David Griffeath. Internal diffusion limited aggrega-tion.
Ann. Probab. , 20(4):2117–2140, 1992.[36] Gregory F. Lawler and Vlada Limic.
Random walk: a modern introduction , volume 123 of
Cambridge Studies in Advanced Mathematics . Cambridge University Press, Cambridge, 2010.[37] David A. Levin, Yuval Peres, and Elizabeth L. Wilmer.
Markov chains and mixing times .American Mathematical Society, Providence, RI, 2009. With a chapter by James G. Proppand David B. Wilson.[38] Lionel Levine and Yuval Peres. Laplacian growth, sandpiles, and scaling limits.
Bull. Amer.Math. Soc. (N.S.) , 54(3):355–382, 2017.[39] Lionel Levine and Vittoria Silvestri. How long does it take for Internal DLA to forget its initialprofile?
Probab. Theory Related Fields , 174(3-4):1219–1271, 2019.[40] L´aszl´o Lov´asz. Random walks on graphs: a survey. In
Combinatorics, Paul Erd˝os is eighty,Vol. 2 (Keszthely, 1993) , volume 2 of
Bolyai Soc. Math. Stud. , pages 353–397. J´anos BolyaiMath. Soc., Budapest, 1996.[41] Cyrille Lucas. The limiting shape for drifted internal diffusion limited aggregation is a trueheat ball.
Probab. Theory Related Fields , 159(1-2):197–235, 2014.[42] P. Meakin and J. M. Deutch. The formation of surfaces by diffusion-limited annihilation.
JChem Phys , 85:2320 – 2325, 1986.[43] Cristopher Moore and Jonathan Machta. Internal diffusion-limited aggregation: parallel algo-rithms and complexity.
J. Statist. Phys. , 99(3-4):661–690, 2000.[44] Yuval Peres and Perla Sousi. Mixing times are hitting times of large sets.
J. Theoret. Probab. ,28(2):488–519, 2015.[45] Olivier Raimond and Bruno Schapira. Internal DLA generated by cookie random walks on Z . Electron. Commun. Probab. , 16:482–490, 2011.[46] Nicol´as Rivera, Thomas Sauerwald, Alexandre Stauffer, and John Sylvester. The dispersiontime of random walks on finite graphs. In
The 31st ACM Symposium on Parallelism inAlgorithms and Architectures , SPAA ’19, pages 103–113, New York, NY, USA, 2019. ACM.[47] Eric Shellef. IDLA on the supercritical percolation cluster.
Electron. J. Probab. , 15:no. 24,723–740, 2010.[48] Vittoria Silvestri. Internal dla on cylinder graphs: fluctuations and mixing.
Preprint ,arXiv:1909.09893, 2019.[49] Alexandre Stauffer and Lorenzo Taggi. Critical density of activated random walks on transitivegraphs.
Ann. Probab. , 46(4):2190–2220, 07 2018.[50] Debleena Thacker and Stanislav Volkov. Border aggregation model.
Ann. Appl. Probab. ,28(3):1604–1633, 2018. 37
Bounds for Expected Hitting Times of Sets
We must first state a well known result.
Lemma A.1 (equation (12.11) of [37]) . Consider a lazy random walk on a connected graph, then e p tu,v ≤ π ( v ) + q d ( v ) d ( u ) λ t , where λ is the second eigenvalue of the associated transition matrix. We can now prove result which controls t hit ( π, S ) by bounding short-term return probabilities. Lemma A.2.
Let G be any regular-graph and S be any subset of vertices. Then, for any vt hit ( v, S ) ≤ − e − · n (1 + ⌈ log | S |⌉ )(1 − λ ) | S | . Furthermore, suppose that there exists a constants
C > and ε > such that p tu,w ≤ π ( w )+ Ct − (1+ ε ) for any pair of vertices u, w . Then for any vt hit ( v, S ) ≤ − e − ) · (5 C + 1) n | S | ε/ (1+ ε ) . Both results above extend to almost-regular graphs at expense of a multiplicative O (1) factor.Proof. We begin by deriving the first bound. Let ( X t ) t ≥ be a random walk starting from vertex v and let τ S be the first time X t hits the set S . We divide time into phases I i of length 5 τ where τ = t mix (1 /e ), i.e. I i = { i − τ, . . . , iτ − } . We count the number of phases needed to reachthe set S . Suppose that in phases 1 , . . . , i − S . During a phase I i , we let thewalk move for 4 τ times ignoring if it visits or not the set S , and we observe if the walk visited S inthe last τ time-steps of phase I i . Then, independent of everything that happens before time-step5 iτ , with probability at least (1 − e − ) we can couple X τ +5( i − τ with the stationary distribution(e.g. Lemma A.5 in [32]), hence Pr v [ τ S ≤ iτ | τ S ≥ i − τ ] ≥ (1 − e − ) Pr π [ τ S ≤ τ ] . We compute the later probability we define the random variable Z = P τ − i =0 { X t ∈ S } which countsthe number of visits of visits to S . Then Pr π [ τ S ≤ τ ] = Pr π [ Z ≥
1] and we use the trivial fact that Pr π [ Z ≥
1] = E π [ Z ] / E π [ Z | Z ≥ E π [ Z ] = τ π ( S ) = τ | S | /n . Furthermore E π [ Z | Z ≥ ≤ max u ∈ S τ X t =0 X v ∈ S p tu,v ≤ τ X t =0 min ( , X v ∈ S (cid:18) n + λ t (cid:19)) (13)The second inequality holds because p tu,v ≤ n + λ t for any u, v in a regular graph (Lemma A.1).By separating the sum from t = 0 to τ at t = ⌈ log λ (1 /S ) ⌉ and applying − log λ ≤ − λ we have E π [ Z | Z ≥ ≤ ⌈ log | S |⌉ − λ · τ · | S | · n + | S | · τ X t = ⌈ log λ (1 /S ) ⌉ ( λ ) t . Finally bounding the sum by a geometric series, E π [ Z | Z ≥
1] = ⌈ log | S |⌉ − λ + τ · | S | · n + 11 − λ ≤ (cid:18) ⌈ log | S |⌉ − λ (cid:19) , τ = t mix ( e − ) ≤ n − λ by [37, (12.9)], and | S | log n ≤ n log | S | for all | S | ≥
2. Therefore, Pr π [ Z ≥