Balanced Allocation on Dynamic Hypergraphs
aa r X i v : . [ c s . D S ] J un Balanced Allocation on Dynamic Hypergraphs ∗ Catherine Greenhill † Bernard Mans ‡ Ali Pourmiri § April 24, 2020
Abstract
The balls-into-bins model randomly allocates n sequential balls into n bins, as follows: eachball selects a set D of d > D (ties broken randomly). The maximum load is themaximum number of balls in any bin. In 1999, Azar et al. showed that, provided ties arebroken randomly, after n balls have been placed the maximum load , is log d log n + O (1), withhigh probability. We consider this popular paradigm in a dynamic environment where the binsare structured as a dynamic hypergraph . A dynamic hypergraph is a sequence of hypergraphs,say H ( t ) , arriving over discrete times t = 1 , , . . . , such that the vertex set of H ( t ) ’s is the setof n bins, but (hyper)edges may change over time. In our model, the t -th ball chooses an edgefrom H ( t ) uniformly at random, and then chooses a set D of d > D , with ties broken randomly.We quantify the dynamicity of the model by introducing the notion of pair visibility , whichmeasures the number of rounds in which a pair of bins appears within a (hyper)edge. Weprove that if, for some ε >
0, a dynamic hypergraph has pair visibility at most n − ε , andsome mild additional conditions hold, then with high probability the process has maximumload O (log d log n ). Our proof is based on a variation of the witness tree technique, which is ofindependent interest. The model can also be seen as an adversarial model where an adversarydecides the structure of the possible sets of d bins available to each ball. The standard balls-into-bins model is a process that randomly allocates m sequential balls into n bins, where each ball chooses a set D of d bins, independently and uniformly at random, thenthe ball is allocated to a least-loaded bin from D (with ties broken randomly). When m = n and d = 1, it is well known that at the end of process the maximum number of balls at any bin, the maximum load , is (1 + o (1)) log n log log n , with high probability. Surprisingly, Azar et al. [2] showedthat for this d -choice process with d >
2, provided ties are broken randomly, the maximum loadexponentially decreases to log d log n + O (1). This phenomenon is known as the power of d choices .The multiple-choice paradigm has been successfully applied in a wide range of problems from nearbyserver selection, and load-balanced file placement in the distributed hash table, to the performanceanalysis of dictionary data structures (e.g., see [21]). In the classical setting, all (cid:0) nd (cid:1) sets of d binsare available to each ball. However, in many realistic scenarios such as cache networks, peer-to-peer or cloud-based systems, the balls (requested files, jobs, items,..) have to be allocated to bins ∗ The first author is supported by the Australian Research Council Discovery Project DP190100977. The secondand third authors are supported by the Australian Research Council Discovery Project DP170102794. † UNSW Sydney, Australia, [email protected] ‡ Macquarie University, Sydney, Australia, [email protected] § Macquarie University, Sydney, Australia, [email protected] d -choice process,which also requires the study of non-uniform distributions over choices (e.g. [1, 6, 7, 11]). Hence inmany settings, allowing all possibilities for the set D of d bins is costly, and may not be practical.This motivates the investigation of the effect of distributions of the set D on the maximum load.In this regard, Kenthapadi and Panigrahy [13] proposed balanced allocation on graphs, where binsform the vertices of a ∆-regular graph and each ball chooses an edge of the graph uniformly atrandom. The ball is then placed in an endpoint of the selected edge with smaller load (ties arebroken randomly). Kenthapadi and Panigrahy showed that the maximum load is Θ(log log n ) if andonly if ∆ = n Ω(1 / log log n ) . Here, one may see that the possibilities for the set D (the two chosenbins) is restricted to the set of n ∆ / d = 2, the underlying graph is a complete graph (all (cid:0) n (cid:1) edges present). Following the studyof balls-into-bins with related choices, Godfrey [12] utilized hypergraphs to model the structure ofbins. In this model, each ball picks a random edge of a given hypergraph that contain Ω(log n ) binsand the hypergraph satisfies some mild conditions. Then, the ball is allocated to a least-loadedbin contained in the edge, with ties broken randomly. Godfrey showed that the maximum load isconstant. Balanced allocation on graphs and hypergraphs has been further studied in [3, 4, 18, 19].In the aforementioned works, either the underlying graph is fixed during the process or, in thehypergraph setting, the number d of choices satisfies d = Ω(log n ). However, in many real-worldsystems the structure may change over time, and probing the load of Ω(log n ) bins might be a costlytask. Seeking a more realistic model, this paper studies the d -choice process in dynamic graphs andhypergraphs, where 2 d = o (log n ).Balanced allocation on dynamic hypergraphs can also be seen as an adversarial model, wherethe set D of potential choices is proposed by an adversary (environment) whose goal is to increasethe maximum load. Here we want to understand the conditions under which the balanced allocationon dynamic (hyper)graphs still benefits from the effect of the power of d choices. We propose balanced allocation algorithms on different dynamic environments, namely dynamicgraph and hypergraph models. In order to measure the dynamicity, we introduce the notion of pair visibility . For a pair { i, j } of distinct vertices, the visibility of { i, j } , denoted by vis ( i, j ), isthe number of rounds t ∈ { , . . . , n } such { i, j } is contained in the edge chosen at round t . (Amore formal definition is given below.) When ball i is placed into a bin, the height of ball i is thenumber of balls that were allocated to the bin before ball i . We say that event E n holds with highprobability (w.h.p.) if Pr [ E n ] > − n − c for every constant c > Balanced Allocation on Dynamic Hypergraphs
Write [ n ] = { , . . . , n } to be the set of n bins. A hypergraph H = ([ n ] , E ) is s - uniform if | H | = s for every H ∈ E . For every integer n >
1, let s = s ( n ) be an integer such 2 s n . A dynamic s -uniform hypergraph , denoted by ( H (1) , H (2) , . . . , H ( n ) ), is a sequence of s -uniform hypergraphs H ( t ) = ([ n ] , E t ) with vertex set [ n ]. The edge sets E t may change with t . A hypergraph is regular ifevery vertex is contained in the same number of edges.In this paper, we are interested in the following properties which dynamic hypergraphs maysatisfy. We refer to these properties as the balancedness , visibility , and size properties. The bal-ancedness property is adapted from [3, 12]. Balancedness:
Let H t denote a randomly chosen edge from E t . If there exists a constant β > Pr [ i ∈ H t ] βs/n for every 1 t n and each bin i ∈ [ n ], then the2ynamic hypergraph ( H (1) , . . . H ( n ) ) is β -balanced. A dynamic hypergraph is balanced if it is β -balanced for some constant β >
1. Every regular hypergraph is 1-balanced.
Visibility:
For every pair of distinct vertices { i, j } ⊂ [ n ], the visibility of { i, j } is vis ( i, j ) = |{ t ∈ { , , . . . , n } | { i, j } ⊂ H ∈ E t }| . If there exists a constant ε ∈ (0 ,
1) such that vis ( i, j ) sn − ε for all pairs { i, j } ⊆ [ n ] of dis-tinct bins then the dynamic hypergraph ( H (1) , . . . , H ( n ) ) is ε -visible. A dynamic hypergraphsatisfies the visibility property if it is ε -visible for some constant ε ∈ (0 , Size: If s = Ω(log n ) and there exists a positive constant c > |E t | n c forevery t >
1, then the dynamic hypergraph ( H (1) , . . . , H ( n ) ) satisfies the c -size property. Adynamic hypergraph satisfies the size property if it satisfies the c -size property for someconstant c > Definition 1 (Balanced Allocation on Dynamic Hypergraphs) . Suppose that ( H (1) , . . . , H ( n ) ) is an s -uniform hypergraph and fix d = d ( n ) with 2 d = o (log n ) and d s . The balanced allocationalgorithm on ( H (1) , . . . , H ( n ) ) proceeds in rounds ( t = 1 , , . . . , n ), sequentially allocating n balls to n bins. In round t , the t -th ball chooses an edge H t uniformly at random from E t , then it randomlychooses a set D t of d bins from H t (without repetition) and allocates itself to a least-loaded binfrom D t , with ties broken randomly. Theorem 2.
Let ( H (1) , . . . , H ( n ) ) be a dynamic s -uniform hypergraph which satisfies the balanced-ness, ε -visibility and size properties. Fix d = d ( n ) such that d = o (log n ) . There exists Θ( n ) m n such that after the balanced allocation process on ( H (1) , . . . , H ( n ) ) has allocated m balls, the maximum load is log d log n + O (1 /ε ) with high probability. Moreover, for every fixed positiveinteger γ with γm n , after allocating γm balls the maximum load is at most γ (log d log n + O (1 /ε )) ,w.h.p..Remark . In our result we only consider the case where d = o (log n ), because when d = Ω(log n ),a constant upper bound is obtained by [12]. The size property is mainly assumed for technicalreasons. For instance, |E t | poly( n ) is not necessary. Roughly speaking, balanced allocation ona dynamic hypergraph with large |E t | resembles the standard balls-into-bins process. So it mightbe possible that having more structural information about a dynamic hypergraph would enableus to extend our result to allow an arbitrary number of edges |E t | . Another possible extension ofTheorem 2 would be to allow s to be a function of d .The proof of Theorem 2 is based on the witness tree technique (see [1, 11, 16, 20] for example).First, we define a certain structure corresponding to the allocation process and claim that thestructure exists with very small probability (i.e., n −O (1) ). Second, we will deterministically showthat if the maximum load is higher than a certain threshold, then this structure must exist. Puttingthese together, we obtain an upper bound for the maximum load, with high probability. Thisapproach is of independent interest and might be applied for the study of random hypergraphs.The proof is given in Section 2.Finally, in the following theorem we show that ε -visibility can also lead to a lower bound onthe maximum load achieved by the balanced allocation process on hypergraphs. This theorem isproved in Appendix A. Theorem 4.
Let s = s ( n ) = n ε , where ε ∈ (0 , is an arbitrary small real number. There exists adynamic s -uniform hypergraph, say ( H (1) , . . . , H ( n ) ) , which satisfies the balancedness condition and(trivially) satisfies the ε -visibility condition. Let d s be any integer which is constant. Supposethat the balanced allocation process on ( H (1) , . . . , H ( n ) ) has allocated n balls, then the maximum loadis at least min { Ω(1 /ε ) , Ω(log n/ log log n ) } with high probability. alanced Allocation on Dynamic Graphs A dynamic graph is a special case of a dynamic hypergraph, where s = s ( n ) = 2 for all n . Write( G (1) , . . . , G ( n ) ) to denote a dynamic graph, where G ( t ) = ([ n ] , E t ) for t = 1 , , . . . , n . Theorem 2does not cover the case of graphs ( s = 2), due to the size property. We will prove a result onbalanced allocation for regular dynamic graphs. Definition 5 (Balanced Allocation on Dynamic Graphs) . Suppose that ( G (1) , . . . , G ( n ) ) is a regulardynamic graph on vertex set [ n ]. The balanced allocation algorithm on ( G (1) , . . . , G ( n ) ) proceeds inrounds ( t = 1 , . . . , n ). In each round t , the t -th ball chooses an edge of G ( t ) uniformly at random,and the ball is then placed in one of the bins incident to the edge with a lesser load, with ties brokenrandomly.Say that the dynamic graph is regular if G ( t ) is ∆ t -regular for some positive integer ∆ t and all t = 1 , , . . . , n . For every pair of distinct bins { i, j } ⊂ [ n ], we will assume that the visibility vis ( i, j )satisfies vis ( i, j ) = |{ t ∈ { , , . . . , n } | { i, j } ∈ E t }| n − ε for some constant ε ∈ (0 , ε -visibility. Theorem 6.
Let ( G (1) , . . . , G ( n ) ) be a regular dynamic graph which satisfies the ε -visibility con-dition, for some ε ∈ (0 , . Suppose that the balanced allocation process on ( G (1) , . . . , G ( n ) ) hasallocated n balls. Then the maximum load is at most log log n + O (1 /ε ) , with high probability. The proof, which can be found in Section 3, is again based on the witness tree technique. Weremark that Theorem 6 can be extended to the case where the dynamic graph is almost regular ,meaning that the ratio of the minimum and maximum degree of G ( t ) is bounded above by anabsolute constant for t = 1 , . . . , n . Dynamic Graphs and Hypergraphs with Low Pair Visibility
In order to show the ubiquity of the visibility condition, we will describe some dynamic graphswith low pair visibility. One can easily construct a dynamic hypergraph from a dynamic graphby considering the r -neighborhood of each vertex of the t -th graph as a hyperedge in the t -thhypergraph, for t = 1 , . . . , n . • Dynamic Cycle.
For t = 1 , . . . , n define the edge set E t = {{ i, j } ⊂ { , . . . , n − } | j = i + ⌈ t/ √ n ⌉ (mod n ) or i = j + ⌈ t/ √ n ⌉ (mod n ) } , where calculations are performed modulo n (that is, in the additive group Z n ). In modularaddition, for every pair { i, j } ⊂ { , . . . , n − } , the equation i = j + k (mod n ) has at mostone solution 1 k √ n and hence vis ( i, j ) = |{ t ∈ { , , . . . , n } | { i, j } ∈ E t }| √ n. Now C ( t ) = ( { , , . . . , n − } , E t ) is 2-regular, so it is either a Hamilton cycle or a union oftwo or more disjoint cycles (depending on whether t and n are coprime). By Theorem 6, themaximum load attained by the algorithm on { C ( t ) , t = 1 , . . . , n } is at most log log n + O (1).The analysis of the balanced allocation algorithm on ∆-regular graphs given by Kenthapadiand Panigrahy [13] showed that the balanced allocation process on arbitrary ∆-regular graphshas maximum load Θ(log log n ) only when ∆ = n Ω(1 / log log n ) . By contrast, here each C ( t ) hasdegree at most 2, but the visibility condition keeps the maximum load as low as the standardtwo-choice process. 4 emark . By Theorem 6, w.h.p., the balanced allocation process on the dynamic cycleachieves the maximum load at most log log n + O (1). Since | E t | = n for t = 1 , . . . , n , eachball requires log n random bits. However, in the standard power-of-two-choices process, eachball chooses two independent and random bins, which requires 2 log n random bits. Therefore,the dynamic cycle can be used to reduce (by half) the number of random bits required in thestandard two-choice process. • Dynamic Modular Hypergraph.
Suppose that n is a prime number and fix s = s ( n ) such thatlog n s n / . (Here n is large enough so that this range is non-empty.) For t = 1 , . . . , n ,let k t = ⌈√ n ⌉ + ⌈ tn / ⌉ and for each α ∈ Z n define H t ( α ) = { α + jk t (mod n ) | j = 0 , , . . . , s − } . Then H t ( α ) is a subset of Z n of size s , as n is prime. Now for each t = 1 , . . . , n we define thedynamic s -uniform hypergraph H ( t ) = ( Z n , E t ), where E t = { H t ( α ) | α ∈ Z n } . Then H ( t ) is s -regular, and hence 1-balanced, and it satisfies the 1-size property as |E t | = n . Suppose that { β , β } ⊂ H t ( α ) for some α ∈ Z n , with β = β . Then there exists j , j ∈ { , . . . , s − } suchthat β = α + j k t (mod n ) and β = α + j k t (mod n ). Thus, β − β = ( j − j ) k t (mod n ).Note that j , j must be distinct as β , β are distinct. Next suppose that k t = k t for some t , t ∈ { , . . . , n } , and take any j , j ∈ { , . . . , s − } . By definition of k t and working in Z ,we see that 1 | j k t − j k t | ( s − (cid:0) ⌈√ n ⌉ + ⌈ n / ⌉ (cid:1) < n, and it follows that j k t = j k t (mod n ) . (1)Finally, suppose that some distinct β , β satisfy { β , β } ⊂ H t ( α ) ∩ H t ( α ) where k t = k t .Then β − β = jk t (mod n ) for some j ∈ { , . . . , s − } , and β − β = j k t (mod n ) forsome j ∈ { , . . . , s − } , but this contradicts (1). Therefore, by definition of k t , for every { β , β } ⊂ Z n , we have vis ( β , β ) = |{ t ∈ { , , . . . , n } | { β , β } ⊂ H t ( α ) for some α ∈ Z n }| O ( n / ) . • Stationary Geometric Mobile Network.
Consider an R -dimensional torus Γ( n, R ), which is agraph whose vertex set is the Cartesian product of Z Rℓ = Z ℓ × . . . × Z ℓ , where ℓ = n /R ∈ Z ,and two vertices ( x , . . . , x R ) and ( y , . . . , y R ) are connected if for some j ∈ { , . . . , R } x j = y j ± n and for all i = j we have x i = y i . Let π be the stationary distributionof the following random walk on Γ( n, R ): at each step, the walker stays at the current vertexwith probability p , and otherwise chooses a neighbour randomly and moves to that neighbour.The transition probability from vertex u to a neighbouring vertex w is (1 − p ) / (2 R ), where 2 R is the degree of vertex u in Γ( n, R ). Now place n agents on vertices of Γ( n, R ) independently,each according to the distribution π . At each time step, each agent independently performs astep of the random walk described above (For random walks on a torus we refer the interestedreader to [15]). For every pair of distinct agents a and b , let d t ( a, b ) denote the Manhattandistance (in Γ) of the locations of a and b at time t . For a given r >
1, we define the communication graph process { G ( t ) r | t = 0 , , . . . } over the set of agents, say A , so that forevery t >
0, agents a and b are connected if and only if d t ( a, b ) r . The model has beenthoroughly studied when R = 2 in the context of information spreading [9]. We present thefollowing result regarding the pair visibility of the communication graph process, proved inAppendix B. 5 roposition 8. Fix r = r ( n ) = n o (1) . Also let { G ( t ) r = ( A, E t ) | t n } be thecommunication graph process defined on an R -dimensional torus Γ( n, R ) . Then there existsconstant ε > such that for every pair of agents, say { a, b } ⊂ A , vis ( a, b ) = |{ t ∈ { , , . . . , n } | { a, b } ∈ E t }| = O ( n − ε ) . As we discussed, in the standard balls-into-bins, each ball picks a set of d choices from n bins, inde-pendently and uniformly at random. One of the first algorithms considering a different distributionover the bins is called always-go-left proposed by V¨ocking [20]. In this algorithm, the bins are par-titioned into d groups of size n/d and each ball picks one random bin from each group. The ball isthen allocated to a least-loaded bin among the chosen bins, with ties broken in favor of the bin fromthe least-indexed group. The algorithm uses exponentially smaller number of choices and achieve amaximum load of log log ndφ d + O (1), where 1 φ d n bins are uniformly at random placed on a geometric space. Then each ball, inturn, picks d locations in the space. Corresponding to these d locations, the ball probes the loadof d bins that have the minimum distance from the locations. The ball then allocates itself to oneof the d bins with minimum load. In this scenario, the probability that a location close to a bin ischosen depends on the distribution of other bins in the space and hence there is not a uniform distri-bution over the potential choices. Here, the authors showed the maximum load is log d log n + O (1).Later on, Kenthapadi and Panigrahy [13] proposed a graphical balanced allocation in which binsare interconnected as a s -regular graph and each ball picks a random edge of the graph. It is thenplaced in one of its endpoints with a smaller load. This allocation algorithm results in a maximumload of log log n + O (cid:16) log n log( s/ log n ) (cid:17) + O (1). Godfrey [12] studied balanced allocation on hypergraphswhere each ball probes the bins contained in a random edge of size Ω(log n ). In [3, 12], the bal-anced allocation process on hypergraphs was studied where number of choices is d = Ω(log n ). Theanalysis involves the second moment method (Chernoff bounds), and lower bound on d is neededin order to achieve concentration. Hence it is unlikely that the techniques of [3,12] can be extendedto the range d = o (log n ). Peres et al. [18] also considered balanced allocation on graphs where thenumber of balls m can be much larger than n (i.e., m ≫ n ) and the graph is not necessarily regularand dense. Then, they established upper bound O (log n/σ ) for the gap between the maximumand the minimum loaded bin after allocating m balls, where σ is the edge expansion of the graph.Bogdan et al. [4] studied a model where each ball picks a random vertex and performs a local searchfrom the vertex to find a vertex with local minimum load, where it is finally placed. They showedthat when the graph is a constant degree expander, the local search guarantees a maximum load ofΘ(log log n ). Pourmiri [19] substitutes the local search by non-backtracking random walks of length ℓ = o (log n ) to sample the choices and then the ball is allocated to a least-loaded bin. Providedthe underlying graph has sufficiently large girth and ℓ , he showed the maximum load is a constant.In the context of hashing (e.g., [1, 11]), authors apply the witness graph techniques to analyze themaximum load in the balls-into-bins process where the bins are picked based on tabulation. In this section we establish an upper bound for the maximum load attained by the balanced alloca-tion on hypergraphs (i.e., Theorem 2). In order to analyze the process let us first define a conflictgraph . We write D t for the set of d bins chosen by the t -th ball, and sometimes refer to D t as the d - choice of the t -th ball. We will slightly abuse the notation and write D u ∩ D t , D u ∪ D t to denotethe set of common bins, and the union of bins, chosen by balls u and t , respectively.6 efinition 9 (Conflict Graph) . For m = 1 , . . . , n , the conflict graph C m is a simple graph withvertex set { D , D , . . . , D m } . Vertices D u and D t are connected by an edge in C m if and only if D u ∩ D t = ∅ (that is, the d -choices of the t -th ball and the u -th ball contain a common bin).We say a subgraph of C m with vertex set { D t , . . . , D t k } is c - loaded if every bin in D t ∪ D t ∪· · · ∪ D t k has at least c balls.Our analysis will involve a useful combinatorial object, called an ordered tree . An ordered tree isa rooted tree, together with a specified ordering of the children of every vertex. Recall that k +1 (cid:0) kk (cid:1) is the k -th Catalan number, which counts numerous combinatorial objects, including the numberof ways to form k balanced parentheses. It is well known [17] that ordered trees with k − k − Proposition 10.
The number of k -vertex ordered trees is k (cid:0) k − k − (cid:1) k − . More information regarding the enumeration of trees can be found in [14].The following blue-red coloring will be very helpful in our analysis.
Definition 11 (Blue-red coloring) . Given m ∈ { , , . . . , n } , suppose that T ⊂ C m is a rooted andordered k -vertex tree contained in C m . Let the vertex set of T be { D t . . . , D t k } , where D t isthe root. Perform depth-first search starting from the root, respecting the specified order of eachvertex. For i = 1 , . . . , k , let D ( i ) ∈ { D t . . . , D t k } be the vertex which is the i -th visited vertex inthe depth-first search. Then D (1) = D t is the root. for j = 1 , . . . , k . We now define a blue-redcoloring col : { D (2) , . . . , D ( k ) } → { blue, red } as follows. For i = 2 , . . . , k , col ( D ( i )) = ( blue if | ( ∪ i − j =1 D ( j )) ∩ D ( i ) | = 1 , red if | ( ∪ i − j =1 D ( j )) ∩ D ( i ) | > . The following key lemma presents a upper bound for the probability that a certain tree can befound as a subgraph of C m . Lemma 12 (Key Lemma) . Let ( H (1) , . . . , H ( n ) ) be a dynamic s -uniform hypergraph which satisfiesthe β -balanced, ε -visibility and c -size properties. Suppose that c > is an arbitrary constant and k = C log n for some constant C > . There exists Θ( n ) m n such that the probability that C m contains a c -loaded k -vertex tree with r red vertices in its blue-red colouring is at most n c +3 exp { k log(2 βd ) − rε log( n ) / − c ( d − k − r − } . Moreover, with high probability, r = O (1 /ε ) . The proof, presented in Appendix C, involves an extension of the witness tree technique. Thismethod might be of independent interest in the study of random hypergraphs.We now explain how to recursively build a witness graph if there exists a bin whose load ishigher than a certain threshold. The minimum load of D t is the number of balls in the least-loadedbin in D t (the set of d choices of D t ). Clearly, if ball t is placed at height h then D t has minimumload at least h . Construction of the Witness Graph
Suppose that there exists a bin with load ℓ + c + 1. Let R be the d -choice corresponding to the ball at height ℓ + c in this bin. Then the minimum load of R is ℓ + c . We start building the witness tree in C m whose root is R . For every bin i ∈ R , considerthe ℓ balls in bin i at height ℓ + c − j , for j = 1 , . . . , ℓ , and let D it j be the d -choice correspondingto the ball in bin i with height ℓ + c − j . These ℓ balls exist as the minimum load of R is ℓ + c . Werefer to set { D it j | i ∈ R, j ℓ } as the set of children of R , where the minimum load of D it j ℓ + c − j −
1. All children of R are connected to R in C m . Order the children of R arbitrarily,then blue-red colour the first level of the tree (the children of R ). Recall that a vertex is colored byblue if it only shares one bin with its predecessors in the ordering. So a blue d -choice contains d − d -choices (with respect to depth-first search, respecting thefixed ordering). We call these d − fresh .Next, consider each blue vertex of the tree (if any), and recover the d -choices corresponding toballs that are placed in fresh bins with height at least c . Then, blue-red color the children of those d -choices, with respect to an arbitrary ordering. This recursion will continue until either there areno balls remaining with height at least c , or there are no blue vertices. For j = 1 , . . . , ℓ , let f ( ℓ − j )denote the number of d -choices that the recursive construction gives, when the d -choice for the roothas minimum load ℓ + c − j −
1. Provided all vertices are colored blue, the recursive constructioncontinues until no ball remains with height at least c . Therefore, a simple calculation shows that f ( ℓ ) > ( d − f ( ℓ −
1) + f ( ℓ −
2) + · · · + f (0) + 1) , where f (0) = 1. Solving the above recursive formula shows that f ( ℓ ) > d − d ℓ − > d ℓ . Proof of Theorem 2.
Let ( H (1) , . . . , H ( n ) ) be a dynamic hypergraph which satisfies the β -balanced, ε -visibility and c -size properties. By Lemma 12, there exists Θ( n ) = m n such that the followingholds with high probability: after m balls have been allocated by the balanced allocation process,if T ⊆ C m is a c -loaded tree with k vertices and T is blue-red coloured according to some arbitraryordering of the children of each vertex, then the number r of red vertices satisfies r = O (1 /ε ). Sowe are able to find a constant c > r < c · d .Now suppose that after allocating m balls, there is a ball at height ℓ + c + c + 1. This impliesthat there is a d -choice, denoted by R , whose minimum load is at least ℓ + c + c +1. Let us considerall balls placed in the bins contained in R with height at least ℓ + c + 1. Recover the corresponding d -choices for these balls, say D , D , . . . , D w , then colour them blue-red with respect to the root R and an arbitrary ordering of the children of each vertex. Since w > c · d , w.h.p., there are b > w − b red vertices. We now consider every blue vertex D t ∈ { D , D , . . . , D w } asa root and start the recursive construction of the witness graph. Assuming that the number of redvertices is strictly less than c · d < w , it follows that at least one recursive construction (with root D i ) does not produce any red vertex. Moreover, the recursion from D i gives a c -loaded tree withat least k = d ℓ vertices. We take ℓ = log d log n , so that k = log n . Another application of Lemma 12implies that a c -loaded k -vertex tree with no red vertices exists with probability at most n c +3 exp { k log(2 βd ) − c ( d − k − } exp (cid:8)(cid:0) c + 4 + 4 log(2 βd ) − c ( d − (cid:1) log n (cid:9) exp (cid:8)(cid:0) c + 4 + 4 log(4 β ) − c (cid:1) log n (cid:9) , using the fact that 2 d = o (log n ) and k = log n . Setting c to be a large enough positive constant,we conclude that with high probability the maximum load is at mostlog d log n + O (1) + c = log d log n + O (1 /ε ) , where c = O (1 /ε ). This proves the first statement of Theorem 2. The proof of the second statementis presented in Appendix D. In this section we show an upper bound for maximum load attained by the balanced allocationon regular dynamic graphs (i.e., Theorem 6). Suppose that the balanced allocation process has8llocated n balls to the dynamic regular graph ( G (1) , . . . , G ( n ) ). Define the conflict graph C n formedby the edges selected by the n balls. The vertex set of C n is the set [ n ] of bins, and the loads ofthese bins are updated during the process.Given a tree T which is a subgraph of C n , and vertices u , v of the tree, if { u, v } is an edge of C n then we say it is a cycle-producing edge with respect to the tree T . The name arises as addingthis edge to the tree would produce a cycle, which may be a 2-cycle if the edge { u, v } is alreadypresent in T . For a positive integer c >
0, a subgraph of C n is called c - loaded if each vertex (bin)contained in the subgraph has load at least c . The following proposition presents some propertiesof connected components of C n . Proposition 13.
Let ( G (1) , . . . , G ( n ) ) be a regular dynamic graph on vertex set [ n ] which is ε -visible. Let that C n be the conflict graph obtained after allocating n balls using the balanced allocationprocess. Then for every given constant c > , with probability at least − n − c , every c + 1) -loaded connected component of C n contains strictly fewer than log n vertices. Moreover, the numberof cycle-producing edges in the component is at most c + 1) /ε . We will prove the proposition in Appendix F. We now explain how to recursively build a witnessgraph, provided there exists a bin whose load is higher than a certain threshold.
Construction of the Witness Graph
Let us start with a bin, say r , with ℓ + c balls. Clearly, ifa ball is in bin r at height h then the other bin it chose, as part of the balanced allocation procedure,had load at least h . Starting from bin (vertex) r , let us recover all ℓ edges corresponding to theballs that were placed in r with height at least c . Thus, the alternative bin choices have loads atleast ℓ + c − , . . . , c , respectively. These ℓ bins are all neighbours of r in C n , and we refer to themas the children of r . Next, we recover the edges corresponding to balls placed in the children of r atheight at least c . Recursively, we continue until there is no ball remaining at height c or more. Forevery i = 1 , . . . , ℓ , let f ( ℓ − i ) denote the number of vertices generated by the recursive construction,starting with a bin which contains ℓ − i + c balls. Assume for the moment that, for each vertexwith load at least c , the recursive procedure always gives produces distinct children. Then f ( ℓ ) > f ( ℓ −
1) + f ( ℓ −
2) + . . . + f (0) + 1 , where f (0) = 1. A simple calculation shows that f ( ℓ ) > ℓ . Thus, the recursive procedure givesa c -loaded tree with at least 2 ℓ vertices, under the assumption that the children of each vertexconsidered by the recursion are all distinct.We may now prove our main result on dynamic regular graphs. Proof of Theorem 6.
We want to show that after n balls have been allocated to the dynamicregular graph ( G (1) , . . . , G ( n ) ), which satisfies the ε -visibility property, the maximum load is atmost log log n + O (1 /ε ) with high probability.Let c > − n − c , the number of cycle-producing edges in a given component of C n is at most c = 2( c + 1) /ε . For a contradiction, suppose that there exists a bin, say r , which has at least ℓ + c + c + 1 balls, where c = 12( c + 1). Consider c + 1 balls in r at height at least ℓ + c .The children of r in C n are the bins r , r , . . . , r c +1 (which might not be distinct), which werethe alternative choice of these c + 1 balls. Each of these children r i has load at least ℓ + c .We start the recursive construction at each child r i of r . Assuming that this component of C n contains at most c cycle-producing edges, it follows that for at least one child r i of r , the recursiveprocedure gives distinct children for each vertex which is a descendent of r i . Hence we obtain a c -loaded tree which has 2 ℓ vertices. Substituting ℓ = log log n and applying the first statement ofProposition 13, we conclude that with probability at least 1 − n − c such a structure does not exist9n C n . This contradiction shows that with high probability, the maximum load after n balls havebeen allocated is at most log log n + O (1 /ε ). References [1] Anders Aamand, Mathias Bæk Tejs Knudsen, and Mikkel Thorup. Power of d choiceswith simple tabulation. In Ioannis Chatzigiannakis, Christos Kaklamanis, D´aniel Marx,and Donald Sannella, editors, , volume 107of LIPIcs , pages 5:1–5:14. Schloss Dagstuhl - Leibniz-Zentrum f¨ur Informatik, 2018. doi:10.4230/LIPIcs.ICALP.2018.5 .[2] Yossi Azar, Andrei Z. Broder, Anna R. Karlin, and Eli Upfal. Balanced allocations.
SIAM J.Comput. , 29(1):180–200, 1999.[3] Petra Berenbrink, Andr´e Brinkmann, Tom Friedetzky, and Lars Nagel. Balls into bins withrelated random choices.
J. Parallel Distrib. Comput. , 72(2):246–253, 2012.[4] Paul Bogdan, Thomas Sauerwald, Alexandre Stauffer, and He Sun. Balls into bins via localsearch. In
Proc. 24th Symp. Discrete Algorithms (SODA) , pages 16–34, 2013.[5] John W. Byers, Jeffrey Considine, and Michael Mitzenmacher. Geometric generalizations ofthe power of two choices. In
Proc. 16th Symp. Parallelism in Algorithms and Architectures(SPAA) , pages 54–63, 2004.[6] L. Elisa Celis, Omer Reingold, Gil Segev, and Udi Wieder. Balls and bins: Smaller hash familiesand faster evaluation.
SIAM J. Comput. , 42(3):1030–1050, 2013. doi:10.1137/120871626 .[7] Xue Chen. Derandomized balanced allocation. In
Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January6-9, 2019 , pages 2513–2526, 2019. doi:10.1137/1.9781611975482.154 .[8] Kai-Min Chung, Henry Lam, Zhenming Liu, and Michael Mitzenmacher. Chernoff–Hoeffdingbounds for Markov chains: Generalized and simplified. In , pages 124–135, 2012. doi:10.4230/LIPIcs.STACS.2012.124 .[9] Andrea E. F. Clementi, Angelo Monti, Francesco Pasquale, and Riccardo Silvestri. Informa-tion spreading in stationary markovian evolving graphs.
IEEE Trans. Parallel Distrib. Syst. ,22(9):1425–1432, 2011. doi:10.1109/TPDS.2011.33 .[10] Xavier Dahan. Regular graphs of large girth and arbitrary degree.
Combinatorica , 34(4):407–426, 2014.[11] Søren Dahlgaard, Mathias Bæk Tejs Knudsen, Eva Rotenberg, and Mikkel Thorup. The powerof two choices with simple tabulation. In
Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms , SODA 16, page 16311642, USA, 2016. Society forIndustrial and Applied Mathematics.[12] Brighten Godfrey. Balls and bins with structure: balanced allocations on hypergraphs. In
Proc. 19th Symp. Discrete Algorithms (SODA) , pages 511–517, 2008.[13] Krishnaram Kenthapadi and Rina Panigrahy. Balanced allocation on graphs. In
Proc. 17thSymp. Discrete Algorithms (SODA) , pages 434–443, 2006.1014] Donald Knuth.
The Art of Computer Programming, Vol. 1: Fundamental Algorithms . Adison-Wesley, third edition, 1997.[15] David A. Levin, Yuval Peres and Elizabeth L. Wilmer.
Markov Chains and Mixing Times .American Mathematical Society, 2006.[16] Michael Mitzenmacher, Andr´ea W. Richa, and Ramesh Sitaraman. The power of two randomchoices: A survey of techniques and results. In in Handbook of Randomized Computing , pages255–312. Kluwer, 2000.[17] OEIS Foundation Inc. (2020). The On-Line Encyclopedia of Integer Sequences. http://oeis.org/A000108 .[18] Yuval Peres, Kunal Talwar, and Udi Wieder. Graphical balanced allocations and the (1 + β )-choice process. Random Struct. Algorithms , DOI: 10.1002/rsa.20558, 2014.[19] Ali Pourmiri. Balanced allocation on graphs: A random walk approach.
Random Struct.Algorithms , 55(4):980–1009, 2019. doi:10.1002/rsa.20875 .[20] Berthold V¨ocking. How asymmetry helps load balancing.
J. ACM , 50(4):568–589, 2003.[21] Udi Wieder. Hashing, load balancing and multiple choice.
Foundations and Trends in Theo-retical Computer Science , 12(3-4):275–379, 2017. doi:10.1561/0400000070 . A Proof of Theorem 4
Proof.
Let G = ([ n ] , E ) denote a s -regular graph that does not contain any 4-cycle, where s = n ε . Itis worth mentioning that there are several explicit families of s -regular graphs with girth log s n (e.g.,see [10]). For each i ∈ [ n ], let N ( i ) be set of vertices adjacent to i . Also, let H = ([ n ] , { N ( i ) , i =1 , . . . , n } ) denote a hypergraph obtained from G . We consider the s -uniform dynamic hypergraph( H , H , . . . , H ). Clearly, for every { i, j } ⊂ [ n ] we have that vis ( i, j ) n sn − ε Therefore, the dynamic hypergraph is ε -visible. Fix an integer d such that 2 d s and d isconstant. Since G does not contain any 4-cycle, we deduce that every d -subset of vertices onlyappears in at most one hyperedge of H . Therefore, the probability that a d -subset is chosen by anyball is 1 / ( n (cid:0) sd (cid:1) ). Let D = { i , i , . . . , i d } ⊂ [ n ] be an arbitrary set of d vertices contained in somehyperedge of H . Let X ( D, k ) be an indicator random variable taking one if at least k balls choose D and zero otherwise. Then we have that Pr [ X ( D, k ) = 1] = (cid:18) nk (cid:19) n (cid:0) sd (cid:1) ! k Also let Y k = P D X ( D, k ) denote the number of d -subsets that are chosen by at least k balls. Bylinearity of expectation we have that E [ Y k ] = X D E [ X ( D, k )] = n (cid:18) sd (cid:19)(cid:18) nk (cid:19) n (cid:0) sd (cid:1) ! k > n (cid:18) s − d k (cid:19) k = n (cid:18) n − dε k (cid:19) k , (2)where the last inequality follows from (cid:0) nk (cid:1) > ( nk ) k and (cid:0) sd (cid:1) < s d . In what follows we show that withhigh probability there exists k such that Y k >
1. Suppose that dε = Θ(1), then if we set k = 1, then11here is a d -subset which is picked by at least one ball and hence Y >
1. If (log log n ) / (3 log n ) < dε and dε = o (1), then by setting k = 1 / (6 dε ) we have k < (log n ) / (2 log log n ) < log n and E [ Y k ] > nk − k n − kdε > n (log n ) − log n/ (2 log log n ) n − / = n / = ω (log n ) . Moreover, if dε log log n/ (3 log n ), then by letting k = log n/ (2 log log n ) we get that E [ Y k ] > nk − k n − kdε > n (log n ) − log n/ (2 log log n ) n − / = n / = ω (log n ) . Therefore, there exists k = min { Ω(1 /ε ) , Ω(log n/ log log n ) } so that E [ Y k ] = ω (log n ). As thenumber of balls is n , it is easy to observe that for a given k , the random variables X ( D, k ) arenegatively correlated. Application of the Chernoff bound for negatively correlated random variableimplies that Pr [ Y k E [ Y k ] / exp( − E [ Y k ] /
8) = exp( − ω (log n )) . It follows that there exists a d -subset D which is chosen by at least k balls and hence there isat least one bin in D whose load is at least k/d . B Proof of Proposition 8
In this section we prove Proposition 8. First we restate a useful theorem from [8].
Theorem 14. [8, Theorem 3]
Let M be an ergodic Markov chain with finite state space Ω andstationary distribution π . Let T = T ( ε ) be its ε -mixing time for ε < / . Let ( Z , . . . , Z t ) denotea t -step random walk on M starting from an initial distribution ρ on Ω (that is, Z is distributedaccording to ρ ). For some positive constant µ and every i ∈ [ t ] , let f i : Ω → [0 , be a weight functionat step i such that the expected weight E π [ f i ( v )] = P v ∈ Ω π ( v ) f i ( v ) satisfies E π [ f i ( v )] = µ for all i .Define the total weight of the walk ( Z , ..., Z t ) by X = P ti =1 f i ( Z i ) . Write || ρ || π = pP x ∈ Ω ρ x /π x .Then there exists some positive constant c ( independent of µ and ε ) such that for all α > ,1. Pr [ X > (1 + α ) µt ] c || ρ || π e − α µt/ T for α .2. Pr [ X > (1 + α ) µt ] c || ρ || π e − αµt/ T for α > .3. Pr [ X (1 − α ) µt ] c || ρ || π e − α µt/ T for α .Proof of Proposition 8. Let Ω be the vertex set of the R -dimensional torus Γ( n, R ) and let a and b denote two arbitrary agents. By definition of the communication graph process, agents a and b are initially placed on two randomly chosen vertices of Γ, say u and v . Note that u and v areindependently chosen according to the stationary distribution π of the random walk on Γ( n, R ).Now consider the trajectory of agents a and b , which give two independent random walks u , u , . . . and v , v , . . . on Γ( n, R ). Defining X t = ( u t , v t ) for t = 0 , , . . . gives a finite, ergodic Markov chainwith stationary distribution ( π, π ) on Ω × Ω. For every t >
0, define f ( X t ) = f ( u t , v t ) = ( d ( u t , v t ) r ,0 otherwise.where d ( · , · ) is the Manhattan distance for the given grid. Let u t and v t denote the projection ofthe random walks u t and v t onto the 1-dimensional torus Γ( n /R , n, R ). Then X t = ( u t , v t ) is an ergodicMarkov chain on Γ( n /R , f ( u t , v t ) = ( d ( u t , v t ) r ,0 otherwise.By the Manhattan distance property, if f ( u t , v t ) = 1 then f ( u t , v t ) = 1. Therefore, vis ( a, b ) = n X t =0 f ( X t ) n X t =0 f ( X t ) . Set δ = min { / , /R } . Let t be the first time when d ( u t , v t ) n δ . Consider a moving window W of length 2 n δ + 1, which contains the locations of u t and v t . At time t , the vertices covered by W are labelled in increasing order, with the leftmost vertex labelled − n δ and the rightmost vertexlabelled n δ −
1. The window W stays at its initial location as long as no agent hits a border of W (vertices labelled − n δ or n δ ), or the middle vertex of W (labelled 0). Let b be the first agent thathits a border or the centre of W . From this time on, b and W are coupled so that they both moveand/or stay, simultaneously. (If b moves left then W also moves left, for example.) Each time thewindow W moves, a vertex u ∈ Γ is no longer covered by W and a new vertex, w ∈ Γ , becomescovered by w . The new vertex w is assigned the label of vertex u . This process always labels thevertices covered by W by {− n δ − , . . . , n δ − } , and the movement of agent b over these labeledvertices simulates a random walk on the additive group Z n δ +1 . Define S = { t n | u t and v t ∈ W } . Assume that S = ∅ and define the chain Y t = ( u t , v t ) , t ∈ S . Then Y t can be considered asan ergodic Markov chain of length | S | n over Z n δ − , or equivalently, as a Markov chain on a(2 n δ + 1)-cycle. By the proposition assumption we have r = O ( n o (1) ) < n δ , and so vis ( a, b ) = n X t =0 f ( X t ) n X t =0 f ( X t ) X t ∈ S f ( Y t ) n X t =0 f ( Y t ) . The chain Y t converges to stationary distribution ( π, π ), where π is the uniform distribution of arandom walk on a (2 n δ + 1)-cycle. It follows that for all t = 0 , , . . . we have E ( π,π ) [ f ( Y t )] = µ =Θ( r/n δ ), independently of t . It is well-known [15] that the ε -mixing time of the random walk on a(2 n δ +1)-cycle is O ( n δ log(1 /ε )). If ρ is the initial distribution Y , then we have that || ρ || π O ( n δ ).Applying Theorem 14 implies that Pr " n X t =1 f ( Y t ) > µ · n = O ( n δ )e − Θ( rn − δ ) = n − ω (1) . Therefore, with probability 1 − n − ω (1) , vis ( a, b ) n X t =0 f ( Y t ) = O ( rn − δ ) = O ( n − δ + o (1) ) = O ( n − ε ) , taking ε = δ/
2, say. Taking the union bound over all pairs of agents completes the proof.13
Appearance Probability of a Certain Structure
In this subsection we work towards a proof of Lemma 12. First we will give some useful definitionand prove some helpful results. The definition was introduced in [19].
Definition 15.
Suppose that A is an allocation algorithm that sequentially allocates n balls into n bins according to some mechanism. For a given constant α >
0, and for Θ( n ) = m n , we saythat A is ( α, m )-uniform if for every ball 1 t m = Θ( n ) and every bin i ∈ [ n ], Pr [ ball t is allocated to bin i by A | balls 1 , , . . . , t − A ] αn . In the above definition, we condition on the allocations of balls 1 , . . . , t − A .The following result, proved in Appendix E, states that the balanced allocation process is uniformon dynamic hypergraphs. Lemma 16 (Uniformity Lemma) . Fix d = d ( n ) with d = o (log n ) and suppose that for someconstant β > , the s -uniform dynamic hypergraph ( H (1) , . . . , H ( n ) ) satisfies the β -balanced andsize properties, with d s . Then there exists a constant α = α ( β ) , which depends only on β , andthere exists m = Θ( n ) with m < n , such that the balanced allocation process on ( H (1) , . . . , H ( n ) ) is ( α, m ) -uniform. Specifically, we may take α = 44 β . We are ready to prove Lemma 12.
Lemma 17 (Restatement of Lemma 12) . Fix d = d ( n ) with d = o (log n ) . Let ( H (1) , . . . , H ( n ) ) be a dynamic hypergraph which satisfies the β -balanced, ε -visibility and c -size properties. Supposethat c > β e is a sufficiently large constant, and let k = C log n for some constant C > . Thereexists Θ( n ) m n such that the probability that C m contains a c -loaded k -vertex tree is at most exp n k log(2 βd ) − c ( d − k − r −
1) + (cid:0) c + 3 − rε/ (cid:1) log( n ) o where r is the number of red vertices in the blue-red coloring of the tree. Moreover, with highprobability, if C m contains any such tree then r = O (1 /ε ) .Proof. Fix m = m ( n ) to equal the m provided by Lemma 16. There are at most 4 k ordered treeswith k vertices. (Proposition 10). Fix such a tree, say T , and label the vertices { , , . . . , k } suchthat vertex i is the i -th new vertex visited when performing depth-first search in T starting fromthe root, and respecting the given ordering. In particular, the root of T is vertex 1. Next, we willassign a d -choice to the root vertex of T , as a first step in describing trees which may be presentin the witness graph C m . Let x count the number of possible d -choices that can be assigned to theroot of T . Then x (cid:18) sd (cid:19) · (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) m [ t =1 E t (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) · m (cid:18) sd (cid:19) · n c +2 , where the last inequality follows from the size property and the inequality m n . Therefore,there are x possibilities for the root and hence there are at most 4 k · (cid:0) sd (cid:1) · n c +2 ordered trees withthe specified root. Fix an arbitrary d -choice D t as the root for T .Next we fix an arbitrary function col : { , . . . , k } → { blue , red } , that gives a blue-red coloringof 2 , . . . , k . In what follows we establish an upper bound for the probability that C m contains theblue-red colored tree T ⊂ C m , (according to Definition 11). Let q ( t ) be the probability that the t -th ball chooses the root of T (that is, that the d -choice made by the t -th ball corresponds to theroot of T ). Then m X t =1 q ( t ) m X t =1 (cid:0) sd (cid:1) n (cid:0) sd (cid:1) , (3)14ecause H contains (cid:0) sd (cid:1) distinct d -element sets for for every H ∈ E t . For every t = 2 , . . . , k , define q i ( t, col( i )) to be the probability that the t -th ball chooses the i -th vertex of the tree (i.e., i ) withcol( i ). If col( i ) is red then D t must share at least two bins with ∪ i − j =1 D t j , while if col( i ) is blue then D t only shares one bin with its parent. For every i = 2 , . . . , k , let us derive an upper bound on q i ( t, blue). Here, the i -th vertex share one bin with its parent in T , say D t j . Now D t j has d binsand by the balancedness property we get Pr (cid:2) D t j ∩ H t = ∅ (cid:3) X i ∈ D tj Pr [ i ∈ H t ] βdsn , where H t is the edge chosen by ball t from H ( t ) , uniformly at random. Suppose that for some a > | D t j ∩ H t | = a d . Then the total number of d -element subsets of H t which share onlyone bin with D t j is a (cid:0) s − ad − (cid:1) d (cid:0) s − d − (cid:1) . Thus, we get m X t =1 q i ( t, blue) m X t =1 βdsn · d (cid:0) s − d − (cid:1)(cid:0) sd (cid:1) = m X t =1 βd n βd , (4)because m n .Next, for every i = 2 , . . . , k , and every t = 2 , . . . , m , we need an upper bound on q i ( t, red). Ifthe i -th vertex of the tree is the set D t and is coloured red, then D t is a d -element set of bins whichshares at least two bins with ∪ i − j =1 D t j . One of these bins belongs to the (known) parent, and theother belongs to D t . . . , D t i − . So if U is the number of choices for this pair of bins, then U d · ( i − d kd . (5)Let { p , p . . . , p U } be the set of such pairs of bins. For J = 1 , . . . , U , write A ( p J , t ) for the eventthat the pair p J is contained in a randomly chosen edge of E t . Observe that if p J ⊂ D t then A ( p J , t )holds. Then, by the balancedness property we have Pr [ p J ⊂ D t ]= Pr [ p J ⊂ D t | A ( p J , t )] · Pr [ A ( p J , t )] Pr [ p J ⊂ H t ] · (cid:0) s − d − (cid:1)(cid:0) sd (cid:1) · Pr [ A ( p J , t )] Pr [ p J ∩ H t = ∅ ] · (cid:0) s − d − (cid:1)(cid:0) sd (cid:1) · Pr [ A ( p J , t )] βsn · (cid:0) s − d − (cid:1)(cid:0) sd (cid:1) · Pr [ A ( p J , t )] = 2 βd ( d − s − n Pr [ A ( p J , t )] , as (cid:0) s − d − (cid:1) is the number of d -element subsets of H t which contain the pair p J . Then q i ( t, red) U X J =1 βd ( d − s − n Pr [ A ( p J , t )] . Note that by (5) we have U kd and hence, m X t =1 q i ( t, red) U X J =1 n X t =1 βd ( d − s − n Pr [ A ( p J , t )] U X J =1 βd ( d − s − n vis ( p J ) βkd n ε . (6)15he final inequality follows from the visibility property, using the fact that d < s .Write col − (blue) for the set of blue vertices in T , and similarly for col − (red). Then | col − (red) | + | col − (blue) | = k − . Suppose that ( t , . . . , t k ) is the sequence of balls that are going to select vertices 1 , , . . . , k of T . Byapplying (3), (4) and (6), we find that the probability that the edges of the colored tree T appearsin C m at times ( t , . . . , t k ), and the corresponding sets D t , . . . , D t k consistent with chosen blue-redcoloring scheme, is at most X ( t ,...,t k ) ( q ( t ) k Y i =2 q i ( t i , col( i )) ) m X t =1 q ( t ) ! k Y m X t =1 q i ( t, col( i )) ! n (cid:0) sd (cid:1) Y i ∈ col − (blue) m X t =1 q i ( t, blue) Y i ∈ col − (red) m X t =1 q i ( t, red) n (cid:0) sd (cid:1) (cid:0) βd (cid:1) | col − (blue) | (cid:18) βkd n ε (cid:19) | col − (red) | nβ k d k (cid:0) sd (cid:1) (cid:18) kn ε (cid:19) | col − (red) | . (7)There are at most 2 k − coloring functions and 4 k poly( n ) (cid:0) sd (cid:1) rooted and ordered trees. So by theupper bound (7), together with the union bound over all colored ordered trees, we obtain Pr [ C m contains a valid blue-red colored k -vertex tree with r red vertices ] k k − · n c +2 (cid:18) sd (cid:19) · n β k d k (cid:0) sd (cid:1) (cid:18) kn ε (cid:19) r n c +3 · (2 βd ) k · n − rε/ exp (cid:0) k log(2 βd ) + ( c + 3 − rε/
2) log n (cid:1) , (8)using k = O (log n ) for the penultimate inequality.Let b = k − r − D s , . . . , D s b be the sorted list ofblue vertices such that s < s < · · · < s b . Then, by the definition of blue-red coloring, for every j = 1 , . . . , b we have | ( ∪ j − g =1 D s g ) ∩ D s j |
1. This implies that y = | ∪ kj =1 D t j | > | ∪ bj =1 D s j | > ( d − b = ( d − k − − r ) , since { s , . . . , s b } ⊆ { t , . . . , t k } . Applying Lemma 16 implies that the balanced allocation is ( α, m )-uniform, where α = 44 β , say. Hence for any c > β e , the probability that each bin in ∪ kj =1 D t j isallocated at least c balls (that is, the tree T is c -loaded) is at most (cid:18) mcy (cid:19) (cid:16) αyn (cid:17) cy (cid:18) e mcy (cid:19) cy (cid:16) αyn (cid:17) cy (cid:16) e αc (cid:17) cy e − c ( d − k − r − , where the last inequality follows from m n and the fact that c > α e . Since balls are independentfrom each other, we can multiply the above inequality by (8) to show that the probability that C m contains a c -loaded k -vertex tree with r red vertices is at mostexp n k log(2 βd ) − c ( d − k − r −
1) + (cid:0) c + 3 − rε/ (cid:1) log n o , (9)proving the first statement of the lemma. Finally, suppose that rε → ∞ as n → ∞ . Then theupper bound in (9) can be written asexp n(cid:0) βd ) − c ( d − (cid:1) k + O (log n ) + o ( r · log n ) − ( rε/
2) log n o exp n O (log n ) + o ( r · log n ) − ( rε/
2) log n o . rε → ∞ , this term dominates and the probability that C m contains a blue-red coloured treewith r red vertices tends to zero. Therefore, if such a tree is present in C m then r = O (1 /ε ) withhigh probability. This completes the proof. D Missing Part of Proof of Theorem 2
In order to prove the second statement of Theorem 2 we show the sub-additivity of the balancedallocation algorithm. We want to prove that for every constant integer γ > γm n , afterallocating γm balls, the maximum load is at most γ (log d log n + O (1 /ε )), with high probability. Firstassume that 2 m n and suppose that the algorithm has allocated m balls to H ( t ) , t = 1 , . . . , m andlet ℓ ∗ log d log n + O (1) denote its maximum load. We now consider two independent balancedallocation algorithms, say A and A , on two dynamic hypergraphs starting from step m . Thesedynamic hypergraphs are ( H ( m ) , . . . , H ( n ) ) and ( H ( m )0 , . . . , H ( n )0 ), where H ( t )0 is an identical copyof H ( t ) for t = m, . . . , n . Moreover, we assume that in round m , all bins contained in H ( m )0 haveexactly ℓ ∗ balls. Let us couple algorithm A on H ( t ) and algorithm A on H ( t )0 . Write V = [ n ]for the set of n bins. To do so, the coupled process allocates a pair of balls to bins as follows:for t = m + 1 , . . . , m , the coupling chooses a one-to-one labeling function σ t : V → { , , . . . , n } uniformly at random, where V is the ground set of both hypergraphs (i.e, set of n bins) and { , , . . . , n } is a set of labels. Next, the coupling chooses D t randomly from H ( t ) . Let D ′ t denotethe same set of d bins as D t in H ( t )0 . Algorithm A allocates ball t + 1 to a least-loaded vertex of D t ,and algorithm A allocates ball t + 1 to a least-loaded vertex of D ′ t , with both algorithms breakingties in favour of the vertex v with the smallest load and minimum label σ t ( v ). Note that algorithm A is a faithful copy of the balanced allocation process on ( H ( m ) , . . . , H ( n ) ), and algorithm A is afaithful copy of the balanced allocation process on ( H ( m )0 , . . . , H ( n )0 ), respectively. (This follows as σ t is chosen uniformly at random.) Let X ti and Y ti , m + 1 t m , denote the load of bin i in H ( t ) and H ( t )0 , respectively. We prove by induction that for every integer m t m and i ∈ V we have X ti Y ti . (10)The inequality holds when by the assumption that Y mi = ℓ ∗ for every i ∈ V . Let us assume thatfor every t ′ , t ′ t m , Inequality (10) holds, then we will show it for t + 1. Let i ∈ D t +1 and j ∈ D ′ t +1 denote the vertices (bins) that receive a ball in step t + 1. We now consider two cases: • Case 1: X ti < Y ti . Since algorithm A allocated ball t + 1 to bin i , it follows that X ti + 1 = X t +1 i Y ti Y t +1 i . So, Inequality (10) holds for t + 1 and every bin i ∈ V . • Case 2: X ti = Y ti . Since D ′ t +1 is a copy of D t +1 , we have j ∈ D t +1 and i ∈ D ′ t +1 . We knowthat no vertex (bin) in D t +1 has smaller load than i , and no vertex (bin) in D ′ t +1 has smallerload than j . Hence X ti X tj Y tj Y ti , where the middle inequality follows from the inductive hypothesis (10) for bin j . So byassumption of this case we obtain X ti = X tj = Y tj = Y ti . If i = j and σ t +1 ( j ) < σ t +1 ( i ),then it contradicts the fact that ball t + 1 is allocated to bin i by algorithm A . Similarly,if σ t +1 ( j ) > σ t +1 ( i ), then it contradicts the fact that algorithm A allocated ball t to bin j .Therefore i = j and hence X t +1 i = X ti + 1 = Y ti + 1 = Y t +1 i . t >
0. By applying the first part of thetheorem, with high probability, using algorithm A to allocate m balls to the dynamic hypergraph( H ( m )0 , . . . , H (2 m )0 ) results in maximum load ℓ ∗ + log d log n + O (1 /ε ) d log n + O (1 /ε ))in H (2 m )0 . Therefore, by Inequality (10), after using algorithm A to allocate m balls to the dy-namic hypergraph ( H ( m ) , . . . , H ( n ) ), with high probability the maximum load in H (2 m ) is at most2(log d log n + O (1 /ε )). Applying the union bound, we conclude that after allocating γm balls, where γm n , the maximum load is at most γ (log d log n + O (1 /ε )), with high probability. E Proof of Lemma 16
Berenbrink et al. [3] proposed an allocation algorithm B such that for t = 1 , , . . . , the t -th ballchooses an edge of H ( t ) = ([ n ] , E t ) , t = 1 , . . . , uniformly at random, say H t . The ball is then allocatedto an empty vertex (bin) of H t , with ties broken randomly. If H t does not contain an empty binthen the process fails. The next lemma follows directly from [3, Lemmas 4, 5]. Lemma 18.
Suppose that the dynamic s -uniform hypergraph ( H (1) , . . . , H ( n ) ) satisfies the balanced-ness and size properties. There exists m = Θ( n ) such that with probability at least − n − , algorithm B successfully allocates m balls and there are at least s/ empty vertices in H t for t = 1 , . . . , m . We now apply the above result to show the same property holds for the balanced allocation onany dynamic hypergraph.
Lemma 19.
Fix d = d ( n ) with d = o (log n ) . Suppose that the dynamic s -uniform hypergraph ( H (1) , . . . , H ( n ) ) satisfies the balancedness and size properties. There exists m = Θ( n ) with m < n such that with probability at least − n − , the edge H t chosen by the t -th ball contains at least s/ empty vertices for t = 1 , . . . , m .Proof. We apply a coupling technique between the balanced allocation process on a dynamic hy-pergraph and B .Let us first consider an identical copy of the set of bins, called B . The coupled process sequen-tially allocates a ball to a pair of bins. In round t = 1 , . . . , m , the t -th ball chooses an edge of H ( t ) uniformly at random, say H t . Let H ′ t be the corresponding set of bins, chosen from B . Then thefirst ball is allocated to a bin, say i , contained in H t according the balanced allocation. If i ∈ H ′ t is empty then the second ball is allocated to bin i ∈ H ′ t as well. If i ∈ H ′ t is not empty then thesecond ball is allocated to an empty bin from H ′ t , with ties are broken randomly. If there is noempty bin in H ′ t then the coupling fails. Note that H t and H ′ t have the same set of bins but mayhave different loads. Observe that the coupled process allocates balls to bins from B according to B . Next we show that for t = 1 , . . . , m , Empty ( H t ) > Empty ( H ′ t ) , (11)where Empty ( H ) denotes the number of empty bins contained in H . For a contradiction, assumethat there is a first time t such that Empty ( H ′ t ) > Empty ( H t ). Then there is vertex i ∈ H ′ t which is empty, while i ∈ H t has a ball at height zero: this is ball t , say, where 1 t t . Thisimplies that the coupled process has allocated ball t to bin i ∈ H t , but it has not allocated anyball to bin i ∈ H ′ t , since i was empty until round t . This contradicts the definition of the coupledprocess. So Inequality (11) holds for t = 1 , . . . , m . Applying Lemma 18 yields that there exists m = Θ( n ) such that for t = 1 , . . . , m , Empty ( H t ) > Empty ( H ′ t ) > s/ . roof of Lemma 16. Fix m = m ( n ) to equal the m provided by Lemma 19. For t = 1 , . . . , m , let D t be the d -element subset of H t that is chosen by the t -th ball. Define the indicator random variable I t as follows: I t := ( D t contains at least d/ i and then define A ( t, i ) to be the event that the t -th ball is allocated tovertex i . (The first t − i D t then Pr [ A ( t, i )] = 0. It follows that Pr [ A ( t, i )] = Pr [ A ( t, i ) | i ∈ D t and I t = 1] · Pr [ i ∈ D t and I t = 1]+ Pr [ A ( t, i ) | i ∈ D t and I t = 0] · Pr [ i ∈ D t and I t = 0] . Now there are at least d/ D t when I t = 1, so Pr [ A ( t, i ) | i ∈ D t and I t = 1] /d. It follows that Pr [ A ( t, i )] (6 /d ) Pr [ i ∈ D t and I t = 1] + Pr [ i ∈ D t and I t = 0] (6 /d ) Pr [ i ∈ D t ] + Pr [ I t = 0 | i ∈ D t ] · Pr [ i ∈ D t ] . (12)In order to have i ∈ D t , first an edge containing i must be selected, and then the chosen d -elementsubset of that edge must contain i . By the β -balancedness property, Pr [ i ∈ D t ] βsn · (cid:0) s − d − (cid:1)(cid:0) sd (cid:1) βn . Using the above inequality, we simplify Inequality (12) as follows: Pr [ A ( t, i )] βn + βdn Pr [ I t = 0 | i ∈ D t ] . If d Pr [ A ( t, i )] β/n . This completesthe proof when d
6. For the remainder of the proof we assume that d >
7, and prove that Pr [ I t = 0 | i ∈ D t ] ˆ c/d (13)for some absolute constant ˆ c >
0. From this, we see that Pr [ A ( t, i )] α/n where α = β (6 + ˆ c ). As i was an arbitrary bin, this proves that the process is ( α, m )-uniform.Let F be the event that H t contains at least s/ t = 1 , . . . , m . ByLemma 19, we have Pr [ F ] > − n − . Then Pr [ I t = 0 | i ∈ D t ]= Pr [ I t = 0 | ( i ∈ D t ) and F ] · Pr [ F ] + Pr [ I t = 0 | ( i ∈ D t ) and ¬F ] · Pr [ ¬F ] Pr [ I t = 0 | ( i ∈ D t ) and F ] + n − Pr [ I t = 0 | ( i ∈ D t ) and F ] + 1 /d. (14)Let X be the random variable that counts the number of empty bins of a random ( d − H t \{ i } , conditioned on the event that “( i ∈ D t ) and F ” holds. Then X is a hypergeometric19andom variable with parameters ( s − , K, d − K is the number of empty bins containedin H t \ { i } . Thus E [ X ] = ( d − Ks − Var [ X ] ( d − Ks − d. Then E [ X ] > d/
3, since K > s/ − i ∈ D t and F holds (and using the size property s = Ω(log n )) and the fact that d > Pr [ I t = 0 | ( i ∈ D t ) and F ] Pr [ X < d/ Pr [ | X − E [ X ] | E [ X ] / < Var [ X ] E [ X ] dd = 36 d , using Chebychev’s inequality. Substituting the above upper bound in Inequality (14) establishes(13) with ˆ c = 38, which completes the proof. F Proof of Proposition 13
In this subsection we will prove two lemmas and then combine them to establish the proposition.The lemmas and their proofs are inspired by [13, Lemma 2.1 and 2.2]. Recall that a subgraph of C n is c -loaded if every vertex (bin) in the subgraph has load at least c . Lemma 20.
Let k be a positive integer and let c > . The probability that conflict graph C n contains a c -loaded connected component with k vertices is at most n · k · (cid:18) c (cid:19) c k . Moreover, by setting c = 12( c + 1) , we conclude that with probability at least − n − c , the conflictgraph C n does not contain a c -loaded tree with at least log n vertices.Proof. A connected component in C n with k vertices contains a spanning tree with k vertices. ByProposition 10, there are at most 4 k − ordered trees with k vertices. For every ordered tree, we canchoose its root in n ways, as we have n bins (vertices). Hence there are at most n · k − rooted andordered trees. Let us fix an arbitrary ordered tree T with a specified root. Also let ( t , . . . , t k − )denote an arbitrary sequence of rounds, where t i ∈ { , . . . , n } is the round when the i -th edge of theordered tree T is chosen. Notice that in an ordered tree with specified root, the i -th edge alwaysconnects the i -th child to its parent, and the parent is already known to us. Therefore, to build thetree, the i -th edge of the tree must be chosen from edges of G ( t i ) that are adjacent to the knownparent. This implies that the algorithm chooses the i -th edge of T in round t i with probability ∆ ti n ∆ ti / = n . Since balls are independent from each other, the tree T is constructed at the giventimes ( t , . . . , t k − ) with probability (cid:18) n (cid:19) k − . (15)On the other hand, ball t is allocated to a given bin with probability at most ∆ t / ( n ∆ t /
2) = 2 /n .Therefore, the probability that T is c -loaded is at most (cid:18) nck (cid:19) (cid:18) kn (cid:19) c k (cid:18) e nc k (cid:19) c k (cid:18) kn (cid:19) c k = (cid:18) c (cid:19) c k , (16)20here we used the fact that (cid:0) nc k (cid:1) (cid:16) e nc k (cid:17) c k . Since balls are independent, one can multiply (15)by (16) and derive an upper bound for the probability that T is constructed at the given times andis c -loaded. Taking the union bound over all rooted ordered trees and time sequences gives n k − X ( t ,...,t k − ) ((cid:18) n (cid:19) k − (cid:18) c (cid:19) c k ) n k − n k − · ((cid:18) n (cid:19) k − (cid:18) c (cid:19) c k ) = n k − (cid:18) c (cid:19) c k , proving the first statement of the lemma. By setting c = 12( c + 1) and k = log n in the aboveformula, we infer that the probability that C n contains a c -loaded tree with log n vertices is atmost n k − (cid:18) c (cid:19) c k < n k − c +1) k n − ck − k n − c , completing the proof. Lemma 21.