Distance between two random k-out digraphs, with and without preferential attachment
aa r X i v : . [ m a t h . C O ] A ug Applied Probability Trust (13 October 2018)
DISTANCE BETWEEN TWO RANDOM k -OUT DIGRAPHS,WITH AND WITHOUT PREFERENTIAL ATTACHMENT NICHOLAS R. PETERSON, ∗ The Ohio State University
BORIS PITTEL, ∗ The Ohio State University
Abstract
A random k -out mapping (digraph) on [ n ] is generated by choosing k randomimages of each vertex one at a time, subject to a “preferential attachment”rule: the current vertex selects an image i with probability proportional toa given parameter α = α ( n ) plus the number of times i has already beenselected. Intuitively, the larger α gets, the closer the resulting k -out mappingis to the uniformly random k -out mapping. We prove that α = Θ( n / ) is thethreshold for α growing “fast enough” to make the random digraph approachthe uniformly random digraph in terms of the total variation distance. We alsodetermine an exact limit for this distance for α = βn / . Keywords: random graphs, random digraphs, preferential attachment, uniform, k -out digraphs, total variation distance, local limit theorem2010 Mathematics Subject Classification: Primary 05C80Secondary 60C05, 60F05
1. Introduction
In the study of random graph/digraph processes, preferential attachment (or thepopularity effect) refers broadly to processes in which edges or arcs are inserted oneat a time, and vertices chosen as endpoints previously are more likely to be chosengoing forward. These processes have become well known since Barab´asi and Albert [1]introduced the first such model to explain a “scale-free” vertex degree distributionobserved empirically in various real-world networks. In this scheme the vertex setgrows in time, each new vertex attaching itself randomly to existing vertices, with ∗ Postal address: 100 Math Tower; 231 W 18th Ave; Columbus, OH 43210; USAThe authors gratefully acknowledge support from NSF grant probabilities proportional to their current “popularity” (degree). The Barab´asi-Albertmodel was later formalized, and studied rigorously, in papers by Bollob´as and Riordan[5] and Bollob´as, Riordan, Spencer, and Tusn´ady [6].For a single host vertex, the resulting graph (a non-uniform recursive tree) had beenstudied some years earlier; see Bergeron et al. [2], Mahmoud et al. [16], and Pittel [18],for instance.Since then, a wealth of preferential attachment graph models have been studied,see, for example, Buckley and Osthus [7], Bollob´as et al. [4], and Deijfen [9].Recently Pittel [19] studied a graph process { G α ( n, M ) } NM =0 , which is a “preferentialattachment” counterpart of the Erd¨os-R´enyi process { G ( n, M ) } NM =0 on a fixed vertexset [ n ], ( N := (cid:0) n (cid:1) ): given the current graph G α ( n, M ), G α ( n, M + 1) is obtainedby adding a new edge; we choose to connect currently non-adjacent vertices i and j with probability proportional to ( d i + α )( d j + α ), where d i , d j are the degrees ofvertices i and j in G α ( n, M ). Clearly, { G ( n, M ) } nM =0 is the limiting case α = ∞ of { G α ( n < M ) } NM =0 . The main result in [19] is that w.h.p. G α ( n, M ) develops a giantcomponent when the average vertex degree c := 2 M/n exceeds αα +1 , and the giantcomponent has size asymptotic to n h − (cid:16) α + c ∗ α + c (cid:17) α i , where c ∗ < αα +1 is a root of c ( α + c ) α = c ∗ ( α + c ∗ ) α . Notably, formally letting α = ∞ in this result recovers the result of Erd¨os and R`enyi[11], that the Erd¨os-R`enyi process { G ( n, M ) } develops a giant component when c :=2 M/n exceeds 1, and that the giant component has size n (1 − c ∗ c ), where c ∗ < c ∗ e − c ∗ = ce − c .Another model of this type is a preferential attachment model for random mappings,defined and studied by Hansen and Jaworski [13–15]. Let α >
0, and say that eachvertex in [ n ] has initial weight α . The vertices take turns choosing their images, startingwith 1; conditioned on the previous steps in the process, vertex i chooses vertex j asits image with probability proportional to the current weight of vertex j , which thenincreases by 1. Call the resulting mapping M αn, (this is our notation, not that of Hansenand Jaworski). The constant α measures, essentially, the “independent-mindedness”of the vertices as they choose their images: the larger α is, the less impact previouschoices have on future ones. Letting α → ∞ , we recover the uniformly random mapping istance Between Two Random k -Out Digraphs [ n ] → [ n ]. Extending earlier results of Gertsbakh [12], Burtin [8], and Pittel [17] forthe uniform mapping, Hansen and Jaworski [13] found the distributions of the sets ofultimate “successors” and “predecessors” of a given set in the random mapping M αn, .Given this heuristic connection in the α = ∞ case, it is natural to wonder: if we let α vary with n , and α → ∞ “fast enough” as n → ∞ , will M αn, behave asymptoticallylike the uniform mapping? If so, how fast is fast enough? In [15], Hansen and Jaworskiestablished asymptotic properties of M αn, in the case where αn → ∞ , and specializingtheir results for the parameters studied in [12], [8] and [17] in the case α → ∞ doesreveal the “continuity at α = ∞ ”. At first glance, this might seem to indicate that α → ∞ , however slowly, is enough to make M αn, asymptotically uniform. However,we shall see in Section 5 that this is not the case. A rather simple parameter, the sumof squared in-degrees, is much more sensitive to the behavior of α , and its asymptoticdistribution is close to that for α = ∞ only if α >> n / .In this paper, we generalize Hansen and Jaworski’s preferential attachment randommapping model to a new setting: k -valued mappings. Specifically, we study thecollection of digraphs on vertex set [ n ] in which each vertex has out-degree k , andthe out-arcs belonging to each vertex are labeled 1 , , . . . , k . (Equivalently, these canbe thought of as functional digraphs for mappings [ n ] → [ n ] k , where [ n ] k denotesthe set of k -long vectors with coordinates in [ n ].) We call our model M αn,k , and thecorresponding uniform model M ∞ n,k ; the case k = 1 corresponds exactly to the modelof Hansen and Jaworski.Measuring the distance between M αn,k and M ∞ n,k via the total variation distance,we prove that α = Θ( √ n ) (notably, much smaller than n ) is the threshold for “fastenough” growth to ensure asymptotic uniformity of M αn,k . We determine an exactlimit for the distance in the case α = β √ n , where β > M αn,k and M ∞ n,k .
2. Definition of the Model and Statement of Results
Let n, k ∈ N and α ∈ (0 , ∞ ). Let M αn,k be the directed multigraph on vertex set [ n ]generated via insertion of a kn -long sequence of out-arcs, k arcs per vertex, starting N. R. Peterson and B. Pittel with the empty digraph and each vertex having initial weight α . At a generic step,choose (uniformly at random) a vertex with out-degree below k , and select its targetvertex (image) with probability proportional to the target’s current weight; increasethe weight of the chosen vertex by 1. After kn steps, we arrive at a directed multigraphon vertex set [ n ], in which each vertex has out-degree k and its k out-arcs are labeled chronologically , , . . . , k .While this scheme is perhaps a natural digraph growth process, the distributionof the terminal digraph is the same for any random ordering of the decision makers,provided that it depends only on the current out-degrees. So, alternatively, we canconsider the process consisting of k rounds of the Hansen-Jaworski process [15], inwhich each round begins with the vertex weights accumulated during the previousrounds. Think of this as a committee of n people undergoing k rounds of voting fora chair, in which votes are made publicly and people are swayed by the total votes inearlier rounds. Given any M : [ n ] → [ n ] k , we find that P ( M αn,k = M ) = Q nj =1 α d j ( αn ) kn , x y := x ( x + 1) · · · ( x + y − , (2.1)where ( d , . . . , d n ) is the in-degree sequence of M , including multiplicity.Analogously to Hansen and Jaworski’s model, we can view the uniformly randommapping [ n ] → [ n ] k as the limiting case of M αn,k in which α = ∞ : keeping n fixedand allowing α to grow without bound, the current in-degree of each vertex becomesnegligible compared to α , so that all of the weights are nearly identical. In light of thisconnection, we let M ∞ n,k denote the uniformly random mapping [ n ] → [ n ] k .Our main result is as follows: Theorem 1.
Let d T V ( M αn,k , M ∞ n,k ) denote the total variation distance between themeasures on M n,k (the collection of k -out maps on vertex set [ n ] ) induced by M αn,k and M ∞ n,k : d T V ( M αn,k , M ∞ n,k ) := sup A⊆M n,k | P ( M αn,k ∈ A ) − P ( M ∞ n,k ∈ A ) | = 12 X M ∈M n,k | P ( M αn,k = M ) − P ( M ∞ n,k = M ) | . Let α = α ( n ) and n → ∞ . istance Between Two Random k -Out Digraphs i) If α/ √ n → ∞ , then d T V ( M αn,k , M ∞ n,k ) → .ii) If α → ∞ and α/ √ n → as n → ∞ , then d T V ( M αn,k , M ∞ n,k ) → .iii) If α = β √ n , β ∈ (0 , ∞ ) being fixed, then d T V ( M αn,k , M ∞ n,k ) → E | − exp[ −N ] | ;here N is a Gaussian random variable with E [ N ] = k β and Var[ N ] = k β . Note that Theorem 1 gives us a very strong result in the case where α ≫ √ n : namely,that the difference in the probability assigned to any event A by the distributions of M αn,k and M ∞ n,k tends to 0 with n . The result for α ≪ √ n , on the other hand, is muchless powerful: it simply tells us that there is an event A n such that P ( M αn, ∈ A n ) → P ( M ∞ n, ∈ A n ) →
0. As the example of the number of connected components in M αn, discussed in Section 1 shows, there exist natural events whose probabilities under M αn,k and M ∞ n,k are nearly the same. Still, Theorem 1(ii) is rather revealing: it tellsus that α = Θ( √ n ) is truly the threshold for every parameter of the k -out mappinghaving the same distribution in the limit for the random mappings M αn,k and M ∞ n,k .Theorem 1 calls for finding a (hopefully natural) parameter X of the k -out map-ping such that the total variation distance d T V ( X ( M αn,k ) , X ( M ∞ n,k )) is asymptoticto d T V ( M αn,k , M ∞ n,k ). This X is a parameter whose distribution is most sensitive tofiniteness of α , allowed to be infinite only in the limit. We found such a parameter forthe critical α = Θ( n / ). Theorem 2.
For a k -out mapping M , let D ( M ) = ( D ( M ) , . . . , D n ( M )) denote thesequence of its in-degrees, and let X ( M ) := P i ( D i ( M )) , the sum of squared in-degrees. Let α = β √ n , β > fixed. Then lim n →∞ d T V (cid:0) X ( M αn,k ) , X ( M ∞ n,k ) (cid:1) = 12 E | − exp[ −N ] | , where N is as in Theorem 1(iii). Theorems 1 and 2 open an avenue for further study. For instance, suppose α =Θ( n σ ), σ ∈ (0 , / − d T V ( M αn,k , M ∞ n,k ) →
0. Thequestions are how fast, and which parameter of M is “in charge” of the convergencerate? Suppose σ ∈ [1 /s, / s > X = X ( M ) = { X ( t ) ( M ) } st =2 , X ( t ) ( M ) := X i ( D i ( M )) t . Is it true that 1 − d T V (cid:0) X ( M αn,k ) , X ( M ∞ n,k ) (cid:1) ∼ − d T V ( M αn,k , M ∞ n,k )? N. R. Peterson and B. Pittel
3. Preliminary ResultsDefinition 1.
Let D αn = ( D αn, , . . . , D αn,n ) denote the in-degree sequence for M αn,k , andlet D ∞ n denote the in-degree sequence for M ∞ n,k .Note that d = ( d , . . . , d n ) is an admissible in-degree sequence precisely when it satisfies d , . . . , d n ≥ d + · · · + d n = kn . (From now on, we use d to denote a genericadmissible sequence.) The number of mappings with in-degree sequence d is precisely (cid:0) kn d (cid:1) , which is shorthand for the multinomial coefficient (cid:18) kn d (cid:19) := (cid:18) knd , . . . , d n (cid:19) . The coordinates of D αn are interdependent, as P j D αn,j = kn . However, thereare IID random variables that can be gainfully used to analyze these coordinates.For k = 1, it was proved in [14] that the in-degrees are (jointly) distributed as IIDnegative binomial variables, conditioned on summing to n ; likewise, the in-degrees ofthe uniformly random mapping are distributed as IID Poisson variables, conditionedon summing to n . These results generalize to M αn,k and M ∞ n,k : Lemma 1. (In-degree sequence distributions.)
Let d = ( d , d , . . . , d n ) be given.i) Let Z n, , . . . , Z n,n be IID random variables with the generalized negative binomialdistribution with shape parameter α and probability kα + k : P ( Z n,j = d ) = α d d ! (cid:18) αα + k (cid:19) α (cid:18) kα + k (cid:19) d , d = 0 , , , . . . . Let Z n = ( Z n, , . . . , Z n,n ) . Then P ( D αn = d ) = P ( Z n = d | Z n, + · · · + Z n,n = kn ) . ii) Let Y n, , . . . , Y n,n be IID Poisson-distributed random variables with mean k . Let Y n = ( Y n, , . . . , Y n,n ) . Then P ( D ∞ n = d ) = P ( Y n = d | Y n, + · · · + Y n,n = kn ) . Proof.
The probability generating function for Z n,j is f Z ( x ) := E [ x Z n,j ] = (cid:18) αα + k (cid:19) α (cid:18) − kxα + k (cid:19) − α . (3.1) istance Between Two Random k -Out Digraphs It follows by independence of the Z n,j that P X j Z n,j = kn = [ x kn ]( f Z ( x )) n = ( αn ) kn ( kn )! (cid:18) αα + k (cid:19) αn (cid:18) kα + k (cid:19) kn , while independence and P j d j = kn imply P ( Z n = d ) = (cid:18) αα + k (cid:19) nα (cid:18) kα + k (cid:19) kn n Y j =1 α d j d j ! . Combining these yields P ( Z n = d | Z n, + · · · + Z n,n = kn ) = (cid:18) kn d (cid:19) Q nj =1 α d j ( αn ) kn . The multinomial coefficient is precisely the number of k -out mappings on [ n ], and so(2.1) implies P ( D αn = d ) = (cid:18) kn d (cid:19) Q nj =1 α d j ( αn ) kn = P Z n = d | X j Z n,j = kn , proving (i). The proof of (ii) proceeds in the same fashion.We will find that the total variation distance we seek can be computed by focusingon the in-degree sequences of our mappings; to do so, we will need information aboutthe moments of the in-degrees, and some results about concentration. With Lemma 1in hand, the moments can be computed explicitly: Corollary 1. (Moments.)
Let n, k ∈ N and α ∈ (0 , ∞ ) .i) The factorial moments and moments of Z n,j (defined as in Lemma 1(i)) are E [( Z n,j ) ℓ ] = α ℓ k ℓ α ℓ and E [( Z n,j ) s ] = s X ℓ =1 (cid:26) sℓ (cid:27) α ℓ k ℓ α ℓ , where (cid:8) sℓ (cid:9) is the Stirling partition number and ( a ) b = a ( a − · · · ( a − ( b − isthe falling factorial.ii) The factorial moments of D αn,j and D ∞ n,j are, respectively, E [( D αn,j ) ℓ ] = α ℓ ( kn ) ℓ ( αn ) ℓ and E [( D ∞ n,j ) ℓ ] = ( kn ) ℓ n ℓ . N. R. Peterson and B. Pittel iii) The moments of D αn,j and D ∞ n,j are, respectively, µ s,α := E [( D αn,j ) s ] = s X ℓ =1 (cid:26) sℓ (cid:27) α ℓ ( kn ) ℓ ( αn ) ℓ and µ s, ∞ := E [( D ∞ n,j ) s ] = s X ℓ =1 (cid:26) sℓ (cid:27) ( kn ) ℓ n ℓ . iv) The mixed factorial moments for D αn and D ∞ n are, respectively, E [( D αn,i ) ℓ ( D αn,j ) m ] = α ℓ α m ( kn ) ℓ + m ( αn ) ℓ + m and E [( D ∞ n,i ) ℓ ( D ∞ n,j ) m ] = ( kn ) ℓ + m n ℓ + m . v) If α = α ( n ) → ∞ as n → ∞ , then µ s,α = µ s, ∞ + O (cid:18) α (cid:19) . vi) As n → ∞ , µ s, ∞ = s X ℓ =1 (cid:26) sℓ (cid:27) k ℓ + O (cid:18) n (cid:19) . Proof.
Let Z n = ( Z n, , . . . , Z n,n ) and Y n = ( Y n, , . . . , Y n,n ) be as in Lemma 1, andlet f Z ( x ) be the probability generating function for Z n,j as computed in (3.1). Then E [( Z n,j ) ℓ ] = f ( ℓ ) Z (1), and the first formula in (i) follows. The proof of (i) is completedby the identity x s = s X ℓ =1 (cid:26) sℓ (cid:27) ( x ) ℓ . (3.2)Note that because Z n, , . . . , Z n,n are IID, E [( D αn,j ) ℓ ] = kn X d = ℓ ( d ) ℓ P ( D αn,j = d )= kn X d = ℓ ( d ) ℓ P ( Z n, = d ) P (cid:0)P j ≥ Z n,j = kn − d (cid:1) P (cid:0)P j ≥ Z n,j = kn )= [ x kn ]( x ℓ f ( ℓ ) Z ( x ))( f Z ( x )) n − [ x kn ]( f Z ( x )) n = α ℓ ( kn ) ℓ ( αn ) ℓ . Arguing similarly for the uniform mapping yields E [( D ∞ n,j ) ℓ ] = [ x kn ]( x ℓ f ( ℓ ) Y ( x ))( f Y ( x )) n − [ x kn ]( f Y ( x )) n = ( kn ) ℓ n ℓ , istance Between Two Random k -Out Digraphs where f Y ( x ) = e k ( x − is the probability generating function for Y n,j . This completesthe proof of (ii). Claim (iii) follows from (ii) and the identity (3.2). For (iv): arguingas in the proof of (ii) leads to E [( D αn,i ) ℓ ( D αn,j ) m ] = [ x kn ]( x ℓ f ( ℓ ) Z ( x )) ( x m f ( m ) Z ( x )) ( f Z ( x )) n − [ x kn ]( f Z ( x )) n = α ℓ α m ( kn ) ℓ + m ( αn ) ℓ + m , while E [( D ∞ n,i ) ℓ ( D ∞ n,j ) m ] = [ x kn ]( x ℓ f ( ℓ ) Y ( x )) ( x m f ( m ) Y ( x )) ( f Y ( x )) n − [ x kn ]( f Y ( x )) n = ( kn ) ℓ + m n ℓ + m . For (v), we need only notice that if α → ∞ , then µ s,α = s X ℓ =1 (cid:26) sℓ (cid:27) α ℓ ( kn ) ℓ ( αn ) ℓ = s X ℓ =1 (cid:26) sℓ (cid:27) α ℓ ( kn ) ℓ ( αn ) ℓ (cid:18) O (cid:18) α (cid:19)(cid:19) = µ s, ∞ + O (cid:18) α (cid:19) . Finally, (vi) is an immediate consequence of the expression for µ s, ∞ in (iii). Thiscompletes the proof.The most important applications of Corollary 1 are listed in the following statement: Corollary 2. (Concentration results.)
Suppose ω = ω ( n ) → ∞ as n → ∞ , howeverslowly. Let s ∈ N , µ s,α := E [( D αn,j ) s ] , and µ s, ∞ := E [( D ∞ n,j ) s ] .i) If α = α ( n ) is bounded away from , then lim n →∞ P (cid:0) | ( D αn, ) s + · · · + ( D αn,n ) s − µ s,α n | < ω √ n (cid:1) = 1 . ii) For the uniform map, lim n →∞ P (cid:0) | ( D ∞ n, ) s + · · · + ( D ∞ n,n ) s − µ s, ∞ n | < ω √ n (cid:1) = 1 . Proof.
Let us first consider (i). Note that E [( D αn, ) s + · · · + ( D αn,n ) s ] = n E [( D αn, ) s ] = µ s,α n. Further, the moments in Corollary 1 imply that E h(cid:16) n X j =1 ( D αn,j ) s (cid:17) i = n E [( D αn, ) s ( D αn, ) s ] + O ( n ) . (3.3) Rewrite ( D αn, ) s ( D αn, ) s in terms of falling factorials, to find E [( D αn, ) s ( D αn, ) s ] = E h(cid:16) s X ℓ =1 (cid:26) sℓ (cid:27) ( D αn, ) ℓ (cid:17)(cid:16) s X m =1 (cid:26) sm (cid:27) ( D αn, ) m (cid:17)i = s X ℓ =1 s X m =1 (cid:26) sℓ (cid:27)(cid:26) sm (cid:27) α ℓ α m ( kn ) ℓ + m ( αn ) ℓ + m = s X ℓ =1 s X m =1 (cid:26) sℓ (cid:27)(cid:26) sm (cid:27) α ℓ α m k ℓ + m α ℓ + m (cid:18) O (cid:18) n (cid:19)(cid:19) . For α bounded away from 0, the summands here are bounded, so that E [( D αn, ) s ( D αn, ) s ] = s X ℓ =1 s X m =1 (cid:26) sℓ (cid:27)(cid:26) sm (cid:27) α ℓ k ℓ α ℓ · α m k m α m + O (cid:18) n (cid:19) = s X ℓ =1 (cid:26) sℓ (cid:27) α ℓ k ℓ α ℓ ! + O (cid:18) n (cid:19) = s X ℓ =1 (cid:26) sℓ (cid:27) α ℓ ( kn ) ℓ ( αn ) ℓ + O (cid:18) n (cid:19)! + O (cid:18) n (cid:19) = ( µ s,α ) + O (cid:18) n (cid:19) . This, combined with (3.3), yields Var hP nj =1 ( D αn,j ) s i = O ( n ). Result (i) followsimmediately via Chebyshev’s inequality, and (ii) is proved similarly.The last ingredients that we need before moving on to prove Theorem 1 are thefollowing bounds on the rising factorial: Lemma 2. (Rising factorial bounds.)
Suppose a ∈ (0 , ∞ ) and b ∈ Z ∩ [0 , a + 1) . Then:i) The rising factorial a b satisfies exp (cid:18) − b a (cid:19) ≤ a b a b exp (cid:16) b ( b − a (cid:17) ≤ . ii) The rising factorial a b satisfies ≤ a b a b exp (cid:16) b ( b − a − b ( b − b − a (cid:17) ≤ exp (cid:18) b a (cid:19) . Proof.
Write a b = a b exp b − X j =0 log (cid:18) ja (cid:19) . (3.4) istance Between Two Random k -Out Digraphs For x ∈ (0 , x ) has alternating terms which decreasein absolute value, so that log(1 + x ) is sandwiched between any two successive partialsums. From this, we get the bounds ja − j a ≤ log (cid:18) ja (cid:19) ≤ ja , (3.5)and ja − j a ≤ log (cid:18) ja (cid:19) ≤ ja − j a + j a (3.6)for 0 ≤ j ≤ b − < a . It follows from (3.4) and (3.5) thatexp (cid:18) − b a (cid:19) ≤ exp (cid:18) − b ( b − b − a (cid:19) ≤ a b a b exp (cid:16) b ( b − a (cid:17) ≤ , which proves Part (i). Part (ii) follows similarly from (3.4) and (3.6).
4. Proof of Theorem 1(i): The Case α ≫ √ n We are now ready to prove Theorem 1(i) – that d
T V ( M αn,k , M ∞ n,k ) → n → ∞ if α/ √ n → ∞ . The first step is to rewrite the total variation distance in terms ofin-degree sequences, using (2.1). Letting M n,k ( d ) denote the collection of all k -outmaps on [ n ] with in-degree sequence d , we computed T V ( M αn,k , M ∞ n,k ) = 12 X d X M ∈M n,k ( d ) | P ( M αn,k = M ) − P ( M ∞ n,k = M ) | = 12 X d X M ∈M n,k ( d ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Q nj =1 α d j ( αn ) kn − n kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 12 X d (cid:18) kn d (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Q nj =1 α d j ( αn ) kn − n kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , (4.1)where the summation is over all valid in-degree sequences d = ( d , . . . , d n ): d i ≥ i , and d + · · · + d n = kn . Notice that the right side of (4.1) is precisely thetotal variation distance between D αn and D ∞ n . We might have expected this: indeed,conditioned on the in-degree sequence, M αn,k and M ∞ n,k are both uniformly random.Let us write α = ω √ n ; note that ω = ω ( n ) → ∞ as n → ∞ . Let B n denote thecollection of in-degree sequences d = ( d , . . . , d n ) such that | d + · · · + d n − µ ,α n | < √ ωn and | d + · · · + d n − µ ,α n | < √ ωn, where as before µ s,α := E [( D αn,j ) s ]. We split the sum from (4.1) into major and minorcontributions:12 X d (cid:18) kn d (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Q nj =1 α d j ( αn ) kn − n kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 12 X d ∈B n (cid:18) kn d (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Q nj =1 α d j ( αn ) kn − n kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + Σ n . For Σ n , by the triangle inequality,Σ n = 12 X d / ∈B n (cid:18) kn d (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Q nj =1 α d j ( αn ) kn − n kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ X d / ∈B n (cid:18) kn d (cid:19) Q nj =1 α d j ( αn ) kn + X d / ∈B n (cid:18) kn d (cid:19) n kn = P ( D αn / ∈ B n ) + P ( D ∞ n / ∈ B n ) . (4.2)By Corollary 2(i), P ( D αn / ∈ B n ) → n → ∞ . By Corollary 1(v), | µ s,α − µ s, ∞ | = O (cid:18) α (cid:19) = O (cid:18) ω √ n (cid:19) , s ∈ { , } , so that | nµ s,α − nµ s, ∞ | = O (cid:18) √ nω (cid:19) = o ( √ ωn ) . It follows that P ( D ∞ n / ∈ B n ) →
0, since by Corollary 2(ii), with its ω replaced by √ ω ,we have P (cid:16)(cid:12)(cid:12)(cid:12) n X j =1 ( D ∞ n,j ) s − µ s, ∞ n (cid:12)(cid:12)(cid:12) < √ ωn for s = 2 , (cid:17) → n → ∞ . So, by (4.2), Σ n → n → ∞ , and we can focus on the sum over d ∈ B n .Applying the rising factorial estimate in Lemma 2(i), we find that( αn ) kn = ( αn ) kn exp (cid:18) kn ( kn − αn (cid:19) (cid:18) O (cid:18) ( kn ) ( αn ) (cid:19)(cid:19) = ( αn ) kn exp (cid:18) k √ n ω (cid:19) (cid:18) O (cid:18) ω (cid:19) + O (cid:18) ω √ n (cid:19)(cid:19) . (4.3)Using the same bounds for each factor α d j shows that, uniformly over all d , n Y j =1 α d j = α kn exp (cid:18) n X j =1 d j ( d j − α + O (cid:18) n X j =1 d j α (cid:19)(cid:19) . (4.4)Here, uniformly over d ∈ B n , n X j =1 d j ( d j − α = µ ,α n − kn + O ( √ ωn )2 α . istance Between Two Random k -Out Digraphs Further, Corollaries 1(v) and 1(vi) imply µ ,α n = µ , ∞ n + O (cid:16) nα (cid:17) = ( k + k ) n + O (1) + O (cid:16) nα (cid:17) , so that n X j =1 d j ( d j − α = k n α + O (cid:18) ω √ n (cid:19) + O (cid:18) ω (cid:19) + O (cid:18) √ ω (cid:19) = k √ n ω + O (cid:18) √ ω (cid:19) . Such d also satisfy n X j =1 d j α = µ ,α n + O ( √ ωn ) α = O (cid:18) ω (cid:19) . So, returning to (4.4), we find that uniformly over d ∈ B n , n Y j =1 α d j = α kn exp (cid:18) k √ n ω (cid:19) (cid:18) O (cid:18) √ ω (cid:19)(cid:19) . (4.5)Therefore12 X d ∈B n (cid:18) kn d (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Q nj =1 α d j ( αn ) kn − n kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = O √ ω n kn X d (cid:18) kn d (cid:19)! = O (cid:18) √ ω (cid:19) → , as n → ∞ . This completes the proof.
5. Proof of Theorem 1(ii): The Case α ≪ √ n Having seen that d
T V ( M αn,k , M ∞ n,k ) → n → ∞ when α/ √ n → ∞ , the naturalquestion to ask is this: is this the best we can do? We now prove that it is, in thesense that d T V ( M αn,k , M ∞ n,k ) → n → ∞ , if α → ∞ but α/ √ n → A n ⊆ M n,k ,d T V ( M αn,k , M ∞ n,k ) = sup A⊆M n,k | P ( M αn,k ∈ A ) − P ( M ∞ n,k ∈ A ) |≥ | P ( M αn,k ∈ A n ) − P ( M ∞ n,k ∈ A n ) | . As such, it is enough to find an event A n such that P ( M αn,k ∈ A n ) → P ( M ∞ n,k ∈A n ) →
0. To that end, let us write α = √ n/ω , and let G n denote the set of all validin-degree sequences d = ( d , . . . , d n ) such that | d + · · · + d n − µ ,α n | < √ ωn. Then the event { M : d ( M ) ∈ G n } is precisely the event A n that we seek. Indeed, since ω → ∞ as n → ∞ , P ( D αn ∈ G n ) → n → ∞ by Corollary 2(i). On the other hand,Corollary 2(ii) states that P ( | ( D ∞ n, ) + · · · + ( D ∞ n,n ) − µ , ∞ n | < √ ωn ) → n → ∞ . (5.1)Here, by Corollary 1(iii), | µ ,α n − µ , ∞ n | = k ( n − kn − αn + 1 ∼ k nα = k ω √ n, whence | µ ,α n − µ , ∞ n | ≥ √ ωn for n sufficiently large, so that (5.1) implies that P ( D ∞ n ∈ G n ) → n → ∞ . We conclude that1 = lim n →∞ | P ( D αn ∈ G n ) − P ( D ∞ n ∈ G n ) | ≤ d T V ( M αn,k , M ∞ n,k ) ≤ , thus completing the proof.
6. Proof of Theorem 1(iii): The Case α = β √ n We have now seen that d
T V ( M αn,k , M ∞ n,k ) → α/ √ n → ∞ , meaning that M αn,k is, in a strong sense, asymptotically uniform in this case. We have also seen thatd T V ( M αn,k , M ∞ n,k ) → α/ √ n →
0, so that, in the limit, the supports of M αn,k and M ∞ n,k in M n,k are disjoint. The natural follow-up question, of course, is this: whathappens in between? We are now ready to prove Theorem 1(iii), which covers exactlythis case. Let us start by giving a basic outline of the proof.A slight manipulation of (4.1) gives us thatd T V ( M αn,k , M ∞ n,k ) = 12 X d (cid:18) knd (cid:19) Q nj =1 α d j ( αn ) kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − ( αn ) kn n kn Q nj =1 α d j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (6.1)Note that this expresses the total variation distance as the expectation, with respectto M αn,k , of the following quantity:12 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − ( αn ) kn n kn Q nj =1 α D αn,j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (6.2)As in the proof of Theorem 1(i), we begin by splitting the sum in (6.1) into major andminor contributions of “good” and “bad” d , largely corresponding to two complemen-tary ranges of P j d j . We will show that the contribution of bad d is negligible; so our istance Between Two Random k -Out Digraphs focus will be on good d . A sharp asymptotic analysis will show that, uniformly overthose d , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − ( αn ) kn n kn Q nj =1 α d j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≈ f ( S n ( d )) , S n ( d ) := P nj =1 d j − n E [( Z n, ) ] √ n ; (6.3)here Z n = ( Z n, , . . . , Z n,n ) is as in Lemma 1(i) and f ( x ) := (cid:12)(cid:12)(cid:12) − exp (cid:16) − k β − x β (cid:17)(cid:12)(cid:12)(cid:12) . Since E [( Z n, ) ] and E [( D αn, ) ] are relatively close, (6.3) is a clear sign that a centrallimit theorem (CLT) for ( D αn, ) + · · · +( D αn,n ) might be the key for asymptotic analysisof the contribution by good d . In this setting, a CLT is indeed plausible: we know that D αn is distributed as Z n conditioned on Z n, + · · · + Z n,n = kn , a condition weak enoughthat any fixed, bounded set of coordinates of D αn are asymptotically independent.Still, the interdependence of D αn,j is too strong to count on standard techniques,such as Fourier- and/or martingale-based approaches. If workable at all, the methodof moments would have required sharp asymptotic estimates of the central moments of( D αn, ) + · · · + ( D αn,n ) , obtained from an extension of Corollary 1(iv) to higher-ordermixed factorial moments – an exceedingly computational route which would hardlyexplain the intrinsic reasons for the CLT to hold.So, instead, we recall the result of Lemma 1(i): the in-degree sequence D αn isdistributed as Z n = ( Z n, , . . . , Z n,n ) conditioned on Z n, + · · · + Z n,n = kn , wherethe Z n,j are IID negative binomial variables. In light of this, we consider a two-dimensional S n = P j (cid:0) √ n Z n,j , √ n Z n,j (cid:1) , the sum of independent 2-vectors, or, morespecifically, the centered vector S n = ( S n, , S n ) = P nj =1 Z n,j − n E [ Z n, ] √ n , P nj =1 Z n,j − n E [ Z n, ] √ n ! . Conditioned on S n = 0, S n is distributed as S n ( D αn ), S n ( d ) being defined in (6.3).Now we should certainly expect that S n is asymptotically Gaussian (normal). Howeverjust a CLT for S n would not be enough, since P ( S n = 0) = Θ( n − / ), making theconditioning event { S n = 0 } way too intrusive to extract the limiting distribution of S n ( D αn ) from the limiting cumulative distribution of S n . So, instead, we will have toprove a Local Central Limit Theorem (LCLT) for S n ; this will immediately yield aLCLT (which in turn directly implies a CLT) for S n ( D αn ). Lemma 3. (LCLT for S n .) Suppose α = α ( n ) < ∞ for all n , but that α ( n ) → ∞ as n → ∞ . Then lim n →∞ sup x ∈ Supp( S n ) (cid:12)(cid:12)(cid:12) n P (cid:0) S n = x (cid:1) − η ( x ) (cid:12)(cid:12)(cid:12) = 0 , where η ( x ) := exp( − xΣ − x T )2 π p | det Σ | is the density function of a Gaussian random vector in R with mean and the positive-definite covariance matrix Σ and its inverse Σ − given by Σ = k k + k k + k k + 6 k + k , Σ − = k − + k − / − k − − k − / − k − − k − / k − / . Proof.
As a template, we use the Fourier-based proof of the one-variable LCLT inDurrett [10, Section 2.5, Theorem 5.2], modifying it for vector-valued summands, anddealing with dependence on n (through dependence of α on n ) of the distributions of theindividual summands. Most of the relevant facts about multi-dimensional characteristicfunctions, such as inversion formulas, can be found in [3] by Bhattacharya and Rao.For ease of notation, define V n,j := ( Z n,j − E [ Z n,j ] , Z n,j − E [ Z n,j ]) , V n = X j V n,j ;so V n,j are IID, and S n = n − / V n . For t ∈ R , introduce the characteristic functions φ n ( t ) := E [ e i h t , V n,j i ] , Φ n ( t ) = E [ e i h t , S n i ] = ( φ n ( n − / t )) n . Let us show first that lim n →∞ Φ n ( t ) = e − tΣt T , t ∈ R ; (6.4)see Lemma 3 for Σ . This will prove the CLT for S n , namely: S n converges indistribution to a Gaussian 2-vector with mean and covariance matrix Σ .Each V n,j has mean , and the same covariance matrix, Σ n . By Corollary 1(iii), E (cid:2) k V n,j k (cid:3) = O (1) as n → ∞ . So, uniformly in n , φ n ( y ) = 1 − yΣ n y T / O (cid:0) | y | (cid:1) , y → . (6.5)Using Corollary 1(i) to compute Σ n , we find that k Σ n − Σ k = O ( α − ) →
0, as α = Θ( n / ). Thus, for every t ,Φ n ( t ) = (cid:0) φ n ( n − / t ) (cid:1) n = (cid:0) − n − tΣt T / O ( n − / ) (cid:1) n → e − tΣt T , istance Between Two Random k -Out Digraphs which proves (6.4).The CLT is a key ingredient in the proof of the LCLT for S n .The minimum additive subgroup L ⊆ R such that P ( V n,j ∈ x + L ) = 1 forsome x ∈ R is generated by a = (1 ,
1) and b = (1 , − Z n,j , Z n,j ) mustbe contained in the span of a and b , because m ≡ m (mod 2) for all non-negativeintegers m , and a , b are necessary because (0 , , (1 , , (2 , ∈ Supp( Z n,j , Z n,j ). Sincethe random V n,j are independent, the minimum subgroup for S n is L / √ n , the latticegenerated by a / √ n and b / √ n . So, by the inversion formula for lattice-distributedvariables (c.f. [3, p.230, eq. 21.28]), P ( S n = x ) = 14 π · n · Z F n e − i h t , x i Φ n ( t ) d t , x ∈ Supp ( S n ) , (6.6)where F n := { t = ( t , t ) : | t + t | < π √ n and | t − t | < π √ n } . Since the characteristic function e − tΣt T ∈ L ( R ), we also have η ( x ) = 14 π Z R e − i h t , x i e − tΣt T d t . (6.7)The triangle inequality and equations (6.6) and (6.7) imply (cid:12)(cid:12)(cid:12) n P ( S n = x ) − η ( x ) (cid:12)(cid:12)(cid:12) ≤ Z F n (cid:12)(cid:12)(cid:12) Φ n ( t ) − e − t tΣt T (cid:12)(cid:12)(cid:12) d t + Z R \F n e − tΣt T d t . (6.8)The right side of (6.8) does not depend on x . So, in order to establish uniformconvergence, it suffices to show that the right side of (6.8) tends to 0 as n → ∞ .That the integral over R \ F n tends to 0 is immediate: e − tΣt T is integrable, and F n increases to R as n → ∞ . Consider the integral over R \ F n . Note that, bythe CLT already proved, the integrand converges to zero pointwise. To prove that theintegral goes to zero as well, we consider t with small n − / | t | and the remaining t separately.We know that Σ n → Σ , and Σ is positive-definite. So there is a constant γ > n large enough, yΣ n y T ≥ γ | y | for all y ∈ R . Consequently, by theuniform estimate in (6.5), there is a constant δ ′ > n large enough and | y | ≤ δ ′ , | φ n ( y ) | ≤ − γ | y | ≤ e − γ | y | . Choose δ ∈ (0 , π ) so that t ∈ F (1) n implies | t | < δ ′ √ n , where F (1) n := { t : | t + t | < δ √ n and | t − t | < δ √ n } . Then for t ∈ F (1) n , | Φ n ( t ) − e − tΣt T | ≤ | Φ n ( t ) | + e − tΣt T ≤ ( e − γ | t | /n ) n + e − tΣt T = e − γ | t | + e − tΣt T , which is integrable. So, by the Dominated Convergence Theorem, Z F (1) n | Φ n ( t ) − e − tΣt T | d t → n → ∞ . For F (2) n := F n \F (1) n , note that V n,j converges in distribution to ( Y − k, Y − ( k + k )) as n → ∞ , where Y is Poisson-distributed with parameter k . Therefore, φ n ( t ) convergesto the characteristic function φ ∗ ( t ) of ( Y, Y ), uniformly on compact sets in R . By [3,Lemma 21.6], | φ ∗ ( y ) | = 1 if and only if y = ( y , y ) has | y + y | , | y − y | ∈ π Z . Wenote that, uniformly for t ∈ F (2) n , t / √ n is bounded away from all such points. So, byuniform continuity of φ ∗ on every compact set, there exists ε ∈ (0 ,
1) so that for all n and for all t ∈ F (2) n , | φ ∗ ( t / √ n ) | ≤ ε/
2. Using again the fact that F (2) n / √ n is containedin a compact set, we know that φ n ( t ) converges uniformly to φ ∗ ( t ) on F (2) n / √ n . Thus | φ n ( t / √ n ) | ≤ ε and | Φ n ( t ) | ≤ ε n for t ∈ F (2) n and n sufficiently large. It then followsthat for n sufficiently large, Z F (2) n | Φ n ( t ) − e − tΣt T | d t ≤ ε n Vol ( F (2) n ) + Z F (2) n e − tΣt T d t → , as n → ∞ , completing the proof.With Lemma 3 in hand, we are ready to prove the desired CLT for the sum of squaredin-degrees. In fact, the LCLT for S n is strong enough to prove the corresponding LCLTfor that sum, which directly implies the desired convergence in distribution. Corollary 3. (LCLT for P j ( D αn,j ) .) Suppose α = α ( n ) < ∞ for all n , but α ( n ) → ∞ as n → ∞ . Define S n = P j ( D αn,j ) − n E (cid:2) ( Z αn, ) (cid:3) √ n . Then lim n →∞ sup x ∈ Supp( S n ) (cid:12)(cid:12)(cid:12)(cid:12) √ n P ( S n = x ) − ψ ( x ) (cid:12)(cid:12)(cid:12)(cid:12) = 0 , istance Between Two Random k -Out Digraphs where ψ ( x ) := e − x / (4 k k √ π is the density of a Gaussian random variable with mean andvariance k .Proof. Let S n be defined as in Lemma 3. Then for any x ∈ Supp ( S n ), P ( S n = x ) = P (cid:18) S n = (0 , x ) | X j Z n,j = kn (cid:19) = P ( S n = (0 , x ) , P j Z n,j = kn ) P ( P j Z n,j = kn ) = P ( S n = (0 , x )) P ( P j Z n,j = kn ) , (6.9)since the condition P j Z n,j = kn means precisely that the first coordinate of S n iszero. As in the proof of Lemma 1, P X j Z n,j = kn = ( αn ) kn ( kn )! (cid:18) αα + k (cid:19) αn (cid:18) kα + k (cid:19) kn . We apply the identity a b = Γ( a + b ) / Γ( a ) and Stirling’s approximation to obtain P X j Z n,j = kn = r α πk ( α + k ) n (cid:18) O (cid:18) n (cid:19)(cid:19) = 1 √ πkn (cid:18) O (cid:18) α (cid:19) + O (cid:18) n (cid:19)(cid:19) . So, using (6.9) and Lemma 3, uniformly over x ∈ Supp( S n ), √ n P ( S n = x ) = (cid:0) O ( α − ) (cid:1) √ πk · n P ( S n = (0 , x ))= (cid:0) O ( α − ) (cid:1) √ πk ( η (0 , x ) + o (1)) = ψ ( x ) + o (1) , where ψ ( x ) = √ πk η (0 , x ) = √ πk π p | det Σ | exp (cid:18) − x Σ − ) , (cid:19) = e − x / (2 k ) k √ π . So lim n →∞ sup x ∈ Supp( S n ) (cid:12)(cid:12)(cid:12)(cid:12) √ n P ( S n = x ) − ψ ( x ) (cid:12)(cid:12)(cid:12)(cid:12) = 0 , which completes the proof.Finally, we are ready to carry out the proof of Theorem 1(iii). We proceed via a seriesof claims, after making some initial definitions. Define f ( x ) := (cid:12)(cid:12)(cid:12)(cid:12) − exp (cid:18) − k β − x β (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) . Let ω = ω ( n ) be such that ω → ∞ as n → ∞ but ω/ √ n →
0. For
A >
0, define thefunction f A by f A ( x ) := f ( x ) if x ≥ − A,f ( − A ) otherwise;so f A ( x ) is bounded and continuous for any fixed A . Further, define A n = A n ( A ) := n d : (cid:12)(cid:12)(cid:12) n X j =1 d j − n E [( D αn,j ) ] (cid:12)(cid:12)(cid:12) < A √ n o , where d ranges over all valid in-degree sequences of k -out digraphs on [ n ], and A ′ n = A ′ n ( A ) := n d ∈ A n : (cid:12)(cid:12)(cid:12) n X j =1 d sj − n E [( D αn,j ) s ] (cid:12)(cid:12)(cid:12) < ω √ n for s = 3 , o . As in the proof of Theorem 1(i), we start fromd
T V ( M αn,k , M ∞ n,k ) = 12 X d (cid:18) kn d (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Q nj =1 α d j ( αn ) kn − n kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . Proposition 1.
The error committed in restricting d T V ( M αn,k , M ∞ n,k ) to in-degreesequences d ∈ A ′ n can be made small by choosing A large. In particular, for A > k β ,there is a constant C > , independent of A , such that lim sup n →∞ X d / ∈A ′ n (cid:18) kn d (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Q nj =1 α d j ( αn ) kn − n kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ C ( A − k β ) . Proof.
By the triangle inequality, X d / ∈A ′ n (cid:18) kn d (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Q nj =1 α d j ( αn ) kn − n kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ P ( D αn / ∈ A ′ n ) + P ( D ∞ n / ∈ A ′ n ) . By Corollary 2(i), P (cid:16)(cid:12)(cid:12)(cid:12) n X j =1 ( D αn,j ) s − n E [( D αn, ) s ] (cid:12)(cid:12)(cid:12) ≥ ω √ n (cid:17) → n → ∞ for s = 3 and s = 4, while a similar application of Chebyshev’s inequality yields P (cid:16)(cid:12)(cid:12)(cid:12) n X j =1 ( D αn,j ) − n E [( D αn, ) ] (cid:12)(cid:12)(cid:12) ≥ A √ n (cid:17) ≤ Var[ P nj =1 ( D αn,j ) ] A n = O ( A − ) . istance Between Two Random k -Out Digraphs The last two estimates combine to imply existence of a constant C >
0, independentof A , so that lim sup n →∞ P ( D αn / ∈ A ′ n ) ≤ C A . (6.10)For n sufficiently large, Corollary 1(ii) and α = βn / imply that, for s = 2 , , | E [( D αn,j ) s ] − E [( D ∞ n,j ) s ] | ≤ c s n − / ,c s being constants, with c = 2 k /β . By Chebyshev’s inequality, for s = 3 and s = 4we have P (cid:16)(cid:12)(cid:12)(cid:12) n X j =1 ( D ∞ n,j ) s − n E [( D αn,j ) s ] (cid:12)(cid:12)(cid:12) ≥ ω √ n (cid:17) ≤ Var (cid:2) P nj =1 ( D ∞ n,j ) s (cid:3) ( ω − c s ) n → n → ∞ , while for some constant C independent of AP (cid:16)(cid:12)(cid:12)(cid:12) n X j =1 ( D ∞ n,j ) − n E [( D αn,j ) ] (cid:12)(cid:12)(cid:12) ≥ A √ n (cid:17) ≤ Var (cid:2) P nj =1 ( D ∞ n,j ) (cid:3) ( A − k β ) n ≤ C ( A − k β ) . This bound and (6.10) combined imply thatlim sup n →∞ P ( D ∞ n / ∈ A ′ n ) ≤ C + C ( A − k β ) , completing the proof. Proposition 2. As n → ∞ , X d ∈A ′ n (cid:18) kn d (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Q nj =1 α d j ( αn ) kn − n kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 12 E [ f ( S n ) D αn ∈A ′ n ] + O (cid:18) ω √ n (cid:19) . Proof.
As in the proof of Theorem 1(i), we rewrite (cid:18) kn d (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Q nj =1 α d j ( αn ) kn − n kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:18) kn d (cid:19) Q nj =1 α d j ( αn ) kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − ( αn ) kn n kn Q nj =1 α d j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . By Lemma 2(ii), ( αn ) kn = ( αn ) kn exp (cid:18) k n β √ n − k β + O (cid:18) √ n (cid:19)(cid:19) . Applying Lemma 2(ii)to each α d j and using µ s,α from Corollary 1(iii) we obtain: uniformly over all d ∈ A ′ n , n Y j =1 α d j = α kn exp n X j =1 (cid:18) d j ( d j − α − d j ( d j − d j − α (cid:19) + O n X j =1 d j α = α kn exp n X j =1 d j α − d j α − d j − d j + d j α ! + O (cid:18) √ n (cid:19) = α kn exp " P nj =1 d j β √ n − k √ n β − µ ,α − µ ,α + k β + O (cid:18) ω √ n (cid:19) = α kn exp " P nj =1 d j − n E [ Z n, ]2 β √ n + k n β √ n − k β + k β + O (cid:18) ω √ n (cid:19) . Combining these results yields( αn ) kn n kn Q nj =1 α d j = exp " − k β − P nj =1 d j − n E [ Z n, ]2 β √ n + O (cid:18) ω √ n (cid:19) . (6.11)Thus, denoting x ( d ) = n − / (cid:0)P j d j − n E [ Z n, ] (cid:1) and recalling the definition of f , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − ( αn ) kn n kn Q nj =1 α d j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = f ( x ( d )) + O (cid:0) e − k / β − x ( d ) ωn − / (cid:1) = f ( x ( d )) + O ( ωn − / ) , uniformly over d ∈ A ′ n , because | x ( d ) | is bounded over such d . It follows that X d ∈A ′ n (cid:18) kn d (cid:19) Q nj =1 α d j ( αn ) kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − ( αn ) kn n kn Q nj =1 α d j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = E [ f ( S n ) { D αn ∈A ′ n } ] + O (cid:18) ω √ n (cid:19) , as claimed. Proposition 3.
There is a constant C (independent of A ) such that lim sup n →∞ (cid:12)(cid:12)(cid:12)(cid:12) E [ f ( S n ) D αn ∈A ′ n ] − E [ f A ( S n )] (cid:12)(cid:12)(cid:12)(cid:12) ≤ C (1 + e A/ β ) e − A / k A .
Proof.
Consider the expectations as sums over d and match the summands for thesame d . The matched summands are identical for d ∈ A ′ n , and the summand fromthe first sum is 0 for d / ∈ A ′ n . So we only need to bound E [ f A ( S n ) { D αn / ∈A ′ n } ], where { D αn / ∈ A ′ n } ⊆ { D αn ∈ A n \ A ′ n } ∪ { D αn / ∈ A n } . Since | f A ( x ) | ≤ e A/ β , E [ f A ( S n ) { D αn ∈A n \A ′ n } ] ≤ (1 + e A/ β ) P ( D αn / ∈ A ′ n ) = O (cid:18) ω √ n (cid:19) , (6.12) istance Between Two Random k -Out Digraphs and, since S n ⇒ N (0 , k ) as n → ∞ by Corollary 3 and A is fixed, E [ f A ( S n ) { D αn / ∈A n } ] ≤ (1 + e A/ β ) P ( | S n | ≥ A ) → (1 + e A/ β ) P ( |N (0 , k ) | ≥ A ) . (6.13)Here, by the tail inequality for normal variables (c.f. [10, Theorem 1.1.4]), P ( |N (0 , k ) | ≥ A ) ≤ k √ π · e − A / k A . (6.14)Combining (6.12), (6.13) and (6.14) proves the claim.
Proposition 4. As n → ∞ , E [ f A ( S n )] → E [ f A ( N (0 , k ))] . Proof.
This is an immediate consequence of two facts: that f A is bounded andcontinuous, and the CLT for S n .We are now ready to put the pieces together: Proof of Theorem 1(iii).
Combining Propositions 1-4, we find that for
A > k β ,lim sup n →∞ (cid:12)(cid:12)(cid:12)(cid:12) d T V ( M αn,k , M ∞ n,k ) − E [ f A ( N (0 , k ))] (cid:12)(cid:12)(cid:12)(cid:12) ≤ b A − k β ) + (1 + e A/ β ) e − A / k A , (6.15)where “ ≤ b ” means that the left side is bounded by a constant (independent of A )multiple of the right side. Now, letting A → ∞ , the right side of (6.15) tends to 0.Further, the nonnegative f A ( x ) increases pointwise to f ( x ) for all x , so that12 E [ f A ( N (0 , k ))] → E [ f ( N (0 , k )] = 12 E (cid:12)(cid:12)(cid:12)(cid:12) − exp (cid:18) −N (cid:18) k β , k β (cid:19)(cid:19)(cid:12)(cid:12)(cid:12)(cid:12) as A → ∞ , proving the result.
7. Proof of Theorem 2
We begin by noting that the total variation distance we seek isd
T V (cid:16) n X j =1 ( D αn,j ) , n X j =1 ( D ∞ n,j ) (cid:17) = k n X s = k n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X d + ··· + d n = s (cid:18) kn d (cid:19) Q nj =1 α d j ( αn ) kn − n kn !(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = k n X s = k n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X d + ··· + d n = s (cid:18) kn d (cid:19) Q α d j ( αn ) kn − ( αn ) kn n kn Q α d j !(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . This differs from the distance between M αn,k and M ∞ n,k only in the placement of theabsolute values: were the absolute values inside the inner sum, the two distanceswould be exactly the same. So, our task is to show that the triangle inequality is asymptotically sharp . This requires that we find analogs for Propositions 1 and 2;however, the rest of the proof of Theorem 1(iii) will transfer over directly from there.Let A n = A n ( A ) and A ′ n = A ′ n ( A ) be as in Section 6; let A ′ n,s be the set of all d ∈ A ′ n with d + · · · + d n = s . If B n,s denotes the set of all valid d with d + · · · + d n = s but d / ∈ A ′ n , then the minor contribution to the total variation distance is X s (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X d ∈B n,s (cid:18) kn d (cid:19) Q α d j ( αn ) kn − ( αn ) kn n kn Q α d j !(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ X d / ∈A ′ n (cid:18) kn d (cid:19) Q α d j ( αn ) kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − ( αn ) kn n kn Q α d j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ;thus, by Proposition 1,lim sup n →∞ X s (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X d ∈B n,s (cid:18) kn d (cid:19) Q α d j ( αn ) kn − ( αn ) kn n kn Q α d j !(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ C ( A − k β ) . To handle the major contribution, note that as in the proof of Proposition 2, and(6.11) in particular,1 − ( αn ) kn n kn Q nj =1 α d j = 1 − exp (cid:18) − k β − x ( d )2 β (cid:19) + O ( ωn − / ) (7.1)uniformly over d ∈ A ′ n , where x ( d ) = n − / ( P j d j − n E [ Z n, ]). Notably, x ( d ) – andtherefore the major contribution in (7.1) – is determined entirely by d + · · · + d n . Thisallows us to write the major contribution as X s (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X d ∈A ′ n,s (cid:18) kn d (cid:19) Q α d j ( αn ) kn − exp − k β − s − n E [ Z n, ]2 β √ n !!(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + O ( ωn − / ) , (7.2) istance Between Two Random k -Out Digraphs where the outer summation is over | s − n E [ Z n, ] | < A √ n . Note that the signs of theterms in the inner summation have no dependence on d other than through dependenceon s . So, for a given s , either all terms are positive or all terms are negative. So, thereis no cancellation in this new form of the sum, and the triangle inequality is actuallyequality. This allows us to rewrite (7.2) as X s X d ∈A ′ n,s (cid:18) kn d (cid:19) Q α d j ( αn ) kn (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − exp − k β − s − n E [ Z n, ]2 β √ n !(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + O ( ωn − / )= X d ∈A ′ n (cid:18) kn d (cid:19) Q α d j ( αn ) kn (cid:12)(cid:12)(cid:12)(cid:12) − exp (cid:18) − k β − x ( d )2 β (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) + O ( ωn − / )= E [ f ( S n ) D αn ∈A ′ n ] + O (cid:18) ω √ n (cid:19) . This is precisely the expression in the conclusion of Proposition 2. From here, the restof the proof of Theorem 1(iii) applies directly.
8. Afterword
At the start of Section 2, we introduced two one-arc-at-a-time processes that ter-minate in the random k -out mapping M αn,k after kn steps; in one, we place a fixedorder on the out-arcs to be chosen, and choose their images in this order; in the other,at each step we choose a vertex uniformly at random from the currently unsaturated vertices, and choose its image. In this paper, our focus was on the total variationdistance between the terminal snapshots M αn,k and M ∞ n,k ; a natural follow-up questionis: how does the total variation distance between the two processes , for α = α ( n ) < ∞ and α = ∞ , depend on α ( n )? What is a threshold behavior for uniformity?In the fixed-order case, it is unsurprising that the total variation distance betweenthe processes exactly matches that between the terminal snapshots: if we know thethrowing order and the terminal snapshot, then we know the entire process. We mightsuspect that the threshold for the randomly-ordered process is actually higher than α = Θ( √ n ), as this process contains more information than the terminal snapshot;however, this is not the case, as can be seen as a consequence of the fact that theterminal snapshot is independent of the throwing order.Analysis of the total variation distance from a uniform distribution, between two terminal snapshots and/or two processes, could prove an interesting challenge for otherpreferential attachment models, such as the process { G α ( n, M ) } studied by Pittel [19].Since the distribution of G α ( n, M ) is not accessible directly, a good first step might bethe relaxed (multigraph) process M G α ( n, M ). Our suspicion is that there are differentthresholds for uniformity of the terminal snapshot and the entire process, with thethreshold for uniformity of the entire process being the larger. Acknowledgement.
We are grateful to Huseyin Acan, Dan Poole, and ChrisRoss for many stimulating and probing discussions of this study during brain-stormingmeetings of the combinatorial probability working group at The Ohio State University.We would also like to thank the anonymous referee of this work for their helpful criticalcomments.
References [1]
Barab´asi, A.-L. and Albert, R. (1999). Emergence of scaling in randomnetworks.
Science
Bergeron, F., Flajolet, P. and Salvy, B. (1992). Varieties of increasingtrees. In
CAAP ’92 . ed. J.-C. Raoult. vol. 581 of
Lecture Notes in ComputerScience . Springer Berlin Heidelberg pp. 24–48.[3]
Bhattacharya, R. N. and Rao, R. (1976).
Normal approximation andasymptotic expansions . Wiley series in probability and mathematical statistics.Wiley.[4]
Bollob´as, B., Borgs, C., Chayes, J. and Riordan, O. (2003). Directedscale-free graphs. In
Proc. 14th ACM-SIAM Symposium on Discrete Algorithms .pp. 132–139.[5]
Bollob´as, B. and Riordan, O. M. (2004). The diameter of a scale-free randomgraph.
Combinatorica
Bollob´as, B., Riordan, O. M., Spencer, J. and Tusn´ady, G. (2001). Thedegree sequence of a scale-free random graph process.
Random Struct. Algorithms istance Between Two Random k -Out Digraphs [7] Buckley, P. and Osthus, D. (2004). Popularity based random graph modelsleading to a scale-free degree sequence.
Discrete Math.
Burtin, Y. D. (1980). On a simple formula for random mappings and itsapplications.
J. Appl. Prob.
Deijfen, M. (2010). Random networks with preferential growth and vertex death.
J. Appl. Probab.
Durrett, R. (2005).
Probability: Theory and Examples
Erd¨os, P. and R´enyi, A. (1960). On the evolution of random graphs. In
Publ.Math. Inst. Hung. Acad. Sci. . vol. 5. pp. 17–61.[12]
Gertsbakh, I. B. (1977). Epidemic process on a random graph: Somepreliminary results.
J. Appl. Prob.
Hansen, J. C. and Jaworski, J. (2008). Local properties of random mappingswith exchangeable in-degrees.
Adv. in Appl. Probab.
Hansen, J. C. and Jaworski, J. (2008). Random mappings with exchangeablein-degrees.
Random Struct. Algorithms
Hansen, J. C. and Jaworski, J. (2009). A random mapping with preferentialattachment.
Random Struct. Algorithms
Mahmoud, H. M., Smythe, R. T. and Szyma´nski, J. (1993). On the structureof random plane-oriented recursive trees and their branches.
Random Struct. Alg. Pittel, B. (1983). On distributions related to transitive closures of random finitemappings.
Ann. Probab.
Pittel, B. (1994). Note on the heights of random recursive trees and random m -ary search trees. Random Struct. Alg. Pittel, B. (2010). On a random graph evolving by degrees.
Advances in Math.223,