[PDF] Strong couplings for static locally tree-like random graphs

Abstract

The goal of this paper is to provide a general purpose result for the coupling of exploration processes of random graphs, both undirected and directed, with their local weak limits when this limit is a marked Galton-Watson process. This class includes in particular the configuration model and the family of inhomogeneous random graphs with rank-1 kernel. Vertices in the graph are allowed to have attributes on a general separable metric space and can potentially influence the construction of the graph itself. The coupling holds for any fixed depth of a breadth-first exploration process.

Full PDF

aa r X i v : . [ m a t h . P R ] F e b Strong couplings for static locally tree-like random graphs

Mariana Olvera-CraviotoUniversity of North Carolina at Chapel Hill

Abstract

The goal of this paper is to provide a general purpose result for the coupling of explorationprocesses of random graphs, both undirected and directed, with their local weak limits whenthis limit is a marked Galton-Watson process. This class includes in particular the conﬁgurationmodel and the family of inhomogeneous random graphs with rank-1 kernel. Vertices in the graphare allowed to have attributes on a general separable metric space and can potentially inﬂuencethe construction of the graph itself. The coupling holds for any ﬁxed depth of a breadth-ﬁrstexploration process.

Keywords:

Random graphs, complex networks, Galton-Watson processes, conﬁguration model,inhomogeneous random graph, local-weak convergence.

There is a growing literature of problems in physics, mathematics, computer science and operationsresearch that are set up as processes, random or not, on large sparse graphs. The range of problemsbeing studied is wide, and includes problems related to the classiﬁcation, sorting, and ranking oflarge networks, as well as the analysis of Markov chains and interacting particle systems on graphs.Popular among the types of graphs used for these purposes, are the locally tree-like random graphmodels, such as the conﬁguration model and the inhomogeneous random graph family (whichincludes the classical Erd˝os-R´enyi model). These random graph models are quite versatile in thetypes of graphs they can mimic, and have important mathematical properties that make theiranalysis tractable.In particular, the mathematical tractability of locally tree-like random graphs comes from the factthat their local neighborhoods resemble trees. This property makes it easy to transfer questionsabout the process of interest on a graph, to the often easier analysis of the process on the limitingtree. Mathematically, this transfer is enabled by the notion of local weak convergence [1, 2, 4, 16].However, as it is the case for many problems involving usual weak convergence of random variables,it is often desirable to construct the original set of random variables and their corresponding weaklimits on the same probability space, in other words, to have a coupling. In addition, many problemsstudying processes on graphs require that we keep track of additional vertex attributes not usuallyincluded in the local weak limits, attributes that may not be discrete. The results in this paperwere designed to solve these two problems simultaneously, by providing a general purpose couplingbetween the exploration of the neighborhood of a uniformly chosen vertex in a locally tree-like1raph and its local weak limit, including general vertex attributes that may indirectly inﬂuence theconstruction of the graph.The main results focus only on the two families of random graph models that are known to converge,in the local weak sense, to a marked Galton-Watson process. It is worth mentioning that otherlocally tree-like graphs like the preferential attachment models do not fall into this category, sincetheir local weak limits are continuous-time branching processes. In particular, we focus on randomgraphs constructed according to either a conﬁguration model or any of the inhomogeneous randomgraph models with rank-1 kernels (see Sections 1.1 and 1.2 for the precise deﬁnitions). Our resultsinclude both undirected and directed graphs, and are given under minimal moment conditions.In particular, under our assumptions, it is possible for the oﬀspring distribution in the limitingmarked Galton-Watson process to have inﬁnite mean, and in the directed case, for the limitingjoint distribution of the in-degree and out degree of a vertex to have inﬁnite covariance.Before describing the two families of random graph models for which our coupling theorems hold,we will introduce some deﬁnitions that will be used throughout the paper. We will use G ( V n , E n ) todenote a graph on the set of vertices V n = { , , . . . , n } and having edges on the set E n . A directededge from vertex i to vertex j is denoted by ( i, j ). If the graph is undirected, we simply ignore thedirection and take E n ⊆ { ( i, j ) : i, j ∈ V n , i < j } . In the undirected case, we use D i to denotethe degree of vertex i , which corresponds to the number of adjacent neighbors of vertex i . In thedirected case, we use D − i to denote the in-degree of vertex i and D + i to denote its out-degree; thein-degree counts the number of inbound neighbors while the out-degree the number of outboundones. All our results are given in terms of the large graph limit, which corresponds to taking asequence of graphs { G ( V n , E n ) : n ≥ } and taking the limit as | V n | = n → ∞ , where | A | denotesthe cardinality of set A . Both the conﬁguration model and the family of inhomogeneous randomgraphs are meant to model large static graphs, since there may be no relation between G ( V n , E n )and G ( V m E m ) for n ≥ m . Strong couplings for evolving graphs such as the preferential attachmentmodels are a topic for future work. The conﬁguration model [5, 25] produces graphs from any prescribed (graphical) degree sequence.In the undirected version of this model, each vertex is assigned a number of stubs or half-edgesequal to its target degree. Then, these half-edges are randomly paired to create edges in the graph.For an undirected conﬁguration model (CM), we assume that each vertex i ∈ V n is assigned an attribute vector a i = ( D i , b i ), where D i ∈ N is its degree, and b i ∈ S ′ encodes additional informationabout vertex i that does not directly aﬀect the construction of the graph but may depend on D i .For the sequence { D i : 1 ≤ i ≤ n } to deﬁne the degree sequence of an undirected graph, we musthave that L n := n X i =1 D i be even. Note that this may require us to consider a double sequence { a ( n ) i : i ≥ , n ≥ } ratherthan a unique sequence, i.e., one where a ( n ) i = a ( m ) i for n = m .Assuming that L n is even, enumerate all the stubs, and pick one stub to pair; suppose the stub2elongs to vertex i . Next, choose one of the remaining L n − j , draw an edge between vertices i and j ; pick another stub to pair.In general, a stub being paired chooses uniformly at random from the set of unpaired stubs, thenidentiﬁes the vertex to which the chosen stub belongs, and creates an edge between its vertex andthe one to which the chosen stub belongs.The directed version of the conﬁguration model (DCM) is such that each vertex i ∈ V n is assignedan attribute of the form a i = ( D − i , D + i , b i ) ∈ N × S ′ . Similarly to the undirected case, D − i and D + i denote the in-degree and the out-degree, respectively, of vertex i , and the b i is allowed to dependon ( D − i , D + i ). The condition needed to ensure we an draw a graph is now: L n := n X i =1 D + i = n X i =1 D − i , which again may require us to consider a double sequence { a ( n ) i : i ≥ , n ≥ } .As for the CM, we give to each vertex i a number D − i of inbound stubs, and a number D + i ofoutbound stubs. To construct the graph, we start by choosing an inbound (outbound) stub, saybelonging to vertex i , and choose uniformly at random one of the L n outbound (inbound) stubs.If the chosen stub belongs to vertex j , draw an edge from j to i (from i to j ); then pick anotherinbound (outbound) stub to pair. In general, when pairing an inbound (outbound) stub, we pickuniformly at random from all the remaining unpaired outbound (inbound) stubs. If the stub beingpaired belongs to vertex i , and the one to which the chosen stub belongs to is j , we draw a directededge from j to i (from i to j ).We emphasize that both the CM and the DCM are in general multi-graphs, that is, they can haveself-loops and multiple edges (in the same direction) between a given pair of vertices. However,provided the pairing process does not create self-loops or multiple edges, the resulting graph isuniformly chosen among all graphs having the prescribed degree sequence. It is well known thatwhen the limiting degree distribution has ﬁnite second moments, the pairing process results in asimple graph with a probability that remains bounded away from zero even as the graph grows[25, 10].We will use F n = σ ( a i : 1 ≤ i ≤ n ) to denote the sigma algebra generated by the attributesequence, which does not include the edge structure of the graph. To simplify the notation, wewill use P n ( · ) = P ( ·| F n ) and E n [ · ] = E [ ·| F n ] to denote the conditional probability and conditionalexpectation, respectively, given F n . The second class of random graph models we consider is the family of inhomogeneous random graphs(digraphs), in which the presence of an edge is determined by the toss of a coin, independentlyof any other edge. This family includes the classical Erd˝os-R´enyi graph [23, 17, 3, 18, 6, 15], butalso several generalizations that allow the edge probabilities to depend on the two vertices beingconnected, e.g., the Chung-Lu model [11, 12, 13, 14, 20], the Norros-Reittu model (or Poissonianrandom graph) [21, 25, 24], and the generalized random graph [25, 8, 24], to name a few. Unlikethe Erd˝os-R´enyi model, these generalizations are capable of producing graphs with inhomogeneous3egree sequences, and can mimic almost any degree distribution whose support is N (or N in thedirected case). This paper focuses only on inhomogeneous random graphs (digraphs) having rank-1kernels (see [7]), which excludes models such as the stochastic block model.To deﬁne an undirected inhomogeneous random graph (IR), assign to each vertex i ∈ V n an attribute a i = ( W i , b i ) ∈ R + × S ′ . The W i will be used to determine how likely vertex i is to have neighbors,while the b i can be used to include vertex characteristics that are not needed for the constructionof the graph but that are allowed to depend on W i . If convenient, one can consider using a doublesequence { a ( n ) i : 1 ≥ , n ≥ } as with the conﬁguration model, but this is not as important sincethe sequence W n := { W i : 1 ≤ i ≤ n } does not need to satisfy any additional conditions in orderfor us to draw the graph.We will use the same notation F n = σ ( a i : 1 ≤ i ≤ n ), as for the conﬁguration model, to denotethe sigma algebra generated by the vertex attributes, as well as the notation for the correspondingconditional probability, P n ( · ) = P ( ·| F n ), and expectation, E n [ · ] = E [ ·| F n ].For the IR, the edge probabilities are given by: p ( n ) ij := P n (( i, j ) ∈ E n ) = 1 ∧ W i W j θn (1 + ϕ n ( W i , W j )) , ≤ i < j ≤ n, where − < ϕ n ( W i , W j ) = ϕ ( n, W i , W j , W n ) a.s. is a function that may depend on the entiresequence W n , on the types of the vertices { i, j } , or exclusively on n , and 0 < θ < ∞ satisﬁes1 n n X i =1 W i P −→ θ, n → ∞ . Here and in the sequel, x ∧ y = min { x, y } and x ∨ y = max { x, y } . Since the graph is to be simpleby construction, p ( n ) ii ≡ i ∈ V n .For the directed version, which we refer to as an inhomogeneous random digraph (IRD), the vertexattributes take the form a i = ( W − i , W + i , b i ) ∈ R × S ′ . The parameter W − i controls the in-degreeof vertex i , and W + i its out-degree. If we write W i = ( W − i , W + i ), the edge probabilities in the IRDare given by: p ( n ) ij := P n (( i, j ) ∈ E n ) = 1 ∧ W + i W − j θn (1 + ϕ n ( W i , W j )) , ≤ i = j ≤ n, where − < ϕ n ( W i , W j ) = ϕ ( n, W i , W j , W n ) a.s. is a function that may depend on the entiresequence W n := { W i : 1 ≤ i ≤ n } , on the types of the vertices { i, j } , or exclusively on n , and0 < θ < ∞ satisﬁes 1 n n X i =1 ( W − i + W + i ) P −→ θ, n → ∞ . Since the graphs are again simple by construction, we have p ( n ) ii ≡ i ∈ V n . For an undirected graph constructed according to one of the two models (CM or IR), our mainresult shows that there exists a coupling between the breadth-ﬁrst exploration of the component of4 uniformly chosen vertex and that of the root node of a marked Galton-Watson process. Beforewe can state the theorem, we need to introduce some notation on the graph and describe theGalton-Watson process that describes its local weak limit.Each vertex i in the graph in an undirected graph G ( V n , E n ) is given a vertex attribute of the form: a i = ( ( D i , b i ) if G ( V n , E n ) is a CM,( W i , b i ) if G ( V n , E n ) is an IR.In addition, deﬁne for each vertex i its full mark : X i = ( D i , a i ) , where D i is the degree of vertex i . We point out that the deﬁnition of X i is redundant when thegraph is a CM, however, it is not so if the graph is an IR. In both cases the vertex attributes aremeasurable with respect to F n , while the full marks are not if the graph is an IR.The main assumption needed for the coupling to hold is given in terms of the empirical measurefor the vertex attributes, i.e., υ n ( · ) = 1 n n X i =1 a i ∈ · ) . (2.1)In order to state the assumption, recall that the state space for the vertex attributes, S ′ , is assumedto be a separable metric space under metric ρ ′ . Now deﬁne the metric ρ ( x , y ) = | x − y | + | x − y | + ρ ′ ( x , y ) , x = ( x , x , x ) , y = ( y , y , y ) , on the space S := N × R × S ′ , which makes S a separable metric space as well. Using ρ , and for anyprobability measures ν n , µ n in the conditional probability space ( S , F n , P n ), deﬁne the Wassersteinmetric of order one W ( ν n , µ n ) = inf n E n h ρ ( ˆY , Y ) i : law( ˆY ) = ν n , law( Y ) = µ n o . Assumption 2.1 (Undirected)

We say that two graphs G ( V, E ) and G ′ ( V ′ , E ′ ) are isomorphic if there exists abijection σ : V → V ′ such that edge ( i, j ) ∈ E if and only if edge ( σ ( i ) , σ ( j )) ∈ E ′ . If this is thecase, we write G ≃ G ′ . To describe the limit of G ( k ) I as n → ∞ , we will construct a delayed marked Galton-Watson process,denoted T ( A ), using the measure υ in Assumption 2.1. The “delayed” refers to the fact that theroot will, in general, have a diﬀerent distribution than all other nodes in the tree.To start, let U := S ∞ k =0 N k + denote the set of labels for nodes in a tree, with the convention that N := {∅} contains the root. For a label i = ( i , . . . , i k ) we write | i | = k to denote its length, anduse ( i , j ) = ( i , . . . , i k , j ) to denote the index concatenation operation.The tree T is constructed as follows. Let { ( N i , A i ) : i ∈ U } denote a sequence of independentvectors in S , with { ( N i , A i ) : i ∈ U , i = ∅} i.i.d. For any i ∈ U , the N i will denote the number ofoﬀspring of node i , and A i will denote its attribute (mark). As with the graph, we will use thenotation T to denote the tree without its attributes. Let A = {∅} and recursively deﬁne A k = { ( i , j ) : i ∈ A k − , ≤ j ≤ N i } , k ≥ , to be the k th generation of T . To match the notation on the graph, we write X ∅ = ( N ∅ , A ∅ ) , and X i = ( N i + 1 , A i ) , i = ∅ . The marked tree is then given by T ( A ) = { X i : i ∈ T } . We will denote T ( k ) ( T ( k ) ( A )) to be therestriction of T ( T ( A )) to its ﬁrst k generations.It only remains to identify the distribution of X i , for both i = ∅ and i = ∅ , in terms of theprobability measure υ in Assumption 2.1. For a CM, let A = ( D , B ) be distributed according to υ , then, P ( X ∅ ∈ B ) = P (( D , A ) ∈ · ) ,P ( X i ∈ B ) = 1 E [ D ] E [ D D , A ) ∈ · )] , i = ∅ . For an IR, let A = ( W, B ) be distributed according to υ , then, P ( X ∅ ∈ B ) = P (( D, A ) ∈ · ) ,P ( X i ∈ B ) = 1 E [ W ] E [ W D + 1 , A ) ∈ · )] , i = ∅ , where D is a mixed Poisson random variable with mean W . Note that the distribution of X i for i = ∅ , corresponds to a size-biased version of the distribution of X ∅ with respect to its ﬁrstcoordinate.We are now ready to state the main coupling theorem for undirected graphs.6 heorem 2.3 Suppose G ( V n , E n ) is either a CM or an IR satisfying Assumption 2.1. Then, for G ( k ) I ( a ) the depth- k neighborhood of a uniformly chosen vertex I ∈ V n , there exists a marked Galton-Watson tree T ( k ) ( A ) restricted to its ﬁrst k generations, whose root corresponds to vertex I , andsuch that for any ﬁxed k ≥ , P n (cid:16) G ( k ) I

6≃ T ( k ) (cid:17) P −→ , n → ∞ , and if we let σ ( i ) ∈ V n denote the vertex in the graph corresponding to node i ∈ T ( k ) , then, for any ǫ > , E n [ ρ ( X I , X ∅ )] P −→ and P n  k \ r =0 \ i ∈A r { ρ ( X σ ( i ) , X i ) ≤ ǫ } , G ( k ) I ≃ T ( k )  P −→ , n → ∞ . In the directed case, our main result will allow us to couple the breadth-ﬁrst exploration of eitherthe in-component or the out-component of a uniformly chosen vertex. Since the two cases areclearly symmetric, we state our results only for the in-component.As with the undirected graph, each vertex i in the graph G ( V n , E n ) has an attribute : a i = ( ( D − i , D + i , b i ) if G ( V n , E n ) is a DCM , ( W − i , W + i , b i ) if G ( V n , E n ) is an IRD . The full mark of vertex i is now given by: X i = ( D − i , D + i , a i ) , where D − i and D + i are the in-degree and out-degree, respectively, of vertex i .With some abuse of notation, we use again υ n , as deﬁned in (2.1), to denote the empirical measurefor the vertex attributes. However, the state space for the full marks is now S := N × R × S ′ ,equipped with the metric: ρ ( x , y ) = | x − y | + | x − y | + | x − y | + | x − y | + ρ ′ ( x , y ) , for x = ( x , x , x , x , x ) and y = ( y , y , y , y , y ). The Wasserstein metric W deﬁned on theconditional probability space ( S , F n , P n ) remains the same after the adjustments made to S and ρ . Assumption 3.1 (Directed)

Let υ n be deﬁned according to (2.1) , and suppose there exists aprobability measure υ (diﬀerent for each model) such that W ( υ n , υ ) P −→ , n → ∞ . In addition, assume that the following conditions hold: . In the DCM, let ( D − , D + , B ) be distributed according to υ , and suppose there exists a non-random b ∈ S ′ such that E [ D − + D + + ρ ( B , b )] < ∞ .B. In the IRD, let ( W − , W + , B ) be distributed according to υ , and suppose the following hold:1. E n = 1 n n X i =1 X ≤ i = j ≤ n, | p ( n ) ij − ( r ( n ) ij ∧ | P −→ as n → ∞ , where r ( n ) ij = W + i W − j / ( θn ) .2. There exists a non-random b ∈ S ′ such that E [ W − + W + + ρ ( B , b )] < ∞ . Since we will state our result for the exploration of the in-component of a uniformly chosen vertex,the structure of the coupled tree will be determined by the vertices that we encounter during abreadth-ﬁrst exploration. This exploration starts with a uniformly chosen vertex I ∈ V n , which isused to create the set A = { I } . It then follows all the inbound edges of I to discover all the verticesat inbound distance one from I , which become the set A . In general, to identify the vertices inthe set A k , we explore all the inbound edges of vertices in A k − . As we perform the exploration,we also discover the out-degrees of the vertices we have encountered, however, we do not follow anyoutbound edges. We then deﬁne G ( k ) I to be the subgraph of G ( V n , E n ) whose vertex set is S kr =0 A r and whose edges are those that are encountered during the breadth-ﬁrst exploration we described.The notation G ( k ) I ( a ) will be used to refer to the graph G ( k ) I including the values of the full marks { X i } for all of its vertices.In the directed case, the limit of G ( k ) I is again a delayed marked Galton-Watson process, withthe convention that all its edges are pointing towards the root. We will denote the tree T ( A )as before, however, it will be constructed using a sequence of independent vectors of the form { ( N i , D i , A i ) : i ∈ U } , with { ( N i , D i , A i ) : i ∈ U , i = ∅} i.i.d. In other words, the full marks nowtake the form X i = ( N i , D i , A i ) , i ∈ U . The construction of the tree T is done as in the undirected case using the {N i : i ∈ U } , and themarked tree is given by T ( A ) = { X i : i ∈ T } . The notation T ( k ) ( T ( k ) ( A )) refers again to therestriction of T ( T ( A )) to its ﬁrst k generations.The distribution of the full marks X i for both i = ∅ and i = ∅ are also diﬀerent than in theundirected case. For a DCM, let A = ( D − , D + , B ) be distributed according to υ , then P ( X ∅ ∈ · ) = P (( D − , D + , A ) ∈ · ) ,P ( X i ∈ · ) = 1 E [ D + ] E (cid:2) D + D − , D + , A ) ∈ · ) (cid:3) , i = ∅ . For an IRD, let A = ( W − , W + , B ) be distributed according to υ , then P ( X ∅ ∈ · ) = P (( D − , D + , A ) ∈ · ) ,P ( X i ∈ · ) = 1 E [ W + ] E (cid:2) W + D − , D + + 1 , A ) ∈ · ) (cid:3) , i = ∅ , where D − and D + are conditionally independent (given ( W − , W + )) Poisson random variables withmeans cW − and (1 − c ) W + , respectively, and c = E [ W + ] /E [ W − + W + ]. Note that in this case,8he distribution of X i for i = ∅ , corresponds to a size-biased version of the distribution of X ∅ withrespect to its second coordinate.The following is our main coupling theorem for directed graphs. Theorem 3.2

The proofs of Theorem 2.3 and Theorem 3.2 are based on an intermediate coupling between thebreadth-ﬁrst exploration of the graph G ( V n , E n ) and a delayed marked Galton-Watson processwhose oﬀspring distribution and marks still depend on the ﬁltration F n . The proof of Theorem 2.3and Theorem 3.2 will be done by stating and proving this intermediate coupling ﬁrst, and thencouple the intermediate tree, which we will denote ˆ T ( k ) ( ˆA ), with T ( k ) ( A ). Interestingly, the cou-pling between G ( k ) I and ˆ T ( k ) ( ˆA ) will be perfect, in the sense that the vertex/node marks in each ofthe two graphs will also be identical to each other.To organize the exposition, we will separate the undirected case from the directed one. Once theintermediate coupling theorems are proved, the coupling between the two trees can be done indis-tinctly for the undirected and directed cases (on the trees, the direction of the edges is irrelevant). As mentioned above, the main diﬀerence between the intermediate tree and the limiting one lies onthe distribution of the marks. As before, we start with the construction of the possibly inﬁnite treeˆ T , which is done with the conditionally independent (given F n ) sequence of random vectors in S , { ( ˆ N i , ˆA i ) : i ∈ U } , with { ( ˆ N i , ˆA i ) : i ∈ U , i = ∅} conditionally i.i.d. Let ˆ A = {∅} and recursivelydeﬁne ˆ A k = { ( i , j ) : i ∈ ˆ A k − , ≤ j ≤ ˆ N i } , k ≥ . Next, deﬁne the full marks according to: ˆX ∅ = ( ˆ N ∅ , ˆA ∅ ) and ˆX i = ( ˆ N i + 1 , ˆA i ) , i = ∅ , T ( ˆA ) = { ˆX i : i ∈ ˆ T } . We useFor a CM, the distribution of the full marks is given by: P n (cid:16) ˆX ∅ ∈ · (cid:17) = 1 n n X i =1 D i , a i ) ∈ · ) , P n (cid:16) ˆX i ∈ · (cid:17) = n X i =1 D i L n D i , a i ) ∈ · ) , i = ∅ . For the IR model, ﬁrst let { b n } be a sequence such that b n P −→ ∞ and b n / √ n P −→ n → ∞ , anduse it to deﬁne ¯ W i = W i ∧ b n and Λ n = n X i =1 ¯ W i . The marks on the coupled marked Galton-Watson process are given by: P n (cid:16) ˆX ∅ ∈ · (cid:17) = 1 n n X i =1 P (( D i , a i ) ∈ ·| a i ) , P n (cid:16) ˆX i ∈ · (cid:17) = n X i =1 ¯ W i Λ n P (( D i + 1 , a i ) ∈ ·| a i ) , i = ∅ , where conditionally on a i , D i is a Poisson r.v. with mean Λ n ¯ W i / ( θn ).We will also need to extend our deﬁnition of an isomorphism for marked graphs. Deﬁnition 4.1

A graph G ( V, E ) is called a vertex-weighted graph if each of its vertices has amark (weight) assigned to it. We say that the two vertex-weighted graphs G ( V, E ) and G ( V ′ , E ′ ) are isomorphic if there exists a bijection σ : V → V ′ such that edge ( i, j ) ∈ E if and only if edge ( σ ( i ) , σ ( j )) ∈ E ′ , and in addition, the marks of i and σ ( i ) are the same. If this is the case, wewrite G ≃ G ′ . The intermediate coupling theorem is given below.

Theorem 4.2

Suppose G ( V n , E n ) is either a CM or an IR satisfying Assumption 2.1. Then, for G ( k ) I ( a ) the depth- k neighborhood of a uniformly chosen vertex I ∈ V n , there exists a marked Galton-Watson tree ˆ T ( k ) ( ˆA ) restricted to its ﬁrst k generations, whose root corresponds to vertex I , andsuch that for any ﬁxed k ≥ , P n (cid:16) G ( k ) I ( a ) ˆ T ( k ) ( ˆA ) (cid:17) P −→ , n → ∞ . The proof of Theorem 4.2 is given separately for the two models being considered, the CM and theIR. 10 .1.1 Coupling for the conﬁguration model

To explore the neighborhood of depth k of vertex I ∈ G ( V n , E n ) we start by labeling the set of L n stubs in such a way that stubs { , . . . , D } belong to vertex 1, stubs { D + 1 , . . . , D + D } belongto vertex 2, and in general, stubs { D + · · · + D m − + 1 , . . . , D + D m } belong to vertex m .For any k ≥ A k = set of vertices in G ( V n , E n ) at distance k from vertex I . J k = set of stubs belonging to vertices in A k . V k = k [ r =0 A r . ˆ A k = set of nodes in ˆ T at distance k from the root ∅ .ˆ V k = k [ r =0 A r . To do a breadth-ﬁrst exploration of G ( V n , E n ) we start by selecting vertex I uniformly at random.Next, let J denote the set of stubs belonging to vertex I and set A = { I } . For k ≥

1, Step k inthe exploration will identify all the stubs belonging to nodes in A k . Step k , k ≥ : a. Initialize the sets A k = J k = ∅ .b. For each vertex i ∈ A k − :i. For each of the unpaired stubs of vertex i :1) Pick an unpaired stub of vertex i and sample uniformly at random a stub from the L n available. If the chosen stub is the stub currently being paired or if it had alreadybeen paired, sample again until an unpaired stub is sampled.2) If the chosen stub belongs to vertex j , draw an edge between vertices i and j usingthe chosen stub. If vertex j had not yet been discovered, add it to A k and add allof its unpaired stubs to I k .The exploration terminates at Step k if J k = ∅ , at which point the component of I will have beenfully explored.To couple the construction of ˆ T initialize ˆ A = {∅} and identify ∅ with vertex I in G ( V n , E n ) andset ˆ N ∅ = D I , a ∅ = a I . For k ≥

1, Step k in the construction will identify all the nodes in A k byadding nodes in agreement with the exploration of the graph. Each node that is added to the treewill have a number of stubs equal to the total number of stubs of the corresponding vertex, minusone (the one being used to create te edge), regardless of whether some of those stubs may alreadyhave been paired. Step k , k ≥ : a. Initialize the set ˆ A k = ∅ . 11. For each node i = ( i , . . . , i k − ) ∈ ˆ A k − :i. For each 1 ≤ r ≤ ˆ N i :1) Pick a stub uniformly at random from the L n available.2) If the chosen stub belongs to vertex j , then add node ( i , r ) to ˆ A k and set ˆ N ( i ,r ) = D j − ˆA ( i ,r ) = a j .This process will end in Step k if ˆ N i = 0 for all i ∈ ˆ A k , or it may continue indeﬁnitely. Deﬁnition 4.3

We say that the coupling breaks in generation τ = k if: • The ﬁrst time we have to resample a stub in step ( b )( i )(1) occurs while exploring a stubbelonging to a vertex in A k − ; or • If given that the above has not happened, a stub belonging to a vertex in A k − is paired witha stub belonging to a previously encountered vertex (this vertex could be in either A k − or thecurrent set A k ). Note:

The exploration of the component of depth k of vertex I in G ( V n , E n ) and the constructionof the ﬁrst k generations of the tree ˆ T will be identical provided τ > k . Proof of Theorem 4.2 for the CM.

From the observation made above, it suﬃces to show thatthe exploration of the k -neighborhood of vertex I does not require us to resample any stub in step( b )( i )(1) nor samples a stub belonging to a vertex that had already been discovered. To computethe probability of successfully completing k generations in ˆ T before the coupling breaks, write: P n (cid:16) G ( k ) I ( a ) = ˆ T ( k ) ( ˆA ) (cid:17) ≤ P n ( τ ≤ k ) . The coupling breaks the ﬁrst time we draw a stub belonging to a vertex that has already beenexplored: either a stub already paired, or one that is unpaired but already attached to the graph.The number of paired stubs when exploring a vertex in A r − is smaller or equal than 2 P rj =1 | A j | + | J r | , which corresponds to two stubs each for the vertices at distance at most r of I and the unpairedstubs belonging to nodes in J r . Note that up to the moment that the coupling breaks, we have | A j | = | ˆ A j | for all 0 ≤ j ≤ r , and | J r | = | ˆ A r +1 | , so the probability that we break the coupling whileexploring a vertex in A r − is smaller or equal than P r := 2 L n r +1 X j =1 | ˆ A j | ≤ | ˆ V r +1 | L n , r ≥ . It follows that for any a n > P n ( τ ≤ k ) = P n ( τ ≤ k, | ˆ V k +1 | ≤ a n ) + P n ( | ˆ V k +1 | > a n ) ≤ k X r =1 P n ( τ = r, | ˆ V r +1 | ≤ a n ) + P n ( | ˆ V k +1 | > a n )12 k X r =1 P n (cid:16) Bin( ˆ A r − , P r ) ≥ , | ˆ V r +1 | ≤ a n (cid:17) + P n ( | ˆ V k +1 | > a n ) ≤ k X r =1 P n (Bin( a n , a n /L n ) ≥

1) + P n ( | ˆ V k +1 | > a n ) ≤ k X r =1 a n L n + P n ( | ˆ V k +1 | > a n ) , where Bin( n, p ) represents a binomial random variable with parameters ( n, p ). Hence, we have P n (cid:16) G ( k ) I ( a ) ˆ T ( k ) ( ˆA ) (cid:17) ≤ P n ( τ ≤ k ) ≤ ka n L n + P n ( | ˆ V k +1 | > a n ) . To analyze the last probability we use the ﬁrst part of Theorem 4.7 to obtain that for any ﬁxed k ≥ T ( k ) of depth k , whose distribution does not depend on F n , such that P n (cid:16) ˆ T ( k )

6≃ T ( k ) (cid:17) P −→ , as n → ∞ . Let |A k | denote the size of the k th generation of that tree, deﬁne |V k +1 | = P k +1 j =0 |A j | ,and note that P n (cid:16) G ( k ) I ( a ) ˆ T ( k ) ( ˆA ) (cid:17) ≤ ka n L n + P ( |V k +1 | > a n ) + P n (cid:16) ˆ T ( k )

6≃ T ( k ) (cid:17) . Choosing a n P −→ ∞ so that a n /n P −→ n → ∞ , and observing that |V k +1 | < ∞ a.s., completesthe proof. We will couple the exploration of the component of vertex I ∈ G ( V n , E n ) with a marked multi-type Galton-Watson process with n types, one for each vertex in G ( V n , E n ). A node of type i ∈ { , , . . . , n } in the tree will have a Poisson number of oﬀspring of type j with mean: q ( n ) ij = ¯ W i ¯ W j θn , ≤ j ≤ n. Similarly as in the case of the CM, deﬁne: A k = set of vertices in G ( V n , E n ) at distance k from vertex I . V k = k [ r =0 A r . ˆ A k = set of nodes in ˆ T at distance k from the root ∅ .ˆ B k = set of types of nodes in ˆ A k .13 V k = k [ r =0 ˆ A r . We will again do a breadth-ﬁrst exploration of G ( V n , E n ) starting from a uniformly chosen vertex I . To start, let { U ij : i, j ≥ } be a sequence of i.i.d. Uniform[0 ,

1] random variables, independentof F n . We will use this sequence of i.i.d. uniforms to realize the Bernoulli random variables thatdetermine the presence/absence of edges in G ( V n , E n ). Set A = { I } and initialize the set J = ∅ ;the set J will keep track of the vertices that have been fully explored (all its potential edges realized),and will coincide with V k − at the end of Step k . Step k , k ≥ : a. Initialize the set A k = ∅ .b. For each vertex i ∈ A k − :i. Sample X ij = 1( U ij > − p ( n ) ij ) for each j ∈ { , , . . . , n } \ J .ii. If X ij = 1 draw an edge between vertices i and j and add vertex j to A k .iii. Add vertex i to set J .The exploration terminates at the end of Step k if A k = ∅ , at which point the component of I willhave been fully explored.To couple the construction of ˆ T initialize ˆ A = {∅} and identify ∅ with vertex I in G ( V n , E n ) asbefore; let ˆ B = { I } . To construct the tree, we will sample for a node of type i a Poisson numberof oﬀspring of type j for each j ∈ { , . . . , n } . To do this, let G ( · ; λ ) be the cumulative distributionfunction of a Poisson random variable with mean λ , and let G − ( u ; λ ) = inf { x ∈ R : G ( x ; λ ) ≥ u } denote its pseudoinverse. In order to keep the tree coupled with the exploration of the graph wewill use the same sequence of i.i.d. uniform random variables used to sample the edges in the graph.Initialize the set ˆ J = ∅ , which will keep track of the types that have appeared and whose oﬀspringhave been sampled. The precise construction is given below: Step k , k ≥ : a. Initialize the sets ˆ A k = ˆ B k = ∅ .b. For each node i = ( i , . . . , i k − ) ∈ ˆ A k − :i. If i has type t / ∈ ˆ J :1) For each type j ∈ { , . . . , n } \ ˆ J let Z tj = G − ( U tj ; q ( n ) tj ), and create Z tj children oftype j for node i . If Z tj ≥

1, create Z tj children of type j for node i , each with nodeattribute equal to a j , and add j to set ˆ B k .2) For each type j ∈ ˆ J sample Z ∗ tj ∼ Poisson( q ( n ) tj ), independently of the sequence { U ij : i, j ≥ } and any other random variables. If Z ∗ tj ≥ Z ∗ tj children oftype j for node i , each with attribute equal to a j .14) Randomly shuﬄe all the children created in steps (b)(i)(1) and (b)(i)(2) and givethem labels of the form ( i , j ), then add the labeled nodes to set ˆ A k . The nodeattributes will be denoted ˆA ( i ,j ) = a j . (The shuﬄing avoids the label from providinginformation about its type).4) Add type t to set ˆ J .ii. If i has type t ∈ ˆ J :1) For each type j ∈ { , . . . , n } sample Z ∗ tj ∼ Poisson( q ( n ) tj ), independently of thesequence { U ij : i, j ≥ } and any other random variables; create Z ∗ tj children of type j for node i , each with attribute equal to a j .2) Randomly shuﬄe all the children created in step (b)(ii)(1) and give them labels ofthe form ( i , j ), attributes ˆA ( i ,j ) = a j , and add the labeled nodes to set ˆ A k .This construction may continue indeﬁnitely, or may terminate at the end of Step k if ˆ A k = ∅ . Deﬁnition 4.4

We say that the coupling breaks in generation τ = k if for any node in ˆ A k − either: • In step (b)(i)(1) we have Z tj = X tj for some j ∈ { , . . . , n } \ ˆ J ; • In step (b)(i)(1) we have Z tj ≥ for some j ∈ ( ˆ B k − ∪ ˆ B k ) \ ˆ J , in which case a cycle orself-loop is created; or, • In step (b)(i)(2) we have Z ∗ tj ≥ for some j ∈ ˆ J . We start by proving the following preliminary result. Throughout this section, let∆ n := Z (cid:12)(cid:12) F − n ( u ) − F − ( u ) (cid:12)(cid:12) du ≤ W ( υ n , υ ) , where F n ( x ) = n P nj =1 W i ≤ x ) and F ( x ) = P ( W ≤ x ). We also use the notation X n = O P ( x n )as n → ∞ to mean that there exists a random variable Y n such that | X n | ≤ s.t. Y n and Y n /x n P −→ K for some ﬁnite constant K . Lemma 4.5

For any ≤ i ≤ n we have P n (cid:18) max ≤ j ≤ n,j = i | X ji − Z ji | ≥ (cid:19) ≤ min (cid:8) , W i > b n ) + P n ( i ) + ¯ W i η n (cid:9) , where P n ( i ) = X ≤ j ≤ n,j = i (cid:12)(cid:12)(cid:12) p ( n ) ji − ( r ( n ) ji ∧ (cid:12)(cid:12)(cid:12) ,η n = (∆ n + g ( b n ) + b n /n + b n ∆ n / ( θn )) /θ, and g ( x ) = E [( W − x ) + ] . Proof.

1) = P n ( | X ij − Z ij | ≥ , | X ij − R ij | ≥ P n ( | X ij − Z ij | ≥ , | X ij − Ri ij | = 0) ≤ P n ( | X ij − R ij | ≥

1) + P n ( | R ij − Z ij | ≥ . The ﬁrst probability can be computed to be: P n ( | X ij − Z ij | ≥

1) = | p ( n ) ij − ( r ( n ) ij ∧ | . To analyze each of probabilities involving R ij and Z ij , note that P n ( | R ij − Z ij | ≥

1) = P n ( R ij = 0 , Z ij ≥

1) + P n ( R ij = 1 , Z ij = 0) + P n ( R ij = 1 , Z ij ≥ (cid:18) − (1 ∧ r ( n ) ij ) − e − q ( n ) ij (cid:19) + + (cid:18) e − q ( n ) ij − ∧ r ( n ) ij ) (cid:19) + + min (cid:26) − e − q ( n ) ij (1 + q ( n ) ij ) , (1 ∧ r ( n ) ij ) (cid:27) = (cid:12)(cid:12)(cid:12)(cid:12) − (1 ∧ r ( n ) ij ) − e − q ( n ) ij (cid:12)(cid:12)(cid:12)(cid:12) + min (cid:26) (1 ∧ r ( n ) ij ) , e − q ( n ) ij ( e q ( n ) ij − − q ( n ) ij ) (cid:27) . Now use the inequalities e − x ≥ − x , e − x − x ≤ x / e x − − x ≤ x e x / x ≥

0, toobtain that P n ( | X ij − Z ij | ≥ ≤ r ( n ) ij − q ( n ) ij + (cid:12)(cid:12)(cid:12)(cid:12) − q ( n ) ij − e − q ( n ) ij (cid:12)(cid:12)(cid:12)(cid:12) + e − q ( n ) ij ( e q ( n ) ij − − q ( n ) ij )= r ( n ) ij − q ( n ) ij + e − q ( n ) ij − q ( n ) ij + e − q ( n ) ij ( e q ( n ) ij − − q ( n ) ij ) ≤ r ( n ) ij − q ( n ) ij + ( q ( n ) ij ) . It follows that 1( W i ≤ b n ) X ≤ j ≤ n,j = i P n ( | X ij − Z ij | ≥ ≤ W i ≤ b n ) X ≤ j ≤ n,j = i (cid:16) | p ( n ) ij − ( r ( n ) ij ∧ | + r ( n ) ij − q ( n ) ij + ( q ( n ) ij ) (cid:17) ≤ P n ( i ) + X ≤ j ≤ n,j = i ¯ W i ( W j − ¯ W j ) θn + ( ¯ W i ) ( θn ) X ≤ j ≤ n,j = i ( ¯ W j ) ≤ P n ( i ) + ¯ W i θn n X j =1 ( W j − b n ) + + ( ¯ W i ) b n Λ n ( θn ) . To further bound the second term note that if we let W ( n ) denote a random variable distributedaccording to F n and W a random variable distributed according to F , then1 n n X j =1 ( W j − b n ) + = E n h ( W ( n ) − b n ) + i ≤ d ( F n , F ) + E (cid:2) ( W − b n ) + (cid:3) = ∆ n + g ( b n ) . W i ) b n Λ n ( θn ) ≤ ¯ W i b n θ n · E n h W ( n ) i ≤ ¯ W i b n θ n (∆ n + E [ W ]) . We conclude that for E n as deﬁned in the statement of the lemma,1( W i ≤ b n ) X ≤ j ≤ n,j = i P n ( | X ij − Z ij | ≥ ≤ P n ( i ) + ¯ W i θ (∆ n + g ( b n )) + ¯ W i b n θ n (∆ n + E [ W − ]) ≤ P n ( i ) + η n ¯ W i , which in turn yields P n (cid:18) max ≤ j ≤ n,j = i | X ij − Z ij | ≥ (cid:19) ≤ min (cid:8) , W i > b n ) + P n ( i ) + η n ¯ W i (cid:9) . Proof of Theorem 4.2 for the IR.

We start by deﬁning the following events: F i ( I, J, L ) =  max j ∈ I | X ji − Z ji | = 0 , X j ∈ J Z ∗ ji + X j ∈ L Z ji = 0  , B i = { current set ˆ B k − ∪ ˆ B k when the neighbors of i ∈ A k − are explored } , J i = { current set J when the neighbors of i are explored } ,H k = \ i ∈ A k − F i ( { , . . . , n } \ J i , J i , B i \ J i ) ,M k = {| ˆ V k | ≤ s n } . Next, note that P n ( τ ≤ k ) ≤ P n ( τ ≤ k, M k ) + P n ( M ck ) ≤ k X r =1 n n X i =1 P n,i ( τ = r, M r ) + P n ( M ck ) , where the last probability can be bounded using the ﬁrst part of Theorem 4.7 as it was done at theend of the proof of Theorem 4.2 for the CM. Speciﬁcally, P n ( M ck ) ≤ P ( |V k +1 | > s n ) + P n (cid:16) ˆ T ( k )

6≃ T ( k ) (cid:17) , where |V k +1 | = P k +1 j =0 |A j | < ∞ a.s. and the distribution of T ( k ) does not depend on F n .Now note that for any r ≥ P n,i ( τ = r, M r ) = P n,i M r ∩ r − \ m =1 H m ∩ H cr ! , T m =1 H m = Ω. Let F t denote the sigma-algebra that contains the historyof the exploration process in the graph as well as that of its coupled tree, up to the end of Step t of the graph exploration process. It follows that we can write: P n,i ( τ = r, M r ) = E n,i " M r − ∩ r − \ m =1 H m ! P n ( M r ∩ H cr |F r − ) . To analyze the conditional probability inside the expectation above note that conditionally on F r − ,the set A r − is known, and recall that the set J = V r − at the beginning of Step r (assuming r ≥ J = ∅ ). Therefore, by the union bound and the independence among the edges, wehave: P n ( M r ∩ H cr |F r − ) = P n  M r ∩ [ i ∈ A r − F i ( { , . . . , n } \ J i , J i , B i \ J i ) c (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) F r −  ≤ X i ∈ A r − P n ( M r ∩ F i ( { , . . . , n } \ J i , J i , B i \ J i ) c | F r − ) ≤ X i ∈ A r − min (cid:26) , P n (cid:18) max j ∈{ ,...,n }\ J i | X ji − Z ji | ≥ (cid:12)(cid:12)(cid:12)(cid:12) F r − (cid:19) + P n  M r ∩ X j ∈ J i Z ∗ ji + X j ∈ B i \ J i Z ji ≥ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) F r −  . Now use the independence of the edges from the rest of the exploration process and Lemma 4.5 toobtain that P n (cid:18) max { ,...,n }\ J i i | X ji − Z ji | ≥ (cid:12)(cid:12)(cid:12)(cid:12) F r − (cid:19) ≤ P n (cid:18) max ≤ j ≤ n,j = i | X ji − Z ji | ≥ (cid:19) ≤ W i > b n ) + P n ( i ) + ¯ W i η n . Next, condition further on the exploration up to the moment we are about to explore the neighborsof i , and use the independence of the edges from the rest of the exploration process to obtain that P n  M r ∩ X j ∈ J i Z ∗ ji + X j ∈ B i \ J i Z ji ≥ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) F r −  ≤ E n (cid:20) | ˆ V r | ≤ s n ) (cid:18) − e − P j ∈ B i q ( n ) ji (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) F r − (cid:21) = E n (cid:20) | ˆ V r | ≤ s n ) (cid:18) − e − ¯ Wiθn P j ∈∪ B i ¯ W j (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) F r − (cid:21) ≤ E n (cid:20) | ˆ V r | ≤ s n ) (cid:18) − e − bn ¯ Wiθn | B i | (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) F r − (cid:21) ≤ b n ¯ W i θn s n , − e − x ≤ x for x ≥ | B i | ≤ | ˆ V r | ≤ s n .It follows that P n,i ( τ = r, M r ) ≤ E n,i  M r − ∩ r − \ m =1 H m ! X j ∈ A r − min (cid:26) , W i > b n ) + P n ( j ) + ¯ W i η n + b n ¯ W i θn s n (cid:27)(cid:21) . To analyze this remaining expectation we note that on the event T r − m =1 H m the coupling has notbroken yet, and therefore we can can replace A r − with its tree counterpart ˆ A r − . Also, note thatby Lemma 3.4 in [ ? ] we have that the types of the nodes in each of the sets ˆ A k are independentof the type of their parents. We will then identify the nodes in ˆ A r − as { Y , . . . , Y | ˆ A r − | } , where forany t ≥ P n ( Y t = j ) = ¯ W j Λ n , j = 1 , , . . . , n. It follows that P n,i ( τ = r, M r ) ≤ E n,i  M r − ) | ˆ A r − | X t =1 min (cid:26) , W Y t > b n ) + P n ( Y t ) + ¯ W Y t η n + b n ¯ W Y t θn s n (cid:27) ≤ E n,i  ⌊ s n ⌋ X t =1 min (cid:26) , W Y t > b n ) + P n ( Y t ) + ¯ W Y t η n + b n ¯ W Y t θn s n (cid:27) ≤ ⌊ s n ⌋ X t =1 E n,i (cid:20) W Y t > b n ) + P n ( Y t ) + ¯ W Y t η n + b n ¯ W Y t θn s n (cid:21) = ⌊ s n ⌋ E n (cid:20) W Y > b n ) + P n ( Y ) + ¯ W Y η n + b n ¯ W Y θn s n (cid:21) . To compute the last expectation, let ( W ( n ) , W ) be constructed according to an optimal couplingof F n and F . Let ¯ W ( n ) = W ( n ) ∧ b n . Then, for any c n ≥ ⌊ s n ⌋ E n (cid:20) W Y > b n ) + P n ( Y ) + ¯ W Y η n + b n ¯ W Y θn s n (cid:21) ≤ s n n X j =1 ¯ W j Λ n (cid:18) W j > b n ) + P n ( j ) + ¯ W j η n + b n ¯ W j θn s n (cid:19) ≤ s n n Λ n · n n X j =1 ( ¯ W j − c n ) + + s n n Λ n · n n X j =1 c n (cid:18) W j > b n ) + P n ( j ) + ¯ W j η n + b n ¯ W j θn s n (cid:19) = s n E n [ ¯ W ( n ) ] (cid:18) E n h ( ¯ W ( n ) − c n ) + i + c n P n ( W ( n ) > b n ) + c n E n + E n h ¯ W ( n ) i (cid:18) c n η n + c n b n s n θn (cid:19)(cid:19) = O P (cid:18) s n (cid:18) g ( c n ) + ∆ n + c n P ( W > b n ) + c n E n + c n η n + c n b n s n n (cid:19)(cid:19) , n → ∞ , and since η n = O P (cid:0) ∆ n + g ( b n ) + b n /n (cid:1) , we conclude that P n,i ( τ = r, M r ) = O P (cid:0) s n (cid:0) g ( c n ) + c n E n + c n ∆ n + c n g ( b n ) + c n b n /n (cid:1)(cid:1) , as n → ∞ . It now follows from the beginning of the proof that P n ( τ ≤ k ) ≤ O P (cid:0) ks n (cid:0) g ( c n ) + c n E n + c n ∆ n + c n g ( b n ) + c n b n /n (cid:1)(cid:1) + P ( |V k +1 | > s n ) + P n (cid:16) ˆ T ( k )

6≃ T ( k ) (cid:17) , as n → ∞ . Since lim x →∞ g ( x ) = 0, choosing, for example, c n = (cid:0) E n + ∆ n + g ( b n ) + b n /n (cid:1) − / and s n = ( g ( c n ) + c − / n ) − / proves the theorem. The equivalent of Theorem 4.2 for directed graphs has already been proven, under conditionsequivalent to those in Assumption 3.1, in [22] (Theorem 6.3) for the DCM, and in [19] (Theorem 3.7)for the IRD. Hence, we only need to describe the distribution of the intermediate tree and statethe coupling theorem. The descriptions of the couplings follow, with some adjustments, those fromSections 4.1.1 and 4.1.2. However, the precise descriptions in the directed case can be found in [9](Section 5.2) for the DCM and in [19] (Section 3.2.2) for the IRD.In the directed case, the intermediate tree ˆ T is constructed using a sequence of conditionallyindependent (given F n ) random vectors { ( ˆ N i , ˆ D i , ˆA i ) : i ∈ U } in S , with { ( ˆ N i , ˆ D i , ˆA i ) : i ∈ U , i = ∅} conditionally i.i.d. The tree ˆ T is constructed as in the undirected case using the { ˆ N i } , with alledges pointing towards the root, and the full marks take the form: ˆX i = ( ˆ N i , ˆ D i , ˆA i ) , i ∈ U . The marked tree is given by ˆ T ( ˆA ) = { ˆX i : i ∈ ˆ T } .We now specify the distribution of the full marks, which in the case of a DCM is given by: P n (cid:16) ˆX ∅ ∈ · (cid:17) = 1 n n X i =1 D − i , D + i , a i ) ∈ · ) , P n (cid:16) ˆX i ∈ · (cid:17) = n X i =1 D + i L n D − i , D + i , a i ) ∈ · ) , i = ∅ . For the IRD model, ﬁrst let { a n } and { b n } be a sequences such that a n ∧ b n P −→ ∞ and a n b n /n P −→ n → ∞ , and use them to deﬁne ¯ W − i = W − i ∧ a n and ¯ W + i = W + i ∧ b n ,Λ − n = n X i =1 ¯ W − i and Λ + n = n X i =1 ¯ W + i . The marks on the coupled marked Galton-Watson process are given by: P n (cid:16) ˆX ∅ ∈ · (cid:17) = 1 n n X i =1 P (( D − i , D + i , a i ) ∈ ·| a i ) , n (cid:16) ˆX i ∈ · (cid:17) = n X i =1 ¯ W + i Λ + n P (( D − i , D + i + 1 , a i ) ∈ ·| a i ) , i = ∅ , where conditionally on a i , D − i and D + i are independent Poisson random variables with meansΛ + n ¯ W + i / ( θn ) and Λ − n ¯ W − i / ( θn ), respectively.The intermediate coupling theorem for directed graphs is given below, and it is a direct consequenceof Theorem 6.3 in [22] and Theorem 3.7 in [19]. Theorem 4.6

Suppose G ( V n , E n ) is either a DCM or an IRD satisfying Assumption 3.1. Then,for G ( k ) I ( a ) the depth- k neighborhood of a uniformly chosen vertex I ∈ V n , there exists a markedGalton-Watson tree ˆ T ( k ) ( ˆA ) restricted to its ﬁrst k generations, whose root corresponds to vertex I , and such that for any ﬁxed k ≥ , P n (cid:16) G ( k ) I ( a ) ˆ T ( k ) ( ˆA ) (cid:17) P −→ , n → ∞ . In view of Theorems 4.2 and 4.6, the proofs of the main theorems, Theorem 2.3 and 3.2, will becomplete once we establish that with high probability the intermediate tree ˆ T ( k ) is isomorphic tothe limiting tree T ( k ) , and that the node marks in the two trees are within ǫ distance of each other. Note:

There is no need to consider the undirected and directed cases separately, since they onlydiﬀer on the sample space for the full marks, ˆX i / X i , which take values in S = N × R × S ′ in theundirected case and S = N × N × R × R × S ′ in the directed one. For the directed case, all edgesin the trees point towards the root.The coupling theorem between the two trees is the following. The proof of the main theorems,Theorems 2.3 and 3.2, will follow directly from combining Theorems 4.2 and 4.7 in the undirectedcase, and Theorems 4.6 and 4.7 in the directed one. Theorem 4.7

Under Assumption 2.1 or 3.1, as appropriate, there exists a coupling of ˆ T ( k ) ( ˆA ) and T ( k ) ( A ) such that P n (cid:16) ˆ T ( k )

6≃ T ( k ) (cid:17) P −→ , n → ∞ , and such that for any ǫ > , E n h ρ ( ˆX ∅ , X ∅ ) i P −→ and P n  k \ r =0 \ i ∈A r { ρ ( ˆX i , X i ) ≤ ǫ } , ˆ T ( k ) ≃ T ( k )  P −→ , n → ∞ . Before proving Theorem 4.7, we will need to prove a couple of technical lemmas. The ﬁrst of thetwo establishes the existence of couplings for the node attributes, whose distributions are given by: υ n ( · ) = P n (cid:16) ˆA ∈ · (cid:17) = 1 n n X i =1 a i ∈ · ) and υ ( · ) = P ( A ∈ · ) , a i = ( D i , b i ) in the CM and a i = ( W i , b i ) in the IR, while in the directed case they take theform a i = ( D − i , D + i , b i ) in the DCM and a i = ( W − i , W + i , b i ) in the IRD. In the undirected case,the size-bias is done with respect to the ﬁrst coordinate, while in the directed case with respect tothe second one. Speciﬁcally, the size-biased attributes in the undirected case take the form: P n (cid:16) ˆA b ∈ · (cid:17) = ( L − n P ni =1 D i D i , b i ) ∈ · ) , in the CM , Λ − n P ni =1 ¯ W i W i , b i ) ∈ · ) , in the IR , and P ( A b ∈ · ) = ( E [ D D, B ) ∈ · )] /E [ D ] , in the CM ,E [ W W, B ) ∈ · )] /E [ W ] , in the IR , while in the directed case they take the form: P n (cid:16) ˆA b ∈ · (cid:17) = ( L − n P ni =1 D + i D − i , D + i , b i ) ∈ · ) , in the DCM , (Λ + n ) − P ni =1 ¯ W + i W − i , W + i , b i ) ∈ · ) , in the IRD , and P ( A b ∈ · ) = ( E [ D + D − , D + , B ) ∈ · )] /E [ D + ] , in the DCM ,E [ W + W − , W + , B ) ∈ · )] /E [ W + ] , in the IRD . For the undirected case, let ρ ′′ be the metric on S ′′ = [0 , ∞ ) + ×S ′ given by ρ ′′ ( x , y ) = | x − y | + ρ ′ ( x , y ) , x = ( x , x ) , y = ( y , y ) , and for the directed case let ρ ′′ be the metric on S ′′ = [0 , ∞ ) × [0 , ∞ ) × S ′ given by ρ ′′ ( x , y ) = | x − y | + | x − y | + ρ ′ ( x , y ) , x = ( x , x , x ) , y = ( y , y , y ) . Lemma 4.8

Under Assumption 2.1 or 3.1, as appropriate, there exist couplings ( ˆA , A ) and ( ˆA b , A b ) constructed on the same probability space ( S ′′ , F n , P n ) such that E n h ρ ′′ ( ˆA , A ) i P −→ , ρ ′′ ( ˆA , A ) P −→ and ρ ′′ ( ˆA b , A b ) P −→ as n → ∞ . Proof.

Assumptions 2.1 and 3.1 state that W ( υ n , υ ) P −→ n → ∞ , and by the properties of theWasserstein metric (see Theorem 4.1 in [26]), there exists an optimal coupling ( ˆA , A ) such that E n h ρ ′′ ( ˆA , A ) i = W ( υ n , υ ) P −→ ρ ′′ ( ˆA , A ) P −→ n → ∞ . 22or the biased versions, note that it suﬃces to prove the lemma for the undirected case, since asimple rearrangement of terms: a ′ i = ( D ′ i , b ′ i ) := ( D + i , D − i , b i ) or a ′ i = ( W ′ i , b ′ i ) := ( W + i , W − i , b i )reduces the directed case to the undirected one. Through the remainder of the proof, write ˆA =( ˆ Y , ˆB ) and A = ( Y, B ) to avoid having to separate the CM and IR cases.Next, note that we only need to show that that ˆA b ⇒ A b as n → ∞ , where ⇒ denotes convergence indistribution, since then we can take the almost sure representation to obtain that ρ ′′ ( ˆA b , A b ) P −→ f : S ′′ → R be a bounded and continuous function, and let ( ˆA , A ) be the one fromthe beginning of the proof. Let ˜ Y = ˆ Y if the graph is a CM or ˜ Y = ˆ Y ∧ b n if it is an IR. Then, (cid:12)(cid:12)(cid:12) E n h f ( ˆA b ) i − E [ f ( A b )] (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) E n [ ˜ Y ] E n h ˜ Y f ( ˆA ) i − E [ Y ] E [ Y f ( A )] (cid:12)(cid:12)(cid:12)(cid:12) ≤ E n [ ˜ Y ] (cid:16)(cid:12)(cid:12)(cid:12) E n h ( ˜ Y − Y ) f ( ˆA ) i(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) E n h Y ( f ( ˆA ) − f ( A )) i(cid:12)(cid:12)(cid:12)(cid:17) + (cid:12)(cid:12)(cid:12)(cid:12) E n [ ˜ Y ] − E [ Y ] (cid:12)(cid:12)(cid:12)(cid:12) | E [ Y f ( A )] |≤ E n [ ˜ Y ] (cid:18) E n h | ˜ Y − Y | i sup a ∈S ′′ | f ( a ) | + (cid:12)(cid:12)(cid:12) E n h Y ( f ( ˆA ) − f ( A )) i(cid:12)(cid:12)(cid:12)(cid:19) + E n [ | ˜ Y − Y | [ E n [ ˜ Y ] E [ Y ] | E [ Y f ( A )] |≤ E n [ ˜ Y ] (cid:18) E n h | ˜ Y − Y | i a ∈S ′′ | f ( a ) | + E n h Y | f ( ˆA ) − f ( A ) | i(cid:19) . Since W ( υ n , υ ) P −→ E n [ | ˆ Y − Y | ] P −→ n → ∞ , we have E n h | ˜ Y − Y | i ≤ E n h | ˆ Y − Y | i + E [ Y Y > b n )] P −→ E n [ ˜ Y ] P −→ E [ Y ] as n → ∞ . And by the dominated convergence theorem,lim n →∞ E h E n h Y | f ( ˆA ) − f ( A ) | ii = E h lim n →∞ Y | f ( ˆA ) − f ( A ) | i = 0 . Hence, E n h Y | f ( ˆA ) − f ( A ) | i P −→ n → ∞ , and A b ⇒ A b as required.The second technical lemma relates the convergence of the attributes to that of the full marks. Lemma 4.9

Suppose Assumption 2.1 or 3.1 holds, as appropriate, and let ( ˆA , A ) and ( ˆA b , A b ) bethe couplings in Lemma 4.8. Then, there exist couplings for ( ˆX ∅ , X ) and ( ˆX , X ) constructed onthe same probability space as ( ˆA , A ) and ( ˆA b , A b ) , such that E n h ρ ( ˆX ∅ , X ) i P −→ , ρ ( ˆX ∅ , X ) P −→ and ρ ( ˆX , X ) P −→ as n → ∞ . roof. For the two undirected models, CM and IR, write: ˆA = ( ˆ Y , ˆB ) , A = ( Y, B ) ˆA b = ( ˆ Y b , ˆB b ) , A b = ( Y b , B b ) . For the two directed models, DCM and IRD, write: ˆA = ( ˆ Y − , ˆ Y + , ˆB ) , A = ( Y − , Y + , B ) ˆA b = ( ˆ Y − b , ˆ Y + b , ˆB b ) , A b = ( Y − b , Y + b , B b ) . To obtain the statement of the lemma for the CM, simply set ( ˆX ∅ , X ∅ ) = ( ˆ Y , ˆA , Y, A ) and( ˆX , X ) = ( ˆ Y b , ˆA b , Y b , A b ). Similarly, for the DCM set ( ˆX ∅ , X ∅ ) = ( ˆ Y − , ˆ Y + , ˆA , Y − , Y + , A ) and( ˆX , X ) = ( ˆ Y − b , ˆ Y + b , ˆA b , Y − b , Y + b , A b ).For the IR construct ( ˆ S, S ) = (cid:16) Λ n ( ˆ Y ∧ b n ) / ( θn ) , Y (cid:17) . Note that our assumptions imply that E n h | ˆ S − S | i P −→ n → ∞ . Now let U ∼ Uniform[0 ,

1] bei.i.d. and independent of ( ˆ

S, S ), and take( ˆX ∅ , X ) = (cid:16) G − ( U ; ˆ S ) , ˆA , G − ( U ; Y ) , A (cid:17) , where G − ( u ; λ ) = P ∞ m =0 m G ( m ; λ ) ≤ u < G ( m + 1; λ )) is the generalized inverse of thePoisson distribution function with mean λ . Note that since G ( m ; λ ) is decreasing in λ for all m ≥

0, then we have that Poi( λ ) ≥ s.t. Poi( µ ) whenever λ ≥ µ , where ≥ s.t. denotes the usualstochastic order and Poi( α ) denotes a Poisson random variable with mean α . It follows that E (cid:2)(cid:12)(cid:12) G − ( U ; λ ) − G − ( U ; µ ) (cid:12)(cid:12)(cid:12)(cid:12) = | λ − µ | , which in turn implies that E n h ρ ( ˆX ∅ , X ) i = E n h | ˆ S − S | + ρ ′′ ( ˆA , A ) i P −→ , n → ∞ . For the size-biased versions, set ( ˆ S b , S b ) = (cid:16) Λ n ( ˆ Y b ∧ b n ) / ( θn ) , Y b (cid:17) , note that Lemma 4.8 gives | ˆ S b − S b | P −→ n → ∞ , and let( ˆX , X ) = (cid:16) G − ( U ; ˆ S b ) + 1 , ˆA b , G − ( U ; Y b ) + 1 , A b (cid:17) . Now use the continuity in λ of G − ( u ; λ ) to obtain that ρ ( ˆX , X ) = (cid:12)(cid:12)(cid:12) G − ( U ; ˆ S b ) − G − ( U ; Y b ) (cid:12)(cid:12)(cid:12) + ρ ′′ ( ˆA b , A b ) P −→ , n → ∞ . The same steps also give the result for the IRD by setting:( ˆ S − , ˆ S + , S − , S + ) = (cid:16) Λ + n ( ˆ Y − ∧ a n ) / ( θn ) , Λ − n ( ˆ Y + ∧ b n ) / ( θn ) , cY − , (1 − c ) Y + (cid:17) ,

24 ˆ S − b , ˆ S + b , S − b , S + b ) = (cid:16) Λ + n ( ˆ Y − b ∧ a n ) / ( θn ) , Λ − n ( ˆ Y + b ∧ b n ) / ( θn ) , cY − b , (1 − c ) Y + b (cid:17) , where c = E [ W + ] /E [ W − + W + ], and setting( ˆX ∅ , X ∅ ) = (cid:16) G − ( U ; ˆ S − ) , G − ( U ′ ; ˆ S + ) + 1 , ˆA , G − ( U ; S − ) , G − ( U ′ ; S + ) + 1 , A (cid:17) , ( ˆX , X ) = (cid:16) G − ( U ; ˆ S − b ) , G − ( U ′ ; ˆ S + b ) + 1 , ˆA b , G − ( U ′ ; S − b ) , G − ( U ; S + b ) + 1 , A b (cid:17) , for some U, U ′ i.i.d. Uniform[0 ,

1] and independent of F n . This completes the proof.Finally, we can give the proof of Theorem 4.7. Proof of Theorem 4.7.

By Lemma 4.9 there exists couplings ( ˆX ∅ , X ∅ ) and ( ˆX , X ) such that E n h ρ ( ˆX ∅ , X ∅ ) i P −→ ρ ( ˆX , X ) P −→ , as n → ∞ . Now let { ( ˆX i , X i ) : i ∈ U , i = ∅} be i.i.d. copies of ( ˆX , X ), independent of ( ˆX ∅ , X ∅ ).Recall that ˆ N i ( N i ) can be determined from the ﬁrst coordinate of ˆX i ( X i ).We will now use the sequence { ( ˆX i , X i ) : i ∈ U } to construct both ˆ T ( ˆA ) and T ( A ) by determiningtheir nodes according to the recursions:ˆ A k = { ( i , j ) : i ∈ ˆ A k − , ≤ j ≤ ˆ N i } and A k = { ( i , j ) : i ∈ A k − , ≤ j ≤ N i } , for k ≥ κ ( ǫ ) = inf n k ≥ ρ ( ˆX i , X i ) > ǫ for some i ∈ ˆ A k o . Note that since ˆ N i and N i are integer-valued, then ρ ( ˆX i , X i ) > ǫ implies that ˆ N i = N i . It followsthat for any x n ≥ P n (cid:16) ˆ T ( k ) ≃ T ( k ) (cid:17) ≥ P n  k \ r =0 \ i ∈A r { ρ ( ˆX i , X i ) ≤ ǫ } , ˆ T ( k ) ≃ T ( k )  = P n ( κ ( ǫ ) > k ) ≥ − P n ( κ ( ǫ ) ≤ k, |V k | ≤ x n ) − P n ( |V k | > x n )= 1 − k X r =0 P n ( κ ( ǫ ) = r, |V k | ≤ x n ) − P n ( |V k | > x n ) , where V k = S kr =0 A r . To compute the last probabilities, note that P n ( κ ( ǫ ) = 0) ≤ ǫ − E n h ρ ( ˆX ∅ , X ∅ ) i ,and for r ≥ P n ( κ ( ǫ ) = r, |V k | ≤ x n ) ≤ P n  [ i ∈A r { ρ ( ˆX i , X i ) > ǫ } , |A r | ≤ x n  E n  |A r | ≤ x n ) X i ∈A r (cid:16) ρ ( ˆX i , X i ) > ǫ (cid:17) = E n [1( |A r | ≤ x n ) |A r | ] P n (cid:16) ρ ( ˆX , X ) > ǫ (cid:17) ≤ x n P n (cid:16) ρ ( ˆX , X ) > ǫ (cid:17) , where in the third step we used the independence of ( ˆX , X ) from A r . It follows that if we choose x n = P n (cid:16) ρ ( ˆX , X ) > ǫ (cid:17) − / P −→ ∞ , then P n  k \ r =0 \ i ∈A r { ρ ( ˆX i , X i ) ≤ ǫ } , ˆ T ( k ) ≃ T ( k )  ≥ − ǫ − E n h ρ ( ˆX ∅ , X ∅ ) i − kx n P n (cid:16) ρ ( ˆX , X ) > ǫ (cid:17) − P n ( |V k | > x n ) ≥ − ǫ − E n h ρ ( ˆX ∅ , X ∅ ) i − kx − / n − P n ( |V k | > x n ) P −→ , as n → ∞ . This completes the proof. References [1] David Aldous and Russell Lyons. Processes on unimodular random networks.

ElectronicJournal of Probability , 12:1454–1508, 2007.[2] David Aldous and J Michael Steele. The objective method: probabilistic combinatorial op-timization and local weak convergence. In

Probability on discrete structures , pages 1–72.Springer, 2004.[3] T. L. Austin, R. E. Fagen, W. F. Penney, and J. Riordan. The number of components inrandom linear graphs.

Annals of Mathematical Statistics , 30:747–754, 1959.[4] Itai Benjamini and Oded Schramm. Recurrence of distributional limits of ﬁnite planar graphs.In

Selected Works of Oded Schramm , pages 533–545. Springer, 2011.[5] B. Bollob´as. A probabilistic proof of an asymptotic formula for the number of labelled regulargraphs.

European Journal of Combinatorics , pages 311–316, 1980.[6] B. Bollob´as.

Random graphs . Cambridge University Press, 2001.[7] B. Bollob´as, S. Janson, and O. Riordan. The phase transition in inhomogeneous randomgraphs.

Random Structures & Algorithms , 31:3–122, 2007.[8] T. Britton, M. Deijfen, and A. Martin-L¨af. Generating simple random graphs with prescribeddegree distribution.

Journal of Statistical Physics , 124:1377–1397, 2006.[9] N. Chen, N. Livtak, and M. Olvera-Cravioto. Generalized PageRank on directed conﬁgurationnetworks.

Random Structures & Algorithms , 51(2):237–274, 2017.2610] N. Chen and M. Olvera-Cravioto. Directed random graphs with given degree distributions.

Stochastic Systems , 3:147–186, 2013.[11] F. Chung and L. Lu. Connected components in random graphs with given expected degreesequences.

Annals of Combinatorics , 6:125–145, 2002.[12] F. Chung and L. Lu. The average distances in random graphs with given expected degrees. In

Proceedings of National Academy of Sciences , volume 99, pages 15879–15882, 2002a.[13] F. Chung and L. Lu. The volume of the giant component of a random graph with givenexpected degrees.

SIAM Journal on Discrete Mathematics , 20:395–411, 2006a.[14] F. Chung and L. Lu.

Complex graphs and networks , volume 107. CBMS Regional ConferenceSeries in Mathematics, 2006b.[15] R. Durrett.

Random graph dynamics, Cambridge Series in Statistics and Probabilistic Mathe-matics . Cambridge University Press, 2007.[16] A. Garavaglia, R. van der Hofstad, and N. Litvak. Local weak convergence for PageRank.

Annals of Applied Probability , 30(1):40–79, 2020.[17] E. N. Gilbert. Random graphs.

Annals of Mathematical Statistics , 30:1141–1144, 1959.[18] S. Janson, T. Luczak, and A. Rucinski.

Random graphs . Wiley-Interscience, 2000.[19] J. Lee and M. Olvera-Cravioto. PageRank on inhomogeneous random digraphs.

StochasticProcesses and their Applications , 130(4):1–57, 2020.[20] L. Lu.

Probabilistic methods in massive graphs and Internet Computing . PhD thesis, Universityof California, San Diego, 2002.[21] I. Norros and H. Reittu. On a conditionally Poissonian graph process.

Advances in AppliedProbability , 38:59–75, 2006.[22] M. Olvera-Cravioto. PageRank’s ’s behavior under degree correlations.

To appear in Ann.Applied Prob. , 122:1–43, 2019. ArXiv:1909.09744.[23] P. Erd˝os and A. R´enyi. On random graphs.

Publicationes Mathematicae (Debrecen) , 6:290–297,1959.[24] H. van den Esker, R. van der Hofstad, and G. Hooghiemstra. Universality for the distance inﬁnite variance random graphs.

Journal of Statistical Physics , 133:169–202, 2008.[25] R. van der Hofstad.

Random graphs and complex networks . 2014.[26] C. Villani.