[PDF] Large deviations for the degree structure in preferential attachment schemes

Abstract

Preferential attachment schemes, where the selection mechanism is linear and possibly time-dependent, are considered, and an infinite-dimensional large deviation principle for the sample path evolution of the empirical degree distribution is found by Dupuis-Ellis-type methods. Interestingly, the rate function, which can be evaluated, contains a term which accounts for the cost of assigning a fraction of the total degree to an "infinite" degree component, that is, when an atypical "condensation" effect occurs with respect to the degree structure. As a consequence of the large deviation results, a sample path a.s. law of large numbers for the degree distribution is deduced in terms of a coupled system of ODEs from which power law bounds for the limiting degree distribution are given. However, by analyzing the rate function, one can see that the process can deviate to a variety of atypical nonpower law distributions with finite cost, including distributions typically associated with sub and superlinear selection models.

Full PDF

aa r X i v : . [ m a t h . P R ] F e b The Annals of Applied Probability (cid:13)

Institute of Mathematical Statistics, 2013

LARGE DEVIATIONS FOR THE DEGREE STRUCTURE INPREFERENTIAL ATTACHMENT SCHEMES By Jihyeok Choi and Sunder Sethuraman

Syracuse University and University of Arizona

Preferential attachment schemes, where the selection mechanismis linear and possibly time-dependent, are considered, and an inﬁnite-dimensional large deviation principle for the sample path evolution ofthe empirical degree distribution is found by Dupuis–Ellis-type meth-ods. Interestingly, the rate function, which can be evaluated, containsa term which accounts for the cost of assigning a fraction of the totaldegree to an “inﬁnite” degree component, that is, when an atypical“condensation” eﬀect occurs with respect to the degree structure.As a consequence of the large deviation results, a sample path a.s.law of large numbers for the degree distribution is deduced in termsof a coupled system of ODEs from which power law bounds for thelimiting degree distribution are given. However, by analyzing the ratefunction, one can see that the process can deviate to a variety of atyp-ical nonpower law distributions with ﬁnite cost, including distribu-tions typically associated with sub and superlinear selection models.

1. Introduction and results.

Preferential attachment processes are graphnetworks which evolve in time by linking at each time step a new node to avertex in the existing graph with probability based on a selection function ofthe vertex’s connectivity. Such schemes have a long history in various guisesgoing back to [50] and [51]; cf. surveys [40, 49]. More recently, Barab´asiand Albert (BA) in [4] proposed that versions of these processes, where theselection function is an increasing function of the connectivity, may serveas models for growing real-world networks such as the world wide internetweb, and types of social structures.For instance, in a “friend network,” a newcomer may have a predilectionto link or become friends with an individual with high connectivity, or inother words, one who already has many friends. An important property of

Received January 2012. Supported in part by NSF-09-06713 and NSA Q H982301010180.

AMS 2000 subject classiﬁcations.

Primary 60F10; secondary 05C80.

Key words and phrases.

Preferential attachment, random graphs, degree distribution,large deviations, time-dependent, law of large numbers, power law, condensation.

This is an electronic reprint of the original article published by theInstitute of Mathematical Statistics in

The Annals of Applied Probability ,2013, Vol. 23, No. 2, 722–763. This reprint diﬀers from the original in paginationand typographic detail. 1

J. CHOI AND S. SETHURAMAN such reinforcing networks is that when the selection function is in a linearform, asymptotically as time grows, the proportions of nodes with degrees1 , , . . . , k, . . . converge to a power-law distribution h q ( k ) : k ≥ i where 0 < lim k ↑∞ q ( k ) k θ < ∞ for some θ >

0. We will say that a network with sucha law of large numbers (LLN) property is “scale free.” Since it has beenobserved that the sampled empirical degree structure in many real-worldnetworks has a “scale-free” form, such preferential attachment processeshave become quite popular in several ways; see [1, 3, 5, 12, 14, 16, 17, 26,30, 40, 43–45] and references therein.To illustrate more clearly the possible phenomena, consider the followingbasic example.

Example 1.1.

Initially, at time n = 1, the network G is composed oftwo vertices with a single (undirected) edge between them. At time n = 2,a new vertex is attached to one of the two vertices in G with probabilityproportional to a function of its degree to form the new network G . Thisscheme continues: more precisely, at time n + 1, a new node is linked tovertex x ∈ G n with probability proportional to w ( d x ( n )), that is, chance w ( d x ( n )) / P y ∈ G n w ( d y ( n )), where d z ( n ) is the degree at time n of vertex z ,and w = w ( d ) : N → R + is the selection function.In this way, since the initial graph is a tree, all later networks G n for n ≥ Z k ( n ) be the number of vertices in G n with k links, Z k ( n ) = P y ∈ G n d y ( n ) = k ). We now describe a trichotomy ofgrowth behaviors corresponding to the strength and type of the selectionfunction w [36].First, when w is linear, say w ( d ) = d + α for α > −

1, the system is “scale-free.” As is well understood in the literature (cf. [30], Chapter 4), the meanvalues h M k ( n ) = E [ Z k ( n )] : k ≥ i satisfy rate equations in the time index n ≥ n ↑∞ M k ( n ) /n = q ( k ) for k ≥ q is in power-law form with θ = 3 + α .Later, in [10], when α = 0, a concentration inequality was used to showconvergence in probability, lim n ↑∞ Z k ( n ) /n = q ( k ) for k ≥

1. We will callthe α = 0 model the “classical BA process” as it was the model originallyanalyzed in [4]. Also, for all α > −

1, P´olya urn/martingale ideas, and em-beddings into branching processes have given alternative proofs which yielda.s. convergence; see [2, 41, 48].However, in the sublinear case, when w ( d ) = d r for 0 < r <

1, although itwas shown that a.s. lim n ↑∞ Z k ( n ) /n = q ( k ), this LLN limit q is not a powerlaw, but in stretched exponential form [36, 48]: for k ≥ q ( k ) = µk r k Y j =1 (cid:18) µj r (cid:19) − and(1.1) DP FOR PREFERENTIAL ATTACHMENT SCHEMES µ is determined by 1 = ∞ X k =1 k Y j =1 (cid:18) µj r (cid:19) − . Asymptotically, log q ( k ) ∼ − ( µ/ (1 − r )) k − r as k ↑ ∞ . On the other hand,when r = 0, the case of uniform attachment when an old vertex is selecteduniformly, an a.s. LLN can also be similarly obtained where q is geometric. q ( k ) = 2 − k for k ≥ w ( d ) = d r for r >

1, “explosion” or a sort of“condensation eﬀect”’ happens in that in the limiting graph a random singlevertex dominates in accumulating connections. In particular, the limitinggraph is a tree where there is a single random vertex with an inﬁnite numberof children; all other vertices have bounded degree, and of these only a ﬁnitenumber have degree strictly larger than r/ ( r − n ↑∞ E Z k ( n ) /n = q ( k ), isargued where q is degenerate in that q (1) = 1 but q ( k ) = 0 for k ≥

2; cf. [36]and [30], Chapter 4. Such a limit implies, in the superlinear selection process,that most of the nodes at step n are leaves.Since the work of Barab´asi and Albert [4], much eﬀort has been devoted tounderstand the degree and other structures in generalized versions of thesegraphs. A partial selection of this large literature includes: more on degreestructure [23, 24, 31, 32, 34, 37, 38]; growth and location of the maximumdegree [2, 21, 42]; spectral gap and cover time of a random walk on thegraph [19, 39]; width and diameter [9, 22, 35]; graph limits [6, 8, 11, 47]. Connection between urns and degree structure.

If, however, one focusesonly on the degree structure of the growing network, then it may be helpfulto view the degree distribution evolution in terms of “balls-in-bins” or “P´olyaurn” models. For instance, in the previous example, every new connectionthat a vertex gains can be represented by a new ball added to a correspondingurn in a collection of urns. More precisely, at time n = 1, there are two urns,each possessing one ball, in the initial collection U (0). At time j + 1, a newurn with one ball is included in the collection, and also one ball is added toan existing urn x ∈ U ( j ) with probability proportional to w ( b x ) where b x isthe number of balls in urn x . Then, Z k ( n ) translates to the number of urnsin U ( n ) with k balls for k ≥ p , a new urn withone ball is created and added to the collection or, with probability 1 − p ,a new ball is put in one of the existing urns x with probability proportional J. CHOI AND S. SETHURAMAN to ( b x ) r where b x is the number of balls in x . It was proved in [15], amongother results, when r = 1 and p >

0, analogous to linear selection preferentialattachment graphs, that the empirical urn size distribution converges to apower law with θ = 1 + (1 − p ) − .In this context, our purpose is to study a generalized preferential attach-ment process of urns, where at each time step a new urn is created and a newball is added to it or an existing urn according to a time dependent linearselection function, which includes the evolving degree structure of linear se-lection preferential attachment model discussed above, and also a version ofthe r = 1 CHJ urn model [15]. We defer to Section 1.1 the exact descriptionof our scheme.As mentioned in [28], understanding preferential attachment or urn sys-tems, where the selection function depends on time, allows for more realisticmodels given real world networks are time-dependent. However, it appearsmost of the work on time-dependent schemes consists of rate equation for-mulations ([25], Section E), [36] and related work, in models where at eachstep a random number of links or balls may be added to the structure [2, 18].Given this background, detailing the large deviation behavior of the de-gree distribution in time-dependent preferential attachment schemes is anatural problem which gives much understanding of typical and, in partic-ular, atypical evolutions. We remark, even in the usual time-homogeneousmodels, large deviations of the degree structure is an open question.Previous large deviation work in preferential attachment models has fo-cused on one-dimensional objects, for instance, the number of leaves [13],or the degree growth of a single vertex with respect to dynamics where anyvertex may attach to a newly added vertex with a small chance [21]. Seereferences cited in [13] for large deviations work with respect to other typesof random trees and balls-in-bins models.Our main work in this article includes an inﬁnite-dimensional sample pathlarge deviation principle (LDP) for an array of empirical urn ball size dis-tributions {h Z nk ( j ) /n : k ≥ i : 0 ≤ j ≤ n } n ≥ , when the initial conﬁguration,not necessarily ﬁxed, satisﬁes a limit condition (Theorem 1.4). Here, Z nk ( j )stands for the count of urns with k balls at time j in the n th row of thearray. Part of these results is a ﬁnite-dimensional LDP with respect to thenumbers of urns with less than d balls for d < ∞ (Theorem 1.2).As an application of the large deviations results, we obtain an a.s. sam-ple path LLN for the urn counts in terms of a system of coupled ODEs(Corollary 1.7), which, for homogeneous schemes complements ﬁxed timeLLNs mentioned earlier, and gives a diﬀerent way to derive them aside fromthe rate equation method mentioned in Example 1.1. Finally, the LLN limittrajectories are shown to have power law-type behavior in terms of bounds(Corollary 1.9), although the general behavior can interpolate between thesebounds; see Figure 1. DP FOR PREFERENTIAL ATTACHMENT SCHEMES Interestingly, the inﬁnite-dimensional rate function I ∞ can be calculatedon scaled urn ball size path distributions ξ = {h ξ k ( t ) : k ≥ i : 0 ≤ t ≤ } .Here, since in our model, exactly one ball is added to the urn collection ateach microscopic time, ξ k ( t ) / ( t + c ) is the fraction of urns with size k atmacroscopic time t ≥ c = P k ≥ ξ k (0) is the initial mass, that is thescaled initial number of urns. It is natural then to ask which trajectories ξ have ﬁnite cost, I ∞ ( ξ ) < ∞ .It turns out “no mass can be lost,” that is, all ﬁnite cost paths ξ aresuch that the proportions { ξ k ( t ) / ( t + c ) } k ≥ form a probability distribution, P k ≥ ξ k ( t ) / ( t + c ) ≡

1. Also, a variety of nonpower law distributions canbe achieved with ﬁnite rate at any time 0 < t ≤

1, including geometric andstretched exponential distributions discussed in Example 1.1.Intriguingly, on the other hand, “some of the weight may be lost” in cer-tain ﬁnite rate trajectories, that is, the scaled mean urn ball size of the sys-tem may satisfy a “weight loss” property at a time 0 < t ≤ P k ≥ kξ k ( t ) / ( t +˜ c ) <

1, where ˜ c = P k ≥ kξ k (0) is the scaled initial total urn ball size, eventhough the pre-limit quantity equals 1 at all steps in the urn growth scheme.We dub a trajectory ξ with this “weight loss” property at some time 0

1, and P k kξ k ( t ) ≡

0. It turnsout this path, associated with superlinear selection preferential attachmentmodels (cf. Example 1.1), has ﬁnite cost.Moreover, the rate function I ∞ contains a term which measures the costof “condensation” when some of the ﬂow of urn ball size in the scaling limitescapes toward urns with “inﬁnite” size. In addition, we point out, at anytime 0 ≤ t ≤

1, LLN distributions arising from either sublinear or super-linear selection preferential attachment models may be achieved with ﬁnitecost. One might interpret that although the linear selection process is typ-ically “scale-free,” since it is between, in a sense, sublinear and superlinearselection models, its atypical degree distribution structure may include thetypical behavior of its sub and superlinear relatives. See Remark 1.5 andExample 1.6 for more details and discussion.We also remark that the large deviations and other work are, with respectto the process, starting from either “small” or “large” initial conﬁgurations,that is, when the initial urn collection has o ( n ) balls (e.g., ﬁnite), or whenthe size of the collection is on order n , respectively. It appears these ini-tial conﬁgurations, which enter into all result statements, have not beenconsidered before, in general.The main idea for the results is to extend a variational control prob-lem/weak convergence approach of Dupuis and Ellis (cf. [29]) to establishﬁnite-dimensional LDPs in the time-dependent setting. Then, a projective J. CHOI AND S. SETHURAMAN limit approach, and some analysis to identify the rate function, is used toobtain the inﬁnite-dimensional LDP. For the LLN and power-law corollar-ies, a coupled system of ODEs, which governs the typical degree distributionevolution, is identiﬁed, and analyzed.To be concrete, we have focused upon models where the network is in-cremented by one urn and one ball each time, which include basic models.However, the methods here should be of use to analyze the large deviationsof the degree structure in other combinatorial models with diﬀerent incre-ment structure: for instance, the evolving graph model discussed in [16],Chapter 3, where at each time with probability p a new vertex is preferen-tially attached to an old one, and with probability 1 − p , an edge is addedbetween two old nodes selected preferentially, and the BA graphs where,instead of only one vertex, m ≥ Model.

Let p ( t ) : [0 , → [0 ,

1] and β ( t ) : [0 , → [0 , ∞ ) be given func-tions. We deﬁne an urn conﬁguration U as a ﬁnite collection of urns, eachurn x ∈ U containing a nonnegative number of balls b x . We now specify anevolving array { U n ( j ) : 0 ≤ j ≤ n } n ≥ of urn conﬁgurations by the followingtime-dependent iterative scheme. In the n th row of the array: • Start at step 0, with a given initial urn conﬁguration U n (0). • At step j + 1 ≤ n , to form a new urn conﬁguration U n ( j + 1), we ﬁrstcreate and include a new urn with no ball. Then:– with probability p ( j/n ), we place a new ball in this urn;– with probability 1 − p ( j/n ), we place a new ball in one of the other urns x ∈ U n ( j ) with probability b x + β ( j/n ) P y ∈ U n ( j ) ( b y + β ( j/n )) . We will call, for urn x ∈ U n ( j ), the term b x + β ( j/n ) as the “weight” of theurn at time j in the n th row of the process. Let now | U n ( j ) | and B n ( j ) = P x ∈ U n ( j ) b x be the total number of urns and balls in U n ( j ), respectively.Then, the number of urns | U n ( j ) | = | U n (0) | + j and the total number ofballs B n ( j ) = B n (0) + j . Also, the total weight of the conﬁguration at time j is s n ( j ) := X y ∈ U n ( j ) ( b y + β ( j/n )) = (1 + β ( j/n )) j + B n (0) + β ( j/n ) | U n (0) | . The above urn scheme, as discussed in the Introduction, may be viewed interms of the evolving degree structure in a preferential attachment randomgraph process with time-dependent selection function w ( d ; j, n ) = d + β ( j/n ).Here, the step of including a new empty urn and incrementing the numberof balls in an old urn corresponds to an edge being placed between a new DP FOR PREFERENTIAL ATTACHMENT SCHEMES node, with degree 1, and an old vertex in the existing graph whose degreeis consequently incremented. In particular, when p and β are in particularforms, we recover the following models:(1) “Classical” BA process. When p ( t ) ≡ β ( t ) ≡

1, the scheme istime-homogeneous. When the initial urn conﬁguration consists of two emptyurns, the probability of selecting an urn x with k ≥ j ≥ k + 1) / (2( j + 1)), which matches the selection process in the evolution of thedegree structure in the BA preferential attachment graph scheme at times j + 1 ≥ w ( d ) = d , as discussed in Example 1.1, whereurns with k ≥ d = k + 1 ≥ p ( t ) ≡ β ( t ) ≡ β ≥

0, again thescheme is time-homogeneous, and urns with k ≥ k + 1 ≥

1. However, now the weight of an urn with k balls is k + β , in a sense “oﬀset” from the classical BA scheme. Correspondingly, theurn selection scheme is the same as the growth process of the degree structurein the preferential attachment model with selection function w ( d ) = d + α with α = β − p ( t ) ≡ p and β ( t ) ≡ β ≥

0, the evolu-tion of the number of urns of size k ≥ r = 1CHJ model discussed in the Introduction. However, we note, in our model,an empty urn is added at each step with probability 1 − p , and these emptyurns are kept track of in our scheme. When β = 0, the dynamics of urnsof size k ≥ r = 1 CHJ model since the empty urns have no weight,and once created, they cannot be selected to ﬁll in later steps, and do notinﬂuence the structure of urns with k ≥ n ≥

1, let Z ni ( j ) be the number of urns in the n th row of the urnarray process with i ≥ ≤ j ≤ n and, for d ≥

0, let ¯ Z nd +1 ( j )denote the number of urns with more than d balls at time 0 ≤ j ≤ n . Thesequantities satisfy d X i =0 Z ni ( j ) + ¯ Z nd +1 ( j ) = | U n (0) | + j, d X i =0 iZ ni ( j ) + ( d + 1) ¯ Z nd +1 ( j ) ≤ B n (0) + j. Deﬁne now vectors in R d +2 , f d := h , , , . . . , i , f di := h , , . . . , , − , , , . . . , i where − i + 1)th position for 1 ≤ i ≤ d, f dd +1 := h , , . . . , i . J. CHOI AND S. SETHURAMAN

For y = h y , . . . , y d , y d +1 i ∈ R d +2 and 0 ≤ i ≤ d + 1, denote[ y ] i := i X l =0 y l . Note that 0 ≤ [ f d ] i ≤ ≤ i ≤ d, (1.2) [ f d ] d +1 = 1 and 0 ≤ d +1 X i =0 (1 − [ f d ] i ) ≤ . Consider now the “truncated” degree distribution { Z n,d ( j ) := h Z n ( j ) , . . . , Z nd ( j ) , ¯ Z nd +1 ( j ) i| ≤ j ≤ n } , where ¯ Z nd +1 ( j ) = P k ≥ d +1 Z nk ( j ) = j + | U n (0) | − P dk =0 Z nk ( j ), which forms adiscrete time Markov chain with initial state Z n,d (0) corresponding to theinitial urn conﬁguration U n (0) and one-step transition property, Z n,d ( j + 1) − Z n,d ( j )=  f d , with prob. p ( j/n ) + (1 − p ( j/n )) β ( j/n ) Z n ( j ) s n ( j )for i = 0, f di , with prob. (1 − p ( j/n )) ( i + β ( j/n )) Z ni ( j ) s n ( j )for 1 ≤ i ≤ d , f dd +1 , with prob. (1 − p ( j/n )) (cid:18) − P di =0 ( i + β ( j/n )) Z ni ( j ) s n ( j ) (cid:19) .We also deﬁne the “full” degree distribution { Z n, ∞ ( j ) := h Z n ( j ) , . . . , Z nd ( j ) , . . . i| ≤ j ≤ n } , which is also a Markov chain on R ∞ with increments Z n, ∞ ( j + 1) − Z n, ∞ ( j )=  f ∞ , with prob. p ( j/n ) + (1 − p ( j/n )) β ( j/n ) Z n ( j ) s n ( j )for i = 0, f ∞ i , with prob. (1 − p ( j/n )) ( i + β ( j/n )) Z ni ( j ) s n ( j )for i ≥ f ∞ = h , , , . . . , , . . . i and f ∞ i = h , , . . . , , − , , , . . . , , . . . i withthe “ −

1” being in the ( i + 1)th place. DP FOR PREFERENTIAL ATTACHMENT SCHEMES We will assume throughout the following initial condition, which ensuresa LLN at time t = 0. With respect to constants c ni , c n , ˜ c n ≥

0, for i ≥

0, deﬁne c ni := 1 n Z ni (0) , c n := X i ≥ c ni and ˜ c n := X i ≥ ic ni . (LIM) For constants c i , c, ˜ c ≥

0, we have c i := lim n ↑∞ c ni and ˜ c := lim n ↑∞ ˜ c n = X i ≥ ic i < ∞ . Consequently, c := lim n ↑∞ c n = P i ≥ c i < ∞ .In the previous sentence, the c n limit follows from the uniform bound, P i ≥ A c ni ≤ A − P i ≥ ic ni → ˜ c/A . Deﬁne also¯ c d := X i ≥ d +1 c i and c d := h c , . . . , c d , ¯ c d i . We remark one can classify the initial conﬁgurations depending on when c i ≡ c i > i ≥ • (Small conﬁguration) c i ≡ i ≥

0. Here, the initial urn conﬁgura-tions are small in that their size is o ( n ). This is the case when the initialconﬁgurations do not depend on n , for instance. • (Large conﬁguration) c i > i ≥

0. Here, the initial state is alreadya partly-developed conﬁguration whose size is of order n .We also note, when the initial urn conﬁgurations correspond to initialtree conﬁgurations in the corresponding preferential attachment process,some restrictions in the values of c i arise. One may verify that a graph with n vertices with degrees d , . . . , d n is a tree exactly when P ni =1 d i = 2( n − n th row, the number of verticesequals n P k ≥ c nk , and the sum of degrees equals n P k ≥ ( k + 1) c nk (recall thecorrespondence between urn sizes and degrees discussed in the Introduction),we have n P k ≥ ( k + 1) c nk = 2( n P k ≥ c nk − c = c .In addition, we note (LIM) speciﬁes an initial limiting degree distribu-tion which has full “weight” or in other words is not “condensed,” that is,˜ c = lim n ↑∞ ˜ c n = P i ≥ ic i . See Remark 1.8, however, for comments when theinitial distribution is “condensed,” that is, ˜ c > P i ≥ ic i .Our results will be on the family of processes X n,d = { X n,d ( t ) : 0 ≤ t ≤ } and X n, ∞ = { X n, ∞ ( t ) : 0 ≤ t ≤ } obtained by linear interpolation of the J. CHOI AND S. SETHURAMAN discrete-time Markov chains n Z n,d ( j ) and n Z n, ∞ ( j ), respectively. For 0 ≤ t ≤

1, let X n,d ( t ) := 1 n Z n,d ( ⌊ nt ⌋ ) + nt − ⌊ nt ⌋ n ( Z n,d ( ⌊ nt ⌋ + 1) − Z n,d ( ⌊ nt ⌋ )) , X n, ∞ ( t ) := 1 n Z n, ∞ ( ⌊ nt ⌋ ) + nt − ⌊ nt ⌋ n ( Z n, ∞ ( ⌊ nt ⌋ + 1) − Z n, ∞ ( ⌊ nt ⌋ )) . The trajectories X n,d lie in C ([0 , R d +2 ), and are Lipschitz with constantat most 1, satisfying X n,d (0) = n Z n,d (0). On the other hand, the inﬁnitedistribution X n, ∞ ∈ Q ∞ i =0 C ([0 , R ), considered with the product topology,where X n, ∞ (0) = n Z n, ∞ (0). In both cases, although X n,d ( t ) and X n, ∞ ( t ) arenot necessarily probabilities because it is possible that we do not normalizeby the total mass; they are, however, ﬁnite distributions.We now specify the assumptions on p ( t ) and β ( t ) used for the main re-sults.(ND) p and β are piecewise continuous and, for constants p , β and β ,0 ≤ p ( · ) ≤ p < < β ≤ β ( · ) < β < ∞ . We discuss (ND) more in the remark after Theorem 1.2.We note, throughout the article, that we use conventions0 log 0 = 0 / · ±∞ = 1 / ∞ = 0 , ± / ±∞ and(1.3) E [ X ; A ] = Z A X dP.

Results.

We now recall the statement of a large deviation principle(LDP). A sequence { X n } of random variables taking values in a completeseparable metric space V satisﬁes the LDP with rate n and good rate func-tion J : V → [0 , ∞ ] if for each M < ∞ , the level set { x ∈ V| J ( x ) ≤ M } is acompact subset of V , that is, J has compact level sets, and if the followingtwo conditions hold:(i) Large deviation upper bound. For each closed subset F of V ,lim sup n →∞ n log P { X n ∈ F } ≤ − inf x ∈ F J ( x ) . (ii) Large deviation lower bound. For each open subset G of V ,lim inf n →∞ n log P { X n ∈ G } ≥ − inf x ∈ G J ( x ) . DP FOR PREFERENTIAL ATTACHMENT SCHEMES For d ≥

0, we now state the LDP for { X n,d ( t ) | ≤ t ≤ } . Deﬁne the func-tion I d : C ([0 , R d +2 ) → [0 , ∞ ] given by I d ( ϕ ) = Z (1 − [ ˙ ϕ ( t )] ) log 1 − [ ˙ ϕ ( t )] p ( t ) + (1 − p ( t )) β ( t ) ϕ ( t )(1+ β ( t )) t +˜ c + cβ ( t ) + d X i =1 (1 − [ ˙ ϕ ( t )] i ) log 1 − [ ˙ ϕ ( t )] i (1 − p ( t )) ( i + β ( t )) ϕ i ( t )(1+ β ( t )) t +˜ c + cβ ( t ) + − d X i =0 (1 − [ ˙ ϕ ( t )] i ) ! log 1 − P di =0 (1 − [ ˙ ϕ ( t )] i )(1 − p ( t ))(1 − P di =0 ( i + β ( t )) ϕ i ( t )(1+ β ( t )) t +˜ c + cβ ( t ) ) dt, where ϕ (0) = c d , ϕ i ≥ ≤ [ ˙ ϕ ( t )] i ≤ ≤ i ≤ d , P d +1 i =0 ˙ ϕ i ( t ) = 1, P di =0 (1 − [ ˙ ϕ ( t )] i ) = P d +1 i =0 i ˙ ϕ i ( t ) ≤ t , and the integral converges; otherwise, I d ( ϕ ) = ∞ . It will turn out that I d is convex and is a good rate function.To explain the last condition in the deﬁnition of I d , note that ϕ d +1 ( t )represents the fraction of urns with size at least d + 1, so that ( d + 1) ϕ d +1 ( t )is the truncated fraction of balls in these urns. Since the process incre-ments by one ball at each step, it makes sense to specify P di =0 (1 − [ ˙ ϕ ( t )] i ) = P d +1 i =0 i ˙ ϕ i ( t ) ≤ P d +1 i =0 iϕ i ( t ) ≤ t + ˜ c if I d ( ϕ ) < ∞ .The rate function can be understood as follows: in order for X n,d todeviate to ϕ , at time t , the process should behave as if the increment prob-abilities v i of f di are such that the mean P di =0 v i f di + v d +1 f dd +1 = ˙ ϕ . In theproof of Theorem 1.2, we show v i = 1 − [ ˙ ϕ ] i for 0 ≤ i ≤ d and v d +1 = 1 − P dj =0 (1 − [ ˙ ϕ ] j ). But, the natural evolution increment probabilities u i , giventhe process is in state ϕ ( t ), are u = p ( t )+ (1 − p ( t )) β ( t ) ϕ ( t )(1+ β ( t )) t +˜ c + cβ ( t ) , u i = (1 − p ( t )) ( i + β ( t )) ϕ i ( t )(1+ β ( t )) t +˜ c + cβ ( t ) for 1 ≤ i ≤ d and u d +1 = (1 − p ( t ))(1 − P di =0 ( i + β ( t )) ϕ i ( t )(1+ β ( t )) t +˜ c + cβ ( t ) ).Then I d is time integral of the relative entropies of these two increment prob-ability measures.Recall, for probability measures µ and ν , that the relative entropy of µ with respect to ν is deﬁned as R ( µ k ν ) :=  Z log (cid:18) dµdν (cid:19) dµ, if µ ≪ ν , ∞ , otherwise. Theorem 1.2 (Finite-dimensional LDP).

The C ([0 , R d +2 ) -valued se-quence { X n,d } satisﬁes an LDP with rate n and convex, good rate func-tion I d . J. CHOI AND S. SETHURAMAN

Remark 1.3.

We now make comments on the underlying assumption(ND) and obtain the rate function at the ﬁxed time t = 1.(A) The assumption (ND) speciﬁes that the process considered is “non-degenerate” in some sense. (ND) does not cover some “boundary” cases, forinstance, when p ( t ) ≡

1, the process is deterministic in that at each time,one places a new ball in a new urn. Also, when β ( t ) ≡

0, urns without a ballhave no weight; and, if in addition p ( t ) ≡

0, then all new balls are placedinto urns in the initial conﬁguration. Although an LDP should hold in theseand other “less degenerate” cases, the form of the rate function may diﬀerin that some increments may not be possible.On the other hand, assumption (ND) is natural with respect to the con-vergence estimates needed for the proof of the lower bound in the LDP.However, we point out the LDP upper bound holds without any of theboundedness assumptions on p ( · ) and β ( · ) in (ND).Formally, when β ( t ) ≡ ∞ , this is the case of “uniform,” as opposed topreferential, selection of urns. The limit lim β ↑∞ I d corresponds to the ratefunction for this type of dynamic.(B) One recovers the LDP at a ﬁxed time, say t = 1, by the contrac-tion principle with respect to continuous function F : C ([0 , R d +2 ) → R d +2 deﬁned by F ( ϕ ) = ϕ (1), so that F ( X n,d ) = X n,d (1) = n Z n,d ( n ). Then The-orem 1.2 implies the LDP for n Z n,d ( n ) with rate function given by thevariational expression K ( x ) = inf { I d ( ϕ ) | ϕ (0) = c d , ϕ (1) = x } which mightbe evaluated numerically; cf. [13] for calculations when d = 0.We now extend the ﬁnite-dimensional LDP results to the inﬁnite-dimen-sional case ( d = ∞ ). Deﬁne for ξ ∈ Q ∞ i =0 C ([0 , R ) the function I ∞ ( ξ ) = Z lim d →∞ " (1 − [ ˙ ξ ( t )] ) log 1 − [ ˙ ξ ( t )] p ( t ) + (1 − p ( t )) β ( t ) ξ ( t )(1+ β ( t )) t +˜ c + cβ ( t ) + d X i =1 (1 − [ ˙ ξ ( t )] i ) log 1 − [ ˙ ξ ( t )] i (1 − p ( t )) ( i + β ( t )) ξ i ( t )(1+ β ( t )) t +˜ c + cβ ( t ) + − d X i =0 (1 − [ ˙ ξ ( t )] i ) ! log 1 − P di =0 (1 − [ ˙ ξ ( t )] i )(1 − p ( t ))(1 − P di =0 ( i + β ( t )) ξ i ( t )(1+ β ( t )) t +˜ c + cβ ( t ) ) dt where ξ i (0) = c i , ξ i ( t ) ≥ ≤ [ ˙ ξ ( t )] i ≤ i ≥ ddt P ∞ i =0 ξ i ( t ) = 1 and lim d [ P di =0 i ˙ ξ i ( t ) + ( d + 1)(1 − [ ˙ ξ ( t )] d )] = P ∞ i =0 (1 − [ ˙ ξ ( t )] i ) ≤ t , and the integral converges; otherwise I ∞ ( ξ ) = ∞ .It will turn out through a projective limit approach (cf. [20], Section 4.6) that I ∞ is well deﬁned, convex and a good rate function, and that the integrandlimit exists because the term in square brackets is increasing in d . DP FOR PREFERENTIAL ATTACHMENT SCHEMES Theorem 1.4 (Inﬁnite-dimensional LDP).

The Q ∞ i =0 C ([0 , R ) -valuedsequence { X n, ∞ } satisﬁes an LDP with rate n and convex, good rate func-tion I ∞ . Remark 1.5.

From the result, degree distributions, not fully supportedon the nonnegative integers, that is, when P i ≥ ϕ i ( t ) < t + c or, in otherwords, when the distribution speciﬁes a positive fraction of urns with an inﬁ-nite number of balls, cannot be achieved with ﬁnite cost in the evolution pro-cess. This stabilization of the “mass” is understood as follows. The fractionof urns with size larger than A at time ⌊ nt ⌋ is bounded in terms of the frac-tion of balls in the system: P k ≥ A Z nk ( ⌊ nt ⌋ ) /n ≤ A − P k ≥ kZ nk ( ⌊ nt ⌋ ) /n ≤ A − ( ⌊ nt ⌋ /n + ˜ c n ) ≤ A − (1 + 2˜ c ) for all large n . Hence, for all realizations ofthe process, the fraction of inﬁnite sized urns vanishes.On the other hand, it seems some fraction of the total “weight” can indeedbe lost in the evolution process with ﬁnite rate, that is, it may be possible toachieve a degree distribution at a time 0 < t ≤ P di =0 iξ i ( t ) < t + ˜ c although pre-limit P ∞ i =0 iZ ni ( ⌊ nt ⌋ ) /n = ⌊ nt ⌋ /n + ˜ c n . The interpretation isthat it is possible to put a positive fraction of the balls into a few verylarge urns with ﬁnite cost, a sort of “condensation” eﬀect noticed in thelimiting evolution when the selection function is superlinear as mentionedin Example 1.1.The last term in the integrand of the rate function, corresponding to theincrement f dd +1 , measures the cost of choosing urns with very large size.In the d ↑ ∞ limit, this last term may be viewed as the cost of “escape”of weight from urns with bounded size, or, in other words, the cost of theincrement “ h , , . . . , , . . . i ” which corresponds to a new empty urn beingincluded and very large sized urns being incremented. Some “condensed”ﬁnite rate evolutions are discussed in Example 1.6.However, on the other hand, this type of “weight” loss or “condensation”cannot happen in the typical evolution—see Corollary 1.7. Example 1.6.

Consider the “classical” BA model which follows theevolution of a random graph with preferential attachment selection function w ( d ) = d , noted in Example 1.1 and Section 1.1, which corresponds to theurn system when β ( t ) ≡ p ( t ) ≡

0. Suppose that the initial conﬁgura-tions satisfy c i = 0 for all i ≥ ξ ( t ) = tγ where γ = h γ i : i ≥ i where constants γ i ≥ X i ≥ γ i = 1 and X i ≥ iγ i = X i ≥ (1 − [ γ ] i ) ≤ . Since, ξ ( t ) is linear in t , calculation of the rate I ∞ ( ξ ) simpliﬁes considerably,and one evaluates the limit of the last term in the integrand of I ∞ ( ξ ) as the J. CHOI AND S. SETHURAMAN time-independent quantity,lim d ↑∞ − d X i =0 (1 − [ ˙ ξ ( t )] i ) ! log 1 − P di =0 (1 − [ ˙ ξ ( t )] i )1 − ( P di =0 ( i + 1) ξ i ( t )) / (2 t )= (cid:18) − X i ≥ iγ i (cid:19) log 2 , which gives the cost of the “increment h , , . . . , , . . . i ” when the dynamicsattaches new vertices to very large hubs or places balls into already verylarge urns.This cost is positive if P i ≥ iγ i <

1, and, as discussed in the remark above,corresponds to the cost of forming urns/nodes with very large size/degreein the evolution process, a “condensation” eﬀect. It follows then I ∞ ( ξ ) = X i ≥ (1 − [ γ ] i ) log 1 − [ γ ] i ( i + 1) γ i / (cid:18) − X i ≥ iγ i (cid:19) log 2 . (1.4)In the case γ = 1 and γ i = 0 for i ≥

1, one observes I ∞ ( ξ ) = log 2, andone can associate a graph evolution to achieve this degree or size distribu-tion. For instance, one may grow a “star” tree conﬁguration where all newvertices connect to the same vertex, or all balls are put in the same urn. Ifinitially, there are only two vertices with degree 1 or two empty urns, thenthe “star” conﬁguration at the n th step has probability 2 − n of occurring.As the degree/size structure at time n consists of n leaves/empty urns andone vertex with degree n or one urn with size n −

1, one observes the LLNlimit for the degree/size sequence is ξ ( t ) = tγ , from which the rate evaluationfollows.As discussed in Example 1.1, this “condensed” conﬁguration is the limittree with respect to superlinear selection function w ( d ) = d r for r >

2. More-over, as noted in the Introduction, all preferential attachment evolutionswith respect to superlinear selection function w ( d ) = d r for r > γ , that is, E Z i ( n ) /n → γ i , where γ = 1 and γ i = 0 for i ≥ γ is supported only on a ﬁnite number of indices,one sees that I ∞ ( ξ ) < ∞ exactly when there exists i ∗ ≥ γ i > i ≤ i ∗ . In particular, the “straight road” evolution, leading to trees where allnodes have degree 2, except for two leaves, or urn conﬁgurations consistingof single ball urns except for two empty urns, has inﬁnite cost: start with twovertices with degree 1 or two empty urns. At step j + 1, connect a new vertexto one of the two leaves, or add an empty urn and place a ball in one of thetwo empty urns in the conﬁguration formed at step j . This conﬁguration attime n has probability 1 /n ! of occurring, and in the LLN limit correspondsto ξ ( t ) = tγ , where γ = 0, γ = 1 and γ i = 0 for i ≥

2, for which I ∞ ( ξ ) = ∞ . DP FOR PREFERENTIAL ATTACHMENT SCHEMES Even when no weight escapes, that is, P i ≥ iγ i = 1, it may be noted thatdeviations to nonpower law urn size paths ξ are possible with ﬁnite rate.For instance, when γ i = 2 − ( i +1) for i ≥ I ∞ ( ξ ) = − P i ≥ i +1 log i +12 . When γ i = q ( i + 1) for i ≥ q in form of the stretched exponential in (1.1),the LLN limit for the degree distribution with respect to sublinear selectionpreferential attachment, a calculation veriﬁes that P i ≥ iγ i = 1 and also I ∞ ( ξ ) < ∞ .We now turn to the LLN behavior which corresponds to the “zero-cost”trajectory. Consider the system of ODEs for ϕ d = ϕ , with initial condition ϕ (0) = c d :˙ ϕ ( t ) = 1 − p ( t ) − (1 − p ( t )) β ( t ) ϕ ( t )(1 + β ( t )) t + ˜ c + cβ ( t ) , ˙ ϕ ( t ) = p ( t ) + (1 − p ( t )) β ( t ) ϕ ( t )(1 + β ( t )) t + ˜ c + cβ ( t ) − (1 − p ( t )) (1 + β ( t )) ϕ ( t )(1 + β ( t )) t + ˜ c + cβ ( t ) , (1.5) ˙ ϕ i ( t ) = (1 − p ( t )) ( i − β ( t )) ϕ i − ( t )(1 + β ( t )) t + ˜ c + cβ ( t ) − (1 − p ( t )) ( i + β ( t )) ϕ i ( t )(1 + β ( t )) t + ˜ c + cβ ( t ) for 2 ≤ i ≤ d, ˙ ϕ d +1 ( t ) = 1 − d X i =0 ˙ ϕ i ( t ) . Recall that a “Carath´eodory” solution is an absolutely continuous func-tion satisfying the ODEs a.a. t , and the initial condition, or equivalentlya function satisfying the integral equation associated to the ODEs. Onecan readily integrate ODEs (1.5), and ﬁnd a Carath´eodory solution ζ d ( t ) = h ζ ( t ) , . . . , ζ d ( t ) , ¯ ζ d +1 ( t ) i [see formula (4.1)], which is unique from the follow-ing theorem. One extends to “ d = ∞ ” setting by deﬁning ζ ∞ ( t ) := h ζ ( t ) , . . . , ζ d ( t ) , . . . i . We now state a LLN for X n,d and X n, ∞ as a consequence of the LDP up-per bound. As remarked in the Introduction, this LLN may also be obtainedby rate equation formulations as in [36] and [30], Chapter 4. Corollary 1.7 (LLN).

For d ≥ , ζ d is the unique Carath´eodory solu-tion to ODEs (1.5) with the initial condition ϕ (0) = c d , and also I d ( ζ d ) = 0 .Then, in the sup topology on C ([0 , R d +2 ) , X n,d ( · ) → ζ d ( · ) a.s. J. CHOI AND S. SETHURAMAN

Fig. 1.

The thick curves are the (numerical) LLN ODE paths at times t = 0 . , . , with p ( t ) ≡ , β ( t ) = 8 for t < . , for t ≥ . and c k ≡ . Dashed lines are straightlines with slope − and − . The plots use log–log scales. As a consequence, we have in the product topology on Q ∞ i =0 C ([0 , R ) that X n, ∞ ( · ) → ζ ∞ ( · ) . Moreover, P ∞ i =0 ζ i ( t ) = t + c and P ∞ i =0 iζ i ( t ) = t + ˜ c ,and hence no “weight” is lost in the LLN limit. Remark 1.8.

The last equality, P i ≥ iζ i ( t ) = t + ˜ c , requires the condi-tion in (LIM) that the initial scaled degree distribution is not “condensed,”that is, ˜ c = lim n ↑∞ ˜ c n = P i ≥ ic i . When the initial distribution is “con-densed,” that is, a strict Fatou limit ˜ c = lim n ↑∞ ˜ c n > P i ≥ ic i occurs, thelarge deviation results Theorems 1.2, 1.4 and Corollary 1.7 (except forthe last equality) still hold with the same notation and proofs. However,one can show by similar arguments as for the proof of the last equalityin Corollary 1.7 that the LLN trajectory ζ ∞ will now be “condensed,”that is, s ( t ) = P i ≥ iζ i ( t ) < t + ˜ c for t ≥

0. Moreover, for a constant C = C ( c, ˜ c, p , β , β ) >

0, one can see for all large t that C (cid:18) ˜ c − X i ≥ ic i (cid:19) t (1 − p ) / (1+ β ) ≤ t + ˜ c − s ( t ) ≤ C − (cid:18) ˜ c − X i ≥ ic i (cid:19) t / (1+ β ) . We now consider the “scale-freeness” of ζ ∞ . Although it seems diﬃcultto control each ζ i , nevertheless ζ ∞ has “power law” behavior, in terms ofbounds on [ ζ ∞ ] i . In general, it appears ζ ∞ can interpolate between thebounds (cf. Figure 1; as a curiosity, we note a ﬁgure with a similar “bend”is found in [33] with respect to Facebook social network data). Corollary 1.9 (Power law).

Assume ≤ p min ≤ p ( · ) ≤ p =: p max < ,and < β =: β min ≤ β ( · ) ≤ β max := β . Then, ζ ∞ is bounded between twopower laws: DP FOR PREFERENTIAL ATTACHMENT SCHEMES For small conﬁgurations, for example, c k ≡ , we have, for i ≥ and t ≥ , [ η ′ ] i t ≤ [ ζ ∞ ( t )] i ≤ [ η ] i t. For large conﬁgurations, for example, c k > for some k ≥ , we have, for i ≥ , [ η ′ ] i ( t + o (1)) ≤ [ ζ ∞ ( t )] i ≤ [ η ] i ( t + o (1)) as t ↑ ∞ . Here, with respect to positive constants

C, C ′ depending on p and β , η ′ i := C ′ i β min ) / (1 − p min ) (1 + o (1)) and η i := Ci β max ) / (1 − p max ) (1 + o (1)) . The outline of the paper is that in Sections 2 and 3, we prove the ﬁniteand inﬁnite-dimensional LDPs, Theorems 1.2 and 1.4. In Section 4, we provethe law of large numbers (Corollary 1.7). Finally, in Section 5, we discusspower-law behavior (Corollary 1.9).

2. Proof of Theorem 1.2.

We follow the method and notation of Dupuisand Ellis in [29]; see also [52]. Some steps are similar to those in [13] wherethe “leaves” in a more simpliﬁed graph scheme are considered. However, asmany things diﬀer in our model, in the upper bound, and especially thelower bound proof, we present the full argument.We now ﬁx 0 ≤ d < ∞ and equip R d +2 with the L -norm denoted by | · | .Recall, from assumption (LIM), c n,d = ( c n , . . . , c nd , ¯ c n,d ) := 1 n Z n,d (0) → c d , where ¯ c n,d = P i ≥ d +1 c ni . Denote ~ξ ( n, t ) := ( p n ( t ) , β n ( t ) , σ n ( t )) , where p n ( t ) := p ( ⌊ nt ⌋ /n ) , β n ( t ) := β ( ⌊ nt ⌋ /n ) ,σ n ( t ) := 1 n s n ( ⌊ nt ⌋ ) = (1 + β n ( t )) ⌊ nt ⌋ n + ˜ c n + c n β n ( t ) . Let σ ( t ) := (1 + β ( t )) t + ˜ c + cβ ( t ) ,~ξ ( t ) := ( p ( t ) , β ( t ) , σ ( t )) . J. CHOI AND S. SETHURAMAN

We note that, as n → ∞ , and p ( t ) and β ( t ) are piecewise continuous, ~ξ ( n, t ) → ~ξ ( t ) for almost all t ∈ [0 , . In the remainder of the section, when the context is clear, we often dropthe superscript d to save on notation. Recall X n ( j ) := 1 n Z n,d ( j ) , X n (0) = c n,d and X n ( j + 1) = X n ( j ) + n y n X n ( j ) ( j ), where y n x ( j ) has distribution ρ ~ξ ( n,j/n ) , x . Here, for x = h x , . . . , x d , x d +1 i ∈ R d +2 such that x i ≥ ≤ i ≤ d +1, numbers p ′ ∈ [0 ,

1] and β ′ , σ ′ ≥ P d +1 i =0 ( i + β ′ ) x i ≤ σ ′ , and A ⊂ R d +2 , ρ ( p ′ ,β ′ ,σ ′ ) , x ( A ) := (cid:18) p ′ + (1 − p ′ ) β ′ x σ ′ (cid:19) δ f ( A )+ d X i =1 (1 − p ′ ) ( i + β ′ ) x i σ ′ δ f i ( A )+ (1 − p ′ ) (cid:18) − P di =0 ( i + β ′ ) x i σ ′ (cid:19) δ f d +1 ( A ) . We note when σ ′ = 0 and x = h , . . . , i , by convention 0 / ρ ( p ′ ,β ′ , , x ( A ) := p ′ δ f ( A ) + (1 − p ′ ) δ f d +1 ( A ) . From (1.2) and (LIM), for

A >

0, the paths X n ( t ) = X n,d ( t ), for all large n ,belong toΓ d,A := ( ϕ ∈ C ([0 , R d +2 ) || ϕ (0) − c d | ≤ A, ϕ i is Lipschitzwith bound 1 , ≤ [ ˙ ϕ ( t )] i ≤ ≤ i ≤ d + 1, and(2.1) d +1 X i =0 ˙ ϕ i ( t ) = 1 , d +1 X i =0 i ˙ ϕ i ( t ) = d X i =0 (1 − [ ˙ ϕ ( t )] i ) ≤ t ) . Here, we equip C ([0 , R d +2 ) with the supremum norm.Let h : C ([0 , R d +2 ) → R be a bounded continuous function. Let also W n := − n log E { exp[ − nh ( X n )] } . DP FOR PREFERENTIAL ATTACHMENT SCHEMES To prove Theorem 1.2, we need to establish Laplace principle upper andlower bounds (cf. [29], Section 1.2), namely upper boundlim inf n →∞ W n ≥ inf ϕ ∈ C ([0 , R d +2 ) { I d ( ϕ ) + h ( ϕ ) } for a good rate function I d , and lower boundlim sup n →∞ W n ≤ inf ϕ ∈ C ([0 , R d +2 ) { I d ( ϕ ) + h ( ϕ ) } . Given X n (0) = c n,d , deﬁne, for 1 ≤ j ≤ n , that W n ( j, { x , . . . , x j } ):= − n log E { exp[ − nh ( X n )] | X n (1) = x , . . . , X n ( j ) = x j } and W n := W n (0 , ∅ ) = − n log E { exp[ − nh ( X n )] } . The Dupuis–Ellis method stems from the following discussion. From theMarkov property, for 1 ≤ j ≤ n − e − nW n ( j, { x ,..., x j } ) = E { e − nh ( X n ) | X n (1) = x , . . . , X n ( j ) = x j } = E { E { e − nh ( X n ) | X n (1) , . . . , X n ( j + 1) }| X n (1) = x , . . . , X n ( j ) = x j } = E { e − nW n ( j +1 , { X n (1) ,..., X n ( j ) , X n ( j +1) } ) | X n (1) = x , . . . , X n ( j ) = x j } = Z R d +2 e − nW n ( j +1 , { x ,..., x j , x j + y /n } ) ρ ~ξ ( n,j/n ) , x j ( d y ) . Recall the deﬁnition of relative entropy near Theorem 1.2. Then, by thevariational formula for relative entropy (cf. [29], Proposition 1.4.2), for 1 ≤ j ≤ n − W n ( j, { x , . . . , x j } )= − n log Z R d +2 e − nW n ( j +1 , { x ,..., x j , x j + y /n } ) ρ ~ξ ( n,j/n ) , x j ( d y )= inf µ (cid:26) n R ( µ k ρ ~ξ ( n,j/n ) , x j )+ Z R d +2 W n (cid:18) j + 1 , (cid:26) x , . . . , x j , x j + 1 n y (cid:27)(cid:19) µ ( d y ) (cid:27) . We also have a terminal condition W n ( n, { x , . . . , x n } ) = h ( x ( · )), where x ( · )is the linear interpolated path connecting { ( j/n, x j ) } ≤ j ≤ n .We may understand these dynamic programming equations and terminalconditions in terms of a particular stochastic control problem. Deﬁne: J. CHOI AND S. SETHURAMAN (i) L j = ( R d +2 ) j , the state space on which W n ( j, · ) is deﬁned;(ii) U = P ( R d +2 ), where P ( B ) is the space of probabilities on B , is thecontrol space on which the inﬁmum is taken;(iii) for j = 0 , . . . , n −

1, “control” v nj ( d y ) = v nj ( d y | x , . . . , x j ) which is astochastic kernel on R d +2 given ( R d +2 ) j ;(iv) { ¯ X n ( j ); 0 ≤ j ≤ n } , the “controlled” process which is the adaptedpath satisfying ¯ X n (0) = c n,d and ¯ X n ( j + 1) = ¯ X n ( j ) + n ¯ Y n ( j ) for 0 ≤ j ≤ n −

1, where ¯ Y n ( j ), given ( ¯ X n (0) , . . . , ¯ X n ( j )), has distribution v nj ( · ) [i.e.,¯ P { ¯ Y n ( j ) ∈ d y | ¯ X n (0) , . . . , ¯ X n ( j ) } := v nj ( d y | ¯ X n (0) , . . . , ¯ X n ( j ))] and ¯ X n ( · ) isthe piecewise linear interpolation of ( j/n, ¯ X n ( j ));(v) “running costs” C j ( v ) = n R ( v k ρ ) for v ∈ P ( R d +2 ); and(vi) “terminal cost” equals to the function h .Also, deﬁne, for 0 ≤ j ≤ n −

1, the minimal cost function V n ( j, { x , . . . , x j } )= inf { v ni } ¯ E j, x ,..., x j ( n n − X i = j R ( v ni ( · ) k ρ ~ξ ( n,i/n ) , ¯ X n ( i ) ) + h ( ¯ X n ( · )) ) , where v ni ( · ) = v ni ( ·| ¯ X n (0) , . . . , ¯ X n ( i )), and the inﬁmum is taken over all con-trol sequences { v ni } . Here, ¯ E j, x ,..., x j denotes expectation, with respect tothe adapted process ¯ X n ( · ) associated to { v ni } , conditioned on ¯ X n (1) = x , . . . , ¯ X n ( j ) = x j . The boundary conditions are V n ( n, { x , . . . , x n } ) = h ( x ( · ))and V n := V n (0 , ∅ ) = inf { v nj } ¯ E ( n n − X j =0 R ( v nj ( · ) k ρ ~ξ ( n,j/n ) , ¯ X n ( j ) ) + h ( ¯ X n ( · )) ) . (2.2)It turns out that { V n ( j, { x , . . . , x j } ) : 0 ≤ j ≤ n } also satisﬁes the dy-namic programming equations and terminal condition, and since these equa-tions have unique solutions (cf. [29], Section 3.2), we may conclude by [29],Corollary 5.2.1, that W n = − n log E { exp[ − nh ( ¯ X n ( · ))] } = V n . Upper bound.

To prove the upper bound, it will be helpful to putthe controls { v nj } into continuous-time paths. Let v n ( d y | t ) := v nj ( d y ) for t ∈ [ j/n, ( j + 1) /n ), j = 0 , . . . , n −

1, and v n ( d y |

1) := v nn − . Deﬁne v n ( A × B ) := Z B v n ( A | t ) dt for Borel A ⊂ R d +2 and B ⊂ [0 , X n ( t ) := ¯ X n ( j ) for t ∈ [ j/n, ( j + 1) /n ), 0 ≤ j ≤ n −

1, and ˜ X n (1) := ¯ X n ( n − DP FOR PREFERENTIAL ATTACHMENT SCHEMES Then W n = V n = inf { v nj } ¯ E (cid:26)Z R ( v n ( ·| t ) k ρ ~ξ ( n,t ) , ˜ X n ( t ) ) dt + h ( ¯ X n ) (cid:27) . Given ρ ~ξ, x is supported on K := { f , f , . . . , f d +1 } , if { v nj } is not supported on K , then R ( v n k ρ ~ξ, x ) = ∞ . Since | V n | ≤ k h k ∞ < ∞ and K ⊂ R d +2 is compact,for each n , there is { v nj } supported on K and corresponding v n ( d y × dt ) = v n ( d y | t ) × dt such that, for ε > W n + ε = V n + ε ≥ ¯ E (cid:26)Z R ( v n ( ·| t ) k ρ ~ξ ( n,t ) , ˜ X n ( t ) ) dt + h ( ¯ X n ) (cid:27) . (2.3)Recall that ¯ X n ( · ) takes values in Γ d,A . Since Γ d,A is compact, by applica-tions of the Ascoli–Arzel´a theorem, and { v nj } is tight, by Prokhorov’s theo-rem, given any subsequence of { v n , ¯ X n } , there is a further subsubsequence,a probability space ( ¯Ω , ¯ F , ¯ P ), a stochastic kernel v on K × [0 ,

1] given ¯Ω anda random variable ¯ X mapping ¯Ω into Γ d,A such that the subsubsequenceconverges in distribution to ( v, ¯ X ). In particular, since ¯ X n (0) = c n,d → c d as n → ∞ , we have ¯ X [cf. (2.1)] belongs toΓ d := Γ d, those functions such that ϕ (0) = c d . Then, [29], Lemma 3.3.1, shows that v is a subsequential weak limit of v n ,and there exists a stochastic kernel v ( dy | t, ω ) on K given [0 , × ¯Ω such that¯ P -a.s. for ω ∈ ¯Ω, v ( A × B | ω ) = Z B v ( A | t, ω ) dt. Now, the same proof given for [29], Lemma 5.3.5, shows that ( v n , ¯ X n , ˜ X n )has a subsequential weak limit ( v, ¯ X , ¯ X ), where the last coordinate is withrespect to Skorokhod space D ([0 , R d +2 ), and ¯ P -a.s. for t ∈ [0 , X ( t ) = Z R d +2 × [0 ,t ] y v ( d y × ds ) = Z t (cid:18)Z K y v ( d y | s ) (cid:19) ds, ˙¯ X ( t ) = Z K y v ( d y | t ) . By Skorokhod’s representation theorem, we may take that ( v n , ¯ X n , ˜ X n ) con-verges to ( v, ¯ X , ¯ X ) a.s. In particular, ¯ X n → ¯ X uniformly a.s., and as ¯ X is continuous, it follows that also ˜ X n → ¯ X uniformly a.s.; cf. [29], Theo-rem A.6.5.Let λ denote Lebesgue measure on [0 ,

1] and ρ × λ product measure on K × [0 , Z R ( v n ( ·| t ) k ρ ~ξ ( n,t ) , ˜ X n ( t ) ) dt = R ( v n ( ·| t ) × λ ( dt ) k ρ ~ξ ( n,t ) , ˜ X n ( t ) × λ ( dt )) . J. CHOI AND S. SETHURAMAN

We now evaluate the limit inferior of W n using formula (2.3), along asubsequence as above:lim inf n →∞ V n + ε ≥ lim inf n →∞ ¯ E (cid:26)Z R ( v n ( ·| t ) k ρ ~ξ ( n,t ) , ˜ X n ( t ) ) dt + h ( ¯ X n ) (cid:27) = lim inf n →∞ ¯ E { R ( v n ( ·| t ) × λ ( dt ) k ρ ~ξ ( n,t ) , ˜ X n ( t ) × λ ( dt )) + h ( ¯ X n ) }≥ ¯ E { R ( v ( ·| t ) × λ ( dt ) k ρ ~ξ ( t ) , ¯ X ( t ) × λ ( dt )) + h ( ¯ X ) } = ¯ E (cid:26)Z R ( v ( ·| t ) k ρ ~ξ ( t ) , ¯ X ( t ) ) dt + h ( ¯ X ) (cid:27) . Note that we used Fatou’s lemma in the second inequality, observing (i)–(iv).(i) v n ( d y | dt ) × λ ( dt ) → v ( d y | dt ) × λ ( dt ) a.s. as v n ⇒ v a.s.;(ii) ρ ~ξ ( n,t ) , ˜ X n ( t ) ⇒ ρ ~ξ ( t ) , ¯ X ( t ) as ~ξ ( n, t ) → ~ξ ( t ) a.a. t ∈ [0 , X n ( t ) → ¯ X ( t ) uniformly on [0 ,

1] a.s.;(iii) lim inf n →∞ R ( v n ( d y | dt ) × λ ( dt ) k ρ ~ξ ( n,t ) , ˜ X n ( t ) × λ ( dt )) ≥ R ( v ( d y | dt ) × λ ( dt ) k ρ ~ξ ( t ) , ¯ X ( t ) × λ ( dt )) a.s. as R is lower semi-continuous;(iv) h ( ¯ X n ) → h ( ¯ X ) a.s. as h is continuous and ¯ X n → ¯ X uniformly on[0 ,

1] a.s.By [29], Lemma 3.3.3(c), R ( v ( ·| t ) k ρ ~ξ ( t ) , ¯ X ( t ) ) ≥ L (cid:18) ~ξ ( t ) , ¯ X ( t ) , Z K z v ( d z | t ) (cid:19) , where L ( ~ξ ( t ) , x , y ) := sup (cid:26) h θ , y i − log Z K exp h θ , z i ρ ~ξ ( t ) , x ( d z ) (cid:12)(cid:12)(cid:12) θ ∈ R d +2 (cid:27) = inf (cid:26) R ( ν ( ·| t ) k ρ ~ξ ( t ) , x ) | ν ( ·| t ) ∈ P ( K ) , Z K z ν ( d z | t ) = y (cid:27) . We note, in this deﬁnition, the inﬁmum is attained at some ν ∈ P ( K )as the relative entropy is convex and lower semicontinuous; cf. [29], Lem-ma 1.4.3(b). Since R z v ( d z | t ) = ˙¯ X ( t ), we havelim inf n →∞ V n ≥ ¯ E (cid:26)Z L ( ~ξ ( t ) , ¯ X ( t ) , ˙¯ X ( t )) dt + h ( ¯ X ) (cid:27) . As ¯ X ∈ Γ d , we havelim inf n →∞ V n ≥ inf ϕ ∈ Γ d (cid:26)Z L ( ~ξ ( t ) , ϕ ( t ) , ˙ ϕ ( t )) dt + h ( ϕ ) (cid:27) . DP FOR PREFERENTIAL ATTACHMENT SCHEMES For ϕ ∈ Γ d , we can evaluate a unique minimizer ν ( ·| t ) in the deﬁnition of L ( ~ξ ( t ) , ϕ ( t ) , ˙ ϕ ( t )): recall that [ ˙ ϕ ( t )] i := P il =0 ˙ ϕ l ( t ). Then, as P d +1 i =0 f i ν ( f i | t ) = h ˙ ϕ ( t ) , . . . , ˙ ϕ d +1 ( t ) i , a calculation gives ν ( ˙ ϕ ( t ) | t ) = d X i =0 (1 − [ ˙ ϕ ( t )] i ) δ f i + d X i =0 [ ˙ ϕ ( t )] i − d ! δ f d +1 . (2.4)Hence, L ( ~ξ ( t ) , ϕ ( t ) , ˙ ϕ ( t ))= R ( ν ( ˙ ϕ ( t ) | t ) k ρ ~ξ ( t ) ,ϕ ( t ) )= (1 − [ ˙ ϕ ( t )] ) log 1 − [ ˙ ϕ ( t )] p ( t ) + (1 − p ( t )) β ( t ) ϕ ( t )(1+ β ( t )) t +˜ c + cβ ( t ) (2.5) + d X i =1 (1 − [ ˙ ϕ ( t )] i ) log 1 − [ ˙ ϕ ( t )] i (1 − p ( t )) ( i + β ( t )) ϕ i ( t )(1+ β ( t )) t +˜ c + cβ ( t ) + − d X i =0 (1 − [ ˙ ϕ ( t )] i ) ! log 1 − P di =0 (1 − [ ˙ ϕ ( t )] i )(1 − p ( t ))(1 − P di =0 ( i + β ( t )) ϕ i ( t )(1+ β ( t )) t +˜ c + cβ ( t ) ) , interpreted under our conventions (1.3).Finally, deﬁne I d ( ϕ ) := Z L ( ~ξ ( t ) , ϕ ( t ) , ˙ ϕ ( t )) dt, when ϕ ∈ Γ d , and I d ( ϕ ) = ∞ otherwise. Since L is convex, I d is convex. Also I d has compact level sets by the proof of [29], Proposition 6.2.4, and so isa good rate function. Hence, the Laplace principle upper bound holds withrespect to I d .We will need the following result for the proof of the lower bound in thenext section. Lemma 2.1.

Let ℓ ( t ) = e t + c d be a linear function, where e = ( e , e , . . . ,e d +1 ) is such that e i > for i ≥ , P d +1 i =0 e i = 1 , and P d +1 i =0 ie i ≤ . Then, I d ( ℓ ( t )) < ∞ . Proof.

Noting P di =0 (1 − [ e ] i ) = P d +1 i =0 ie i ≤

1, explicitly I d ( ℓ ( t )) = Z (1 − [ e ] ) log 1 − [ e ] p ( t ) + (1 − p ( t )) β ( t )( e t + c )(1+ β ( t )) t +˜ c + cβ ( t ) J. CHOI AND S. SETHURAMAN + d X i =1 (1 − [ e ] i ) log 1 − [ e ] i (1 − p ( t )) ( i + β ( t ))( e i t + c i )(1+ β ( t )) t +˜ c + cβ ( t ) + − d X i =0 (1 − [ e ] i ) ! log 1 − P di =0 (1 − [ e ] i )(1 − p ( t ))(1 − P di =0 ( i + β ( t ))( e i t + c i )(1+ β ( t )) t +˜ c + cβ ( t ) ) dt is bounded under the bounds on p, β in assumption (ND). (cid:3) Lower bound.

Fix h : C ([0 , R d +2 ) → R , a bounded, continuous func-tion, and ϕ ∗ ∈ Γ d such that I d ( ϕ ∗ ) < ∞ . To show the lower bound, it suﬃcesto prove, for each ε >

0, thatlim sup n →∞ V n ≤ I d ( ϕ ∗ ) + h ( ϕ ∗ ) + 8 ε. (2.6)The main idea of the argument is to construct from ϕ ∗ a sequence of controlmeasures suitable to evaluate formulas for V n .Note only in this “lower bound” subsection, to make several expressionssimpler, we often take c d +1 := ¯ c d .2.2.1. Step 1: Convex combination and regularization.

Rather than workdirectly with ϕ ∗ , we consider a convex combination of paths with betterregularity: for 0 ≤ θ ≤

1, let ϕ θ ( t ) = (1 − θ ) ϕ ∗ ( t ) + θℓ ( t ) , where ℓ ( t ) = e t + c d is a linear function such that e satisﬁes the assumptionsof Lemma 2.1, say e = ( , , . . . , d +1 , d +1 ). Lemma 2.2. As θ ↓ , we have | I d ( ϕ θ ) − I d ( ϕ ∗ ) | → and | h ( ϕ θ ) − h ( ϕ ∗ ) | → . Proof.

By convexity of I d , and ﬁniteness of I d ( ℓ ( t )) from Lemma 2.1, I d ( ϕ θ ) ≤ (1 − θ ) I d ( ϕ ∗ ) + θI d ( ℓ ) . On the other hand, since | ϕ θ ( t ) − ϕ ∗ ( t ) | = | R t ( ˙ ϕ θ − ˙ ϕ ∗ )( s ) ds | ≤ tθ ( d + 2),we have k ϕ θ − ϕ ∗ k ∞ < θ ( d + 2) ↓

0, by lower semi-continuity of I d , we havelim inf θ ↓ I d ( ϕ θ ) ≥ I d ( ϕ ∗ ) . Also, as h is continuous, we have that | h ( ϕ θ ) − h ( ϕ ∗ ) | → (cid:3) Now, ﬁx θ > I d ( ϕ θ ) ≤ I d ( ϕ ∗ ) + ε and h ( ϕ θ ) ≤ h ( ϕ ∗ ) + ε. DP FOR PREFERENTIAL ATTACHMENT SCHEMES Next, for κ ∈ N and t ∈ [0 , ψ κ ( t ) = Z t γ κ ( s ) ds + c d , (2.7)where γ κ ( t ) = κ Z ( i +1) /κi/κ ˙ ϕ θ ( s ) ds for t ∈ [ i/κ, ( i + 1) /κ ), 0 ≤ i ≤ κ −

1, and γ κ (1) = γ κ (1 − /κ ). Note that ψ κ ∈ Γ d , and on [ i/κ, ( i + 1) /κ ) for 0 ≤ i ≤ κ −

1, ˙ ψ κ ( t ) equals the constantvector γ κ ( i/κ ). In particular, ˙ ψ κ is a step function. Lemma 2.3.

For ≤ i ≤ d + 1 and ≤ t ≤ , ψ κ,i ( t ) ≥ θ ( e i t + c i ) , (2.8) d +1 X i =0 i ˙ ψ κ,i ( t ) = d X i =0 (1 − [ ˙ ψ κ ( t )] i ) ≤ − θe d +1 . (2.9) Proof.

These are properties of ϕ θ inherited from properties of ϕ ∗ , ℓ ∈ Γ d , which are preserved with respect to (2.7). Indeed, for each 0 ≤ i ≤ d + 1, ψ κ,i ( t ) = ϕ θ,i ( ⌊ tκ ⌋ /κ ) + ( tκ − ⌊ tκ ⌋ )( ϕ θ,i (( ⌊ tκ ⌋ + 1) /κ ) − ϕ θ,i ( ⌊ tκ ⌋ /κ )) ≥ θ ( e i t + c i ) . Last, (2.9) follows: noting that P di =0 (1 − [ e ] i ) = P d +1 i =0 ie i = 1 − e d +1 , d X i =0 (1 − [ ˙ ψ κ ( t )] i )= κ Z ( l +1) /κl/κ " (1 − θ ) d X i =0 (1 − [ ˙ ϕ ∗ ( s )] i ) + θ d X i =0 (1 − [ ˙ ℓ ( s )] i ) ds ≤ − θ + θ d X i =0 (1 − [ e ] i ) = 1 − θe d +1 . (cid:3) Lemma 2.4.

For large enough κ , we have h ( ψ κ ) ≤ h ( ϕ ∗ ) + 2 ε and I d ( ψ κ ) ≤ I d ( ϕ ∗ ) + 2 ε. (2.10) Proof.

Since lim κ →∞ sup t ∈ [0 , | ϕ θ ( t ) − ψ κ ( t ) | = 0 , J. CHOI AND S. SETHURAMAN the inequality with respect to h follows from continuity of h and choosing κ in terms of θ . We also note, by absolute continuity of ϕ θ , that a.s. in t ,˙ ψ κ ( t ) = γ κ ( t ) = κ Z ( ⌊ tκ ⌋ +1) /κ ⌊ tκ ⌋ /κ ˙ ϕ θ ( s ) ds → ˙ ϕ θ ( t ) as κ ↑ ∞ . Then, by the form of L [cf. (2.5)], bounds in Lemma 2.3 and piecewisecontinuity and bounds on p, β in assumption (ND), we have, as κ ↑ ∞ , that L ( ~ξ ( t ) , ψ κ ( t ) , ˙ ψ κ ( t )) → L ( ~ξ ( t ) , ϕ θ ( t ) , ˙ ϕ θ ( t )) for almost all t ∈ [0 , L ( ~ξ ( t ) , ψ κ ( t ) , ˙ ψ κ ( t )) as follows: ﬁrst, using x log x ≤ ≤ x ≤

1, bound that L ( ~ξ ( t ) , ψ κ ( t ) , ˙ ψ κ ( t )) ≤ − (1 − [ ˙ ψ κ ( t )] ) log (cid:18) p ( t ) + (1 − p ( t )) β ( t ) ψ κ, ( t )(1 + β ( t )) t + ˜ c + cβ ( t ) (cid:19) − d X i =1 (1 − [ ˙ ψ κ ( t )] i ) log (cid:18) (1 − p ( t )) ( i + β ( t )) ψ κ,i ( t )(1 + β ( t )) t + ˜ c + cβ ( t ) (cid:19) − − d X i =0 (1 − [ ˙ ψ κ ( t )] i ) ! × log (cid:18) (1 − p ( t )) (cid:18) − P di =0 ( i + β ( t )) ψ κ,i ( t )(1 + β ( t )) t + ˜ c + cβ ( t ) (cid:19)(cid:19) . Now, as 0 ≤ [ ˙ ψ κ ] i ≤ ≤ P di =0 (1 − [ ˙ ψ κ ] i ) ≤

1, we have the furtherupperbound, using Lemma 2.3, − log (cid:18) p ( t ) + (1 − p ( t )) β ( t ) θ ( e t + c )(1 + β ( t )) t + ˜ c + cβ ( t ) (cid:19) − d X i =1 log (cid:18) (1 − p ( t )) ( i + β ( t )) θ ( e i t + c i )(1 + β ( t )) t + ˜ c + cβ ( t ) (cid:19) − log (cid:18) (1 − p ( t )) ( d + 1 + β ( t )) θ ( e d +1 t + ¯ c d )(1 + β ( t )) t + ˜ c + cβ ( t ) (cid:19) , which is integrable on [0 ,

1] given the bounds on p, β in assumption (ND).By dominated convergence, we obtain lim κ I d ( ψ κ ) = I d ( ϕ θ ), and thereforethe other inequality with respect to I d . (cid:3) Let now κ be such that (2.10) holds. Finally, we modify ψ κ on the interval[0 , δ ], for a small enough δ > DP FOR PREFERENTIAL ATTACHMENT SCHEMES Deﬁne t i := δ − d X l = i ( δ + [ c d ] l − [ ψ κ ( δ )] l )(2.11)for 0 ≤ i ≤ d , and t d +1 := δ ; set also t − := 0 and t d +2 = t d +1 . Let also ψ ∗ ( t ) = Z t γ ∗ ( s ) ds + c d , (2.12)where γ ∗ ( t ) = ( f d +1 , when 0 ≤ t < t , f i , when t i ≤ t < t i +1 , ≤ i ≤ d , γ κ ( t ) , when t ≥ δ .Note γ ∗ may not be deﬁned at some endpoints as possibly t i = t i +1 forsome i .By inspection, ψ ∗ ∈ Γ d . Also, ˙ ψ ∗ ( t ) = f d +1 when 0 ≤ t < t and ˙ ψ ∗ ( t ) = f i when t i ≤ t < t i +1 for 0 ≤ i ≤ d . Moreover, we have the following properties. Lemma 2.5.

We have ψ ∗ ( δ ) = ψ κ ( δ ) and t ≥ θe d +1 δ . Also, ψ ∗ ( t ) = t + c , ψ ∗ j ( t ) = c j for ≤ j ≤ d + 1 , when ≤ t < t , and ψ ∗ ( t ) ≥ θe d +1 δ + c when t < t < t ,ψ ∗ i ( t ) ≥ θ ( e i δ + c i ) when t i < t < t i +1 and ≤ i ≤ d. Proof.

The lower bound for t follows from the integration of both sidesin (2.9) and the deﬁnition of t . Now, we note that ˙ ψ ∗ ( t ) = 0 if t ≤ t ≤ t ,and 1 otherwise. Also, note that for 1 ≤ i ≤ d + 1, ˙ ψ ∗ i ( t ) = 1 if t i − < t < t i ,˙ ψ ∗ i ( t ) = − t i < t < t i +1 and ˙ ψ ∗ i ( t ) = 0 otherwise. Thus, noting (2.11), ψ ∗ ( δ ) = Z δ γ ∗ ( s ) ds + c = δ − ( t − t ) + c = ψ κ, ( δ )and, for 1 ≤ i ≤ d + 1, ψ ∗ i ( δ ) = Z δ γ ∗ i ( s ) ds + c i = ( t i − t i − ) − ( t i +1 − t i ) + c i = ψ κ,i ( δ ) , which proves that ψ ∗ ( δ ) = ψ κ ( δ ). Since ψ ∗ ( t ) is nondecreasing, for t ≥ t , ψ ∗ ( t ) ≥ ψ ∗ ( t ) = t + c ≥ θe d +1 δ + c . For 1 ≤ i ≤ d , for t i < t < t i +1 , ψ ∗ i ( t )decreases to its ﬁnal value ψ κ,i ( δ ) ≥ θ ( e i δ + c i ) by (2.8). (cid:3) J. CHOI AND S. SETHURAMAN

Step 2: More properties of ψ ∗ . We now show the rate of ψ ∗ up totime δ does not contribute too much. Lemma 2.6.

For small enough δ > , Z δ L ( ~ξ ( t ) , ψ ∗ ( t ) , ˙ ψ ∗ ( t )) dt ≤ ε and k ψ ∗ − ψ κ k ∞ < ε. In particular, h ( ψ ∗ ) ≤ h ( ϕ ∗ ) + 3 ε and I d ( ψ ∗ ) ≤ I d ( ϕ ∗ ) + 3 ε . Proof.

Write, for 0 ≤ t ≤ δ , as L ( ~ξ ( t ) ,ψ ∗ ( t ) , ˙ ψ ∗ ( t )) = R ( δ f d +1 k ρ ~ξ ( t ) ,ψ ∗ ( t ) ) × < t < t ) + P di =0 R ( δ f i k ρ ~ξ ( t ) ,ψ ∗ ( t ) )1( t i < t < t i +1 ), L ( ~ξ ( t ) , ψ ∗ ( t ) , ˙ ψ ∗ ( t ))= − log (cid:18) (1 − p ( t )) (cid:18) − P dl =0 ( l + β ( t )) ψ ∗ l ( t )(1 + β ( t )) t + ˜ c + cβ ( t ) (cid:19)(cid:19) < t < t ) − log (cid:18) p ( t ) + (1 − p ( t )) β ( t ) ψ ∗ ( t )(1 + β ( t )) t + ˜ c + cβ ( t ) (cid:19) t < t < t ) − d X i =1 log (cid:18) (1 − p ( t )) ( i + β ( t )) ψ ∗ i ( t )(1 + β ( t )) t + ˜ c + cβ ( t ) (cid:19) t i < t < t i +1 ) . By Lemma 2.5 and the bounds on p, β in assumption (ND), this expressionis integrable for 0 ≤ t ≤ δ . (It would be bounded unless ¯ c d = 0 and c = 0,in which case the ﬁrst term in the expression involves − log t .) Hence, theﬁrst statement follows for small δ >

0. Also, the second statement holdsas k ψ ∗ − ψ κ k ∞ = sup ≤ t<δ | ψ ∗ − ψ κ | ≤ δ ( d + 2). The last statement is aconsequence now of (2.10). (cid:3) We will take δ >

Lemma 2.7.

We have lim n →∞ sup ≤ j ≤ n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ψ ∗ ( j/n ) − n j − X l =0 ˙ ψ ∗ ( l/n ) − c d (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 0 . (2.13) Also, for j ≥ ⌊ δn ⌋ and ≤ i ≤ d + 1 , n j − X l =0 ˙ ψ ∗ i ( l/n ) + c i ≥ θ (cid:18) e i jn + c i (cid:19) . (2.14) Proof.

Since ˙ ψ ∗ is piecewise constant, when l/n ≤ s ≤ ( l + 1) /n , | ˙ ψ ∗ ( s ) − ˙ ψ ∗ ( l/n ) | 6 = 0 for at most κ subintervals [cf. (2.7) and (2.12)], and is also DP FOR PREFERENTIAL ATTACHMENT SCHEMES bounded by 2( d + 2). Hence, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ψ ∗ ( j/n ) − n j − X l =0 ˙ ψ ∗ ( l/n ) − c d (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) j − X l =0 Z ( l +1) /nl/n ( ˙ ψ ∗ ( s ) − ˙ ψ ∗ ( l/n )) ds (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ d + 2) n κ. The last statement follows from (2.8). (cid:3)

Step 3: Admissible control measures and convergence.

We nowbuild a sequence of controls based on ψ ∗ . Deﬁne ν = ν ( ˙ ψ ∗ ( j/n ) | j/n ) using(2.4), and v nj ( d y ; x , . . . , x j )=  ν ( ˙ ψ ∗ ( j/n ) | j/n ) , when 0 ≤ j ≤ ⌊ δn ⌋ or when j ≥ ⌈ δn ⌉ and x j,i ≥ θ e i δ + c i )for 0 ≤ i ≤ d + 1, ρ ~ξ ( j/n ) , x j , otherwise.The reasoning behind this choice of controls is as follows: to bound the limitof the quantity in (2.2), using formula (2.5), by I d ( ψ ∗ ) + h ( ψ ∗ ), we would liketo specify the controls in form ν ( ˙ ψ ∗ ( j/n ) | j/n ). Such a choice, as we will see,also ensures that the adapted sequence ¯ X n ( j ) is close to ψ ∗ ( j/n ). However,the adapted process, as it is random, may get too close to a boundary. Whenthis happens, not often it turns out, to bound errors, we specify that thecontrols take the cost-free form of the natural evolution sequence. Also, to getpast this boundary layer initially, ψ ∗ has been built as a step function so thatthe adapted process must follow a deterministic trajectory up to time ⌊ δn ⌋ .Deﬁne ¯ X n (0) = c d , and ¯ X n ( j + 1) = ¯ X n ( j ) + n ¯ Y n ( j ) for j ≥ P ( ¯ Y n ( j ) ∈ d y | ¯ X n (0) , . . . , ¯ X n ( j )) = v nj ( d y ; ¯ X n (0) , . . . , ¯ X n ( j )) . Thus, for j ≥

0, ¯ X n ( j ) = n P j − l =0 ¯ Y n ( l ) + c d . It will be useful later to note thetotal weight P d +1 i =0 ( i + β ( j/n )) ¯ X ni ( j ) ≤ ( j/n + ˜ c ) + β ( j/n )( j/n + c ) and, for0 ≤ j ≤ ⌊ δn ⌋ , as mentioned ¯ X n ( j ) is deterministic and ¯ X n ( j ) = n P j − l =0 ˙ ψ ∗ ( l/n ) + c d .Deﬁne now, for each n ≥

1, the martingale sequence for 0 ≤ j ≤ n M n ( j ) := 1 n j − X l =0 ( ¯ Y n ( l ) − ¯ E ( ¯ Y n ( l ) | ¯ X n ( l )))= ¯ X n ( j ) − n j − X l =0 ¯ E ( ¯ Y n ( l ) | ¯ X n ( l )) − c d . J. CHOI AND S. SETHURAMAN

Let τ n := n ∧ min (cid:26) ⌈ δn ⌉ ≤ l ≤ n : ¯ X ni ( l ) < θ e i δ + c i )for some 0 ≤ i ≤ d + 1 (cid:27) . Then, τ n ≥ ⌈ δn ⌉ is a stopping time, and the corresponding stopped process { M n ( j ∧ τ n ) } is also a martingale for 0 ≤ j ≤ n . Let now A n := (cid:26) sup ≤ j ≤ n | M n ( j ∧ τ n ) | > θe d +1 n / (cid:27) . Lemma 2.8.

For n ≥ δ − , on the set A cn , we have τ n = n . Proof.

From the deﬁnition of { v nj } and τ n , we have ¯ E ( ¯ Y n ( l ) | ¯ X n ( l )) =˙ ψ ∗ ( l/n ) for 0 ≤ l ≤ j ∧ τ n − j ≥ ⌈ δn ⌉ . Then, on A cn , by (2.14), we have¯ X ni ( j ∧ τ n ) ≥ c i + 1 n j ∧ τ n − X l =0 ¯ E ( ¯ Y ni ( l ) | ¯ X n ( l )) − θe d +1 n / = c i + 1 n j ∧ τ n − X l =0 ˙ ψ ∗ i ( l/n ) − θe d +1 n / ≥ θ (cid:18) e i ( j ∧ τ n ) n + c i (cid:19) − θe d +1 n / ≥ θ e i δ + c i ) . Hence, τ n = n . (cid:3) We now observe, by Doob’s martingale inequality and bounds, in termsof constants C = C d , that¯ P [ A n ] ≤ Cn / ¯ E | M n ( j ∧ τ n ) | = Cn − / ¯ E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) j ∧ τ n − X l =0 ( ¯ Y n ( l ) − ¯ E ( ¯ Y n ( l ) | ¯ X n ( l ))) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (2.15) ≤ Cn − / n = Cn − / . We now state the following almost sure convergence.

Lemma 2.9.

We have lim n ↑∞ sup ≤ j ≤ n (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ¯ X n ( j ) − n j − X l =0 ˙ ψ ∗ ( l/n ) − c d (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = 0 a.s. (2.16) DP FOR PREFERENTIAL ATTACHMENT SCHEMES Proof.

First, by (2.15) and the Borel–Cantelli lemma, ¯ P (lim sup A n ) =0. On the other hand, on the full measure set S m ≥ T k ≥ m A ck , since τ n = n and ¯ E ( ¯ Y n ( l ) | ¯ X n ( l )) = ˙ ψ ∗ ( l/n ) for 0 ≤ l ≤ n − A cn by Lemma 2.8, thedesired convergence holds. (cid:3) Step 4.

We now argue the lower bound through representation(2.2). Recall the deﬁnition of ~ξ ( · ) in the beginning of Section 2. The sum in(2.2) equals ¯ E " n n − X j =0 R ( v nj k ρ ~ξ ( j/n ) , ¯ X n ( j ) ) = ¯ E " n ⌊ δn ⌋ X j =0 R ( v nj k ρ ~ξ ( j/n ) , ¯ X n ( j ) ) + ¯ E " n n − X j = ⌈ δn ⌉ R ( v nj k ρ ~ξ ( j/n ) , ¯ X n ( j ) ); A n (2.17) + ¯ E " n n − X j = ⌈ δn ⌉ R ( v nj k ρ ~ξ ( j/n ) , ¯ X n ( j ) ); A cn = A + A + A . Step A in (2.17). Recall σ ( j/n ) = (1+ β ( j/n ))( j/n )+˜ c + cβ ( j/n ) and the “weight” bound on ¯ X n ( j ) in beginning of Step 3. For ⌈ δn ⌉ ≤ j ≤ n − R ( v nj k ρ ~ξ ( j/n ) , ¯ X n ( j ) )= R ( ν ( ˙ ψ ∗ ( j/n )) k ρ ~ξ ( j/n ) , ¯ X n ( j ) ) ×

1( ¯ X ni ( j ) ≥ ( θ/ e i δ + c i ) for 0 ≤ i ≤ d + 1) . Noting (2.5), this is bounded above, using x log x ≤ ≤ x ≤

1, by " − (cid:18) − (cid:20) ˙ ψ ∗ (cid:18) jn (cid:19)(cid:21) (cid:19) log (cid:18) p ( j/n ) + (1 − p ( j/n )) β ( j/n ) ¯ X n ( j ) σ ( j/n ) (cid:19) − d X i =1 (cid:18) − (cid:20) ˙ ψ ∗ (cid:18) jn (cid:19)(cid:21) i (cid:19) log (cid:18) (1 − p ( j/n )) ( i + β ( j/n )) ¯ X ni ( j ) σ ( j/n ) (cid:19) − d X i =0 (cid:20) ˙ ψ ∗ (cid:18) jn (cid:19)(cid:21) i − d ! J. CHOI AND S. SETHURAMAN × log (cid:18) (1 − p ( j/n )) (cid:18) − P di =0 ( i + β ( j/n )) ¯ X ni ( j ) σ ( j/n ) (cid:19)(cid:19) ×

1( ¯ X ni ( j ) ≥ ( θ/ e i δ + c i ) for 0 ≤ i ≤ d + 1) . Given bounds on p, β in (ND), as 0 ≤ [ ˙ ψ ∗ ] i ≤

1, we have d ≤ P di =1 [ ˙ ψ ∗ ] i ≤ d + 1and d X i =0 ( i + β ( j/n )) ¯ X ni ( j ) ≤ σ ( j/n ) − ( d + 1 + β ( j/n )) ¯ X nd +1 ( j ) ≤ σ ( j/n ) − ( d + 1 + β ( j/n )) · ( θ/ e d +1 δ + c d +1 ) , the relative entropy is further bounded by a constant C d . Thus, for large n , A ≤ C d · ¯ P (cid:20) sup ≤ j ≤ n | M n ( j ∧ τ n ) | > θe d +1 n / (cid:21) ≤ ε. (2.18) Step A in (2.17), we recall for j ≤ ⌊ δn ⌋ that¯ X n ( j ) = n P j − l =0 ˙ ψ ∗ ( l/n ) + c d is deterministic. Also note, for 0 ≤ i ≤ d , that˙ ψ ∗ ( t ) = f i on t i < t < t i +1 , and ˙ ψ ∗ ( t ) = f d +1 on 0 = t − ≤ t ≤ t (cf. nearLemma 2.5). Thus, for 0 ≤ j ≤ ⌊ δn ⌋ , denoting f − = f d +1 , we may write R ( v nj k ρ ~ξ ( j/n ) , ¯ X n ( j ) )= L ~ξ (cid:18) jn (cid:19) , n j − X l =0 ˙ ψ ∗ (cid:18) ln (cid:19) + c d , ˙ ψ ∗ (cid:18) jn (cid:19)! = d X i = − L ~ξ (cid:18) jn (cid:19) , i − X l = − ⌊ t l +1 n ⌋ − ⌊ t l n ⌋ n f l + j − ⌊ t i n ⌋ n f i + c d , f i ! × ⌊ t i n ⌋ ≤ j < ⌊ t i +1 n ⌋ ) , where, for i = −

1, the empty sum in the argument for L vanishes. Com-paring with the proof of Lemma 2.6, this expression, given bounds on p, β in (ND), is bounded, for 0 ≤ j ≤ ⌊ δn ⌋ , except when ¯ c d = 0 and c = 0, inwhich case a “ − log( j/n )” term appears in the i = − − (1 /n ) P ⌊ δn ⌋ j =1 log( j/n ) ≤ − R δ log( t ) dt , its contribution is still small. Hence, A ≤ ǫ ( δ ) where ǫ ( δ ) → δ → . (2.19) Step A in (2.17). For n ≥ δ − , byLemma 2.8 and deﬁnition of L (2.5), A ≤ ¯ E " n n − X j = ⌈ δn ⌉ L (cid:18) ~ξ (cid:18) jn (cid:19) , ¯ X n ( j ) , ˙ ψ ∗ (cid:18) jn (cid:19)(cid:19) ; A cn ∩ { τ n = n } DP FOR PREFERENTIAL ATTACHMENT SCHEMES ≤ ¯ E (cid:20)Z δ L (cid:18) ~ξ (cid:18) ⌊ nt ⌋ n (cid:19) , ¯ X n ( ⌊ nt ⌋ ) , ˙ ψ ∗ (cid:18) ⌊ nt ⌋ n (cid:19)(cid:19) dt ; A cn ∩ B n (cid:21) , where B n = { ¯ X ni ( j ) ≥ ( θ/ e i δ + c i ) for 0 ≤ i ≤ d + 1 , j ≥ ⌈ δn ⌉} . On theevent A cn ∩ B n , ¯ E ( ¯ Y n ( l ) | ¯ X n ( l )) = ˙ ψ ∗ ( l/n ) for l ≥

0, and so | ¯ X n ( ⌊ nt ⌋ ) − ψ ∗ ( ⌊ nt ⌋ /n ) | A cn ∩ B n ) → L (2.5), as ˙ ψ ∗ is a step function, (2.8), and boundsand piecewise continuity of p, β in (ND), we may bound as in Step 4.1 andobserve 2 C d ≥ (cid:12)(cid:12)(cid:12)(cid:12) L (cid:18) ~ξ (cid:18) ⌊ nt ⌋ n (cid:19) , ¯ X n ( ⌊ nt ⌋ ) , ˙ ψ ∗ (cid:18) ⌊ nt ⌋ n (cid:19)(cid:19) − L ( ~ξ ( t ) , ψ ∗ ( t ) , ˙ ψ ∗ ( t )) (cid:12)(cid:12)(cid:12)(cid:12) A cn ∩ B n ) → t and each realization. Hence, by bounded convergence theo-rem, with respect to d ¯ P × δ, t ]) dt ,lim sup n →∞ A ≤ Z δ L ( ~ξ ( t ) , ψ ∗ ( t ) , ˙ ψ ∗ ( t )) dt. (2.20)2.2.5. Step 5.

Finally, by (2.13) and (2.16), lim n →∞ h ( ¯ X n ( · )) = h ( ψ ∗ ( · ))a.s. in the sup topology, and by bounded convergence lim n →∞ ¯ E [ h ( ¯ X n ( · ))] = h ( ψ ∗ ( · )).We now combine all bounds to conclude the proof of (2.6). By (2.2),bounds (2.18), (2.19), (2.20) and nonnegativity of L , we havelim sup n →∞ V n ≤ lim sup n →∞ ¯ E " n n − X j =0 R ( v nj k ρ ~ξ ( j/n ) , ¯ X n ( j ) ) + h ( ¯ X n ( · )) ≤ ε + Z L ( ~ξ ( t ) , ψ ∗ ( t ) , ˙ ψ ∗ ( t )) dt + h ( ψ ∗ ) . Then, by Lemma 2.6, we obtain (2.6).

3. Proof of Theorem 1.4.

The proof of Theorem 1.4 follows from thefollowing two propositions, and is given below. We ﬁrst recall the projec-tive limit approach, following notation in [20], Section 4.6. Deﬁne, for 0 ≤ i ≤ j , Y j = C ([0 , R j +2 ) and p ij : Y j → Y i by h ϕ , . . . , ϕ j +1 i 7→ h ϕ , . . . , ϕ i , P j +1 l = i +1 ϕ l i . Also deﬁne lim ←− Y j ⊂ Q i ≥ Y i as the subset of elements x = h x ,x , . . . i such that p ij x j = x i , equipped with the product topology. Let also p j : lim ←− Y j → Y j be the canonical projection, p j x = x j . J. CHOI AND S. SETHURAMAN

Since I d are convex, good rate functions on C ([0 , , R d +2 ), by the LDPsTheorem 1.2 and [20], Theorem 4.6.1, we obtain the following proposi-tion. Recall the notation in Theorem 1.2. For n ≥

1, let X n, ∞ = h X n, , X n, , . . . i . Proposition 3.1.

The sequence {X n, ∞ } ⊂ lim ←− Y j satisﬁes an LDP withrate n and convex, good rate function J ∞ ( ϕ ) = ( sup d { I d ( p d ( ϕ )) } , when ϕ ∈ lim ←− Y j , ∞ , otherwise. To establish Theorem 1.4, it remains to further identify J ∞ . Recall Γ d ⊂ C ([0 , R d +2 ) are those elements ϕ = h ϕ , . . . , ϕ d , ϕ d +1 i such that: ϕ (0) = c d , each ϕ i ≥ ≤ [ ˙ ϕ ( t )] i ≤ ≤ i ≤ d , P d +1 i =0 ˙ ϕ i ( t ) = 1, and P d +1 i =0 i ˙ ϕ i ( t ) = P di =0 (1 − [ ˙ ϕ ( t )] i ) ≤ t .Let also Γ ∗ ⊂ lim ←− Y j be those elements ϕ = h ϕ , ϕ , . . . i such that ϕ d ∈ Γ d for d ≥

0. Since { Γ d } d ≥ are compact sets, it is a straightforward exercise to seethat Γ ∗ is compact. Deﬁne L d ( p d ( ϕ ( t ))) equal to(1 − [ ˙ ϕ d ( t )] ) log 1 − [ ˙ ϕ d ( t )] p ( t ) + (1 − p ( t )) β ( t ) ϕ d ( t )(1+ β ( t )) t +˜ c + cβ ( t ) + d X i =1 (1 − [ ˙ ϕ d ( t )] i ) log 1 − [ ˙ ϕ d ( t )] i (1 − p ( t )) ( i + β ( t )) ϕ di ( t )(1+ β ( t )) t +˜ c + cβ ( t ) + − d X i =0 (1 − [ ˙ ϕ d ( t )] i ) ! log 1 − P di =0 (1 − [ ˙ ϕ d ( t )] i )(1 − p ( t ))(1 − P di =0 ( i + β ( t )) ϕ di ( t )(1+ β ( t )) t +˜ c + cβ ( t ) ) dt. Proposition 3.2.

The rate function J ∞ ( ϕ ) diverges when ϕ / ∈ Γ ∗ . How-ever, for ϕ ∈ Γ ∗ , lim d ↑∞ L d ( p d ( ϕ ( t ))) exists for almost all t , and we canevaluate J ∞ ( ϕ ) = Z lim d ↑∞ L d ( p d ( ϕ ( t ))) dt. Proof.

First, from the deﬁnition, J ∞ ( ϕ ) diverges unless ϕ ∈ Γ ∗ . Next,for ϕ ∈ Γ ∗ and almost all t , we argue L r ( p r ( ϕ ( t ))) ≤ L s ( p s ( ϕ ( t ))) when r < s. (3.1) DP FOR PREFERENTIAL ATTACHMENT SCHEMES It will be enough to show from the form of the rates the following: − r X i =0 (1 − [ ˙ ϕ s ( t )] i ) ! log 1 − P ri =0 (1 − [ ˙ ϕ s ( t )] i )(1 − p ( t ))(1 − P ri =0 ( i + β ( t )) ϕ si ( t )(1+ β ( t )) t +˜ c + cβ ( t ) ) ≤ s X i = r +1 (1 − [ ˙ ϕ s ( t )] i ) log 1 − [ ˙ ϕ s ( t )] i (1 − p ( t )) ( i + β ( t )) ϕ si ( t )(1+ β ( t )) t +˜ c + cβ ( t ) + − s X i =0 (1 − [ ˙ ϕ s ( t )] i ) ! log 1 − P si =0 (1 − [ ˙ ϕ s ( t )] i )(1 − p ( t ))(1 − P si =0 ( i + β ( t )) ϕ si ( t )(1+ β ( t )) t +˜ c + cβ ( t ) ) . Consider now h ( x ) = x log x which is convex for x ≥

0. Under conventions(1.3), for nonnegative numbers, a i and b i , we have P qi = p a i P qi = p b i log P qi = p a i P qi = p b i = h (cid:18) P qi = p a i P qi = p b i (cid:19) = h q X i = p b i P qi = p b i a i b i ! ≤ q X i = p b i P qi = p b i h (cid:18) a i b i (cid:19) = P qi = p a i log( a i /b i ) P qi = p b i . We now ﬁnish the proof of (3.1) by applying the last sequence, with p = r + 1and q = s + 1, to a j =  − [ ˙ ϕ s ( t )] j , for r + 1 ≤ j ≤ s ,1 − s X i =0 (1 − [ ˙ ϕ s ( t )] i ) , for j = s + 1,and b j =  (1 − p ( t )) ( j + β ( t )) ϕ sj ( t )(1 + β ( t )) t + ˜ c + cβ ( t ) , for r + 1 ≤ j ≤ s ,(1 − p ( t )) (cid:18) − P si =0 ( i + β ( t )) ϕ si ( t )(1 + β ( t )) t + ˜ c + cβ ( t ) (cid:19) , for j = s + 1.Finally, given L d ( p d ( ϕ ( t ))) ≥ d , the identiﬁcation of J ∞ in the display of the proposition follows from monotone convergence. (cid:3) Proof of Theorem 1.4.

Let Γ ∞ ⊂ Q i ≥ C ([0 , R ), endowed withthe product topology, be those elements ξ = h ξ , ξ , . . . i such that: ξ i (0) = c i , ξ i ( t ) ≥ ≤ [ ˙ ξ ( t )] i ≤ i ≥ ddt P i ≥ ξ i ( t ) = 1 and lim d [ P di =0 i ˙ ξ i ( t )+ ( d + 1)(1 − [ ˙ ξ ( t )] d )] = P i ≥ (1 − [ ˙ ξ ( t )] i ) ≤ t . J. CHOI AND S. SETHURAMAN

We now show that Γ ∞ and Γ ∗ are homeomorphic. Hence, as Γ ∗ is com-pact, Γ ∞ would also be compact. (We note, one can see directly that Γ ∞ iscompact.)Deﬁne the map F : Γ ∞ → Γ ∗ by F ( ξ ) = h ξ , . . . , ξ d , . . . i where ξ d = h ξ , . . . , ξ d , t + c − [ ξ ] d i ∈ Γ d . In verifying the last inclusion, note P di =0 i ˙ ξ i ( t ) + ( d + 1)(1 − [ ˙ ξ ( t )] d ) = P di =0 (1 − [ ˙ ξ ( t )] i ) ≤ P i ≥ (1 − [ ˙ ξ ( t )] i ) ≤

1. We now argue that F is a bi-continuous bijection.Indeed, we ﬁrst note that F − : Γ ∗ → Γ ∞ is given by F − ( ϕ ) = h ϕ , . . . , ϕ dd , . . . i . In checking F − ( ϕ ) ∈ Γ ∞ , note for ϕ ∈ Γ ∗ that lim d P di =0 (1 − P il =0 ˙ ϕ ll ( t )) =lim d P di =0 (1 − [ ˙ ϕ d ( t )] i ) ≤

1. Then, by bounded convergence with respect tothe last term in the previous series, lim d ( t + P di =0 c i − P di =0 ϕ ii ( t )) = lim d ( t + P di =0 c i − [ ϕ d ( t )] d ) = 0, and so P i ≥ ϕ ii ( t ) = t + c . Finally, it is not diﬃcultto see that F and F − are both continuous in the product topology.Now, X n, ∞ ∈ Γ ∗ , X n, ∞ ∈ Γ ∞ , and F ( X n, ∞ ) = X n, ∞ for n ≥

1. Hence,through the action of F , the LDP for X n, ∞ translates to the LDP for X n, ∞ .We now identify the rate function. Given Propositions 3.1 and 3.2, for adegree distribution ξ ∈ Γ ∞ , we identify its rate as I ∞ ( ξ ) = J ∞ ( F ( ξ )). SinceΓ ∞ is closed, and therefore distributions ξ / ∈ Γ ∞ can never be attained by X n, ∞ , we set I ∞ ( ξ ) = ∞ in this case. Last, by properties of F , as J ∞ is aconvex, good rate function, one obtains readily I ∞ is also a convex, goodrate function. (cid:3)

4. Proof of Corollary 1.7.

We verify some properties of ζ d in the nextlemmas and conclude the proof of Corollary 1.7 at the end of the section. Lemma 4.1.

The ODE (1.5) has a unique Carath´eodory solution ζ d . Proof.

Any Carath´eodory solution to ODE (1.5), given the assumption p, β are piecewise continuous, is piecewise continuously diﬀerentiable. Sincethe deﬁning ODEs are linear, one can solve them, and so the solution isunique and given by ζ d = h ζ ( t ) , ζ ( t ) , . . . , ¯ ζ d +1 ( t ) i where, for t ∈ [0 , ζ ( t ) := c M (0 , t ) + Z t (1 − p ( s )) M ( s, t ) ds,ζ ( t ) := c M (0 , t )+ Z t (cid:18) p ( s ) + (1 − p ( s )) β ( s ) ζ ( s )(1 + β ( s )) s + ˜ c + cβ ( s ) (cid:19) M ( s, t ) ds, (4.1) DP FOR PREFERENTIAL ATTACHMENT SCHEMES ζ i ( t ) := c i M i (0 , t )+ Z t (1 − p ( s )) ( i − β ( s )) ζ i − ( s )(1 + β ( s )) s + ˜ c + cβ ( s ) M i ( s, t ) ds for 2 ≤ i ≤ d and¯ ζ d +1 ( t ) := t + c − d X i =0 ζ i ( t ) = ¯ c d + Z t (1 − p ( s )) ( d + β ( s )) ζ d ( s )(1 + β ( s )) s + ˜ c + cβ ( s ) ds. Here, for 0 ≤ i ≤ d , M i ( s, t ) := exp (cid:20) − Z ts (1 − p ( u )) i + β ( u )(1 + β ( u )) u + ˜ c + cβ ( u ) du (cid:21) . (cid:3) Lemma 4.2.

We have ζ d ∈ Γ d , and moreover ∞ X i =0 ζ i ( t ) = t + c and ∞ X i =0 iζ i ( t ) = t + ˜ c. Proof.

First, from properties of the ODE system and the piecewisecontinuity assumption on p, β in (ND), ζ i ≥ ζ i is Lipschitz with constant1 and moreover piecewise continuously diﬀerentiable, and 0 ≤ [ ˙ ζ ( t )] i ≤ i ≥

0, and P di =0 ζ i ( t ) + ¯ ζ d +1 ( t ) = t + c for d ≥ t . We postponeproving P di =0 (1 − [ ˙ ζ ( t )] i ) ≤ d ≥ t , which would complete theargument to show ζ d ∈ Γ d , until the end.We now show P i ≥ ζ i ( t ) = t + c . From the deﬁning ODEs (1.5), for N ≥ − P Ni =0 ˙ ζ i ( t ) = (1 − p ( t )) ( N + β ( t )) ζ N ( t )(1+ β ( t )) t +˜ c + cβ ( t ) , and hence t + N X i =0 ζ i (0) − N X i =0 ζ i ( t ) = Z t (1 − p ( s )) ( N + β ( s )) ζ N ( s )(1 + β ( s )) s + ˜ c + cβ ( s ) ds. (4.2)We obtain, as the integrand on the right-hand side is nonnegative, that P Ni =0 ζ i ( t ) ≤ t + P Ni =0 c i ≤ t + c for all t ≥ N ≥ c = P ∞ i =0 c i . In particular, P i ≥ R t ζ i ( s ) s + c ds ≤ t . Also, the right-handside of (4.2), after a calculation, is bounded above by N +1min { β , } R t ζ N ( s ) s + c ds .Hence, since by nonnegativity and (LIM) the right-side of (4.2) has a limit,this limit must vanish and P i ≥ ζ i ( t ) = t + c .Next, to establish P i ≥ iζ i ( t ) = t + ˜ c , again from the ODEs, for N ≥ N X i =0 i ˙ ζ i ( t ) = p ( t ) + (1 − p ( t )) P Ni =0 ( i + β ( t )) ζ i ( t )(1 + β ( t )) t + ˜ c + cβ ( t )(4.3) − (1 − p ( t )) ( N + 1)( N + β ( t )) ζ N ( t )(1 + β ( t )) t + ˜ c + cβ ( t ) . J. CHOI AND S. SETHURAMAN

From nonnegativity of ζ i and P ∞ i =0 ζ i = t + c , we bound the right-hand sideof (4.3) by p ( t ) + (1 − p ( t )) P Ni =0 iζ i ( t )+ β ( t )( t + c )(1+ β ( t )) t +˜ c + cβ ( t ) . Let s N ( t ) := P Ni =0 iζ i ( t ). Then,˙ s N ( t ) ≤ p ( t ) + (1 − p ( t )) s N ( t )+ β ( t )( t + c )(1+ β ( t )) t +˜ c + cβ ( t ) . Since, s N ( t ) is piecewise continu-ously diﬀerentiable, we have, by Lemma 4.3, that s N ( t ) ≤ t + ˜ c for t ≥ N ≥

1. Hence, P ∞ i =0 R t iζ i ( s ) s + c ds ≤ At since ˜ c ≤ Ac for some A > c = P i ≥ ic i < ∞ .Now, integrating both sides of ODE (4.3), we have N X i =0 iζ i ( t ) − N X i =0 ic i = Z t p ( s ) ds + Z t (1 − p ( s )) P Ni =0 ( i + β ( s )) ζ i ( s )(1 + β ( s )) s + ˜ c + cβ ( s ) ds (4.4) − Z t (1 − p ( s )) ( N + 1)( N + β ( s )) ζ N ( s )(1 + β ( s )) s + ˜ c + cβ ( s ) ds. From nonnegativity, our estimates and (LIM), the last integral above has alimit. This last integral in (4.4) is bounded above by ( N +1) N min { β , } R t Nζ N ( s ) s + c ds ,and hence its limit must vanish. Then, using P ∞ i =0 ζ i ( t ) = t + c , we see s ( t ) = P i ≥ iζ i ( t ) satisﬁes the ODE in Lemma 4.3, and therefore s ( t ) = t + ˜ c .Finally, to ﬁnish the postponed veriﬁcation, noting (4.2), we have d X i =0 (1 − [ ˙ ζ ( t )] i ) = (1 − p ( t )) s d ( t ) + β ( t ) P di =0 ζ i (1 + β ( t )) t + ˜ c + cβ ( t ) ≤ t + ˜ c + β ( t )( t + c )(1 + β ( t )) t + ˜ c + cβ ( t ) = 1 . (cid:3) Lemma 4.3.

The ODE ˙ f ( t ) = G ( t, f ( t )) with G ( t, x ) = p ( t ) + (1 − p ( t )) x + β ( t )( t + c )(1 + β ( t )) t + ˜ c + cβ ( t ) and initial condition f (0) = ˜ c has unique Carath´eodory solution t + ˜ c for t ≥ .In addition, if u ( t ) is piecewise continuously diﬀerentiable, u (0) = u ≤ ˜ c ,and ˙ u ( t ) ≤ G ( t, u ( t )) , then u ( t ) ≤ t + ˜ c for t ≥ . Proof.

Since the ODE is linear and, from the piecewise continuity as-sumption on p, β in (ND), f is piecewise continuously diﬀerentiable, we cansolve uniquely f ( t ) = ˜ c exp { B (0 , t ) } + Z t (cid:20) p ( s ) + (1 − p ( s )) β ( s )( s + c )(1 + β ( s )) s + ˜ c + cβ ( s ) (cid:21) exp { B ( s, t ) } ds, DP FOR PREFERENTIAL ATTACHMENT SCHEMES where B ( q, r ) = R rq − p ( v )(1+ β ( v )) v +˜ c + cβ ( v ) dv . Recall the convention 0 · ∞ = 0, sowhen c = 0 the ﬁrst term ˜ ce B (0 ,t ) = 0 vanishes. However, t + ˜ c is a solution,and therefore f ( t ) may be identiﬁed as desired.The second statement is obtained similarly. (cid:3) Proof of Corollary 1.7.

Any root of I d must be a Carath´eodorysolution to ODE (1.5). Hence, by Lemmas 4.1 and 4.2, ζ d ∈ Γ d is the uniqueminimizer of I d . The LLNs now follow from the LDP upper bound in Theo-rem 1.2 and Borel–Cantelli lemma. Statements about “mass” and “weight”of ζ ∞ are proved in Lemma 4.2. (cid:3)

5. Proof of Corollary 1.9.

Since [ ζ ∞ ] i = [ ζ d ] i for i ≤ d , the proof followsfrom the next lemma. Deﬁne, for o , o , o , o , o ≥

0, the ODEs, O ( o , o , o , o , o ): with initial condition ϕ (0) = c d ˙ ϕ ( t ) = 1 − o − (1 − o ) o o · ϕ ( t ) t + o , [ ˙ ϕ ( t )] i = 1 − (1 − o ) i + o o · ϕ i ( t ) t + o for 1 ≤ i ≤ d. One can check that χ ( t ) is the solution to O ( o , o , o , o , o ) above for0 ≤ o ≤

1, where χ i ( t ) = b i ( t + o ) + i X ℓ =0 a i,ℓ (cid:18) o t + o (cid:19) (1 − o )( ℓ + o ) / (1+ o ) for 0 ≤ i ≤ d. (5.1)Here, the sequence b i = b i ( o , o , o , o , o ) is deﬁned by b = − o − o ) o / (1+ o ) , b = o +(1 − o ) o b / (1+ o )1+(1 − o )(1+ o ) / (1+ o ) , and, for i ≥ b i = b i Y ℓ =2 (1 − o )( ℓ − o ) / (1 + o )1 + (1 − o )( ℓ + o ) / (1 + o )= b Γ(2 + o + (1 + o ) / (1 − o ))Γ(1 + o ) Γ( i + o )Γ( i + 1 + o + (1 + o ) / (1 − o )) ∼ i o ) / (1 − o ) . The sequence a i,ℓ = a i,ℓ ( o , o , o , o , o ) is given by a , = c − b o , and, for i ≥ a i,ℓ = i − o i − ℓ a i − ,ℓ where 0 ≤ ℓ < i and a i,i = c i − b i o − i − X ℓ =0 a i,ℓ . J. CHOI AND S. SETHURAMAN

Recall now the assumption in Corollary 1.9: 0 ≤ p min ≤ p ( · ) ≤ p max < < β min ≤ β ( · ) ≤ β max < ∞ . Lemma 5.1.

The systems O ( p min , p max , β min , β max , max { ˜ c, c } ) and O ( p max , p min , β max , β min , min { ˜ c, c } ) have respective unique solutions ˜ ζ and ˆ ζ . Then,for ≤ i ≤ d and t ∈ [0 , , with respect to the zero-cost trajectory ζ d ( t ) inCorollary 1.7 with initial condition ζ d (0) = c d , we have [ˆ ζ ( t )] i ≤ [ ζ d ( t )] i ≤ [˜ ζ ( t )] i . Proof.

The proof that ˜ ζ and ˆ ζ are the unique solutions uses a similarargument to that in the proof of Lemma 4.1. We now establish the inequalityin the display with respect to ˜ ζ as an analogous proof works for ˆ ζ . We useinduction to see that [˜ ζ ] i ≥ [ ζ ] i for 0 ≤ i ≤ d .Since ˜ ζ (0) = ζ (0) = c d , from ODEs, O ( p min , p max , β min , β max , max { ˜ c, c } )and (1.5), we have˙˜ ζ ( t ) − ˙ ζ ( t ) ≥ p ( t ) − p min + (1 − p max ) β min ( ζ ( t ) − ˜ ζ ( t ))(1 + β max )( t + max { ˜ c, c } ) , (5.2) [ ˙˜ ζ ( t )] i − [ ˙ ζ ( t )] i ≥ (1 − p max ) ( i + β min )( ζ i ( t ) − ˜ ζ i ( t ))(1 + β max )( t + max { ˜ c, c } ) . (5.3)For i = 0, suppose ˜ ζ ( t ) < ζ ( t ) for some t . Then, by continuity, we mayassume that ˜ ζ ( t ) < ζ ( t ) for all t ∈ ( t , t ] for some 0 ≤ t < t ≤

1, and˜ ζ ( t ) = ζ ( t ). We may further arrange t , t , from the piecewise continuityassumptions in (ND), that p, β are continuous on ( t , t ). From the meanvalue theorem, we ﬁnd a t ′ ∈ ( t , t ) such that ˙˜ ζ ( t ′ ) < ˙ ζ ( t ′ ), which contra-dicts the ODE (5.2) as it gives ˙˜ ζ ( t ′ ) − ˙ ζ ( t ′ ) >

0. Therefore, ˜ ζ ≥ ζ .Now, for 1 ≤ i ≤ d , suppose [˜ ζ ( t )] i < [ ζ ( t )] i for some t . By induction hy-pothesis ([˜ ζ ( · )] i − ≥ [ ζ ( · )] i − ), we must have ˜ ζ i ( t ) < ζ i ( t ). Since [˜ ζ ( · )] i , [ ζ ( · )] i ,˜ ζ i ( · ) and ζ i ( · ) are continuous functions, as for the case i = 0, we may assume[˜ ζ ( t )] i < [ ζ ( t )] i and ˜ ζ i ( t ) < ζ i ( t ), and p, β are continuous for all t ∈ ( t , t ) forsome 0 ≤ t < t ≤

1, and also ˜ ζ i ( t ) = ζ i ( t ). By the mean value theoremfor [˜ ζ ( t )] i − [ ζ ( t )] i , there is t ′ ∈ ( t , t ) such that [ ˙˜ ζ ( t ′ )] i < [ ˙ ζ ( t ′ )] i . But (5.3)gives [ ˙˜ ζ ( t ′ )] i − [ ˙ ζ ( t ′ )] i >

0, a contradiction. Therefore [˜ ζ ] i ≥ [ ζ ] i . (cid:3) Proof of Corollary 1.9.

Given Lemma 5.1, we need only detailthe solutions ˜ ζ and ˆ ζ when the initial conﬁguration is “small” and “large,”respectively. To this end, when the initial conﬁguration is “small” ( c i ≡ ζ ,ˆ ζ are linear, namely ˜ ζ i ( t ) = ˜ b i t , and ˆ ζ i ( t ) = ˆ b i t , where ˜ b i := b i ( p min , p max , β min ,β max ,

0) and ˆ b i := b i ( p max , p min , β max , β min ,

0) [cf. (5.1)].On the other hand, when the initial conﬁguration is “large” ( c i > ≤ i ≤ d + 1), as t ↑ ∞ , ˜ ζ i ( t ) = (˜ b i + o (1)) t and ˆ ζ i ( t ) = (ˆ b i + o (1)) t . (cid:3) DP FOR PREFERENTIAL ATTACHMENT SCHEMES Acknowledgments.

We thank the referees/editors for constructive com-ments which helped improve the exposition of the paper.REFERENCES [1]

Albert, R. and

Barab´asi, A.-L. (2002). Statistical mechanics of complex networks.

Rev. Modern Phys. Athreya, K. B. , Ghosh, A. P. and

Sethuraman, S. (2008). Growth of preferen-tial attachment random graphs via continuous-time branching processes.

Proc.Indian Acad. Sci. Math. Sci.

Barab´asi, A.-L. (2009). Scale-free networks: A decade and beyond.

Science

Barab´asi, A.-L. and

Albert, R. (1999). Emergence of scaling in random networks.

Science

Barrat, A. , Barthelemy, M. , Pastor-Satorras, R. and

Vespignani, A. (2004).The architecture of complex weighted networks.

PNAS

Berger, N. , Borgs, C. , Chayes, J. and

Saberi, A. (2009). A weak local limit forpreferential attachment graphs. Preprint.[7]

Berger, N. , Borgs, C. , Chayes, J. T. and

Saberi, A. (2005). On the spread ofviruses on the internet. In

Proceedings of the Sixteenth Annual ACM–SIAMSymposium on Discrete Algorithms

Bhamidi, S. (2007). Universal techniques to analyze preferential attachmenttrees: Global and local analysis. Preprint. Available at .[9]

Bollob´as, B. and

Riordan, O. (2004). The diameter of a scale-free random graph.

Combinatorica Bollob´as, B. , Riordan, O. , Spencer, J. and

Tusn´ady, G. (2001). The degreesequence of a scale-free random graph process.

Random Structures Algorithms Borgs, C. , Chayes, J. , Lov´asz, L. , S´os, V. and

Vesztergombi, K. (2011).Limits of randomly grown graph sequences.

European J. Combin. Bornholdt, S. and

Schuster, H. G. , eds. (2003).

Handbook of Graphs and Net-works: From the Genome to the Internet . Wiley–VCH, Weinheim. MR2016116[13]

Bryc, W. , Minda, D. and

Sethuraman, S. (2009). Large deviations for the leavesin some random trees.

Adv. in Appl. Probab. Caldarelli, G. (2007).

Scale-Free Networks: Complex Webs in Nature and Tech-nology . Oxford Univ. Press, Oxford.[15]

Chung, F. , Handjani, S. and

Jungreis, D. (2003). Generalizations of Polya’s urnproblem.

Ann. Comb. Chung, F. and

Lu, L. (2006).

Complex Graphs and Networks . CBMS RegionalConference Series in Mathematics . Amer. Math. Soc., Providence, RI.MR2248695[17]

Cohen, R. and

Havlin, S. (2010).

Complex Networks: Structure Robustness andFunction . Cambridge Univ. Press, Cambridge.[18]

Cooper, C. and

Frieze, A. (2003). A general model of web graphs.

Random Struc-tures Algorithms Cooper, C. and

Frieze, A. (2007). The cover time of the preferential attachmentgraph.

J. Combin. Theory Ser. B J. CHOI AND S. SETHURAMAN[20]

Dembo, A. and

Zeitouni, O. (1998).

Large Deviations Techniques and Applica-tions , 2nd ed.

Applications of Mathematics (New York) . Springer, New York.MR1619036[21] Dereich, S. and

M¨orters, P. (2009). Random networks with sublinear preferentialattachment: Degree evolutions.

Electron. J. Probab. Dereich, S. , M¨onch, C. and

M¨orters, P. (2011). Typical distances in ultrasmallrandom networks. Available at arXiv:1102.5680v1.[23]

Dorogovtsev, S. N. , Krapivsky, P. L. and

Mend´es, J. F. F. (2008). Transitionfrom small to large world in growing networks.

Europhys. Lett. EPL Art.30004, 5. MR2443959[24]

Dorogovtsev, S. N. and

Mendes, J. F. F. (2000). Evolution of networks withaging of sites.

Phys. Rev. E Dorogovtsev, S. N. and

Mendes, J. F. F. (2001). Scaling properties of scale-freeevolving networks: Continuous approach.

Phys. Rev. E Dorogovtsev, S. N. and

Mendes, J. F. F. (2003).

Evolution of Networks:From Biological Nets to the Internet and WWW . Oxford Univ. Press, Oxford.MR1993912[27]

Drinea, E. , Enachescu, M. and

Mitzenmacher, M. (2001). Variations on randomgraph models for the web. Harvard Technical Report TR-06-01.[28]

Drinea, E. , Frieze, A. and

Mitzenmacher, M. (2002). Balls and Bins models withfeedback. In

Proc. of the 11th ACM–SIAM Symposium on Discrete Algorithms(SODA)

Dupuis, P. and

Ellis, R. S. (1997).

A Weak Convergence Approach to the Theoryof Large Deviations . Wiley, New York. MR1431744[30]

Durrett, R. (2007).

Random Graph Dynamics . Cambridge Univ. Press, Cambridge.MR2271734[31]

Fortunato, S. , Flammini, A. and

Menczer, F. (2006). Scale-free network growthby ranking.

Phys. Rev. Lett. Frieze, A. , Vera, J. and

Chakrabarti, S. (2006). The inﬂuence of search engineson preferential attachment.

Internet Math. Gjoka, M. , Kurant, M. , Butts, C. T. and

Markopoulou, A. (2011). Practicalrecommendations on crawling online social networks.

IEEE Journal on SelectedAreas in Communications Janssen, J. and

Pra lat, P. (2010). Rank-based attachment leads to power lawgraphs.

SIAM J. Discrete Math. Katona, Z. (2005). Width of a scale-free tree.

J. Appl. Probab. Krapivsky, P. and

Redner, S. (2001). Organization of growing random networks.

Phys. Rev. E Krapivsky, P. L. and

Redner, S. (2002). Finiteness and ﬂuctuations in growingnetworks.

J. Phys. A Krapivsky, P. L. , Rodgers, G. J. and

Redner, S. (2001). Degree distributions ofgrowing networks.

Phys. Rev. Lett. Mihail, M. , Papadimitriou, C. and

Saberi, A. (2006). On certain connectiv-ity properties of the internet topology.

J. Comput. System Sci. Mitzenmacher, M. (2004). A brief history of generative models for power law andlognormal distributions.

Internet Math. M´ori, T. F. (2002). On random trees.

Studia Sci. Math. Hungar. [42] M´ori, T. F. (2005). The maximum degree of the Barab´asi–Albert random tree.

Combin. Probab. Comput. Newman, M. , Barab´asi, A.-L. and

Watts, D. J. , eds. (2006).

The Structure andDynamics of Networks . Princeton Univ. Press, Princeton, NJ. MR2352222[44]

Newman, M. E. J. (2003). The structure and function of complex networks.

SIAMRev. Newman, M. E. J. (2010).

Networks: An Introduction . Oxford Univ. Press, Oxford.MR2676073[46]

Oliveira, R. and

Spencer, J. (2005). Connectivity transitions in networks withsuper-linear preferential attachment.

Internet Math. R´ath, B. and

Szak´acs, L. (2011). Multigraph limit of the dense conﬁguration modeland the preferential attachment graph. Preprint. Available at arXiv:1106.2058.[48]

Rudas, A. , T´oth, B. and

Valk´o, B. (2007). Random trees and general branchingprocesses.

Random Structures Algorithms Simkin, M. V. and

Roychowdhury, V. P. (2011). Re-inventing Willis.

Phys. Rep.

Simon, H. A. (1955). On a class of skew distribution functions.

Biometrika Yule, G. U. (1924). A mathematical theory of evolution, based on the conclusionsof Dr. J. C. Willis.

Philos. Trans. Roy. Soc. London Ser. B

Zhang, J. X. and

Dupuis, P. (2008). Large-deviation approximations for generaloccupancy models.

Combin. Probab. Comput. Department of MathematicsSyracuse UniversitySyracuse, New York 13244USAE-mail: [email protected]