[PDF] The dynamics of power laws: Fitness and aging in preferential attachment trees

Abstract

Continuous-time branching processes describe the evolution of a population whose individuals generate a random number of children according to a birth process. Such branching processes can be used to understand preferential attachment models in which the birth rates are linear functions. We are motivated by citation networks, where power-law citation counts are observed as well as aging in the citation patterns. To model this, we introduce fitness and age-dependence in these birth processes. The multiplicative fitness moderates the rate at which children are born, while the aging is integrable, so that individuals receives a finite number of children in their lifetime. We show the existence of a limiting degree distribution for such processes. In the preferential attachment case, where fitness and aging are absent, this limiting degree distribution is known to have power-law tails. We show that the limiting degree distribution has exponential tails for bounded fitnesses in the presence of integrable aging, while the power-law tail is restored when integrable aging is combined with fitness with unbounded support with at most exponential tails. In the absence of integrable aging, such processes are explosive.

Full PDF

TTHE DYNAMICS OF POWER LAWS:FITNESS AND AGING IN PREFERENTIAL ATTACHMENT TREES

Alessandro Garavaglia a,1 , Remco van der Hofstad a,2 , and Gerhard Woeginger b,3a

Department of Mathematics and Computer Science, Eindhoven University of Technology, 5600 MB Eindhoven,The Netherlands b Department of Computer Science, RWTH Aachen, Ahornstrasse 55, D-52074, Aachen, Germany email address : [email protected], [email protected], [email protected] abstract Continuous-time branching processes describe the evolution of a population whose individu-als generate a random number of children according to a birth process. Such branching processescan be used to understand preferential attachment models in which the birth rates are linearfunctions.We are motivated by citation networks, where power-law citation counts are observed as wellas aging in the citation patterns. To model this, we introduce ﬁtness and age-dependence in thesebirth processes. The multiplicative ﬁtness moderates the rate at which children are born, whilethe aging is integrable, so that individuals receives a ﬁnite number of children in their lifetime.We show the existence of a limiting degree distribution for such processes. In the preferentialattachment case, where ﬁtness and aging are absent, this limiting degree distribution is knownto have power-law tails. We show that the limiting degree distribution has exponential tails forbounded ﬁtnesses in the presence of integrable aging, while the power-law tail is restored whenintegrable aging is combined with ﬁtness with unbounded support with at most exponentialtails. In the absence of integrable aging, such processes are explosive. Introduction

Preferential attachment models (PAMs) aim to describe dynamical networks. As for many real-world networks, PAMs present power-law degree distributions that arise directly from the dynamics,and are not artiﬁcially imposed as, for instance, in conﬁguration models or inhomogeneous randomgraphs.PAMs were ﬁrst proposed by Albert and Barab´asi [1], who deﬁned a random graph modelwhere, at every discrete time step, a new vertex is added with one or more edges, that are attachedto existing vertices with probability proportional to the degrees, i.e., P (vertex ( n + 1) is attached to vertex i | graph at time n ) ∝ D i ( n ) , where D i ( n ) denotes the degree of a vertex i ∈ { , . . . , n } = [ n ] at time n . In general, the dependenceof the attachment probabilities on the degree can be through a preferential attachment function ofthe degree, also called preferential attachment weights . Such models are called PAMs with generalweight function . According to the asymptotics of the weight function w ( · ), the limiting degreedistribution of the graph can behave rather diﬀerently. There is an enormous body of literatureshowing that PAMs present power-law decay in the limiting degree distribution precisely when theweight function is aﬃne, i.e., it is a constant plus a linear function. See e.g., [14, Chapter 8] andthe references therein. In addition, these models show the so-called old-get-richer eﬀect, meaningthat the vertices of highest degrees are the vertices present early in the network formation. Anextension of this model is called preferential attachment models with a random number of edges [8],where new vertices are added to the graph with a diﬀerent number of edges according to a ﬁxeddistribution, and again power-law degree sequences arise. A generalization that also gives younger1 a r X i v : . [ m a t h . P R ] N ov he dynamics of power laws vertices the chance to have high degrees is given by PAMs with ﬁtness as studied in [10],[9]. Borgset al. [6] present a complete description of the limiting degree distribution of such models, withdiﬀerent regimes according to the distribution of the ﬁtness, using generalized Poly´a’s urns . Aninteresting variant of a multi-type PAM is investigated in [21], where the author consider PAMswhere ﬁtnesses are not i.i.d. across the vertices, but they are sampled according to distributionsdepending on the ﬁtnesses of the ancestors.This work is motivated by citation networks , where vertices denote papers and the directed edgescorrespond to citations. For such networks, other models using preferential attachment schemesand adaptations of them have been proposed mainly in the physics literature. Aging eﬀects, i.e.,considering the age of a vertex in its likelihood to obtain children, have been extensively consideredas the starting point to investigate their dynamics [25], [26], [11], [12], [7]. Here the idea is thatold papers are less likely to be cited than new papers. Such aging has been observed in manycitation network datasets and makes PAMs with weight functions depending only on the degreeill-suited for them. As mentioned above, such models could more aptly be called old-get-richer models, i.e., in general old vertices have the highest degrees. In citation networks, instead, paperswith many citations appear all the time. Barab´asi, Wang and Song [24] investigate a model thatincorporates these eﬀects. On the basis of empirical data, they suggest a model where the agingfunction follows a lognormal distribution with paper-dependent parameters, and the preferentialattachment function is the identity. In [24], the ﬁtness function is estimated rather than the moreclassical approach where it is taken to be i.i.d.. Hazoglou, Kulkarni, Skiena Dill in [13] propose asimilar dynamics for citation evolution , but only considering the presence of aging and cumulativeadvantage without ﬁtness.Tree models, arising when new vertices are added with only one edge, have been analyzed in [2],[3], [23], [22] and lead to continuous-time branching processes (CTBP). The degree distributionsin tree models show identical qualitative behavior as for the non-tree setting, while their analysisis much simpler. Motivated by this and the wish to understand the qualitative behavior of PAMswith general aging and ﬁtness, the starting point of our model is the CTBP or tree setting. Suchprocesses have been intensively studied, due to their applications in other ﬁelds, such as biology.Detailed and rigorous analysis of CTBPs can be found in [4], [15], [18], [23], [2], [3], [5]. A CTBPconsists of individuals, whose children are born according to certain birth processes, these processesbeing i.i.d. across the individuals in the population. The birth processes ( V t ) t ≥ are deﬁned in termof point or jump processes on N [15], [18], where the birth times of children are the jump times ofthe process, and the number of children of an individual at time t ∈ R + is given by V t .In the literature, the CTBPs are used as a technical tool to study PAMs [3], [23], [21]. Indeed, . . PSEEBT figure − − − − − − PSEEBT figure aravaglia, van der hofstad, woeginger − − − PS − − − − EE − − − BT figure

3: Degree distribution for papers from 1984 over time.the CTBP at the n th birth time follows the same law as the PAM consisting of n vertices. In[3], [23], the authors prove an embedding theorem between branching processes and preferentialattachment trees, and give a description of the degree distribution in terms of the asymptoticbehavior of the weight function w ( · ). In particular, a power-law degree distribution is presentin the case of (asymptotically) linear weight functions [22]. In the sub-linear case, instead, thedegree distribution is stretched-exponential , while in the super-linear case it collapses, in the sensethat one of the ﬁrst vertices will receive all the incoming new edges after a certain step [19]. Dueto the apparent exponential growth of the number of nodes in citation networks, we view thecontinuous-time process as the real network, which deviates from the usual perspective. Becauseof its motivating role in this paper, let us now discuss the empirical properties of citation networksin detail. Let us now discuss the empirical properties of citation networksin more detail. We analyze the Web Of Science database, focusing on three diﬀerent ﬁelds ofscience:

Probability and Statistics (PS),

Electrical Engineering (EE) and

Biotechnology and AppliedMicrobiology (BT). We ﬁrst point out some characteristics of citation networks that we wish toreplicate in our models.Real-world citation networks possess ﬁve main characteristics:(1) In Figure 1, we see that the number of scientiﬁc publications grows exponentially in time.While this is quite prominent in the data, it is unclear how this exponential growth arises.This could either be due to the fact that the number of journals that are listed in Web OfScience grows over time, or that journals contain more and more papers.(2) In Figure 2, we notice that these datasets have empirical power-law citation distributions.Thus, most papers attract few citations, but the amount of variability in the number of cita- PS EE BT figure

4: Time evolution for the number of citations of samples of 20 randomly chosen papersfrom 1980 for PS and EE, and from 1982 for BT.3 he dynamics of power laws . . . . . PS . . . . EE . . . . . . BT figure

5: Average degree increment over a 20-years time window for papers published in diﬀerentyears. PS presents an aging eﬀect diﬀerent from EE and BT, showing that papers in PS receivecitations longer than papers in EE and BT.tions is rather substantial. We are also interested in the dynamics of the citation distributionof the papers published in a given year, as time proceeds. This can be observed in Figure 3.We see a dynamical power law , meaning that at any time the degree distribution is close to apower law, but the exponent changes over time (and in fact decreases, which corresponds toheavier tails). When time grows quite large, the power law approaches a ﬁxed value.(3) In Figure 4, we see that the majority of papers stop receiving citations after some time, whilefew others keep being cited for longer times. This inhomogeneity in the evolution of nodedegrees is not present in classical PAMs, where the degree of every ﬁxed vertex grows asa positive power of the graph size. Figure 4 shows that the number of citations of paperspublished in the same year can be rather diﬀerent, and the majority of papers actually stopreceiving citations quite soon. In particular, after a ﬁrst increase, the average incrementof citations decreases over time (see Figure 5). We observe a diﬀerence in this aging eﬀectbetween the PS dataset and the other two datasets, due to the fact that in PS, scientiststend to cite older papers than in EE or BT. Nevertheless the average increment of citationsreceived by papers in diﬀerent years tends to decrease over time for all three datasets.(4) Figure 6 shows the linear dependence between the past number of citations of a paper andthe future ones. Each plot represents the average number of citations received by paperspublished in 1984 in the years 1993, 2006 and 2013 according to the initial number of citationsin the same year. At least for low values of the starting number of citations, we see that theaverage number of citations received during a year grows linearly. This suggests that theattractiveness of a paper depends on the past number of citations through an aﬃne function.(5) A last characteristic that we observe is the lognormal distribution of the age of cited papers.In Figure 7, we plot the distribution of cited papers, looking at references made by papers PS Data - 1992Linear regression - 1992Data - 2005Linear regression - 2005Data - 2012Linear regression - 2012 0 25 50 75 100 125 150 175 20001020 EE Data - 1992Linear regression - 1992Data - 2005Linear regression - 2005Data - 2012Linear regression - 2012 0 25 50 75 100 125 1500102030 BT Data - 1992Linear regression - 1992Data - 2005Linear regression - 2005Data - 2012Linear regression - 2012 figure

6: Linear dependence between past and future number of citations for papers from 1988.4 aravaglia, van der hofstad, woeginger . . PS . . . . EE . . . . BT figure

7: Distribution of the age of cited papers for diﬀerent citing years.in diﬀerent years. We have used a 20 years time window in order to compare diﬀerent citingyears. Notice that this lognormal distribution seems to be very similar within diﬀerent years,and the shape is similar over diﬀerent ﬁelds.Let us now explain how we translate the above empirical characteristics into our model. First,CTBPs grow exponentially over time, as observed in citation networks. Secondly, the aging presentin citation networks, as seen both in Figures 4 and 5, suggests that citation rates become smallerfor large times, in such a way that typical papers stop receiving citations at some (random) point intime. The hardest characteristic to explain is the power-law degree sequence. For this, we note thatcitations of papers are inﬂuenced by many external factors that aﬀect the attractiveness of papers(the journal, the authors, the topic,. . . ). Since this cannot be quantiﬁed explicitly, we introduceanother source of randomness in our birth processes that we call ﬁtness . This appears in the formof multiplicative factors of the attractiveness of a paper, and for lack of better knowledge, we takethese factors to be i.i.d. across papers, as often assumed in the literature. These assumptions aresimilar in spirit as the ones by Barab´asi et al. [24], which were also motivated by citation data, andwe formalize and extend their results considerably. In particular, we give the precise conditionsunder which power-law citation counts are observed in this model.Our main goal is to deﬁne CTBPs with both aging as well as random ﬁtness that keep havinga power-law decay in the in-degree distribution. Before discussing our model in detail in Section 2,we present the heuristic ideas behind it as well as the main results of this paper.

The crucial point of this work is to show that it is possible toobtain power-law degree distributions in preferential attachment trees where the birth process isnot just depending on an asymptotically linear weight sequence , in the presence of integrable aging and ﬁtness . Let us now brieﬂy explain how these two eﬀects change the behavior of the degreedistribution.

Integrable aging and aﬃne preferential attachment without ﬁtness.

In the presence ofaging but without ﬁtness, we show that the aging eﬀect substantially slows down the birth process.In the case of aﬃne weights, aging destroys the power-law of the stationary regime, generating alimiting distribution that consists of a power law with exponential truncation. We prove this underreasonable conditions on the underlying aging function (see Lemma 5.1).

Integrable aging and super-linear preferential attachment without ﬁtness.

Since theaging destroys the power-law of the aﬃne PA case, it is natural to ask whether the combinationof integrable aging and super-linear weights restores the power-law limiting degree distribution.Theorem 2.3 states that this is not the case, as super-linear weights imply explosiveness of the5 he dynamics of power laws branching process, which is clearly unrealistic in the setting of citation networks (here, we calla weight sequence k (cid:55)→ f k super-linear when (cid:80) k ≥ /f k < ∞ ). This result is quite general,because it holds for any integrable aging function. Due to this, it is impossible to obtain power-laws from super-linear preferential attachment weights. This suggests that (apart from slowly-varying functions), aﬃne preferential attachment weights have the strongest possible growth, whilemaintaining exponential (and thus, in particular, non-explosive) growth. Integrable aging and aﬃne preferential attachment with unbounded ﬁtness.

In thecase of aging and ﬁtness, the asymptotic behavior of the limiting degree distribution is ratherinvolved. We estimate the asymptotic decay of the limiting degree distribution with aﬃne weightsin Proposition 5.5. With the example ﬁtness classes analyzed in Section 5.3, we prove that power-law tails are possible in the setting of aging and ﬁtness, at least when the ﬁtness has roughlyexponential tail. So far, PAMs with ﬁtness required the support of the ﬁtness distribution to be bounded . The addition of aging allows the support of the ﬁtness distribution to be unbounded,a feature that seems reasonable to us in the context of citation networks. Indeed, the relativeattractivity of one paper compared to another one can be enormous, which is inconsistent with abounded ﬁtness distribution. While we do not know precisely what the necessary and suﬃcientconditions are on the aging and the ﬁtness distribution to assure a power-law degree distribution,our results suggests that aﬃne PA weights with integrable aging and ﬁtnesses with at most anexponential tail in general do so, a feature that was not observed before.

Dynamical power laws.

In the case of ﬁtness with exponential tails, we further observe thatthe number of citations of a paper of age t has a power-law distribution with an exponent thatdepends on t . We call this a dynamical power law , and it is a possible explanation of the dynamicalpower laws observed in citation data (see Figure 3). Universality.

An interesting and highly relevant observation in this paper is that the limitingdegree distribution of preferential attachment trees with aging and ﬁtness shows a high amount of universality . Indeed, for integrable aging functions, the dependence on the precise choice of theaging function seems to be minor, except for the total integral of the aging function. Further, thedependence on ﬁtness is quite robust as well. Our model and main results

In this paper we introduce the eﬀect of aging and ﬁtness in CTBP populations, giving rise todirected trees. Our model is motivated by the study of citation networks , which can be seen asdirected graphs. Trees are the simplest case in which we can see the eﬀects of aging and ﬁtness.Previous work has shown that PAMs can be obtained from PA trees by collapsing, and their generaldegree structure can be quite well understood from those in trees. For example, PAMs with ﬁxedout-degree m ≥ m ∈ N vertices in the tree (see [14, Section 8.2]). In this case, the limiting degreedistribution of the PAM preserve the structure of the tree case ([14, Section 8.4], [5, Section 5.7]).This explains the relevance of the tree case results for the study of the eﬀect of aging and ﬁtnessin PAMs. It could be highly interesting to prove this rigorously. CTBPs represent a population made of individuals producing childrenindependently from each other, according to i.i.d. copies of a birth process on N . We present thegeneral theory of CTBPs in Section 3, where we deﬁne such processes in detail and we refer togeneral results that are used throughout the paper. In general, considering a birth process ( V t ) t ≥ aravaglia, van der hofstad, woeginger on N , every individual in the population has an i.i.d. copy of the process ( V t ) t ≥ , and the number ofchildren of individual x at time t is given by the value of the process V xt . We consider birth processesdeﬁned by a sequence of weights ( f k ) k ∈ N describing the birth rates. Here, the time between the k thand the ( k + 1)st jump is exponentially distributed with parameter f k . The behavior of the wholepopulation is determined by this sequence.The fundamental theorem for the CTBPs that we study is Theorem 3.10 quoted in Section 3. Itstates that, under some hypotheses on the birth process ( V t ) t ≥ , the population grows exponentiallyin time, which nicely ﬁts the exponential growth of scientiﬁc publications as indicated in Figure 1.Further, using a so-called random vertex characteristic as introduced in [15], a complete class ofproperties of the population can be described, such as the fraction of individuals having k children,as we investigate in this paper. The two main properties are stated in Deﬁnitions 3.8 and 3.9,and are called supercritical and Malthusian properties. These properties require that there exists apositive value α ∗ such that E (cid:2) V T α ∗ (cid:3) = 1 , and − ddα E [ V T α ] (cid:12)(cid:12)(cid:12)(cid:12) α = α ∗ < ∞ , where T α denotes an exponentially distributed random variable with rate α independent of the pro-cess ( V t ) t ≥ . The unique value α ∗ that satisﬁes both conditions is called the Malthusian parameter ,and it describes the exponential growth rate of the population size. The aim is to investigate theratio number of individuals with k children at time t size total population at time t . According to Theorem 3.10, this ratio converges almost surely to a deterministic limiting value p k . The sequence ( p k ) k ∈ N , which we refer to as the limiting degree distribution of the CTBP (seeDeﬁnition 3.12), is given by p k = E (cid:104) P ( V u = k ) u = T α ∗ (cid:105) . The starting idea of our model of citation networks is that, given the history of the process up totime t ,the rate of an individual of age t and k children to generate a new child is Y f k g ( t ) , (2.1)where f k is a non-decreasing PA function of the degree, g is an integrable function of time, and Y is a positive random variable called ﬁtness. Therefore, the likelihood to generate children increasesby having many children and/or a high ﬁtness, while it is reduced by age.Recalling Figure 6, we assume that the PA function f is aﬃne, so f k = ak + b . In terms of aPA scheme, this implies P (a paper cites another with past k citations | past) ≈ n ( k )( ak + b ) A , where n ( k ) denotes the number of papers with k past citations, and A is the normalization factor.Such behavior has already been observed by Redner [20] and Barab´asi et al. [16]).We assume throughout the paper that the aging function g is integrable. In fact, we start by thefact that the age of cited papers is lognormally distributed (recall Figure 7). By normalizing sucha distribution by the average increment in the number of citations of papers in the selected timewindow, we identify a universal function g ( t ). Such function can be approximated by a lognormalshape of the form g ( t ) ≈ c e − c (log( t +1) − c ) , for c , c and c ﬁeld-dependent parameters. In particular, from the procedure used to deﬁne g ( t ),we observe that g ( t ) ≈ number of references to year t number of papers of age t total number of papers consideredtotal number of references considered , he dynamics of power laws which means in terms of PA mechanisms that P (a paper cites another of age t | past) ≈ n ( t ) g ( t ) B , where B is the normalization factor, while this time n ( t ) is the number of paper of age t . Thissuggests that the citing probability depends on age through a lognormal aging function g ( t ), whichis integrable. This is one of the main assumptions in our model, as we discuss in Section 1.2.It is known from the literature ([22], [23], [2]) that CTBPs show power-law limiting degree dis-tributions when the inﬁnitesimal rates of jump depend only on a sequence ( f k ) k ∈ N that is asymp-totically linear . Our main aim is to investigate whether power-laws can also arise in branchingprocesses that include aging and ﬁtness. The results are organized as follows. In Section 2.2, wediscuss the results for CTBPs with aging in the absence of ﬁtness. In Section 2.3, we present theresults with aging and ﬁtness. In Section 2.4, we specialize to ﬁtness with distributions with expo-nential tails, where we show that the limiting degree distribution is a power law with a dynamic power-law exponent. In this section, we focus on aging in PA trees in theabsence of ﬁtness. The aging process can then be viewed as a time-changed stationary birth process(see Deﬁnition 3.13). A stationary birth process is a stochastic process ( V t ) t ≥ such that, for h small enough, P ( V t + h = k + 1 | V t = k ) = f k h + o ( h ) . In general, we assume that k (cid:55)→ f k is increasing. The aﬃne case arises when f k = ak + b with a, b >

0. By our observations in Figure 6, as well as related works ([20], [16]), the aﬃne case is areasonable approximation for the attachment rates in citation networks.For a stationary birth process ( V t ) t ≥ , under the assumption that it is supercritical and Malthu-sian, the limiting degree distribution ( p k ) k ∈ N of the corresponding branching process is given by p k = α ∗ α ∗ + f k k − (cid:89) i =0 f i α ∗ + f i . (2.2)For a more detailed description, we refer to Section 3.2. Branching processes deﬁned by stationaryprocesses (with no aging eﬀect) have a so-called old-get-richer eﬀect. As this is not what we observein citation networks (recall Figure 4), we want to introduce aging in the reproduction process ofindividuals. The aging process arises by adding age-dependence in the inﬁnitesimal transitionprobabilities: Deﬁnition 2.1 (Aging birth processes) . Consider a non-decreasing PA sequence ( f k ) k ∈ N of positivereal numbers and an aging function g : R + → R + . We call a stochastic process ( N t ) t ≥ an agingbirth process (without ﬁtness) when(1) N = 0 , and N t ∈ N for all t ∈ N ;(2) N t ≤ N s for every t ≤ s ;(3) for ﬁxed k ∈ N and t ≥ , as h → , P ( N t + h = k + 1 | N t = k ) = f k g ( t ) h + o ( h ) . Aging processes are time-rescaled versions of the corresponding stationary process deﬁned bythe same sequence ( f k ) k ∈ N . In particular, for any t ≥ N t has the same distribution as V G ( t ) ,where G ( t ) = (cid:82) t g ( s ) ds . In general, we assume that the aging function is integrable , which meansthat G ( ∞ ) := (cid:82) ∞ g ( s ) ds < ∞ . This implies that the number of children of a single individualin its entire lifetime has distribution V G ( ∞ ) , which is ﬁnite in expectation. In terms of citation8 aravaglia, van der hofstad, woeginger − − − − − − − − Stationary processAging process 10 − − − − − − − − Stationary processAging process figure

8: Examples of stationary and aging limit degree distributionsnetworks, this assumption is reasonable since we do not expect papers to receive an inﬁnite numberof citations ever (recall Figure 5). Instead, for the stationary process ( V t ) t ≥ in Deﬁnition 3.13, wehave that P -a.s. V t → ∞ , so that also the aging process diverges P -a.s. when G ( ∞ ) = ∞ .For aging processes, the main result is the following theorem, proven in Section 4. In itsstatement, we rely on the Laplace transform of a function. For a precise deﬁnition of this notion,we refer to Section 3: Theorem 2.2 (Limiting distribution for aging branching processes) . Consider an integrable agingfunction and a PA sequence ( f k ) k ∈ N . Denote the corresponding aging birth process by ( N t ) t ≥ .Then, assuming that ( N t ) t ≥ is supercritical and Malthusian, the limiting degree distribution of thebranching process N deﬁned by the birth process ( N t ) t ≥ is given by p k = α ∗ α ∗ + f k ˆ L g ( k, α ∗ ) k − (cid:89) i =0 f i ˆ L g ( i, α ∗ ) α ∗ + f i ˆ L g ( i, α ∗ ) , (2.3) where α ∗ is the Malthusian parameter of N . Here, the sequence of coeﬃcients ( ˆ L g ( k, α ∗ )) k ∈ N appearing in (2.3) is given by ˆ L g ( k, α ∗ ) = L ( P ( N · = k ) g ( · ))( α ∗ ) L ( P ( N · = k ))( α ∗ ) , (2.4) where, for h : R + → R , L ( h ( · ))( α ) denotes the Laplace transform of h .Further, considering a ﬁxed individual in the branching population, the total number of children inits entire lifetime is distributed as V G ( ∞ ) , where G ( ∞ ) is the L -norm of g . The limiting degree distribution maintains a product structure as in the stationary case (see (2.2)for comparison). Unfortunately, the analytic expression for the probability distribution ( p k ) k ∈ N in(2.3) given by the previous theorem is not explicit. In the stationary case, the form reduces to thesimple expression in (2.2).In general, the asymptotics of the coeﬃcients ( ˆ L g ( k, α ∗ )) k ∈ N is unclear, since it depends bothon the aging function g as well as the PA weight sequence ( f k ) k ∈ N itself in an intricate way. Inparticular, we have no explicit expression for the ratio in (2.4), except in special cases. In this typeof birth process, the cumulative advantage given by ( f k ) k ∈ N and the aging eﬀect given by g cannotbe separated from each other.Numerical examples in Figure 8 show how aging destroys the power-law degree distribution. Ineach of the two plots, the limiting degree distribution of a stationary process with aﬃne PA weightsgives a power-law degree distribution, while the process with two diﬀerent integrable aging functionsdoes not. In the examples we have used g ( t ) = e − λt and g ( t ) = (1 + t ) − λ for some λ >

1, andwe observe the insensitivity of the limiting degree distribution with respect to g . The distribution9 he dynamics of power laws given by (2.3) can be seen as the limiting degree distribution of a CTBP deﬁned by preferentialattachment weight ( f k ˆ L g ( k, α ∗ )) k ∈ N . This suggests that f k ˆ L g ( k, α ∗ ) is not asymptotically linear in k . In Section A.2, we investigate the two examples in Figure 8, showing that the limiting degreedistribution has exponential tails, a fact that we know in general just as an upper bound (seeLemma 5.3).In order to apply the general CTBP result in Theorem 3.10 below, we need to prove that anaging process ( N t ) t ≥ is supercritical and Malthusian. We show in Section 4 that, for an integrableaging function g , the corresponding process is supercritical if and only iflim t →∞ E (cid:2) V G ( t ) (cid:3) = E (cid:2) V G ( ∞ ) (cid:3) > . (2.5)Condition (2.5) heuristically suggests that the process ( N t ) t ≥ has a Malthusian parameter if andonly if the expected number of children in the entire lifetime of a ﬁxed individual is larger thanone, which seems quite reasonable. In particular, such a result follows from the fact that if g isintegrable, then the Laplace transform is always ﬁnite for every α >

0. In other words, since N T α ∗ has the same distribution as V G ( T α ∗ ) , E [ N T α ∗ ] is always bounded by E [ V G ( ∞ ) ]. This implies that G ( ∞ ) cannot be too small, as otherwise the Malthusian parameter would not exist, and the CTBPwould die out P -a.s..The aging eﬀect obviously slows down the birth process, and makes the limiting degree distribu-tion have exponential tails for aﬃne preferential attachment weights. One may wonder whether thepower-law degree distribution could be restored when ( f k ) k ∈ N grows super-linearly instead. Here,we say that a sequence of weights ( f k ) k ∈ N grows super-linearly when (cid:80) k ≥ /f k < ∞ (see Deﬁnition3.16). In the super-linear case, however, the branching process is explosive , i.e., for every individualthe probability of generating an inﬁnite number of children in ﬁnite time is 1. In this situation, theMalthusian parameter does not exist, since the Laplace transform of the process is always inﬁnite.One could ask whether, by using an integrable aging function, this explosive behavior is destroyed.The answer to this question is given by the following theorem: Theorem 2.3 (Explosive aging branching processes for super-linear attachment weights) . Considera stationary process ( V t ) t ≥ deﬁned by super-linear PA weights ( f k ) k ∈ N . For any aging function g ,the corresponding non-stationary process ( N t ) t ≥ is explosive. The proof of Theorem 2.3 is rather simple, and is given in Section 4.2. We investigate thecase of aﬃne PA weights f k = ak + b in more detail in Section 5.1. Under a hypothesis on theregularity of the integrable aging function, in Proposition 5.2, we give the asymptotic behavior ofthe corresponding limiting degree distribution. In particular, as k → ∞ , p k = C Γ( k + b/a )Γ( k + 1) e − C k G ( k, g )(1 + o (1)) , for some positive constants C , C . The term G ( k, g ) is a function of k , the aging function g andits derivative. The precise behavior of such term depends crucially on the aging function. Apartfrom this, we notice that aging generates an exponential term in the distribution, which explainsthe two examples in Figure 8. In Section A.2, we prove that the two limiting degree distributionsin Figure 8 indeed have exponential tails. The analysis of birth processes becomes harder when wealso consider ﬁtness. First of all, we deﬁne the birth process with aging and ﬁtness as follows:

Deﬁnition 2.4 (Aging birth process with ﬁtness) . Consider a birth process ( V t ) t ≥ . Let g : R + → R + be an aging function, and Y a positive random variable. The process M t := V Y G ( t ) is called abirth process with aging and ﬁtness . aravaglia, van der hofstad, woeginger Deﬁnition 2.4 implies that the inﬁnitesimal jump rates of the process ( M t ) t ≥ are as in (2.1),so that the birth probabilities of an individual depend on the PA weights, the age of the individualand on its ﬁtness. Assuming that the process ( M t ) t ≥ is supercritical and Malthusian, we can provethe following theorem: Theorem 2.5 (Limiting degree distribution for aging and ﬁtness) . Consider a process ( M t ) t ≥ with integrable aging function g , ﬁtnesses that are i.i.d. across the population, and assume that itis supercritical and Malthusian with Malthusian parameter α ∗ . Then, the limiting degree distributionfor the corresponding branching process is given by p k = E (cid:34) α ∗ α ∗ + f k Y ˆ L ( k, α ∗ , Y ) k − (cid:89) i =0 f i Y ˆ L ( i, α ∗ , Y ) α ∗ + f i Y ˆ L ( i, α ∗ , Y ) (cid:35) . For a ﬁxed individual, the distribution ( q k ) k ∈ N of the number of children it generates over its entirelifetime is given by q k = P (cid:0) V Y G ( ∞ ) = k (cid:1) . Similarly to Theorem 2.2, the sequence ( ˆ L ( k, α ∗ , Y )) k ∈ N is given byˆ L ( k, α ∗ , Y ) = (cid:32) L ( P (cid:0) V uG ( · ) = k (cid:1) g ( · ))( α ∗ ) L ( P (cid:0) V uG ( · ) = k (cid:1) )( α ∗ ) (cid:33) u = Y , where again L ( h ( · ))( α ) denotes the Laplace transform of a function h . Notice that in this case,with the presence of the ﬁtness Y , this sequence is no longer deterministic but random instead. Westill have the product structure for ( p k ) k ∈ N as in the stationary case, but now we have to averageover the ﬁtness distribution.We point out that Theorem 2.2 is a particular case of Theorem 2.5, when we consider Y ≡ p k = Γ( k + b/a )Γ( b/a )Γ( k + 1) 2 π (cid:112) det( kH k ( t k , s k )) e − k Ψ k ( t k ,s k ) P ( N ≥ − t k , N ≥ − s k ) (1 + o (1)) , where the function Ψ k ( t, s ) depends on the aging function, the density µ of the ﬁtness and k . Thepoint ( t k , s k ) is the absolute minimum of Ψ k ( t, s ), H k ( t, s ) is the Hessian matrix of Ψ k ( t, s ), and( N , N ) is a bivariate normal vector with covariance matrix related to H k ( t, s ). We do not knowthe necessary and suﬃcient conditions for the existence of such a minimum ( t k , s k ). However, inSection 5.3, we consider two examples where we can apply this result, and we show that it ispossible to obtain power-laws for them.In the case of aging and ﬁtness, the supercriticality condition in (2.5) is replaced by the analogouscondition that E (cid:2) V Y G ( t ) (cid:3) < ∞ for every t ≥ t →∞ E (cid:2) V Y G ( t ) (cid:3) > . (2.6)Borgs et al. [6] and Dereich [9], [10] prove results on stationary CTBPs with ﬁtness. In theseworks, the authors investigate models with aﬃne dependence on the degree and bounded ﬁtnessdistributions. This is necessary since unbounded distributions with aﬃne weights are explosive and11 he dynamics of power laws thus do not have Malthusian parameter . We refer to Section 3.3 for a more precise discussion ofthe conditions on ﬁtness distributions.In the case of integrable aging and ﬁtness, it is possible to consider aﬃne PA weights, even withunbounded ﬁtness distributions, as exempliﬁed by (2.6). In particular, for f k = ak + b , E [ V t ] = ba (cid:0) e at − (cid:1) . As a consequence, Condition (2.6) can be written as ∀ t ≥ E (cid:104) e aY G ( t ) (cid:105) < ∞ and lim t →∞ E (cid:104) e aY G ( t ) (cid:105) > ab . (2.7)The expected value E (cid:2) e aY G ( t ) (cid:3) is the moment generating function of Y evaluated in aG ( t ). Inparticular, a necessary condition to have a Malthusian parameter is that the moment generatingfunction is ﬁnite on the interval [0 , aG ( ∞ )). As a consequence, denoting E [e sY ] by ϕ Y ( s ), we haveeﬀectively moved from the condition of having bounded distributions to the condition ϕ Y ( x ) < + ∞ on [0 , aG ( ∞ )) , and lim x → aG ( ∞ ) ϕ Y ( x ) > a + ba . (2.8)Condition (2.8) is weaker than assuming a bounded distribution for the ﬁtness Y , which means wecan consider a larger class of distributions for the aging and ﬁtness birth processes. Particularlyfor citation networks, it seems reasonable to have unbounded ﬁtnesses, as the relative popularityof papers varies substantially. In Section 5.3we introduce three diﬀerent classes of ﬁtness distributions, for which we give the asymptotics forthe limiting degree distribution of the corresponding CTBP.The ﬁrst class is called heavy-tailed . Recalling (2.8), any distribution Y in this class satisﬁes,for any t > ϕ Y ( t ) = E (cid:2) e tY (cid:3) = + ∞ . (2.9)These distributions have a tail that is thicker than exponential. For instance, power-law distribu-tions belong to this ﬁrst class. Similarly to unbounded distributions in the stationary regime, suchdistributions generate explosive birth processes, independent of the choice of the integrable agingfunctions.The second class is called sub-exponential . The density µ of a distribution Y in this class satisﬁes ∀ β > , lim s → + ∞ µ ( s )e βs = 0 . (2.10)An example of this class is the density µ ( s ) = C e − θs ε , for some ε, C, θ >

0. For such density, weshow in Proposition 5.7 that the corresponding limiting degree distribution has a thinner tail thana power-law.The third class is called general-exponential . The density µ of a distribution Y in this class isof the form µ ( s ) = Ch ( s )e − θs , (2.11)where h ( s ) is a twice diﬀerentiable function such that h (cid:48) ( s ) /h ( s ) → h (cid:48)(cid:48) ( s ) /h ( s ) → s → ∞ ,and C is a normalization constant. For instance, exponential and Gamma distributions belong tothis class. From (2.8), we know that in order to obtain a non-explosive process, it is necessary toconsider the exponential rate θ > aG ( ∞ ). We will see that the limiting degree distribution obeysa power law as θ > aG ( ∞ ) with tails becoming thinner when θ increases.12 aravaglia, van der hofstad, woeginger For a distribution in the general exponential class, as proven in Proposition 5.6, the limitingdegree distribution of the corresponding CTBP has a power-law term, with slowly-varying correc-tions given by the aging function g and the function h . We do not state Propositions 5.6 and 5.7here, as these need notation and results from Section 5.1. For this reason, we only state the resultfor the special case of purely exponential ﬁtness distribution: Corollary 2.6 (Exponential ﬁtness distribution) . Let the ﬁtness distribution Y be exponentiallydistributed with parameter θ , and let g be an integrable aging function. Assume that the correspond-ing birth process ( M t ) t ≥ is supercritical and Malthusian. Then, the limiting degree distribution ( p k ) k ∈ N of the corresponding CTBP M is p k = E (cid:34) θθ + f k G ( T α ∗ ) k − (cid:89) i =0 f i G ( T α ∗ ) θ + f i G ( T α ∗ ) (cid:35) . The distribution ( q k ) k ∈ N of the number of children of a ﬁxed individual in its entire lifetime is givenby q k = θθ + G ( ∞ ) f k k − (cid:89) i =0 G ( ∞ ) f i θ + G ( ∞ ) f i . Using exponential ﬁtness makes the computation of the Laplace transform and the limitingdegree distribution easier. We refer to Section 5.4 for the precise proof. In particular, the sequencedeﬁned in Corollary 2.6 is very similar to the limiting degree distribution of a stationary processwith a bounded ﬁtness. Let ( ξ Yt ) t ≥ be a birth process with PA weights ( f k ) k ∈ N and ﬁtness Y withbounded support. As proved in [10, Corollary 2.8], and as we show in Section 3.3, the limitingdegree distribution of the corresponding branching process, assuming that ( ξ Yt ) t ≥ is supercriticaland Malthusian, has the form p k = E (cid:34) α ∗ α ∗ + Y f k k − (cid:89) i =0 Y f i α ∗ + Y f i (cid:35) = P (cid:0) ξ YT α ∗ = k (cid:1) . We notice the similarities with the limiting degree sequence given by Corollary 2.6. When g isintegrable, the random variable G ( T α ∗ ) has bounded support. In particular, we can rewrite thesequence of the Corollary 2.6 as p k = P (cid:16) ξ G ( T α ∗ ) T θ = k (cid:17) . As a consequence, the limiting degree distribution of the process ( M t ) t ≥ equals that of a stationaryprocess with ﬁtness G ( T α ∗ ) and Malthusian parameter θ .In the case where Y has exponential distribution and the PA weights are aﬃne, we can alsoinvestigate the occurrence of dynamical power laws . In fact, with ( M t ) t ≥ such a process, theexponential distribution Y leads to P k [ M ]( t ) = P ( M t = k ) = θθ + f k G ( t ) k − (cid:89) i =0 f i G ( t ) θ + f i G ( t )= θaG ( t ) Γ(( b + θ ) / ( aG ( t ))Γ( aG ( t )) Γ( k + b/ ( aG ( t )))Γ( k + b/ ( aG ( t )) + 1 + θ/ ( aG ( t ))) . (2.12)Here, M t describes the number of children of an individual of age t . In other words, ( P ( M t = k )) k ∈ N is a distribution such that, as k → ∞ , P k [ M ]( t ) = P ( M t = k ) = k − (1+ θ/aG ( t )) (1 + o (1)) . This means that for every time t ≥

0, the random variable M t has a power-law distribution withexponent τ ( t ) = 1 + θ/aG ( t ) >

2. In particular, for every t ≥ M t has ﬁnite expectation. We13 he dynamics of power laws − − − − Dynamical power-law − − − − Dynamical power-law figure

9: Degree distribution for simulated processes with ﬁtness, aging and aﬃne weights. Weconsidered a = 1, b = 3 . θ = 2 . dynamicalpower law . This occurs not only in the case of pure exponential ﬁtness, but in general for everydistribution as in (2.11), as shown in Proposition 5.6 below.Further, we see that when t → ∞ , the dynamical power-law exponent coincides with the power-law exponent of the entire population. Indeed, the limiting degree distribution equals p k = E (cid:20) θ/ ( aG ( T α ∗ ))) Γ( θ/ ( aG ( T α ∗ )) + b/ ( aG ( T α ∗ ))Γ( b/ ( aG ( T α ∗ ))) Γ( k + b/ ( aG ( T α ∗ )))Γ( k + b/ ( aG ( T α ∗ )) + 1 + θ/ ( aG ( T α ∗ ))) (cid:21) . (2.13)In Figure 9, we show a numerical example of the dynamical power-law for a process withexponential ﬁtness distribution and aﬃne weights. When time increases, the power-law exponentmonotonically decreases to the limiting exponent τ ≡ τ ( ∞ ) >

2, which means that the limitingdistribution still has ﬁnite ﬁrst moment. Note the similarity to the case of citation networks inFigure 3.When t → ∞ , the power-law exponent converges, and also M t converges in distribution to alimiting random distribution M ∞ given by q k = P ( M ∞ = k ) = θaG ( ∞ ) Γ(( b + θ ) / ( aG ( ∞ ))Γ( b/ ( aG ( ∞ ))) Γ( k + b/ ( aG ( ∞ )))Γ( k + b/ ( aG ( ∞ )) + 1 + θ/ ( aG ( ∞ ))) . (2.14) M ∞ has a power-law distribution, where the power-law exponent is τ = lim t →∞ τ ( t ) = 1 + θ/ ( aG ( ∞ )) > . − − − − − − − − − Stationary processAging processAging and ﬁtness process 10 − − − − − − − − Stationary processAging processAging and ﬁtness process figure

10: Example of limiting degree distribution for branching processes.14 aravaglia, van der hofstad, woeginger

In particular, since τ >

2, a ﬁxed individual has ﬁnite expected number of children also in itsentire lifetime, unlike the stationary case with aﬃne weights. In terms of citation networks, thistype of processes predicts that papers do not receive an inﬁnite number of citations after they arepublished (recall Figure 5).Figure 8 shows the eﬀect of aging on the stationary process with aﬃne weights, where thepower-law is lost due to the aging eﬀect. Thus, aging slows down the stationary process, and it isnot possible to create the amount of high-degree vertices that are present in power-law distribu-tions. Fitness can speed up the aging process to gain high-degree vertices, so that the power-lawdistribution is restored. This is shown in Figure 10, where aging is combined with exponentialﬁtness for the same aging functions as in Figure 8.In the stationary case, it is not possible to use unbounded distributions for the ﬁtness to obtaina Malthusian process if the PA weights ( f k ) k ∈ N are aﬃne. In fact, using unbounded distributions,the expected number of children at exponential time T α is not ﬁnite for any α >

0, i.e., thebranching process is explosive . The aging eﬀect allows us to relax the condition on the ﬁtness,and the restriction to bounded distributions is relaxed to a condition on its moment generatingfunction.

In this paper, we only consider the tree setting , which is clearlyunrealistic for citation networks. However, the analysis of PAMs has shown that the qualitativefeatures of the degree distribution for PAMs are identical to those in the tree setting. Provingthis remains an open problem that we hope to address hereafter. Should this indeed be the case,then we could summarize our ﬁndings in the following simple way: The power-law tail distributionof PAMs is destroyed by integrable aging, and cannot be restored either by super-linear weightsor by adding bounded ﬁtnesses. However, it is restored by unbounded ﬁtnesses with at most anexponential tail. Part of these results are example based, while we have general results provingthat the limiting degree distribution exists. Structure of the paper.

The present paper is organized as follows. In Section 3, we quotegeneral results on CTBPs, in particular Theorem 3.10 that we use throughout our proofs. InSection 3.2, we describe known properties of the stationary regime. In Section 3.3, we brieﬂy discussthe Malthusian parameter, focusing on conditions on ﬁtness distributions to obtain supercriticalprocesses. In Section 4, we prove Theorem 2.3 and 2.5, and we show how Theorem 2.2 is a particularcase of Theorem 2.5. In Section 5 we specialize to the case of aﬃne PA function, giving preciseasymptotics. General theory of Continuous-Time Branching Processes

In this section we present the general theory of continuous-time branching processes (CTBPs). In such models, individuals produce children according to i.i.d.copies of the same birth process. We now deﬁne birth processes in terms of point processes:

Deﬁnition 3.1 (Point process) . A point process ξ is a random variable from a probability space (Ω , A , P ) to the space of integer-valued measures on R + . A point process ξ is deﬁned by a sequence of positive real-valued random variables ( T k ) k ∈ N .With abuse of notation, we can denote the density of the point process ξ by ξ ( dt ) = (cid:88) k ∈ N δ T k ( dt ) , he dynamics of power laws where δ x ( dt ) is the delta measure in x , and the random measure ξ evaluated on [0 , t ] as ξ ( t ) = ξ ([0 , t ]) = (cid:88) k ∈ N [0 ,t ] ( T k ) . We suppose throughout the paper that T k < T k +1 with probability 1 for every k ∈ N . Remark 3.2.

Equivalently, considering a sequence ( T k ) k ∈ N (where T = 0 ) of positive real-valuedrandom variables, such that T k < T k +1 with probability , we can deﬁne ξ ( t ) = ξ ([0 , t ]) = k when t ∈ [ T k , T k +1 ) . We will often deﬁne a point process from the jump-times sequence of an integer-valued process ( V t ) t ≥ . For instance, consider ( V t ) t ≥ as a Poisson process, and denote T k = inf { t > : V t ≥ k } .Then we can use the sequence ( T k ) k ∈ N to deﬁne a point process ξ . The point process deﬁned fromthe jump times of a process ( V t ) t ≥ will be denoted by ξ V . We now introduce some notation before giving the deﬁnition of CTBP. We denote the set ofindividuals in the population using Ulam-Harris notation for trees. The set of individuals is N = (cid:91) n ∈ N N n . For x ∈ N n and k ∈ N we denote the k -th child of x by xk ∈ N n +1 . This construction is well known,and has been used in other works on branching processes (see [15], [18], [23] for more details).We now are ready to deﬁne our branching process: Deﬁnition 3.3 (Continuous-time branching process) . Given a point process ξ , we deﬁne the CTBP associated to ξ as the pair of a probability space (Ω , A , P ) = (cid:89) x ∈N (Ω x , A x , P x ) , and an inﬁnite set ( ξ x ) x ∈N of i.i.d. copies of the process ξ . We will denote the branching processby ξ . Remark 3.4 (Point processes and their jump times) . Throughout the paper, we will deﬁne pointprocesses in terms of jump times of processes ( V t ) t ≥ . In order to keep the notation light, we willdenote branching processes deﬁned by point processes given by jump times of the process V t by V .To make it more clear, by V we denote a probability space as in Deﬁnition 3.3 and an inﬁnite setof measures ( ξ xV ) x ∈ N , where ξ V is the point process deﬁned by the process V . According to Deﬁnition 3.3, a branching process is a pair of a probability space and a sequenceof random measures. It is possible though to deﬁne an evolution of the branching population. Attime t = 0, our population consists only of the root, denoted by ∅ . Every time t an individual x gives birth to its k -th child, i.e., ξ x ( t ) = k + 1, assuming that ξ x ( t − ) = k , we start the process ξ xk .Formally: Deﬁnition 3.5 (Population birth times) . We deﬁne the sequence of birth times for the process ξ as τ ξ ∅ = 0 , and for x ∈ N , τ ξxk = τ ξx + inf { s ≥ : ξ x ( s ) ≥ k } . In this way we have deﬁned the set of individuals, their birth times and the processes accordingto which they reproduce. We still need a way to count how many individuals are alive at a certaintime t . 16 aravaglia, van der hofstad, woeginger Deﬁnition 3.6 (Random characteristic) . A random characteristic is a real-valued process Φ : Ω × R → R such that Φ( ω, s ) = 0 for any s < , and Φ( ω, s ) = Φ( s ) is a deterministic bounded functionfor every s ≥ that only depends on ω through the birth time of the individual, as well as the birthprocess of its children. An important example of a random characteristic is obtained by the function R + ( s ), whichmeasures whether the individual has been born at time s . Another example is R + ( s ) { k } ( ξ ),which measures whether the individual has been born or not at time s and whether it has k children presently.For each individual x ∈ N , Φ x ( ω, s ) denotes the value of Φ evaluated on the progeny of x ,regarding x as ancestor, when the age of x is s . In other words, Φ x ( ω, s ) is the evaluation of Φon the tree rooted at x , ignoring the rest of the population. If we do not specify the individual x , then we assume that Φ = Φ ∅ . We use random characteristics to describe the properties of thebranching population. Deﬁnition 3.7 (Evaluated branching processes) . Consider a random characteristic Φ as in Deﬁ-nition 3.6. We deﬁne the evaluated branching processes with respect to Φ at time t ∈ R + as ξ Φ t = (cid:88) x ∈N Φ x ( t − τ ξx ) . The meaning of the evaluated branching process is clear when we consider the random charac-teristic Φ( t ) = R + ( t ), for which ξ R + t = (cid:88) x ∈N ( R + ) x ( t − τ ξx ) , which is the number of x ∈ N such that t − τ ξx ≥

0, i.e., the total number of individuals already bornup to time t . Another characteristic that we consider in this paper is, for k ∈ N , Φ k ( t ) = { k } ( ξ t ),for which ξ Φ k t = (cid:88) x ∈N { k } (cid:16) ξ xt − τ ξx (cid:17) is the number of individuals with k children at time t .As known from the literature, the properties of the branching process are determined by thebehavior of the point process ξ . First of all, we need to introduce some notation. Consider afunction f : R + → R . We denote the Laplace transform of f by L ( f ( · ))( α ) = (cid:90) ∞ e − αt f ( t ) dt. With a slight abuse of notation, if µ is a positive measure on R + , then we denote L ( µ ( d · ))( α ) = (cid:90) ∞ e − αt µ ( dt ) . We use the Laplace transform to analyze the point process ξ : Deﬁnition 3.8 (Supercritical property) . Consider a point process ξ on R + . We say ξ is super-critical when there exists α ∗ > such that L ( E ξ ( d · ))( α ∗ ) = (cid:90) ∞ e − α ∗ t E ξ ( dt ) = (cid:88) k ∈ N E (cid:20)(cid:90) ∞ e − α ∗ t δ T k ( dt ) (cid:21) = (cid:88) k ∈ N E (cid:104) e − α ∗ T k (cid:105) = 1 . We call α ∗ the Malthusian parameter of the process ξ . he dynamics of power laws We point out that E ξ ( d · ) is an abuse of notation to denote the density of the averaged measure E [ ξ ([0 , t ])]. A second fundamental property for the analysis of branching processes is the following: Deﬁnition 3.9 (Malthusian property) . Consider a supercritical point process ξ , with Malthusianparameter α ∗ . The process ξ is Malthusian when − ddα ( L ( E ξ ( dt ))) ( α ) (cid:12)(cid:12)(cid:12)(cid:12) α ∗ = (cid:90) ∞ t e − α ∗ t E ξ ( d · ) = (cid:88) k ∈ N E (cid:104) T k e − α ∗ T k (cid:105) < ∞ . We denote ˜ α = inf { α > L ( E ξ ( d · )) ( α ) < ∞} , (3.1)and we will also assume that the process satisﬁes the conditionlim α (cid:38) ˜ α L ( E ξ ( d · )) ( α ) > . (3.2)Integrating by parts, it is possible to show that, for a point process ξ , L ( E ξ ( d · )) ( α ) = E [ V T α ] , where T α is an exponentially distributed random variable independent of the process ( V t ) t ≥ .Heuristically, the Laplace transform of a point process ξ V is the expected number of children bornat exponentially distributed time T α . In this case the Malthusian parameter is the exponential rate α ∗ such that at time T α ∗ exactly one children has been born.These two conditions are required to prove the main result on branching processes that we relyupon: Theorem 3.10 (Population exponential growth) . Consider the point process ξ , and the corre-sponding branching process ξ . Assume that ξ is supercritical and Malthusian with parameter α ∗ ,and suppose that there exists ¯ α < α ∗ such that (cid:90) ∞ e − ¯ αt E ξ ( dt ) < ∞ . Then(1) there exists a random variable Θ such that as t → ∞ , e − α ∗ t ξ R + t P − as −→ Θ; (3.3) (2) for any two random characteristics Φ and Ψ , ξ Φ t ξ Ψ t P − as −→ L ( E [Φ( · )])( α ∗ ) L ( E [Ψ( · )])( α ∗ ) . (3.4)This result is stated in [23, Theorem A], which is a weaker version of [18, Theorem 6.3]. Formula(3.3) implies that, P -a.s., the population size grows exponentially with time. It is relevant thoughto give a description of the distribution of the random variable Θ: Theorem 3.11 (Positivity of Θ) . Under the hypothesis of Theorem 3.10, if E (cid:2) L ( ξ ( d · ))( α ∗ ) log + ( L ( ξ ( d · ))( α ∗ )) (cid:3) < ∞ , (3.5) then, on the event { ξ R + t → ∞} , i.e., on the event that the branching population keeps growing intime, the random variable Θ in (3.3) is positive with probability 1, and E [Θ] = 1 . Otherwise, Θ = 0 with probability 1. Condition (3.5) is called the (xlogx) condition. aravaglia, van der hofstad, woeginger This result is proven in [15, Theorem 5.3], and it is the CTBPs equivalent of the Kesten-Stigumtheorem for Galton-Watson processes ([17, Theorem 1.1]).Formula (3.4) says that the ratio between the evaluation of the branching process with twodiﬀerent characteristics converges P -a.s. to a constant that depends only on the two characteristicsinvolved. In particular, if we consider, for k ∈ N ,Φ( t ) = { k } ( ξ t ) , and Ψ( t ) = R + ( t ) , then Theorem 3.10 gives ξ Φ t ξ R + t P − as −→ α ∗ L ( P ( ξ ( · ) = k ))( α ∗ ) , (3.6)since L ( E [ R + ( · )])( α ∗ ) = 1 /α ∗ . The ratio in the previous formula is the fraction of individuals with k children in the whole population: Deﬁnition 3.12 (limiting degree distribution for CTBP) . The sequence ( p k ) k ∈ N , where p k = α ∗ L ( P ( ξ ( · ) = k ))( α ∗ ) = α ∗ (cid:90) ∞ e − α ∗ t P ( ξ ( t ) = k ) dt is the limiting degree distribution for the branching process ξ . The aim of the following sections will be to study when point processes satisfy the conditionsof Theorem 3.10, in order to analyze the limiting degree distribution in Deﬁnition 3.12.

In this section we present the theory of birthprocesses that are stationary and have deterministic rates. This is relevant since the deﬁnition ofaging processes starts with a stationary process. In particular, we give description of the aﬃnecase, which plays a central role in the present work:

Deﬁnition 3.13 (Stationary non-ﬁtness birth processes) . Consider a non-decreasing sequence ( f k ) k ∈ N of positive real numbers. A stationary non-ﬁtness birth process is a stochastic process ( V t ) t ≥ such that(1) V = 0 , and V t ∈ N for all t ∈ R + ;(2) V t ≤ V s for every t ≤ s ;(3) for h small enough, P ( V t + h = k + 1 | V t = k ) = f k h + o ( h ) , and for j ≥ , P ( V t + h = k + j | V t = k ) = o ( h ) . (3.7) We denote the jump times by ( T k ) k ∈ N , i.e., T k = inf { t ≥ : V t ≥ k } . We denote the point process corresponding to ( V t ) t ≥ by ξ V . In this case, ( V t ) t ≥ is an inho-mogeneous Poisson process, and for every k ∈ N , T k +1 − T k has exponential law with parameter f k independent of ( T h +1 − T h ) k − h =0 . It is possible to show the following proposition: Proposition 3.14 (Probabilities for ( V t ) t ≥ ) . Consider a stationary non-ﬁtness birth process ( V t ) t ≥ . Denote, for every k ∈ N , P ( V t = k ) = P k [ V ]( t ) . Then P [ V ]( t ) = exp ( − f t ) , (3.8) and, for k ≥ , P k [ V ]( t ) = f k − exp ( − f k t ) (cid:90) t exp ( f k x ) P k − [ V ]( x ) dx. (3.9)19 he dynamics of power laws For a proof, see [4, Chapter 3, Section 2]. From the jump times, it is easy to compute theexplicit expression for the Laplace transform of ξ V as L ( E ξ V ( d · ))( α ) = (cid:88) k ∈ N E (cid:20)(cid:90) ∞ e − αt δ T k ( dt ) (cid:21) = (cid:88) k ∈ N E (cid:2) e − αT k (cid:3) = (cid:88) k ∈ N k − (cid:89) i =0 f i α + f i , since every T k can be seen as sum of independent exponential random variables with parame-ters given by the sequence ( f k ) k ∈ N . Assuming now that ξ V is supercritical and Malthusian withparameter α ∗ , we have the explicit expression for the limit distribution ( p k ) k ∈ N , given by (2.2).An analysis of the behavior of the limit distribution of branching processes is presented in [2]and [22], where the authors prove that ( p k ) k ∈ N has a power-law tail only if the sequence of rates( f k ) k ∈ N is asymptotically linear with respect to k . Proposition 3.15 (Characterization of stationary and linear process V ) . Consider the sequence f k = ak + b . Then:(1) for every α ∈ R + , L ( E ξ V ( d · ))( α ) = Γ( α ∗ /a + b/a )Γ( b/a ) (cid:88) k ∈ N Γ( k + b/a )Γ( k + b/a + α/a ) = bα − a . (2) The Malthusian parameter is α ∗ = a + b , and ˜ α = a , where ˜ α is deﬁned as in (3.1) .(3) The derivative of the Laplace transform is − b ( α − a ) , which is ﬁnite whenever α > a ;(4) The process ( V t ) t ≥ satisﬁes the (xlogx) condition (3.5) .Proof. The proof can be found in [23, Theorem 2], or [2, Theorem 2.6].For aﬃne PA weights ( f k ) k ∈ N = ( ak + b ) k ∈ N , the Malthusian parameter α ∗ exists. Since α ∗ = a + b , the limiting degree distribution of the branching process V is given by p k = (1 + b/a ) Γ(1 + 2 b/a )Γ( b/a ) Γ ( k + b/a )Γ ( k + b/a + 2 + b/a ) . (3.10)Notice that p k has a power-law decay with exponent τ = 2 + ba . Branching processes of this typeare related to PAM, also called the Barab´asi-Albert model ([1]). This model shows the so-called old-get-richer eﬀect. Clearly this is not true for real-world citation networks. In Figure 5, we noticethat, on average, the increment of the citation received by old papers is smaller than the incrementof younger papers. Rephrasing it, old papers tend to be cited less and less over time. The existence of the Malthusian parameter is a necessarycondition to have a branching process growing at exponential rate. In particular, the Malthusianparameter does not exist in two cases: when the process is subcritical and grows slower thanexponential, or when it is explosive. In the ﬁrst case, the branching population might either die outor grow indeﬁnitely with positive probability, but slower than at exponential rate. In the secondcase, the population size explodes in ﬁnite time with probability one. In both cases, the behaviorof the branching population is diﬀerent from what we observe in citation networks (Figure 1). Forthis reason, we focus on supercritical processes, i.e., on the case where the Malthusian parameterexists. 20 aravaglia, van der hofstad, woeginger

Denote by ( V t ) t ≥ a stationary birth process deﬁned by PA weights ( f k ) k ∈ N . In general, weassume f k → ∞ . Denote the sequence of jump times by ( T k ) k ∈ N . As we quote in Section 3.2, theLaplace transform of a birth process ( V t ) t ≥ is given by L ( E V ( d · ))( α ) = E (cid:34)(cid:88) k ∈ N e − αT k (cid:35) = E [ V T α ] = (cid:88) k ∈ N k − (cid:89) i =0 f i α + f i . Such expression comes from the fact that, in stationary regime, T k is the sum of k independentexponential random variables. We can write (cid:88) k ∈ N exp (cid:32) − k − (cid:88) i =0 log (cid:18) αf i (cid:19)(cid:33) = (cid:88) k ∈ N exp (cid:32) − α k − (cid:88) i =0 f i (1 + o (1)) (cid:33) . The behavior of the Laplace transform depends on the asymptotic behavior of the PA weights.We deﬁne now the terminology we use:

Deﬁnition 3.16 (Superlinear PA weights) . Consider a PA weight sequence ( f k ) k ∈ N . We say thatthe PA weights are superlinear if (cid:80) ∞ i =0 /f i < ∞ . As a general example, consider f k = ak q + b , where q >

0. In this case, the sequence is aﬃnewhen q = 1, superlinear when q > q < C = (cid:80) ∞ i =0 /f i < ∞ , we have (cid:88) k ∈ N exp (cid:32) − α k − (cid:88) i =0 f i (1 + o (1)) (cid:33) ≥ (cid:88) k ∈ N exp ( − αC ) = + ∞ . (3.11)This holds for every α >

0. As a consequence, the Laplace transform L ( E V ( d · ))( α ) is alwaysinﬁnite, and there exist no Malthusian parameter. In particular, if we denote by T ∞ = lim k →∞ T k ,then T ∞ < ∞ a.s.. This means that the birth process ( V t ) t ≥ explodes in a ﬁnite time.When the weights are at most linear, the bound in (3.11) does not hold anymore. In fact,consider as example aﬃne weights f k = ak + b . We have that (cid:80) k − i =0 1 f i = (1 /a ) log k (1 + o (1)). Asa consequence, the Laplace transform can be written as (cid:88) k ∈ N exp (cid:16) − αa log k (1 + o (1)) (cid:17) = (cid:88) k ∈ N k − αa (1 + o (1)) . (3.12)In this case, the Laplace transform is ﬁnite for α > a . For the sublinear case, for which (cid:80) k − i =0 /f i = Ck (1 − q ) (1 + o (1)), we obtain (cid:88) k ∈ N exp (cid:0) − Cαk − q (cid:1) . This sum is ﬁnite for any α >

Remark 3.17.

Consider the process ( V t ) t ≥ deﬁned by the sequence of PA weights ( f k ) k ∈ N as inSection 3.2. For u ∈ R + we denote by ( V ut ) t ≥ the process deﬁned by the sequence ( uf k ) k ∈ N . It iseasy to show that L ( E ξ V u ( d · ))( α ) = L ( E ξ V ( d · ))( α/u ) . The behavior of the degree sequence of ( V ut ) t ≥ is the same of the process V t . Remark 3.17 shows a sort of monotonicity of the Laplace transform with respect to the sequence( f k ) k ∈ N . This is very useful to describe the Laplace transform of a birth process with ﬁtness, whichwe deﬁne now: 21 he dynamics of power laws Deﬁnition 3.18 (Stationary ﬁtness birth processes) . Consider a birth process ( V t ) t ≥ deﬁned bya sequence of weights ( f k ) k ∈ N . Let Y be a positive random variable. We call stationary ﬁtnessbirth processes the process ( V Yt ) t ≥ , deﬁned by the random sequence of weights ( Y f k ) k ∈ N , i.e.,conditionally on Y , P (cid:0) V Yt + h = k + 1 | V Yt = k, Y (cid:1) = Y f k h + o ( h ) . By Deﬁnition 3.18, it is obvious that the properties of the process ( V Yt ) t ≥ are related to theproperties of ( V t ) t ≥ . Since we consider a random ﬁtness Y independent of the process ( V t ) t ≥ ,from Remark 3.17 it follows that L ( E V Y ( d · ))( α ) = E [ L ( E ξ V u ( d · ))( α ) u = Y ] = E (cid:34)(cid:88) k ∈ N k − (cid:89) i =0 Y f i α + Y f i (cid:35) . (3.13)For aﬃne weights the ﬁtness distribution needs to be bounded, as discussed in Section 2.4. In thissection we give a qualitative explanation of this fact. Consider the sum in the expectation in theright hand term of (3.13). We can rewrite the sum as (cid:88) k ∈ N k − (cid:89) i =0 Y f i α + Y f i = (cid:88) k ∈ N exp (cid:32) − k − (cid:88) i =0 log (cid:18) αY f i (cid:19)(cid:33) = (cid:88) k ∈ N exp (cid:32) − αY k − (cid:88) i =0 f i (1 + o (1)) (cid:33) . (3.14)The behavior depends sensitively on the asymptotic behavior of the PA weights. In particular, anecessary condition for the existence of the Malthusian parameter is that the sum in (3.13) is ﬁniteon an interval of the type ( ˜ α, + ∞ ). In other words, since the Laplace transform is a decreasingfunction (when ﬁnite), we need to prove the existence of a minimum value ˜ α such that it is ﬁnitefor every α > ˜ α . Using (3.14) in (3.13), we just need to ﬁnd a value α such that the right handside of (3.14) equals 1.In the case of aﬃne weights f k = ak + b , we have (cid:80) k − i =0 1 f i = C log k (1 + o (1)), for a constant C . As a consequence, (3.14) is equal to E (cid:34)(cid:88) k ∈ N exp (cid:16) − C αY log k (cid:17)(cid:35) = E (cid:34)(cid:88) k ∈ N k − Cα/Y (cid:35) . (3.15)The sum inside the last expectation is ﬁnite only on the event { Y < Cα } . If Y has an unboundeddistribution, then for every value of α > { Y ≥ Cα } is an event of positive probability.As a consequence, for every α >

0, the Laplace transform of the birth process ( V Yy ) t ≥ is inﬁnite,which means there exists no Malthusian parameter.This is why a bounded ﬁtness distribution is necessary to have a Malthusian parameter usingaﬃne PA weights. The situation is diﬀerent in the case of sublinear weights. For example, consider f k = (1 + k ) q , where q ∈ (0 , (cid:80) k − i =0 /f i = Ck − q (1 + o (1)). Using this in (3.14), we obtain E (cid:34)(cid:88) k ∈ N exp (cid:16) − C αY k (1 − q ) (cid:17)(cid:35) . In this case, since both α and Y are always positive, the last sum is ﬁnite with probability 1, andthe expectation might be ﬁnite under appropriate moment assumptions on Y .Assume now that the ﬁtness Y satisﬁes the necessary conditions, so that the process ( V Yt ) t ≥ issupercritical and Malthusian with parameter α ∗ . We can evaluate the limiting degree distribution.Conditioning on Y , the Laplace transform of E ξ V Y ( dx ) is (cid:88) k ∈ N k − (cid:89) i =0 Y f i α + Y f i , aravaglia, van der hofstad, woeginger so, as a consequence, the limiting degree distribution of the branching processes is p k = E (cid:34) α ∗ α ∗ + Y f k k − (cid:89) i =0 Y f i α ∗ + Y f i (cid:35) . (3.16)It is possible to see that the right-hand side of (3.16) is similar to the distribution of the simplercase with no ﬁtness given by (2.2). We still have a product structure for the limit distribution, butin the ﬁtness case it has to be averaged over the ﬁtness distribution. This result is similar to [10,Theorem 2.7, Corollary 2.8].Considering aﬃne weights f k = ak + b , we can rewrite (3.16) as p k = E (cid:20) Γ(( α ∗ + b ) / ( aY ))Γ( b/ ( aY )) Γ( k + b/ ( aY ))Γ( k + b/ ( aY ) + 1 + α ∗ / ( aY )) (cid:21) . Asymptotically in k , the argument of the expectation in the previous expression is random witha power-law exponent τ ( Y ) = 1 + α ∗ / ( aY ). For example, in this case averaging over the ﬁtnessdistribution, it is possible to obtain power-laws with logarithmic corrections (see eg [5, Corollary32]). Existence of limiting distributions

In this section, we give the proof of Theorems 2.2, 2.3 and 2.5, proving that the branching processesdeﬁned in Section 2 do have a limiting degree distribution. As mentioned, we start by provingTheorem 2.5, and then explain how Theorem 2.2 follows as special case.Before proving the result, we do need some remarks on the processes we consider. Birth processwith aging alone and aging with ﬁtness are deﬁned respectively in Deﬁnition 2.1 and 2.4. Considerthen a process with aging and ﬁtness ( M t ) t ≥ as in Deﬁnition 2.4. Let ( T k ) k ∈ N denote the sequenceof birth times, i.e., T k = inf { t ≥ M t ≥ k } . It is an immediate consequence of the deﬁnition that, for every k ∈ N , P ( T k ≤ t ) = P (cid:0) ¯ T k ≤ Y G ( t ) (cid:1) , (4.1)where ( ¯ T k ) k ∈ N is the sequence of birth times of a stationary birth process ( V t ) t ≥ deﬁned bu thesame PA function f . Consider then the sequence of functions ( P k [ V ]( t )) k ∈ N associated with thestationary process ( V t ) t ≥ deﬁned by the same sequence of weights ( f k ) k ∈ N (see Proposition 3.14).As a consequence, for every k ∈ N , P ( M t = k ) = E [ P k [ V ]( Y G ( t ))], and the same holds for an agingprocess just considering Y ≡

1. Formula (4.1) implies that the aging process is the stationaryprocess with a deterministic time-change given by G ( t ). A process with aging and ﬁtness is thestationary process with a random time-change given by Y G ( t ).Assume now that g is integrable, i.e. lim t →∞ G ( t ) = G ( ∞ ) < ∞ . Using (4.1) we can describethe limiting degree distribution ( q k ) k ∈ N of a ﬁxed individual in the branching population, i.e., thedistribution N ∞ (or M ∞ ) of the total number of children an individual will generate in its entirelifetime. In fact, for every k ∈ N ,lim t →∞ P ( N t = k ) = lim t →∞ P [ V ]( G ( t )) = P (cid:0) V G ( ∞ ) = k (cid:1) , (4.2)which means that N ∞ has the same distribution as V G ( ∞ ) . With ﬁtness,lim t →∞ P ( M t = k ) = lim t →∞ E [ P [ V ]( Y G ( t ))] = P (cid:0) V Y G ( ∞ ) = k (cid:1) . For example, in the case of aging only, this is rather diﬀerent from the stationary case, where thenumber of children of a ﬁxed individual diverges as the individual gets old (see e.g [2, Theorem2.6]). 23 he dynamics of power laws

Birth processes with continuous aging eﬀect and ﬁtness are deﬁnedin Deﬁnition 2.4. We now identify conditions on the ﬁtness distribution to have a Malthusianparameter:

Lemma 4.1 (Condition (2.6)) . Consider a stationary process ( V t ) t ≥ , an integrable aging function g and a random ﬁtness Y . Assume that E [ V t ] < ∞ for every t ≥ . Then the process ( V Y G ( t ) ) t ≥ is supercritical if and only if Condition (2.6) holds, i.e., E (cid:2) V Y G ( t ) (cid:3) < ∞ for every t ≥ and lim t →∞ E (cid:2) V Y G ( t ) (cid:3) > . Proof.

For the if part, we need to prove thatlim α → + E (cid:2) V Y G ( T α ∗ ) (cid:3) > α →∞ E (cid:2) V Y G ( T α ∗ ) (cid:3) = 0 . As before, ( ¯ T k ) k ∈ N are the jump times of the process ( V G ( t ) ) t ≥ . Then E (cid:2) V Y G ( T α ∗ ) (cid:3) = (cid:88) k ∈ N E (cid:104) e − α ¯ T k /Y (cid:105) . When α →

0, we have E (cid:104) e − α ¯ T k (cid:105) → P (cid:0) ¯ T k /Y < ∞ (cid:1) . Now, (cid:88) k ∈ N P (cid:0) ¯ T k < ∞ (cid:1) = lim t →∞ (cid:88) k ∈ N P (cid:0) ¯ T k /Y ≤ t (cid:1) = lim t →∞ E (cid:2) V Y G ( t ) (cid:3) > . For α → ∞ , (cid:90) ∞ α e − αt E (cid:2) V Y G ( t ) (cid:3) dt = (cid:90) ∞ e − u E (cid:2) V Y G ( u/α ) (cid:3) du. When α → ∞ we have E (cid:2) V Y G ( u/α ) (cid:3) →

0. Then, ﬁx α > E (cid:2) V Y G ( u/α ) (cid:3) < α > α . As a consequence, e − u E (cid:2) V Y G ( u/α ) (cid:3) du ≤ e − u for any α > α . By dominated convergence,lim α →∞ (cid:90) ∞ α e − αt E (cid:2) V Y G ( t ) (cid:3) dt = 0 . Now suppose Condition (2.6) does not hold. This means that E [ V Y G ( t ) ] = + ∞ for some t ∈ [0 , G ( ∞ )) or lim t →∞ E [ V Y G ( t ) ] ≤ t ∈ (0 , aG ( ∞ )) such that E (cid:2) V Y G ( t ) (cid:3) = + ∞ forevery t ≥ t (recall that E (cid:2) V Y G ( t ) (cid:3) in an increasing function in t ). As a consequence, for every α >

0, we have E (cid:2) V Y G ( T α ) (cid:3) = + ∞ , which means that the process is explosive.If the second condition holds, then for every α > Y than requiring it to be bounded.Now, we want to investigate the degree distribution of the branching process, assuming that theprocess ( M t ) t ≥ is supercritical and Malthusian. Denote the Malthusian parameter by α ∗ . Theabove allows us to complete the proof of Theorem 2.5: Proof of Theorem 2.5.

We start from p k = E [ P k [ V ]( Y G ( T α ∗ ))] . (4.3)Conditioning on Y and integrating by parts in the integral given by the expectation in (4.3), gives − f k Y (cid:90) ∞ e − α ∗ t P k [ V ]( Y G ( t )) g ( t ) dt + f k − Y (cid:90) ∞ e − α ∗ t P k − [ V ]( Y G ( t )) g ( t ) dt. aravaglia, van der hofstad, woeginger Now, we deﬁne ˆ L ( k, α ∗ , Y ) = (cid:32) L ( P (cid:0) V uG ( · ) = k (cid:1) g ( · ))( α ∗ ) L ( P (cid:0) V uG ( · ) = k (cid:1) )( α ∗ ) (cid:33) u = Y (4.4)Notice that the sequence ( ˆ L ( k, α ∗ , Y )) k ∈ N is a sequence of random variables. Multiplying both sidesof the equation by α ∗ , on the right hand side we have − f k Y ˆ L ( k, α ∗ , Y ) E [ P k [ V ]( uG ( T α ∗ ))] u = Y + f k − Y ˆ L ( k − , α ∗ , Y ) E [ P k − [ V ]( uG ( T α ∗ ))] u = Y , while on the left hand side we have α ∗ E [ P k [ V ]( uG ( T α ∗ ))] u = Y . As a consequence, E [ P k [ V ]( uG ( T α ∗ ))] u = Y = f k − Y ˆ L ( k − , α ∗ , Y ) α ∗ + f k Y ˆ L ( k, α ∗ , Y ) E [ P k − [ V ]( uG ( T α ∗ ))] u = Y . (4.5)We start from p , that is given by E [ P [ V ]( uG ( T α ∗ ))] u = Y = α ∗ α ∗ + f Y ˆ L (0 , α ∗ , Y ) . Recursively using (4.5), gives E [ P k [ V ]( uG ( T α ∗ ))] u = Y = α ∗ α ∗ + f k Y ˆ L ( k, α ∗ , Y ) k − (cid:89) i =0 f i Y ˆ L ( i, α ∗ , Y ) α ∗ + f i Y ˆ L ( i, α ∗ , Y ) . Taking expectation on both sides gives p k = E (cid:34) α ∗ α ∗ + f k Y ˆ L ( k, α ∗ , Y ) k − (cid:89) i =0 f i Y ˆ L ( i, α ∗ , Y ) α ∗ + f i Y ˆ L ( i, α ∗ , Y ) (cid:35) . Now the sequence ( ˆ L ( k, α ∗ , Y )) k ∈ N creates a relation among the sequence of weights, the agingfunction and the ﬁtness distribution, so that these three ingredients are deeply related. As mentioned, Theorem 2.2 follows immediately by con-sidering Y ≡

1. The proof in fact is the same, since we can express the probabilities P ( N t = k ) asfunction of the stationary process ( V t ) t ≥ deﬁned by the same PA function f .Condition (2.5) immediately follows from Condition (2.6). In fact, considering Y ≡

1, Condition(2.6) becomes E (cid:2) V G ( t ) (cid:3) < ∞ for every t ≥ t →∞ E (cid:2) V G ( t ) (cid:3) > . (4.6)The ﬁrst inequality in general true for the type of stationary process we consider (for instance with f aﬃne). The second inequality is exaclty Condition (2.5).The expression of the sequence ( ˆ L g ( k, α ∗ )) k ∈ N is simplier than the general case given in (4.4).In fact, in (4.4), the sequence ( ˆ L ( k, α ∗ , Y )) k ∈ N is actually a squence of random variables. In thecase of aging alone, ˆ L g ( k, α ∗ ) = L ( P (cid:0) V G ( · ) = k (cid:1) g ( · ))( α ∗ ) L ( P (cid:0) V G ( · ) = k (cid:1) )( α ∗ ) , which is a deterministic sequence. 25 he dynamics of power laws Remark 4.2.

Notice that ˆ L g ( k, α ∗ ) = 1 when g ( t ) ≡ , so that G ( t ) = t for every t ∈ R + andthere is no aging, and we retrieve the stationary process ( V t ) t ≥ . Unfortunately, the explicit expression of the coeﬃcients ( ˆ L g ( k, α ∗ )) k ∈ N is not easy to ﬁnd, eventhough they are deterministic.Theorem 2.3, which states that even if g is integrable, the aging does not aﬀect the explosivebehavior of a birth process with superlinear weights, is a direct consequence of (4.2): Proof of Theorem 2.3.

Consider a birth process ( V t ) t ≥ , deﬁned by a sequence of superlinear weights( f k ) k ∈ N (in the sense of Deﬁnition 3.16), and an integrable aging function g . Then, for every t > P ( N t = ∞ ) = P (cid:0) V G ( t ) = ∞ (cid:1) > . Since this holds for every t >

0, the process ( N t ) t ≥ is explosive. As a consequence, for any α > E [ N T α ] = ∞ , which means that there exists no Malthusian parameter. Affine weights and adapted Laplace method

In this section, we consider aﬃne PA weights, i.e., we consider f k = ak + b . The main aim is to identify the asymptotic behavior of the limiting degree distribution ofthe branching process with aging. Consider a stationary process ( V t ) t ≥ , where f k = ak + b . Then,for any t ≥

0, it is possible to show by induction and the recursions in (3.8) and (3.9) that P k [ V ]( t ) = P ( V t = k ) = 1Γ( b/a ) Γ( k + b/a )Γ( k + 1) e − bt (cid:0) − e − at (cid:1) k . (5.1)We omit the proof of (5.1). As a consequence, since the corresponding aging process is ( V G ( t ) ) t ≥ ,the limiting degree distribution is given by p k = P ( V t = k ) = Γ( k + b/a )Γ( b/a )Γ( k + 1) (cid:90) ∞ α ∗ e − α ∗ t e − bG ( t ) (cid:16) − e − aG ( t ) (cid:17) k dt. (5.2)We can obtain an immediate upper bound for p k , in fact p k = Γ( k + b/a )Γ( b/a )Γ( k + 1) (cid:90) ∞ α ∗ e − α ∗ t e − bG ( t ) (cid:16) − e − aG ( t ) (cid:17) k dt ≤ Γ( k + b/a )Γ( b/a )Γ( k + 1) (1 − e − aG ( ∞ ) ) k , which implies that the distribution ( p k ) k ∈ N has at most an exponential tail. A more precise analysisis hard. Instead we will give an asymptotic approximation, by adapting the Laplace method forintegrals to our case.The Laplace method states that, for a function f that is twice diﬀerentiable and with a uniqueabsolute minimum x ∈ ( a, b ), as k → ∞ , (cid:90) ba e − k Ψ( x ) dx = (cid:115) πk Ψ (cid:48)(cid:48) ( x ) e − k Ψ( x ) (1 + o (1)) . (5.3)In this situation, the interval [ a, b ] can be inﬁnite. The idea behind this result is that, when k (cid:29) x where e − k Ψ( x ) is maximized.In the integral in (5.2), we do not have this situation, since we do not have an integral of the type(5.3). Deﬁning Ψ k ( t ) := α ∗ k t + bk G ( t ) − log (cid:16) − e − aG ( t ) (cid:17) , (5.4)we can rewrite the integral in (5.2) as I ( k ) := (cid:90) ∞ α ∗ e − k Ψ k ( t ) dt. (5.5)26 aravaglia, van der hofstad, woeginger The derivative of the function Ψ k ( t ) isΨ (cid:48) k ( t ) = α ∗ k + bk g ( t ) − ag ( t )e − aG ( t ) − e − aG ( t ) . (5.6)In particular, if there exists a minimum t k , then it depends on k . In this framework, we cannotdirectly apply the Laplace method. We now show that we can apply a result similar to (5.3) evento our case: Lemma 5.1 (Adapted Laplace method 1) . Consider α, a, b > . Let the integrable aging function g be such that(1) for every t ≥ , < g ( t ) ≤ A < ∞ ;(2) g is diﬀerentiable on R + , and g (cid:48) is ﬁnite almost everywhere;(3) there exists a positive constant B < ∞ such that g ( t ) is decreasing for t ≥ B ;(4) assume that the solution t k of Ψ (cid:48) k ( t ) = 0 , for Ψ (cid:48) k ( t ) as in (5.6) , is unique, and g (cid:48) ( t k ) < .Then, for σ k = ( k Ψ (cid:48)(cid:48) k ( t k )) − , there exists a constant C such that, as k → ∞ , I ( k ) = C (cid:113) πσ k e − k Ψ k ( t k ) (cid:18)

12 + P (cid:0) N (0 , σ k ) ≥ t k (cid:1)(cid:19) (1 + o (1)) , where N (0 , σ k ) denotes a normal distribution with zero mean and variance σ k . Since Lemma 5.1 is an adapted version of the classical Laplace method, we move the proof toAppendix B. We can use the result of Lemma 5.1 to prove:

Proposition 5.2 (Asymptotics - aﬃne weights, aging, no ﬁtness) . Consider the aﬃne PA weights f k = ak + b , an integrable aging function g , and denote the limiting degree distribution of thecorresponding branching process by ( p k ) k ∈ N . Then, under the hypotheses of Lemma 5.1, there existsa constant C > such that, as k → ∞ , p k = Γ( k + b/a )Γ( k + 1) (cid:18) Cg ( t k ) − g (cid:48) ( t k ) g ( t k ) (cid:19) / e − α ∗ t k (1 − e − aG ( ∞ ) ) k D k ( g )(1 + o (1)) , (5.7) where D k ( g ) = 12 + 12 √ π (cid:90) C k ( g ) − C k ( g ) e − u du, and C k ( g ) = t k (cid:16) Cg ( t k ) − g (cid:48) ( t k ) g ( t k ) (cid:17) / . In this section, we investigate the asymptotic behavior of thelimiting degree distribution of a CTBP, in the case of aﬃne PA weights. The method we use isanalogous to that in Section 5.1.We assume that the ﬁtness Y is absolutely continuous with respect to the Lebesgue measure,and we denote its density function by µ . The limiting degree distribution of this type of branchingprocess is given by p k = P (cid:0) V Y G ( T α ∗ ) = k (cid:1) = Γ( k + b/a )Γ( b/a )Γ( k + 1) (cid:90) R + × R + α ∗ e − α ∗ t µ ( s )e − bsG ( t ) (cid:16) − e − asG ( t ) (cid:17) k dsdt. (5.8)We immediately see that the degree distribution has exponential tails when the ﬁtness distributionis bounded: 27 he dynamics of power laws Lemma 5.3 (Exponential tails for integrable aging and bounded ﬁtnesses) . When there exists γ such that µ ([0 , γ ]) = 1 , i.e., the ﬁtness has a bounded support, then p k ≤ Γ( k + b/a )Γ( b/a )Γ( k + 1) (cid:16) − e − aγG ( ∞ ) (cid:17) k . (5.9) In particular, p k has exponential tails.Proof. Obvious.Like in the situation with only aging, the explicit solution of the integral in (5.8) may be hardto ﬁnd. We again have to adapt the Laplace method to estimate the asymptotic behavior of theintegral. We write I ( k ) := (cid:90) R + × R + e − k Ψ k ( t,s ) dsdt, (5.10)where Ψ k ( t, s ) := α ∗ k t + bk sG ( t ) − k log µ ( s ) − log(1 − e − saG ( t ) ) . (5.11)As before, we want to minimize the function Ψ k . We state here the lemma: Lemma 5.4 (Adapted Laplace method 2) . Let Ψ k ( t, s ) as in (5.11) . Assume that(1) g satisﬁes the assumptions of Lemma 5.1;(2) µ is twice diﬀerentiable on R + ;(3) there exists a constant B (cid:48) > such that, for every s ≥ B (cid:48) , µ is monotonically decreasing;(4) ( t k , s k ) is the unique point where both partial derivatives are zero;(5) ( t k , s k ) is the absolute minimum for Ψ k ( t, s ) ;(6) the hessian matrix H k ( t k , s k ) of Ψ k ( t, s ) evaluated in ( t k , s k ) is positive deﬁnite.Then, I ( k ) = e − k Ψ k ( t k ,s k ) π (cid:112) det( kH k ( t k , s k )) P ( N ( k ) ≥ − t k , N ( k ) ≥ − s k ) (1 + o (1)) , where ( N ( k ) , N ( k )) := N ( , ( kH k ( t k , s k )) − ) is a bivariate normal distributed vector and =(0 , . The proof of Lemma 5.4 can be found in Appendix B.1. Using Lemma 5.4 we can describe thelimiting degree distribution ( p k ) k ∈ N : Proposition 5.5 (Asymptotics - aﬃne weights, aging, ﬁtness) . Consider aﬃne PA weights f k = ak + b , an integrable aging function g and a ﬁtness distribution density µ . Assume that the corre-sponding branching process is supercritical and Malthusian. Under the hypotheses of Lemma 5.4,the limiting degree distribution ( p k ) k ∈ N of the corresponding CTBP satisﬁes p k = k b/a − Γ( b/a ) 2 π (cid:112) det( kH k ( t k , s k )) e − k Ψ k ( t k ,s k ) P ( N ≥ − t k , N ≥ − s k ) (1 + o (1)) . aravaglia, van der hofstad, woeginger Proposition 5.5 in Section 5.2 gives the asymptoticbehavior of the limiting degree distribution of a CTBP with integrable aging and ﬁtness. Lemma5.4 requires conditions under which the function Ψ k ( t, s ) as in (5.11) has a unique minimum pointdenoted by ( t k , s k ). In this section we consider the three diﬀerent classes of ﬁtness distributionsthat we have introduced in Section 2.4.For the heavy-tailed class, i.e., for distributions with tail thicker than exponential, there isnothing to prove. In fact, (2.7) immediately implies that such distributions are explosive.For the other two cases, we apply Proposition 5.5, giving the precise asymptotic behavior ofthe limiting degree distributions of the correponding CTBPs. Propositions 5.6 and 5.7 containthe results respectively on the general-exponential and sub-exponential classes. The proof of thesepropositions are moved to Appendix C. Proposition 5.6.

Consider a general exponential ﬁtness distribution as in (2.11) . Let ( M t ) t ≥ be the corresponding birth process. Denote the unique minimum point of Ψ k ( t, s ) as in (5.11) by ( t k , s k ) . Then(1) for every t ≥ , M t has a dynamical power law with exponent τ ( t ) = 1 + θaG ( t ) ;(2) the asymptotic behavior of the limiting degree distribution ( p k ) k ∈ N is given by p k = e − α ∗ t k h ( s k ) (cid:18) ˜ C − α ∗ g (cid:48) ( t k ) g ( t k ) (cid:19) − / k − (1+ θ/ ( aG ( ∞ ))) (1 + o (1)) , where the power law term has exponent τ = 1 + θ/aG ( ∞ ) ;(3) the distribution ( q k ) k ∈ N of the total number of children of a ﬁxed individual has a power lawbehavior with exponent τ = 1 + θ/aG ( ∞ ) . By (2.8) it is necessary to consider the exponential rate θ > aG ( ∞ ) to obtain a non-explosiveprocess. In particular, this implies that, for every t ≥ τ ( t ), as well as τ , are strictly larger than2. As a consequence, the three distributions ( P k [ M ]( t )) k ∈ N , ( p k ) k ∈ N and ( q k ) k ∈ N have ﬁnite ﬁrstmoment. Increasing the value of θ leads to power-law distributions with exponent larger than 3, sowith ﬁnite variance.A second observation is that, independently of the aging function g , the point s k is of orderlog k . In particular, this has two consequences. First the correction to the power law given by h ( s k )is a power of log k . Since h (cid:48) ( s ) /h ( s ) → s → ∞ . Second the power-law term k − (1+ θ/ ( aG ( ∞ ))) arises from µ ( s k ). This means that the exponential term in the ﬁtness distribution µ not only isnecessary to obtain a non-explosive process, but also generates the power law.The third observation is that the behavior of the three distributions ( P k [ M ]( t )) k ∈ N , ( p k ) k ∈ N and( q k ) k ∈ N depends on the integrability of the aging function, but does only marginally depends on itsprecise shape . The contribution of the aging function g to the exponent of the power law in fact isgiven only by the value G ( ∞ ). The other terms that depend directly on the shape of g are e − α ∗ t k and the ratio g (cid:48) ( t k ) /g ( t k ). The ratio g (cid:48) /g does not contribute for any function g whose decay isin between power law and exponential. The term e − α ∗ t k depends on the behavior of t k , that canbe seen as roughly g − (1 / log k ). For any function between power law and exponential, e − α ∗ t k isasymptotic to a power of log k .The last observation is that every distribution in the general exponential class shows a dynamicalpower law as for the pure exponential distribution, as shown in Section 5.4. The pure exponentialdistribution is a special case where we consider h ( s ) ≡

1. Interesting is the fact that τ actuallydoes not depend on the choice of h ( s ), but only on the exponential rate θ > aG ( ∞ ). In particular,Proposition 5.6 proves that the limiting degree distribution of the two examples in Figure 9 havepower-law decay.We move to the class of sub-exponential ﬁtness. We show that the power law is lost due to theabsence of a pure exponential term. We prove the result using densities of the form µ ( s ) = C e − s ε , (5.12)29 he dynamics of power laws for ε > C the normalization constant. The result is the following: Proposition 5.7.

Consider a sub-exponential ﬁtness distribution as in (5.12) . Let ( M t ) t ≥ be thecorresponding birth process. Denote the minimum point of Ψ k ( t, s ) as in (5.11) by ( t k , s k ) . Then(1) for every t ≥ , M t satisﬁes P ( M t = k ) = k − (log k ) − ε/ e − θ ( aG ( t ))1+ ε (log k ) ε (1 + o (1)); (2) the limiting degree distribution ( p k ) k ∈ N of the CTBP has asymptotic behavior given by p k = e − α ∗ t k k − (cid:18) C − s εk g (cid:48) ( t k ) g ( t k ) (cid:19) e − θ ( aG ( ∞ ))1+ ε (log k ) ε (1 + o (1)); (3) the distribution ( q k ) k ∈ N of the total number of children of a ﬁxed individual satisﬁes q k = k − (log k ) − ε/ e − θ ( aG ( ∞ ))1+ ε (log k ) ε (1 + o (1)) . In Proposition 5.7 the distributions ( P k [ M ]( t )) k ∈ N , ( p k ) k ∈ N and ( q k ) k ∈ N decay faster than apower law. This is due to the fact that a sub-exponential tail for the ﬁtness distribution does notallow the presence of suﬃciently many individuals in the branching population whose ﬁtness valueis suﬃciently high to restore the power law.In this case, we have that s k is roughly c log k − c log log k . Hence, as ﬁrst approximation, s k is still of logarithmic order. The power-law term is lost because there is no pure exponential termin the distribution µ . In fact, in this case µ ( s k ) generates the dominant term e − θ (log k ) ε . The case whenthe ﬁtness Y is exponentially distributed turns out to be simpler. In this section, denote the ﬁtnessby T θ , where θ is the parameter of the exponential distribution. First of all, we investigate theLaplace transform of the process. In fact, we can write E [ M T α ] = (cid:90) ∞ θ e − θs E (cid:2) V sG ( T α ) (cid:3) ds, which is the Laplace transform of the stationary process ( V sG ( T α ) ) s ≥ with bounded ﬁtness G ( T α )in θ . As a consequence, E [ M T α ] = (cid:88) k ∈ N E (cid:34) k − (cid:89) i =0 f i G ( T α ) θ + f i G ( T α ) (cid:35) . Suppose that there exists a Malthusian parameter α ∗ . This means that, for ﬁxed ( f k ) k ∈ N , g and θ , α ∗ is the unique value such that E (cid:2) M T α ∗ (cid:3) = 1. As a consequence, if we ﬁx ( f k ) k ∈ N , g and α ∗ , θ isthe unique value such that (cid:88) k ∈ N E (cid:34) k − (cid:89) i =0 f i G ( T α ) θ + f i G ( T α ) (cid:35) = 1 . Therefore θ is the Malthusian parameter of the process ( V sG ( T α ) ) s ≥ . We are now ready to proveCorollary 2.6: Proof of Corollary 2.6.

We can write P ( M t = k ) = P (cid:0) V T θ G ( t ) = k (cid:1) , which means that we have toevaluate the Laplace transform of P (cid:0) V sG ( t ) = k (cid:1) in θ . Using (5.1) the ﬁrst part follows immediatelyby simple calculations. For the second part, we just need to take the limit as t → ∞ . For thesequence ( p k ) k ∈ N , the result is immediate since p k = E [ P k [ M ]( T α ∗ )].The case of aﬃne PA weights f k = ak + b is particularly nice. As already mentioned in Section2, the process ( M t ) t ≥ has a power-law distribution at every t ∈ R + and (2.12) follows immediately.Further, (2.13) and (2.14) follow directly. 30 aravaglia, van der hofstad, woeginger A. Limiting distribution with aging effect, no fitness

In this section, we analyze the limiting degree distribution ( p k ) k ∈ N of CTBPs with aging but noﬁtness. In Section A.1 we prove the adapted Laplace method for the general asymptotic behaviorof p k . In Section A.2 we consider some examples of aging function g , giving the asymptotics forthe corresponding distributions. A.1 Proofs of Lemma 5.1 and Proposition 5.2.

Proof of Lemma 5.1.

First of all, we show that t k is actually a minimum. In fact,lim t → ddt Ψ k ( t ) = −∞ , and lim t →∞ ddt Ψ k ( t ) = αk > . As a consequence, t k is a minimum. Then,lim k →∞ g ( t k ) (cid:32) α ∗ k − e − aG ( ∞ ) a e − aG ( ∞ ) (cid:33) − = lim k →∞ g ( t k ) akα ∗ (e aG ( ∞ ) −

1) = 1 . (A.1)In particular, g ( t k ) is of order 1 /k . Then, since t k is the actual minimum, and g is monotonicallydecreasing for t ≥ B ,Ψ (cid:48)(cid:48) k ( t k ) = bk g (cid:48) ( t k ) + g ( t k ) a e − aG ( t k ) (2 − e − aG ( t k ) )(1 − e − aG ( t k ) ) − g (cid:48) ( t k ) a e − aG ( t k ) − e − aG ( t k ) > . (A.2)We use the fact that we are evaluating the second derivative in the point t k where the ﬁrst derivativeis zero. This means g ( t k ) a e − aG ( t k ) )1 − e − aG ( t k ) = αk + bk g ( t k ) . We use this in (A.2) to obtain k Ψ (cid:48)(cid:48) k ( t k ) = bg (cid:48) ( t k ) + g ( t k ) a (2 − e − aG ( t k ) )1 − e − aG ( t k ) ( α + bg ( t k )) − g (cid:48) ( t k ) g ( t k ) ( α + bg ( t k ))= g ( t k ) a (2 − e − aG ( t k ) )1 − e − aG ( t k ) ( α + bg ( t k )) − α g (cid:48) ( t k ) g ( t k ) . (A.3)Now, we use Taylor expansion around t k of Ψ k ( t ) in the integral in (5.5). Since we use the expansionaround t k , which is the minimum of Ψ k ( t ), the ﬁrst derivative of Ψ k is zero. As a consequence, wehave I ( k ) = (cid:90) ∞ e − k ( Ψ k ( t k )+ Ψ (cid:48)(cid:48) k ( t k )( t − t k ) + o (( t − t k ) ) ) dt. First of all, notice that the contribution of the terms with | t − t k | (cid:29) − k Ψ k ( t ) ≤ e − α ∗ t (1 − e − aG ( ∞ ) ) k , (A.4)which means that such terms are exponentially small, so we can ignore them. Now we make achange of variable u = t − t k . Then I ( k ) = (cid:90) ∞− t k e − k ( Ψ k ( t k )+ Ψ (cid:48)(cid:48) k ( t k ) u + o ( u ) ) du. In particular, since the term e − k Ψ k ( t k ) does not depend on u , we can write I ( k ) = e − k Ψ k ( t k ) (cid:90) ∞− t k e − k ( Ψ (cid:48)(cid:48) k ( t k ) u + o ( u ) ) du. he dynamics of power laws We use the notation k Ψ k ( t k ) = σ k , which means we can rewrite the integral ase − k Ψ k ( t k ) (cid:113) πσ k (cid:90) t k −∞ (cid:113) πσ k e − u σ k dt = e − k Ψ k ( t k ) (cid:113) πσ k P (cid:0) N (0 , σ k ) ≤ t k (cid:1) . Since the distribution N (0 , σ k ) is symmetric with respect to 0, for every k ∈ N , P (cid:0) N (0 , σ k ) ≤ t k (cid:1) = 12 (cid:34) √ π (cid:90) t k /σ k − t k /σ k e − u du (cid:35) . (A.5)The behavior of the above integral depends on the ratio t k /σ k , which is bounded between 0 and 1.As a consequence, the term P (cid:0) N (0 , σ k ) ≤ t k (cid:1) is bounded between 1 / Proof of Proposition 5.2.

Recall that σ k = ( k Ψ( t k ) (cid:48)(cid:48) ) − . Using (A.3), the fact that g is boundedalmost everywhere, and g (cid:48) ( t k ) <

0, we can write k Ψ( t k ) (cid:48)(cid:48) = α (cid:32) a (2 − e − aG ( ∞ ) )1 − e − aG ( ∞ ) g ( t k ) − g (cid:48) ( t k ) g ( t k ) (cid:33) (1 + o (1)) . (A.6)Notice that in (A.6) the terms g ( t k ) − g (cid:48) ( t k ) g ( t k ) are always strictly positive, since g ( t ) is decreasing and t k → ∞ as k → ∞ . As a consequence, we can replace the term (cid:113) π/σ k by (cid:16) Cg ( t k ) − g (cid:48) ( t k ) g ( t k ) (cid:17) / ,for C = a (2 − e − aG ( ∞ ) )1 − e − aG ( ∞ ) . We also have thate − k Ψ k ( t k ) = exp (cid:104) − α ∗ t k − bG ( t k ) + k log (cid:16) − e − aG ( t k ) (cid:17)(cid:105) = e − α ∗ t k (1 − e − aG ( ∞ ) ) k (1 + o (1)) , since G ( t k ) converges to G ( ∞ ). For the term in (A.5), it is easy to show that it is asymptotic to D k ( g ). This completes the proof. A.2 Examples of aging functions.

In this section, we analyze two examples of aging functions,in order to give examples of the limiting degree distribution of the branching process. We consideraﬃne weights f k = ak + b , and three diﬀerent aging functions: g ( t ) = e − λt , g ( t ) = (1 + t ) − λ , and g ( t ) = λ e − λ (log( t +1) − λ ) . We assume that in every case the aging function g is integrable, so we consider λ > λ > λ , λ , λ > g satisﬁes Condition (2.5) in order to have a supercritical process.We now apply (5.7) to these three examples, giving their asymptotics. In general, we approxi-mate t k with the solution of, for c = a e − aG ( ∞ ) − e − aG ( ∞ ) , α ∗ k + bk g ( t ) − c g ( t ) = 0 . (A.7)We start considering the exponential case g ( t ) = e − λt . In this case, from (A.7) we obtain that,ignoring constants, t k = log k (1 + o (1)) . (A.8)32 aravaglia, van der hofstad, woeginger As we expected, t k → ∞ . We now use (A.6), which gives a bound on σ k in (A.1) in terms of g andits derivatives. As a consequence, (cid:18) g ( t k ) − g (cid:48) ( t k ) g ( t k ) (cid:19) − / = (cid:16) e − λt k + λ (cid:17) / ∼ λ / (1 + o (1)) . Looking at e − k Ψ k ( t k ) , it is easy to compute that, with t k as in (A.8),exp (cid:104) − α ∗ log k + bG ( t k ) + k log(1 − e − aG ( t k ) ) (cid:105) = k − α ∗ (1 + o (1)) . Since t k /σ k → ∞ , then P (cid:0) N (0 , σ k ) ≤ t k (cid:1) →

1, so that p k = Γ( k + b/a )Γ( b/a ) 1Γ( k + 1) C k − α ∗ e − C k (1 + o (1)) , which means that p k has an exponential tail with power-law corrections.We now apply the same result to the power-law aging function, so g ( t ) = (1 + t ) − λ , and G ( t ) = λ − (1 + t ) − λ . In this case(1 + t k ) = (cid:18) α ∗ c k (cid:19) − /λ (1 + o (1)) . We use again (A.1), so (cid:18) g ( t k ) − g (cid:48) ( t k ) g ( t k ) (cid:19) = (cid:32) α ∗ c k + λ (cid:18) c kα ∗ (cid:19) /λ (cid:33) / ∼ k α ∗ / λ (1 + o (1)) . In conclusion, p k = Γ( k + b/a )Γ( b/a ) 1Γ( k + 1) k α ∗ / λ e − α ∗ (cid:16) α ∗ c k (cid:17) − /λ − C k (1 + o (1)) , which means that also in this case we have a power-law with exponential truncation.In the case of the lognormal aging function, (A.7) implies that[log( t k + 1) − λ ] ≈ + 1 λ log (cid:16) c α ∗ k (cid:17) . By (A.1) we can say that (cid:18) g ( t k ) − g (cid:48) ( t k ) g ( t k ) (cid:19) = (cid:18) λ log (cid:16) c α ∗ k (cid:17) + 2 λ log( t k + 1) t k + 1 (cid:19) (1 + o (1)) = λ log (cid:16) c α ∗ k (cid:17) (1 + o (1)) . We conclude then, for some constant C > p k = Γ( k + b/a )Γ( b/a ) 1Γ( k + 1) (cid:16) λ log (cid:16) c α ∗ k (cid:17)(cid:17) / e − α ∗ e (log( c α ∗ k ))1 / e − C k (1 + o (1)) . B. Limiting distribution with aging and fitness

In this section, we consider birth processes with aging and ﬁtness. We prove Lemma 5.4, used inthe proof of Proposition 5.5. Then we give examples of limiting degree distributions for diﬀerentaging functions and exponentially distributed ﬁtness.33 he dynamics of power laws

B.1 Proofs of Lemma 5.4 and Proposition 5.5.

Proof of Lemma 5.4.

We use again second order Taylor expansion of the function Ψ k ( t, s ) centeredin ( t k , s k ), where the ﬁrst order partial derivatives are zero. As a consequence we writeexp [ − k Ψ k ( t, s )] = exp (cid:20) − k Ψ k ( t k , s k ) + 12 x T ( kH k ( t k , s k )) x + o ( || x || ) (cid:21) , where x = (cid:20) t − t k s − s k (cid:21) , and H k ( t k , s k ) =  ∂ Ψ k ∂t ( t k , s k ) ∂ Ψ k ∂s∂t ( t k , s k ) ∂ Ψ k ∂s∂t ( t k , s k ) ∂ Ψ k ∂s ( t k , s k )  . As for the proof of Lemma 5.1, we start by showing that we can ignore the terms where || x || (cid:29) − k Ψ k ( t,s ) ≤ exp ( − α ∗ t − bsG ( t ) + k log( µ ( s ))) . Since µ is a probability density, µ ( s ) < s (cid:29)

1. As a consequence, log( µ ( s )) <

0, whichmeans that the above bound is exponentially decreasing whenever t and s are very large. As aconsequence, we can ignore the contribution given by the terms where | t − t k | (cid:29) | s − s k | (cid:29) − k Ψ k ( t k ,s k ) is independent of t and s , so we do not consider it in the integral. Writing u = t − t k and v = s − s k , we can write (cid:90) R + × R + e − x T ( kH k ( t k ,s k )) x dsdt = (cid:90) ∞− t k (cid:90) ∞− s k e − y T ( kH k ( t k ,s k )) y dudv, where this time y T = [ u v ]. As a consequence, (cid:90) ∞− t k (cid:90) ∞− s k e − y T ( kH k ( t k ,s k )) y dudv = 2 π (cid:112) det( kH k ( t k , s k )) P ( N ( k ) ≥ − t k , N ( k ) ≥ − s k ) , (B.1)provided that the covariance matrix ( kH k ( t k , s k )) − is positive deﬁnite.As a consequence, we can use (B.1) to obtain that, for the corresponding limiting degree dis-tribution of the branching process ( p k ) k ∈ N , as k → ∞ , p k = Γ( k + b/a )Γ( b/a ) 1Γ( k + 1) e − k Ψ k ( t k ,s k ) π (cid:112) det( kH k ( t k , s k )) P ( N ( k ) ≥ − t k , N ( k ) ≥ − s k ) (1 + o (1)) . This results holds if the point ( t k , s k ) is the absolute minimum of Ψ k , and the Hessian matrix ispositive deﬁnite at ( t k , s k ). B.2 The Hessian matrix of Ψ k ( t, s ) . First of all, we need to ﬁnd a point ( t k , s k ) which is thesolution of the system ∂ Ψ k ∂t = α ∗ k + bk sg ( t ) − sag ( t )e − saG ( t ) − e − saG ( t ) = 0 , (B.2) ∂ Ψ k ∂s = bk G ( t ) − k µ (cid:48) ( s ) µ ( s ) − aG ( t )e − saG ( t ) − e − saG ( t ) = 0 . (B.3)Denote the solution by ( t k , s k ). Then ∂ Ψ k ∂t = bk sg (cid:48) ( t k ) + g ( t k ) s a e − asG ( t k ) (1 − e − asG ( t k ) ) − g (cid:48) ( t k ) as e − asG ( t k ) − e − asG ( t k ) ,∂ Ψ k ∂s = − k µ (cid:48)(cid:48) ( s ) µ ( s ) − µ (cid:48) ( s ) µ ( s ) + a G ( t ) e − saG ( t ) (1 − e − saG ( t ) ) ,∂ Ψ k ∂s∂t = bk g ( t k ) + (cid:18) − as k (cid:19) (cid:18) bk G ( t k ) − k µ (cid:48) ( s k ) µ ( s k ) (cid:19) (cid:18) α ∗ k + bk s k g ( t k ) (cid:19) . (B.4)34 aravaglia, van der hofstad, woeginger From (B.2) and (B.3) we know α ∗ k + bk s k g ( t k ) = s k ag ( t k )e − s k aG ( t k ) − e − s k aG ( t k ) ,bk G ( t k ) − k µ (cid:48) ( s k ) µ ( s k ) = aG ( t k )e − s k aG ( t k ) − e − s k aG ( t k ) . (B.5)Using (B.5) in the expressions for the second derivatives, ∂ Ψ k ∂t = bk s k g (cid:48) ( t k ) + as k g ( t k )(1 − e − as k G ( t k ) ) (cid:18) αk + bk s k g ( t k ) (cid:19) − g (cid:48) ( t k ) g ( t k ) (cid:18) αk + bk s k g ( t k ) (cid:19) = as k g ( t k )(1 − e − as k G ( t k ) ) (cid:18) αk + bk s k g ( t k ) (cid:19) − αk g (cid:48) ( t k ) g ( t k ) , (B.6) ∂ Ψ k ∂s = − k µ (cid:48)(cid:48) ( s k ) µ ( s k ) − µ (cid:48) ( s k ) µ ( s k ) + aG ( t k )1 − e − as k G ( t k ) (cid:18) bk G ( t k ) − k µ (cid:48) ( s k ) µ ( s k ) (cid:19) = − k µ (cid:48)(cid:48) ( s k ) µ ( s k ) + 1 k (cid:18) µ (cid:48) ( s k ) µ ( s k ) (cid:19) − k µ (cid:48) ( s k ) µ ( s k ) aG ( t k )1 − e − as k G ( t k ) + 1 k abG ( t k ) − e − as k G ( t k ) . (B.7)In conclusion, the matrix kH k ( t k , s k ) is given by( kH k ( t k , s k )) , = as k g ( t k )(1 − e − as k G ( t k ) ) ( α + bs k g ( t k )) − α g (cid:48) ( t k ) g ( t k ) ;( kH k ( t k , s k )) , = − µ (cid:48)(cid:48) ( s k ) µ ( s k ) + (cid:18) µ (cid:48) ( s k ) µ ( s k ) (cid:19) − µ (cid:48) ( s k ) µ ( s k ) aG ( t k )1 − e − as k G ( t k ) + abG ( t k ) − e − as k G ( t k ) ;( kH k ( t k , s k )) , = bg ( t k ) + (cid:18) − as k (cid:19) (cid:18) bk G ( t k ) − k µ (cid:48) ( s k ) µ ( s k ) (cid:19) ( α ∗ + bs k g ( t k )) . (B.8)We point out that, solving (B.3) in terms of s , it follows that s = 1 aG ( t ) log  k aG ( t ) bG ( t ) − µ (cid:48) ( s ) µ ( s )  . (B.9)As a consequence, s k g ( t k ) = α ∗ G ( t k ) µ ( s k ) µ (cid:48) ( s k ) . (B.10)We use (B.9), (B.10) and the expressions for the elements of the Hessian matrix given in (B.8) forthe examples in Section B.3. We also use the formulas of this section in the proof of Propositions5.6 and 5.7 given in Section C. B.3 Examples of aging functions.

Here we give examples of limiting degree distributions. Weconsider the same three examples of aging functions we considered in Section A.2, so g ( t ) = e − λt , g ( t ) = (1 + t ) − λ , and g ( t ) = λ e − λ (log( t +1) − λ ) . We consider exponentially distributed ﬁtness, so µ ( s ) = θ e − θs . In order to have a supercriticaland Malthusian process, we can rewrite Condition (2.8) for exponentially distributed ﬁtness as aG ( ∞ ) < θ < ( a + b ) G ( ∞ ).In general, we identify the minimum point ( t k , s k ), then use (C.13). For all three examples,replacing G ( t ) by G ( ∞ ) and using (B.9), it holds that s k ≈ aG ( ∞ ) log (cid:18) k aG ( ∞ ) bG ( ∞ ) + θ (cid:19) , he dynamics of power laws and s k g ( t k ) ≈ α ∗ G ( ∞ ) θ . For the exponential aging function, using (B.10), it follows that e − λt k ≈ log k . In this case, since g (cid:48) ( t ) /g ( t ) = − λ , the conclusion is that, ignoring the constants, p k = k − (1+ λθ/a ) (log k ) α ∗ /λ (1 + o (1)) . For the inverse-power aging function t k ≈ (log k ) /λ , which implies (ignoring again the constants)that p k = k − (1+( λ − θ/a ) e − α ∗ (log k ) /λ (1 + o (1)) , where we recall that, for g being integrable, λ >

1. For the lognormal case, t k ≈ e (log k ) / , which means that p k = k − (1+ θ/aG ( ∞ )) e − α ∗ e (log k )1 / (1 + o (1)) . C. Proof of propositions 5.6 and 5.7

In the present section, we prove Propositions 5.6 and 5.7. These proofs are applications of Propo-sition 5.5, and mainly consist of computations. In the proof of the two propositions, we often referto Appendix B.2 for expressions regarding the Hessian matrix of Ψ k ( t, s ) as in (5.11). C.1 Proof of Proposition 5.6.

We start by proving the existence of the dynamical power-law.We already know that P ( M t = k ) = Γ( k + b/a )Γ( b/a )Γ( k + 1) (cid:90) ∞ µ ( s )e − bsG ( t ) (cid:16) − e − asG ( t ) (cid:17) k ds. (C.1)We write J ( k ) = (cid:90) ∞ e − kψ k ( s ) ds, (C.2)where ψ k ( s ) = bG ( t ) k s − k log( µ ( s )) − log (cid:16) − e − asG ( t ) (cid:17) . (C.3)In order to give asymptotics on J ( k ) as in (C.2), we can use a Laplace method similar to the oneused in the proof of Lemma 5.1, but the analysis is simpler since in this case ψ k ( s ) is a functionof only one variable. The idea is again to ﬁnd a minimum point s k for ψ k ( s ), and to use Taylorexpansion inside the integral, so ψ k ( s ) = ψ k ( s k ) + 12 ψ (cid:48)(cid:48) k ( s k )( s − s k ) + o (( s − s k ) ) . We can ignore the contribution of the terms where ( s − s k ) (cid:29)

1, since e − kψ k ( s ) ≤ e − bsG ( t ) , so thatthe error is at most exponentially small. As a consequence, J ( k ) = (cid:114) πψ (cid:48)(cid:48) k ( s k ) e − kψ k ( s k ) (1 + o (1)) . (C.4)The minimum s k is a solution of dψ k ( s ) ds = bG ( t ) k − k µ (cid:48) ( s ) µ ( s ) − aG ( t )e − saG ( t ) − e − asG ( t ) = 0 . (C.5)In particular, s k satisﬁes the following equality, which is similar to (B.9): s k = 1 aG ( t ) log (cid:18) k aG ( t ) bG ( t ) − µ (cid:48) ( s k ) /µ ( s k ) (cid:19) . (C.6)36 aravaglia, van der hofstad, woeginger When µ ( s ) = Ch ( s )e − θs , µ (cid:48) ( s ) µ ( s ) = h (cid:48) ( s )e − θs − θh ( s )e − θs h ( s )e − θs = − θ (cid:18) − h (cid:48) ( s ) θh ( s ) (cid:19) ≈ − θ. (C.7)In particular, this implies s k = 1 aG ( t ) log (cid:18) k aG ( t ) bG ( t ) + θ (cid:19) (1 + o (1)) . (C.8)Similarly to the element ( kH k ( t k , s k )) , in (B.8), k d ψ k ( s k ) ds = − µ (cid:48)(cid:48) ( s k ) µ ( s k ) + (cid:18) µ (cid:48) ( s k ) µ ( s k ) (cid:19) − µ (cid:48) ( s k ) µ ( s k ) aG ( t )1 − e − as k G ( t ) + abG ( t ) − e − as k G ( t ) . (C.9)For the general exponential class, the ratio µ (cid:48)(cid:48) ( s k ) µ ( s k ) = h (cid:48)(cid:48) ( s ) h ( s ) − θ + θ . As a consequence, k d ψ k ( s ) ds converges to a positive constant, which means that s k is an actualminimum. Then J ( k ) = c e − kψ k ( s k ) (1 + o (1)). Using this in (C.1) and ignoring the constants, P ( M t = k ) = Γ( k + b/a )Γ( b/a )Γ( k + 1) e − s k bG ( t ) µ ( s k )(1 + o (1))= k − k − b/a k b/a h ( s k ) k − θ/aG ( t ) (1 + o (1)) = h ( s k ) k − (1+ θ/aG ( t )) (1 + o (1)) , (C.10)which is a power-law distribution with exponent τ ( t ) = 1 + θ/aG ( t ), and minor corrections givenby h ( s k ). This holds for every t ≥

0. In particular, considering G ( ∞ ) instead of G ( t ), with thesame argument we can also prove that the distribution of the total number of children obeys apower-law tail with exponent τ ( ∞ ) = 1 + θ/aG ( ∞ ).We now prove the result on the limiting distribution ( p k ) k ∈ N of the CTBP, for which we applydirectly Proposition 5.5, using the analysis on the Hessian matrix given in Section B.2. First of all,from (B.9) it follows that s k = 1 aG ( t k ) log (cid:18) k aG ( t k ) bG ( t k ) + θ (cid:19) (1 + o (1)) , (C.11)and by (B.10) s k g ( t k ) k →∞ −→ α G ( ∞ ) θ . (C.12)For the Hessian matrix, using (C.11) and (C.11) in (B.8), for any integrable aging function g wehave ( kH k ( t k , s k )) , = C + o (1) > , and ( kH k ( t k , s k )) , = o (1) , but ( kH k ( t k , s k )) , behaves according to g (cid:48) ( t k ) /g ( t k ). If this ratio is bounded, then ( kH k ( t k , s k )) , = C + o (1) >

0, while ( kH k ( t k , s k )) , → ∞ whenever g (cid:48) ( t k ) /g ( t k ) diverges. In both cases, ( t k , s k ) isa minimum. In particular, again ignoring the multiplicative constants and using (C.11) and (C.12)in the deﬁnition of Ψ k ( t, s ), the limiting degree distribution of the CTBP is asymptotic to k − (1+ θ/ ( aG ( t k ))) h ( s k )e − α ∗ t k (cid:18) ˜ C − α ∗ g (cid:48) ( t k ) g ( t k ) (cid:19) − / , (C.13)where the term (cid:16) ˜ C − α ∗ g (cid:48) ( t k ) g ( t k ) (cid:17) − / , which comes from the determinant of the Hessian matrix,behaves diﬀerently according to the aging function. With this, the proof of Proposition 5.6 iscomplete. 37 he dynamics of power laws C.2 Proof of Proposition 5.7.

This proof is identical to the proof of Proposition 5.6, but thistime we consider a sub-exponential distribution. First, we start looking at the distribution of thebirth process at a ﬁxed time t ≥

0. We deﬁne ψ k ( s ) and J ( k ) as in (C.3) and (C.2). We use again(C.1), so s k = 1 aG ( t ) log (cid:18) k aG ( t ) bG ( t ) − µ (cid:48) ( s k ) /µ ( s k ) (cid:19) . In this case, we have µ (cid:48) ( s ) µ ( s ) = − θ (1 + ε ) s ε . (C.14)Then s k satisﬁes s k = 1 aG ( t ) log (cid:18) k aG ( t ) bG ( t ) + θ (1 + ε ) s εk (cid:19) . (C.15)By substitution, it is easy to check that s k is approximatively c log k − c log log k = log k (1 − log log k log k ), for some positive constants c and c . This means that as ﬁrst order approximation, s k isstill of logarithmic order. Then, µ (cid:48)(cid:48) ( s ) µ ( s ) = θ (1 + ε ) s ε − θ (1 + ε ) εs ε − . (C.16)Using (C.14) and (C.16), we can write k d ψ k ( s ) ds = θ (1 + ε ) εs ε − k + θ (1 + ε ) s εk aG ( t )1 − e − as k G ( t ) + abG ( t ) − e − as k G ( t ) = θ (1 + ε ) εs ε − k + θ (1 + ε ) s εk k ( bG ( t ) + θ (1 + ε ) s εk )+ bG ( t ) k ( bG ( t ) + θ (1 + ε ) s εk ) . (C.17)The dominant term is c s εk , for some constant c . This means k d ψ k ( s ) ds is of order (log k ) ε . Now, J ( k ) = (cid:18) k d ψ k ( s ) ds (cid:19) − / C e − bG ( t ) s k − θs εk + k log(1 − e − aG ( t ) sk ) (1 + o (1))= (log k ) − ε/ k − b/a e − θ (log k ) ε (1 + o (1)) . (C.18)As a consequence, P ( M t = k ) = k − (log k ) − ε/ e − θ (log k ) ε (1 + o (1)) , (C.19)which is not a power-law distribution. Again using similar arguments, we show that the limitingdegree distribution of the CTBP does not show a power-law tail. In this case s k = 1 aG ( t k ) log (cid:18) k aG ( t k ) bG ( t k ) + θ (1 + ε ) s εk (cid:19) , and s k g ( t k ) = αG ( t k ) θ (1 + ε ) s εk = αG ( t k )log ε k (1 + o (1)) → . The Hessian matrix elements are( kH k ( t k , s k )) , = aα G ( t k ) s εk − aα g (cid:48) ( t k ) g ( t k ) + o (1) , ( kH k ( t k , s k )) , = θ (1 + ε ) εs ε − k + θs εk aG ( ∞ ) + abG ( ∞ ) + o (1) , ( kH k ( t k , s k )) , = o (1) . (C.20)38 aravaglia, van der hofstad, woeginger This implies that det ( kH k ( t k , s k )) = C − s εk g (cid:48) ( t k ) g ( t k ) + o (1) > . (C.21)As a consequence, ( t k , s k ) is an actual minimum. Then using the deﬁnition of Ψ k ( t, s ), p k = e − α ∗ t k k − b/a k − b/a µ ( s k ) ∼ e − α ∗ t k k − (cid:18) C − s εk g (cid:48) ( t k ) g ( t k ) (cid:19) e − θ ( aG ( ∞ ))1+ ε (log k ) ε (1 + o (1)) . (C.22)This completes the proof. Acknowledgments.

We are grateful to Nelly Litvak and Shankar Bhamidi for discussions onpreferential attachment models and their applications, and Vincent Traag and Ludo Waltman fromCWTS for discussions about citation networks as well as the use of Web of Science data. Thiswork is supported in part by the Netherlands Organisation for Scientiﬁc Research (NWO) throughthe Gravitation

Networks grant 024.002.003. The work of RvdH is further supported by theNetherlands Organisation for Scientiﬁc Research (NWO) through VICI grant 639.033.806.

References [1] R. Albert and A. L. Barab´asi. “Statistical mechanics of complex networks”.

Rev. ModernPhys.

In-ternet Math.

Proc. Indian Acad. Sci. Math. Sci.

Dover Publications, NY (2004). Reprintof the 1972 original [Springer, New York; MR0373040], pp. xii+287.[5] S. Bhamidi. “Universal techniques to analyze preferential attachment trees: Global and Localanalysis”. preprint (2007). url : .[6] C. Borgs, J. Chayes, C. Daskalakis, and S. Roch. “First to market is not everything: ananalysis of preferential attachment with ﬁtness”. STOC’07—Proceedings of the 39th AnnualACM, Symposium on Theory of Computing (2007), pp. 135–144.[7] G. Cs´ardi. “Dynamics of Citation Networks”. Lecture Notes in Computer Science 4131 (2006).Ed. by Stefanos D. Kollias, Andreas Stafylopatis, Wlodzislaw Duch, and Erkki Oja, pp. 698–709.[8] M. Deijfen, H. van den Esker, R. van der Hofstad, and G. Hooghiemstra. “A preferentialattachment model with random initial degrees”.

Arkiv f¨or Matematik

Electron. J.Probab.

21 (2016), 38 pp.[10] S. Dereich and M. Ortgiese. “Robust analysis of preferential attachment models with ﬁtness”.

Combinatorics, Probability and Computing

Physica A: Statistical Mechanics andits Applications

Physica A:Statistical Mechanics and its Applications arXiv preprint (2017).[14] R. van der Hofstad. “Random Graphs and Complex Networks, Volume 1”.

Cambridge Uni-versity Press (2017).[15] P. Jagers and O. Nerman. “The growth and composition of branching populations”.

Adv. inAppl. Probab. he dynamics of power laws [16] H. Jeong, Z. N´eda, and A. L. Barab´asi. “Measuring preferential attachment in evolving net-works”.

EPL (Europhysics Letters)

Journal of Mathematical Analysis and Applications

Prob-ability Theory and Related Fields

Internet Math.

Physics today arXiv preprint (2017).[22] A. Rudas. “Random tree growth with general weight function”. arXiv preprint math/0410532 (2004).[23] A. Rudas, B. T´oth, and B. Valk´o. “Random trees and general branching processes”.

RandomStructures Algorithms

Science

Physica A: Statistical Mechanics and its Applications