Scaling of degree correlations and the influence on diffusion in scale-free networks
aa r X i v : . [ phy s i c s . s o c - ph ] A ug Scaling of degree correlations and the influence on diffusion in scale-free networks
Lazaros K. Gallos, Chaoming Song, and Hern´an A. Makse
Levich Institute and Physics Department, City College of New York, New York, NY 10031, US (Dated: October 29, 2018)Connectivity correlations play an important role in the structure of scale-free networks. Whileseveral empirical studies exist, there is no general theoretical analysis that can explain the largelyvarying behavior of real networks. Here, we use scaling theory to quantify the degree of correlationsin the particular case of networks with a power-law degree distribution. These networks are classifiedin terms of their correlation properties, revealing additional information on their structure. Forinstance, the studied social networks and the Internet at the router level are clustered aroundthe line of random networks, implying a strongly connected core of hubs. On the contrary, somebiological networks and the WWW exhibit strong anti-correlations. The present approach can beused to study robustness or diffusion, where we find that anti-correlations tend to accelerate thediffusion process.
PACS numbers: 89.75.Fb, 89.75.Da, 87.23.Ge
The topological structure of complex networks islargely determined by the way in which the constituentunits are interconnected. Correlations in the connectivityof complex networks have been proved to be importantand have been used to explain the functionality, robust-ness, stability, and structure of networks from Biology [1]and Sociology [2] to Computer Science [3]. A study of thecorrelation profile in a network of protein-protein inter-actions revealed that links from hubs to non-hub nodesare favored [1], a result with consequences for the sta-bility and the modularity in biological networks. In thecase of social networks, Newman has shown that mostof them are assortative (i.e. hub-hub correlations dom-inate the system) [2], and Colizza et al. demonstratedthe ‘rich-club’ phenomenon where all hubs tend to forma connected cluster [4]. As a result, social networks aremore difficult to immunize and diseases can spread fast.Recently, it was also shown that hub anticorrelations,i.e. the tendency of the hubs not to be directly connectedwith each other, give rise to fractal networks [5], such asthe undirected (symmetrized) WWW, the protein homol-ogy network [6] and other biological networks. On thecontrary, when there is a large probability of direct hubconnections the resulting networks, such as the Internet,the cond-mat co-authorship and other social networks,are non-fractals [7]. In this category falls also the ran-dom configuration model [8, 9].An important topological feature of complex networksis the degree distribution P ( k ), where k is the numberof links for a given node. Although the form of P ( k )has a direct influence on the network properties, it can-not convey all the information for the network structure.Thus, two networks can have the same distribution P ( k )but with completely different topologies, determined bythe presence of degree correlations. This structure canbe captured by the probability P ( k , k ) that two nodesof degree k and k are connected to each other, and byquantities derived from P ( k , k ), such as the Pearson coefficient r , the average degree of nearest neighbors k nn ,etc.Despite their importance, a general theoretical frame-work to describe and characterize degree correlations inscale-free networks is still work in progress. For an at-tempt to describe correlations using a master equationapproach see [10]. Here, we find that the degree corre-lations in the studied scale-free networks can be charac-terized in terms of a correlation exponent ǫ , which wecalculate using a renormalization approach. This allowsus to propose a classification of a set of dissimilar net-works, according to the degree of correlations into a smallnumber of different classes in a “phase diagram”. Forexample, biological networks and the WWW are in thestrong anti-correlations part of the diagram, while socialnetworks and the Internet are clustered near the region ofrandom networks. We show how we can use these ideasto explore more network properties, such as diffusion androbustness, which depend on the degree of correlations inthe network.We start by recalling the renormalization of a networkunder a scale transformation. The renormalization pro-cedure tiles a network according to the box-covering al-gorithm [11], with the minimum number of boxes wherethe maximum distance in any box is less than ℓ B . Eachbox is subsequently replaced by a node, and links areestablished between these new ‘super-nodes’ if at leastone node included in a box was connected to any nodeof the other box. These boxes are treated as the nodesof the renormalized network. Renormalization is a reli-able method for determining how the network behaves atdifferent length scales. Self-similarity is then obtained ifthe network structure remains invariant under the renor-malization.Alternatively, we can retain multiple links between theboxes when we renormalize a network [12]. As we showbelow, though, this is not a strong effect, mainly due topreservation of the self-similar structure under renormal- FIG. 1: The joint degree distribution P ( k , k ) of WWW (toprow) and Internet at the router level (bottom row) beforerenormalization (left), after renormalization forbidding mul-tiple links (center), and including multiple links (right). ization. In contrast, during a random rewiring processdegree correlations are destroyed, but bias is introducedif multiple links are forbidden.We use renormalization and scaling theory to deter-mine the form of P ( k , k ). Since the self-similarityof a scale-free network requires the invariance of thedegree distribution P ( k ), a power-law distribution of P ( k ) ∼ k − γ , where γ is the degree exponent, is the onlyform that can satisfy this condition [5]. Taking this ideaone step further, it is interesting to clarify whether corre-lations between degrees, as expressed by the joint degreedistribution P ( k , k ), also remain invariant. In Fig. 1we present an example of this distribution before and af-ter renormalization for the WWW and the Internet atthe router level (similar results are derived for other net-works, as well). Allowing multiple links between boxesdoes not significantly modify the result. The statisticalsimilarity of the corresponding plots suggests the invari-ance of P ( k , k ). Accordingly, this suggests that the k and k dependence can be separated and the behavior ofthe tail of the joint degree distribution is: P ( k , k ) ∼ k − ( γ − k − ǫ ( k > k ) . (1)The value of the first exponent γ − R P ( k , k ) dk = k P ( k ) ∼ k − ( γ − . Equation (1) is also consistent withthe known result for completely random networks P ( k , k ) ∼ k P ( k ) k P ( k ) ∼ k − γ k − γ , (2)i.e. the exponent ǫ for these networks is ǫ rand = γ − k -dependence was k − γ , and for large-degree nodes, with a k − γ dependence. Using Eq. (1) we can see that integrat-ing over k for low-degree nodes ( k > k ) we retrieve the k − γ dependence. For the case of hubs, where integrationis over k , the dependence on the degree is k − ǫ and forthe yeast protein interaction network we have calculatedthat ǫ = γ (see Fig. 3). These results are in agreementwith the observed behavior in [1].Next, we introduce a scale-invariant quantity E b ( k ) tosimplify the estimation of ǫ , even for small networks.We are motivated to introduce this quantity by askingwhether a node is significantly linked to more connectednodes, i.e. a node considers another node as a ‘hub’ if itsdegree is much larger than its own. We define the ratio E b ( k ) ≡ R ∞ bk P ( k | k ′ ) dk ′ R ∞ bk P ( k ′ ) dk ′ , (3)as the measure of a node’s preference to connect to neigh-bors with degree larger than bk ( b is an arbitrary positivenumber, and large b corresponds to the identification ofthe hubs) [13]. From the scaling of E b ( k ) with k , weare able to obtain the exponent ǫ in a simpler way thanusing P ( k , k ), which presents more fluctuations thanthe average quantity E b ( k ). The conditional probabilityis P ( k | k ′ ) = P ( k, k ′ ) / R P ( k, k ′ ) dk = P ( k, k ′ ) /k ′ − γ = k − ( γ − k ′− (1+ ǫ − γ ) . We find for a scale-free distribution: E b ( k ) ∼ k − ( ǫ − γ ) . (4)We have verified that the scaling of E b ( k ) remains invari-ant under renormalization. The same scaling exponentsare recovered for the renormalized networks, even whenmultiple links are allowed between two boxes (Fig. 2 in-set). In the latter case, the renormalized nodes have ingeneral larger degrees, which means that deviations ap-pear in nodes of smaller degree. Additionally, ǫ was foundto be independent of the value of b . We notice that otherquantities derived from P ( k , k ) may not be invariantunder renormalization, such as r or k nn , and thereforeare not suitable to distinguish fractal from non-fractalnetworks.In Fig. 2 we present the behavior of E b ( k ) for theWWW, protein homology, Internet (router level) andcond-mat authorship. The existence of a scaling rela-tion over a k range, combined with the invariance of thiscurve, support Eq. (4) and the form used for P ( k , k )in Eq. (1). The WWW and the protein homology net-work have been shown to have a fractal topology. Theslope of E b ( k ) with k is small or negative in these caseswith values of ǫ = 2 . ǫ = 2 .
4, respectively. Thisbehavior is in contrast with the two non-fractal networksin the figure, i.e. the Internet at the router level and thecond-mat co-authorship network, where E b ( k ) increasesalmost linearly with increasing k . For these networks wefind that ǫ = 1 . ǫ = 1 .
6, respectively.
FIG. 2: Plot of E b ( k ) versus k for the WWW, protein ho-mology, Internet at the router level and cond-mat network.Different topologies correspond to different scaling behaviorwith the degree k . Inset: Plot of E b ( k ) versus k for a) theInternet, b) the renormalized Internet network without multi-ple links between two nodes, and c) the renormalized Internetnetwork allowing multiple links between two nodes. The datahave been vertically shifted in order to show the invariance.In b) and c) we use the MEMB method [11] with r B = 3, and b = 3. In order to interpret the values of ǫ we now turn to therenormalization scheme [5]. After renormalization thenumber of nodes N in the network and the degree of anode k scale with ℓ B as power laws with fractal exponent d B and degree exponent d k , respectively (we use a primeto describe quantities measured in the renormalized net-work): N → N ′ ∼ ℓ − d B B N , k → k ′ ∼ ℓ − d k B k . (5)If d B and d k are finite, the network is fractal. If d B → ∞ and d k → ∞ (or equivalently the decay is exponential orfaster) the network is not fractal.After tiling the network with boxes of diameter ℓ B ,each of these boxes have one unique local hub (i.e. thelargest degree node in the box). Considering all possiblepairs of boxes, we introduce the probability E ( ℓ B ) thatthere exists a direct connection between the two hubs ofany two boxes. We have shown (see e.g. Figs. 2e, 3d ofRef. [7]) that the probability E scales with the length ℓ B as E ( ℓ B ) ∼ ℓ − d e B . (6)Below we relate the exponent ǫ to the hub-hub repulsionthrough the hub correlation exponent d e , which is crucialfor fractality.The conservation of links in the renormalized networkleads to the expression N P ( k , k ) dk dk = E ( ℓ B ) N ′ P ′ ( k ′ , k ′ ) dk ′ dk ′ . (7) FIG. 3: Classification of scale-free networks [16]. We usethe correlation exponent ǫ in order to quantify the degreeof correlations and the fractality of a network, as a functionof γ . The line ǫ rand = γ − ǫ = 2 separates fractal( ǫ >
2) from non-fractal networks ( ǫ ≤ ǫ = γ describes a fractal tree [7]. The four schematics illustratenetworks where hub correlations are stronger than in randomnetworks (area I), weaker than random but non-fractal (areaII), non-fractal according to the minimal model of [7] ( ǫ = 2),and fractal (area III). Using Eqs. (1), (5), (6), and (7) we get the relation ℓ d B B ℓ d e B ℓ (3 − γ − ǫ ) d k B = 1 which finally leads to ǫ = 2 + d e /d k = 2 + ( γ − d e d B , (8)where we have substituted the value γ = 1 + d B /d k . Thisrelation of ǫ with d e shows that correlations between thehubs of the boxes determine the correlations for all de-grees, in accordance with the invariance under renormal-ization.The direct determination of ǫ through the slope of E b ( k ) vs k enables us to construct a ‘phase diagram’ inthe plane ( ǫ, γ ), shown in Fig. 3. This plot is classifyingthe studied networks in classes according to their degreeof correlations, even though they correspond to dissimilarsystems in biology, sociology or technology.As shown in Eq. (2), the exponent for a random net-work corresponds to the random line ǫ rand = γ −
1, whichis verified in the plot for different γ values of the config-uration model. In random network models, correlationsarise because links are selected for connecting with eachother equiprobably, so that the probability of two hubsbeing connected is large [14]. Thus, networks that areclose to the line ǫ rand = γ − ǫ rand = γ − ǫ rand quantifies how different from randomnessthe network structure is, in terms of degree correlations.As ǫ increases from ǫ rand , we expect that at somepoint the networks will become fractal, due to increasedhub-hub correlations. The point of emergent fractalityis found through Eq. (8), where the borderline case of d B → ∞ yields ǫ = 2. Indeed, we have verified via directmeasurements of d B that all the networks above the line ǫ = 2 in Fig. 3 are fractals.Thus, starting from ǫ rand we can separate the phasespace into areas where the hub correlations are strongerthan in random models (area I) or weaker than that (ar-eas II and III). The weak correlation areas II and IIIare further divided by the line ǫ = 2 which determineswhether the anticorrelations are strong enough to resultin a fractal network (III) or not (II).An immediate result from this diagram is the differ-ent position of the Internet at the router level comparedto the AS level [15]. Although the degree distributionof these two networks is the same ( γ = 2 . ǫ reveals that there are more hub-hubconnections at the router level, similar to the case of arandom network. Contrary to that, the AS level exhibitsa structure with less correlations deviating from that ofa simple random model. This difference may hint on dif-ferent design principles or requirements at varying levelsof the Internet.The above approach can be directly applied to exploremany interesting properties, such as network robustness,synchronization, or diffusion processes. Until now, the-oretical studies have been limited to using the uncorre-lated version of P ( k ′ | k ) ∼ k ′ P ( k ′ ). The introduction ofEq. (1) enables us to substitute this form and generalizethe problem for networks with known correlation expo-nents. For example, we can study the effect of correla-tions on diffusion by starting with the master equationfor the density of particles ρ ( k, t ) on nodes with degree k at time tdρ ( k, t ) dt = − ρ ( k, t ) + k ∞ X k ′ = k min P ( k ′ | k ) ρ ( k ′ , t ) k ′ , (9)and substitute P ( k ′ | k ) with a form derived throughEq. (1). The Laplace transform of the above equationleads to a Fredholm integral equation of the second kind FIG. 4: Influence of correlations on diffusion, for scale-freenetworks with γ = 2 .
75. The presented ǫ values correspond todifferent areas in Fig. 3. Decreasing hub-hub correlations (topto bottom) leads to faster convergence towards equilibrium,i.e. diffusion is accelerated. with separable kernel that can be solved analytically. Wedefine the quantity K ( t ) = h k x ( t ) i /x , where x = γ − − ǫ ,and K ( t ) serves as a measure of the diffusing parti-cles preference to larger or smaller degrees k . The an-alytical result for κ x ( t ), defined as κ x ( t ) = h k x i ( t ), is κ x ( t ) = κ ∞ + ( κ − κ ∞ ) e − ct , where κ ∞ = ( γ − / ( ǫ − κ = ( γ − /ǫ are constants depending on γ and ǫ ,while c = ( ǫ − / (( γ − ǫ − γ )). The result for K ( t )is displayed in Fig. 4 for networks with γ = 2 .
75. Theexponential convergence to the asymptotic steady-stateconfiguration depends on the value of c , which increaseswith the exponent ǫ . Networks that are close to the ran-dom case ǫ rand = γ − ǫ value enhancesanti-correlations in the network and the particles movefaster occupying smaller degree nodes on the average. Wecan infer, thus, that stronger correlations tend to speedup the diffusion process. The mechanism behind this be-havior is as follows: when the hubs are directly connectedto each other the particles tend to remain localized in theneighborhood around these hubs, so that it takes longerfor them to explore wider areas. On the contrary, whenhub anti-correlations are important the particles spendmost of their time in the intermediate areas which areformed by smaller degree nodes and which connect indi-rectly the hubs to each other.We acknowledge valuable discussions with ShlomoHavlin and support from NSF grants. [1] S. Maslov and K. Sneppen, Science , 910 (2002).[2] M.E.J. Newman,
Phys. Rev. Lett. , 208701 (2002).[3] R. Pastor-Satorras, A. V´azquez, and A. Vespignani, Phys. Rev. Lett. , 258701 (2001).[4] V. Colizza, A. Flammini, M.A. Serrano, and A. Vespig-nani, Nature Physics , 110 (2006).[5] C. Song, S. Havlin, and H.A. Makse, Nature , 392(2005).[6] A.T. Adai, S.V. Date, S. Wieland, and E.M. Marcotte,
J. Mol. Biol. , 179 (2004).[7] C. Song, S. Havlin, and H.A. Makse,
Nature Physics ,275 (2006).[8] M. Molloy and B.A. Reed, Random Struct. Algorithms
161 (1995).[9] A.-L. Barab´asi and R. Albert,
Science , 509 (1999).[10] P. Krapivsky and S. Redner,
Phys. Rev. E , 066123 (2001).[11] C. Song, L. K. Gallos, S. Havlin, and H. A. Makse, J.Stat. Mech.
P03006 (2007).[12] S. Maslov, K. Sneppen, and A. Zaliznyak,
Physica A ,529 (2004).[13] Notice that we could have defined E b ( k ) without usingthe denominator in Eq. (3), i.e. E b ( k ) ≡ R ∞ bk P ( k | k ′ ) dk ′ ,in which case we get E b ( k ) ∼ k − (1 − ǫ ) . However, due tothe large fluctuations occurring at the tails of the de-gree distribution in real-life networks, the calculation of ǫ through Eq. (3) was proved to be more robust.[14] M. Catanzaro, M. Boguna, and R. Pastor-Satorras, Phys.Rev. E71