SScaling of load in communications networks
Onuttom Narayan and Iraj Saniee Department of Physics, University of California, Santa Cruz, CA 95064 and Mathematics of Networks Department, Bell Laboratories,Alcatel-Lucent, 600 Mountain Avenue, Murray Hill, NJ 07974 (Dated: October 29, 2018)We show that the load at each node in a preferential attachment network scales as a power of thedegree of the node. For a network whose degree distribution is p ( k ) ∼ k − γ , we show that the loadis l ( k ) ∼ k η with η = γ − , implying that the probability distribution for the load is p ( l ) ∼ /l independent of γ. The results are obtained through scaling arguments supported by finite size scalingstudies. They contradict earlier claims, but are in agreement with the exact solution for the specialcase of tree graphs. Results are also presented for real communications networks at the IP layer,using the latest available data. Our analysis of the data shows relatively poor power-law degreedistributions as compared to the scaling of the load versus degree. This emphasizes the importanceof the load in network analysis.
PACS numbers:
A variety of problems in fields ranging from the socialsciences to biology to engineering deal with networks, forwhich a unified understanding has been sought [1, 2, 3].One of the models commonly used is the preferential at-tachment (PA) model due to Barabasi and Albert [4] andits generalizations [5, 6, 7]. This model generates scale-free networks, in which the probability for a node to havea degree k scales as p ( k ) ∼ k − γ . Depending on the pa-rameters of the model, γ can be varied continuously overthe range 2 < γ ≤ . It has been argued that this modelis appropriate to describe communications networks suchas the Internet.Among the various properties of PA networks that havebeen studied is the distribution of load (with uniform de-mand) at different nodes of the network. This is definedby assuming that one unit of traffic flows between eachpair of nodes in the network along the shortest path con-necting them[8]. (If there are multiple shortest pathsbetween a pair of nodes, the traffic between them is di-vided equally among all the shortest paths[9].) In thissetting, the amount of traffic flowing through a node is itsload. Based on numerical simulations [10], it was claimedthat the probability distribution for the load scales as p ( l ) ∼ /l δ with δ = 2 . . Subsequently, data for net-works in various different fields were presented, and itwas argued that [11] there are two universality classeswith δ = 2 and δ = 2 . . These claims were disputed [12]on the basis further numerical simulations, which seemedto indicate that δ varies continuously with γ and is there-fore not universal. For the special case of PA networksthat are trees, it was argued [13] and then proved [14]that δ = 2 . In this paper, we show that the average load at nodesof degree k scales as l ( k ) ∼ k η with η = γ − γ. (As k is increased for fixed N, finite size effects are seen.) If we assume that thedistribution of load for fixed k and N does not have ananomalously large width, this implies that the exponent δ is universal, but is equal to 2, contradicting the earlier claims [10, 11, 12], and extending the exact result for PAtrees. We also extend the analytical proof of Ref. [14]for tree graphs to show directly that η = 2 , supportingthe assumption that the distribution of l for fixed k isnot anomalous. Our results are obtained by simple scal-ing arguments that are reinforced by finite size scalingstudies. The deviations from this universal result thatare observed [10, 12] are due to finite size scaling andsubleading corrections to the asymptotic scaling form.We also show results for load analyses on networksdrawn from a recent database [15] of connectivity of com-munications networks at the IP layer. The data arecollected with new measurement techniques, and findmany more routers and links than earlier studies [16].The results demonstrate that the scaling of the load l ( k ) ∼ k η is much clearer than that of the —more com-monly studied— degree distribution.In the generalized PA model, a network grows one nodeat a time. Each node is born with m undirected edgeswhich are attached to preexisting nodes. The probabilityof attachment to a preexisting node of degree k is pro-portional to k + k . Thus k and m are the parametersof the model, with k < − m. For an infinite network, itcan be shown that the probability of a randomly chosennode having a degree k is to p k ∝ k − γ γ = 3 + k /m (1)for large k. For such a network, we assume —as verifiedlater through numerical simulations— that the averageload l N ( k ) at all the nodes of degree k in a network of N nodes has the scaling form l N ( k ) = N k η ˆ l ( k/N µ ) (2)where ˆ l ( x ) → x → l ( x ) → x → ∞ . The prefactor of N is reasonable, since most of the loadat nodes near the periphery of the network, for which k ∼ O (1), is due to traffic that starts or ends there,and is therefore O ( N ) . The exponent η can be found by a r X i v : . [ phy s i c s . s o c - ph ] J un noting that (cid:80) k [ N p k ] l N ( k ) is the total traffic flowing inthe network. Since N units of traffic are generated inthe graph, and the average geodesic length is ∼ ln N, thesum should scale as ∼ N ln N for large N. From Eqs.(1)and (2), this implies that η = γ − . (3)To find the exponent µ, we note that for a network of N nodes, the maximum degree k max that is achievedcan be estimated by requiring (1 − (cid:80) ∞ k max p k ) N to be ∼ O (1) . From Eq.(1), for large k max this is equivalentto exp[ − AN/k γ − max ] ∼ O (1) with some constant A, fromwhich k max ∼ N / ( γ − . If we assume that all character-istic k ’s scale with N in the same way, we obtain µ = 1 / ( γ − . (4)For k << N µ , Eq.(2) implies that l N ( k ) ∼ N k η . Whencombined with Eq.(1), we have p ( l ) dl ∼ dl/l δ with δ = γ − η + 1 = 2 (5)where we have used Eq.(3).Although these results are plausible, they are basedon assumptions, most notably the scaling hypothesis ofEq.(2) itself. To check these assumptions, we turn to fi-nite size scaling numerical simulations. Networks with N ranging from 500 to 8000 or 16000 were generated for dif-ferent values of m and k . We considered the cases k = 0with m = 1 , , , m = 6 with k = 1 , , , , . Foreach choice of k , m and N, l N ( k ) was calcu-lated by averaging over all the nodes of degree k in the100 graphs. For the plots, it is more convenient to usethe form l N ( k ) = N ˜ l ( k/N µ ) instead of Eq.(2), whichone obtains if one uses Eqs.(4) and (3).Figure 1 shows the results for m = 1 and k = 0 , forwhich µ = 1 / l ( k ) ∼ k . , consistent with earlier results [13]. In view of the analyt-ical results for PA tree graphs, this discrepancy must beattributed to finite size scaling effects which flatten thecurve for large k and presumably reduce the apparentvalue of η. The same discrepancy is seen for k = 0 with m = 2 , ,
6; it is reasonable to attribute it to the samecause.Figure 2 shows a similar scaling plot for m = 6 and k = 3 , for which µ = 2 / . The scaling collapse and thefit to l ( k ) ∼ k . m = 6 and k = 5 . The scalingcollapse is again very good, but finite size corrections forlarge k now increase the slope of the curve. Thus the fitto the predicted form of ∼ k / only works when k <
FIG. 1: Average traffic at nodes of degree k as a function of k for the PA model with m = 1 , k = 0 . (Here and in thesubsequent figures, N/ N. ) A scaling collapse with the exponents from Eqs.(3) and(4) works reasonably well. However, a straight line with thepredicted slope of 2.0 is shown and only fits the curve — if atall — for small k. Similar deviations are seen for m = 2 , , k = 0 .
10 100 1000 10 Load / N k/N N=2000N=4000N=8000
FIG. 2: Scaling plot of the average traffic at nodes of degree k as a function of k for the PA model with m = 6 , k = 3 . The scaling collapse is very good, as is the fit to ∼ k / in thescaling regime. for m = 6 and k = 1 , k < , finite sizecorrections reduce the apparent η for large k, while for k > , they increase the apparent η for large k. This isconsistent with Ref. [12].For the tree graphs generated by the PA model with m = 1 , k = 0 , the result l ( k ) ∼ k follows from p ( k ) ∼ /k and p ( l ) ∼ /l if we assume that l ( k ) scales as a
10 100 1000 10 Load / N k/N N=2000N=4000N=8000N=16000
FIG. 3: A plot similar to Figure 2 but with k = 5 . For k ∼ O (1) , the individual curves pull away from the scalingform. For k ∼ O ( N / ) , finite size effects cause the curvesto bend upwards. Between these two regimes, the slope isconsistent with ∼ k / as predicted. power of k and that the distribution of l for fixed k isnot anomalously broad. Although these are reasonableassumptions, it is not difficult to prove l ( k ) ∼ k directly.The probability that the node created at time τ will beattached to a preexisting node of degree k is equal to k/ (2 τ − . Therefore the probability that a node createdat time τ will have exactly k nodes subsequently attachedto it is p k +1 ,N ( τ ) = N (cid:88) τ<τ <...τ k P ( τ +1 , τ − ,
1) 12( τ − P ( τ +1 , τ − ,
2) 22( τ − . . . P ( τ k − +1 , τ k − , k ) k τ k − P ( τ k +1 , N, k +1)(6)where P ( τ, τ (cid:48) , m ) = (cid:32) − m τ − (cid:33)(cid:32) − m τ (cid:33) . . . (cid:32) − m τ (cid:48) − (cid:33) . (7)Replacing P ( τ, τ (cid:48) , m ) as the exponential of an integralinstead of a sum in the approximation τ >> , we have P ( τ, τ (cid:48) , m ) = ( τ /τ (cid:48) ) m/ . With this, Eq.(6) simplifies to p k +1 ,N ( τ ) ≈ N (cid:88) τ = τ . . . N (cid:88) τ k = τ (cid:114) τN √ τ N . . . √ τ k N (8)where we have used the symmetry of the τ i ’s to eliminatethe restriction τ < τ . . . < τ k . If the sums are replacedwith integrals, p k +1 ,N ( τ ) = (cid:112) τ /N [1 − (cid:112) τ /N ] k [17] and p k = (1 /N ) (cid:80) τ p k,N ( τ ) ∼ / [ k ( k + 1)( k + 2)] for large N. If n , n , . . . n k are the sizes of the k subtrees descend-ing from a node of degree k +1 , one can show [14] that forlarge N the load at the node is proportional to N (cid:80) n i . If n i ( t ) is the size of the i ’th subtree at time t ≥ τ i , with n i ( N ) = n i , the probability that n i ( t + 1) = n i ( t ) + 1 is (2 n i − / (2 t ) , with the initial condition n i ( t ) = 1 . attime t. Averaging over randomness for fixed τ i , the solu-tion is (cid:104) n i ( t ) (cid:105) = ( t/τ i + 1) / . With the symmetrizationof the previous paragraph, l N ( k +1; τ ) ∝ N (cid:104) (cid:88) n i (cid:105) = N k (cid:104) n i (cid:105) = kN − x (cid:90) x x i x i dx i (9)where τ i = N x i , τ = N x and we have replaced sumswith integrals. This yields l N ( k + 1) ∝ N k (cid:90) x (1 − x ) k [2 k (1 + x ) + x ] dx ∝ N k (10)where we have only kept the terms that are relevant for N >> k >> . The terms dropped with the k >> -5 -4 -3 -2 -1
110 1 10 100 N / k l ( k ) kp(k)P(k)N/k l(k) 10 -3 -1 P ( l ) l FIG. 4: Log log plot of N/ ( kl ( k )) as a function of k for theSprintlink network [15]. The dashed line is visually adjustedfor best fit, and has a slope of − . l ( k ) ∼ k η with η = 1 . . The plot also shows the degree distribution p ( k )and the cumulative distribution P ( k ) = P ∞ k p ( k ) . The insetshows the cumulative load distribution P ( l ) = R ∞ l p ( l ) dl and a straight line with slope -1. We now compare with network data from the Rocket-fuel database [15], which is the most recent, comprehen-sive and publicly available collection of measurements ofthe connectivity between nodes of communications net-works at the IP layer. There are ten networks with121 to 10214 routing nodes (from here on referred to as”routers”) in the database. Since the data is insufficientto test scaling functions (in our simulations we workedwith 100 networks for each N, with 500 < N < ∼ -5 -4 -3 -2 -1
110 1 10 100 N / k l ( k ) kp(k)P(k)N/k l(k) 10 -3 -1 P ( l ) l FIG. 5: Plot similar to Figure 4 but with results from all tennetworks in the database combined. law with a slope of 1.3. The degree distribution itself ismuch more irregular, and it is difficult to say whetherit is of the form ∼ k − γ with γ = η + 1 . The cumula-tive degree distribution is also shown in the figure; itis much smoother, but does not show clear power law behavior. The inset shows the cumulative load distribu-tion and a straight line with the predicted slope of -1,which is equally unconvincing. Figure 5 is a similar fig-ure with all ten networks in the database merged. Theplots are straighter, and one could perhaps argue for anarrow power law regime in the cumulative degree distri-bution as is done in Ref. [15] or a regime with slope − p ( k ) ∼ k − γ , the average traffic as a function of node degree scales as l ( k ) ∼ k γ − . This is equivalent to the statement that theprobability distribution for the load scales as p ( l ) ∼ /l regardless of γ. Although the numerical simulations andanalytical calculations are for a specific model, the resultfollows from the scaling assumption and the small-worldphenomenon and is therefore more robust. The scaling l ( k ) ∼ k η is also seen clearly in networks at the IP layer,and is in fact much better than the degree distributionwhich has attracted much more interest.This work was supported by AFOSR grant FA9550-08-1-0064. [1] R. Albert and A.-L. Barabasi, Rev. Mod. Phys. , 47(2002).[2] S.N. Dorogovtsev and J.F.F. Mendes, Adv. Phys. ,1079 (2002).[3] M.E.J. Newman, SIAM Review , 167 (2003).[4] A.-L. Barabasi and R. Albert, Science , 509 (1999).[5] S.N. Dorogovtsev, J.F.F. Mendes and A.N. Samukhin,Phys. Rev. Lett. , 4633 (2000).[6] S.N. Dorogovtsev and J.F.F. Mendes, Phys. Rev. E ,056125 (2001).[7] P.L. Krapivsky, S. Redner and F. Leyvraz, Phys. Rev.Lett. , 4629 (2000).[8] This notion of load is sometimes referred to as “between-ness centrality”.[9] We do not distinguish between geodesics that are partlyoverlapping and those that are completely distinct. Thusif there are three geodesics between a pair of nodes, withthe first two overlapping for most of their lengths andthe last one completely separate, the traffic along eachwould be 1 / , resulting in greater traffic over most of theoverlapping geodesics. [10] K-I. Goh, B. Kahng and D. Kim, Phys. Rev. Lett. ,278701 (2001).[11] K-I. Goh, E. Oh, H. Jeong, B. Kahng and D. Kim, Proc.Natl. Acad. Sci, , 12583 (2002).[12] M. Barthelemy, Phys. Rev. Lett. , 189803 (2003).[13] G. Szabo, M. Alava and J. Kertesz, Phys. Rev. E ,026101 (2002).[14] B. Bollobas and O. Riordan, Phys. Rev. E63