Entropy Rate of Diffusion Processes on Complex Networks
aa r X i v : . [ c ond - m a t . s t a t - m ec h ] D ec Entropy Rate of Diffusion Processes on Complex Networks
Jes´us G´omez-Garde˜nes and Vito Latora Scuola Superiore di Catania, Via S. Paolo 73, 95123 Catania, Italy Institute for Biocomputation and Physics of Complex Systems (BIFI), University of Zaragoza, Zaragoza 50009, Spain Dipartimento di Fisica e Astronomia, Universit`a di Catania and INFN, Via S. Sofia 64, 95123 Catania, Italy (Dated: October 22, 2018)The concept of entropy rate for a dynamical process on a graph is introduced. We study diffusion processeswhere the node degrees are used as a local information by the random walkers. We describe analitically andnumerically how the degree heterogeneity and correlations affect the diffusion entropy rate. In addition, theentropy rate is used to characterize complex networks from the real world. Our results point out how to designoptimal diffusion processes that maximize the entropy for a given network structure, providing a new theoreticaltool with applications to social, technological and communication networks.
PACS numbers: 02.50.Ga,89.75.Fb,89.75.Hc
Entropy is a key concept in statistical thermodynamics [1],in the theory of dynamical systems [2], and in informationtheory [3]. In the realm of complex networks [4, 5], the en-tropy has been used as a measure to characterize properties ofthe topology, such as the degree distribution of a graph [6], orthe shortest paths between couples of nodes (with the main in-terest in quantifying the information associated with locatingspecific addresses [7], or to send signals in the network [8].Alternatively, various authors have studied the entropy asso-ciated with ensembles of graphs , and provided, via the appli-cation of the maximum entropy principle, the best predictionof network properties subject to the constraints imposed by agiven set of observations [9, 10, 11]. An approach of this typeplays the same role in networks as is played by the Boltzmanndistribution in statistical thermodynamics [1].The main theoretical and empirical interest in the study ofcomplex networks is in understanding the relations betweenstructure and function. Besides, many of the interaction dy-namics that takes place in social, biological and technologicalsystems can be analyzed in terms of diffusion processes on topof complex networks, e.g. data search and routing, informa-tion and disease spreading [4, 5]. It is therefore of outmostimportance to relate the properties of a diffusion process withthe structure of the underlying network.In this Letter, we show how to associate an entropy rate to a diffusion process on a graph. In particular, we considerprocesses such as biased random walks on the graph that canbe represented as ergodic Markov chains. In this context, theentropy rate is a quantity more similar to the Kolmogorov-ˆASinai entropy rate of a dynamical system [12, 13], than to theentropy of a statistical ensemble [1, 4]. Differently from thenetwork entropies previously defined, the entropy rate of dif-fusion processes depends both on the dynamical process (thekind of bias in the random walker) and on the graph topol-ogy. We provide the analytical expression that describes theentropy rate in scale-free networks as a function of the bias inthe walk, and of the degree distribution and correlations. Weshow how the values of the entropy rate can provide usefulinformation to characterize diffusion processes in real-worldnetworks. In particular, a maximum value of entropy is found for different types of the bias in the diffusion processes, de-pending on the network structure.Let us consider a connected undirected graph with N nodes(labelled as , , ..., N ) and K links, described by the adja-cency matrix A = { a ij } We limit our discussion to diffu-sion processes on the graph that can be represented as
Markovchains [3]. In particular, we consider the case of biased ran-dom walks in which, at each time step, the walker at node i chooses one of the first neighbors of i , let say j , with a prob-ability proportional to the power α ( α ∈ R ) of the degree k j .Such biased random walk corresponds to a time-invariant (therule does not change in time) Markov chain with a transitionprobability matrix Π , with elements: π ji = a ij k αj P j a ij k αj (1)Notice that Π depends on either the graph topology and thekind of stochastic process we are considering. The exponent α allows to tune the dependence of the diffusion process onthe nodes’ degree. When α = 0 we are introducing in the ran-dom movement of the particle a bias towards high- ( α > )or low-degree (when α < ) neighbors. On the other hand,when α = 0 the standard (unbiased) random walk is recov-ered. Since the walker must move from a node to somewhere,we have P j π ji = 1 , thus Π is a stochastic matrix. If w i ( t ) is the probability that the random walker is at node i at time t (with P Ni =1 w i ( t ) = 1 ∀ t ), then the probability w j ( t + 1) ofits being at j one step later is: w j ( t +1) = P i π ji w i ( t ) . Writ-ing the probabilities w i ( t ) as a N -dimensional column vector w ( t ) = ( w ( t ) , w ( t ) . . . w N ( t ) ) ⊤ , the rule of the walk canbe expressed in matricial form as: w ( t + 1) = Π w ( t ) . Inthe case of an undirected and connected network, the Perron-Frobenius theorem [14] assures that the dynamics describedby Eq. (1) is an ergodic Markov chain [3]. This means that theMarkov chain has a unique stationary distribution w ∗ , suchthat lim t →∞ Π t w (0) = w ∗ for any initial distribution w (0) .In other words, any initial distribution of the random walkerover the nodes of the graph will converge, under the dynamicsof Eq. (1), to the same distribution w ∗ .The dynamical properties of the above diffusion processesover the graph can be accounted by evaluating the entropy rate of the associated Markov chain that, in the case of an ergodicMarkov chain, is given by [3]: h = − X i,j π ji · w ∗ i ln( π ji ) (2)The value of h measures how the entropy of the biased ran-dom walk grows with the number of hops. This means thatwe can practically represent the typical sequences of length n generated by the diffusion process by using approximately n · h information units. In different words, h measures thespreading of a set of independent random walkers, in terms ofnumber of visited nodes.To evaluate h for a given graph we need to calculate thestationary probability distribution w ∗ . For this purpose, weconsider the probability W i → j ( t ) of going from node i to node j in t time steps, W i → j ( t ) = X j ,j ,...,j t − π i,j · π j ,j · ... · π j t − ,j . (3)Since the network is undirected we have a ij = a ji ∀ i, j .Hence, the relation between the two probabilities W i → j ( t ) and W j → i ( t ) can be written as: c i k αi W i → j ( t ) = c j k αj W j → i ( t ) , (4)where c i = P j a ij k αj . The above relation implies that for thestationary distribution w ∗ the equation c i k αi w ∗ j = c j k αj w ∗ i holds, and hence w ∗ reads: w ∗ i = c i k αi P l c l k αl . (5)By plugging expressions (1) and (5) into the definition of en-tropy (2), we finally get a closed form for the entropy rate ofdegree-biased random walks on the graph: h = P i k αi P j a ij k αj ln( k αj ) − P i k αi c i ln( c i ) P i c i k αi . (6)We notice that h depends on the the kind of bias in the ran-dom walker and also on the graph topology. In the following,we first evaluate analytically the entropy rate of unbiased andbiased random walks on scale-free (SF) graphs with a power-law degree distribution P k ∼ k − γ , and γ > [4, 5]. Then,we study the entropy in networks from the real world. Unbiased Random Walks.-
In the particular case α = 0 , thetransition probability reads π ji = a ij /k i , and the stationarydistribution is easily obtained as: w ∗ i = k i K . Substituting thisexpression in Eq. (2) and changing the sum over node indexesinto a sum over degree classes, we can write the entropy rate ofa unbiased random walk on a network with degree distribution P k as: h = N K X k kP k ln( k ) = h k ln( k ) ih k i . (7)
11 9 7 5 3 1 2.2 2.4 2.6 2.8 3 3.2 3.4 h γ Thermodynamic limith( γ ,N=10 )Synthetic SF networks FIG. 1: Entropy rate h of unbiased random walks on SF networkswith N = 10 nodes as a function of the exponent γ of the degreedistribution. Numerical results (circles) are compared with the twoanalytical curves corresponding to Eq. (8) (dashed line) and to thelimit N → ∞ (solid line). In the case of SF networks of size N , the value of h canbe easily expressed as a function of γ and N taking into ac-count that the maximum degree of the network is k max ∼ k N / ( γ − , with k being the minimum degree of a node.From Eq. (7), and approximating k as a continuum variable,we get: h ( γ, N ) = ln( k ) + 1 γ − N − γγ − ln( N )( γ − N − γγ − − . (8)The above expression diverges for SF networks when γ → .Conversely, when γ > SF networks have a finite entropy inthe thermodynamic limit: h ( γ ) = ln( k ) + γ − .In order to check the analytical results we have constructedensembles of SF networks with N = 10 nodes and dif-ferent values of γ . We have obtained numerically the station-ary distribution w ∗ , and computed the entropy directly fromEq. (2). The results, averaged over the ensemble of networks,are reported in Fig. 1 as a function of γ . We notice a goodagreement between numerics and Eq. (8). Biased Random Walks.-
Let us now concentrate on degree-biased diffusion ( α = 0 ). In this case, the entropy rate ofEq. (6) can be re-written by changing again the sums overnode indexes into sums over degree classes, as: h = − P k k α P k (cid:16) C k ln( C k ) − P k ′ αk ′ α P k ′ ,k ln( k ′ ) (cid:17)P k C k k α P k (9)where C k = k P k ′ k ′ α P k ′ ,k , and P k ′ ,k is the conditionalprobability that a link from a node of degree k ends in a nodewith degree k ′ . We notice that the entropy rate of biased ran-dom walks depends on the degree distribution of the network, P k , and on the conditional probabilities P k ′ ,k . In the partic-ular case of a network with no degree-degree correlations wecan write P k ′ ,k = kP k / h k i , and the expression for the entropyreduces to: h = (1 − α ) h k α +1 ln( k ) ih k α +1 i + ln (cid:18) h k α +1 ih k i (cid:19) . (10)This expression only depends on the degree distribution ofthe network. For SF networks, we get in the the continuum-degree approximation: h ( γ, α, N ) = 1 − αγ − α − − α ) N α +2 − γγ − ln( N )( γ − N α +2 − γγ − − " k ( γ − N α +2 − γγ − − γ − α − N − γγ − − . (11)When N → ∞ , the entropy rate in SF networks with γ < α diverges as h ∼ ln( N ) . On the other hand, when γ > α ,the entropy rate in the limit N → ∞ is finite and equal to: h ( γ, α ) = 1 − αγ − α − (cid:20) k ( γ − γ − α − (cid:21) . (12)Such an expression, valid in infinite size limit, shows a mono-tone growth of the entropy h ( γ, α ) with the degree-bias α ,with h tending to infinity as α → ( γ − − . More interest-ingly, the entropy rate in finite networks, Eq. (11), shows asingle maximum at a value of α that depends on γ . This resultis a consequence of the interplay between diffusion processand network topology. It indicates that, for a given network,is possible to maximize the entropy of the process by oppor-tunely tuning the bias α of the walker.To check the above analytical expressions we have com-puted numerically the entropy rate of degree-biased randomwalkers on computer-generated uncorrelated SF networks, aswe did for the unbiased case. In Fig. 2.a we report the entropyrate as a function of the degree bias α for SF networks of size N = 10 . In Fig. 2.b we show the scaling of h with the systemsize N , in SF network with γ = 3 . In both cases Eq. (11) isin good agreement with the numerical results reproducing thequalitative behavior of h as a function of α (being the globalmaximum of h well reproduced) and N (being both the diver-gence of h , for α > γ − , and the asymptotic finite value of h , when α < γ − , correctly reproduced) [15]. Real Networks.-
Up to now, we focused on the entropyrate of biased random walks on SF networks. However,real networks are not perfect scale-free and, more impor-tantly, show additional important structural properties suchas degree-degree correlations, motifs and community struc-tures [4, 5]. Now we propose to characterize a real networkby studying different diffusion processes on top of it, andfinding the optimal value of the bias that maximizes the en-tropy. As reference system, we compare the entropy rate h of α -biased random walks on the network, with the entropyrate h Rand obtained, from Eq. (10), for a randomized ver-sion of the network, with the same degree sequence of thereal one [16]. For this purpose, we have analyzed differentnetworks reported in Table I, corresponding to thre differentfunctional classes where diffusion of data, rumors, viruses and h α (Num) γ =3.4(Theo) γ =3.4(Num) γ =3.0(Theo) γ =3.0(Num) γ =2.6(Theo) γ =2.6 (a)h h N(Theo) α =-0.5(Num) α =-0.5(Theo) α =0.0(Num) α =0.0(Theo) α =0.5(Num) α =0.5(Theo) α =1.2(Num) α =1.2 h (b) FIG. 2: (color online). (a) Entropy rate h , as a function of α , for α -biased random walks on SF networks with N = 10 nodes and γ = 2 . , , . . Symbols represent the values of h found numer-ically, while the lines are the corresponding analytical predictions h ( γ, α, N ) of Eq. (11). (b) Entropy rate h for α -biased randomwalks on SF networks with γ = 3 , as a function of the system size N and for several values of α . Again, symbols are the results ofnumerical simulations, while the lines correspond to Eq. (11). diseases, takes place, namely (i) transportation, (ii) technolog-ical/communication and (iii) social networks.In Fig. 3 we report, for six of the networks, the results ob-tained as a function of the bias parameter α . Two differentbehaviors emerge clearly for α > [26], namely the entropyof the real network h is either larger or smaller than h rand forall the range of positive values of α . In table I we summarizethis result by reporting the ratio h/h Rand for α = 1 (linearbias). We found that social networks have always h > h Rand ,while the other networks have h < h
Rand , with the excep-tion of Internet routers. This difference in the entropy rate hasits roots mainly on the different types of degree-degree cor-relations of the network, and points out that assortativity fa-cilitates the spread of the diffusion. The optimal degree-bias, α opt that produces the maximal entropy rate is also reported inTable I. The results indicate that for assortative networks ( e.g. social networks) the maximal entropy rate is obtained with asuper-linear diffusion, while for dissasortative networks α opt is located in the sub-linear bias region. TABLE I: Properties of the real networks analyzed. N is thenumber of nodes in the giant connected component, h k i is the av-erage degree. The ratio of entropy rates, h/h Rand , is reported fora linear ( α = 1 ) degree-biased random walk. Finally we report theoptimal value of α that maximizes h for each network. network ref. N h k i h/h Rand α opt U.S. Airports [17] 500 11.92 0.964 0.8Internet routers [18] 228263 2.80 1.191 1.7Internet A.S. [18] 1174 4.19 0.662 0.6WWW [19] 325729 6.70 0.867 0.9P2P [20] 79939 4.13 0.613 0.7Sci. Coll. (cond-mat) [21] 12722 6.28 1.091 1.5Sci. Coll. (astro-ph) [22] 13259 18.62 1.071 1.5U.S. patents [23] 230686 4.81 1.113 1.5E-mail [24] 1133 9.62 1.019 1.2P.G.P [25] 10680 4.56 1.176 1.3 α U.S. airportsRand. α WWWRand. hhh (a)(b)(c) (d)(e)(f)
FIG. 3: Entropy rate h for α -biased random walks on six of the net-works in Table I (filled circles). Such entropy rate is compared withthat obtained for the randomized version of the network (full circles).Both entropy rates are shown as a function of the degree-bias param-eter α . Summing up, in this Letter, we have introduced the entropyrate of degree-biased random walks on networks, a measurethat is particularly suited to capture the interplay between net-work structure and diffusion dynamics. We have studied thedependence of the entropy rate with the topology of syntheticand real networks, in particular with the heterogeneity of de-gree distributions and the nature of the degree-degree correla-tions. The results indicate how it is possible to tune the biasin the random walk in order to maximize the entropy rate ona given topology. The method introduced can find useful ap-plications to cases where diffusion in complex networks is themechanism at work, such as in the search of efficient algo- rithms for data search in the WWW, in the improvement ofinformation dissemination in social networks, or in the de-sign of large impact virus/antivirus spreading in computer net-works. The approach adopted here can be easily generalizedto other types of diffusion processes and to more general net-work topologies, such as weighted graphs and also, with someappropriate modifications to directed and unconnected graphs.The authors are indebted to M. Chavez and H. Touchettefor many helpful discussions on the subject. [1] R. Balian, From Microphysics to Macrophysics (Springer-Verlag, New York, 1991)[2] C. Beck, F. Schl¨ogl, Thermodynamics of chaotic systems(Cambridge University Press, Cambridge, 1993)[3] T. M. Cover and J. A. Thomas, Elements of Information Theory(Wiley, 1991)[4] M.E.J. Newman, SIAM Review , 167 (2003).[5] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez , D.-U. Hwang,Phys. Rep. , 175 (2006).[6] R. Ferrer, I. Cancho and R. V. Sol´e, Lect. Notes in Physics ,114 (2003)[7] M. Rosvall, A. Trusina, P. Minnhagen, and K. Sneppen, Phys.Rev. Lett. , 028701 (2005).[8] A. Trusina, M. Rosvall, K. Sneppen, Phys. Rev. Lett. The Entropy of randomized network ensembles ”,arXiv:0708.0153, in press in Europhys. Lett.[10] P. Minnhagen and S. Bernhardsson, Chaos , 026117 (2007).[11] J. Park and M.E.J. Newman, Phys. Rev. E70 , 066117 (2004).[12] A. N. Kolmogorov, Dokl. Akad. Nauk SSSR , 861 (1958); , 754 (1959).[13] V. Latora, M. Baranger, Phys. Rev. Lett. , 520 (1999).[14] F. R. Gantmacher, The Theory of Matrices, Volume 2 (ChelseaPublishing Company, New York, 1959)[15] Note that the effect of the continuum-degree approximation,made in the calculation, become more evident as α grows, be-cause for large α the discreteness effects in k are amplified.[16] S. Maslov, K. Sneppen, Science , 910 (2002).[17] V. Colizza, R. Pastor-Satorras and A. Vespignani, NaturePhysics , 276 (2007).[18] A. V´azquez, R. Pastor-Satorras and A. Vespignani, Lect. Notesin Physics , 425 (2004).[19] R. Albert, H. Jeong and A.L. Barab´asi, Nature , 130 (1999).[20] F. Wang, Y. Moreno and Y. Sun, Phys. Rev. E , 036123(2006).[21] M.E.J. Newman, Phys. Rev. E Topological Analysis of ScientificCoauthorship Networks ” (Univ. of Catania, Catania, 2006).[23] B.H. Hall, A.B. Jaffe and M. Tratjenberg, NBER Working Pa-per 8498 (2001).[24] R. Guimera, L. Danon, A. Diaz-Guilera, F. Giralt and A. Are-nas, Phys. Rev. E , 065103(R) (2003).[25] M. Bogu˜na, R. Pastor-Satorras, A. Diaz-Guilera and A. Arenas,Phys. Rev. E , 056122 (2004).[26] Note that for α = 0 both the real network and the randomizedensemble have the same value of hh