Fractal and Transfractal Scale-Free Networks
Hernan D. Rozenfeld, Lazaros K. Gallos, Chaoming Song, Hernan A. Makse
aa r X i v : . [ phy s i c s . s o c - ph ] A ug Fractal and Transfractal Scale-Free Networks
Hern´an D. Rozenfeld, ∗ Lazaros K. Gallos, † Chaoming Song, ‡ and Hern´an A. Makse § Levich Institute and Physics Department, City College of New York, New York, New York 10031, USA
Article Outline
Glossary and Notation1. Definition of the Subject and Its Importance2. Introduction3. Fractality in Real-World Networks4. Models: Deterministic Fractal and Transfractal Networks5. Properties of Fractal and Transfractal Networks6. Future Directions7. APPENDIX: The Box Covering Algorithms8. Bibliography
Glossary and Notation
Degree of a node
Number of edges incident to the node.
Scale-Free Network
Network that exhibits a wide (usually power-law) distribution of the degrees.
Small-World Network
Network for which the diameter increases logarithmically with the number of nodes.
Distance
The length (measured in number of links) of the shortest path between two nodes.
Box
Group of nodes. In a connected box there exists a path within the box between any pair of nodes.Otherwise, the box is disconnected . Box Diameter
The longest distance in a box.
DEFINITION OF THE SUBJECT AND ITS IMPORTANCE
The explosion in the study of complex networks during the last decade has offered a unique view in the structureand behavior of a wide range of systems, spanning many different disciplines [1]. The importance of complex networkslies mainly in their simplicity, since they can represent practically any system with interactions in a unified way bystripping complicated details and retaining the main features of the system. The resulting networks include only nodes , representing the interacting agents and links , representing interactions. The term ‘interactions’ is used looselyto describe any possible way that causes two nodes to form a link. Examples can be real physical links, such as thewires connecting computers in the Internet or roads connecting cities, or alternatively they may be virtual links, suchas links in WWW homepages or acquaintances in societies, where there is no physical medium actually connectingthe nodes.The field was pioneered by the famous mathematician P. Erd˝os many decades ago, when he greatly advanced graphtheory [2]. The theory of networks would have perhaps remained a problem of mathematical beauty, if it was not forthe discovery that a huge number of everyday life systems share many common features and can thus be describedthrough a unified theory. The remarkable diversity of these systems incorporates artificially or man-made technologicalnetworks such as the Internet and the World Wide Web (WWW), social networks such as social acquaintances orsexual contacts, biological networks of natural origin, such as the network of protein interactions of Yeast [1, 3], and arich variety of other systems, such as proximity of words in literature [4], items that are bought by the same people [5]or the way modules are connected to create a piece of software, among many others.The advances in our understanding of networks, combined with the increasing availability of many databases,allows us to analyze and gain deeper insight into the main characteristics of these complex systems. A large numberof complex networks share the scale-free property [1, 6], indicating the presence of few highly connected nodes (usuallycalled hubs) and a large number of nodes with small degree. This feature alone has a great impact on the analysisof complex networks and has introduced a new way of understanding these systems. This property carries importantimplications in many everyday life problems, such as the way a disease spreads in communities of individuals, or theresilience and tolerance of networks under random and intentional attacks [7, 8, 9, 10, 11].Although the scale-free property holds an undisputed importance, it has been shown to not completely determinethe global structure of networks [12]. In fact, two networks that obey the same distribution of the degrees maydramatically differ in other fundamental structural properties, such as in correlations between degrees or in theaverage distance between nodes. Another fundamental property, which is the focus of this article, is the presenceof self-similarity or fractality. In simpler terms, we want to know whether a subsection of the network looks muchthe same as the whole [13, 14, 15, 16]. In spite of the fact that in regular fractal objects the distinction betweenself-similarity and fractality is absent, in network theory we can distinguish the two terms: in a fractal network the number of boxes of a given size that are needed to completely cover the network scales with the box size as apower law, while a self-similar network is defined as a network whose degree distribution remains invariant underrenormalization of the network (details on the renormalization process will be provided later). This essential resultallows us to better understand the origin of important structural properties of networks such as the power-law degreedistribution [17, 18, 19].
INTRODUCTION
Self-similarity is a property of fractal structures, a concept introduced by Mandelbrot and one of the fundamentalmathematical results of the 20th century [14, 15, 20]. The importance of fractal geometry stems from the fact thatthese structures were recognized in numerous examples in Nature, from the coexistence of liquid/gas at the criticalpoint of evaporation of water [21, 22, 23], to snowflakes, to the tortuous coastline of the Norwegian fjords, to thebehavior of many complex systems such as economic data, or the complex patterns of human agglomeration [14, 15].Typically, real world scale-free networks exhibit the small-world property [1], which implies that the number ofnodes increases exponentially with the diameter of the network, rather than the power-law behavior expected forself-similar structures. For this reason complex networks were believed to not be length-scale invariant or self-similar.In 2005, C. Song, S. Havlin and H. Makse presented an approach to analyze complex networks, that reveals theirself-similarity [17]. This result is achieved by the application of a renormalization procedure which coarse-grains thesystem into boxes containing nodes within a given size [17, 24]. As a result, a power-law relation between the numberof boxes needed to cover the network and the size of the box is found, defining a finite self-similar exponent. Thesefundamental properties, which are shown for the WWW, cellular and protein-protein interaction networks, help tounderstand the emergence of the scale-free property in complex networks. They suggest a common self-organizationdynamics of diverse networks at different scales into a critical state and in turn bring together previously unrelatedfields: the statistical physics of complex networks with renormalization group, fractals and critical phenomena.
FRACTALITY IN REAL-WORLD NETWORKS
The study of real complex networks has revealed that many of them share some fundamental common properties.Of great importance is the form of the degree distribution for these networks, which is unexpectedly wide. This meansthat the degree of a node may assume values that span many decades. Thus, although the majority of nodes havea relatively small degree, there is a finite probability that a few nodes will have degree of the order of thousands oreven millions. Networks that exhibit such a wide distribution P ( k ) are known as scale-free networks, where the termrefers to the absence of a characteristic scale in the degree k . This distribution very often obeys a power-law form FIG. 1: Representation of the Protein Interaction Network of Yeast. The colors show different subgroups of proteins thatparticipate in different functionality classes [3]. with a degree exponent γ , usually in the range 2 < γ < P ( k ) ∼ k − γ . (1)A more generic property, that is usually inherent in scale-free networks but applies equally well to other types ofnetworks, such as in Erd˝os-R´enyi random graphs, is the small-world feature. Originally discovered in sociologicalstudies [26], it is the generalization of the famous ‘six degrees of separation’ and refers to the very small networkdiameter. Indeed, in small-world networks a very small number of steps is required to reach a given node startingfrom any other node. Mathematically this is expressed by the slow (logarithmic) increase of the average diameter ofthe network, ¯ ℓ , with the total number of nodes N , ¯ ℓ ∼ ln N , where ℓ is the shortest distance between two nodes anddefines the distance metric in complex networks [2, 25, 27, 28], namely, N ∼ e ¯ ℓ/ℓ , (2)where ℓ is a characteristic length.These network characteristics have been shown to apply in many empirical studies of diverse systems [1, 6, 25]. Thesimple knowledge that a network has the scale-free and/or small-world property already enables us to qualitativelyrecognize many of its basic properties. However, structures that have the same degree exponents may still differin other aspects [12]. For example, a question of fundamental importance is whether scale-free networks are alsoself-similar or fractals. The illustrations of scale-free networks (see e.g. Figs. 1 and 2b) seem to resemble traditionalfractal objects. Despite this similarity, Eq. (2) definitely appears to contradict a basic property of fractality: fastincrease of the diameter with the system size. Moreover, a fractal object should be self-similar or invariant under ascale transformation, which is again not clear in the case of scale-free networks where the scale has necessarily limitedrange. So, how is it even possible that fractal scale-free networks exist? In the following, we will see how theseseemingly contradictory aspects can be reconciled. Fractality and Self-Similarity
The classical theory of self-similarity requires a power-law relation between the number of nodes N and the diameterof a fractal object ℓ [13, 16]. The fractal dimension can be calculated using either box-counting or cluster-growing techniques [14]. In the first method the network is covered with N B boxes of linear size ℓ B . The fractal dimension orbox dimension d B is then given by [15]: N B ∼ ℓ − d B B . (3)In the second method, instead of covering the network with boxes, a random seed node is chosen and nodes centered atthe seed are grown so that they are separated by a maximum distance ℓ . The procedure is then repeated by choosingmany seed nodes at random and the average “mass” of the resulting clusters, h M c i (defined as the number of nodesin the cluster) is calculated as a function of ℓ to obtain the following scaling: h M c i ∼ ℓ d f , (4)defining the fractal cluster dimension d f [15]. If we use Eq. (4) for a small-world network, then Eq. (2) readily impliesthat d f = ∞ . In other words, these networks cannot be characterized by a finite fractal dimension, and should beregarded as infinite-dimensional objects. If this were true, though, local properties in a part of the network would notbe able to represent the whole system. Still, it is also well established that the scale-free nature is similar in differentparts of the network. Moreover, a graphical representation of real-world networks allows us to see that those systemsseem to be built by attaching (following some rule) copies of itself.The answer lies in the inherent inhomogeneity of the network. In the classical case of a homogeneous system (suchas a fractal percolation cluster) the degree distribution is very narrow and the two methods described above arefully equivalent, because of this local neighborhood invariance. Indeed, all boxes in the box-covering method arestatistically similar with each other as well as with the boxes grown when using the cluster-growing technique, so thatEq. (4) can be derived from Eq. (3) and d B = d f .In inhomogeneous systems, though, the local environment can vary significantly. In this case, Eqs. (3) and (4)are no longer equivalent. If we focus on the box-covering technique then we want to cover the entire network withthe minimum possible number of boxes N B ( ℓ B ), where the distance between any two nodes that belong in a box issmaller than ℓ B . An example is shown in Fig. 2a using a simple 8-node network. After we repeat this procedure fordifferent values of ℓ B we can plot N B vs ℓ B ∼ networks), the network of protein interaction of H. sapiens and
E. coli [29, 30] and sev-eral cellular networks [31, 32], then they follow Eq. (3) with a clear power-law, indicating the fractal nature of thesesystems (Figs. 3a, 3b, 3c). On the other hand when the method is applied to other real world networks such us theInternet [33] or the Barab´asi-Albert network [34], they do not satisfy Eq. (3), which manifests that these networksare not fractal.The reason behind the discrepancy in the fractality of homogeneous and inhomogeneous systems can be betterclarified studying the mass of the boxes. For a given ℓ B value, the average mass of a box h M B ( ℓ B ) i is h M B ( ℓ B ) i ≡ N/N B ( ℓ B ) ∼ ℓ d B B , (5)as also verified in Fig. 3 for several real-world networks. On the other hand, the average performed in the clustergrowing method (averaging over single boxes without tiling the system) gives rise to an exponential growth of themass h M c ( ℓ ) i ∼ e ℓ/ℓ , (6)in accordance with the small-world effect, Eq. (2). Correspondingly, the probability distribution of the mass ofthe boxes M B using box-covering is very broad, while the cluster-growing technique leads to a narrow probabilitydistribution of M c .The topology of scale-free networks is dominated by several highly connected hubs— the nodes with the largestdegree— implying that most of the nodes are connected to the hubs via one or very few steps. Therefore, the averageperformed in the cluster growing method is biased; the hubs are overrepresented in Eq. (6) since almost every nodeis a neighbor of a hub, and there is always a very large probability of including the same hubs in all clusters. Onthe other hand, the box covering method is a global tiling of the system providing a flat average over all the nodes,i.e. each part of the network is covered with an equal probability. Once a hub (or any node) is covered, it cannot becovered again.In conclusion, we can state that the two dominant methods that are routinely used for calculations of fractalityand give rise to Eqs. (3) and (4) are not equivalent in scale-free networks, but rather highlight different aspects: boxcovering reveals the self-similarity, while cluster growth reveals the small-world effect. The apparent contradiction isdue to the hubs being used many times in the latter method.The apparent contradiction between the small-world and fractal properties, as expressed through Eqs. 2 and 3 canbe explained as follows. Scale-free networks can be classified into three groups: (i) pure fractal, (ii) pure small-worldand (iii) a mixture between fractal and small-world. (i) A fractal network satisfies Eq. 3 at all scales, meaning that ab FIG. 2: The renormalization procedure for complex networks. a, Demonstration of the method for different ℓ B and different stages in a network demo. The first column depicts the originalnetwork. The system is tiled with boxes of size ℓ B (different colors correspond to different boxes). All nodes in a box areconnected by a minimum distance smaller than the given ℓ B . For instance, in the case of ℓ B = 2, one identifies four boxeswhich contain the nodes depicted with colors red, orange, white, and blue, each containing 3, 2, 1, and 2 nodes, respectively.Then each box is replace by a single node; two renormalized nodes are connected if there is at least one link between theunrenormalized boxes. Thus we obtain the network shown in the second column. The resulting number of boxes needed totile the network, N B ( ℓ B ), is plotted in Fig. 4 versus ℓ B to obtain d B as in Eq. (3). The renormalization procedure is appliedagain and repeated until the network is reduced to a single node (third and fourth columns for different ℓ B ). b, Three stagesin the renormalization scheme applied to the entire WWW. We fix the box size to ℓ B = 3 and apply the renormalization forfour stages. This corresponds, for instance, to the sequence for the network demo depicted in the second row in part a of thisfigure. We color the nodes in the web according to the boxes to which they belong. for any value of ℓ B , the number of boxes always follows a power-law (examples are shown in Fig. 3a, 3b, 3c). (ii)When a network is a pure small-world, it never satisfies Eq. 3. Instead, N B follows an exponential decay with ℓ B andthe network cannot be regarded as fractal. Figs. 3d and 3e show two examples of pure small-world networks. (iii) Inthe case of mixture between fractal and small-world, Eq. 3 is satisfied up to some cut-off value of ℓ B , above which thefractality breaks down and the small-world property emerges. The small-world property is reflected in the plot of N B vs. ℓ B as an exponential cut-off for large ℓ B .We can also understand the coexistence of the small-world property and the fractality through a more intuitiveapproach. In a pure fractal network the length of a path between any pair of nodes scales as a power-law with thenumber of nodes in the network. Therefore, the diameter L also follows a power-law, L ∼ N /d B . If one adds a a bcd e FIG. 3: Self-similar scaling in complex networks. a, Upper panel: Log-log plot of the N B vs ℓ B revealing the self-similarityof the WWW according to Eq. (3). Lower panel: The scaling of s ( ℓ B ) vs. ℓ B according to Eq. (9). b, Same as (a) but fortwo protein interaction networks:
H. sapiens and
E. coli . Results are analogous to (b) but with different scaling exponents. c, Same as (a) for the cellular networks of
A. fulgidus , E. coli and
C. elegans . d, Internet. Log-log plot of N B ( ℓ B ). The solid lineshows that the internet [33] is not a fractal network since in does not follow the power-law relation of Eq.(5). e, Same as (d)for the Barab´asi-Albert model network [34] with m = 3 and m = 5. FIG. 4: Invariance of the degree distribution of the WWW under the renormalization for different box sizes, lB . We showthe data collapse of the degree distributions demonstrating the self-similarity at different scales. The inset shows the scalingof k ′ = s ( ℓ B ) k for different ℓ B , from where we obtain the scaling factor s ( ℓ B ). Moreover, renormalization for a fixed box size( ℓ B = 3) is applied, until the network is reduced to a few nodes. It was found that P ( k ) is invariant under these multiplerenormalizations procedures. few shortcuts (links between randomly chosen nodes), many paths in the network are drastically shortened and thesmall-world property emerges as L ∼ Log N . In spite of this fact, for shorter scales, ℓ B ≪ L , the network still behavesas a fractal. In this sense, we can say that globally the network is small-world, but locally (for short scales) thenetwork behaves as a fractal. As more shortcuts are added, the cut-off in a plot of N B vs. ℓ B appears for smaller ℓ B ,until the network becomes a pure small-world for which all paths lengths increase logarithmically with N .The reasons why certain networks have evolved towards a fractal or non-fractal structure will be described later,together with models and examples that provide additional insight into the processes involved. Renormalization
Renormalization is one of the most important techniques in modern Statistical Physics [35, 36, 37]. The idea behindthis procedure is to continuously create smaller replicas of a given object, retaining at the same time the essentialstructural features, and hoping that the coarse-grained copies will be more amenable to analytic treatment.The idea for renormalizing the network emerges naturally from the concept of fractality described above. If anetwork is self-similar, then it will look more or less the same under different scales. The way to observe thesedifferent length-scales is based on renormalization principles, while the criterion to decide on whether a renormalizedstructure retains its form is the invariance of the main structural features, expressed mainly through the degreedistribution.The method works as follows. Start by fixing the value of ℓ B and applying the box-covering algorithm in order tocover the entire network with boxes (see Appendix). In the renormalized network each box is replaced by a singlenode and two nodes are connected if there existed at least one connection between the two corresponding boxes inthe original network. The resulting structure represents the first stage of the renormalized network. We can applythe same procedure to this new network, as well, resulting in the second renormalization stage network, and so onuntil we are left with a single node.The second column of the panels in Fig. 2a shows this step in the renormalization procedure for the schematicnetwork, while Fig. 2b shows the results for the same procedure applied to the entire WWW for ℓ B = 3.The renormalized network gives rise to a new probability distribution of links, P ( k ′ ) (we use a prime ′ to denotequantities in the renormalized network). This distribution remains invariant under the renormalization: P ( k ) → P ( k ′ ) ∼ ( k ′ ) − γ . (7)Fig. 4 supports the validity of this scale transformation by showing a data collapse of all distributions with the same γ according to (7) for the WWW.Here, we present the basic scaling relations that characterize renormalizable networks. The degree k ′ of each nodein the renormalized network can be seen to scale with the largest degree k in the corresponding original box as k → k ′ = s ( ℓ B ) k . (8)This equation defines the scaling transformation in the connectivity distribution. Empirically, it was found that thescaling factor s ( <
1) scales with ℓ B with a new exponent, d k , as s ( ℓ B ) ∼ ℓ − d k B , so that k ′ ∼ ℓ − d k B k, (9)This scaling is verified for many networks, as shown in Fig. 4a.The exponents γ , d B , and d k are not all independent from each other. The proof starts from the density balanceequation n ( k ) dk = n ′ ( k ′ ) dk ′ , where n ( k ) = N P ( k ) is the number of nodes with degree k and n ′ ( k ′ ) = N ′ P ( k ′ ) isthe number of nodes with degree k ′ after the renormalization ( N ′ is the total number of nodes in the renormalizednetwork). Substituting Eq. (8) leads to N ′ = s γ − N . Since the total number of nodes in the renormalized network isthe number of boxes needed to cover the unrenormalized network at any given ℓ B we have the identity N ′ = N B ( ℓ B ).Finally, from Eqs. (3) and (9) one obtains the relation between the three indexes γ = 1 + d B /d k . (10)The use of Eq. (10) yields the same γ exponent as that obtained in the direct calculation of the degree distribution.The significance of this result is that the scale-free properties characterized by γ can be related to a more fundamentallength-scale invariant property, characterized by the two new indexes d B and d k .We have seen, thus, that concepts introduced originally for the study of critical phenomena in statistical physics,are also valid in the characterization of a different class of phenomena: the topology of complex networks. A largenumber of scale-free networks are fractals and an even larger number remain invariant under a scale-transformation.The influence of these features on the network properties will be delayed until the sixth chapter, after we introducesome algorithms for efficient numerical calculations and two theoretical models that give rise to fractal networks. MODELS: DETERMINISTIC FRACTAL AND TRANSFRACTAL NETWORKS
The first model of a scale-free fractal network was presented in 1979 when N. Berker and S. Ostlund [38] proposeda hierarchical network that served as an exotic example where renormalization group techniques yield exact results,including the percolation phase transition and the q → The Song-Havlin-Makse Model
The correlations between degrees in a network [42, 43, 44, 45] are quantified through the probability P ( k , k ) thata node of degree k is connected to another node of degree k . In Fig. 5 we can see the degree correlation profile R ( k , k ) = P ( k , k ) /P r ( k , k ) of the cellular metabolic network of E. coli [46] (known to be a fractal network) andfor the Internet at the router level [47] (a non-fractal network), where P r ( k , k ) is obtained by randomly swappingthe links without modifying the degree distribution.Fig. 5 shows a dramatic difference between the two networks. The network of E. coli , that is a fractal network,present an anti-correlation of the degrees (or dissasortativity [42, 43]), which means that mostly high degree nodesare linked to low degree nodes. This property leads to fractal networks. On the other hand, the Internet exhibits ahigh correlation between degrees leading to a non-fractal network.With this idea in mind, in 2006 C. Song, S. Havlin and H. Makse presented a model that elucidates the way newnodes must be connected to the old ones in order to build a fractal, a non-fractal network, or a mixture between (a) (b)FIG. 5: Degree correlation profile for (a) the cellular metabolic network of E. coli , and (b) the Internet at the router level.FIG. 6: The model grows from a small network, usually two nodes connected to each other. During each step and for everylink in the system, each endpoint of a link produces m offspring nodes (in this drawing m = 3). In this case, with probability e = 1 the original link is removed and x new links between randomly selected nodes of the new generation are added. Noticethat the case of x = 1 results in a tree structure, while loops appear for x > fractal and non-fractal network [18]. This model shows that, indeed, the correlations between degrees of the nodesare a determinant factor for the fractality of a network. This model was later extended [48] to allow loops in thenetwork, while preserving the self-similarity and fractality properties.The algorithm is as follows (see Fig. 6): In generation n = 0, start with two nodes connected by one link. Then,generation n + 1 is obtained recursively by attaching m new nodes to the endpoints of each link l of generation n .In addition, with probability e remove link l and add x new links connecting pairs of new nodes attached to theendpoints of l .The degree distribution, diameter and fractal dimension can be easily calculated. For example, if e = 1 (pure fractalnetwork), the degree distribution follows a power-law P ( k ) ∼ k − γ with exponent γ = 1 + log(2 m + x ) / log m andthe fractal dimension is d B = log(2 m + x ) / log m . The diameter L scales, in this case, as power of the number ofnodes as L ∼ N /d B [18, 24]. Later, in Section “Properties of Fractal and Transfractal Networks”, several topologicalproperties are shown for this model network. ( u, v ) -flowers In 2006, H. Rozenfeld, S. Havlin and D. ben-Avraham proposed a new family of recursive deterministic scale-freenetworks, the ( u, v )- flowers , that generalize both, the original scale-free model of Berker and Ostlund [38] and thepseudo-fractal network of Dorogovstev, Goltsev and Mendes [49] and that, by appropriately varying its two parameters u and v , leads to either fractal networks or non-fractal networks [50, 51]. The algorithm to build the ( u, v )-flowers is0the following: In generation n = 1 one starts with a cycle graph (a ring) consisting of u + v ≡ w links and nodes (otherchoices are possible). Then, generation n + 1 is obtained recursively by replacing each link by two parallel paths of u and v links long. Without loss of generality, u ≤ v . Examples of (1 , , u = 1 and v = 2 and the Berker and Ostlund model corresponds to u = 2 and v = 2.An essential property of the ( u, v )-flowers is that they are self-similar, as evident from an equivalent method ofconstruction: to produce generation n + 1, make w = u + v copies of the net in generation n and join them at thehubs. FIG. 7: ( u, v )-flowers with u + v = 4 ( γ = 3). (a) u = 1 (dotted line) and v = 3 (broken line). (b) u = 2 and v = 2. The graphsmay also be iterated by joining four replicas of generation n at the hubs A and B, for (a), or A and C, for (b). The number of links of a ( u, v )-flower of generation n is M n = ( u + v ) n = w n , (11)and the number of nodes is N n = (cid:16) w − w − (cid:17) w n + (cid:16) ww − (cid:17) . (12)The degree distribution of the ( u, v )-flowers can also be easily obtained since by construction, ( u, v )-flowers haveonly nodes of degree k = 2 m , m = 1 , , . . . , n . As in the DGM case, ( u, v )-flowers follow a scale-free degree distribution, P ( k ) ∼ k − γ , of degree exponent γ = 1 + ln( u + v )ln 2 . (13)Recursive scale-free trees may be defined in analogy to the flower nets. If v is even, one obtains generation n + 1of a ( u, v )-tree by replacing every link in generation n with a chain of u links, and attaching to each of its endpointschains of v/ , v is odd, attach to the endpoints (of thechain of u links) chains of length ( v ± /
2. The trees may be also constructed by successively joining w replicasat the appropriate hubs, and they too are self-similar. They share many of the fundamental scaling properties with( u, v )-flowers: Their degree distribution is also scale-free, with the same degree exponent as ( u, v )-flowers.The self-similarity of ( u, v )-flowers, coupled with the fact that different replicas meet at a single node, makes themamenable to exact analysis by renormalization techniques. The lack of loops, in the case of ( u, v )-trees, furthersimplifies their analysis [38, 50, 51, 52].1 FIG. 8: The (1 , n is replaced by a chain of u = 1 links, to which ends one attaches chains of v/ u + v = 3 replicas of generation n are joinedat the hubs. (c) Generations n = 1 , , Dimensionality of the ( u, v ) -flowers There is a vast difference between ( u, v )-nets with u = 1 and u >
1. If u = 1 the diameter L n of the n -th generationflower scales linearly with n . For example, L n for the (1 , L n = 2 n for the (1 , , v )-flower, for v odd, is L n = ( v − n + (3 − v ) /
2, and, in general one can show that L n ∼ ( v − n .For u >
1, however, the diameter grows as a power of n . For example, for the (2 , L n = 2 n , and,more generally, the diameter satisfies L n ∼ u n . To summarize, L n ∼ (cid:26) ( v − n u = 1 ,u n u > , flowers . (14)Similar results are quite obvious for the case of ( u, v )-trees, where L n ∼ (cid:26) vn u = 1 ,u n u > , trees . (15)Since N n ∼ ( u + v ) n [Eq. (12)], we can recast these relations as L ∼ (cid:26) ln N u = 1 ,N ln u/ ln( u + v ) u > . (16)Thus, ( u, v )-nets are small world only in the case of u = 1. For u >
1, the diameter increases as a power of N , justas in finite -dimensional objects, and the nets are in fact fractal . For u >
1, the change of mass upon the rescaling oflength by a factor b is N ( bL ) = b d B N ( L ) , (17)where d B is the fractal dimension [16]. In this case, N ( uL ) = ( u + v ) N ( L ), so d B = ln( u + v )ln u , u > . (18)2 Transfinite Fractals
Small world nets, such as (1 , v )-nets, are infinite -dimensional. Indeed, their mass ( N , or M ) increases faster thanany power (dimension) of their diameter. Also, note that a naive application of (4) to u → d f → ∞ . In thecase of (1 , v )-nets one can use their weak self-similarity to define a new measure of dimensionality, ˜ d f , characterizinghow mass scales with diameter: N ( L + ℓ ) = e ℓ ˜ d f N ( L ) . (19)Instead of a multiplicative rescaling of length, L bL , a slower additive mapping, L L + ℓ , that reflects thesmall world property is considered. Because the exponent ˜ d f usefully distinguishes between different graphs of infinitedimensionality, ˜ d f has been termed the transfinite fractal dimension of the network. Accordingly, objects that areself-similar and have infinite dimension (but finite transfinite dimension), such as the (1 , v )-nets, are termed transfinitefractals, or transfractals , for short.For (1 , v )-nets, we see that upon ‘zooming in’ one generation level the mass increases by a factor of w = 1 + v , whilethe diameter grows from L to L + v − L + v (trees). Hence their transfractal dimension is˜ d f = ln(1+ v ) v (1 , v )-trees, ln(1+ v ) v − (1 , v )-flowers. (20)There is some arbitrariness in the selection of e as the base of the exponential in the definition (19). However thebase is inconsequential for the sake of comparison between dimensionalities of different objects. Also, scaling relations between various transfinite exponents hold, irrespective of the choice of base: consider the scaling relation of Eq. 10valid for fractal scale-free nets of degree exponent γ [17, 18]. For example, in the fractal ( u, v )-nets (with u > b = u and all degrees are reduced by a factor of 2, so b d k = 2. Thus d k = ln 2 / ln u , and since d B = ln( u + v ) / ln u and γ = 1 + ln( u + v ) / ln 2, as discussed above, the relation (10) isindeed satisfied.For transfractals, renormalization reduces distances by an additive length, ℓ , and we express the self-similaritymanifest in the degree distribution as P ′ ( k ) = e ℓ ˜ d k P ( e − ℓ ˜ d k k ) , (21)where ˜ d k is the transfinite exponent analogous to d k . Renormalization of the transfractal (1 , v )-nets reduces the linklengths by ℓ = v − ℓ = v (trees), while all degrees are halved. Thus,˜ d k = ln 2 v (1 , v )-trees, ln 2 v − (1 , v )-flowers.Along with (20), this result confirms that the scaling relation γ = 1 + ˜ d f ˜ d k (22)is valid also for transfractals, and regardless of the choice of base. A general proof of this relation is practicallyidentical to the proof of ( ?? ) [17], merely replacing fractal with transfractal scaling throughout the argument.For scale-free transfractals, following m = L/ℓ renormalizations the diameter and mass reduce to order one, andthe scaling (19) implies L ∼ mℓ , N ∼ e mℓ ˜ d f , so that L ∼ d f ln N , in accordance with their small world property. At the same time the scaling (21) implies K ∼ e mℓ ˜ d k , or K ∼ N ˜ d k / ˜ d f .Using the scaling relation (22), we rederive K ∼ N / ( γ − , which is indeed valid for scale-free nets in general , be theyfractal or transfractal.3 PROPERTIES OF FRACTAL AND TRANSFRACTAL NETWORKS
The existence of fractality in complex networks immediately calls for the question of what is the importance ofsuch a structure in terms of network properties. In general, most of the relevant applications seem to be modifiedto a larger or lesser extent, so that fractal networks can be considered to form a separate network sub-class, sharingthe main properties resulting from the wide distribution of regular scale-free networks, but at the same time bearingnovel properties. Moreover, from a practical point of view a fractal network can be usually more amenable to analytictreatment.In this section we summarize some of the applications that seem to distinguish fractal from non-fractal networks.
Modularity
Modularity is a property closely related to fractality. Although this term does not have a unique well-defineddefinition we can claim that modularity refers to the existence of areas in the network where groups of nodes sharesome common characteristics, such as preferentially connecting within this area (the ‘module’) rather than to the restof the network. The isolation of modules into distinct areas is a complicated task and in most cases there are manypossible ways (and algorithms) to partition a network into modules.Although networks with significant degree of modularity are not necessarily fractals, practically all fractal networksare highly modular in structure. Modularity naturally emerges from the effective ‘repulsion’ between hubs. Since thehubs are not directly connected to each other, they usually dominate their neighborhood and can be considered asthe ‘center of mass’ for a given module. The nodes surrounding hubs are usually assigned to this module.The renormalization property of self-similar networks is very useful for estimating how modular a given network is,and especially for how this property is modified under varying scales of observation. We can use a simple definitionfor modularity M , based on the idea that the number of links connecting nodes within a module, L in i , is higher thanthe number of link connecting nodes in different modules, L out i . For this purpose, the boxes that result from thebox-covering method at a given length-scale ℓ B are identified as the network modules for this scale. This partitioningassumes that the minimization of the number of boxes corresponds to an increase of modularity, taking advantage ofthe idea that all nodes within a box can reach each other within less than ℓ B steps. This constraint tends to assign thelargest possible number of nodes in a given neighborhood within the same box, resulting in an optimized modularityfunction.A definition of the modularity function M that takes advantage of the special features of the renormalization processis, thus, the following [48]: M ( ℓ B ) = 1 N B N B X i =1 L in i L out i , (23)where the sum is over all the boxes.The value of M through Eq. (23) for a given ℓ B value is of small usefuleness on its own, though. We can gathermore information on the network structure if we measure M for different values of ℓ B . If the dependence of M on ℓ B has the form of a power-law, as if often the case in practice, then we can define the modularity exponent d M through M ( ℓ B ) ∼ ℓ d M B . (24)The exponent d M carries the important information of how modularity scales with the length, and separates modularfrom non-modular networks. The value of d M is easy to compute in a d -dimensional lattice, since the number of linkswithin any module scales with its bulk, as L in i ∼ ℓ dB and the number of links outside the module scale with the lengthof its interface, i.e. L out i ∼ ℓ d − B . So, the resulting scaling is M ∼ ℓ B i.e. d M = 1. This is also the borderline valuethat separates non-modular structures ( d M <
1) from modular ones ( d M > x = 1, the network is a tree, with well-definedmodules. Larger values of x mean that a larger number of links are connecting different modules, creating more loopsand ‘blurring’ the discreteness of the modules, so that we can vary the degree of modularity in the network. For thismodel, it is also possible to analytically calculate the value of the exponent d M .During the growth process at step t , the diameter in the network model increases multiplicatively as L ( t +1) = 3 L ( t ).The number of links within a module grows with 2 m + x (each node on the side of one link gives rise to m new links and4 x extra links connect the new nodes), while the number of links pointing out of a module is by definition proportionalto x . Thus, the modularity M ( ℓ B ) of a network is proportional to (2 m + x ) /x . Eq. (24) can then be used to calculate d M for the model: 2 m + xx ∼ d M , (25)which finally yields d M = ln (cid:0) mx + 1 (cid:1) ln 3 . (26)So, in this model the important quantity that determines the degree of modularity in the system is the ratio of thegrowth parameters m/x .Most of the real-life networks that have been measured display some sort of modular character, i.e. d M > d M <
1. Mostinteresting is, though, the case of d M values much larger than 1, where a large degree of modularity is observed andthis trend is more pronounced for larger length-scales.The importance of modularity as described above can be demonstrated in biological networks. There, it has beensuggested that the boxes may correspond to functional modules and in protein interaction networks, for example,there may be an evolution drive of the system behind the development of its modular structure. Robustness
Shortly after the discovery of the scale-free property, the first important application of their structure was perhapstheir extreme resilience to removal of random nodes [7, 9, 51, 53, 54, 55, 56]. At the same time such a network wasfound to be quite vulnerable to an intentional attack, where nodes are removed according to descreasing order of theirdegree [8, 57]. The resilience of a network is usually quantified through the size of the largest remaining connectedcluster S max ( p ), when a fraction p of the nodes has been removed according to a given strategy. At a critical point p c where this size becomes equal to S max ( p c ) ≃
0, we consider that the network has been completely disintegrated. Forthe random removal case, this threshold is p c ≃
1, i.e. practically all nodes need to be destroyed. In striking contrast,for intentional attacks p c is in general of the order of only a few percent, although the exact value depends on thesystem details.Fractality in networks considerably strengthens the robustness against intentional attacks, compared to non-fractalnetworks with the same degree exponent γ . In Fig. 9 the comparison between two such networks clearly shows thatthe critical fraction p c increases almost 4 times from p c ≃ .
02 (non-fractal topology) to p c ≃ .
09 (fractal topology).These networks have the same γ exponent, the same number of links, number of nodes, number of loops and the sameclustering coefficient, differing only in whether hubs are directly connected to each other. The fractal property, thus,provides a way of increasing resistance against the network collapse, in the case of a targeted attack.The main reason behind this behavior is the dispersion of hubs in the network. A hub is usually a central node thathelps other nodes to connect to the main body of the system. When the hubs are directly connected to each other,this central core is easy to destroy in a targeted attack leading to a rapid collapse of the network. On the contrary,isolating the hubs into different areas helps the network to retain connectivity for longer time, since destroying thehubs now is not similarly catastrophic, with most of the nodes finding alternative paths through other connections.The advantage of increased robustness derived from the combination of modular and fractal network character, mayprovide valuable hints on why most biological networks have evolved towards a fractal architecture (better chance ofsurvival against lethal attacks). Degree Correlations
We have already mentioned the importance of hub-hub correlations or anti-correlations in fractality. Generalizingthis idea to nodes of any degree, we can ask what is the joint degree probability P ( k , k ) that a randomly chosen linkconnects two nodes with degree k and k , respectively. Obviously, this is a meaningful question only for networkswith a wide degree distribution, otherwise the answer is more or less trivial with all nodes having similar degrees. Asimilar and perhaps more useful quantity is the conditional degree probability P ( k | k ), defined as the probability5 FIG. 9: Vulnerability under intentional attack of a non-fractal Song-Makse-Havlin network (for e = 0) and a fractal Song-Makse-Havlin network (for e = 1). The plot shows the relative size of the largest cluster, S , and the average size of theremaining isolated clusters, h s i as a function of the removal fraction f of the largest hubs for both networks. that a random link from a node having degree k points to a node with degree k . In general, the following balancecondition is satisfied k P ( k | k ) P ( k ) = k P ( k | k ) P ( k ) . (27)It is quite straightforward to calculate P ( k | k ) for completely uncorrelated networks. In this case, P ( k | k ) does notdepend on k , and the probability to chose a node with degree k becomes simply P ( k | k ) = k P ( k ) / h k i . In thecase where degree-degree correlations are present, though, the calculation of this function is very difficult, even whenrestricting ourselves to a direct numerical evaluation, due to the emergence of huge fluctuations.We can still estimate this function, though, using again the self-similarity principle. If we consider that the function P ( k , k ) remains invariant under the network renormalization scheme described above, then it is possible to showthat P ( k , k ) ∼ k − ( γ − k − ǫ ( k > k ) , (28)and similarly P ( k | k ) ∼ k − ( γ − k − ( ǫ − γ +1)2 ( k > k ) , (29)In the above equations we have also introduced the correlation exponent ǫ , which characterizes the degree of correla-tions in a network. For example, the case of uncorrelated networks is described by the value ǫ = γ − ǫ can be measured quite accurately using an appropriate quantity. For this purpose, we can introducea measure such as E b ( k ) ≡ R ∞ bk P ( k | k ) dk R ∞ bk P ( k ) dk , (30)which estimates the probability that a node with degree k has neighbors with degree larger than bk , and b is anarbitrary parameter that has been shown not to influence the results. It is easy to show that E b ( k ) ∼ k − ǫ k − γ = k − ( ǫ − γ ) . (31)This relation allows us to estimate ǫ for a given network, after calculating the quantity E b ( k ) as a function of k .6The above discussion can be equally applied to both fractal and non-fractal networks. If we restrict ourselves tofractal networks, then we can develop our theory a bit further. If we consider the probability E ( ℓ B ) that the largestdegree node in each box is connected directly with the other largest degree nodes in other boxes (after optimallycovering the network), then this quantity scales as a power law with ℓ B : E ( ℓ B ) ∼ ℓ − d e B , (32)where d e is a new exponent describing the probability of hub-hub connection [18]. The exponent ǫ , which describescorrelations over any degree, is related to d e , which refers to correlations between hubs only. The resulting relation is ǫ = 2 + d e /d k = 2 + ( γ − d e d B . (33)For an infinite fractal dimension d B → ∞ , which is the onset of non-fractal networks that cannot be described by theabove arguments, we have the limiting case of ǫ = 2. This value separates fractal from non-fractal networks, so thatfractality is indicated by ǫ >
2. Also, we have seen that the line ǫ = γ − ǫ > ǫ < ǫ = γ − ǫ , which is though capable of delivering a wealth of information on the network topological properties. Diffusion and Resistance
Scale-free networks have been described as objects of infinite dimensionality. For a regular structure this statementwould suggest that one can simply use the known diffusion laws for d = ∞ . Diffusion on scale-free structures,however, is much harder to study, mainly due to the lack of translational symmetry in the system and different localenvironments. Although exact results are still not available, the scaling theory on fractal networks provides the toolsto better understand processes, such as diffusion and electric resistance.In the following, we describe diffusion through the average first-passage time T AB , which is the average time for adiffusing particle to travel from node A to node B. At the same time, assuming that each link in the network has anelectrical resistance of 1 unit, we can describe the electrical properties through the resistance between the two nodesA and B, R AB .The connection between diffusion (first-passage time) and electric networks has long been established in homo-geneous systems. This connection is usually expressed through the Einstein relation [16]. The Einstein relation isof great importance because it connects a static quantity R AB with a dynamic quantity T AB . In other words, thebehavior of a diffusing particle can be inferred by simply having knowledge of a static topological property of thenetwork.In any renormalizable network the scaling of T and R follow the form: T ′ /T = ℓ − d w B , R ′ /R = ℓ − ζB , (34)where T ′ ( R ′ ) and T ( R ) are the first-passage time (resistance) for the renormalized and original networks, respectively.The dynamical exponents d w and ζ characterize the scaling in any lattice or network that remains invariant underrenormalization. The Einstein relation relates these two exponents through the dimensionality of the substrate d B ,according to: d w = ζ + d B . (35)The validity of this relation in inhomogeneous complex networks, however, is not yet clear. Still, in fractal andtransfractal networks there are many cases where this relation has been proved to be valid, hinting towards a widerapplicability. For example, in Refs. [50, 52] it has been shown that the Einstein Relation [16] in ( u, v )-flowers and( u, v )-trees is valid for any u and v , that is for both fractal and transfractal networks. In general, in terms of thescaling theory we can study diffusion and resistance (or conductance) in a similar manner [48].Because of the highly inhomogeneous character of the structure, though, we are interested in how these quantitiesbehave as a function of the end-node degrees k and k when they are separated by a given distance ℓ . Thus, we are7 (a) (b)FIG. 10: Typical behavior of the probability distributions for the resistance R vs R ′ and the diffusion time T vs T ′ , respectively,for a given ℓ B value. Similar plots for other ℓ B values verify that the ratios of these quantities during a renormalization stageare roughly constant for all pairs of nodes in a given biological network. -2 -1 -4 -3 -2 -1 -2 -1 T ’ / T N’/N (e) R ’ / R Metabolic network (E.coli) Protein interaction network (Yeast) Model (d M =1.46) Model (d M =1.26) (d) FIG. 11: Average value of the ratio of resistances
R/R ′ and diffusion times T /T ′ , as measured for different ℓ B values (eachpoint corresponds to a different value of ℓ B ). Results are presented for both biological networks, and two fractal network modelswith different d M values. The slopes of the curves correspond to the exponents ζ/d B (top panel) and d w /d B (bottom panel). looking for the full dependence of T ( ℓ ; k , k ) and R ( ℓ ; k , k ). Obviously, for lattices or networks with narrow degreedistribution there is no degree dependence and those results should be a function of ℓ only.For self-similar networks, we can rewrite Eq. (34) above as T ′ T = (cid:18) N ′ N (cid:19) d w /d B , R ′ R = (cid:18) N ′ N (cid:19) ζ/d B . (36)where we have taken into account Eq. (3). This approach offers the practical advantage that the variation of N ′ /N islarger than the variation of ℓ B , so that the exponents calculation can be more accurate. To calculate these exponenets,we fix the box size ℓ B and we measure the diffusion time T and resistance R between any two points in a networkbefore and after renormalization. If for every such pair we plot the corresponding times and resistances in T ′ vs T and R ′ vs R plots, as shown in Fig. 10, then all these points fall in a narrow area, suggesting a constant value for theratio T ′ /T over the entire network. Repeating this procedure for different ℓ B values yields other ratio values. Theplot of these ratios vs N ′ /N (Fig. 11) finally exhibits a power-law dependence, verifying Eq. (36). We can then easilycalculate the exponents d w and ζ from the slopes in the plot, since the d B exponent is already known through thestandard box-covering methods. It has been shown that the results for many different networks are consistent, withinstatistical error, with the Einstein relation [48, 50].The dependence on the degrees k , k and the distance ℓ can also be calculated in a scaling form using the self-similarity properties of fractal networks. After renormalization, a node with degree k in a given network, will havea degree k ′ = ℓ − d k B k according to Eq. (9). At the same time all distances ℓ are scaled down according to ℓ ′ = ℓ/ℓ B .8 FIG. 12: Rescaling of (a) the resistance and (b) the diffusion time according to Eqs. (41) and (42) for the protein interactionnetwork of yeast (upper symbols) and the Song-Havlin-Makse model for e = 1 (lower filled symbols). The data for PIN havebeen vertically shifted upwards by one decade for clarity. Each symbol corresponds to a fixed ratio k /k and the different colorsdenote a different value for k . Inset: Resistance R as a function of distance ℓ , before rescaling, for constant ratio k /k = 1and different k values. This means that Eqs.(36) can be written as R ′ ( ℓ ′ ; k ′ , k ′ ) = ℓ − ζB R ( ℓ ; k , k ) (37) T ′ ( ℓ ′ ; k ′ , k ′ ) = ℓ − d w B T ( ℓ ; k , k ) . (38)Substituting the renormalized quantities we get: R ′ ( ℓ − B ℓ ; ℓ − d k B k , ℓ − d k B k ) = ℓ − ζB R ( ℓ ; k , k ) . (39)The above equation holds for all values of ℓ B , so we can select this quantity to be ℓ B = k /d k . This constraint allowsus to reduce the number of variables in the equation, with the final result: R ℓk /d k ; k k , ! = k − ζ/d k R ( ℓ B ; k , k ) . (40)This equation suggests a scaling for the resistance R : R ( ℓ ; k , k ) = k ζ/d k f R ℓk /d k , k k ! , (41)where f R () is an undetermined function. All the above arguments can be repeated for the diffusion time, with asimilar expression: T ( ℓ ; k , k ) = k d w /d k f T ℓk /d k , k k ! , (42)where the form of the right-hand function may be different. The final result for the scaling form is Eqs. (41) and(42), which is also supported by the numerical data collapse in Fig. 12. Notice that in the case of homogeneousnetworks, where there is almost no k -dependence, the unknown functions in the rhs reduce to the forms f R ( x,
1) = x ζ , f T ( x,
1) = x d w , leading to the well-established classical relations R ∼ ℓ ζ and T ∼ ℓ d w . Future Directions
Fractal networks combine features met in fractal geometry and in network theory. As such, they present manyunique aspects. Many of their properties have been well-studied and understood, but there is still a great amount ofopen and unexplored questions remaining to be studied.9Concerning the structural aspects of fractal networks, we have described that in most networks the degree dis-tribution P ( k ), the joint degree distribution P ( k , k ) and a number of other quantities remain invariant underrenormalization. Are there any quantities that are not invariable, and what would their importance be?Of central importance is the relation of topological features with functionality. The optimal network covering leadsto the partitioning of the network into boxes. Do these boxes carry a message other than nodes proximity? Forexample, the boxes could be used as an alternative definition for separated communities, and fractal methods couldbe used as a novel method for community detection in networks [58, 59, 60, 61, 62].The networks that we have presented are all static, with no temporal component, and time evolution has beenignored in all our discussions above. Clearly, biological networks, the WWW, and other networks have grown (andcontinue to grow) from some earlier simpler state to their present fractal form. Has fractality always been there orhas it emerged as an intermediate stage obeying certain evolutionary drive forces? Is fractality a stable condition orgrowing networks will eventually fall into a non-fractal form?Finally, we want to know what is the inherent reason behind fractality. Of course, we have already described howhub-hub anti-correlations can give rise to fractal networks. However, can this be directly related to some underlyingmechanism, so that we gain some information on the process? In general, in Biology we already have some idea onthe advantages of adopting a fractal structure. Still, the question remains: why fractality exists in certain networksand not in others? Why both fractal and non-fractal networks are needed? It seems that we will be able to increaseour knowledge for the network evolutionary mechanisms through fractality studies.In conclusion, a deeper understanding of the self-similarity, fractality and transfractality of complex networks willhelp us analyze and better understand many fundamental properties of real-world networks. APPENDIX: THE BOX COVERING ALGORITHMS
The estimation of the fractal dimension and the self-similar features in networks have become standard propertiesin the study of real-world systems. For this reason, in the last three years many box covering algorithms have beenproposed [24, 63]. This section presents four of the main algorithms, along with a brief discussion on the advantagesand disadvantages that they offer.Recalling the original definition of box covering by Hausdorff [13, 15, 64], for a given network G and box size ℓ B ,a box is a set of nodes where all distances ℓ ij between any two nodes i and j in the box are smaller than ℓ B . Theminimum number of boxes required to cover the entire network G is denoted by N B . For ℓ B = 1, each box enclosesonly 1 node and therefore, N B is equal to the size of the network N . On the other hand, N B = 1 for ℓ B ≥ ℓ max B ,where ℓ max B is the diameter of the network plus one.The ultimate goal of a box-covering algorithm is to find the minimum number of boxes N B ( ℓ B ) for any ℓ B . It hasbeen shown that this problem belongs to the family of NP-hard problems [65], which means that the solution cannotbe achieved in polynomial time. In other words, for a relatively large network size, there is no algorithm that canprovide an exact solution in a reasonably short amount of time. This limitation requires treating the box coveringproblem with approximations, using for example optimization algorithms. The greedy coloring algorithm
The box-covering problem can be mapped into another NP-hard problem [65]: the graph coloring problem.An algorithm that approximates well the optimal solution of this problem was presented in [24]. For an arbitraryvalue of ℓ B , first construct a dual network G ′ , in which two nodes are connected if the distance between them in G (the original network) is greater or equal than ℓ B . Fig. 13 shows an example of a network G which yields such a dualnetwork G ′ for ℓ B = 3 (upper row of Fig. 13).Vertex coloring is a well-known procedure, where labels (or colors) are assigned to each vertex of a network, sothat no edge connects two identically colored vertices. It is clear that such a coloring in G ′ gives rise to a naturalbox covering in the original network G , in the sense that vertices of the same color will necessarily form a box sincethe distance between them must be less than ℓ B . Accordingly, the minimum number of boxes N B ( G ) is equal to theminimum required number of colors (or the chromatic number) in the dual network G ′ , χ ( G ′ ).In simpler terms, (a) if the distance between two nodes in G is greater than ℓ B these two neighbors cannot belongin the same box. According to the construction of G ′ , these two nodes will be connected in G ′ and thus they cannothave the same color. Since they have a different color they will not belong in the same box in G . (b) On the contrary,if the distance between two nodes in G is less than ℓ B it is possible that these nodes belong in the same box. In G ′ FIG. 13: Illustration of the solution for the network covering problem via mapping to the graph coloring problem. Startingfrom G (upper left panel) we construct the dual network G ′ (upper right panel) for a given box size (here ℓ B = 3), where twonodes are connected if they are at a distance ℓ ≥ ℓ B . We use a greedy algorithm for vertex coloring in G ′ , which is then usedto determine the box covering in G , as shown in the plot. these two nodes will not be connected and it is allowed for these two nodes to carry the same color, i.e. they maybelong to the same box in G , (whether these nodes will actually be connected depends on the exact implementationof the coloring algorithm).The algorithm that follows both constructs the dual network G ′ and assigns the proper node colors for all ℓ B valuesin one go. For this implementation a two-dimensional matrix c iℓ of size N × ℓ max B is needed, whose values representthe color of node i for a given box size ℓ = ℓ B .1. Assign a unique id from 1 to N to all network nodes, without assigning any colors yet.2. For all ℓ B values, assign a color value 0 to the node with id=1, i.e. c ℓ = 0.3. Set the id value i = 2. Repeat the following until i = N .(a) Calculate the distance ℓ ij from i to all the nodes in the network with id j less than i .(b) Set ℓ B = 1(c) Select one of the unused colors c jℓ ij from all nodes j < i for which ℓ ij ≥ ℓ B . This is the color c iℓ B of node i for the given ℓ B value.(d) Increase ℓ B by one and repeat (c) until ℓ B = ℓ max B .(e) Increase i by 1.The results of the greedy algorithm may depend on the original coloring sequence. The quality of this algorithmwas investigated by randomly reshuffling the coloring sequence and applying the greedy algorithm several times andin different models [24]. The result was that the probability distribution of the number of boxes N B (for all boxsizes ℓ B ) is a narrow Gaussian distribution, which indicates that almost any implementation of the algorithm yieldsa solution close to the optimal.Strictly speaking, the calculation of the fractal dimension d B through the relation N B ∼ ℓ − d B B is valid only for theminimum possible value of N B , for any given ℓ B value, so any box covering algorithm must aim to find this minimum N B . Although there is no rule to determine when this minimum value has been actually reached (since this wouldrequire an exact solution of the NP-hard coloring problem) it has been shown [66] that the greedy coloring algorithmcan, in many cases, identify a coloring sequence which yields the optimal solution. Burning algorithms
This section presents three box covering algorithms based on more traditional breadth-first search algorithm.A box is defined as compact when it includes the maximum possible number of nodes, i.e. when there do not existany other network nodes that could be included in this box. A connected box means that any node in the box can bereached from any other node in this box, without having to leave this box. Equivalently, a disconnected box denotes1
FIG. 14: Our definitions for a box that is (a) non-compact for ℓ B = 3, i.e. could include more nodes, (b) compact, (c) connected,and (d) disconnected (the nodes in the right box are not connected in the box). (e) For this box, the values ℓ B = 5 and r B = 2verify the relation ℓ B = 2 r B + 1. (f) One of the pathological cases where this relation is not valid, since ℓ B = 3 and r B = 2.FIG. 15: Illustration of the CBB algorithm for ℓ B = 3. (a) Initially, all nodes are candidates for the box. (b) A random node ischosen, and nodes at a distance further than ℓ B from this node are no longer candidates. (c) The node chosen in (b) becomespart of the box and another candidate node is chosen. The above process is then repeated until the box is complete. a box where certain nodes can be reached by other nodes in the box only by visiting nodes outside this box. For ademonstration of these definitions see Fig. 14. Burning with the diameter ℓ B , and the Compact-Box-Burning (CBB) algorithm The basic idea of the CBB algorithm for the generation of a box is to start from a given box center and then expandthe box so that it includes the maximum possible number of nodes, satisfying at the same time the maximum distancebetween nodes in the box ℓ B . The CBB algorithm is as follows (see Fig. 15):1. Initially, mark all nodes as uncovered.2. Construct the set C of all yet uncovered nodes.3. Choose a random node p from the set of uncovered nodes C and remove it from C .4. Remove from C all nodes i whose distance from p is ℓ pi ≥ ℓ B , since by definition they will not belong in thesame box.5. Repeat steps (3) and (4) until the candidate set is empty.6. Repeat from step (2) until all the network has been covered. Random Box Burning
In 2006, J. S. Kim et al. presented a simple algorithm for the calculation of fractal dimension in networks [67, 68, 69]:1. Pick a randomly chosen node in the network as a seed of the box.2. Search using breath-first search algorithm until distance l B from the seed. Assign all newly burned nodes to thenew box. If no new node is found, discard and start from (1) again.2 FIG. 16: Burning with the radius r B from (a) a hub node or (b) a non-hub node results in very different network coverage. In(a) we need just one box of r B = 1 while in (b) 5 boxes are needed to cover the same network. This is an intrinsic problemwhen burning with the radius. (c) Burning with the maximum distance ℓ B (in this case ℓ B = 2 r B + 1 = 3) we avoid thissituation, since independently of the starting point we would still obtain N B = 1.
3. Repeat (1) and (2) until all nodes have a box assigned.This Random Box Burning algorithm has the advantage of being a fast and simple method. However, at the sametime there is no inherent optimization employed during the network coverage. Thus, this simple Monte-Carlo methodis almost certain that will yield a solution far from the optimal and one needs to implement many different realizationsand only retain the smallest number of boxes found out of all these realizations.
Burning with the radius r B , and the Maximum-Excluded-Mass-Burning (MEMB) algorithm A box of size ℓ B includes nodes where the distance between any pair of nodes is less than ℓ B . It is possible, though,to grow a box from a given central node, so that all nodes in the box are within distance less than a given box radius r B (the maximum distance from a central node). This way, one can still recover the same fractal properties of anetwork. For the original definition of the box, ℓ B corresponds to the box diameter (maximum distance between anytwo nodes in the box) plus one. Thus, ℓ B and r B are connected through the simple relation ℓ B = 2 r B + 1. In generalthis relation is exact for loopless configurations, but in general there may exist cases where this equation is not exact(Fig. 14).The MEMB algorithm always yields the optimal solution for non scale-free homogeneous networks, since the choiceof the central node is not important. However, in inhomogeneous networks with wide-tailed degree distribution, suchas scale-free networks, this algorithm fails to achieve an optimal solution because of the presence of hubs.The MEMB, as a difference from the Random Box Burning and the CBB, attempts to locate some optimal central nodes which act as the burning origins for the boxes. It contains as a special case the choice of the hubs as centers ofthe boxes, but it also allows for low-degree nodes to be burning centers, which sometimes is convenient for finding asolution closer to the optimal.In the following algorithm we use the basic idea of box optimization, in which each box covers the maximum possiblenumber of nodes. For a given burning radius r B , we define the excluded mass of a node as the number of uncoverednodes within a chemical distance less than r B . First, calculate the excluded mass for all the uncovered nodes. Then,seek to cover the network with boxes of maximum excluded mass. The details of this algorithm are as follows (seeFig. 17):1. Initially, all nodes are marked as uncovered and non-centers.2. For all non-center nodes (including the already covered nodes) calculate the excluded mass, and select the node p with the maximum excluded mass as the next center.3. Mark all the nodes with chemical distance less than r B from p as covered.4. Repeat steps (2) and (3) until all nodes are either covered or centers.Notice that the excluded mass has to be updated in each step because it is possible that it has been modified duringthis step. A box center can also be an already covered node, since it may lead to a larger box mass. After the aboveprocedure, the number of selected centers coincides with the number of boxes N B that completely cover the network.However, the non-center nodes have not yet been assigned to a given box. This is performed in the next step:1. Give a unique box id to every center node.3 FIG. 17: Illustration of the MEMB algorithm for r B = 1. Upper row: Calculation of the box centers (a) We calculate theexcluded mass for each node. (b) The node with maximum mass becomes a center and the excluded masses are recalculated.(c) A new center is chosen. Now, the entire network is covered with these two centers.
Bottom row: Calculation of the boxes (d) Each box includes initially only the center. Starting from the centers we calculate the distance of each network node to theclosest center. (e) We assign each node to its nearest box.
2. For all nodes calculate the “central distance”, which is the chemical distance to its nearest center. The centraldistance has to be less than r B , and the center identification algorithm above guarantees that there will alwaysexist such a center. Obviously, all center nodes have a central distance equal to 0.3. Sort the non-center nodes in a list according to increasing central distance.4. For each non-center node i , at least one of its neighbors has a central distance less than its own. Assign to i thesame id with this neighbor. If there exist several such neighbors, randomly select an id from these neighbors.Remove i from the list.5. Repeat step (4) according to the sequence from the list in step (3) for all non-center nodes. Comparison between algorithms
The choice of the algorithm to be used for a problem depends on the details of the problem itself. If connectedboxes are a requirement, MEMB is the most appropriate algorithm; but if one is only interested in obtaining thefractal dimension of a network, the greedy-coloring or the random box burning are more suitable since they are thefastest algorithms.As explained previously, any algorithm should intend to find the optimal solution, that is, find the minimum numberof boxes that cover the network. Fig. 18 shows the performance of each algorithm. The greedy-coloring, the CBBand MEMB algorithms exhibit a narrow distribution of the number of boxes, showing evidence that they cover thenetwork with a number of boxes that is close to the optimal solution. Instead, the Random Box Burning returns awider distribution and its average is far above the average of the other algorithms. Because of the great ease andspeed with which this technique can be implemented, it would be useful to show that the average number of coveringboxes is overestimated by a fixed proportionality constant. In that case, despite the error, the predicted number ofboxes would still yield the correct scaling and fractal dimension. ∗ Electronic address: [email protected] † Electronic address: [email protected] ‡ Electronic address: chaoming [email protected] § Electronic address: [email protected][1] R. Albert and A.-L. Barab´asi, Rev. of Mod. Phys. , 47 (2002); A.-L. Barab´asi, Linked: How Everything Is Connected toEverything Else and What It Means , (Plume, 2003); M. E. J. Newman, SIAM Review, , 167 (2003); S. N. Dorogovtsev,J. F. F. Mendes, Advances in Physics Evolution of Networks: FromBiological Nets to the Internet and WWW , (Oxford University Press, Oxford, 2003); S. Bornholdt and H. G. Schuster,
Handbook of Graphs and Networks , (Wiley-VCH, Berlin, 2003); R. Pastor-Satorras and A. Vespignani,
Evolution andStructure of the Internet , (Cambridge University Press, Cambridge, UK, 2004); Amaral, L. A. N. & Ottino, J. M. Complexnetworks - augmenting the framework for the study of complex systems.
Eur. Phys. J. B , 147-162 (2004). FIG. 18: Comparison of the distribution of N B for 10 realizations of the four network covering methods presented in thispaper. Notice that three of these methods yield very similar results with narrow distributions and comparable minimum values,while the random burning algorithm fails to reach a value close to this minimum (and yields a broad distribution).[2] P. Erd˝os and A. R´enyi. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci. , 17–61 (1960).[3] Han, J.-D. J., et al. Nature , 88-93 (2004).[4] Motter, A. E., de Moura, A. P. S., Lai, Y.-C. and Dasgupta, P. Phys. Rev. E, , 065102, (2002).[5] D. Butler Nature , 528 (2006).[6] Faloutsos, M., Faloutsos, P. & Faloutsos, C.
Computer Communications Review , 251-262 (1999).[7] R. Cohen, K. Erez, D. ben-Avraham, S. Havlin, Phys. Rev. Lett. , 4626 (2000).[8] R. Cohen, K. Erez, D. ben-Avraham, S. Havlin Phys. Rev. Lett , 3682 (2001).[9] R. Cohen, D. ben-Avraham, S. Havlin Phys. Rev. E , 036113 (2002).[10] N. Schwartz, R. Cohen, D. ben-Avraham, A.-L. Barabasi, S. Havlin, Phys. Rev. E , 015104 (2002).[11] L.K. Gallos, R. Cohen, P. Argyrakis, A. Bunde, S. Havlin, Phys. Rev. Lett. , 188701 (2005).[12] J. P. Bagrow, E. M. Bollt, J. D. Skufca, D. ben-Avraham arXiv:cond-mat/0703470v1 [cond-mat.dis-nn][13] Bunde, A. & Havlin, S. Fractals and Disordered Systems , edited by A. Bunde and S. Havlin, 2nd edition (Springer-Verlag,Heidelberg, 1996).[14] Vicsek, T.
Fractal Growth Phenomena , 2nd ed., Part IV (World Scientific, Singapore, 1992).[15] Feder, J.
Fractals (Plenum Press, New York, 1988).[16] D. ben-Avraham and S. Havlin,
Diffusion and Reactions in Fractals and Disordered Systems , (Cambridge University Press,Cambridge, 2000).[17] C. Song, S. Havlin, H. A. Makse,
Nature , 392 (2005).[18] C. Song, S. Havlin, H. A. Makse, Nature Physics , 275 (2006).[19] K.-I. Goh, G. Salvi, B. Kahng, and D. Kim Phys. Rev. Lett. , 018701 (2006).[20] B. Mandelbrot, The Fractal Geometry of Nature (W. H. Freeman and Company, New York, 1982).[21] L. P. Kadanoff,
Statistical Physics: Static, Dynamics and Renormalization (World Scientific, 2000).[22] H. E. Stanley.
Introduction to Phase Transitions and Critical Phenomena (Oxford University Press, Oxford, 1971).[23] J. J. Binney, N. J. Dowrick, A. J. Fisher, and M. E. J. Newman.
The Theory of Critical Phenomena: An Introduction tothe Renormalization Group (Oxford University Press, Oxford, 1992).[24] C. Song, L.K. Gallos, S. Havlin, H. A. Makse, Journal of Statistical Mechanics, P03006 (2007)[25] Albert, R. Jeong, H. & Barab´asi, A.-L. Diameter of the World Wide Web.
Nature , 130-131 (1999).[26] Milgram, S.
Psychol. Today , 60 (1967).[27] Bollob´as, B. Random Graphs (Academic Press, London, 1985).[28] Watts, D. J. & Strogatz, S. H. Collective dynamics of ”small-world” networks.
Nature , 440-442 (1998).[29] I. Xenarios et. al. Nucleic Acids Res. , 289-291 (2000).[30] Database of Interacting Proteins (DIP). http://dip.doe-mbi.ucla.edu[31] Jeong, H, Tombor, B., Albert, R., Oltvai Z. N. and Barab´asi, A.-L.
Nature , 651-654 (2000).[32] Overbeek, R. et. al. , Nucleic Acid Res. , 509 (1999).[35] J. Cardy, Scaling and Renormalization in Statistical Physics , (Cambridge University Press;1996).[36] L. P. Kadanoff,
Statistical Physics: Statics, Dynamics and Renormalization (World Scientific Publishing Company; 2000).[37] M. Salmhofer,
Renormalization: An Introduction (Springer; 1999).[38] A. N. Berker and S. Ostlund, J. Phys. C , 4961 (1979).[39] M. Kaufman and R. B. Griffiths, Phys. Rev. B , 496(R) (1981); 244, (1984). [40] M. Hinczewski and A. N. Berker, Phys. Rev. E , 066126 (2006).[41] F. Comellas. Complex Networks: Deterministic Models Physics and Theoretical Computer Science. From Numbers andLanguages to (Quantum) Cryptography. 7 NATO Security through Science Series: Information and Communication Secu-rity. J.-P. Gazeau, J. Nesetril and B. Rovan (Eds). IOS Press, Amsterdam. 348 pags. ISBN 1-58603-706-4. pp. 275-293.[42] M. E. J. Newman, Phys. Rev. Lett. , 208701 (2002).[43] M. E. J. Newman, Phys. Rev. E , 026126 (2003).[44] S. Maslov, K. Sneppen, Science , 910-913 (2002).[45] R. Pastor-Satorras, A. V´azquez, A. Vespignani,
Phys. Rev. Lett , 258701 (2001).[46] H. Jeong, B. Tombor, R. Albert, Z. Oltvai, A.-L. Barab´asi, Nature , 651,654 (2000).[47] H. Burch, W. Chewick. Mapping the Internet.
IEEE Computer , 97-98 (1999).[48] L.K. Gallos, C. Song, S. Havlin, H. A. Makse PNAS, , 7746 (2007).[49] S. N. Dorogovtsev, A. V. Goltsev, and J. F. F. Mendes, Phys. Rev. E , 066122 (2002).[50] H. Rozenfeld, S. Havlin and D. ben-Avraham, New J. Phys. (2007) 175.[51] H. Rozenfeld and D. ben-Avraham, Phys. Rev. E , 061102 (2007).[52] E. bollt, D. ben-Avraham, New J. Phys.
26 (2005).[53] R. Albert, H. Jeong and AL Barab´asi, Nature , p378 (2000).[54] A. Beygelzimer, G. Grinstein, R. Linsker, I. Rish Physica A: Statistical Mechanics and its Applications , 593-612, 2005.[55] M.A. Serrano and M. Boguna, Phys. Rev. Lett. , 088701 (2006).[56] M.A. Serrano and M. Boguna, Phys. Rev. E , 056115 (2006).[57] L.K. Gallos, P. Argyrakis, A. Bunde, R. Cohen, S. Havlin, Physica A (2004) 504 509.[58] M. E. J. Newman and M. Girvan, Phys. Rev. E , 026113 (2004).[59] A. Clauset, M. E. J. Newman, and C. Moore, Phys. Rev. E , 066111 (2004).[60] J. P. Bagrow and E. M. Bollt, Phys. Rev. E , 046108 (2005).[61] J. P. Bagrow arXiv:0706.3880v1 [physics.data-an][62] Gergely Palla, A.-L. Barab´asi and T. Vicsek, Nature , 664-667 (2007).[63] W.-X. Zhoua, Z.-Q. Jianga, D. Sornette Physica A , 741-752 (2007).[64] Peitgen H O, Jurgens H, and Saupe D 1993
Chaos and Fractals: New Frontiers of Science (Springer).[65] Garey M. and Johnson D 1979
Computers and Intractability; A Guide to the Theory of NP-Completeness (New York:W.H. Freeman).[66] Cormen T H, Leiserson C E, Rivest R L and Stein C 2001.
Introduction to Algorithms (MIT Press).[67] J. S. Kim, K.-I. Goh, G. Salvi, E. Oh, B. Kahng and D. Kim, Phys. Rev. E , 016110 (2007).[68] J. S. Kim, K-I Goh, B. Kahng and D. Kim, CHAOS , 026116, 2007.[69] J. S. Kim, K-I Goh, B. Kahng and D. Kim, New J. Phys.9