Nonbacktracking expansion of finite graphs
Gábor Timár, Rui A. da Costa, Sergey N. Dorogovtsev, José F. F. Mendes
aa r X i v : . [ c ond - m a t . d i s - nn ] A p r Nonbacktracking expansion of finite graphs
G. Tim´ar, R. A. da Costa, S. N. Dorogovtsev,
1, 2 and J. F. F. Mendes Departamento de F´ısica da Universidade de Aveiro & I3N,Campus Universit´ario de Santiago, 3810-193 Aveiro, Portugal A. F. Ioffe Physico-Technical Institute, 194021 St. Petersburg, Russia
Message passing equations yield a sharp percolation transition in finite graphs, as an artifact ofthe locally treelike approximation. For an arbitrary finite, connected, undirected graph we constructan infinite tree having the same local structural properties as this finite graph, when observed bya nonbacktracking walker. Formally excluding the boundary, this infinite tree is a generalization ofthe Bethe lattice. We indicate an infinite, locally treelike, random network whose local structure isexactly given by this infinite tree. Message passing equations for various cooperative models on thisconstruction are the same as for the original finite graph, but here they provide the exact solutions ofthe corresponding cooperative problems. These solutions are good approximations to observables forthe models on the original graph when it is sufficiently large and not strongly correlated. We showhow to express these solutions in the critical region in terms of the principal eigenvector componentsof the nonbacktracking matrix. As representative examples we formulate the problems of the randomand optimal destruction of a connected graph in terms of our construction, the nonbacktrackingexpansion. We analyze the limitations and the accuracy of the message passing algorithms fordifferent classes of networks and compare the complexity of the message passing calculations to thatof direct numerical simulations. Notably, in a range of important cases, simulations turn out to bemore efficient computationally than the message passing.
I. INTRODUCTION
The Bethe lattice is a valuable substrate, on top ofwhich many cooperative models are solved exactly [1]and show critical behaviors with mean-field critical ex-ponents and Gaussian critical fluctuations. A regularBethe lattice is equivalent to an infinite random regu-lar graph, which is locally treelike and has only infi-nite cycles (loops) and no boundary. This is in con-trast with an infinite Cayley tree in which the boundarysuppresses phase transitions, e.g., in the Ising model ona Cayley tree, finite or infinite, long-range order neveremerges. The configuration model [2, 3] of a sparse un-correlated network and its generalizations represent aheterogeneous version of the Bethe lattice. This basicmodel also provides exact solutions for percolation andother problems in infinite random networks [4].Self-consistency equations for the probability of reach-ing a finite branch by following a link yield the solutionof percolation problems for infinite locally treelike ran-dom networks [4]. Recently it was found in Ref. [5] thatsuch self-consistency equations produce a good approx-imation of the percolation behavior of large but still fi-nite, sparse graphs. This conclusion was supported bynumerical simulations. The equations are written for agiven graph from which a fraction 1 − p of links is re-moved. These self-consistency equations—termed mes-sage passing equations in the context of finite graphs—predict a sharp continuous phase transition for any finitegraph, e.g., for the 4-clique (four interconnected nodes).In reality, a percolation continuous phase transition can-not be defined in finite systems. The same is true forother models on finite graphs. In this paper, insteadof describing a given finite system with some unknownaccuracy, we address the issue of this useful approxima-tion from a complementary perspective. Namely, if the message passing equations in principle cannot preciselydescribe physical models on finite graphs, then what dothey describe? For each given finite graph, we constructa related infinite network for which these equations pro-vide an exact description of the cooperative models. Thisinfinite network is a generalization of the Bethe latticewith percolation properties that are described exactlyby the message passing equations on the finite graph[5]. The closer the original finite graph is to this con-struction, the better the approximation provided by themessage passing equations. We therefore give a physicalinterpretation of the solutions of these equations for per-colation on a finite graph. We express these solutions inthe critical region in terms of the principal eigenvectorcomponents of the nonbacktracking matrix. As anotherexample we consider the problem of optimal percolation[6, 7]. We show that using our network construction,an approximate algorithm for finding the so-called decy-cling number [8] can be formulated in simple terms of theprincipal eigenvector of the nonbacktracking matrix. Weintroduce a way to quantify the accuracy of the messagepassing algorithm and indicate two classes of networksfor which the approximation fails. We calculate the timecomplexity associated with message passing algorithmsand show that in a wide range of important situationsthis method is less efficient than direct numerical simu-lations. II. CONSTRUCTION
To build the nonbacktracking expansion (NBE) of anarbitrary finite, connected, undirected graph G , we firstconstruct the tree T ( ℓ, i ) in the following way. For anygiven node i of G , T (1 , i ) is a star graph consisting ofa node labeled i in the center, and for any node j in G a)b)c) graph G tree T (3 , branch B (2 , ← FIG. 1. Construction of (b) the tree T (3 ,
1) and (c) thebranch B (2 , ←
1) for an example graph (a). The root(starting node) in T (3 ,
1) is shown by the open circle andthe starting link in B (2 , ←
1) is represented by the dottedline. for which a link j ← i exists, a node labeled j is attachedto the center node of T (1 , i ). For general ℓ > T ( ℓ, i ) in a recursive way. To any leaf labeled k in T ( ℓ − , i ), we attach a node labeled l if and only if B l ← k,k ← j = 1, where B is the nonbacktracking matrixof graph G and j is the label of the parent node of leaf k in T ( ℓ − , i ). The nonbacktracking matrix [9, 10] isdefined as B i ← j,k ← l = δ jk (1 − δ il ), where i, j and k, l areend node pairs of links in the original graph. Note thatboth directions must be taken into account for each link.We see that the branches emanating from the root i in T ( ℓ, i ) correspond exactly to nonbacktracking walks of,at most, length ℓ , in the original graph G starting fromnode i . (Note that some of these walks may be shorterthan ℓ , when a dead end in graph G is encountered.)In this construction the nodes of the resulting tree haveonly labels of the original finite graph, i = 1 , , . . . , N ,so different nodes in this tree can have the same labels.The same is true for nodes in each individual branch ofthe tree. Figure 1(b) explains T ( ℓ, i ) for the finite graphin Fig. 1(a). The construction T ( ℓ, i ) is also known asa computation tree in the computer science community[11]. Taking the limit ℓ → ∞ and formally excludingthe boundary, we obtain a network in which all nodes ofthe same label i and all links of the same label j ← i aretopologically equivalent, in the sense that the structure of any finite neighborhood of each node of label i (andof each link of label j ← i ) is the same. This constructionis a generalization of the Bethe lattice and we call it thenonbacktracking expansion of graph G . It is a specificBethe lattice preserving the local topology of connectionsof the graph G . For example, the NBE of the 4-clique issimply a regular Bethe lattice of coordination number 3.A more explicit definition of the NBE can be givenusing the “ m -cloned” network [12] of the given graph G .This network is constructed as follows. Let the nodesof G be labeled i = 1 , , . . . , N . We make m replicas ofall these nodes, still labeled i = 1 , , . . . , N within eachof the m replicas. The mN nodes obtained in this wayare the nodes of the m -cloned network N ( m ) defined asfollows. Let N ( m ) be the network that is maximallyrandom under the constraint that if and only if nodes i and j are connected in G , then any of the m nodes la-beled i in N ( m ) is connected to exactly one node labeled j (of, in total, m such nodes). In other words, make m copies of graph G and consider all possible rewirings thatpreserve the labels of end nodes of all links, with equalstatistical weights (uniform randomness). Figure 2 ex-plains this construction for m = 2. Let us take the limit FIG. 2. Construction of network N (2) for the finite graphin Fig. 1(a), according to Ref. [12]. Three of the membersof the statistical ensemble of graphs constituting N (2) areshown as examples. m → ∞ and consider the infinite network N ( m → ∞ ).This network is locally treelike: any finite neighborhoodof any node labeled i is given by the tree T ( ℓ, i ). Therelative number of finite loops in this infinite networkis negligible, almost all loops are infinite, and almost allnodes of label i (and links of label j ← i ) are topologicallyequivalent. The network N ( m → ∞ ) gives exactly theNBE of graph G .To discuss the properties of the NBE it is helpful todefine the branch B ( ℓ, j ← i ) as the tree that is recur-sively generated by the nonbacktracking walks of length ℓ in G starting from the link j ← i , as shown in Fig. 1(c).Clearly, T ( ℓ + 1 , i ) consists of all branches B ( ℓ, j ← i )(all j , for which there is a link j ← i in graph G ). III. PROPERTIES OF THENONBACKTRACKING EXPANSION
Let us consider properties of these constructions. Let n l ← k ( ℓ, j ← i ) be the number of links of label l ← k onthe surface of B ( ℓ, j ← i ) (these are links from parent k on the surface of sphere ℓ − l on the surface ofsphere ℓ ). Let n ( ℓ, j ← i ) be the size (number of nodes)of the surface of B ( ℓ, j ← i ), so that f l ← k ( ℓ, j ← i ) ≡ n l ← k ( ℓ, j ← i ) /n ( ℓ, j ← i ) is the relative number of linksof label l ← k on the surface of B ( ℓ, j ← i ). Every linkon the surface of B ( ℓ + 1 , j ← i ) must emanate from theend node of a link on the surface of B ( ℓ, j ← i ), so n l ← k ( ℓ + 1 , j ← i ) = X l ′ ← k ′ n l ′ ← k ′ ( ℓ, j ← i ) B l ← k,l ′ ← k ′ , (1)that is, n ( ℓ + 1 , j ← i ) f l ← k ( ℓ + 1 , j ← i )= n ( ℓ, j ← i ) X l ′ ← k ′ f l ′ ← k ′ ( ℓ, j ← i ) B l ← k,l ′ ← k ′ . (2)Taking ℓ → ∞ , the ratio n ( ℓ + 1 , j ← i ) /n ( ℓ, j ← i ) con-verges to an asymptotic mean branching b and the num-bers f l ← k ( ℓ, j ← i ) must also converge to some asymp-totic values f ∗ l ← k ( j ← i ). For ℓ → ∞ , therefore, we get bf ∗ l ← k ( j ← i ) = X l ′ ← k ′ f ∗ l ′ ← k ′ ( j ← i ) B l ← k,l ′ ← k ′ (3)or, in vector form, b f ∗ ( j ← i ) = Bf ∗ ( j ← i ) . (4)Equation (4) is a right eigenvector equation for the non-backtracking matrix of the original graph G . Accord-ing to the Perron-Frobenius theorem [13], the nonback-tracking matrix B has exactly one strictly non-negativeright eigenvector, and this is the eigenvector correspond-ing to the leading eigenvalue. All of the components of f ∗ ( j ← i ) are, by definition, non-negative, therefore themean branching b is equal to the leading eigenvalue λ of B . We note that the leading eigenvalue and the corre-sponding right eigenvector are independent of the start-ing link j ← i , so we have the following general result.The asymptotic mean branching of B ( ℓ, j ← i ), for any j ← i , is given by the leading eigenvalue λ of the non-backtracking matrix of graph G . From the construction of T ( ℓ, i ), it is clear that T ( ℓ, i ) must also have the sameasymptotic mean branching for any root i . The asymp-totic relative numbers of links l ← k on the surface of B ( ℓ, j ← i ) [or T ( ℓ, i )] are given by the components of theprincipal right eigenvector v ( λ ) of B . These asymptoticrelative numbers are, again, independent of the startinglink (or node).Now we can find the asymptotic relative number g i ofnodes of label i on the surface [of either B ( ℓ, r ′ ← r ) or T ( ℓ, r ), independent of the starting link r ′ ← r or root r ]: g i = P j ∈ ∂i v ( λ ) i ← j P Nk =1 P j ∈ ∂k v ( λ ) k ← j , (5)where ∂i denotes the set of nearest neighbors of node i in graph G . The asymptotic relative number of nodes i must also be the same on the layer just beneath thesurface, and each node in this layer (apart from deadends) must be the starting node of a link on the surface,so we can also write g i = q i − P j ∈ ∂i v ( λ ) j ← i P Nk =1 ( q k − P j ∈ ∂k v ( λ ) j ← k ) , (6)where q i > i . Equations (5) and(6) give a general relationship between the componentsof the eigenvector v ( λ ) . Note that g i is the asymptoticprobability of finding a nonbacktracking random walkerat node i after an infinitely long walk. This quantitycoincides with the nonbacktracking centrality of [14].Let us return to n ( ℓ, j ← i ), the size of the surface of B ( ℓ, j ← i ). Let n ( ℓ ) ≡ P j ← i n ( ℓ, j ← i ) be the sumof the sizes of all such surfaces, and h j ← i ( ℓ ) ≡ n ( ℓ, j ← i ) /n ( ℓ ) be the relative size of the surface of B ( ℓ, j ← i ).Then n ( ℓ + 1 , j ← i ) = X l ← k n ( ℓ, l ← k ) B l ← k,j ← i (7)or n ( ℓ + 1) h j ← i ( ℓ + 1) = n ( ℓ ) X l ← k h l ← k ( ℓ ) B l ← k,j ← i . (8)In the limit ℓ → ∞ , the ratio n ( ℓ + 1) /n ( ℓ ) convergesto the asymptotic mean branching b and the numbers h j ← i ( ℓ ) must also converge to some asymptotic values h ∗ j ← i . We can write bh ∗ j ← i = X l ← k h ∗ l ← k B l ← k,j ← i (9)or, in vector form, b h ∗ = h ∗ B , (10)which is just a left eigenvector equation for B . The samereasoning applies as before, so b is equal to the leadingeigenvalue of B , λ . The vector h ∗ is equal to the cor-responding left eigenvector w ( λ ) . Then the asymptoticrelative sizes of the surfaces of B ( ℓ, j ← i ) are given bythe components of w ( λ ) . We express the asymptotic rel-ative sizes s i of the surfaces of T ( ℓ, i ) in terms of thesecomponents: s i = P j ∈ ∂i w ( λ ) j ← i P Nk =1 P j ∈ ∂k w ( λ ) j ← k . (11)From the symmetric relationship between the left andthe right eigenvectors we see that w ( λ ) j ← i = v ( λ ) i ← j , therefore s i = g i . (12)Thus the asymptotic relative size of the surface of T ( ℓ, i )is equal to the asymptotic relative number of nodes oflabel i on the surface of any T ( ℓ, r ) independently of r .This quantity is expressed in terms of the components ofthe principal (left or right) eigenvector of the nonback-tracking matrix of the original graph G .It is clear from the above constructions that for anarbitrary graph G , the interior of the infinite network N ( m → ∞ ) is given exactly by T ( ℓ → ∞ , r ), indepen-dently of r . Note, however, that the distribution of thenumbers of nodes of different labels in T ( ℓ → ∞ , r )is given by Eq. (5), while the same distribution for N ( m → ∞ ) is uniform—we made the same m numberof copies of every node. This discrepancy is reconciledby the presence of infinite loops: nodes in the infinitenetwork N ( m → ∞ ) are “matched up” at infinity toprovide the overall uniform distribution of node labels.It is informative to consider the dispersion σ ( d ) of therelative number of node labels within a distance d ofan arbitrarily chosen starting node in both T ( ℓ, r ) and N ( m ). Suppose that ℓ and m are such that the numberof nodes in both T ( ℓ, r ) and N ( m ) is V and that V islarge. The dispersion σ ( d ) in both cases will converge to σ ∗ = h g i − h g i (13)but will go to 0 for N ( m ) as d approaches ln V , whileit remains the same for any d in the case of T ( ℓ, r ).Here the variables g are the asymptotic values given byEq. (5). Figure 3 qualitatively illustrates this behavior. IV. EXAMPLE PROBLEMS
We now consider the problem of approximating per-colation on finite graphs. The critical singularity in per-colation occurs only in infinite networks where the gi-ant connected component can be strictly defined. In alarge though finite random graph the so-called scalingwindow, in which the behavior of the largest connectedcluster noticeably deviates from the singular behaviorof the giant connected component, is narrow. In a re-cent work Karrer et al. [5] considered this problem of“approximate” percolation on a finite graph by solvingself-consistency equations for the probabilities of reach-ing finite clusters by following a link in a given direction.Such self-consistency equations provide an exact solution σ ∗ σ ( d ) V d N ( m ) T ( ℓ, r ) FIG. 3. Dispersion of the relative number of node labelswithin a distance d of an arbitrarily chosen node in the non-backtracking expansion N ( m ) and the tree T ( ℓ, r ). The num-ber of nodes in both N ( m ) and T ( ℓ, r ) is denoted by V . for percolation in infinite locally treelike random net-works, and in finite graphs they give an approximationto the behavior of the largest connected cluster. Thelarger and more “locally treelike” the finite graph is, thebetter this approximation is expected to be. For an ar-bitrary connected finite graph the “approximate criticalpoint” is given by the inverse of the leading eigenvalue ofthe nonbacktracking matrix of the graph. It was shownin Ref. [15] that for infinite graphs, this is a tight lowerbound on the exact percolation threshold. The questionarises: More than treating it purely as an approxima-tion, what is the exact meaning of the solutions of self-consistency equations on a finite graph? The results inRef. [5] rely on solving the following set of basic equa-tions: H i ← j ( x ) = 1 − p + px Y k ∈ ∂j \ i H j ← k ( x ) , (14)where H i ← j ( x ) is the probability generating function forthe distribution of the sizes of finite clusters reachableby following links i ← j from i to j . (In the contextof finite graphs, the meaning of finite clusters is: notthe largest cluster.) Here ∂j \ i denotes the set of thenearest neighbors of node j excluding node i , and p isthe existence probability of any link. Clearly, in finitegraphs, these equations do not hold exactly, but may beapproximately correct if the graph in question is largeand sparse. In our nonbacktracking expansion [the net-work N ( m → ∞ )], however, Eqs. (14) hold exactly, andby solving them we actually obtain the exact percolationresults for N ( m → ∞ ). In Ref. [5] it is shown that thevalue of p , at which a non-trivial solution to Eqs. (14)appears, is given by the inverse of the leading eigen-value of the nonbacktracking matrix of the given graph.This coincides with our result that the mean branchingof the network N ( m → ∞ ) is also given by the leadingeigenvalue of the nonbacktracking matrix. The inverseof this value is a true critical point in our NBE. The or-der parameter is the relative size of the giant connectedcomponent, and is given by S = 1 N X i − Y j ∈ ∂i H i ← j (1) , (15)where H i ← j (1) are the solutions of Eq.(14) at x = 1,for a given value of p . The degree distribution of anyNBE coincides with the degree distribution of the origi-nal finite graph, therefore it necessarily has a cutoff—themaximum degree in the original graph. Consequently theorder parameter exponent β = 1, S ∼ = Ω( p − p c ) β , for theNBE of any finite graph.We express the slope Ω of the order parameter at p c in terms of the components of the principal eigenvectorof the nonbacktracking matrix which naturally emergeswhen linearizing Eq. (14). We cannot do this directly,unlike, e.g., in Refs. [16] and [17], since the nonback-tracking matrix is not symmetric, and so the full set of its eigenvectors does not form an orthogonal basis.To surpass this difficulty, we introduce a new symmetricmatrix M = A − λ − ( D − I ) , (16)where λ is the leading eigenvalue of the nonbacktrack-ing matrix and matrices A , D , and I are the adjacencymatrix, the degree matrix ( D ij = q i δ ij ), and the iden-tity matrix, respectively. Matrix M is a special caseof the Bethe Hessian matrix [18]. Its eigenvectors al-ready constitute a complete orthogonal basis. The lead-ing eigenvalue of the nonbacktracking matrix, λ , is alsoan eigenvalue of M , and the corresponding eigenvectorcomponents can be expressed in terms of the componentsof the principal eigenvector of the nonbacktracking ma-trix. These useful properties of matrix M enable us tocalculate the slope of the order parameter at p c = λ − ,(see the Appendix for details of the derivation):Ω ∼ = λ P i P j ∈ ∂ i v ( λ ) i ← j ! P i ( q i − − λ ) P j ∈ ∂ i v ( λ ) i ← j ! N P i P j ∈ ∂ i v ( λ ) i ← j ! P j ∈ ∂ i P k,k ′ >k ∈ ∂ i \ j v ( λ ) i ← k v ( λ ) i ← k ′ − λ P k,k ′ >k ∈ ∂ j \ i v ( λ ) j ← k v ( λ ) j ← k ′ ! . (17)Figure 4 compares the slopes of the order parameter S ( p )calculated in the critical region solving Eq. (15) for the4-clique, an Erd˝os–R´enyi random graph and three realnetworks with the slopes found by using formula (17).Note the perfect agreement at p c . The curves in Fig. 4(b)also reveal the sizes of the regions in which the relation S ∼ = Ω( p − p c ) is applicable for these networks.The slope of the order parameter may also be obtainedusing a different method, which does not rely on symmet-ric matrices. This alternative method uses the fact thatfor any diagonalizable matrix, the complex conjugate ofany left eigenvector corresponding to one eigenvalue isorthogonal to all right eigenvectors corresponding to adifferent eigenvalue (and vice versa, the conjugate of anyright eigenvector corresponding to one eigenvalue is or-thogonal to all left eigenvectors corresponding to a dif-ferent eigenvalue). This method yields a simpler expres-sion for the slope, (see the Appendix for details of thederivation):Ω = λ (cid:16)P i ← j v λ i ← j (cid:17) (cid:16)P i ← j v λ j ← i v λ i ← j (cid:17) N P i ← j v λ j ← i P k,k ′ >k ∈N j \ i v λ j ← k v λ j ← k ′ . (18)Both approaches have their respective merit and theyprovide equivalent expressions for the slope of the orderparameter, Eqs. (17) and (18). The first method intro-duces a new symmetric matrix, which may be useful forvarious problems on undirected graphs, where the under-lying matrix—the adjacency matrix—is symmetric. The second approach requires only that the relevant matrixbe diagonalizable, therefore it may be used to treat awide range of problems in nonsymmetric systems, e.g.,percolation on directed networks. A detailed discussionof these issues will be presented elsewhere.The above methods can be applied to the general classof problems defined by the set of message passing equa-tions a i ← j = F i ← j ( p, { a j ← k : k ∈ ∂j \ i } ) , (19)with the condition that ∂F i ← j ∂a j ← k (cid:12)(cid:12)(cid:12) p = p c = c, (20)where c is a constant, independent of i ← j and j ← k .For any such model the solutions a i ← j of Eq. (19) andthe order parameter S near p c can be calculated usingour schemes (see the Appendix).As another example of the application of our results,we consider an algorithm approximating optimal perco-lation, suggested in Ref. [6]. It was shown that an effi-cient way of destroying a network (removing the lowestnumber of nodes in order to disconnect the giant com-ponent) is by sequentially removing nodes according totheir collective influence CI ℓ ( i ). This characteristic ofnode i is defined as CI ℓ ( i ) = ( q i − X j ∈ ∂ Ball( i,ℓ ) ( q j − , (21) G i a n t c o m p o n e n t S Activation probability p S / ( Ω ( p − p c )) p − p c a)b) FIG. 4. (Color online) (a) The order parameter S as a func-tion of p , computed using Eq. (15), for the 4-clique, anErd˝os–R´enyi graph, the Zachary Karate Club network [19],a subgraph of the Gnutella peer-to-peer file sharing network[20], and a snapshot of the Internet at the autonomous sys-tems level [21]. (b) Normalized slopes of the same examplenetworks, obtained using Eq. (17) [or Eq. (18)]. All normal-ized slopes go to 1 at p = p c . where q i is the degree of node i and ∂ Ball( i, ℓ ) is the setof the nodes that are at a distance ℓ from node i . In thisalgorithm, ℓ must be large but smaller than the diam-eter of the graph under consideration. Let us considerthe same problem on the nonbacktracking expansion ofa finite graph. We see that in this case the collectiveinfluences CI ℓ ( i )—taking ℓ → ∞ —are just the asymp-totic relative sizes of the surfaces of T ( ℓ, i ), multipliedby q i −
1. We have already expressed these relative sizesin terms of the components of the principal eigenvectorof the nonbacktracking matrix of the original graph [seeEq. (11)]. If the finite graph under consideration is largeand has no short loops, we can assume that using ourexpression is a good approximation to the optimal wayof disconnecting the graph. An improvement on the col-lective influence method is suggested in Ref. [7]. It wasshown empirically that better attack performance canbe achieved by sequentially removing nodes accordingto the following centrality measure (collective influencepropagation) CI p ( i ) = X j ∈ ∂i w ( λ ) i ← j v ( λ ) i ← j + w ( λ ) j ← i v ( λ ) j ← i , (22)where w ( λ ) i ← j and v ( λ ) i ← j are the components of the prin- cipal left and right eigenvectors of the nonbacktrackingmatrix, respectively. It was also shown in [7] analyticallythat removing a node according to Eq. (22) results inthe biggest decrease in λ , the leading eigenvalue of thenonbacktracking matrix, i.e., the mean branching of thecorresponding NBE. Thus, this method produces “step-wise optimal” percolation. From the result w ( λ ) j ← i = v ( λ ) i ← j we see that using the centrality index of Eq. (22) is ac-tually equivalent to using CI p ( i ) = X j ∈ ∂i w ( λ ) i ← j v ( λ ) i ← j = X j ∈ ∂i v ( λ ) i ← j v ( λ ) j ← i . (23)Considering the nonbacktracking expansion we can givea physical interpretation of the product w ( λ ) i ← j v ( λ ) i ← j . Thisis the relative number of infinite paths in the NBE pass-ing through a link i ← j (or link j ← i ). The relativenumber of infinite paths arriving at a link i ← j is givenby v ( λ ) i ← j and the relative number of infinite paths start-ing from a link i ← j is given by w ( λ ) i ← j (see Sec. III).Summing the product w ( λ ) i ← j v ( λ ) i ← j over all neighbors j ofnode i , we get the relative number of infinite paths in theNBE passing through a node i , which is exactly the CI p centrality index. One can consider the stepwise optimalremoval of edges instead of nodes. For this problem, theexpression (23) should be replaced by the edge centralitymeasure CI e ( ij ) = w ( λ ) i ← j v ( λ ) i ← j = v ( λ ) i ← j v ( λ ) j ← i (24)for edge ij . Note that in the recent work [22] a methodfor finding the decycling number of graphs was pre-sented, outperforming all previous approaches to opti-mal percolation. A computationally much more efficientmethod, providing a similarly high performance, was in-troduced in [23]. V. ACCURACY AND LIMITATIONS OF THEMESSAGE PASSING APPROACH
We have shown that when we are using message pass-ing equations to approximately solve any given problemon a finite graph, we are actually solving the problemexactly on the nonbacktracking expansion of the givenfinite graph. At present there is no theory available todetermine how good this approximation is in the gen-eral case. We now briefly discuss the accuracy of thisapproximation, through the example of random percola-tion, and how well it may be expected to work in certainsituations.Note that although we defined the NBE for finitegraphs, it may also be defined for infinite networks as alimit of finite networks. From Refs. [5] and [15] we knowthat for infinite networks the critical point (for percola-tion) of the NBE is a lower bound on that of the originalnetwork. An even stronger statement is also true: therelative size of the giant component in the NBE is anupper bound on that of the original network (for anylink activation probability p ). This is easily seen consid-ering the following. The solutions H i ← j (1) of Eq. (14)are a lower bound on the actual probabilities of reachinga finite cluster when following link i ← j in the direc-tion of j , due to the presence of loops in the originalnetwork (see [5]). According to Eq. (15), therefore, theorder parameter of the NBE is an upper bound on thatof the original network. With this result in mind we canquantify the badness R of the NBE approximation as R = R | S NBE ( p ) − S ( p ) | dp R S ( p ) dp , (25)where S ( p ) is the order parameter of the original net-work and S NBE ( p ) is the order parameter of the corre-sponding NBE, for a given link activation probability p .For infinite networks the absolute value in Eq. (25) isnot necessary, as S NBE ( p ) − S ( p ) > p in thiscase. For finite graphs, however, S NBE ( p ) − S ( p ) < p < p c,NBE , where p c,NBE is the critical point of theNBE and S ( p ) is the relative size of the largest compo-nent. Therefore the absolute value in Eq. (25) is re-quired to make this a universally applicable, meaningfulmeasure of the badness of the approximation. For un-correlated random networks the NBE coincides with thenetwork itself (in the infinite size limit), therefore R = 0.We now consider two representative examples of net-works, having the same NBE, where the approximationfails: a two-dimensional square lattice and a syntheticmodular network given by the following construction.Consider n subnetworks, each of which is a random reg-ular network of m nodes and degree 4. In each of thesubnetworks, let us remove two randomly selected links,leaving four nodes of degree 3. Then, let us connect theseaffected nodes to similar affected nodes in other subnet-works randomly, restoring the uniform degree 4 for everynode in the network. In the limit n, m → ∞ this networkhas no finite loops and has modularity Q = 1, accord-ing to the definition of modularity in Ref. [24]. Thecorresponding NBE is exactly the same as the one for atwo-dimensional square lattice. This NBE also coincideswith that of a 5-clique; it is a random regular networkof degree 4. The above examples are particularly badlyapproximated by the NBE as shown in Fig. 5 (a).These two examples represent two classes of net-works that are not well approximated by the NBE:low-dimensional networks and highly modular networks.Low-dimensional networks have many short loops and,therefore, are essentially non-treelike. A modification ofthe original message passing method has been suggestedin [27] to deal with clustering and in [28] for networkswith loopy motifs. Low dimensionality, however, impliesmany intertwined loops of many different sizes, which ap-pears to be a significant factor in rendering such systemsuntreatable by the locally treelike approximation. Inter-estingly, even if there are no finite loops in the network,the NBE may still be a bad approximation if the modu-larity is very high [see the model modular network “Com-munities” in Fig. 5 (a)]. A message passing method formodular networks is suggested in [29], however, to use it, G i a n t c o m p o n e n t S Activation probability p Random regularRandom regular NBESquare latticeSquare lattice NBECommunitiesCommunities NBE G i a n t c o m p o n e n t S Activation probability p RoadsRoads NBEAmazonAmazon NBE a)b)
FIG. 5. (Color online) Simulation results and message pass-ing solutions for the size of the giant connected compo-nent; examples of networks badly approximated by the non-backtracking expansion. (a) Synthetic networks: a two-dimensional square lattice and a model network of maximalmodularity Q = 1. (b) Real-world networks: the road net-work of California [25] and an Amazon copurchase network[26]. In all cases the simulation results differ significantlyfrom the message passing solutions. See Table I for details. one needs to know the modular structure of the network a priori . The problem is, that there is no unique defini-tion of modularity in networks. Also, finding the com-munity structure of a network—using any meaningfuldefinition of modularity—typically leads to NP-hard de-cision problems [30]. In Fig. 5 (b) we present real-worldexamples of the two classes of networks discussed above:a planar network (road network of California [25]) anda modular network (Amazon copurchase network [26]).The respective badness values are listed in Table I.An instructive example that simultaneously demon-strates the pitfalls and the remarkable power of thisapproximation is the neural network of the roundworm Caenorhabditis elegans ( C. elegans ) [31]. We investi-gated the hermaphrodite
C. elegans , combining bothchemical and electrical synapses and removing multiplelinks and self-loops. This neural network contains twoobvious communities: the pharynx and the main bodyof the roundworm. The two communities are separatedby only two links, hence the obvious modular structureof the network. Figure 6 shows percolation simulationresults and the corresponding message passing solutionsfor the whole network and also for the two subnetworks
Network N h q i R Random regular a . a . a . b .
82 0 . b .
53 0 . c
448 21 .
17 0 . c
51 9 .
22 0 . c
397 22 .
69 0 . a . b .
73 0 . d .
22 0 . a Computer generated. b Downloaded from: http://snap.stanford.edu/data/ . c Downloaded from: . d Downloaded from: . TABLE I. Size N (number of nodes), mean degree h q i , andbadness value R [defined in Eq. (25)] for the networks con-sidered in Sec. V. G i a n t c o m p o n e n t S Activation probability p full networkfull network NBEmain bodymain body NBEpharynxpharynx NBE FIG. 6. (Color online) Simulation results and message pass-ing solutions for the size of the giant connected componentfor the neural network of
C. elegans [31]. The full neuralnetwork consists of two well-defined modules: the main bodyand the pharynx. The message passing equations are a badapproximation for the network as a whole, but a surprisinglygood one for the modules separately. (Note the very smallsize, N = 51, of the pharynx network!) (obtained by cutting the two links separating the mod-ules). The NBE approximation is not particularly goodwhen the whole network is considered, but is remark-ably accurate for the two subnetworks. These subnet-works are very small, and have high clustering coeffi-cients ( C = 0 .
47 for the pharynx and C = 0 .
26 for themain body), implying that modularity is a much moreimportant factor in determining the accuracy of the ap-proximation.Figure 7 shows “less correlated” networks—also stud-ied in [5]—where the badness values of the approxima-tion are very close to 0 (see Table I). These networksare small worlds, i.e., infinite-dimensional, and have lowmodularities. We showed above that low dimensionalityand high modularity are two attributes that cause theapproximation to fail. Modular networks are ubiquitous G i a n t c o m p o n e n t S Activation probability p Erd˝os–R´enyiErd˝os–R´enyi NBEGnutellaGnutella NBEInternetInternet NBE
FIG. 7. (Color online) Simulation results and message pass-ing solutions for networks where the nonbacktracking expan-sion gives an excellent approximation: an Erd˝os–R´enyi net-work, a subgraph of the Gnutella peer-to-peer file sharingnetwork [20], and a snapshot of the Internet at the level ofautonomous systems [21]. The simulation results practicallycoincide with the message passing curves. in nature, therefore such excellent agreement as in Fig.7 (or Ref. [5]) may not actually be a very common case.Finally, we discuss another limitation of message pass-ing methods, their time complexity. A significant advan-tage of using message passing equations as opposed todoing simulations is that, supposedly, the computationaltime required to find the solutions is much less than thetime involved in doing the actual simulations and aver-aging over many realizations. This is not always true,however. The number of operations required in messagepassing methods (i.e., when solving the message passingequations by iterations) is T MP ∼ n conv N h q i ∼ n conv L, (26)where the number of iterations to convergence is denoted n conv , and L is the number of links in the network. Toachieve this complexity one may do the following. Con-sidering the percolation problem, Eq. 14, at each iter-ation, for every node, first compute the product of allincoming messages. This operation has time complexity ∼ L . To update each outgoing message from every node,divide the precalculated product for the starting node ofthis message by the incoming message for the same link.This, again, has complexity ∼ L , resulting in the overallcomplexity given in Eq. (26). In order for the messagepassing equations to converge to a solution, the messagesat any part of the network must have information fromevery other part of the network. Therefore, n conv cannotbe less than the network diameter, i.e., the longest of allthe shortest paths between node pairs in the network. Ina typical network demonstrating the small-world effectthe diameter grows logarithmically with the system size N . In this case n conv can be expected to grow at leastas n conv ∼ ln N , and so for networks with sufficientlyrapidly decaying degree distributions T MP ∼ N ln N .(Note that in networks with a divergent second momentof the degree distribution, the diameter may increasewith N even more slowly than logarithmically [32, 33].At the other extreme, namely in the case of a chain, thediameter grows as N .) Let us compare these results withthe time complexity of doing actual simulations.Say we want to determine the size of the “giant” perco-lating cluster in a large graph to a given accuracy (givenstandard error). The quantity we are interested in isan average over n samples, S = h s i , where s is a bi-nary quantity ( s = 1 or s = 0) indicating whether ornot a given node of the graph is in the giant compo-nent. The standard error of the estimate for S is simply σ/ √ n , where σ is the standard deviation of s . There-fore, for a given desired error in S the number of nodesamples needed is independent of the system size. To de-termine whether a node belongs to the giant component,we must explore the neighborhood of this node (e.g., us-ing a simple breadth first search) visiting up to l othernodes, where l ≫ l ∗ . Here l ∗ is the typical size of a finitecluster. The typical cluster size depends on the distancefrom the critical point but not the system size. Consid-ering the above, we arrive at the following conclusion.At a given distance from the critical point, for a givendesired accuracy, the order parameter can be determinedusing simulations in constant time for any network ar-chitecture. This is in strong contrast with the messagepassing approach, where the accuracy is unknown andthe time complexity significantly higher. While the mes-sage passing method may be faster in certain small net-works, it is always inferior to simple simulations for largeenough networks. VI. DISCUSSION AND EXTENSIONS
Message passing equations often provide a good ap-proximation to the behavior of interacting models onlarge sparse networks. In this paper, for any finite graph,we have presented a network construction for which themessage passing equations are exact, thereby giving aphysical interpretation of this approximation. For a widerange of problems, represented by message passing equa-tions, we have shown how to express the solutions nearthe critical point in terms of the principal eigenvectorcomponents of the nonbacktracking matrix.It is well established that this approximation shouldwork well for infinite-dimensional (small-world) networkswhich are sparse and have no strong degree-degree cor-relations. It is not clear how well it should work fornetworks of low dimensionality, many loops, and corre- lations. We have introduced a way of quantifying theaccuracy of the approximation and elaborated on twodistinct classes of networks where this approach fails:low-dimensional and highly modular networks. As real-world examples of these cases we considered the road net-work of California and an Amazon copurchase network,where the message passing equations, indeed, proved abad approximation.Message passing methods based on the locally treelikeapproximation have a wide range of applicability and, insome cases, provide good results. However, as we haveshown, for increasingly large systems these methods havea high computational cost compared with simple simu-lations. Also, while one can always find an estimate ofthe error associated with simulations in a straightfor-ward way, the accuracy of message passing methods isentirely unknown.All the results in this paper apply also to weightedgraphs. In this case one must work with a weighted non-backtracking matrix, obtained by multiplying each rowof the unweighted nonbacktracking matrix by the weightof the corresponding link. Using this modified matrix,the critical point and the solutions near the critical pointare determined exactly as in the unweighted case.The network construction presented here may be gen-eralized straightforwardly to directed graphs. The di-rected percolation critical point is also given by the in-verse of the leading eigenvalue of the nonbacktrackingmatrix. Solutions near the critical point can be foundusing the method we introduced for nonsymmetric sys-tems (see Method 2 in the Appendix).The nonbacktracking expansion can be generalizedto various percolation problems for interdependent andmultiplex networks [34–40]. In the case of multiplexnetworks, the only difference from single-layer (simplex)networks is that the resulting infinite network (the NBE)will consist of links of different colors, corresponding tolinks of different layers. The message passing equationswritten for any given model on finite multiplex networks[35] will be exact in the corresponding NBE, just as insimplex networks. A number of percolation problems onmultiplex and interdependent networks exhibit discon-tinuous transitions. In order to retain the interestingphysics in these systems, one has to account for the non-linear terms in the basic equations [38]. Finding theanalytical solutions similar to our Eqs. (17) and (18)at discontinuous transitions in multiplex networks is aninteresting challenge for the future.We thank L. Zdeborov´a for useful comments. Thiswork was supported by the FET proactive IP projectMULTIPLEX 317532.
APPENDIX: FINDING SOLUTIONS NEAR p c We stated that in the class of physical problems on undirected networks defined by Eqs. (19) and (20) the solutionnear p c can be expressed in terms of the components of the principal eigenvector of the nonbacktracking matrix.0Here we derive this expression in the particular case of the message passing equations, (14), for percolation.We expand the message passing equations, Eq. (14), as a i ← j = p X k ∈ ∂j \ i a j ← k − X k,k ′ >k ∈ ∂j \ i a j ← k a j ← k ′ + . . . , (A1)where a i ← j = 1 − H i ← j (1). The set of eigenvectors of the nonbacktracking matrix, v ( λ ) , forms a complete basisenabling us to express the vector a = { a i ← j } as a linear combination of the eigenvectors a = X λ C λ v ( λ ) . (A2)Notice that both a and the coefficients C λ vary with p , approaching 0 when δ = p − p c →
0. Replacing a i ← j with P λ C λ v ( λ ) i ← j in Eq. (A1), we obtain the equation X λ C λ v ( λ ) i ← j ( pλ −
1) = p X λ,λ ′ C λ C λ ′ X k,k ′ >k ∈ ∂j \ i v ( λ ) j ← k v ( λ ′ ) j ← k ′ + . . . , (A3)where we have used the identity X k ∈ ∂ j \ i v ( λ ) j ← k = λv ( λ ) i ← j , since v ( λ ) is the eigenvector of the nonbacktracking matrix with eigenvalue λ . Since all C λ approach 0 as δ → λ = 1 /p c , such that C λ = O ( δ ), and that for λ = λ the coefficients C λ approach 0 as δ or more rapidly. Furthermore, by the Perron-Frobenius theorem if all components of v ( λ ) arenon-negative, then the eigenvalue λ is positive real and has the largest absolute value (hence, the subscript 1). Inthis way we find that at small δ a i ← j ∼ = C λ v ( λ ) i ← j , (A4)where v ( λ ) i ← j are the components of the principal eigenvector of the nonbacktracking matrix. From this point onwardsone may use two different approaches. Method 1 introduces an auxiliary symmetric matrix, to proceed with thederivation in the standard way.
Method 2 exploits a useful property of the left and right eigenvectors of diagonalizablematrices, resulting in a much shorter derivation and a more compact final result.
Method 1
Let us consider the following sums, over the set of nearest neighbors ∂i of node i , b i ← = X j ∈ ∂i a i ← j and b ← i = X j ∈ ∂i a j ← i . (A5)We sum both sides of Eq. (A1): b i ← = p X j ∈ ∂ i b j ← − b ← i − X j ∈ ∂ i X k,k ′ >k ∈N j \ i a j ← k a j ← k ′ + . . . ,b ← i = p ( q i − b i ← − X j ∈ ∂ i X k,k ′ >k ∈ ∂ i \ j a i ← k a i ← k ′ + . . . . (A6)Replacing b ← i in the first of these equations with the right-hand side of the second equation we obtain b i ← = p X j ∈ ∂ i b j ← − p ( q i − b i ← + X j ∈ ∂ i p X k,k ′ >k ∈ ∂ i \ j a i ← k a i ← k ′ − X k,k ′ >k ∈ ∂ j \ i a j ← k a j ← k ′ + . . . , (A7)1where q i is the degree of node i . Let us introduce the matrix M with elements M ij defined as M ij = ( A ij if i = j, − p c ( q i −
1) if i = j, (A8)where A ij are the elements of the adjacency matrix. The matrix M is symmetric, and so we can express b = { b ← , b ← , . . . , b N ← } as a linear combination of its eigenvectors, u (˜ λ ) , which form a complete orthogonal basis, b = X ˜ λ C ˜ λ u (˜ λ ) . (A9)In the following we denote the eigenvalues of M by ˜ λ . These eigenvalues are different from the eigenvalues λ ofthe nonbacktracking matrix. We replace b i ← and a i ← j in Eq. (A7) with their decompositions, Eqs. (A9) and (A4),respectively, X ˜ λ C ˜ λ u (˜ λ ) i ( p c ˜ λ − − δ X ˜ λ [ p c ( q i − − ˜ λ ] C ˜ λ u (˜ λ ) i = − p c X λ,λ ′ C λ C λ ′ X j ∈ ∂ i p c X k,k ′ >k ∈ ∂ i \ j v ( λ ) i ← k v ( λ ′ ) i ← k ′ − X k,k ′ >k ∈ ∂ j \ i v ( λ ) j ← k v ( λ ′ ) j ← k ′ + . . . . (A10)This equation implies that among the eigenvalues ˜ λ there is one, say ˜ λ ∗ , equal to 1 /p c = λ . Note that theeigenvalue ˜ λ ∗ is not the leading eigenvalue of M . Note also that according to Eqs. (A4), (A5), and (A9), C ˜ λ ∗ u (˜ λ ∗ ) i = C λ P j ∈ ∂ i v ( λ ) i ← j ∼ δ , while the rest of the coefficients C λ and C ˜ λ approach 0 as δ or more rapidly. To obtain C λ we multiply both sides of Eq. (A10) by u (˜ λ ∗ ) i and sum over i . The result is C λ ∼ = λ P i ( q i − − λ ) P j ∈ ∂ i v ( λ ) i ← j ! P i P j ∈ ∂ i v ( λ ) i ← j ! P j ∈ ∂ i P k,k ′ >k ∈ ∂ i \ j v ( λ ) i ← k v ( λ ) i ← k ′ − λ P k,k ′ >k ∈ ∂ j \ i v ( λ ) j ← k v ( λ ) j ← k ′ ! δ. (A11)Note that the contribution of the first term on the left-hand side of Eq. (A10) is 0 because the eigenvectors u (˜ λ ) areorthogonal. Finally, we express the order parameter near p c in terms of the principal eigenvector components of thenonbacktracking matrix as S = 1 N X i − Y j ∈ ∂ i H i ← j (1) ∼ = 1 N X i X j ∈ ∂ i a i ← j ∼ = C λ N X i X j ∈ ∂ i v ( λ ) i ← j = Ω · ( p − p c ) , (A12)where Ω ∼ = λ P i P j ∈ ∂ i v ( λ ) i ← j ! P i ( q i − − λ ) P j ∈ ∂ i v ( λ ) i ← j ! N P i P j ∈ ∂ i v ( λ ) i ← j ! P j ∈ ∂ i P k,k ′ >k ∈ ∂ i \ j v ( λ ) i ← k v ( λ ) i ← k ′ − λ P k,k ′ >k ∈ ∂ j \ i v ( λ ) j ← k v ( λ ) j ← k ′ ! . (A13) Method 2
Let us denote the right eigenvectors of the nonbacktracking matrix v ( λ ) and the left eigenvectors w ( λ ) . The complexconjugate of any v ( λ ) is orthogonal to all w ( λ ′ ) ( λ = λ ′ ), and the conjugate of any w ( λ ) is orthogonal to all v ( λ ′ ) ( λ = λ ′ ). [Indeed, let λ = λ ′ be two different eigenvalues of B . Then clearly λ w ( λ ) v ( λ ′ ) = w ( λ ) Bv ( λ ′ ) = λ ′ w ( λ ) v ( λ ′ ) ,giving ( λ − λ ′ ) w ( λ ) v ( λ ′ ) = 0, and therefore, w ( λ ) v ( λ ′ ) = 0.]2Recall Eq. (A3), X λ C λ v ( λ ) i ← j ( pλ −
1) = p X λ,λ ′ C λ C λ ′ X k,k ′ >k ∈ ∂j \ i v ( λ ) j ← k v ( λ ′ ) j ← k ′ + . . . , and that v ( λ ) i ← j are components of the right eigenvector of the nonbacktracking matrix, corresponding to eigenvalue λ .Multiplying both sides by w λ i ← j and summing over i ← j we see that the terms of λ = λ become 0. This procedureresults in a considerably simpler expression for the coefficient C λ , C λ ∼ = λ P i ← j v ( λ ) j ← i v ( λ ) i ← j P i ← j v ( λ ) j ← i P k,k ′ >k ∈N j \ i v ( λ ) j ← k v ( λ ) j ← k ′ δ, (A14)where we have used the property of the nonbacktracking matrix of undirected graphs w ( λ ) i ← j = v ( λ ) j ← i , for simplification.The singularity of the giant component S = Ω δ has an amplitude, Eq. (A12):Ω = λ (cid:16)P i ← j v ( λ ) i ← j (cid:17) (cid:16)P i ← j v ( λ ) j ← i v ( λ ) i ← j (cid:17) N P i ← j v ( λ ) j ← i P k,k ′ >k ∈N j \ i v ( λ ) j ← k v ( λ ) j ← k ′ . (A15)Equations (A14) and (A15) are equivalent to the expressions, Eqs. (A11) and (A13). [1] R. J. Baxter, Exactly Solved Models in Statistical Me-chanics (Courier Corp., Mineola, NY, 2007).[2] E. A. Bender and E. R. Canfield, “The asymptotic num-ber of labeled graphs with given degree sequences,” J.Combin. Theory Ser. A , 296 (1978).[3] B. Bollob´as, “A probabilistic proof of an asymptotic for-mula for the number of labelled regular graphs,” Eur. J.Combin. , 311 (1980).[4] M. E. J. Newman, S. H. Strogatz, and D. J. Watts,“Random graphs with arbitrary degree distributions andtheir applications,” Phys. Rev. E , 026118 (2001).[5] B. Karrer, M. E. J. Newman, and L. Zdeborov´a, “Perco-lation on sparse networks,” Phys. Rev. Lett. , 208702(2014).[6] F. Morone and H. A. Makse, “Influence maximization incomplex networks through optimal percolation,” Nature , 65 (2015).[7] F. Morone, B. Min, L. Bo, R. Mari, and H. A. Makse,“Collective influence algorithm to find influencers via op-timal percolation in massively large social media,” Sci.Rep. , 30062 (2016).[8] S. Bau, N. C. Wormald, and S. Zhou, “Decycling num-bers of random regular graphs,” Random Struct. Algor. , 397 (2002).[9] K. Hashimoto, “Zeta functions of finite graphs and repre-sentations of p-adic groups,” in Automorphic Forms andGeometry of Arithmetic Varieties: Advanced Studies inPure Mathematics , Vol. 15, edited by K. Hashimoto andY. Namikawa (Kinokuniya, Tokyo, 1989) p. 211.[10] F. Krzakala, C. Moore, E. Mossel, J. Neeman, A. Sly,L. Zdeborov´a, and P. Zhang, “Spectral redemption inclustering sparse networks,” Proc. Natl. Acad. Sci. USA , 20935 (2013).[11] Y. Weiss and W. T. Freeman, “On the optimality of so-lutions of the max-product belief-propagation algorithm in arbitrary graphs,” IEEE Trans. Info. Theory , 736(2001).[12] A. Faqeeh, S. Melnik, and J. P. Gleeson, “Networkcloning unfolds the effect of clustering on dynamical pro-cesses,” Phys. Rev. E , 052807 (2015).[13] H. Minc, Nonnegative Matrices (John Wiley and Sons,New York, 1988).[14] T. Martin, X. Zhang, and M. E. J. Newman, “Local-ization and centrality in networks,” Phys. Rev. E ,052808 (2014).[15] K. E. Hamilton and L. P. Pryadko, “Tight lower boundfor percolation threshold on an infinite graph,” Phys.Rev. Lett. , 208701 (2014).[16] A. V. Goltsev, S. N. Dorogovtsev, and J. F. F. Mendes,“Percolation on correlated networks,” Phys. Rev. E ,051105 (2008).[17] A. V. Goltsev, S. N. Dorogovtsev, J. G. Oliveira, andJ. F. F. Mendes, “Localization and spreading of diseasesin complex networks,” Phys. Rev. Lett. , 128702(2012).[18] A. Saade, F. Krzakala, and L. Zdeborov´a, “Spectralclustering of graphs with the Bethe Hessian,” in Adv.Neural Inf. Process. Syst. (2014) p. 406.[19] W. W. Zachary, “An information flow model for conflictand fission in small groups,” J. Anthropol. Res. , 452(1977).[20] M. Ripeanu, A. Iamnitchi, and I. Foster, “Mapping theGnutella network,” IEEE Internet Comput. , 50 (2002).[21] D. Meyer, “University of Oregon Route Views ArchiveProject,” (2001).[22] A. Braunstein, L. DallAsta, G. Semerjian, and L. Zde-borov´a, “Network dismantling,” Proc. Natl. Acad. Sci.USA , 12368 (2016).[23] L. Zdeborov´a, P. Zhang, and H. J. Zhou, “Fast andsimple decycling and dismantling of networks,” Sci. Rep. , 37954 (2016).[24] M. E. J. Newman and M. Girvan, “Finding and evalu-ating community structure in networks,” Phys. Rev. E , 026113 (2004).[25] J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Ma-honey, “Community structure in large networks: Naturalcluster sizes and the absence of large well-defined clus-ters,” Internet Math. , 29 (2009).[26] J. Yang and J. Leskovec, “Defining and evaluating net-work communities based on ground-truth,” Knowl. Inf.Syst. , 181 (2015).[27] F. Radicchi and C. Castellano, “Beyond the locally tree-like approximation for percolation on real networks,”Phys. Rev. E , 030302 (2016).[28] S. Yoon, A. V. Goltsev, S. N. Dorogovtsev, and J. F. F.Mendes, “Belief-propagation algorithm and the Isingmodel on networks with arbitrary distributions of mo-tifs,” Phys. Rev. E , 041144 (2011).[29] A. Faqeeh, S. Melnik, P. Colomer-de Sim´on, and J. P.Gleeson, “Emergence of coexisting percolating clustersin networks,” Phys. Rev. E , 062308 (2016).[30] S. E. Schaeffer, “Graph clustering,” Comput. Sci. Rev. , 27 (2007).[31] T. A. Jarrell, Y. Wang, A. E. Bloniarz, C. A. Brittin,M. Xu, J. N. Thomson, D. G. Albertson, D. H. Hall, andS. W. Emmons, “The connectome of a decision-making neural network,” Science , 437 (2012).[32] R. Cohen and S. Havlin, “Scale-free networks are ultra-small,” Phys. Rev. Lett. , 058701 (2003).[33] S. N. Dorogovtsev, J. F. F. Mendes, and A. N.Samukhin, “Metric structure of random networks,” Nucl.Phys. B , 307 (2003).[34] S. V. Buldyrev, R. Parshani, G. Paul, H. E. Stanley, andS. Havlin, “Catastrophic cascade of failures in interde-pendent networks,” Nature , 1025 (2010).[35] G. Bianconi and S. N. Dorogovtsev, “Multiple percola-tion transitions in a configuration model of a network ofnetworks,” Phys. Rev. E , 062814 (2014).[36] S. Boccaletti, G. Bianconi, R. Criado, C. I. del Ge-nio, J. G´omez-Garde˜nes, M. Romance, I. Sendi˜na-Nadal,Z. Wang, and M. Zanin, “The structure and dynamicsof multilayer networks,” Phys. Rep. , 1 (2014).[37] M. Kivel¨a, A. Arenas, M. Barthelemy, J. P. Gleeson,Y. Moreno, and M. A. Porter, “Multilayer networks,”J. Complex Networks , 203 (2014).[38] F. Radicchi, “Percolation in real interdependent net-works,” Nature Phys. , 597 (2015).[39] F. Radicchi and G. Bianconi, “Redundant interdepen-dencies boost the robustness of multiplex networks,”Phys. Rev. X , 011013 (2017).[40] G. J. Baxter, G. Bianconi, R. A. da Costa, S. N. Doro-govtsev, and J. F. F. Mendes, “Correlated edge overlapsin multiplex networks,” Phys. Rev. E94