K-Scaffold subgraphs of Complex networks
Bernat Corominas-Murtra, Sergi Valverde, Carlos Rodríguez-Caso, Ricard V. Solé
aa r X i v : . [ phy s i c s . s o c - ph ] S e p K -Scaffold subgraphs of Complex networks Bernat Corominas-Murtra , Sergi Valverde , Carlos Rodr´ıguez-Caso Ricard V. Sol´e , ICREA-Complex Systems Lab, Universitat Pompeu Fabra, Dr. Aiguader 80, 08003 Barcelona, Spain Santa Fe Institute, 1399 Hyde Park Road, New Mexico 87501, USA
Complex networks with high numbers of nodes or links are often difficult to analyse. However,not all elements contribute equally to their structural patterns. A small number of elements (thehubs) seem to play a particularly relevant role in organizing the overall structure around them. Butother parts of the architecture (such as hub-hub connecting elements) are also important. In thisletter we present a new type of substructure, to be named the K -scaffold subgraph, able to captureall the essential network components. Their key features, including the so called critical scaffoldgraph, are analytically derived. Introduction . Networks pervade complexity [1, 2, 3, 4].How networks are organized at different scales is one ofthe main topics of complex network research [5, 6, 7, 8,9, 10]. Some approaches are based on the study of givensubgraphs, from the smaller network motifs [7, 8, 11] to K -cores [6, 12], spanning trees [10] or gradient subgraphsobtained form a given internal system’s dynamics [9].One of the most studied subgraphs is the so-called K -core, formally defined by Bollob`as in [13]. The K -core ofa graph G , C k ( G ) is the largest subgraph whose verticeshave, at least, degree k ≥ K . The behaviour of suchsubgraph, and its percolation properties have been widelystudied [6, 13, 14, 15, 16]. K -cores display interestingfeatures with several implications in the study of realnetworks, both at the theoretical and applied level [6,12, 14, 17].Hubs are the center of attention of the K -core. They areresponsible for the efficient communication among net-work units and their failure or removal can have dramaticconsequences [18]. But other graph components are alsorelevant to understand network behavior. In particular,hubs are often related through other elements exhibit-ing low connectivity, the so-called conectors . Despite itsrelevance, the K -core fails in finding the hub-connectorstructure. This pattern is essential in highly dissassorta-tive or modular networks, where hub-hub conectors playa crucial role [2]. In such networks, robustness againstfailures is strongly tied to hubs, but also to the hub-conector structure. Moreover, conectors can display high betweness centrality [2] despite their low conectivity, re-inforcing the role of this kind of nodes in non-local orga-nization of the global topology and dynamics.To overcome these limitations, we introduce a subgraphdefinition which captures the previous traits. Specifi-cally, we consider a subgraph that includes the most con-nected nodes and their connectors, if any. In doing so, wewant to explore wether there is some fundamental hub-connector subgraph and its relevant properties. Such agraph, the so-called K -scaffold subgraph, was recentlyintroduced (in qualitative terms) within the context ofthe human proteome [19]. This network included onlytranscription factors, i. e. proteins linking to DNA andthus involved in regulating gene expression (fig 1(a)).Specifically, it was shown that an appropiate choice of relevant hubs and their connectors allowed to define afunctionally meaningful subgraph. Such subgraph con-tained a large number of cancer-related proteins aroundwhich well-defined modules were organized as evolution-arily and functionally related subsets. Here, we definethis subgraph in a rigorous way. We analitically charac-terize its properties and degree distributions as well asthe presence of a special class of minimal scaffold graphbased on a critical percolation threshold. K -Scaffold subgraphs . Let us consider a graph G ( V, Γ),where V is the set of nodes and Γ the set of edges connect-ing them. The K -Scaffold subgraph S K ( G ) will considerthe degree of nodes k ( e i ) , e i ∈ E but it will take intoaccount correlations: Specifically, if we choose a node e i ∈ V , it will belong to S K ( G ) if and only if (1) K ≤ k i or (2) e i is connected to e j ∈ V , and e j is such that K ≤ k j . Thus, given a graph G ( V, Γ), with its adjacencymatrix a ij , the K -scaffold of G will be defined as: S ij = (cid:26) a ij iff ( K ≤ k i ∨ K ≤ k j )0 otherwise (1)An example of such K -scaffold subgraph is shown in(fig.(1b)). This allows us to define, from a given graph G a nested hierarchy of subgraphs S K ( G ) such that: ...S K +1 ( G ) ⊆ S K ( G ) ⊆ S K − ( G ) ... (2)To clarify which elements are really relevant, we also de-fine a naked K -scaffold subgraph. From the K -scaffoldsubgraph, S K ( G ), the naked K -scaffold, γ K ( G ), is ob-tained by removing all nodes having a single link (the“hair” of the graph)(fig.(1c)). Thus, from S ij , it is easyto compute the adjacency matrix of the naked K -scaffoldsubgraph, namely: γ ij = S K ( a ij )(1 − δ k i , )(1 − δ k j , ) (3)Additionally, if two or more connectors have identicalpattern of conectivity in γ K ( G ) (i.e., they are connectedto exactly the same hubs, understanding hubs as nodeswith k ≥ K ), we renormalize these sets of connectors byreplacing each of them with a single node. In this way,the renormalized K -scaffold subgraph, γ K ( G ), keeps therelevant elements without redundancies (fig.(1d)). (cid:1)(cid:0)(cid:4)(cid:2) (cid:5)(cid:3) (cid:6)(cid:7) FIG. 1: (a) The Human Transcription Factor interaction Network (HTFN). (b) its 11-Scaffold subgraph. (c) The naked 11-scaffold subgraph and (d) the naked and renormalized 11-scaffold subgraph. The K -scaffold subgraph displays a fundamentalhub-connector structure that organizes the general topology of the whole system. Data from [19]. Statistical Properties.
Here we derive the main statisticalfeatures of the K -scaffold subgraph from an arbitrary,uncorrelated network G . First, we compute the fractionof nodes in G belonging to S K ( G ), i.e., the probability fora random choosen node of G to belong to S K ( G ). If wedefine q 3. No K c can be properly identified (see text). Thepresence of the cut-off can be due to the finite size effects of the simulation; (c) Exponential net with P ( k ) ∝ e − k/ K , K = 13(d) Power Law with exponential cut off net with P ( k ) ∝ k − α e − k/ K , for α = 2 . K = 52. and since equation (20) is equivalent to: ∞ X k k ( kf k − f k − P ( k ) = 0 (22)the percolation condition for a K -scaffold, S K , is: X k k ( kf k − f k − P ( k ) > K -scaffold, S K ( G ): ∞ X k k ( k − P ( k ) > X k 1, thus,( q 1. Thus, the sum X k The authors thank the members of the Complex Sys-tems. This work has been supported by grants FIS2004-0542, IST-FET ECAGENTS, project of the EuropeanCommunity founded under EU R&D contract 01194, bythe EU within the 6th Framework Program under con- tract 001907 (DELIS), by NIH 113004, FIS2004-05422and by the Santa Fe Institute. [1] Albert, R. and Barabasi, A. (2001) Rev. Mod. Phys. 74,47.[2] Newman, M. E. J. (2003) SIAM Review 45, 167.[3] Dorogovtsev, S. N. and Mendes, J. F. F. (2002) Adv.Phys. 51, 1079-1187.[4] Boccaletti, S.; Latora, V.; Moreno, Y.; Chavez, M. andHwang, D.-U. (2006) Physics Reports 424, 175.[5] Palla, G.; Derenyi, I.; Farkas, I. and Vicsek (2005). Na-ture 435, 814.[6] S.N. Dorogovtsev, A.V. Goltsev, and J.F.F. Mendes.(2006) Phys. Rev. Lett. 96, 040601 1-4.[7] S. Itzkovitz, R. Milo, N. Kashtan, G. Ziv and U. Alon(2003) Phys. Rev. E 68, 026127.[8] Milo R., Shen-Orr S., Itzkovitz S., Kashtan N.,Chklovskii D. and Alon U. (2002) Science 298, 824-827.[9] Z. Toroczkai, B. Kozma, K. E. Bassler, N. W. Hengart-ner, and G. Korniss (2004). arXiv:cond-mat/0408262.[10] Kim, D. H.; Noh, J. D.; Jeong, H. (2004) Phys. Rev.Lett. 96, 018701.[11] Valverde, S. and Sol´e, R.V. (2005) Phys. Rev. E 72,026107.[12] J.I. Alvarez-Hamelin, L. Dall’Asta, A. Barrat and A.Vespingani, cs. NI/0504107; cs.NI/0511007[13] Bollob`as, B. (1984)