Absence of a resolution limit in in-block nestedness
Manuel S. Mariani, María J. Palazzi, Albert Solé-Ribalta, Javier Borge-Holthoefer, Claudio J. Tessone
AAbsence of a resolution limit in in-block nestedness
Manuel S. Mariani,
1, 2
Mar´ıa J. Palazzi, Albert Sol´e-Ribalta,
2, 3
Javier Borge-Holthoefer, and Claudio J. Tessone Institute of Fundamental and Frontier Sciences,University of Electronic Science and Technology of China, Chengdy 610054, PR China. URPP Social Networks, Universit¨at Z¨urich, Switzerland Internet Interdisciplinary Institute (IN3), Universitat Oberta de Catalunya, Barcelona, Catalonia, Spain
Originally a speculative pattern in ecological networks, the hybrid or compound nested-modular pattern hasbeen confirmed, during the last decade, as a relevant structural arrangement that emerges in a variety of con-texts –in ecological mutualistic systems and beyond. This implies shifting the focus from the measurement ofnestedness as a global property ( macro level), to the detection of blocks ( meso level) that internally exhibit ahigh degree of nestedness. Unfortunately, the availability and understanding of the methods to properly detectin-block nested partitions lie behind the empirical findings: while a precise quality function of in-block nested-ness has been proposed, we lack an understanding of its possible inherent constraints. Specifically, while it iswell known that Newman-Girvan’s modularity, and related quality functions, notoriously suffer from a resolu-tion limit that impair their ability to detect small blocks, the potential existence of resolution limits for in-blocknestedness is unexplored. Here, we provide empirical, numerical and analytical evidence that the in-block nest-edness function lacks a resolution limit, and thus our capacity to detect correct partitions in networks via itsmaximization depends solely on the accuracy of the optimization algorithms.
I. INTRODUCTION
In-block nestedness has emerged, in the last few years, as an interesting pattern in complex networks. Initially proposedmerely as a hypothetical configuration [1], the idea of hybrid nested-modular structures has gained traction after empiricalevidence has shown that such arrangements may play a prominent role in many systems, natural [2–5] and artificial [6].From a scientific perspective, the presence of in-block nestedness in real networks is not surprising. Modularity –a mesoscalepattern that considers the organization of nodes in a network as a set of cohesive subgroups [7]– is almost ubiquitous in networkstructures [8–12]. Nestedness [13–15] –where the interactions of nodes with low low degree are a subset of those with largerdegree– is also a prominent macroscale pattern in ecology [14, 16] and beyond [17–20]. Both structures emerge as a resultof different evolutive pressures and, following this logic, if two such mechanisms are concurrent, then hybrid nested-modular(in-block nested) arrangements are expected to appear.And yet, practical approaches to properly identify in-block nestedness are still scarce. In general, the identification of suchcompound structures operates sequentially: after the identification of a network partition (usually in terms of modularity [7]),nestedness (usually in terms of NODF [21] or nestedness temperature [14]) is computed locally for each block. A possiblesolution to the limitations of such sequential approach has been recently proposed, as a precise formulation of an appropriatequality function, I [22]. Formally similar to the popular Newman-Girvan’s modularity Q , the advantage of I is that it can bemaximized algorithmically, leading to a proper identification of in-block nested partitions –just like other mesoscale patternsthat have recently appeared, e.g. multiple core-periphery structure [23]. Thanks to these developments, the presence of in-blocknested structures in very diverse systems has been confirmed [6, 22]. However, the mathematical properties of the in-blocknestedness function remain largely unknown.Here, we examine whether the in-block nestedness function exhibits a resolution limit, similar to the one found for themodularity function [24]. The existence of such a limit would imply the impossibility to detect interaction blocks smaller thana given scale [24], potentially making the interpretation of the detected nested blocks ambiguous [25]. After some preliminarytests on empirical networks, we show numerically and analytically that in-block nestedness function does not exhibit a resolutionlimit. Such striking result implies that the identification of the correct in-block nested partition in a network depends only on theaccuracy of the heuristics used in the optimization process, and not on any inherent constraint in the formulation of I itself.The rest of the paper is organized as follows. Section II introduces the theoretical concepts examined in this work. Section IIIpresents results on some empirical networks which provide valuable intuitions for the following sections. Section IV presentsthe analytic derivations for in-block nestedness in an idealized family of synthetic networks, proving the absence of a resolutionlimit. Section V provides numerical evidence that generalizes the analytical findings. Finally, Section VI summarizes the maintakeaways and open questions for future research. II. BACKGROUND ON MODULARITY AND IN-BLOCK NESTEDNESS
Heterogeneity is a landmark feature of real complex networks. At the global scale, for example, the distribution of the numberof neighbors of a node is broad, with a tail that often follows a power law. Interestingly, also the mesoscale often presents a a r X i v : . [ q - b i o . Q M ] F e b Symbol Variable s ∈ { , . . . , N } Node N Total number of nodes ( s, t ) Edge A = { A st ; A st = 1 iff ( s, t ) is observed } Adjacency matrix E = (cid:80) s,t A st Total number of edges Ω s = { t | A st = 1 } Neighborhood of node sk s = (cid:80) t A st Degree of node sα s Community where node s belongs to κ s = (cid:80) t ∈ α s A st Internal degree of node sκ α = (cid:80) s ∈ α κ s Internal degree of community αO st = (cid:80) u A su A tu Common neighbors / overlapTABLE I. Basic notation for unipartite networks [26] similar situation: the distribution of edges is not only globally, but also locally inhomogeneous, with high concentrations ofedges within groups of nodes, and low concentrations between these groups [12].Such feature of real networks –community structure– can be translated to a quantitative criterion. Radicchi et al. [27] proposethe following: a block (also called community , module , compartment , or cluster depending on the research field [15]) constitutesa weak community if and only if its internal degree κ α exceeds its external degree (i.e., the total degree of its nodes by onlyconsidering links with nodes that do not belong to the block). Conversely, a block constitutes a strong community if and only if,for each of its nodes, the node’s internal degree κ s is larger than the node’s external degree.Notably, this definition presupposes that a partition of the network is at hand –whereas, in real situations, this is most often notthe case. This explains why, historically, there have been many efforts to define suitable quality functions, and design associatedoptimization heuristics, that aim at the identification of good (ideally optimal) partitions. Without doubt, the most popularmethod in network science is through the maximization of a fitness function called modularity Q [7]. Yet, other definitions [28]and structures have also attracted the attention of researchers [23, 29, 30].In this section we focus on two of those functions, that constitute the core of this work (modularity and in-block nested-ness [22]), emphasizing their inherent shortcomings, i.e. limitations that are intrinsic to their definition, rather than to theweaknesses of the corresponding optimization strategies. For the sake of simplicity, we only report definitions for unipartitenetworks, see Table I for the notation used in this work. The extension to bipartite systems is not difficult, but requires moreintricate notation, as well as the consideration of the different number of nodes that may compose each network dimension. A. Modularity
One of the most popular methods to identify communities is through the maximization of the modularity Q [12, 25]. For aunipartite network, the modularity function is defined as [7] Q = 12 E (cid:88) st ( A st − E st ) δ ( α s , α t ) , (1)where E st = k s k t / (2 E ) denotes the expected number of links between nodes s and t under the Chung-Lu configuration model[31, 32]. The problem of community detection via modularity optimization is particularly tricky, and has been the subject ofdiscussion in various disciplines. This is an NP complete problem [33], which explains why several methods have been proposedto reduce the complexity of the task [34–36]. However, parallel to the constraints of the algorithmic strategies, the formulation of Q has an inherent limitation itself, which impedes its optimization to detect blocks that are smaller than a given size. Intuitively,for the modularity function, this limit can be understood in a toy unipartite network formed by a set of cliques placed on a ring,where each pair of adjacent cliques is connected by a single inter-clique link, see Figure 1 (top left and middle left panels). Thisis the most modular connected network [24]. In this setting, one can show that the modularity has a scale detection problem.Even if the network has more cliques than B ≥ √ E , the modularity function will still favor partitions where B blocks aredetected. This somehow imposes a detection scale which can be intuitively understood by noticing that the expected number ofedges between two blocks α and β is, approximately, E αβ = k α k β / (2 E ) , where k α = (cid:80) s ∈ α k s denotes the total degree ofblock α . When both k α and k β are of order √ E or smaller, E αβ becomes of order one or smaller, meaning that even a single linkbetween blocks α and β is interpreted by the modularity function as a non-random connection, thereby favoring their merginginto a single block [25].In the maximally-modular network above, an alternative demonstration of the resolution limit can be obtained by comparingthe modularity of the correct partition of the nodes into cliques, Q single , against the modularity of the (wrong) partition obtained FIG. 1.
Illustration of a ring of weakly-interconnected blocks.
Central row: Representation of a ring of sub-graphs (blue circles) connectedby a single link, (cid:96) = 1 (left) and connected through several links (cid:96) = 5 (right). The subgraphs, represented as blue circles, can take the formof identical cliques or perfect nested blocks. A matrix representation for the considered cases is shown in top (identical cliques) and bottom(identical perfect nested blocks) rows, respectively. by merging pairs of adjacent cliques, Q pairs . It turns out that ∆ Q := Q single − Q pairs > if and only if N < √ E . Ifwe gradually increase N by adding new cliques, as soon as N becomes larger than the modularity’s intrinsic scale √ E , themodularity of the wrong partition, Q pairs , exceeds the modularity of the correct partition, Q single ( ∆ Q < ). Alternativeexamples can be drawn to further prove the modularity’s resolution limit in various scenarios [24]. B. In-block nestedness
Among mesoscale quality functions other than modularity, in-block nestedness corresponds to a hybrid or combined patternin which nested-modular arrangements appear. To be precise, an in-block nested network presents an overall compartmentalizedorganization, where blocks present a nested connectivity within. Naturally, it follows that the in-block nestedness quality function I inherits aspects from nestedness measurement (in particular, from the NODF descriptor [21, 37]), as well as ingredients frommodularity. For a unipartite network, the in-block nestedness function I is defined as [22] I = 2 N (cid:88) st O st − (cid:104) O st (cid:105) k t ( C s −
1) Θ( k s − k t ) δ ( α s , α t ) , (2)where each node can only belong to one block α , C α denotes the number of nodes that belong to block α , O st the number ofshared neighbours between nodes s and t (i.e. overlap), Θ is the Heaviside function and δ is the Kronecker delta. For subsequentanalytic developments, it is convenient to rewrite the previous expression for I as sum over the network’s blocks: I = B (cid:88) α =1 N α (3)where B denotes the total number of blocks and N α := 2 N C α − (cid:88) s,t ∈ α (cid:32) O st k t − k s N (cid:33) Θ( k s − k t ) . (4)can be interpreted as the level of block α ’s internal nestedness.Measuring the level of in-block nestedness of a given network requires the optimization of the in-block nestedness function,which is –again– an NP problem. Leaving aside computational aspects, resolution limits can arise when optimizing a qual-ity function different than modularity [38]. For instance, recent works have introduced and examined a quality function thatassumes that each block has a core-periphery internal structure [23, 39]. While this quality function can detect multiple core-periphery structures in a network, it inherits from the modularity function a similar resolution limit [39], which has motivatedthe introduction of a multiscale variant of the original algorithm [40].Turning to in-block nestedness, the function defined by Eq. (2) is substantially different than the modularity function, becauseit is based on the overlaps between nodes that belong to the same cluster, and not on link density. This suggests that the resolutionlimit of this function may have a radically different behavior than the modularity’s one. Examining this conjecture is the maingoal of the rest of the paper. III. EMPIRICAL INSIGHTS: PRELIMINARY INTUITIONS ON Q AND I RESOLUTION LIMIT
To test the absence (or presence) of a resolution limit for the in-block nestedness, we first perform an exploration withempirical data, following an approach similar to the one in [24]. Specifically, for each network, the quality function of interestis optimized by means of the same optimization strategy (extremal optimization algorithm [41], in our case. See Appendix B).Then, all the links between the detected blocks are removed, and the optimization algorithm is applied again to the resultingblocks. With two partitions at hand, we compute the Jaccard index to measure how similar they are. We iterate this procedure–remove links between communities, optimize quality function–, until the Jaccard index J between consecutive partition vectorsis 1, i.e. the algorithm is no longer able to split the current partition into one with higher score.This general scheme is applied separately for modularity Q and in-block nestedness I over a set of 82 real networks, from twodifferent domains: ecological in most cases [42], with some collaboration networks taken from socio-technological systems [6,43] (see Appendix A for details). We have restricted the size of these networks in the range [50 , ] nodes.The idea behind this approach is to get a first intuition on whether a resolution limit for in-block nestedness exists, or not, andhow severe it is –if it does exist–, when compared to the resolution limit of modularity. If the quality function lacks a resolutionlimit (and assuming that the heuristics can reach the optimal partition), one should expect that after the initial optimization step,the algorithm should not be able to further split the detected blocks into smaller ones.The result of this experiment is summarized in Figure 2 (left panel) which shows a scatter plot of the number of attemptsneeded to reach J = 1 after optimizing in-block nestedness, plotted against the corresponding number of attempts to reach J = 1 for modularity, for each network. The size of the points in the scatter plot is proportional to the size of each network. Toease comparison, the number of attempts for Q and I have been plotted in the same scale (log-log), the function y = x is plottedas a dashed black curve as a visual aid. Marginal box plots show the distribution of the number of attempts needed for eachnetwork, for both Q and I . Without exception, the number of iterations needed to reach the stopping condition is substantiallylonger for modularity. (a) (b) FIG. 2.
Comparing the resolution limit of modularity and in-block nestedness in empirical data.
Scatter plots of the number of attemptsneeded to reach a Jaccard index J = 1 (left) and J = 0 . (right) for modularity and in-block nestedness. Marginal box plots show thedistribution of the number of attempts needed for each network. The size of the points in the scatter plot is proportional to the total number ofnodes of each network. Taken strictly, this result can be interpreted as informal evidence of a milder effect of the resolution limit for in-block nest-edness (compared to Q ). At the same time, this result is not a formal proof that the resolution limit is entirely absent: if theresolution limit is absent, the additional optimization steps could be due to the fact that the extremal optimization algorithmis unable to reach the optimal partition in each step. Relaxing the conditions for the stopping criteria, e.g. J ≥ τ , with τ ∈ [0 . , . , strengthens this informal evidence: the number of attempts needed to reach J ≥ τ for I drops to 1 for severalnetworks, while the number of attempts for Q remains large in most cases: see Figure 2 (right panel), which shows this for τ = 0 . . IV. ABSENCE OF RESOLUTION LIMIT IN I : ANALYTIC APPROACH In this Section, we aim to provide an analytic explanation for the previous empirical intuitions, in an idealized family ofsynthetic networks. For the sake of analytic tractability, we consider a ring of interconnected blocks of equal size C , whereeach block has internally a stepwise structure. That is, the degrees of subsequent rows (columns) of the adjacency matrix differby one (see Fig. 1, bottom-left panel). Additionally, contiguous blocks are interconnected by (cid:96) = 1 link that connects the twogeneralists of each block –in total, there are B + 1 inter-block links that connect the B generalists. Our strategy to perform thecalculation is to first compute in-block nestedness I of a perfectly in-block nested network composed of B disconnected blocks,and then add to I the terms due to the interactions between the hubs. To compute I , it is sufficient to derive the nestedness ofa single stepwise block, see Appendix D 1. In the case of equally-sized blocks, we obtain I = 1 − N − B . (5)From this Equation, we can obtain the in-block nestedness of the ring by adding the contribution to I from the inter-block linksthat connect the hubs, see Appendix D 2. We obtain I single = 1 − N − B − BN . (6)To study the possible existence of a resolution limit of I , we need to compare the in-block nestedness of the correct partition, I single , with the in-block nestedness I pairs of a partition where pairs of contiguous “true” blocks, α i and α i +1 ( i = 1 , . . . , B ),are merged and assigned to the same block, α i,i +1 = α i ∪ α i +1 . The in-block nestedness of the wrong partition, I pairs , can becalculated by adding up the contributions from pairs of nodes that belong to the same true block, and those from pairs of nodesthat belong to different true blocks. This results in I pairs = C − C − I single + 2 C (2 C − (cid:32) H C − − g ( C )3 N (cid:33) (7)where H C − := (cid:80) C − t =1 t − denotes the C − th harmonic number, and we defined the polynomial function g ( C ) := ( C −
1) ( C + C + 6) . Putting together Eqs. 6 and 7, we obtain ∆ I = I single − I pairs = C C − I single − C (2 C − (cid:32) H C − − g ( C )3 N (cid:33) . (8)We refer to Appendix D 2 for the full derivation of this equation. Also, numerical results in Figure 3 show the perfect matchingbetween the analytical insights in Eq. (8) (left panel), and Eqs. (6),(7) and (9) (right panel).For a fixed C (cid:29) value, in the limit N → ∞ (or equivalently, B → ∞ ), we obtain I pairs → I single / (9)confirming the numerical intutions in [22], and in accordance with the right panel of Figure 3. This implies that no matter howlarge the network is, the in-block nestedness of the partition with pairwise-merged blocks remains significantly smaller than thein-block nestedness of the partition with the original blocks. The same holds true for small values of C , because the second termin the r.h.s. of Eq. (8) tends to be substantially smaller than the first term. The reason is that the contribution from the null modelis negligible compared to the penalty due to the merging of two blocks into a single one. Therefore, in this idealized example, thepenalization for larger blocks in the in-block nestedness function prevents the resolution limit, allowing the in-block nestednessfunction of the partition composed of the individual blocks to stay always larger than the in-block nestedness of the partitioncomposed of pairwise-merged blocks. (a) (b) FIG. 3.
Analytical and numerical agreement.
Left panel reports on the perfect match between Eq. (8) (symbols) and the actual calculationperformed on synthetic graphs (lines). Similarly, the right panel shows the agreement between analytical and numerical results for Eqs. (6)and (7). Gray symbols and dotted line in right panel confirms Eq. (9).
V. GENERALIZING THE ABSENCE OF RESOLUTION LIMIT IN I : NUMERICAL APPROACH ON BENCHMARKGRAPHS Supported by the excellent agreement between analytical and numerical results in Figure 3, we now carry out a numericalvalidation considering less idealized scenarios. We do so examining numerically whether the in-block nestedness functionpresents a resolution limit or not, in scenarios beyond (cid:96) = 1 where modularity does. To this end, we analyze benchmark networksalong the lines of Figure 1 (middle-right and bottom-right panels), that is, building unipartite synthetic networks, composed of agrowing ring of blocks that internally exhibit a nested structure. We study a wide range of these networks, modifying the numberof blocks B that conform the ring, and the number of inter-block links (cid:96) . We start with a network composed of B = 3 (perfectlynested) stepwise blocks connected as a ring, and then consider a growing number of blocks (up to B = 200 ). Regarding theinter-block connectivity (cid:96) , we start with (cid:96) = 1 , which corresponds to the analytical calculations above, up to (cid:96) = C ( C − / which corresponds to maximum possible connectivity between contiguous blocks. Details on the generation of the internalnested structure of the blocks, and how it determines the internal block density, is available in Appendix C.We carry out the numerical validation in two flavors: in one of them, we consider a random strategy , where the blocks areconnected by adding a link between two randomly selected nodes from each block. For this case, we report results averaged over25 realizations. In the other, the addition of inter-block links ( (cid:96) ≥ ) is deterministic, connecting the most-generalist availablenodes in each pair of adjacent communities ( generalist-based strategy ). Note that, strictly speaking, this latter strategy is thelogical generalization of our analytical results (where a single link was laid between adjacent blocks, connecting the generalistnodes in them).For both strategies, we compare numerically the in-block nestedness of the ground-truth partition, I single , against the in-block nestedness of the wrong partition obtained by considering pairs of adjacent blocks as a single block, I pairs . If the in-blocknestedness has a resolution limit beyond the scenario presented in the previous section, then for some value of B we wouldobserve a crossover from ∆ I := I single − I pairs > to ∆ I < , as indeed happens with Q . All these results are shown inFigure 4, where top and middle rows present the results for ∆ I in the random and generalist strategies, respectively. The bottomrow, conversely, corresponds to the results for ∆ Q for both strategies, since they present –quite surprisingly– identical behavior.For the sake of clarity, a black vertical line is drawn in each panel marking the weak community criterion. Beyond this limit,no recognizable block structure is available, and therefore it becomes irrelevant whether a given quality function identifies a“correct” block or not. Each column of the figure corresponds to different block sizes C .For the random strategy, each point in the parameter space ( B, (cid:96) ) of the panels in Figure 4 reports the average value of ∆ I (top row), and ∆ Q (bottom row), for 25 different realizations. There are at least three remarkable lessons from Figure 4, equallyvalid for the adopted linking strategies. First, only Q shows the existence of a resolution limit consistently –no matter the numberof inter-block links (cid:96) , it is always possible to find a large-enough number of blocks such that the resolution limit appears, i.e. Q pairs is larger than Q single . Second (a consequence of the first), the appearance of the resolution limit for Q is independentof the criterion of weak community: the crossover to ∆ Q < can occur anywhere in the (cid:96) spectrum, and it depends on B only(i.e., on network size, in line with the analytic results in [24]). Of course, increasing (cid:96) reduces the amount of blocks B needed toreach the crossover (note the logarithmic scale on the B axis). Finally, the robustness of the single block as the best partitioningscheme for in-block nestedness (i.e. ∆ I > is remarkably high: note that I single remains systematically larger than I pairs until (cid:96) has almost reached the weak community criterion. In other words, I identifies the correct block-by-block structure up to (a) (b) (c)(d) (e) (f)(g) (h) (i) FIG. 4.
Resolution limit in random- and generalist-connected rings of nested blocks.
The panels represent three-dimensional plots in theparameter space ( B, (cid:96) ) showing, in the z -axis (color code), the values of ∆ I (top and middle panels) and ∆ Q (bottom panel). In the randomlinking strategy (top and bottom), results are averaged over 25 different realizations. The solid black line indicates the transition from weakcommunites to no communities, as defined by Radicchi et al. [27]. the point where such partition (or any other one) becomes unrecognizable.The only relevant difference between the random (top) and the generalist (middle) linking strategies is related to (cid:96) : the area ofthe parameter space where the in-block nestedness cannot detect the correct block partition ( ∆ I < ) is substantially smaller inthe generalist strategy, compared to the same area under the random strategy. This indicates that when inter-block connectionsare preferentially established by local hubs (or generalists), in-block nestedness can detect blocks of locally nested interactionseven when these blocks are not communities in the traditional sense. Other than this important remark, the previous conclusionholds: I does not show a dependency on B (and thus on N ) by which its ability to detect the right partition is affected, and thus I appears to lack a size-related resolution limit. VI. CONCLUSIONS
During the last few years, in-block nestedness has become a relevant structural arrangement in complex networks and a preciseformulation of an appropriate quality function to detect in-block nested patterns has been recently introduced. Nonetheless, thepossible inherent constraints of this quality function are still largely unknown. Particularly, the potential existence of a resolutionlimit for in-block nestedness –similar to the one found for modularity– remains unexplored.In this work, we have verified whether the in-block nestedness function exhibits a modularity-like resolution limit, i.e., theinability to identify blocks smaller than a certain scale. We have approached the question of in-block nestedness’ resolution limitas a three-step process. First, we have performed an informal test on empirical networks, to assess the extent to which a networkcan be recursively split into smaller and smaller blocks, which is an indication of the existence of a resolution limit [24]. Fromthere, upon the intuition that in-block nestedness lacks a resolution limit (or, at least, it is less severe than Q ’s), we provide aformal proof that I does not have a resolution limit, at least in a specific setting –that in which different blocks are connectedby a single link. Finally, we have numerically generalized and confirmed the analytical argument, exhaustively studying a largeparameter space with varying network size and inter-block connectivity.A limitation of our study is that we have focused on the resolution limit that characterizes the modularity function [24]. Ourresults do not rule out the possibility that in-block nestedness might exhibit different kinds of biases in favor of different prop-erties, e.g. specific intra-block densities, or block relative size distribution. Additional studies on alternative sources of biasesof existing quality functions for network analysis are of utmost importance in order to accurately understand the architecture ofecological and socioeconomic systems. ACKNOWLEDGEMENTS
MSM and CJT acknowledge financial support from the URPP Social Networks at the University of Zurich, and the SwissNational Science Foundation (Grant No. 200021-182659). MSM acknowledges financial support from the Science StrengthPromotion Program of the UESTC, and the UESTC professor research start-up (Grant No. ZYGX2018KYQD215). M.J.P,A.S-R. and J.B-H. acknowledge the support of the Spanish MICINN project PGC2018-096999-A-I00. M.J.P. acknowledges aswell the support of a doctoral grant from the Universitat Oberta de Catalunya (UOC).
Appendix A: Empirical datasets
The empirical ecological networks analyzed here represent bipartite mutualistic and competitive systems, including macro-scopic and microscopic environments. Network data can be downloaded from [42] in different formats, and can be filtereddepending on the type of interaction of the system (e.g. plant-pollinator, host-parasite) and the type of data, e.g. binary orweighted. In this work, we have analyzed a total of of these networks, all of them in their binary form. Thus, this kind ofnetworks are represented as a rectangular N × M matrix, where rows and columns refer to interacting species. An entry in thematrix a ij = 1 if species i of one guild interacts with a species j of the other guild at least once, and 0 otherwise.On the other hand, for the collaboration networks we collected data from open source software projects dataset throughGitHub [43], a social coding platform that provides source code management and collaboration features. Similar to ecologicalnetworks above, for each project ( in total) we build a bipartite unweighted network as a rectangular N × M matrix, whererows and columns refer to the contributors and source files of each open source software project, respectively. An entry in thematrix a ij = 1 if a contributor i have edited a file j at least once, and 0 otherwise. More details on this dataset can be found in[6]. The dataset with the OSS projects is available at http://cosin3.rdi.uoc.edu , under the Resources section. Appendix B: Optimization algorithm
As mentioned in the main text, we have employed the extremal optimization algorithm to maximize the modularity and in-block nestedness quality functions. This algorithm was adapted for modularity optimization by Duch and Arenas [41]. Notably,it offers a good trade-off between accuracy and computational speed. Additionally, the “simplicity” of the algorithm, based onthe optimization of local variables, facilitates its adaptation to the case of the in-block nestedness quality function.The algorithm proceeds as follows: starting from a random partition of a network into two groups with the same number ofnodes, at each step, a local fitness measure for each node is calculated by dividing the local fitness of the node by its degree.With some probability, the node with the lowest fitness is moved to the other partition. Each movement implies a change in thepartition, and a recalculation of the fitness is performed. The process is then repeated until the global fitness score can no longerbe improved. Once such bipartition is at hand, each subgraph is considered as a graph on its own, and the procedure is repeatedrecursively for each one, as long as the fitness function increases with each subsequent partition.The corresponding software codes for modularity and in-block nestedness optimization (for uni- and bipartite cases), can bedownloaded from the web page http://cosin3.rdi.uoc.edu/ , under the Resources section.
Appendix C: Benchmark graph model
The internal nested structure of the blocks is generated by defining a separatrix line that divides the filled and empty regionsof the adjacency matrix. Following [22, 44], we partition the [0 , × [0 , bi-dimensional plane into N × N cells, and then weonly add a link to the cells whose center lies above the separatrix . We define the separatrix as f ( x ) = 1 − (1 − x /ξ ) ξ (C1) Note that this procedure generate unipartite networks with self-loops, which might be an unrealistic trait for some empirical networks. Nevertheless, ourresults are robust with respect to the removal of self-loops. where x ∈ [0 , . Parameter ξ ∈ (0 , ∞ ) controls the slimness of the nested structure and, as a consequence, the internal densityof the blocks. In the following, we shall often set ξ = 1 , which corresponds to a stepwise block where given two consecutiverows i and i + 1 , k i +1 − k i = 1 . The corresponding software codes, to generate nested, modular, and in-block nested networks,for uni- and bipartite cases, can be downloaded from the web page of the group http://cosin3.rdi.uoc.edu/ , underthe Resources section. Appendix D: Proving the absence of a resolution limit in a ring of weakly-interconnected blocks
In this Section, we derive the analytic results presented in the main text, namely, the in-block nestedness of a set of discon-nected stepwise blocks (Eq. (5), derivation in Appendix D 1) and the results needed to prove the absence of a resolution limit(Eqs. (6)–(8), derivation in Appendix D 2).
1. Derivation of the in-block nestedness of a set of disconnected stepwise blocks
As each block has an internally nested structure, by definition, O st = k t if k t < k s . Therefore, Eq. (4) becomes N α = 2 f ( k ( α ) ) N ( C α − , (D1)where f ( k ( α ) ) denotes the following function of the vector k ( α ) composed of the degrees of the nodes that belong to node α : f ( k ( α ) ) := (cid:88) s ∈ α (cid:32) − k s N (cid:33) (cid:88) t ∈ α Θ( k s − k t ) . (D2)Note that in general, the function f ( k ( α ) ) depends on the perfectly-nested block’s internal shape or, equivalently, on the densityof the perfectly-nested block. The factor (cid:80) t ∈ α Θ( k s − k t ) represents the number of nodes with degree strictly smaller than k s .As we are considering stepwise perfectly nested networks, we have (cid:80) t ∈ α Θ( k s − k t ) = k s − . Hence, after rearranging someterms, f ( k ( α ) ) := (cid:32) N (cid:33) (cid:88) s ∈ α k s − N (cid:88) s ∈ α k s − C α . (D3)For stepwise perfectly-nested networks, the following identities hold: (cid:88) s ∈ α k s = C α (cid:88) s =1 k s = C α (cid:88) s =1 s = C α ( C α + 1)2 , (cid:88) s ∈ α k s = C α (cid:88) s =1 k s = C α (cid:88) s =1 s = C α ( C α + 1) (2 C α + 1)6 , (cid:88) t Θ( k s − k t ) = k s − . (D4)By replacing (D4) into (D3), and rearranging some terms, we finally obtain f ( k ( α ) ) = (cid:32) −
12 + 13 N (cid:33) C α + C α − C α N . (D5)By plugging (D5) into (D1) and rearranging some terms, we obtain N α = C α N − N C α ( C α + 1) . (D6)This represents the nestedness of a stepwise block α composed of C α nodes.We calculate now the in-block nestedness, I of a (disconnected) network composed of a set of disconnected stepwise blocks.This is readily obtained by summing the contributions N α – given by Eq. (D6) – over all the blocks that compose the network.0In the limit scenario where all the blocks are small compared to the network ( C α (cid:28) N ), we obtain N α (cid:39) C α /N and, as aconsequence, I (cid:39) . In the general case, we obtain I = 1 − N − N (cid:88) α C α . (D7)In the case of equally-sized blocks, C α = C = N/B , we obtain Eq. (5). We verified numerically that this relation is correct.
2. Proving the absence of a resolution limit
As detailed in Section IV, to prove the absence of a resolution limit, we need to calculate I single and I pairs , and evaluate theirdifference. To calculate I single , we perturb the perfectly in-block nested structure described above by connecting all the hubsof the B blocks; each hub is now connected with two other hubs ( B inter-block links, in total) – see Fig. 1 for an illustration.Because of their links with the hubs of the two adjacent blocks, the hubs have degree C α + 2 . For simplicity, we assume that theblocks have the same size C = N/B ; the hubs’ degree is therefore C + 2 = N/B + 2 . In Eq. ( ?? ), the O st /k t term remainsalways equal to one if k s > k t because internally, the blocks remain perfectly nested. The negative term receives now, for eachblock, an additional contribution given by the two extra link of each hub. Therefore, the in-block nestedness of the network, I , can be expressed as I = I + I int , where I int is the ”interaction” term that results from the edges that connect the hubs.Overall, this extra term is I int = − N B (cid:88) α =1 C α − (cid:88) t ∈ α N Θ( C + 2 − k t ) = − BN , (D8)where we used the fact that there are C α − nodes of degree smaller than C + 2 in each block (all the non-hub nodes, simply).Therefore, we obtain Eq. 6. We verified numerically that this relation is correct.The challenge is now to calculate I pairs , i.e., the in-block nestedness for a partition where pairs of adjacent blocks areconnected. For a given pair of blocks, ( α , α ) , we define the useful quantities: f ( k α ) = (cid:88) s,t ∈ α (cid:32) O st k t − k s N (cid:33) Θ( k s − k t ) ,f ( k α ) = (cid:88) s,t ∈ α (cid:32) O st k t − k s N (cid:33) Θ( k s − k t ) ,f ( k α ) = (cid:88) s ∈ α ,t ∈ α (cid:32) O st k t − k s N (cid:33) Θ( k s − k t ) . (D9)We stress the fundamental difference among these quantities: f ( k α ) is obtained from the contribution from all pairs of nodesthat belong to the merged block α ; f ( k α ) only receives contributions from pairs of nodes that belong to the same in-blocknested block α ; f ( k α ) only includes the contributions from pairs of nodes that belong to the same merged block α , butdifferent in-block nested blocks α and α , respectively. Based on symmetry with respect to permutations of the blocks, weobtain: I pairs = B N C − f ( k α ) . (D10)Note that the block-size normalization factor is given by / (2 C − and there is an overall factor B/ , which reflects the propertythat the partition comprises B/ merged blocks which contain C nodes each. For symmetry with respect to permutation of α and α , we also have: f ( k α ) = 2 f ( k α ) + 2 f ( k α ) . (D11)By using this identity, we obtain I pairs = C − C − I single + 2 C (2 C − f ( k α ) , (D12)where we used the identity I single = 2 N C − f ( k α ) . (D13)1In order to compare I pairs against I single , the calculation of f ( k α ) is left. We obtain f ( k α ) = C − (cid:88) t =1 t − ( C −
1) ( C + 2) N − C − (cid:88) s =1 s ( s − N (D14)The three terms on the r.h.s. have a clear interpretation. The first term is the positive contribution that comes from the overlapbetween the hub of block α and the C − non-hubs of block α The second term is the negative contribution that comes fromthe expected overlap between the hub of block α (with degree C + 2) ) and the C − non-hubs of block α . The third term isthe negative contribution that comes from the expected overlap between the non-hubs of block α and the non-hubs of block α ;note that there is no overlap between the neighborhoods of the non-hubs of block α and the non-hubs of block α . By usingagain the identities (D4) and rearranging some terms, we obtain f ( k α ) = H C − − g ( C )3 N (D15)where H C − := (cid:80) C − t =1 t − denotes the C − th harmonic number, and we defined the polynomial function g ( C ) := ( C −
1) ( C + C + 6) . Note that the two terms in the r.h.s. represent the contribution of the observed and expected overlap betweenthe nodes that belong to the two different original blocks that are joint together in the merged partition. By plugging Eq. (D15)into Eq. (D10), we obtain Eq. (7) which, combined with Eq. 6, implies Eq. (8). [1] T. M. Lewinsohn, P. In´acio Prado, P. Jordano, J. Bascompte, and J. M. Olesen, Oikos , 174 (2006).[2] C. O. Flores, J. R. Meyer, S. Valverde, L. Farr, and J. S. Weitz, Proceedings of the National Academy of Sciences , E288 (2011).[3] C. O. Flores, S. Valverde, and J. S. Weitz, The ISME journal , 520 (2013).[4] S. J. Beckett and H. T. Williams, Interface Focus , 20130033 (2013).[5] M. A. Mello, G. M. Felix, R. B. Pinheiro, R. L. Muylaert, C. Geiselman, S. E. Santana, M. Tschapka, N. Lotfi, F. A. Rodrigues, andR. D. Stevens, Nature ecology & evolution , 1 (2019).[6] M. J. Palazzi, J. Cabot, J. L. C. Izquierdo, A. Sol´e-Ribalta, and J. Borge-Holthoefer, Scientific Reports , 13890 (2019).[7] M. E. Newman and M. Girvan, Physical review E , 026113 (2004).[8] W. W. Zachary, Journal of Anthropological Research , 452 (1977).[9] R. Guimer`a and L. A. N. Amaral, Nature , 895 (2005).[10] K. A. Eriksen, I. Simonsen, S. Maslov, and K. Sneppen, Physical Review Letters , 148701 (2003).[11] L. A. Adamic and N. Glance, in Proceedings of the 3rd international workshop on Link discovery (ACM, 2005) pp. 36–43.[12] S. Fortunato, Physics Reports , 75 (2010).[13] B. D. Patterson and W. Atmar, Biological Journal of the Linnean Society , 65 (1986).[14] W. Atmar and B. D. Patterson, Oecologia , 373 (1993).[15] M. S. Mariani, Z.-M. Ren, J. Bascompte, and C. J. Tessone, Physics Reports , 1 (2019).[16] J. Bascompte, P. Jordano, C. J. Meli´an, and J. M. Olesen, Proceedings of the National Academy of Sciences , 9383 (2003).[17] S. Saavedra, D. B. Stouffer, B. Uzzi, and J. Bascompte, Nature , 233 (2011).[18] S. Bustos, C. Gomez, R. Hausmann, and C. A. Hidalgo, PLOS ONE , e49393 (2012).[19] M. D. K¨onig, C. J. Tessone, and Y. Zenou, Theoretical Economics , 695 (2014).[20] J. Borge-Holthoefer, R. A. Ba˜nos, C. Gracia-L´azaro, and Y. Moreno, Scientific Reports (2017).[21] M. Almeida-Neto, P. Guimar˜aes, P. R. Guimar˜aes, R. D. Loyola, and W. Ulrich, Oikos , 1227 (2008).[22] A. Sol´e-Ribalta, C. J. Tessone, M. S. Mariani, and J. Borge-Holthoefer, Physical Review E , 062302 (2018).[23] S. Kojaku and N. Masuda, Physical Review E , 052313 (2017).[24] S. Fortunato and M. Barth´elemy, Proceedings of the National Academy of Sciences , 36 (2007).[25] S. Fortunato and D. Hric, Physics Reports , 1 (2016).[26] M. Newman, Networks: an introduction (Oxford university press, 2010).[27] F. Radicchi, C. Castellano, F. Cecconi, V. Loreto, and D. Parisi, Proceedings of the National Academy of Sciences , 2658 (2004).[28] M. Rosvall and C. T. Bergstrom, Proceedings of the National Academy of Sciences , 1118 (2008).[29] S. P. Borgatti and M. G. Everett, Social Networks , 375 (2000).[30] P. Rombach, M. A. Porter, J. H. Fowler, and P. J. Mucha, SIAM Review , 619 (2017).[31] F. Chung and L. Lu, Proceedings of the National Academy of Sciences , 15879 (2002).[32] F. Chung and L. Lu, Annals of Combinatorics , 125 (2002).[33] M. R. Garey and D. S. Johnson, Computers and intractability: a guide to the theory of NP-completeness (W. H Freeman, New York,1979).[34] L. Danon, A. Diaz-Guilera, J. Duch, and A. Arenas, Journal of Statistical Mechanics: Theory and Experiment , P09008 (2005).[35] T. P. Peixoto, Physical Review Letters , 148701 (2013).[36] S. Sobolevsky, R. Campari, A. Belyi, and C. Ratti, Physical Review E , 012811 (2014).[37] W. Ulrich, M. Almeida-Neto, and N. J. Gotelli, Oikos , 3 (2009). [38] V. A. Traag, P. Van Dooren, and Y. Nesterov, Physical Review E , 016114 (2011).[39] S. Kojaku and N. Masuda, New Journal of Physics , 043012 (2018).[40] S. Kojaku, M. Xu, H. Xia, and N. Masuda, Scientific Reports , 404 (2019).[41] J. Duch and A. Arenas, Physical Review E , 027104 (2005).[42] “Web of life: ecological networks database,” [43] “Github: software development platform,” https://github.com. [44] M. J. Palazzi, J. Borge-Holthoefer, C. Tessone, and A. Sol´e-Ribalta, Journal of the Royal Society Interface16