Modularity maximisation for graphons
MModularity maximisation for graphons ∗ Florian Klimm † , Nick S. Jones ‡ , and Michael T. Schaub § Abstract.
Networks are a widely-used tool to investigate the large-scale connectivity structure in complexsystems and graphons have been proposed as an infinite size limit of dense networks. The detec-tion of communities or other meso-scale structures is a prominent topic in network science as itallows the identification of functional building blocks in complex systems. When such buildingblocks may be present in graphons is an open question. In this paper, we define a graphon-modularity and demonstrate that it can be maximised to detect communities in graphons. Wethen investigate specific synthetic graphons and show that they may show a wide range of dif-ferent community structures. We also reformulate the graphon-modularity maximisation as acontinuous optimisation problem and so prove the optimal community structure or lack thereoffor some graphons, something that is usually not possible for networks. Furthermore, we demon-strate that estimating a graphon from network data as an intermediate step can improve thedetection of communities, in comparison with exclusively maximising the modularity of the net-work. While the choice of graphon-estimator may strongly influence the accord between thecommunity structure of a network and its estimated graphon, we find that there is a substantialoverlap if an appropriate estimator is used. Our study demonstrates that community detectionfor graphons is possible and may serve as a privacy-preserving way to cluster network data.
Key words. networks, community detection, modularity maximisation, graphs, graphons, privacy
AMS subject classifications.
1. Introduction.
Networks have become nearly ubiquitous abstractions for complexsystems arising in biological, technical, and social contexts, and many other applications [1].Mathematically, such networks are represented as graphs in which nodes represent entitiesand edges relations between such entities. Accordingly, graph-based tools have been em-ployed to study and reveal properties of such network systems. Of particular interest hasbeen the detection of community structure , i.e., a grouping of nodes that are more similarto each other according to some criterion than to the rest of the network. There is is nostandard definition of community structure [2, 3, 4] and different notions of communitystructure exist. We adopt here the common viewpoint of defining communities as denselyconnected groups of nodes, when compared to the remainder of the network.Detecting such communities has led to insights about the function of proteins [5], social ∗ Submitted to the editors January 5, 2021.
Funding:
F.K. and N.S.J. thank the EPSRC (Centre for Mathematics of Precision Healthcare;EP/N014529/1). M.T.S. acknowledges funding from the Ministry of Culture and Science (MKW) of the GermanState of North Rhine-Westphalia (”NRW R¨uckkehrprogramm”). † Department of Mathematics, Imperial College London & MRC Mitochondrial Biology Unit, University ofCambridge ([email protected]). ‡ Department of Mathematics, Imperial College London ([email protected]). § Department of Computer Science, RWTH Aachen University ([email protected]). a r X i v : . [ s t a t . C O ] J a n F. KLIMM, NICK S. JONES, AND MICHAEL T. SCHAUB networks [6], neuroscience [7], and many other fields [8]. A wide range of community-detection algorithms exists, such as maximum-likelihood estimation of generative net-work models, spectral methods, and matrix-decomposition approaches [2, 3]. Despitesome known limitations, such as an inherent resolution limit [9], one of the most pop-ular community detection algorithms is modularity maximisation, which seeks to optimizethe modularity score proposed in a seminal paper by Newman and Girvan [10].A current challenge in the analysis of real-world systems is that increased measurementsand data availability have been creating a need for algorithms and analysis tools that scaleto very large networks. In this context graphons have emerged as one promising non-parametric generative model for the study of large networks (a more detailed introductionto graphons will be given in Section 2; briefly, graphons are functions on the unit squareoriginally proposed as continuous limiting objects for dense graph sequences [11]). Usingthe Aldous–Hoover representation theorem [12, 13, 14], it can be shown that graphonsencapsulate a large number of popular existing generative graph models, such as the
Erd˝os–R´enyi graph , the stochastic block model [15] and its variants, random dot productgraphs [16], and many further latent variable graph models [17]. Graphons are thus veryflexible probabilistic models that can represent a wide range of network structures.The price for this flexibility, however, is that a graphon-based network model maystill be quite complex and not easy to interpret for a practitioner. Hence, when inferringa graphon from empirical data, we might end up trading one large complex network foranother complex object, which has arguably impeded the adoption of general graphonmodels by applied scientists (see Figure 1a for an illustration). Indeed, one reason for theinterest in community detection is that communities enable a simplified description of alarge network, by decomposing the network into modular “building blocks”. To addressthis issue in the context of graphons, in this work we develop community detection usinga form of modularity optimization for graphons. Our work thereby serves as a first steptowards an effective, more interpretable summarization of an inferred graphon describinga complex network.
Approximating and simplifying graphons via communities.
As outlined above, one mainmotivation for developing a form of community detection for graphons arises from theneed to simplify an empirically obtained graphon further. As graphons encapsulate manypopular generative graph models, obtaining a community structure for graphons allowsto estimate community structure for those without the need to construct networks fromthem. While we here concentrate on providing an “assortative block simplification” interms of community structure via modularity maximisation, other approaches are conceiv-able as well. For instance, we may use low-rank approximations or corresponding spectralembeddings to obtain a simplified description of a graphon. In fact, we may be inter-ested in other types of potential simplifications and analyses of graphons, including furtherstructural decompositions such as core-periphery approximations [18] or centrality mea-
ODULARITY MAXIMISATION FOR GRAPHONS 3 sures [19]. See also Subsection 1.3 for a brief overview of the emerging area of graphonanalysis for applications.
Community detection via continuous optimization.
As graphons are objects defined ina continuous domain (functions on the unit square), some techniques from continuousmathematics become applicable for their analysis. This may provide a potential reservoirof new algorithms and analysis tools for networks. While not the main focus of this paper,we will see a simple example of this kind in Section 4, where an analytical solution tothe modularity optimization problem on certain graphons is derived. More generally, afruitful endeavour could be to characterise how classical network algorithms might be seenas discrete approximations of certain problems defined on graphons, which could lead to adeeper understanding of these problems.
Privacy-preserving computation.
The protection of sensitive data against unauthorisedaccess is a crucial part of data warehousing and data analysis. Large-scale network datacan be such sensitive information. For instance, network data may comprise health recordssuch as brain connectomics [20], individuals’ social contact data such as their Facebooknetwork [21], or commercially confidential information. Facebook friendships, for example,may be used to expose an individual’s sexual orientation [22]. Graphons are one way torepresent such data as an approximation that partially preserves large- and meso-scalefeatures of the data, while providing some anonymity for individuals. Importantly, it hasbeen demonstrated that we can estimate graphons from network data while preserving theprivacy of individual nodes [23, 24]By sharing such a privacy preserving graphon, a data collecting entity can thus enablefurther analysis of such network data. This may lead to valuable insights into the system,while simultaneously preserving the privacy of the involved individuals. Figure 1b shows aschematic for such an approach for the case of community detection, in which an externalentity is given access to a graphon created from a system graph G in a privacy-preservingway. The external entity can now detect communities and analyse the graphon to gaininsights about the system, without compromising the privacy of the individual nodes. Thisprocedure may also be of interest if the particular network analysis to be performed cannotbe performed by the data collecting entity, e.g., because of a lack of sufficient computationalresources. In this paper, we introduce the problem of modularity optimiza-tion for graphons. We define a modularity function for graphons and introduce corre-sponding algorithms to detect community structure in synthetic and empirically estimatedgraphons. Specifically, we adapt the popular Louvain algorithm for modularity maximiza-tion in networks to the context of graphons. Additionally, we discuss a simple continuousoptimization variant for modularity we call sliced modularity optimization . For selectedgraphons, this enables us to derive analytical expressions of the optimal community struc-ture. Our characterization of community structure in terms of the modularity of a graphonfurther provides a necessary and sufficient criterion for graphs to contain no modular struc-
F. KLIMM, NICK S. JONES, AND MICHAEL T. SCHAUB a) b) NetworkGraphon Community structureCommunity structurecommunity detectionfor networks community detectionfor graphons g e n e r a t e i n f e r f i n i t e p r o j ec t i o n / g e n e r a t i o n NetworkGraphon Community structureCommunity structurecommunity detectionfor graphons i n f e r P r i v a t e c o m pu t a t i o n s a nd d a t a c o m b i n a t i o n node positions Complex Simplified
Figure 1.
Schematic: graphon modularity enables the detection of community structure,which are—similar to networks—a simplified representation of the mesoscale connectivity struc-ture. Furthermore, graphon-based computations enable privacy preserving network analysis. (a)Classic community detection infers a community structure in a network. Networks can be generated fromgraphons. The inverse process is the inference of graphons from network data. We establish a communitydetection for graphons which allows the inference of communities in graphons. (b) For the privacy-preservingnetwork analysis, the data-hosting entity infers a graphon from a network and keeps the node positions x i private, while sharing the graphon with external entities (or publicly). The external entity can then computethe community structure of the graphon and returns it to the data-hosting entity, which combines it withthe node positions x i to obtain the community membership of each node in the network. During the wholeprocess, the privacy of individual nodes is preserved. ture. Namely, for graphons that have a product form (the graphon operator has rank 1),any partition has modularity zero, i.e., there is no evidence for the presence of communitiesin terms of the modularity function.As there already exist a number of algorithms for inferring a graphon from networkdata, we focus mainly on a scenario where the graphon of interest is already given whendeveloping our modularity optimization approach. In Section 7, however, we also studygraphons inferred from empirically observed and synthetic network data. We report onhow the graphon inference steps can impact the results of modularity optimization usingnumerical simulations. Using stochastic blockmodels, we also illustrate that combininggraphon estimation with graphon community detection may improve the performance ofmodularity-based community detection in comparison with standard, graph-based modu-larity maximisation. Intuitively, by first fitting a graphon to the network we smooth outrandom fluctuations in the data, which can impede modularity maximization. At the sametime, the full graphon description can be kept for a more refined interrogation of the net-work structure, e.g., in terms of centrality measures [19] or for sampling surrogate network ODULARITY MAXIMISATION FOR GRAPHONS 5 data.While in the setting of stochastic blockmodels, there is a planted ground truth partitionto compare against, for real networks this is, of course, not the case. We thus investigatenumerically for several real networks the extent to which standard modularity optimization(i.e., directly applying modularity maximization on an observed graph) and graphon-basedmodularity maximization (i.e., first infer a graphon, then find the partition maximisinggraphon modularity) leads to the same results. For empirical data, we find that the accordbetween the community structure detected in a network and its estimated graphon dependson the chosen estimation algorithm. Assuming that partitions that optimize modularityin the graph are those we want to detect in the graphon, this is an important aspect if wewant to use the graphon based scheme for privacy preserving computation.
F. KLIMM, NICK S. JONES, AND MICHAEL T. SCHAUB
Graphon-based network analysis.
Our work relates to the development of graphon analy-sis tools and metrics, which is an active field of research. Indeed, centrality measures [19],controllability [25], random-walk Laplacians [26], epidemic spreading [27], and subgraphcounts [28] have all been recently derived for graphons.Graphons have also been proposed as a general tool for encapsulating sensitive infor-mation in a node-private way [23, 24]. In light of our previous discussion, the graphon-modularity we present here, may thus also be interpreted as a privacy-preserving methodfor community detection.
Graphon estimation.
The estimation of graphons from network data has been studiedin a number of publications, e.g., [29, 30, 31]. Approaches include stochastic blockmodelapproximations [32], neighbourhood-smoothing algorithms [33], or sorting-and-smoothingalgorithms [34]. Estimating a general graphon exactly is only possible under certain iden-tifiability conditions [35]. In practice, however, many approaches estimate a graphon as a mixture of stochastic blockmodels, each consisting of many blocks of equal size (also called network histograms [30]).
Community detection in networks.
Many different approaches have been developed todetect communities in networks [2, 3] and modularity-maximisation is one of the mostwidely-used paradigms. The
Louvain -algorithm is a fast heuristic to solve the modularity-maximisation problem [36]. An alternative to community detection is to identify an em-bedding of discrete nodes in a continuous latent space (e.g., [37]). These latent embeddingsmay be interpreted in terms of generalized community structure [38, 39], or indeed graphonestimation [17].
The rest of the paper is structured as follows. After briefly reviewingsome preliminaries in section 2, we formally define graphon-modularity in section 3. Wethen present approaches to optimise the graphon-modularity in section 4 and we discussdifferent synthetic graphons in section 5. In section 6 we consider the issue of modular-ity maximisation on graphon inferred from finite graphs and graphs sampled from latentgraphons. We finally perform numerical experiments on the performance of our methodsfor empirically networks in section 7 and conclude with a discussion in Section 8.
2. Preliminaries.2.1. Graphs.
We represent networks mathematically as finite, unweighted, and undi-rected graphs. A graph is an ordered pair G = ( V, E ) composed of a set V of nodes(vertices), and a set of edges (links) E , where each edge e ∈ E corresponds to an un-ordered tuple of two nodes [1, 40]. We denote the number of nodes in a graph as N = | V | and the number of edges as M = | E | . Without loss of generality we will assume that thenodes of the graphs have been labeled by the nonzero integers such that V = { , . . . , N } .Using this labeling we can define the adjacency matrix A of the (labeled) graph as the N × N matrix with elements A ij = 1, if ( i, j ) ∈ E and A ij = 0, otherwise. Based on this ODULARITY MAXIMISATION FOR GRAPHONS 7 algebraic representation we can compute the degree k i of each node i as k i = (cid:80) Nj =1 A ij ,i.e., k i is the number of edges incident to node i . Graphons originally emerged in the study of limits of large scale net-works [41, 42, 43, 44, 45], and have been defined as limits of (dense) graphs for which thenumber of nodes N → ∞ . A graphon is a measurable function W : [0 , → [0 , W ( x, y ) = W ( y, x ). An intuitiveway to think of a graphon is in terms of the limiting object of a heatmap image (“pixelpicture”) [45] of a graph’s adjacency matrix as follows. We assume that as the graph size N → ∞ , the heatmap image (the pixel picture) of the adjacency matrix is always spatiallyscaled to maintain the dimension of the unit square. In the limit there are thus N → ∞ nodes associated with the unit interval [0 ,
1] and the values x and y in a graphon may thusbe interpreted as the indices of the vertices in an infinite graph.While this heuristic explanation provides some intuition, it needs refinement. Observethat for any graph we can permute the node labels and thereby change its representation interms of the adjacency matrix, while leaving the graph structure unchanged, For a graphonto be a valid limiting objects of graphs rather than of adjacency matrices (i.e., labeledgraphs), a graphon can only be defined as a limiting object up to measure preservingbijections of its arguments, i.e., W ( x, y ) = W ( π ( x ) , π ( y )), where π : [0 , → [0 ,
1] is ameasure preserving map. A more precise characterization of the equivalence classes ofgraphons is provided in [44, 45].Graphons may also be interpreted as nonparametric random graph models, as intro-duced in [41] under the name W -random graphs. We can sample a random graph of size N within this model as follows. First, each node i ∈ { , . . . , N } is assigned a latent position u i ∈ [0 ,
1] (typically drawn uniformly at random). Second, any pair of nodes i, j is thenconnected with an edge with probability P ( A ij = 1) = W ( u i , u j ).Similar to graphs we may associate a degree function k ( x ) to every node x in a graphonvia the following Lebesgue integral: k ( x ) = (cid:90) W ( x, y ) dy . (2.1)Likewise we define the edge density µ of a graphon as: µ = (cid:90) [0 , W ( x, y ) dxdy = (cid:90) k ( x ) dx . (2.2) A community is a set of nodes, such thatnodes within the same community are more densely connected to each than to nodes inother communities [2, 3, 8]. We restrict our discussion to non-overlapping communities,such that each node belongs to exactly one community. For convenience, we describe thevertex to group assignment by the function g V : V → { , , . . . , c } , which maps each nodeto one of the c communities. F. KLIMM, NICK S. JONES, AND MICHAEL T. SCHAUB
Many different heuristic algorithms to find such a function have been developed (see [4,46, 47] for reviews). Among the most widely-used heuristics is the so-called modularitymaximisation . For this, one defines a modularity function Q ( g V ) = 12 M N (cid:88) i,j =1 ( A ij − P ij ) δ [ g V ( i ) , g V ( j )] = 12 M N (cid:88) i,j =1 B ij δ [ g V ( i ) , g V ( j )] , (2.3)which is a quality index for a group assignment function g V of a network with adjacencymatrix A [48]. Here, the matrix B = [ B ij ] is the modularity matrix and its entries B ij = A ij − P ij are equal to the entries of the adjacency matrix shifted by a chosen null model term P ij . This null model term is typically chosen to be the expected connection strengthbetween nodes i and j under a chosen random graph model. The null model thus serves asa baseline to which the actual connections of the adjacency matrix are compared. Since theKronecker-delta δ [ · , · ] in Equation (2.3) is 1 if its arguments are equal and 0 otherwise, theabove sum only takes into accounts elements of the modularity matrix for which nodes i and j belong to the same community g V ( i ) = g V ( j ). Accordingly, the modularity of a particulargroup assignment g V is equal to the sum (rescaled by 2 M ) of the intra-community edgesminus the expected weight of intra-community edges.There are numerous choices for the null model term P ij . There exist null models forspatially embedded networks [49, 50], null models for networks constructed from correlationmodels [51, 52], and for many other situations [3]. Here we focus on the typically consideredNewman–Girvan null model P ij = k i k j / (2 M ) [10], also known as the Chung–Lu model [53],which preserves the expected degree distribution of the graph.With this choice for P ij the modularity Q can be written as: Q ( g V ) = 12 M N (cid:88) i,j =1 (cid:18) A ij − k i k j M (cid:19) δ [ g V ( i ) , g V ( j )] . (2.4)Clearly there are a number of equivalent group assignments, as the group labels can bepermuted without changing the induced partition of the nodes. It is thus the partition ofthe nodes induced by the group labels that is important for the modularity score, ratherthan the labels per se.The task of community detection can now be formalized as the following modularity-maximisation problem [52]: Find a partition of the nodes (respectively a group assignment)that maximises the modularity function Q max g V Q ( g V ) = max g V N (cid:88) i,j =1 B ij δ [ g V ( i ) , g V ( j )] , (2.5)where g V is a group assignment function, mapping each node to one community. ODULARITY MAXIMISATION FOR GRAPHONS 9
Note that in the optimization problem Equation (2.5) the number c of communities isnot fixed. The modularity maximization problem may thus be viewed as searching over theset of all possible partitions of the nodes. Since this set becomes extremely large even formoderately sized graphs, Equation (2.5) is computationally difficult to optimize. In fact,it has been shown that modularity maximisation is an NP-hard problem [54]. In practice,the modularity optimization problem is thus solved approximately using (greedy) heuristicprocedures, such as the Louvain algorithm or the
Leiden algorithm [36, 55, 56], which havebeen shown to yield good empirical performance.
3. A modularity function for graphons.
In this section, we define the modularityfunction for graphons, which we will later employ for community detection in graphons.Analogously to the community-detection problem for graphs, the community-detectionproblem for graphons can be expressed as the identification of a group assignment function g : [0 , → { , , . . . , c } , (3.1)which assigns each node position x ∈ [0 ,
1] to one of c communities. To find such acommunity-assignment function, we define a modularity function for graphons. Definition 3.1 (Graphon modularity).
For a graphon W ( x, y ) , a graphon null model P ( x, y ) and a group assignment function g , we define the graphon-modularity as Q ( g ) = 1 µ (cid:90) (cid:90) (cid:2) W ( x, y ) − P ( x, y ) (cid:3)(cid:124) (cid:123)(cid:122) (cid:125) modularity surface B ( x,y ) δ ( g ( x ) − g ( y )) dxdy , (3.2) where δ ( · ) denotes Dirac’s delta function, and we have defined the modularity surface B ( x, y ) = W ( x, y ) − P ( x, y ) as analog of the modularity matrix for graphons. Analogous to graph case, the graphon modularity function may be interpreted as ameasure of the quality of a node partition induced by the group assignment function g .Likewise, the modularity surface B ( x, y ) indicates how well a graphon W ( x, y ) at location( x, y ) is connected compared to the null model term P ( x, y ).As for graphs, different choices for the null model P ( x, y ) may be sensible. For sim-plicity we here restrict the discussion again to a Newman–Girvan-type null model: P ( x, y ) = 1 µ k ( x ) k ( y ) . (3.3)Note that since graphs generated for sufficiently smooth graphons [11, 19] will convergeto graphons in the limit, the definition of the modularity surface Definition 3.1, preciselycorresponds to the (scaled) limiting object of the modularity matrix. Proof. (Sketch) It can be shown that the normalized degree of node i converges as N → ∞ [19] and accordingly the degree density of the graph will converge to the graphondensity µ . This implies that the null model term will be well defined in the limit. Since furthermore the (scaled) adjacency matrix A /N will converge to the graphon W , bothterms that define the modularity surface converge and are well defined.Given a group assignment g ( x ), we further define the (relative) size S i ∈ (0 ,
1] of acommunity c i as S i = (cid:90) δ ( g ( x ) , c ) dx . (3.4)The maximal size S i of any community equals one, which indicates that there exists asingle group consisting of all nodes.Analogously to graphs, we can now detect communities in graphons by finding a func-tion g that maximises the graphon modularity (3.2). Definition 3.2 (Modularity-maximisation problem for graphons).
Given a non-emptygraphon W ( x, y ) and a graphon null model P ( x, y ) that define the modularity surface B ( x, y ) = W ( x, y ) − P ( x, y ) , the modularity-maximisation problem is: max g ( x ) (cid:90) (cid:90) B ( x, y ) δ ( g ( x ) , g ( y )) dxdy , (3.5) where g is a group assignment function such that S i > for all communities c i , i.e., eachcommunity has a nonzero measure.Remark (Measure zero sets and L equivalence). Recall how a graphon can only bedefined meaningfully up to measure preserving transformations. The same is true, mutatismutandis , for the group assignment function g . Indeed, from the above Definition 3.1of graphon modularity, it should be clear that a group assignment function can only bemeaningfully defined up to L equivalence. Specifically, let L ([0 , f : [0 , → R with inner product (cid:104) f , f (cid:105) = (cid:82) f ( x ) f ( x ) dx and norm (cid:107) f (cid:107) = (cid:112) (cid:104) f , f (cid:105) . The elements of L ([0 , f ≡ f witheach other if (cid:107) f − f (cid:107) = 0.For instance, changing the group assignment value g ( x ) for any single x ∈ [0 ,
1] willnot alter the modularity Q ( g ) or the size of the communities S i . This measure preservingchange of g leads to a non-identfiability of the precise function g in the optimizationof graphon modularity. However, analogous to the possibility of permuting the grouplabels, this non-identifiability does not lead to practical problems, as in practice we areonly concerned with group assignment functions up to L equivalence. Similarly, we onlyconsider communities with nonzero measure, i.e, we will require that S i > c i , asspecified in Definition 3.2.As for modularity optimization for graphs, finding an optimal solution for the graphonmodularity optimization problem is only possible for special cases. In Section 4, we thusexplore two procedures to find either analytical expressions of the optimal communitystructure, or approximate solutions via numerical algorithms. ODULARITY MAXIMISATION FOR GRAPHONS 11
4. Optimising graphon-modularity.
Here we explore two approaches to identify amaximum-modularity partition for graphons (see Figure 2 for schematic representations).The first approach is to discretise the modularity surface B ( x, y ) (see Definition 3.1) anduse a generalised Louvain algorithm ( GenLouvain ) to find a group assignment function g . The second approach is (semi-)analytical and works if we can constrain the set of groupassignment function g to be monotonically increasing, which we can do for certain syn-thetic graphons. In this case we can find the exact maxima of modularity. We will showin Section 5 that for selected synthetic graphons both methods return essentially identicalresults, providing some further validation for the Louvain heuristic. We use the
GenLou-vain algorithm [55] on a piecewise constant approximation of W ( x, y ), as illustrated inFigure 2, to heuristically optimize graphon modularity. GenLouvain is a variant of thefast
Louvain -algorithm [36], which was originally designed for standard modularity opti-mization on simple graphs. To apply the
GenLouvain algorithm, we first need to discretizethe graphon W ( x, y ) appropriately by trading off two aspects. First, we need to chose afine enough grid to capture the variation of the graphon W ( x, y ) in x and y direction.Second, we would like to choose an as coarse as possible grid, to limit the computationalcosts of the optimization performed via GenLouvain .In the following we approximate W ( x, y ) using a uniformly spaced grid of size 2000 × W ( x, y ) is fine enough for all the problems consideredin this paper. Increasing the resolution of the grid further has essentially no effect for theresults, in practice.We remark, that more elaborate discretization schemes are conceivable and might resultin computational gains. For instance, one could employ multigrid discretization schemes toobtain a better approximation of local features of the modularity surface, without incurringa large extra computational cost (for a review of multigrid schemes see [57]). We now con-sider a setup in which we can analytically establish the optimal partitions of a graphon interms of the modularity function. This provides us with a way to validate the results weobtain from the discretization based graphon-modularity optimization outlined in the pre-vious section. Moreover, it highlights that (within certain situations) tools from continuousoptimization can be employed to analyse modularity maximizing partitions of graphons.To this end, we restrict the possible group assignment functions g ( x ) to be piecewise g(x)12 x x i x g(x)12 xLouvain algorithm on discretised graphon Continous optimisation of partition border x Figure 2.
Optimizing graphon modularity.
We discuss two procedures to maximise graphon-modularity. (Left panel) We discretise the graphon into intervals [ x, x + ∆ x ] and apply a Louvain algorithmto optimise the modularity function for the resulting matrix. (Right panel) We restrict the modules to becontinuous intervals (in this example [0 , x ] and ( x , ). We optimise the modularity by continuously vary-ing the border x between the communities. In this schematic, there exist only two communities, althoughin general there can be any number of communities. constant on c intervals:(4.1) g ( x ) = x ∈ [0 , x )2 if x ∈ [0 , x )... c if x ∈ [ x c − , g ( x ) is of such a form, we can rewrite the graphonmodularity function as: Q ( g ) = 1 µ c (cid:88) i =1 L ( x i − , x i ) , with(4.2) L ( a, b ) = (cid:90) ba (cid:90) ba B ( x, y ) dxdy = 2 (cid:90) ba (cid:90) xa B ( x, y ) dxdy , (4.3)where the last equality follows from the symmetry ( B ( x, y ) = B ( y, x )) of the modularitysurface. We call the function L ( a, b ) a modularity slice as the modularity function may beseen as a simple linear sum of these slices. Note that within the above formulation, we havethus restricted the (combinatorial) optimization problem to a much simpler optimizationwith c − c − g ( x ) is constant. For instance, if we know that there are c = 3 communities we ODULARITY MAXIMISATION FOR GRAPHONS 13 obtain: Q ( g ) ∝ L (0 , x ) + L ( x , x ) + L ( x , , (4.4)which has only x and x as free variables. This continuous optimisation problem can inmany cases be solved analytically. When this is not directly possible, however, we canoptimise it using standard optimisation procedures, such as the Nelder–Mead method [58].In the following, we will use the sliced-modularity approach to prove the maximum-modularity partition for a synthetic graphon. In the appendix, we discuss a generalisedsliced-modularity approach, which allows the detection of communities in settings whenless is known about the location of communities on the line.
5. Modularity optimization for synthetic graphons.
In this section we explore thedetection of community structure for different given synthetic graphons, and show how theobtained group assignments provide some simplified description block description of thegraphons.
Maximising the graphon-modularityreturns a group assignment function g ( x ) with the highest graphon-modularity Q . If themodularity surface of the graphon is B ( x, y ) = 0, however, all functions g ( x ) have thesame modularity Q = 0. In this case, it is not possible to find a partition that has a highermodularity than any other. Accordingly, we will say that such a graphon does not exhibit acommunity structure. Note that in contrast to the graph case, such a degenerate situationis possible even if the graphon itself is non-zero and well defined. Proposition 5.1.
Graphons of the form W ( x, y ) = f ( x ) f ( y ) do not have communitystructure.Proof. Let W ( x ) be a graphon of the form W ( x, y ) = f ( x ) f ( y ). The degree functionis thus k ( x ) = f ( x ) (cid:82) f ( y ) dy , the edge density is µ = (cid:82) f ( y ) dy (cid:82) f ( x ) dx and the nullmodel term can be computed as P ( x, y ) = f ( x ) f ( y ). Therefore the modularity surface is B ( x, y ) = 0 and it follows that the modularity function is zero for all partitions.An example of a graphon without community structure is the graphon associated to theErd˝os–R´enyi (ER) graph model G ( n, p ), which can be represented as a graphon W ( x, y ) = p with a constant connection probability p ∈ [0 , W ( x, y ) = √ p √ p = f ( x ) f ( y ) and thushas no community structure. In fact, the reverse is also true and all non-empty graphonswithout community structure are multiplicative (up to L equivalence). Proposition 5.2.
Let W ( x, y ) be a non-empty graphon with vanishing modularity surface B ( x, y ) = 0 . Then, the graphon is of the form W ( x, y ) = f ( x ) f ( y ) .Proof. As W ( x, y ) and P ( x, y ) are positive, if follows from B ( x, y ) = 0 that W ( x, y ) ≡ P ( x, y ) (in the sense of L equivalence), where the null model term is given by P ( x, y ) = k ( x ) k ( y ) /µ . We can therefore write W ( x, y ) = f ( x ) f ( y ) with f ( x ) = k ( x ) / √ µ . Note that this implies that the linear (graphon) integral operator, i.e., the integral operatorwhose kernel is given by the graphon has rank 1 λ -graphon. We consider a graphon with core-periphery structure W ( x, y ) = (1 − λ ) xy + λ , modulated by a parameter λ ∈ [0 , W ( x, y ) the λ -graphon in the following. Note that the λ -graphon may be seen as a convex mixture of (a) a flat connectivity profile with probability λ ,corresponding to random ER-like connectivity, and (b) a coordinate dependent connectivityprofile xy , which may be interpreted in terms of a continuous core-periphery structure.Indeed, the larger the coordinate of the node, the higher its connectivity, as computing thedegree function of the λ -graphon confirms: k ( x ) = λ + (1 − λ ) x/ . The edge density of the λ -graphon is µ = (1 − λ ) / λ and, accordingly, the modularitysurface is B ( x, y ) = (1 − λ ) λ (2 x − y − λ )Note that when λ = 0 and λ = 1 we have B ( x, y ) = 0 and we cannot detect communitiesfor these cases as discussed above. For λ = 1, the λ -graphon becomes the constant graphon W ( x, y ) = 1, which clearly does not have a community structure as all nodes are equivalent.For λ = 0 we obtain W ( x, y ) = xy , which does not have a community structure becauseof its multiplicative structure. Numerically optimising the graphon modularity with the GenLouvain -approach for λ = 0 and λ = 1 yields indeed a single community with Q = 0.Let us now focus on scenarios for which λ ∈ (0 , k ( x ),the modularity surface B ( x, y ), and the detected community structure g ( x ) for λ = 0 . Figure 3.
Graphon modulurity maximization for a graphon with core-periphery structure.
In a core-periphery λ -graphon (see text) with λ = 0 . we detect two communities. The panels show thegraphon W ( x, y ) , the degree k ( x ) , the modularity surface B ( x, y ) , and the community structure g ( x ) returnedby the GenLouvain algorithm, respectively. The sliced-modularity approach returns similar results for thecommunity structure. ODULARITY MAXIMISATION FOR GRAPHONS 15 an example. Optimising modularity with the
GenLouvain -approach for general λ ∈ (0 , x = 1 /
2. To confirm this numerical results let us use the sliced-modularity approach toanalytically compute the optimal community structure for two continuous groups. Firstwe compute the modularity slice: L ( a, b ) = ( a − b ) ( − a + b ) (1 − λ ) λ λ + 1 . (5.1)To obtain the optimal border x between the two communities we maximise L (0 , x ; λ ) + L ( x , λ ) = 2(1 − λ ) λ λ ( x − x = κ ( x − x . (5.2)As λ ∈ (0 , κ = 2(1 − λ ) λ/ (1 + 3 λ ) is greater than zero, and maximising themodularity is therefore equivalent to maximising h ( x ) = ( x − x . The local extremaof h ( x ) with dh/dx = 0 are 0, 1 /
2, and 1. Evaluating the second derivative reveals that0 and 1 are local minima and that x = 1 / λ ∈ (0 ,
1) the optimalmodular structure thus indeed consists of two equal-sized communities with the border at x = 1 /
2, confirming the numerical results from the
GenLouvain approach.Using the above information about the optimal split for λ ∈ (0 , λ parameter: Q max ( λ ) = L (0 , / λ ) + L (1 / , λ ) µ ( λ ) = λ (1 − λ )2(3 λ + 1) , (5.3)which has its maximum Q max = 1 /
32 = 0 . λ = 1 /
5. The example shown inFig. 3 is therefore the λ -graphon with the largest possible modularity score. Note that aslim λ → Q max ( λ ) = 0 and lim λ → Q max ( λ ) = 0, the maximum modularity Q max vanisheswhen the λ -graphons approach the boundary cases without community structure. In thiscase the optimal value of modularity is thus continuous in λ . max graphon. Synthetic models for growingnetworks have been widely used to model phenomena observed in real-world networks,such as long-tailed degree distributions [59]. These models typically consist of an iterativeprocedure that adds nodes and edges sequentially until a certain number of nodes is reached.Some graphons can be seen as a limit of such a network growing process [60]. Thesequence of growing uniform attachment graphs we discuss next is a particular exampleof such a growth process that exhibits such a convergence. We start with a graph G consisting of a single node and no edges. For n ≥
2, we then construct G n from G n − byadding a new vertex and adding every possible not already present edge in the networkwith probability 1 /n . It can be shown [60] that this graph sequence almost surely convergesto the max graphon displayed in Fig. 4, which is defined as W ( x, y ) = 1 − max( x, y )(5.4) Computing the degree function of the max graphon as k ( x ) = (1 − x ) /
2, we obtain amodularity surface of B ( x, y ) = 1 − max( x, y ) −
34 ( x − y − . (5.5)Optimising this function with the GenLouvain approach yields a partition of the unitinterval into the sets [0 , . . , . . , L ( a, b ) that is a polynomialof order six in a and b . To estimate the maxima of the sixth order polynomial in termsof the partition endpoints x , x we use Mathematica’s Brent–Dekker method [61] andobtain x ≈ .
369 and x ≈ .
463 (see online material). With higher computational effort,it is also possible to obtain the optimal community borders with the sliced-modularityapproach for a larger number c of communities. For c = 5, for example, we obtain x = 0, x ≈ . x ≈ . x = 1, which represents a community structure with twocommunities of vanishing size, yielding de facto , the same community structure with c = 3.These results are well in line with the values obtained with GenLouvain , given thatthe discretisation of the graphon for the
GenLouvain approach and possible numericalinaccuracies encountered when finding the maxima using the sliced-modularity method.
6. Modularity of graphs sampled from unstructured and structured graphons.
Figure 4.
Graphon modularity maximisation for a graphon with uniform attachment.
For themax graphon discussed in the text, we obtain a partition consisting of three communities. The panels showthe graphon W ( x, y ) , the degree k ( x ) , the modularity surface B ( x, y ) , and the community structure g ( x ) returned by the GenLouvain algorithm, respectively. The sliced-modularity apporach returns similar resultsfor the community structure. ODULARITY MAXIMISATION FOR GRAPHONS 17
In a number ofbenchmark tests for community detection on graphs the characterisation of a graphon withno community structure in Subsection 5.1 is implicitly assumed to hold also for finite graphssampled from a corresponding graphon model. For instance, in [3, 62], it is advocated thatfor graphs sampled from an ER model, a community detection algorithm should returna null result that indicates that there are no communities present. While the idea isintuitively appealing, there are some issues when adopting this viewpoint in the finiteregime.In particular, a finite network sampled from a sparse ER model may not necessarilybe representative of the underlying ER model, i.e., the random samples may not be con-centrated around the expected featureless ER graphon [63, 64]. This is in accordance withearlier results that state that samples drawn from an ER model can have a weak commu-nity structure arising from statistical fluctuations [65]. In particular in [65], the authorsshowed that for large networks the maximum modularity approaches Q ∼ ( pN ) − / (fora constant p , independent on N ). Therefore the modularity vanishes for N → ∞ , whichmatches the infinite size limit that graphons represent. The planted partition (PP) model is a prominentrandom graph model with group structure, which can be described as a graphon. For a c -block PP model each node x belongs to exactly one group g (cid:63) ( x ), where g (cid:63) : [0 , →{ , . . . , c } is the group assignment function. The c -block PPM graphon can then be writtenas: W ( x, y ) = (cid:40) p in if g (cid:63) ( x ) = g (cid:63) ( y ) ,p ex otherwise , (6.1)which means that nodes have an internal connection probability p in ∈ [0 ,
1] if they are inthe same group, and an external connection probability p ex ∈ [0 ,
1] otherwise. To obtainan assortative community structure we assume p ex < p in .For the PP graphon, we can compute the degree as: k PP ( x ) = ( p in + ( c − p ex ) c , (6.2)and the total connectivity: µ PP = ( p in + ( c − p ex ) c , (6.3)Accordingly, the modularity surface is B PP ( x, y ) = (cid:40) c − c ( p in − p ex ) if g (cid:63) ( x ) = g (cid:63) ( y ) , ( p ex − p in ) c otherwise . (6.4) Graphon modularity maximization for planted partition graphons.
In a plantedpartition graphon with c = 3 blocks we recover the planted community structure. Each community has aninternal connection probability of p in = 0 . and the connection probability between communities is p ex = 0 . .The degree function is constant k ( x ) = 1 / × . / × . ≈ . . The modularity surface B ( x, y ) is positive in the communities and negative between communities. The community function g ( x ) indicatesthat we correctly identify the three planted communities. which is positive for nodes belonging to the same block and negative across blocks. Ac-cordingly, we maximise the graphon-modularity if we choose the community structure g ( x )equal to the planted one g (cid:63) ( x ) and obtain a maximum modularity of Q max = 1 µ c − c ( p in − p ex ) = c − c p in − p ex p in + ( c − p ex , (6.5)Thus for all c > p ex < p in , the graphon-modularity is positive as desired for a modelwith community structure. In Fif. 5, we show an example PP graphon, for which we indeedperfectly recover the planted partition g (cid:63) ( x ). We now consider the numerical performance of modularity optimization for aPP model with c = 3 groups in three scenarios: (i) if we had access to the correct graphon,(ii) for a graph sampled from such a graphon, and (iii) for a graphon inferred from such asampled graphon.We start by considering the simple baseline case, in which we have access to an ap-propriately discretized version of the true graphon. In this case, we can simply use the GenLouvain algorithm, or any other approach and recover the c = 3 planted communities,as long as p ex < p in (results not shown).Next we consider modularity optimization for graphs sampled from such such a plantedpartition model graphon, and for the graphons inferred from such samples. We con-struct planted partition graphons with c = 3 planted partitions and internal connectionprobability p in = 0 .
05, with varying external link probability p ex ∈ [0 , . N ∈ { , , } from this graphon as follows.First, we create N nodes with associated uniformly spaced coordinates in [0 , ODULARITY MAXIMISATION FOR GRAPHONS 19Figure 6.
Numerical comparison of graphon modularity and graph-based modularity opti-mization.
We measure the alignment between community structure in graphs sampled from the graphonwith the planted partition (dotted lines), and graphons estimated from the sampled graphs (solid lines) as theadjusted mutual information (AMI). The larger the size N of the sampled networks, the better is the recoveryof the planted partition. In all cases, there exists a regime for which the graphon estimation improves thedetected community structure. Here we fix p in = 0 . . x i = ( i − / ( N −
1) is the position of node i . We then draw edges between all (unordered)pairs ( x i , x j ) of nodes with probability W ( x i , x j ) to obtain the symmetric adjacency matrixof an undirected graph. Finally, we estimate the graphon from the sampled graph witha matrix-completion approach, which we choose because it is fast and does not have freehyperparameters [66].We use GenLouvain to optimise the modularity function of the sampled graphs andthe inferred graphons and compare the detected partition with the planted one by comput-ing the adjusted mutual information (AMI) [67]. A maximal value of AMI= 1 indicates thattwo partitions are identical and a minimal value of AMI= 0 indicates that two partitionsdo not provide more information about each other than expected by random chance.The results of our numerical comparison are shown in Fig. 6. The dotted lines cor-respond to the results of modularity optimization on the sampled graphs, the solid linesshow the results when we first infer a graphon and second apply modularity optimizationon the inferred graphon.We find that for modularity optimization both on the sampled graph as well as theinferred graphon, for small external connection probabilities p ex the AMI is one and thuswe recover the planted partitions perfectly. For larger connection probabilities p ex the AMIdecreases, which indicates that we do not fully recover the planted partitions. We furthersee that if we increase the number of sampled nodes, the results obtained from modularityoptimization improve, and we are able to find the correct partition even for larger valuesof p ex (i.e., for smaller differences p in − p ex ), corresponding to the fact that the sampledgraphs converge to the graphon, for which the detection of the planted structures is alwayspossible if p in > p ex . Indeed, it is known that for dense graphs described by graphons the accurate detection of planted communities is a problem that can be efficiently solved bymany algorithms, in contrast to the case of sparse graphs for which a detectability limitexists [68].We find that the AMI curves for the estimated graphons follow a behaviour similarto the AMI curves for modularity optimization on the the sampled graphs. Interestingly,however, for all considered graph sizes N there exists a range of connection probabilities p ex for which modularity optimization on the estimated graphon yields a better performance.This indicates that graphon estimation can improve the performance of modularity op-timization for recovering planted partitions in graphs by smoothing fluctuations in theobserved connectivity structure.
7. Modularity optimization and community structure for graphons estimated fromempirical data.
Real-world network data can be large but is always of finite size. Wethus cannot observe graphons directly but rather finite graphs consisting of discrete nodesand edges. Accordingly, we need to estimate graphons from finite observations. Manydifferent methods have been proposed for this estimation task [29, 30, 33, 35]. Most ofthese graphon estimators are consistent, i.e., the estimation error vanishes as the numberof nodes N → ∞ . However, different graphon estimators may estimate different graphonsfor the same finite graph, and identifying the most appropriate estimator for a certain dataset is an open research question [69].In the following, we employ three prototypical methods for graphon estimation fromempirically observed graphs: (i) a sorting-and-smoothing algorithm, which is a consistenthistogram estimator [35], (ii) a matrix-completion approach [66], and (iii) universal singularvalue thresholding (USVT) [31]. We choose these three methods because they do not havehyperparameters that have to be chosen by the user. Using the estimated graphons we thenemploy modularity maximisation using the GenLouvain approach to obtain a simplifiedpicture of the graphons in terms of community structure. We find that the graphon-estimation approach can have a strong influence on the community structure detected bygraphon-modularity maximisation.To quantify the extent to which the community structure g graphon ( x ) obtained from theestimated graphon resembles the community structure g network ( x ) obtained from a graphitself, we compute the AMI between both for six empirical networks (see Table 1). It isimportant to note that there is no ground truth in this setting [70], i.e., the communitiesobtained from direct modularity optimization on the observed graph are but one possibleclustering of the network, similar to the clustering obtained from the graphon. In factin some cases one may even argue that the graphon modularity results are less proneto random fluctuations as the estimation strategies involved typically involve some kindof smoothing procedure (as observed for the PP graphon in subsection 6.2). We find,that for all networks, the choice of the graphon estimator has a strong influence on thecommunity structure that we detect in the graphon. The sort-and-smooth estimator leadsto a community structure that differs strongly from the one detected in the network itself, ODULARITY MAXIMISATION FOR GRAPHONS 21 as indicated by small AMI values. Modularity maximization using an inferred graphonbased on a matrix-completion approach yields partitions that are commensurate with thepartitions found from direct modularity optimization on the graph (AMI > .
4) for all datasets. The “best results” is this sense are provided by the USVT estimator. Interestingly, forthe data sets analysed here, the USVT estimator always yields the highest AMI if it doesnot give a zero result. This zero AMI score occurs if the USVT estimator returns a constantgraphon W ( x, y ) = const., which means all partitions will have the same modularity scoreof 0 (and thus no partition will be detected). This suggests that the USVT algorithmis a good ‘first choice’ for the estimation step of graphons for modularity optimization.In the following we concentrate on the matrix completion and the USVT estimators asthe sort-and-smooth approach yields results that are largely incomparable to the othermethods.Our results indicate that if there is strong community structure, direct modularity opti-mization and graphon-based modularity optimization yield the same results. For instance,in the US-senate voting network [71], which has a strong community structure, we find thatthe modularity maximisation for graphon and network yield virtually the same partition.In cases when the community structure in the graphon differs from the one in the net-work, we usually detect less communities in the graphon because the graphon-estimationsmoothes some of the connectivity signal. For the Facebook network the AMI is 0 . .
54. Our analysis indicates, that a privacy-preserving com-munity detection via graphon modularity is indeed possible, but extent to which we looseinformation in comparison to the graph partition depends on the data set and the type ofgraphon-estimation algorithm. We postpone a more detailed investigation of this behaviourfor future work.
8. Discussion.
In this manuscript, we considered the problem of modularity opti-mization from the perspective of graphons. We showed how a generalised modularity-maximisation algorithm for graphs can be used for modularity optimization on graphonsafter suitable discretization and discussed how in certain cases analytical solutions forthe modularity optimization problem for graphs can be obtained. For future research,exploring how far further insights into trace maximisation problems, such as modularityoptimization, can be obtained by using such a perspective based on operators defined ona continuous domain would be of interest. Interestingly, it has been shown recently thatmaximum-likelihood estimation of an stochastic block model (a problem closely related toModularity optimization [77, 78]) is equivalent to a discrete surface tension [79], thus pro-viding a connection to partial differential equations and continuous problem formulations.There are also other avenues to explore in the future: In this manuscript, we discussedgraphon-modularity with the popular Newman–Girvan null model. Our framework does, data set reference N AMI( g network , g graphon )Matrix completion USVT Sort-and-smoothZachary Karate Club [74] 34 . ˜0 0.09Senate voting [71] 102 0.9286 . . . . Table 1
The agreement between partitions obtained from clustering a graph directly and clustering a graphonestimated from the graph depends strongly on the data set and the used estimation algorithm. We show theadjusted mutual information (AMI) between the partition g network obtained from a graph and the partitionobtained from the graphons g graphon that we estimated from the graph, using three different approaches. Wealso provide a reference for the data and the number N of nodes in each network. For each data, we highlightthe estimation method that yields the best result. While the matrix-completion method always yields decentresults (AMI > . ), the USVT is better for some data sets. For all data sets, the sort-and-smooth algorithmyields small AMIs. however, also allow the use of other null models or a resolution parameter, which mightreveal a hierarchical community structure in graphons.One limitation of graphons is that they describe limits of dense graphs [17]. Many em-pirical networks, however, are sparse. Exchangeable random measures have been proposedas a way to construct sparse graph-variants [80]. It would be relevant for applications, toextend modularity-based approaches to also detect community structure in these objects.From a privacy-preserving computing point of view, it would be interesting to exploreto what extent a graphon description could be de-anonymised, when obtaining informationabout the network from which it was estimated. Furthermore, an investigation in howfar other graph-measures that can be extended to graphons allow for privacy-preseringcomputation would be interesting.
9. Code availability.
Matlab and
Mathematica code to implement the discussedmethods and reproduce all figures is available under http://github.com/floklimm/graphon.
REFERENCES [1] Mark E J Newman.
Networks: An Introduction (2nd Edition) . Oxford University Press, 2018.[2] Santo Fortunato. Community detection in graphs.
Physics Reports , 486(3):75–174, 2010.[3] Santo Fortunato and Darko Hric. Community detection in networks: A user guide.
Physics Reports ,659:1–44, 2016.[4] Michael T Schaub, Jean-Charles Delvenne, Martin Rosvall, and Renaud Lambiotte. The many facetsof community detection in complex networks.
Applied Network Science , 2(1):4, 2017.[5] Anna C F Lewis, Nick S Jones, Mason A Porter, and Charlotte M Deane. The function of communitiesin protein interaction networks at multiple scales.
BMC Systems Biology , 4(1):100, 2010.[6] Amanda L Traud, Peter J Mucha, and Mason A Porter. Social structure of facebook networks.
Physica
ODULARITY MAXIMISATION FOR GRAPHONS 23
A: Statistical Mechanics and its Applications , 391(16):4165–4180, 2012.[7] Danielle S Bassett and Olaf Sporns. Network neuroscience.
Nature Neuroscience , 20(3):353, 2017.[8] Mason A Porter, Jukka-Pekka Onnela, and Peter J Mucha. Communities in networks.
Notices of theAMS , 56(9):1082–1097, 1164–1166, 2009.[9] Santo Fortunato and Marc Barthelemy. Resolution limit in community detection.
Proceedings of theNational Academy of Sciences of the United States of America , 104(1):36–41, 2007.[10] Mark E J Newman and Michelle Girvan. Finding and evaluating community structure in networks.
Physical Review E , 69(2):026113, 2004.[11] L´aszl´o Lov´asz and Bal´azs Szegedy. Limits of dense graph sequences.
Journal of Combinatorial Theory,Series B , 96(6):933–957, 2006.[12] Abigail Z Jacobs and Aaron Clauset. A unified view of generative models for networks: models,methods, opportunities, and challenges. arXiv preprint arXiv:1411.4070 , 2014.[13] David J Aldous. Representations for partially exchangeable arrays of random variables.
Journal ofMultivariate Analysis , 11(4):581–598, 1981.[14] Douglas N Hoover. Relations on probability spaces and arrays of random variables.
Preprint, Institutefor Advanced Study, Princeton, NJ , 2, 1979.[15] Emmanuel Abbe. Community detection and stochastic block models: recent developments.
TheJournal of Machine Learning Research , 18(1):6446–6531, 2017.[16] Avanti Athreya, Donniell E Fishkind, Minh Tang, Carey E Priebe, Youngser Park, Joshua T Vogel-stein, Keith Levin, Vince Lyzinski, and Yichen Qin. Statistical inference on random dot productgraphs: a survey.
The Journal of Machine Learning Research , 18(1):8393–8484, 2017.[17] Peter Orbanz and Daniel M Roy. Bayesian models of graphs, arrays and other exchangeable randomstructures.
IEEE Transactions on Pattern Analysis and Machine Intelligence , 37(2):437–461,2014.[18] M Puck Rombach, Mason A Porter, James H Fowler, and Peter J Mucha. Core-periphery structurein networks.
SIAM Journal on Applied Mathematics , 74(1):167–190, 2014.[19] Marco Avella-Medina, Francesca Parise, Michael T Schaub, and Santiago Segarra. Centrality measuresfor graphons: Accounting for uncertainty in networks.
IEEE Transactions on Network Scienceand Engineering , 2018.[20] Patric Hagmann, Leila Cammoun, Xavier Gigandet, Reto Meuli, Christopher J Honey, Van J Wedeen,and Olaf Sporns. Mapping the structural core of human cerebral cortex.
PLoS Biology , 6(7), 2008.[21] Benjamin F Maier and Dirk Brockmann. Cover time for random walks on arbitrary complex networks.
Physical Review E , 96(4):042307, 2017.[22] Carter Jernigan and Behram F T Mistree. Gaydar: Facebook friendships expose sexual orientation.
First Monday , 2009.[23] Christian Borgs, Jennifer Chayes, and Adam Smith. Private graphon estimation for sparse graphs.In
Advances in Neural Information Processing Systems , pages 1369–1377, 2015.[24] Christian Borgs, Jennifer Chayes, Adam Smith, and Ilias Zadik. Revealing network structure, con-fidentially: Improved rates for node-private graphon estimation. In , pages 533–543. IEEE, 2018.[25] Shuang Gao and Peter E Caines. The control of arbitrary size networks of linear systems via graphonlimits: An initial investigation. In , pages 1052–1057. IEEE, 2017.[26] Julien Petit, Renaud Lambiotte, and Timoteo Carletti. Random walks on dense graphs and graphons. arXiv preprint arXiv:1909.11776 , 2019.[27] R. Vizuete, P. Frasca, and F. Garin. Graphon-based sensitivity analysis of sis epidemics.
IEEE ControlSystems Letters , 4(3):542–547, 2020.[28] Matthew Coulson, Robert E Gaunt, and Gesine Reinert. Poisson approximation of subgraph countsin stochastic block models and a graphon model.
ESAIM: Probability and Statistics , 20:131–142,2016. [29] Patrick J Wolfe and Sofia C Olhede. Nonparametric graphon estimation. arXiv preprintarXiv:1309.5936 , 2013.[30] Sofia C Olhede and Patrick J Wolfe. Network histograms and universality of blockmodel approx-imation.
Proceedings of the National Academy of Sciences of the United States of America ,111(41):14722–14727, 2014.[31] Sourav Chatterjee. Matrix estimation by universal singular value thresholding.
The Annals of Statis-tics , 43(1):177–214, 2015.[32] Edo M Airoldi, Thiago B Costa, and Stanley H Chan. Stochastic blockmodel approximation ofa graphon: Theory and consistent estimation. In
Advances in Neural Information ProcessingSystems , pages 692–700, 2013.[33] Yuan Zhang, Elizaveta Levina, and Ji Zhu. Estimating network edge probabilities by neighbourhoodsmoothing.
Biometrika , 104(4):771–783, 2017.[34] Justin Yang, Christina Han, and Edoardo Airoldi. Nonparametric estimation and testing of exchange-able graph models. In
Artificial Intelligence and Statistics , pages 1060–1067, 2014.[35] Stanley Chan and Edoardo Airoldi. A consistent histogram estimator for exchangeable graph models.In
International Conference on Machine Learning , pages 208–216, 2014.[36] Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. Fast unfoldingof communities in large networks.
Journal of Statistical Mechanics: Theory and Experiment ,2008(10):P10008, 2008.[37] Aditya Grover and Jure Leskovec. node2vec: Scalable feature learning for networks. In
Proceedingsof the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining ,pages 855–864, 2016.[38] Mark E J Newman and Tiago P Peixoto. Generalized communities in networks.
Physical ReviewLetters , 115(8):088701, 2015.[39] Peter D Hoff, Adrian E Raftery, and Mark S Handcock. Latent space approaches to social networkanalysis.
Journal of the American Statistical Association , 97(460):1090–1098, 2002.[40] Geir Agnarsson and Raymond Greenlaw.
Graph Theory: Modeling, Applications, and Algorithms .Prentice-Hall, Inc., 2006.[41] L´aszl´o Lov´asz and Bal´azs Szegedy. Limits of dense graph sequences.
Journal of Combinatorial Theory,Series B , 96(6):933–957, 2006.[42] Christian Borgs, Jennifer T Chayes, L´aszl´o Lov´asz, Vera T S´os, and Katalin Vesztergombi. Convergentsequences of dense graphs i: Subgraph frequencies, metric properties and testing.
Advances inMathematics , 219(6):1801–1851, 2008.[43] Christian Borgs and Jennifer Chayes. Graphons: A nonparametric method to model, estimate, anddesign algorithms for massive networks. In
Proceedings of the 2017 ACM Conference on Economicsand Computation , pages 665–672. ACM, 2017.[44] L´aszl´o Lov´asz.
Large networks and graph limits , volume 60. American Mathematical Soc., 2012.[45] Daniel Glasscock. What is a graphon?
Notices of the AMS , 62(1), 2015.[46] Zhao Yang, Ren´e Algesheimer, and Claudio J Tessone. A comparative analysis of community detectionalgorithms on artificial networks.
Scientific Reports , 6:30750, 2016.[47] Martin Rosvall, Jean-Charles Delvenne, Michael T Schaub, and Renaud Lambiotte. Different ap-proaches to community detection.
Advances in Network Clustering and Blockmodeling , pages105–119, 2019.[48] Mark E J Newman. Modularity and community structure in networks.
Proceedings of the NationalAcademy of Sciences of the United States of America , 103(23):8577–8582, 2006.[49] Paul Expert, Tim S Evans, Vincent D Blondel, and Renaud Lambiotte. Uncovering space-independentcommunities in spatial networks.
Proceedings of the National Academy of Sciences of the UnitedStates of America , 108(19):7663–7668, 2011.[50] Marta Sarzynska, Elizabeth A Leicht, Gerardo Chowell, and Mason A Porter. Null models for commu-nity detection in spatially embedded, temporal networks.
Journal of Complex Networks , 4(3):363–
ODULARITY MAXIMISATION FOR GRAPHONS 25 arXiv preprintarXiv:1311.1924 , 2013.[52] Marya Bazzi, Mason A Porter, Stacy Williams, Mark McDonald, Daniel J Fenn, and Sam D How-ison. Community detection in temporal multilayer networks, with an application to correlationnetworks.
Multiscale Modeling & Simulation , 14(1):1–41, 2016.[53] Fan Chung and Linyuan Lu. Connected components in random graphs with given expected degreesequences.
Annals of Combinatorics , 6(2):125–145, 2002.[54] Ulrik Brandes, Daniel Delling, Marco Gaertler, Robert Gorke, Martin Hoefer, Zoran Nikoloski, andDorothea Wagner. On modularity clustering.
IEEE Transactions on Knowledge and Data Engi-neering , 20(2):172–188, 2007.[55] Lucas G. S. Jeub, Marya Bazzi, Inderjit S. Jutla, and Peter J. Mucha. A generalized Louvain methodfor community detection implemented in matlab , Version 2.1, 2011-2014.[56] Vincent A Traag, Ludo Waltman, and Nees Jan van Eck. From Louvain to Leiden: guaranteeingwell-connected communities.
Scientific Reports , 9(1):1–12, 2019.[57] Klaus St¨uben. A review of algebraic multigrid. In
Numerical Analysis: Historical Developments inthe 20th Century , pages 331–359. Elsevier, 2001.[58] John A Nelder and Roger Mead. A simplex method for function minimization.
The Computer Journal ,7(4):308–313, 1965.[59] R´eka Albert and Albert-L´aszl´o Barab´asi. Statistical mechanics of complex networks.
Reviews ofModern Physics , 74(1):47, 2002.[60] Christian Borgs, Jennifer Chayes, L´aszl´o Lov´asz, Vera S´os, and Katalin Vesztergombi. Limits ofrandomly grown graph sequences.
European Journal of Combinatorics , 32(7):985–999, 2011.[61] Richard P. Brent. An algorithm with guaranteed convergence for finding a zero of a function.
TheComputer Journal , 14(4):422–425, 1971.[62] Andrea Lancichinetti and Santo Fortunato. Community detection algorithms: A comparative analysis.
Physical Review E , 80:056117, Nov 2009.[63] Can M Le, Elizaveta Levina, and Roman Vershynin. Concentration and regularization of randomgraphs.
Random Structures & Algorithms , 51(3):538–561, 2017.[64] Antony Joseph, Bin Yu, et al. Impact of regularization on spectral clustering.
The Annals of Statistics ,44(4):1765–1791, 2016.[65] Roger Guimera, Marta Sales-Pardo, and Lu´ıs A Nunes Amaral. Modularity from fluctuations inrandom graphs and complex networks.
Physical Review E , 70(2):025101, 2004.[66] Raghunandan H Keshavan, Andrea Montanari, and Sewoong Oh. Matrix completion from a fewentries.
IEEE Transactions on Information Theory , 56(6):2980–2998, 2010.[67] Nguyen Xuan Vinh, Julien Epps, and James Bailey. Information theoretic measures for clusteringscomparison: Variants, properties, normalization and correction for chance.
The Journal of Ma-chine Learning Research , 11:2837–2854, 2010.[68] Aurelien Decelle, Florent Krzakala, Cristopher Moore, and Lenka Zdeborov´a. Inference and phasetransitions in the detection of modules in sparse networks.
Physical Review Letters , 107(6):065701,2011.[69] Chao Gao, Yu Lu, Harrison H Zhou, et al. Rate-optimal graphon estimation.
The Annals of Statistics ,43(6):2624–2652, 2015.[70] Leto Peel, Daniel B Larremore, and Aaron Clauset. The ground truth about metadata and communitydetection in networks.
Science Advances , 3(5):e1602548, 2017.[71] Andrew Scott Waugh, Liuyi Pei, James H Fowler, Peter J Mucha, and Mason Alexander Porter. Partypolarization in congress: A network science approach. 2009.[72] Florian Klimm, Javier Borge-Holthoefer, Niels Wessel, J¨urgen Kurths, and Gorka Zamora-L´opez.Individual node’s contribution to the mesoscale of complex networks.
New Journal of Physics ,16(12):125006, 2014. [73] Florian Klimm, Danielle S Bassett, Jean M Carlson, and Peter J Mucha. Resolving structural vari-ability in network models and the brain.
PLoS Computational Biology , 10(3), 2014.[74] Wayne W Zachary. An information flow model for conflict and fission in small groups.
Journal ofAnthropological Research , 33(4):452–473, 1977.[75] Lada A Adamic and Natalie Glance. The political blogosphere and the 2004 us election: divided theyblog. In
Proceedings of the 3rd International Workshop on Link Discovery , pages 36–43, 2005.[76] Florian Klimm, Charlotte M Deane, and Gesine Reinert. Hypergraphs for predicting essential genesusing multiprotein complex data. bioRxiv , 2020.[77] A Roxana Pamfil, Sam D Howison, Renaud Lambiotte, and Mason A Porter. Relating modularitymaximization and stochastic block models in multilayer networks.
SIAM Journal on Mathematicsof Data Science , 1(4):667–698, 2019.[78] Mark E J Newman. Equivalence between modularity optimization and maximum likelihood methodsfor community detection.
Physical Review E , 94(5):052315, 2016.[79] Zachary M Boyd, Mason A Porter, and Andrea L Bertozzi. Stochastic block models are a discretesurface tension.
Journal of Nonlinear Science , pages 1–34, 2019.[80] Fran¸cois Caron and Emily B Fox. Sparse graphs using exchangeable random measures.
Journal ofthe Royal Statistical Society: Series B (Statistical Methodology) , 79(5):1295–1366, 2017.
ODULARITY MAXIMISATION FOR GRAPHONS 27
Appendix A. Modularity slices for noncontiguously ordered communities.
In themain text, we introduce the sliced-modularity approach for contiguously ordered commu-nities. Here, we demonstrate that it is possible to define modularity slices for noncontigu-ously ordered communities. This might be necessary since, while we can hope that methodsfor inferring graphons will notionally assign close coordinates to subsets of nodes from thesame community, the algorithms for inferring graphons will not always assign all nodes ina community a set of contiguous co-ordinates; rather, we might have sets of nodes from thesame community to be cut and so be fragmented into contiguous subsets and these subsetsassigned noncontiguous locations in the graphon embedding (see Fig. 7 for an example).To this end, we restrict the possible group assignment functions g ( x ) to be piecewiseconstant on s intervals:(A.1) g ( x ) = G (1) if x ∈ [0 , x ) G (2) if x ∈ [0 , x )... G ( s ) if x ∈ [ x s − , , where the, possibly many-to-one, slice–community function G : { , , . . . , s } → { , , . . . , c } maps each slice to one of the c communities with c ≤ s . This allows noncontiguously orderedcommunities, for example with s = 4 slices but c = 3 communities, in which the first sliceand the fourth slice are in the same community, such that G (1) = G (4) (cid:54) = G (2) (cid:54) = G (3)(see Fig. 7). For c = s , each slice belongs to its own community and we obtain the case ofcontinuous communities, as discussed in the main text.If the group assignment function g ( x ) is of such a form, we can rewrite the graphon g(x)13 x2 Example of noncontiguously ordered community structure x x x Figure 7.
The sliced-modularity approach can be extended to encompass noncontiguouslyordered communities.
In this example, there exist s = 4 slices, which are sorted into c = 3 communitiessuch that G (1) = G (4) = 1 , G (2) = 2 , and G (3) = 3 . Using the sliced-modularity approach for discontinuouscommunities, we now can optimise the boundaries x , x , and x between the communities. modularity function as: Q ( g ) = 12 µ s (cid:88) i =1 s (cid:88) j =1 L (2) ( x i − , x i , x j − , x j ) δ [ G ( i ) , G ( j )] , with(A.2) L (2) ( a, b, c, d ) = (cid:90) ba (cid:90) dc B ( x, y ) dxdy , (A.3)where the modularity slice L (2) ( a, b, c, d ) is now a function of four interval points.If we fix the slice–community function G ( s (cid:48) ) apriori (i.e., we know the order of intervalsthat belong to the same community), we have to optimise the community borders x i , whichis an optimization with s − G ( s (cid:48) ) is unknown (i.e., we haveto optimise over all possible slice–community functions G ( s (cid:48) ), as well as, the communityborders x i ). For a small number s of slices, we may test all possible partitions but for alarge number s of slices this might become unfeasible and heuristics would be required.This is especially a problem, if the appropriate number s of slices is unknown. This pointsto an algorithm that is initialised by assuming that all slices are associated with distinctcommunities and assumes that ss