NNoname manuscript No. (will be inserted by the editor)
Some remarks on modularity density
Alberto Costa
Received: date / Accepted: date
Abstract
A “quantitative function” for community detection called modular-ity density has been proposed by Li, Zhang, Wang, Zhang, and Chen in [Phys.Rev. E , 036109 (2008)]. We study the modularity density maximizationproblem and we discuss some features of the optimal solution. More precisely,we show that in the optimal solution there can be communities having negativemodularity density, and we propose a modification of the original formulationto overcome this issue. Moreover, we show that a clique can be divided intotwo or more parts when maximizing the modularity density. We also comparethe solution found by maximizing the modularity density with that obtainedby maximizing the modularity on the Zachary karate club network. Keywords clustering · community detection · complex networks · modularitydensity maximization Networks, or graphs, are often used to describe complex systems, and theyfind application in many fields, e.g., biology and bioinformatics [14,18], recom-mender systems [1], social networks [12]. One of the topics related to networkswhich has been studied extensively in the last years is community detection:given a network G = ( V, E ), where V is the set of vertices and E is the setof edges, one wants to find subsets of V (called clusters, or communities, ormodules) which are more connected with vertices in the same community thanwith vertices in other communities. Hence, a partition is obtained by splitting Financial support by SUTD-MIT International Design Center under grant IDG21300102.A. CostaSingapore University of Technology and DesignE-mail: [email protected] a r X i v : . [ c s . S I] N ov Alberto Costa V in m communities { V , . . . , V m } which cover V . In general, these commu-nities are non-empty, non-overlapping, and their number m is not known apriori.There are many ways to define a community. For example, one may specifysome rules that each community must respect [3,4,19]. Another approach is touse some heuristics (see for example [5,12]). Alternatively, one could specify anobjective function to maximize or minimize. Concerning the latter, probablythe most famous of such functions is modularity, which represents the fractionof edges within communities minus the expected fraction of such edges in arandom network with the same degree distribution [12,17]. More precisely,using the notation of [15], modularity is defined as follows: Q = m (cid:88) i =1 (cid:34) L ( V i , V i ) L ( V, V ) − (cid:18) L ( V i , V ) L ( V, V ) (cid:19) (cid:35) , (1)where L ( V i , V i ) is twice the number of edges in the community V i , L ( V, V ) istwice the number of edges of G (i.e., 2 | E | ), and L ( V i , V ) is equal to the sumof degrees of vertices belonging to the community V i . Notice that, in order tofind a good quality partition, modularity should be maximized.Although modularity is widely used, it presents some issues, as degeneracyand resolution limit [11,13]. Degeneracy is related to the possible presence ofseveral high modularity partitions which makes it difficult to find the globaloptimum. Resolution limit refers to the sensitivity of modularity to the totalnumber of edges in the network, hence small communities may not be identifiedand remain hidden inside larger ones. To overcome the resolution limit ofmodularity, a measure called modularity density has been proposed by Li,Zhang, Wang, Zhang, and Chen in [15]. More precisely, modularity density isdefined as: D = m (cid:88) i =1 d ( G i ) = m (cid:88) i =1 (cid:20) L ( V i , V i ) − L ( V i , ¯ V i ) | V i | (cid:21) , (2)where d ( G i ) is the modularity density associated with the community V i , L ( V i , ¯ V i ) is the number of edges joining a vertex in V i to a vertex belong-ing to another community, and | V i | is the number of vertices belonging to V i . This paper is organized as follows: in Section 2 we discuss some propertiesof the modularity density. In particular, in Section 2.1 we show that in theoptimal solution there can be communities having a negative modularity den-sity value, and we propose a constraint to overcome this issue. We also showhow this constraint can help to derive a mixed integer linear programmingreformulation of the problem, and we point out the relationship between thisconstraint and the weak definition of Radicchi et. al [19]. In Section 2.2 weshow that a clique can be split in the optimal solution. After that, in Section 3we comment some wrong and inaccurate statements of [15]. Finally, in Section4 we present the conclusions. ome remarks on modularity density 3 We discuss in the following some features of modularity density.2.1 Lower bound for modularity density of a communityAs for modularity, one should maximize the modularity density to find a goodquality partition. In fact, Li et. al [15] state that “clearly the maximum D value is often achieved when the network is correctly partitioned”. Intuitively,the modularity density of each community should assume a high value, butthere are cases where some communities can have negative modularity densityvalue in the optimal solution. To show this, consider the network with 31vertices of Fig. 1: it consists of 7 cliques, each of them having 4 vertices (squareshape), connected to a smaller clique with 3 vertices (circle shape). The optimal Fig. 1
Example of a network for which the optimal solution contains a community withnegative modularity density (color online). solution we found by solving the modularity density maximization problemusing the exact formulations presented in [8] is a partition into 8 communities:7 communities correspond to the 7 cliques having size 4 (i.e., the communitiesare V i = { i, i +1 , i +2 , i +3 } , ∀ i ∈ { , . . . , } ) whereas the other communitycorresponds to the smaller clique with 3 vertices (i.e., V = { , , } ). It couldbe easily checked that the modularity density D of this partition is 18.9167.More precisely, the modularity density value associated with the community V is − , and for each of the other 7 communities. Hence, we cannot assume Alberto Costa that in the optimal solution each community has a non-negative modularitydensity value. Notice that this property is strictly related to the weak definitionsuggested by Radicchi et al. [19]. We discuss now this point more in detail.Let k ini be the number of edges connecting the vertex v i to other vertices inthe same community, and k outi be the number of edges connecting the vertex v i to vertices belonging to other communities (hence, the degree of the vertex v i is k i = k ini + k outi ). According to Radicchi et al. [19], a subgraph V l of V isa community in the weak sense if: (cid:88) v i ∈ V l k ini > (cid:88) v i ∈ V l k outi , (3)which implies that twice the number of edges inside a community is strictly greater than the number of edges connecting a vertex of the community to avertex in another community (cut edges). Let x il be a binary variable equalto 1 if the vertex v i is inside the community l , and 0 otherwise, and let a ij be an element of the adjacency matrix of the graph G (i.e., a ij is equal to 1if and only if there is an edge connecting v i to v j ). As shown in [4], the weakcondition (3) is equivalent to:4 (cid:88) { v i ,v j }∈ E x il x jl ≥ (cid:88) v i ∈ V k i x il + 1 . (4)According to Appendix A of [15], modularity density can be expressed asfollows: D = m (cid:88) l =1 (cid:88) v i ∈ V (cid:88) v j ∈ V a ij x il x jl − (cid:88) v i ∈ V (cid:88) v j ∈ V a ij x il (1 − x jl ) (cid:88) v i ∈ V x il , (5)which can be rewritten as: D = m (cid:88) l =1 (cid:88) { v i ,v j }∈ E x il x jl − (cid:88) v i ∈ V k i x il (cid:88) v i ∈ V x il . (6)Comparing (4) and (6) it appears that the weak definition is respected if,for each community, the corresponding value of modularity density is strictly positive. Therefore, one could adjoin to the modularity density formulationthe constraint (4) without the +1 on the right-hand side to assure that eachcommunity has got a non-negative modularity density value, or the constraint(4) to assure that the partition found is compatible with the weak definitionof [19]. The latter has been studied in [4] for modularity maximization. Let ome remarks on modularity density 5 M = { , . . . , m } be the set of the indices of the communities. The binarynon-linear formulation which includes the weak constraint can be written as:max (cid:88) l ∈ M (cid:88) { v i ,v j }∈ E x il x jl − (cid:88) v i ∈ V k i x il (cid:88) v i ∈ V x il (7)s.t. ∀ l ∈ M ≤ (cid:88) v i ∈ V x il ≤ | V | − ∀ v i ∈ V (cid:88) l ∈ M x il = 1 (9) ∀ l ∈ M (cid:88) { v i ,v j }∈ E x il x jl ≥ (cid:88) v i ∈ V k i x il + L (10) ∀ l ∈ M, ∀ v i ∈ V x il ∈ { , } , (11)where (8) ensures that each community is non-empty and that all the verticesare not assigned to the same community (we suppose that there are at least twocommunities, otherwise the solution would be the trivial partition containingall the vertices), (9) imposes that each vertex belong to only one community,and (10) is the weak constraint, where the value of L is 1 if we considerthe original definition in [19] and 0 if we only want to guarantee that eachcommunity assumes a non-negative value of modularity density.The advantage of this new formulation, which will be discussed in thefollowing, is that we can derive a more efficient exact linearization of theobjective function. As noticed in [8], the difficult part is the linearization ofthe fractions arising in (7) (the products x il x jl involving two binary variablescan be easily linearized exactly using the Fortet inequalities [10] or the dualapproach presented in [9]). To ease the explanation, we consider the modularitydensity of the community V l (the same technique can be applied to linearize themodularity density of all the other communities). The idea used in [8] for thelinearization the modularity density of V l (formulation MDL) is the following.First, we introduce a variable α l representing the modularity density of V l : α l = 4 (cid:88) { v i ,v j }∈ E x il x jl − (cid:88) v i ∈ V k i x il (cid:88) v i ∈ V x il . (12)Thanks to the fact that empty communities are not allowed (see constraint(8)), the denominator of (12) is greater than 0, hence we can write:4 (cid:88) { v i ,v j }∈ E x il x jl − (cid:88) v i ∈ V k i x il = (cid:88) v i ∈ V α l x il . (13)We need now to linearize each product α l x il . We can derive an exact lin-earization by means of the McCormick inequalities [16], because x il is binary. Alberto Costa
However, we need a lower and an upper bound on α l . Indeed, the tighter thosebounds, the better the linearization. Concerning the upper bound, it has beencomputed in [8] by solving an auxiliary problem, whereas a theoretical lowerbound L α = − k max1 + k max2 has been employed (where k max and k max aretwo vertices with the highest degrees). If constraint (10) holds then the lowerbound for α l would be L = 1 or L = 0 (depending on the value of L in (10)).Those values provide in general a lower bound which is much tighter than L α = − k max1 + k max2 , and which does not depend on the size of the instances(on the other hand, the quality of the bound L α = − k max1 + k max2 decreaseswith the size of the instance, in general). This idea can be also extended to theformulation MDB in [8], where a binary decomposition of the denominator of(12) has been employed to decrease the number of products to linearize withthe McCormick inequalities.Using the formulation (7)-(11) with both L = 0 and L = 1, the partitioninto 8 communities represented in Fig. 1 is infeasible. The optimal solution isfound when the number of communities is 7: the difference with respect to theprevious case is that the clique { , , } is in the same community of one of thecliques having size 4 connected to vertex 3 (which one does not matter, thesolution would be symmetric). The modularity density value associated withthis new partition is 18.5.2.2 Splitting of a clique in the optimal solutionConsider the network with 18 vertices presented in Fig. 2. Fig. 2
Example of a network composed by three cliques with 5 vertices each, connected toa smaller clique having 3 vertices (color online).ome remarks on modularity density 7
Maximizing the modularity density when the number of communities is 4results in a partition consisting of the 4 cliques: V = { , , } (circle shape), V = { , , , , } (diamond shape), V = { , , , , } (square shape), V = { , , , , } (triangle shape), and the corresponding modularitydensity value D is equal to 9.2. However, a higher value of modularity densityis obtained when the number of communities is 3. The partition found is thefollowing: V = { , , , , , } (yellow color), V = { , , , , , } (greencolor), V = { , , , , , } (red color), and the corresponding value ofmodularity density is D = 12. Hence, in the optimal solution the small clique { , , } is split among the three other cliques. Notice that the solution with4 communities would have been infeasible if using the formulation with theweak constraint (7)-(11), regardless of the value of L . Among the properties presented by Li et al. [15], some of them are not proved,wrong, or need to be clarified. Discussing and commenting these properties isthe subject of this section.3.1 Non-negative modularity densityLi et al. claim that “Since our purpose is to maximize the modularity density D , every term d ( G i ) must be non-negative”. Indeed, this is intuitive, as onemay expect that the maximum value of D is obtained when all the terms d ( G i )assume high values. Nevertheless, this is not always true when the number ofcommunities is non-optimal (where the optimal number of communities is thatof the partition yielding the highest value of modularity density). Consider forexample the journal index network tested in Section V. 3 of [15]. The optimalnumber of communities is 4. However, when trying to maximize the modularitydensity with 5 communities, the authors state that “When we intend to splitthe network into five modules, we get essentially the same partition as withfour, only with the singly connected journal Conservation Biology split offby itself as a community”. It is easy to check that the modularity densityvalue of the community consisting only of the vertex associated to the journalConservation Biology is -1. Actually, even when the number of communities isoptimal, the property could not hold: in some cases having a community with asmall negative value of modularity density allows other communities to assumehigher modularity density values, thus yielding a higher value of D , as shown inSection 2.1. Notice that this wrong statement can yield wrong formulations forthe modularity density maximization problem. As pointed out in Section 2.1, in[8] some exact linearizations of the non-linear formulation proposed in [15] areintroduced, and they require a lower bound on the modularity density value.Using 0, as suggested in [15], would produce a wrong model. Moreover, thestatement “the partition (subgraphs) by optimizing D results in communities Alberto Costa consistent with the weak definition suggested by Radicchi et al. ” is also notcorrect. To summarize, there are two mistakes in their statements: – it is not true that modularity density for a community is always non-negative in the optimal solution (as shown in Fig. 1); – even though modularity density was non-negative for all communities inthe optimal solution, this would not be enough to assure that the weakcondition holds, because that condition requires the modularity density tobe strictly positive for all communities (this because of the strict inequalityin (3), that yields the +1 on the right-hand side of (4)).3.2 Division of cliques in the optimal solutionOne of the properties presented by Li et al. is “Given a clique with n vertices,maximizing modularity density or D does not divide it into two or more parts”.This statement should be clarified: the proof of the authors assumes the wholenetwork being a clique (i.e., the clique has no external edges connecting itwith other vertices), and it does not refer to any clique which can be found ina network (even though this property is later employed to prove some otherresults for networks containing some cliques, see Fig. 1 and Sections III. B-Cin [15]). In fact, if the clique is densely connected to external vertices, it couldbe split, as shown in Section 2.23.3 Complexity of modularity density maximizationLi et al. state that “The search for optimal modularity density D is a NP -hard problem due to the fact that the space of possible partitions grows fasterthan any power of system size”. This is not an appropriate definition of NP -hardness. Consider for example the shortest path problem [7]: even thoughthe space of the possible solutions is exponential, the problem belongs to P .This does not mean that modularity density maximization is not a NP -hardproblem, but the correctness of this statement should be proven in a moreappropriate way, for example by means of a reduction from a NP -completeproblem to the decision version of the modularity density maximization (asdone for modularity [2]). Notice that some papers already cite [15] as referencefor the NP -hardness of modularity density maximization [6,21].3.4 Wrong result for Zachary karate club networkIn Section V Li et al. test their function with some artificial and real-world in-stances. Concerning the latter, they present the results for the famous Zacharykarate club network [20]. Commenting on the solution found, the authors claimthat “By using our method, the network was partitioned into two communities ome remarks on modularity density 9 exactly consistent with real partition when k = 2 (see Fig. 3). However, max-imizing the D value, we obtained the “optimal” partition with k = 4 whichis also reasonable from the topology of the network”. We now discuss more indetail this point. We show in Fig. 3 (that is Fig. 3 borrowed from [15]) thepartitions into 2 and 4 communities presented by the authors, and in Fig. 4 thesame partitions with, in addition, the indications of the labels for the vertices. Fig. 3
Partitions into 2 and 4 communities of the Zachary karate club network presentedin [15] (color online).
123 45 67 89 10 11 1213141516 17 1819 2021 2223 24252627 282930 31 323334
Fig. 4
Same partitions represented in Fig. 3 with labels for the vertices (color online).
We tried to optimize the D value on the Zachary karate club network, butwe obtained different results. The partition with 2 communities found by Li et al. consists of V = { , , , , , , , , , , , , , , , } (squares)and V = { , , , , , , , , , , , , , , , , , } (circles).When the number of communities is 4, each community of the previous par-tition is further split in two. The result is a partition composed by V = { , , , , } (dark squares), V = { , , , , , , , , , , } (whitesquares), V = { , , , , , } (dark circles), and V = { , , , , , , , , , , , } (white circles). We solved the problem of modularitydensity maximization for the Zachary karate club network with 2, 3, and 4communities using the exact formulation presented in [8]. The result obtainedwith 2 communities is consistent with that of the authors, and the value of D is 6.83333. The result obtained with 3 communities is the following partition: V = { , , , , , , , , , , , } , V = { , , , , , , , , , , , , , , , , } , and V = { , , , , } , with D = 7 . V = { , , , , } , V = { , , , , , , , , , , } , V = { , , , } , and V = { , , , , , , , , , , , , , } , with a value of D of 7.54481. The differencewith respect to the solution of the authors is that vertices 24 and 28 are movedfrom the community V to V . Actually, if vertices 24 and 28 belong to V ,the corresponding value of D is 7.50909, that is non-optimal when there are4 communities. These results are summarized in Table 1, together with thevalues of modularity Q associated with the partition found by maximizing themodularity density. Looking at Table 1, we can also notice that the solution Table 1
Results obtained by maximizing the modularity density D on the Zachary karateclub network, and corresponding values of modularity Q . The results refer to the optimalpartitions obtained with 2, 3, and 4 communities, as well as to the non-optimal partitioninto 4 communities presented by the authors in [15] (see Fig. 3-4).m D Q with 4 communities of Fig. 3 corresponds to the highest value of modularity Q , whereas the best solution found by maximizing D with 4 communities isdifferent, as explained earlier. To summarize, not only the solution proposed in[15] is non-optimal with respect to the number of communities (which shouldbe 3 instead of 4), but even fixing the number of communities at 4 their so-lutions is non-optimal with respect to the modularity density value (whichshould be 0.754481 instead of 0.750909).The reason for this behavior is thatall the results presented in [15] are based on a method which finds only localoptima (as confirmed by the authors), hence there is no guarantee of globaloptimality. ome remarks on modularity density 11 In this paper we have discussed some properties of modularity density, and wehave shown the relationship with the weak definition of community of Radicchi et. al [19]. This remark allowed us to derive a new formulation, which is easierto linearize and which ensures that in the optimal solution each communityhas a non-negative value of modularity density. Moreover, we have clarified,commented, and corrected some wrong and inaccurate statements presented in[15]. Despite these issues, modularity density remains a very interesting crite-rion, due to its capability of fixing the resolution limit issue of modularity. Forthis reason, we targeted our effort to a better characterization and descriptionof its features.
References
1. Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: Asurvey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledgeand Data Engineering (6), 734–749 (2005)2. Brandes, U., Delling, D., Gaertler, M., G¨orke, R., Hoefer, M., Nikoloski, Z., Wagner,D.: On modularity clustering. IEEE Transactions on Knowledge and Data Engineering (2), 172–188 (2008)3. Cafieri, S., Caporossi, G., Hansen, P., Perron, S., Costa, A.: Finding communities innetworks in the strong and almost-strong sense. Physical Review E (4), 046,113(2012)4. Cafieri, S., Costa, A., Hansen, P.: Adding cohesion constraints to models for modularitymaximization in networks. Journal of Complex Networks (2014). Accepted5. Cafieri, S., Costa, A., Hansen, P.: Reformulation of a model for hierarchical divisivegraph modularity maximization. Annals of Operations Research (1), 213–226 (2014)6. Chen, M., Kuzmin, K., Szymanski, B.: Community detection via maximization of mod-ularity and its variants. IEEE Transactions on Computational Social Systems (1),46–65 (2014)7. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 3rdedn. The MIT Press (2009)8. Costa, A.: MILP formulations for the modularity density maximization problem. Tech.Rep. 2014-10-4588, Optimization Online (2014)9. Costa, A., Liberti, L.: Relaxations of multilinear convex envelopes: Dual is better thanprimal. In: R. Klasing (ed.) Experimental Algorithms, Lecture Notes in ComputerScience , vol. 7276, pp. 87–98. Springer Berlin Heidelberg (2012)10. Fortet, R.: Applications de l’alg`ebre de Boole en recherche op´erationelle. RevueFran¸caise de Recherche Op´erationelle (14), 17–26 (1960)11. Fortunato, S., Barth´elemi, M.: Resolution limit in community detection. Proceedingsof the National Academy of Sciences of the U.S.A. (1), 36–41 (2007)12. Girvan, M., Newman, M.E.J.: Community structure in social and biological networks.Proceedings of the National Academy of Sciences of the U.S.A. (12), 7821–7826 (2002)13. Good, B.H., de Montjoye, Y.A., Clauset, A.: Performance of modularity maximizationin practical contexts. Physical Review E (4), 046,106 (2010)14. Guimer`a, R., Amaral, L.A.N.: Functional cartography of complex metabolic networks.Nature , 895–900 (2005)15. Li, Z., Zhang, S., Wang, R.S., Zhang, X.S., Chen, L.: Quantitative function for commu-nity detection. Physical Review E , 036,109 (2008)16. McCormick, G.: Computability of global solutions to factorable nonconvex programs:Part I — Convex underestimating problems. Mathematical Programming , 146–175(1976)2 Alberto Costa17. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks.Physical Review E (2), 026,113 (2004)18. Palla, G., Dernyi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping communitystructure of complex networks in nature and society. Nature (7043), 814–818 (2005)19. Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifyingcommunities in networks. Proceedings of the National Academy of Sciences of the U.S.A. (9), 2658–2663 (2004)20. Zachary, W.: An information flow model for conflict and fission in small group. Journalof Anthropological Research33