Finding Densest k -Connected Subgraphs
Francesco Bonchi, David García-Soriano, Atsushi Miyauchi, Charalampos E. Tsourakakis
FFinding Densest k -Connected Subgraphs Francesco Bonchi ∗ , David Garc´ıa-Soriano † , Atsushi Miyauchi ‡ , andCharalampos E. Tsourakakis § ISI Foundation, Turin, Italy University of Tokyo, Tokyo, Japan Boston University, Boston, USAJuly 6, 2020
Abstract
Dense subgraph discovery is an important graph-mining primitive with a variety of real-worldapplications. One of the most well-studied optimization problems for dense subgraph discoveryis the densest subgraph problem, where given an edge-weighted undirected graph G = ( V, E, w ),we are asked to find S ⊆ V that maximizes the density d ( S ), i.e., half the weighted averagedegree of the induced subgraph G [ S ]. This problem can be solved exactly in polynomial timeand well-approximately in almost linear time. However, a densest subgraph has a structuraldrawback, namely, the subgraph may not be robust to vertex/edge failure. Indeed, a densestsubgraph may not be well-connected, which implies that the subgraph may be disconnected byremoving only a few vertices/edges within it. In this paper, we provide an algorithmic frameworkto find a dense subgraph that is well-connected in terms of vertex/edge connectivity. Specifically,we introduce the following problems: given a graph G = ( V, E, w ) and a positive integer/real k , we are asked to find S ⊆ V that maximizes the density d ( S ) under the constraint that G [ S ] is k -vertex/edge-connected. For both problems, we propose polynomial-time (bicriteriaand ordinary) approximation algorithms, using classic Mader’s theorem in graph theory and itsextensions. ∗ [email protected] † [email protected] ‡ [email protected] § [email protected] a r X i v : . [ c s . D S ] J u l Introduction
Dense subgraph discovery is an important graph-mining primitive with a variety of real-world appli-cations [20]. Examples include detecting communities and spam link farms in the Web graph [11,19],extracting molecular complexes in protein–protein interaction networks [3, 45], finding experts incrowdsourcing systems [29], and real-time story identification from tweets [2].One of the most well-studied optimization problems for dense subgraph discovery is the densestsubgraph problem . Let G = ( V, E, w ) be a simple undirected graph with edge weight w : E → R > ,where R > is the set of positive reals. Throughout this paper, we assume that | E | ≥
1, edge-weighted graphs have only positive weights, and G is connected. For S ⊆ V , let G [ S ] denote thesubgraph induced by S , i.e., G [ S ] = ( S, E ( S )), where E ( S ) = {{ u, v } ∈ E | u, v ∈ S } . The density of S ⊆ V is defined as d ( S ) = w ( S ) / | S | , where w ( S ) is the sum of edge weights of G [ S ], i.e., w ( S ) = (cid:80) e ∈ E ( S ) w ( e ). In the densest subgraph problem, given a graph G = ( V, E, w ), we are askedto find S ⊆ V that maximizes d ( S ). An optimal solution to this problem is referred to as a densestsubgraph .Unlike most optimization problems for dense subgraph discovery such as the maximum cliqueproblem [18], the densest subgraph problem is polynomial-time solvable. Indeed, there are somepolynomial-time exact algorithms such as Goldberg’s flow-based algorithm [21] and Charikar’slinear-programming-based algorithm [8]. Moreover, it was shown by Charikar [8] that a simplegreedy algorithm admits 1 / S ⊂ V is called a vertex separator of G if its removal disconnects G , i.e., partitions G intoat least two non-empty graphs between which there are no edges. Note that no clique has a vertexseparator. An edge subset F ⊆ E is called a cut of G if its removal disconnects G . The weight of acut is defined to be the sum of weights of edges within it. The vertex connectivity of G , denoted by κ ( G ), is the smallest cardinality of a vertex separator of G if G is not a clique and | V | − edge connectivity of G , denoted by λ ( G ), is the smallest weight of a cut of G .A densest subgraph does not necessarily have large vertex/edge connectivity, which means thatthe subgraph may be disconnected by removing only a few vertices/edges within it. For instance,consider an unweighted graph G (i.e., w ( e ) = 1 for every e ∈ E ) consisting of two equally-sizedlarge cliques that share only a few vertices or are connected by only a few edges. In both cases, theentire graph is a densest subgraph, but it is easily disconnected by removing the common verticesin the former case and the bridging edges in the latter case.In this paper, we provide an algorithmic framework to find a dense subgraph that is well-connected in terms of vertex/edge connectivity. An (edge-weighted) graph G is said to be k -vertex-connected if κ ( G ) is no less than k . On the other hand, an edge-weighted graph G is said to be k -edge-connected if λ ( G ) is no less than k . Using these criteria, we introduce the following problems: Problem 1 (Densest k -vertex-connected subgraph) . Given an edge-weighted undirected graph G =( V, E, w ) , where w : E → R > , and a positive integer k ∈ Z > , the goal is to find S ⊆ V thatmaximizes the density d ( S ) subject to the constraint that the induced subgraph G [ S ] is k -vertex-connected. Problem 2 (Densest k -edge-connected subgraph) . Given an edge-weighted undirected graph G =( V, E, w ) , where w : E → R > , and a positive real k ∈ R > , the goal is to find S ⊆ V that maximizes a) web-BerkStan (b) web-Google (c) web-NotreDame (d) web-Stanford Figure 1: Densest subgraphs in real-world Web graphs. the density d ( S ) subject to the constraint that the induced subgraph G [ S ] is k -edge-connected. In the two-cliques example we discussed earlier, an optimal solution to Problems 1 and 2 witha sufficiently large value for k would be one of the cliques, which is robust to vertex/edge failureand nearly as dense as the densest subgraph (i.e., the entire graph). We observe that Problems 1and 2 are meaningful for real-world data too; Figure 1 visualizes densest subgraphs of the four real-world Web graphs that are publicly available at SNAP (Stanford Network Analysis Project) [32]using a spring layout positioning. As we can visually observe, small separators may exist in real-world densest subgraphs. Table 1 summarizes the detailed statistics. As can be seen, the densestsubgraphs in web-BerkStan and web-NotreDame have surprisingly small vertex connectivity; infact, they have vertex connectivity of twelve and one, respectively. Note that for both densestsubgraphs, vertex connectivity is much smaller than the minimum degree of vertices, a trivialupper bound on that.For both problems, we propose polynomial-time (bicriteria and ordinary) approximation algo-rithms. Let w max and w min denote the maximum and minimum weights, respectively, over all edgesin G , i.e., w max = max e ∈ E w ( e ) and w min = min e ∈ E w ( e ). Graphs have been made simple undirected by ignoring the direction of edges, and by removing self-loops andmultiple edges. G [ S DS ] in four real-world Web graphs: δ ( G [ S DS ]) representsthe minimum degree of vertices in G [ S DS ], a trivial upper bound on κ ( G [ S DS ]) and λ ( G [ S DS ]).Graph | S DS | | E ( S DS ) | d ( S ) κ ( G [ S DS ]) λ ( G [ S DS ]) δ ( G [ S DS ]) web-BerkStan
392 40,535 103.41 12 201 201 web-Google
123 3,449 28.04 30 30 30 web-NotreDame web-Stanford
597 35,456 59.39 60 60 60Our first result is polynomial-time (cid:16) γ · w min w max , /γ (cid:17) -bicriteria approximation algorithms withparameter γ ∈ [1 ,
2] for Problems 1 and 2. That is, the algorithm for Problem 1/Problem 2 outputs S ⊆ V having density at least the optimal value times γ · w min w max but only satisfies a ( k/γ )-vertex/edge-connectivity constraint (rather than the original k -vertex/edge-connectivity constraint). Note thatif we set γ = 1, we can obtain (cid:16) · w min w max (cid:17) -approximation algorithms. The design of our algorithmsis based on an elegant theorem in graph theory, proved by Mader [34]. The theorem states thatany (unweighted) dense graph contains a highly vertex-connected subgraph wherein the minimumdegree of vertices is greater than the density of the entire vertex set. We refer to this subgraphas a Mader subgraph and our algorithm finds a Mader subgraph of a densest subgraph of eachmaximal k -vertex-connected subgraph of G . It should be noted that to deal with edge-weightedgraphs, we generalize Mader’s theorem. Our generalized version cannot be directly obtained fromthe original statement of Mader’s theorem, and is essential to derive the bicriteria approximationratio for edge-weighted graphs.Our second result is polynomial-time (cid:16) · w min w max (cid:17) -approximation algorithms for Problems 1and 2, which improves the above approximation ratio of · w min w max derived directly from the bicriteriaapproximation ratio. Our algorithm for Problem 1/Problem 2 computes the most highly connectedsubgraph in terms of vertex/edge connectivity, which can be done using the algorithms in Mat-ula [38]. In the analysis of the approximation ratio, we use a useful variant of Mader’s theorem,recently proved by Bernshteyn and Kostochka [5]. Paper organization.
The remainder of this paper is organized as follows. In Section 2, we reviewrelated work. In Section 3, we extend Mader’s theorem to edge-weighted graphs and design analgorithm for finding a Mader subgraph. In Sections 4 and 5, we present our bicriteria and ordinaryapproximation algorithms, respectively. We conclude with some open problems in Section 6.
Variations of the densest subgraph problem.
Wu et al. [52] consider the problem of detectinga dense and connected subgraph in dual networks . A dual network is a pair of graphs G = ( V, E G )and H = ( V, E H ) defined on the same vertex set V , which encode different types of connectionsusing two edge sets E G and E H . Wu et al. [52] introduced the following problem: given a dualnetwork ( G, H ), we are asked to find S ⊆ V that maximizes d ( S ) in G under the constraintthat H [ S ] is connected (i.e., 1-edge-connected). They proved that the problem is NP-hard anddevised a scalable heuristic. Problem 2 with k = 1, i.e., the densest 1-edge-connected subgraph,4n unweighted graphs, can be seen as a special case of their problem wherein two graphs G and H are the same, i.e., E G = E H . It is easy to see that unlike the general form of their problem, thedensest 1-edge-connected subgraph problem (on unweighted graphs) is polynomial-time solvable.Two closely related papers are due to Tsourakakis [48] and Kawase and Miyauchi [30]. Theyaim to find a near-clique (which is robust to vertex/edge failure) by extending the densest subgraphproblem. Tsourakakis [48] introduced the problem called the k -clique densest subgraph problem . Inthis problem, given an unweighted graph G = ( V, E ), we are asked to find S ⊆ V that maximizesthe k -clique density w k ( S ) / | S | , where w k ( S ) is the number of k -cliques (i.e., cliques with size k )in G [ S ]. Tsourakakis [48] showed that this problem (with constant k ) remains polynomial-timesolvable, and later, Mitzenmacher et al. [39] proposed a scalable algorithm that obtains a nearly-optimal solution. On the other hand, Kawase and Miyauchi [30] introduced the problem calledthe f -densest subgraph problem with convex f . In this problem, given an edge-weighted graph G = ( V, E, w ), we are asked to find S ⊆ V that maximizes w ( S ) /f ( | S | ), where f : Z ≥ → R ≥ is amonotonically non-decreasing function that satisfies ( f ( x + 2) − f ( x + 1)) − ( f ( x + 1) − f ( x )) ≥ x ∈ Z ≥ . This formulation generalizes the NP-hard optimal quasi-cliques problem due toTsourakakis et al. [49, 50]. Kawase and Miyauchi [30] studied the hardness of the problem, andproposed a polynomial-time approximation algorithm. Although the above two problems contributeto computing a dense subgraph that is robust to vertex/edge failure, they cannot explicitly impose k -vertex/edge connectivity.There are also some variants that take into account the robustness to the uncertainty of inputgraphs. Zou [53] studied the densest subgraph problem on uncertain graphs . Uncertain graphs are ageneralization of graphs, which can model the uncertainty of the existence of edges. More formally,an uncertain graph consists of an unweighted graph G = ( V, E ) and a function p : E → [0 , e ∈ E is present with probability p ( e ) whereas e ∈ E is absent with probability 1 − p ( e ).In the problem introduced by Zou [53], given an uncertain graph G = ( V, E ) with p , we are askedto find S ⊆ V that maximizes the expected value of the density. Zou [53] observed that thisproblem can be reduced to the original densest subgraph problem, and designed polynomial-timeexact algorithm using the reduction. Very recently, Tsourakakis et al. [51] introduced the problemcalled the risk-averse DSD . In this problem, given an uncertain graph G = ( V, E ) with p , we areasked to find S ⊆ V that has a large expected density and at the same time has a small risk . Therisk of S ⊆ V is measured by the probability that S is not dense on a given uncertain graph. Theyshowed that the risk-averse DSD can be reduced to the densest subgraph problem with negative edge weights (which is NP-hard), and designed an efficient approximation algorithm based on thereduction.Miyauchi and Takeda [41] considered the uncertainty of edge weights rather than the existenceof edges. To model that, they assumed that they have an edge-weight space W = × e ∈ E [ l e , r e ] ⊆× e ∈ E [0 , ∞ ) that contains the unknown true edge weight w . To evaluate the performance of S ⊆ V without any concrete edge weight, they employed a well-known measure in the field of robustoptimization, called the robust ratio . In their scenario, the robust ratio of S ⊆ V under W isdefined as the multiplicative gap between the density of S in terms of edge weight w (cid:48) and thedensity of S ∗ w (cid:48) in terms of edge weight w (cid:48) under the worst-case edge weight w (cid:48) ∈ W , where S ∗ w (cid:48) is a densest subgraph of G with w (cid:48) . Intuitively, S ⊆ V with a large robust ratio has a densityclose to the optimal value even on G with the edge weight selected adversarially from W . Usingthe robust ratio, they formulated the robust densest subgraph problem , where given an unweightedgraph G = ( V, E ) and an edge-weight space W = × e ∈ E [ l e , r e ] ⊆ × e ∈ E [0 , ∞ ), we are asked to find5 ⊆ V that maximizes the robust ratio under W . Miyauchi and Takeda [41] designed an algorithmthat returns S ⊆ V with a robust ratio of at least e ∈ E rele under some mild condition.In addition to the above, there are many other problem variations. The most well-studiedvariants are size restricted ones [1, 6, 13, 31]. For example, in the densest k -subgraph problem [13],given an edge-weighted graph G = ( V, E, w ) and a positive integer k ∈ Z > , we are asked to find S ⊆ V that maximizes d ( S ) subject to the constraint | S | = k . It is known that such a restrictionmakes the problem much harder; indeed, the densest k -subgraph problem is NP-hard and the bestknown approximation ratio is Ω(1 /n / (cid:15) ) for any (cid:15) > Vertex and edge connectivity.
In the vertex connectivity problem , we are asked to compute κ ( G ) for a given graph G = ( V, E ). For this problem, Gabow [16] developed an O ( | V | ( κ ( G ) · min {| V | / , κ ( G ) / } + κ ( G ) | V | ))-time algorithm, which also computes a corresponding minimumvertex separator S ⊂ V . This is one of the current fastest deterministic algorithms for the prob-lem, although there are various randomized algorithms (e.g., see [14, 24, 33, 43]). Note that thereare linear-time algorithms that decide whether G is 2-vertex-connected and 3-vertex-connected,respectively, due to Tarjan [47] and Hopcroft and Tarjan [25].Another important problem related to vertex connectivity is to compute the family of maximal k -vertex-connected subgraphs, which will be solved in our bicriteria approximation algorithm forProblem 1. For S ⊆ V and k ∈ Z > , the induced subgraph G [ S ] is called a maximal k -vertex-connected subgraph if G [ S ] is k -vertex-connected and no superset of S has this property. Forthis task, the first polynomial-time algorithm is given by Matula [37]. Note that maximal k -vertex-connected subgraphs may overlap each other; the design of the algorithm by Matula [37]is based on the fact that the maximum total number of maximal k -vertex-connected subgraphs is O ( | V | ) [37]. Later, Makino [36] designed an O ( | V | · T )-time algorithm, where T is the computationtime required to find a vertex separator of size at most k −
1. Combined with the above vertexconnectivity algorithm by Gabow [16], the algorithm by Makino [36] yields the running time of O ( | V | ( k · min {| V | / , k / } + k | V | )). For some special k , there are some existing algorithms thathave better running time. For k = 2 and 3, there are linear-time algorithms by Tarjan [47] andHopcroft and Tarjan [25], respectively. For any constant k , Henzinger et al. [22] presented an O ( | V | )-time algorithm.In the (global) minimum cut problem , given an edge-weighted graph G = ( V, E, w ), we are askedto find the minimum weight cut F ⊆ E . For this problem, Nagamochi and Ibaraki [42] designed an O ( | V | ( | E | + | V | log | V | ))-time algorithm. Later, Stoer and Wagner [46] and Frank [15] independentlypresented a very simple algorithm that still has the same running time. For simple unweightedgraphs, the seminal work by Karger [27] provides a randomized (Monte Carlo) algorithm thatruns in nearly-linear, O ( | E | log | V | ), time. As this algorithm does not necessarily return the rightanswer, Karger [27] posed an open question to find a nearly-linear-time deterministic algorithm.In a recent breakthrough, Kawarabayashi and Thorup [28] answered the question; they developeda deterministic algorithm for simple unweighted graphs that runs in O ( | E | log | V | ) time. Veryrecently, Henzinger et al. [23] improved the running time to O ( | E | log | V | log log | V | ) time, which6s better even than that of the randomized algorithm by Karger [27].As in the vertex connectivity case, computing the family of maximal k -edge-connected subgraphsis also an important problem, which will be solved in our bicriteria approximation algorithm forProblem 2. For S ⊆ V and k ∈ R > , the induced subgraph G [ S ] is called a maximal k -edge-connected subgraph if G [ S ] is k -edge-connected and no superset of S has this property. The problemcan be solved using any minimum cut algorithm as follows: if the weight of the minimum cutof the graph is less than k , divide the graph into two subgraphs along with the cut and thenrepeat the procedure on the resulting subgraphs. For edge-weighted graphs, we can directly obtainan O ( | V | ( | E | + | V | log | V | ))-time algorithm using one of the above minimum cut algorithms byNagamochi and Ibaraki [42], Stoer and Wagner [46], and Frank [15]. To the best of our knowledge,there is no existing algorithm that has a better running time. For simple unweighted graphs, wecan again directly obtain an O ( | E || V | log | V | log log | V | )-time algorithm using the above minimumcut algorithm by Henzinger et al. [23]. Unlike the weighted case, for some special k , there aresome existing algorithms that have a better running time. For k = 2, there is a linear-timealgorithm by Tarjan [47]. For any constant k , Henzinger et al. [22] presented an O ( | V | log | V | )-time algorithm, and more recently, Chechik et al. [9] provided an O ( (cid:112) | V | ( | E | + | V | log | V | ))-timealgorithm. The latter algorithm is efficient particularly for sparse graphs; indeed, the latter is betterthan the former when | E | = o ( | V | / log | V | ). Very recently, for any k ∈ Z > , Forster et al. [14]developed a randomized (Las Vegas) algorithm that has expected running time O ( k | V | / log | V | + k | E | log | V | ), which is faster than the algorithm by Chechik et al. [9] (for general k ∈ Z > ). In this section, we extend Mader’s theorem to edge-weighted graphs and design an algorithm forfinding a Mader subgraph.
Mader’s theorem [34] is a foundational theorem in graph theory. The precise statement is as follows:
Theorem 1 (Mader [34]; see also Theorem 1.4.3 in Diestel [10]) . Let G = ( V, E ) be an unweightedgraph and let d be a positive integer. If G has density at least d , then G has a ( (cid:98) d/ (cid:99) + 1) -vertex-connected subgraph wherein the minimum degree of vertices is greater than d . A straightforward application of Theorem 1 to edge-weighted graphs would yield the followingresult. Let G = ( V, E, w ) be an edge-weighted graph, let d be a positive real, and assume that G has density at least d . Now consider an unweighted graph G (cid:48) = ( V, E ) defined on the samevertex set V and edge set E . As G (cid:48) has the density of at least d/w max (i.e., at least (cid:98) d/w max (cid:99) ),by Theorem 1, we see that G (cid:48) has a (cid:16)(cid:106) (cid:98) d/w max (cid:99) (cid:107) + 1 (cid:17) -vertex-connected subgraph wherein theminimum degree of vertices is greater than (cid:98) d/w max (cid:99) . Therefore, we can deduce that G has a (cid:16)(cid:106) (cid:98) d/w max (cid:99) (cid:107) + 1 (cid:17) -vertex-connected subgraph wherein the minimum weighted degree of vertices isgreater than w min (cid:98) d/w max (cid:99) . However, this is weaker than what we need to prove the approximationguarantee of our algorithms, as we discuss in Section 4.4.Here we provide a stronger version for edge-weighted graphs. Specifically, we prove the followingtheorem: 7 heorem 2. Let G = ( V, E, w ) be an edge-weighted graph and let d be a positive real. If G hasdensity at least d , then G has a (cid:16)(cid:106) (cid:100) d/w max (cid:101) (cid:107) + 1 (cid:17) -vertex-connected subgraph wherein the minimumweighted degree of vertices is greater than d .Proof. Let H = ( S, E ( S )) be a subgraph of G with the minimum number of vertices that satisfies | S | ≥ (cid:100) d/w max (cid:101) and w ( S ) > d ( V ) (cid:18) | S | − (cid:100) d/w max (cid:101) (cid:19) . (1)There exists such a subgraph H because G itself satisfies the above condition. In fact, since d ( V ) ≥ d holds, there exists a vertex with the weighted degree of at least 2 d , implying that thenumber of neighbors of such a vertex is at least (cid:100) d/w max (cid:101) , thus | V | ≥ (cid:100) d/w max (cid:101) + 1 > (cid:100) d/w max (cid:101) holds, and w ( V ) = d ( V ) | V | > d ( V ) (cid:16) | V | − (cid:100) d/w max (cid:101) (cid:17) . Suppose that | S | = (cid:100) d/w max (cid:101) . Then we have w ( S ) > d ( V ) (cid:18) | S | − (cid:100) d/w max (cid:101) (cid:19) = d ( V ) (cid:100) d/w max (cid:101) ≥ w max ( d/w max ) (cid:100) d/w max (cid:101) > w max (cid:18) (cid:100) d/w max (cid:101) (cid:19) = w max (cid:18) | S | (cid:19) ≥ w ( S ) , a contradiction. Therefore, we see that | S | ≥ (cid:100) d/w max (cid:101) + 1. Suppose also that there exists avertex v in H whose weighted degree is at most d ( V ) in H . Let H (cid:48) = ( S (cid:48) , E ( S (cid:48) )) be a subgraphconstructed by removing v from H . Then we have | S (cid:48) | = | S | − ≥ (cid:100) d/w max (cid:101) and w ( S (cid:48) ) ≥ w ( S ) − d ( V ) > d ( V ) (cid:18) | S | − (cid:100) d/w max (cid:101) − (cid:19) = d ( V ) (cid:18) | S (cid:48) | − (cid:100) d/w max (cid:101) (cid:19) . This means that H (cid:48) also satisfies condition (1), which contradicts the minimality of H . Therefore,we see that every vertex in H has weighted degree greater than d ( V ) ≥ d in H .From now on, we show that H is (cid:16)(cid:106) (cid:100) d/w max (cid:101) (cid:107) + 1 (cid:17) -vertex-connected. Suppose, for contradiction,that there exists T ⊆ S with | T | ≤ (cid:106) (cid:100) d/w max (cid:101) (cid:107) whose removal separates H into two non-emptysubgraphs H [ S ] and H [ S ] so that there are no edges between them. For any vertex v ∈ S , itsneighbors in H are all contained in S ∪ T . As v has weighted degree greater than d ( V ) ≥ d in H ,the number of neighbors of v in S ∪ T is at least (cid:100) d/w max (cid:101) , thus | S ∪ T | ≥ (cid:100) d/w max (cid:101) + 1. From theminimality of H , we see that the subgraph H [ S ∪ T ] does not satisfy condition (1), which impliesthat w ( S ∪ T ) ≤ d ( V ) (cid:18) | S ∪ T | − (cid:100) d/w max (cid:101) (cid:19) holds. Applying the same argument to S , we also have w ( S ∪ T ) ≤ d ( V ) (cid:18) | S ∪ T | − (cid:100) d/w max (cid:101) (cid:19) . lgorithm 1: Peel ( G, d ) Input : G = ( V, E, w ) and d ∈ R > Output:
Subgraph of G or Null S ← V ; while True do v min ← argmin v ∈ S deg S ( v ); if deg S ( v min ) > d then return G [ S ]; S ← S \ { v min } ; return Null ;Combining these two inequalities, we have w ( S ) ≤ w ( S ∪ T ) + w ( S ∪ T ) ≤ d ( V )( | S ∪ T | + | S ∪ T | − (cid:100) d/w max (cid:101) )= d ( V )( | S | + | T | + | S | + | T | − (cid:100) d/w max (cid:101) ) ≤ d ( V ) (cid:18) | S | − (cid:100) d/w max (cid:101) (cid:19) , which contradicts that H satisfies condition (1). We design an algorithm
Mader subgraph that extracts a Mader subgraph, i.e., the subgraph whoseexistence is guaranteed by Theorem 2. To this end, we first present a simple subprocedure, whichwe call
Peel . For an edge-weighted graph G = ( V, E, w ) and a positive real d , the procedure Peel returns the maximal subgraph of G wherein the minimum weighted degree of vertices is greaterthan d if such a subgraph exists and Null otherwise. Specifically,
Peel iteratively removes a vertexwith the minimum weighted degree in the currently remaining graph while the minimum weighteddegree is no greater than d . Note that this procedure is similar to the procedure to find a k -core .For reference, we describe the entire procedure in Algorithm 1, where deg S ( v ) for S ⊆ V and v ∈ S denotes the weighted degree of v in G [ S ]. This algorithm can be implemented to run in O ( | E | + | V | log | V | ) time, as mentioned in the literature [40].Using Algorithm 1, we present Mader subgraph in Algorithm 2, where the notation V ( H (cid:48) )denotes the vertex set of subgraph H (cid:48) of G . Here we briefly explain the behavior of the algorithm.Let G ∗ be a Mader subgraph of a given edge-weighted graph G . The algorithm keeps a family ofsubgraphs H in which exactly one subgraph contains G ∗ as its subgraph. In each iteration, thealgorithm tests whether a subgraph in H is a Mader subgraph or not, and if not, the algorithmdivides the subgraph into strictly smaller pieces and add (a part of) them to H . The algorithmrepeats this operation until it finds a Mader subgraph. It should be noted that our algorithm isbased on Matula’s algorithm [38, Algorithm A], which finds the most highly connected subgraphin terms of vertex connectivity, i.e., H ∈ argmax { κ ( H ) | H is a subgraph of G } .The following theorem verifies the validity of Mader subgraph . The proof strategy is similar tothat for Matula [38, Theorem 3]. 9 lgorithm 2:
Mader subgraph ( G ) Input : G = ( V, E, w ) Output:
Subgraph of G H ← Peel ( G, d ( V )); τ ← (cid:106) (cid:100) d ( V ) /w max (cid:101) (cid:107) + 1; H ← the family of the maximal connected subgraphs of H that have at least τ + 1 vertices; if there exists a clique K in H then return K ; while True do H (cid:48) ← an arbitrary element of H ; C ← the minimum vertex separator of H (cid:48) ; if | C | ≥ τ then return H (cid:48) ; S ← the family of the vertex sets of the maximal connected subgraphs of G [ V ( H (cid:48) ) \ C ]; H (cid:48) ← ∅ ; for each S ∈ S do if Peel ( G [ S ∪ C ] , d ( V )) has at least τ + 1 vertices then H (cid:48) ← H (cid:48) ∪ { Peel ( G [ S ∪ C ] , d ( V )) } ; if there exists a clique K in H (cid:48) then return K ; H ← ( H \ { H (cid:48) } ) ∪ H (cid:48) ; Theorem 3.
For a given edge-weighted graph G = ( V, E, w ) , Algorithm 2 outputs a Mader subgraphof G in O ( | V | / ) time.Proof. It is easy to see that if the algorithm terminates, its output is a Mader subgraph of G . Thus,in what follows, we analyze the time complexity of the algorithm.Specifically, we prove that Algorithm 2 runs in O ( | V | / ) time. The time complexity of thealgorithm except for the while-loop is given by O ( | E | + | V | log | V | ) due to the time complexity ofthe procedure Peel . We can show that the time complexity of the while-loop is given by O ( | V | / ).To see this, we analyze the time complexity of each iteration and the number of iterations. Thetime complexity of each iteration is dominated by that required to compute the minimum vertexseparator C of H (cid:48) . As reviewed in Section 2, the current best algorithm completes this task in O ( | V ( H (cid:48) ) | ( | C | · min {| V ( H (cid:48) ) | / , | C | / } + | C || V ( H (cid:48) ) | )) time. Hence, the time complexity of eachiteration is bounded by O ( | V | / ). Next we show that the number of iterations of the while-loopis bounded by | V | . Let G ∗ be a Mader subgraph of G , that is, G ∗ is a τ -vertex-connected subgraphof G wherein the minimum weighted degree of vertices is greater than d ( V ). It is easy to see thatexactly one subgraph in H contains G ∗ as its subgraph in any iteration of the while-loop. Here wedefine the surplus of H as s ( H ) = (cid:88) H ∈H ( | V ( H ) | − τ − . H , we have s ( H ) ≤ | V | − τ −
1. Note that s ( H ) ≥ S (cid:48) = { S ∈ S || V ( Peel ( G [ S ∪ C ] , d ( V ))) | ≥ τ + 1 } . If |S (cid:48) | ≤ H (cid:48) is simply deleted or replaced by asubgraph with at most | V ( H (cid:48) ) | − H . Thus, the surplus decreases by atleast one in the iteration. Assume that |S (cid:48) | ≥
2. Then we have (cid:88) H ∈H (cid:48) ( | V ( H ) | − τ −
1) = (cid:88) S ∈S (cid:48) ( | V ( Peel ( G [ S ∪ C ] , d ( V ))) | − τ − ≤ (cid:88) S ∈S (cid:48) ( | V ( G [ S ∪ C ]) | − τ − ≤ | V ( H (cid:48) ) | + ( |S (cid:48) | − | C | − τ ) − τ − |S (cid:48) | < | V ( H (cid:48) ) | − τ − , where the last inequality follows from |S (cid:48) | ≥ | C | < τ . Note that | C | < τ holds because thealgorithm has not yet terminated in the iteration. The above inequality implies that the surplusdecreases by at least two in the iteration. Therefore, the number of iterations of the while-loop isbounded by | V | − τ < | V | . In this section, we first design a polynomial-time (cid:16) γ · w min w max , /γ (cid:17) -bicriteria approximation algorithmwith parameter γ ∈ [1 ,
2] for Problem 1, and then present a corresponding result for Problem 2.
For a given edge-weighted graph G = ( V, E, w ), our algorithm first finds the family of maxi-mal k -vertex-connected subgraphs { G [ S ] , . . . , G [ S p ] } using Makino’s algorithm [36] combined withGabow’s vertex connectivity algorithm [16], which takes O ( | V | ( k · min {| V | / , k / } + k | V | )) time.Note that if there is no k -vertex-connected subgraph found, our algorithm returns INFEASIBLE be-cause the instance is actually infeasible.For each i = 1 , . . . , p , the algorithm initializes S ∗ i as S i . Then the algorithm finds a dens-est subgraph S DS i (without any constraint) in G [ S i ]. This can be done in polynomial time usingCharikar’s linear-programming-based algorithm for the densest subgraph problem [8]. After that,if k ≤ γ (cid:16)(cid:106) (cid:100) d ( S DS i ) /w max (cid:101) (cid:107) + 1 (cid:17) holds, then the algorithm employs as S ∗ i the vertex set of a Madersubgraph of G [ S DS i ], i.e., the vertex set of a (cid:16)(cid:106) (cid:100) d ( S DS i ) /w max ( G [ S DS i ]) (cid:101) (cid:107) + 1 (cid:17) -vertex-connected sub-graph in G [ S DS i ] wherein the minimum weighted degree of vertices is greater than d ( S DS i ), usingthe procedure Mader subgraph (Algorithm 2). Here w max ( G [ S DS i ]) denotes the maximum weightof edges in G [ S DS i ]. Note that w max ( G [ S DS i ]) ≤ w max holds. For G [ S DS i ], Mader subgraph runs in O ( | S DS i | / ) = O ( | V | / ) time.Finally, the algorithm outputs the densest subset among { S ∗ , . . . , S ∗ p } . For reference, wesummarize the entire procedure in Algorithm 3. As the maximum total number of maximal k -vertex-connected subgraphs is O ( | V | ) [37], the overall running time of Algorithm 3 is given by O ( | V | ( | V | / + T DS )), where T DS is the computation time required to find a densest subgraph in(any subgraph of) G . Note that as mentioned above, T DS is polynomial in | V | and | E | . Moreover,11 lgorithm 3: Bicriteria approximation algorithm with parameter γ ∈ [1 ,
2] for Problem 1
Input : G = ( V, E, w ) and k ∈ Z > Output: S ⊆ V or INFEASIBLE Find the family of maximal k -vertex-connected subgraphs { G [ S ] , . . . , G [ S p ] } ; if there is no k -vertex-connected subgraph found then return INFEASIBLE ; else for i = 1 , . . . , p do S ∗ i ← S i ; Find a densest subgraph S DS i (without any constraint) in G [ S i ]; if k ≤ γ (cid:16)(cid:106) (cid:100) d ( S DS i ) /w max (cid:101) (cid:107) + 1 (cid:17) then S ∗ i ← The vertex set of
Mader subgraph ( G [ S DS i ]); return S ∈ argmax S ∈{ S ∗ ,...,S ∗ p } d ( S );for unweighted graphs, Goldberg’s flow-based algorithm [21] gives T DS = O ( | E || V | ), using Orlin’smaximum-flow algorithm [44]. Using our generalized Mader’s theorem (Theorem 2), we provide the bicriteria approximation ratioof Algorithm 3:
Theorem 4.
For any γ ∈ [1 , , Algorithm 3 is a polynomial-time (cid:16) γ · w min w max , /γ (cid:17) -bicriteria ap-proximation algorithm for Problem 1.Proof. We first show that the output of Algorithm 3 is ( k/γ )-vertex-connected. To this end, it suf-fices to confirm ( k/γ )-vertex-connectivity of G [ S ∗ i ] for each i = 1 , . . . , p . Fix i ∈ { , . . . , p } . If k ≤ γ (cid:16)(cid:106) (cid:100) d ( S DS i ) /w max (cid:101) (cid:107) + 1 (cid:17) does not hold, we are done since G [ S ∗ i ] is given by G [ S i ], which is k -vertex-connected (thus ( k/γ )-vertex-connected). Consider the case where k ≤ γ (cid:16)(cid:106) (cid:100) d ( S DS i ) /w max (cid:101) (cid:107) + 1 (cid:17) holds. Applying Theorem 2 to G [ S DS i ] with setting d = d ( S DS i ), we see that G [ S DS i ] has a (cid:16)(cid:106) (cid:100) d ( S DS i ) /w max ( G [ S DS i ]) (cid:101) (cid:107) + 1 (cid:17) -vertex-connected subgraph, which is ( k/γ )-vertex-connected. Algo-rithm 3 employs such a subset as S ∗ i .We next analyze the first term of the bicriteria approximation ratio. It suffices to show that foreach i = 1 , . . . , p , the subset S ∗ i has density at least γ · w min w max times the optimal value of Problem 1on G [ S i ]. Fix i ∈ { , . . . , p } . Clearly, the optimal value of Problem 1 on G [ S i ], which we denote by OPT i , is at most d ( S DS i ).We first consider the case where k ≤ γ (cid:16)(cid:106) (cid:100) d ( S DS i ) /w max (cid:101) (cid:107) + 1 (cid:17) does not hold. In this case,Algorithm 3 just employs S i as S ∗ i . As G [ S i ] is k -vertex-connected, each vertex has weighted12egree of at least w min · k > γ · w min (cid:16)(cid:106) (cid:100) d ( S DS i ) /w max (cid:101) (cid:107) + 1 (cid:17) ; thus, the density of S i is greater than γ · w min (cid:18)(cid:22) (cid:100) d ( S DS i ) /w max (cid:101) (cid:23) + 1 (cid:19) / ≥ γ · w min (cid:18) d ( S DS i ) /w max −
12 + 1 (cid:19) / > γ · w min w max · d ( S DS i ) ≥ γ · w min w max · OPT i , which means γ · w min w max -approximation.We next consider the case where k ≤ γ (cid:16)(cid:106) (cid:100) d ( S DS i ) /w max (cid:101) (cid:107) + 1 (cid:17) holds. Applying Theorem 2 to G [ S DS i ] with setting d = d ( S DS i ), we see that G [ S DS i ] has a (cid:16)(cid:106) (cid:100) d ( S DS i ) /w max ( G [ S DS i ]) (cid:101) (cid:107) + 1 (cid:17) -vertex-connected subgraph wherein the minimum weighted degree of vertices is greater than d ( S DS i ).Algorithm 3 employs such a subset as S ∗ i . As each vertex has weighted degree greater than d ( S DS i ),the density of S ∗ i is greater than d ( S DS i ) / ≥ OPT i /
2, which means 1 / γ · w min w max -approximation).From the proof, we see that if the if-condition of Algorithm 3 holds, the output admits 1 / γ = 1 in thetheorem, we can obtain an ordinary · w min w max -approximation algorithm for Problem 1. In Section 5,we present an algorithm with a better approximation ratio. Here we present a bicriteria approximation algorithm for Problem 2, which is an edge-connectivitycounterpart of Algorithm 3. For a given edge-weighted graph G = ( V, E, w ), our algorithm first findsthe family of maximal k -edge-connected subgraphs { G [ S ] , . . . , G [ S p ] } . As reviewed in Section 2,this can be done in O ( | V | ( | E | + | V | log | V | )) time using one of the minimum cut algorithms byNagamochi and Ibaraki [42], Stoer and Wagner [46], and Frank [15] as a subroutine. If G is simpleunweighted, the time complexity reduces to O ( | E || V | log | V | log log | V | ) using the minimum cutalgorithm by Henzinger et al. [23].In the processing of G [ S i ] for each i = 1 , . . . , p , the algorithm computes a variant of a Madersubgraph of G [ S DS i ], i.e., a w min (cid:16)(cid:106) (cid:100) d ( S DS i ) /w max (cid:101) (cid:107) + 1 (cid:17) -edge-connected subgraph in G [ S DS i ] whereinthe minimum weighted degree of vertices is greater than d ( S DS i ). The existence of such a subgraphis guaranteed by a corollary of Theorem 2, which we will present later. Recall that Algorithm 3uses the procedure Mader subgraph . On the other hand, the above variant can be computed usingthe strategy employed by the algorithms for computing the family of maximal k -edge-connectedsubgraphs, presented in Section 2. Specifically, the strategy in our scenario is as follows: if theweight of the minimum cut of G [ S DS i ] is less than w min (cid:106) (cid:100) d ( S DS i ) /w max (cid:101) (cid:107) + 1, divide the graph intotwo subgraphs along with the cut and then repeat the procedure on the resulting subgraphs (untilit finds the variant of a Mader subgraph). It should be noted that in order to satisfy the minimumweighted degree condition, our algorithm needs to conduct the procedure Peel every time before itprocesses a new subgraph. For reference, the pseudocode of our algorithm is given in Algorithm 4.Here we evaluate the running time of Algorithm 4. It is easy to see that the above algorithmfor finding the variant of a Mader subgraph still has the same running time as that of algorithmsfor computing the family of maximal k -edge-connected subgraphs. Therefore, the time complexity13 lgorithm 4: Bicriteria approximation algorithm with parameter γ ∈ [1 ,
2] for Problem 2
Input : G = ( V, E, w ) and k ∈ R > Output: S ⊆ V or INFEASIBLE Find the family of maximal k -edge-connected subgraphs { G [ S ] , . . . , G [ S p ] } ; if there is no k -edge-connected subgraph found then return INFEASIBLE ; else for i = 1 , . . . , p do S ∗ i ← S i ; Find a densest subgraph S DS i (without any constraint) in G [ S i ]; if k ≤ γ · w min (cid:16)(cid:106) (cid:100) d ( S DS i ) /w max (cid:101) (cid:107) + 1 (cid:17) then S ∗ i ← The vertex set of a w min (cid:16)(cid:106) (cid:100) d ( S DS i ) /w max (cid:101) (cid:107) + 1 (cid:17) -edge-connected subgraph in G [ S DS i ] wherein the minimum weighted degree of vertices is greater than d ( S DS i ); return S ∈ argmax S ∈{ S ∗ ,...,S ∗ p } d ( S );of the processing of each G [ S i ] is bounded by O ( T DS ( S i ) + | S i | ( | E ( S i ) | + | S i | log | S i | )), where T DS ( S i ) is the computation time required to find a densest subgraph in G [ S i ]. Recalling thatmaximal k -edge-connected subgraphs do not overlap for any k , we see that the time complexityof the entire for-loop is bounded by O ( T DS ( G ) + | V | ( | E | + | V | log | V | )), which also bounds theoverall running time of Algorithm 4. For simple unweighted graphs, we have the running time of O ( | V | + | E || V | log | V | log log | V | ).Finally we analyze the theoretical performance guarantee of Algorithm 4. It is easy to see thatany (edge-weighted) k -vertex-connected graph G is w min k -edge-connected, which gives the followingcorollary to Theorem 2: Corollary 1.
Let G = ( V, E, w ) be an edge-weighted graph and let d be a positive real. If G has density at least d , then G has a w min (cid:16)(cid:106) (cid:100) d/w max (cid:101) (cid:107) + 1 (cid:17) -edge-connected subgraph wherein theminimum weighted degree of vertices is greater than d . Using this corollary, we can derive the bicriteria approximation ratio of Algorithm 4:
Theorem 5.
For any γ ∈ [1 , , Algorithm 4 is a polynomial-time (cid:16) γ · w min w max , /γ (cid:17) -bicriteria ap-proximation algorithm for Problem 2. The proof is similar to that of Theorem 4, and is omitted.
Here we explain that our generalized Mader’s theorem (i.e., Theorem 2) is essential to derive thebicriteria approximation ratio given in Theorems 4 and 5. To this end, recall that the straight-forward application of the original Mader’s theorem to edge-weighted graphs derives the followingstatement: Let G = ( V, E, w ) be an edge-weighted graph and let d be a positive real. If G has14 lgorithm 5: Approximation algorithm for Problem 1
Input : G = ( V, E, w ) and k ∈ Z > Output: S ⊆ V or INFEASIBLE H ← argmax { κ ( H ) | H is a subgraph of G } ; if κ ( H ) ≥ k then return the vertex set of H ; else return INFEASIBLE ;density at least d , then G has a (cid:16)(cid:106) (cid:98) d/w max (cid:99) (cid:107) + 1 (cid:17) -vertex-connected subgraph wherein the minimumweighted degree of vertices is greater than w min (cid:98) d/w max (cid:99) .Obviously, the above statement is weaker than Theorem 2. Indeed, vertex connectivity of (cid:106) (cid:100) d/w max (cid:101) (cid:107) +1 in Theorem 2 has decreased to (cid:106) (cid:98) d/w max (cid:99) (cid:107) +1, which is only a slight deterioration, butthe minimum weighted degree of d in Theorem 2 has significantly decreased to w min (cid:98) d/w max (cid:99) . It iseasy to see that to prove Theorems 4 and 5, vertex connectivity of (cid:106) (cid:98) d/w max (cid:99) (cid:107) +1 is sufficient, but theminimum weighted degree of w min (cid:98) d/w max (cid:99) is insufficient. In fact, in the last paragraph of the proofof Theorem 4, by using the decreased minimum weighted degree, we can only guarantee that thedensity of S ∗ i is greater than w min (cid:98) d ( S DS i ) /w max (cid:99) ≥ w min (cid:98) OPT i /w max (cid:99) (rather than d ( S DS i ) / ≥ OPT i / w min (cid:98) OPT i /w max (cid:99) may be less than γ · w min w max · OPT i , meaning that thedecreased minimum weighted degree is insufficient to prove the theorem. We can see the same issuein the proof of Theorem 5. In this section, we design a polynomial-time (cid:16) · w min w max (cid:17) -approximation algorithm for Problem 1,which improves the approximation ratio of · w min w max that is immediately derived by Algorithm 3.Then we present its counterpart result for Problem 2. Our algorithm first computes the most highly connected subgraph in terms of vertex connectivity,i.e., H ∈ argmax { κ ( H ) | H is a subgraph of G } . This can be done using Matula’s algorithm [38,Algorithm A]. Then our algorithm simply returns the subgraph if its vertex connectivity is no lessthan k and INFEASIBLE otherwise. Our algorithm is described in pseudocode as Algorithm 5.Matula [38] showed that the time complexity of the algorithm for computing the most highlyconnected subgraph in terms of vertex connectivity is given by O ( | V |· T ), where T is the computationtime required to find a minimum vertex separator of G . If we consider Gabow’s vertex connectiv-ity algorithm [16], the time complexity becomes O ( | V | ( κ ( G ) · min {| V | / , κ ( G ) / } + κ ( G ) | V | )).Clearly, Algorithm 5 has the same time complexity.15 .2 Analysis From now on, we analyze the theoretical performance guarantee of Algorithm 5. To this end, weuse the following theorem, which is a useful variant of Mader’s theorem:
Theorem 6 (Bernshteyn and Kostochka [5]) . Let G = ( V, E ) be an unweighted graph and let t bean integer with t ≥ . If G satisfies | V | ≥ t and | E | > t ( | V | − t ) , then G has a ( t + 1) -vertex-connected subgraph. We provide the approximation ratio of Algorithm 5 in the following theorem:
Theorem 7.
Algorithm 5 is a polynomial-time (cid:16) · w min w max (cid:17) -approximation algorithm for Problem 1.Proof. Let S ⊆ V be the output of Algorithm 5. Define κ max = max { κ ( H ) | H is a subgraph of G } . As we assumed that | E | ≥
1, we have κ max ≥
1. Recall that H = G [ S ] is a κ max -vertex-connectedsubgraph. We denote by OPT the density of an optimal solution to Problem 1. Let S DS ⊆ V be a densest subgraph (unconstrained) in G . As d ( S DS ) ≥ OPT , it suffices to show that d ( S ) ≥ · w min w max · d ( S DS ) holds. Let n DS and m DS denote the number of vertices and edges in G [ S DS ],respectively. Case I: κ max = 1 . In this case, G is a forest; therefore, using the fact that m DS ≤ n DS −
1, wehave d ( S DS ) = w ( S DS ) n DS ≤ w max · m DS n DS < w max . Any vertex subset (with size more than one) inducinga connected subgraph, including the output S , has density of at least w min > w min > · w min w max · d ( S DS ) . Case II: κ max ≥ . Let us define t = (cid:106) · m DS n DS (cid:107) . As m DS ≤ (cid:0) n DS (cid:1) holds, we have t < n DS ,and thus n DS > t . As for the value of m DS , if t (cid:54) = 0, m DS ≥ tn DS > t ( n DS − t ) holds. Thus, byTheorem 6, if t ≥ G [ S DS ] has a ( t +1)-vertex-connected subgraph, whichis also a subgraph of G . Hence, we have κ max ≥ t + 1 ≥ · m DS n DS . On the other hand, if t < κ max ≥ > · m DS n DS . In either case, noticing that the output S is w min κ max -edge-connected,we see that S has density at least w min · κ max ≥ w min · · m DS n DS ≥ · w min w max · w ( S DS ) n DS ≥ · w min w max · d ( S DS ) , which completes the proof. Here we present an approximation algorithm for Problem 2, which is an edge-connectivity counter-part of Algorithm 5. Specifically, our algorithm first computes the most highly connected subgraphin terms of edge connectivity, i.e., H ∈ argmax { λ ( H ) | H is a subgraph of G } . This can be doneusing a simple recursive algorithm mentioned by Matula [38], which is similar to the algorithms forcomputing the family of maximal k -edge-connected subgraphs. Then our algorithm simply returns16 lgorithm 6: Approximation algorithm for Problem 2
Input : G = ( V, E, w ) and k ∈ R > Output: S ⊆ V or INFEASIBLE H ← argmax { λ ( H ) | H is a subgraph of G } ; if λ ( H ) ≥ k then return the vertex set of H ; else return INFEASIBLE ;the subgraph if its edge connectivity is no less than k and INFEASIBLE otherwise. For reference,we describe the entire procedure in Algorithm 6.Matula [38] stated that the time complexity of the algorithm for computing the most highlyconnected subgraph in terms of edge connectivity is given by O ( | V | · T ), where T is the com-putation time required to find a minimum cut of G . If we consider one of the minimum cutalgorithms by Nagamochi and Ibaraki [42], Stoer and Wagner [46], and Frank [15], the time com-plexity becomes O ( | V | ( | E | + | V | log | V | )). If G is simple unweighted, the time complexity reducesto O ( | E || V | log | V | log log | V | ) using the minimum cut algorithm by Henzinger et al. [23]. Clearly,Algorithm 6 has the same time complexity.Finally we analyze the theoretical performance guarantee of Algorithm 6. The following corol-lary is an edge-connectivity counterpart of Theorem 6: Corollary 2.
Let G = ( V, E, w ) be an edge-weighted graph and let t be an integer with t ≥ . If G satisfies | V | ≥ t and | E | > t ( | V | − t ) , then G has a w min ( t + 1) -edge-connected subgraph. Using this corollary, we can derive the approximation ratio of Algorithm 6:
Theorem 8.
Algorithm 6 is a polynomial-time (cid:16) · w min w max (cid:17) -approximation algorithm for Problem 2. The proof is similar to that of Theorem 7, and is omitted.
There are several directions for future research. The most interesting one is to design a polynomial-time algorithm that has a better (bicriteria or ordinary) approximation ratio. We wish to remarkthat assuming Mader’s conjecture [35], which is a stronger version of Theorem 6, we can improvethe approximation ratio of Algorithms 5 and 6, i.e., · w min w max , to · w min w max . However, Mader [35]also conjectured that the statement is best possible, making it unlikely to obtain an approximationratio better than · w min w max via similar analysis. Another interesting direction is to investigate thecomputational complexity of Problems 1 and 2. Acknowledgments
F.B., D.G-S., and C.T. acknowledge support from Intesa Sanpaolo Innovation Center, who hadno role in study design, data collection and analysis, decision to publish, or preparation of the17anuscript. A.M. was supported by Grant-in-Aid for Research Activity Start-up (No. 17H07357)and Grant-in-Aid for Early-Career Scientists (No. 19K20218). This work was partially done whileA.M. was at RIKEN AIP, Japan, and visitied ISI Foundation, Italy.
References [1] R. Andersen and K. Chellapilla. Finding dense subgraphs with size bounds. In
WAW ’09:Proceedings of the 6th Workshop on Algorithms and Models for the Web Graph , pages 25–37,2009.[2] A. Angel, N. Sarkas, N. Koudas, and D. Srivastava. Dense subgraph maintenance understreaming edge weight updates for real-time story identification. In
VLDB ’12: Proceedings ofthe 38th International Conference on Very Large Data Bases , pages 574–585, 2012.[3] G. D. Bader and C. W. V. Hogue. An automated method for finding molecular complexes inlarge protein interaction networks.
BMC Bioinformatics , 4(1):1–27, 2003.[4] B. Bahmani, R. Kumar, and S. Vassilvitskii. Densest subgraph in streaming and mapreduce.In
VLDB ’12: Proceedings of the 38th International Conference on Very Large Data Bases ,pages 454–465, 2012.[5] A. Bernshteyn and A. Kostochka. On the number of edges in a graph with no ( k +1)-connectedsubgraphs. Discrete Mathematics , 339(2):682–688, 2016.[6] A. Bhaskara, M. Charikar, E. Chlamtac, U. Feige, and A. Vijayaraghavan. Detecting highlog-densities: An O ( n / ) approximation for densest k -subgraph. In STOC ’10: Proceedingsof the 42nd ACM Symposium on Theory of Computing , pages 201–210, 2010.[7] S. Bhattacharya, M. Henzinger, D. Nanongkai, and C. E. Tsourakakis. Space- and time-efficientalgorithm for maintaining dense subgraphs on one-pass dynamic streams. In
STOC ’15: Pro-ceedings of the 47th ACM Symposium on Theory of Computing , pages 173–182, 2015.[8] M. Charikar. Greedy approximation algorithms for finding dense components in a graph. In
APPROX ’00: Proceedings of the 3rd International Workshop on Approximation Algorithmsfor Combinatorial Optimization , pages 84–95, 2000.[9] S. Chechik, T. D. Hansen, G. F. Italiano, V. Loitzenbauer, and N. Parotsidis. Faster algorithmsfor computing maximal 2-connected subgraphs in sparse directed graphs. In
SODA ’17: Pro-ceedings of the 28th Annual ACM-SIAM Symposium on Discrete Algorithms , pages 1900–1918,2017.[10] R. Diestel.
Graph Theory , volume 173 of
Graduate Texts in Mathematics . Springer-VerlagBerlin Heidelberg, 5th edition, 2016.[11] Y. Dourisboure, F. Geraci, and M. Pellegrini. Extraction and classification of dense commu-nities in the web. In
WWW ’07: Proceedings of the 16th International Conference on WorldWide Web , pages 461–470, 2007. 1812] A. Epasto, S. Lattanzi, and M. Sozio. Efficient densest subgraph computation in evolvinggraphs. In
WWW ’15: Proceedings of the 24th International Conference on World Wide Web ,pages 300–310, 2015.[13] U. Feige, D. Peleg, and G. Kortsarz. The dense k -subgraph problem. Algorithmica , 29(3):410–421, 2001.[14] S. Forster, D. Nanongkai, L. Yang, T. Saranurak, and S. Yingchareonthawornchai. Computingand testing small connectivity in near-linear time and queries via fast local cut algorithms. In
SODA ’20: Proceedings of the 31st Annual ACM-SIAM Symposium on Discrete Algorithms ,pages 2046–2065, 2020.[15] A. Frank. On the edge-connectivity algorithm of nagamochi and ibaraki.
Laboratoire Artemis,IMAG, Universite J. Fourier , 1994.[16] H. N. Gabow. Using expander graphs to find vertex connectivity.
Journal of the ACM ,53(5):800–844, 2006.[17] E. Galimberti, F. Bonchi, and F. Gullo. Core decomposition and densest subgraph in multilayernetworks. In
CIKM ’17: Proceedings of the 26th ACM International Conference on Informationand Knowledge Management , pages 1807–1816, 2017.[18] M. R. Garey and D. S. Johnson.
Computers and Intractability: A Guide to the Theory ofNP-Completeness . W. H. Freeman & Co., NY, 1979.[19] D. Gibson, R. Kumar, and A. Tomkins. Discovering large dense subgraphs in massive graphs.In
VLDB ’05: Proceedings of the 31st International Conference on Very Large Data Bases ,pages 721–732, 2005.[20] A. Gionis and C. E. Tsourakakis. Dense subgraph discovery: KDD 2015 Tutorial. In
KDD ’15:Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery andData Mining , pages 2313–2314, 2015.[21] A. V. Goldberg.
Finding a maximum density subgraph . University of California Berkeley, 1984.[22] M. Henzinger, S. Krinninger, and V. Loitzenbauer. Finding 2-edge and 2-vertex stronglyconnected components in quadratic time. In
ICALP ’15: Proceedings of the 42nd InternationalColloquium on Automata, Languages and Programming , pages 713–724, 2015.[23] M. Henzinger, S. Rao, and D. Wang. Local flow partitioning for faster edge connectivity.
SIAMJournal on Computing , 49(1):1–36, 2020.[24] M. R. Henzinger, S. Rao, and H. N. Gabow. Computing vertex connectivity: New boundsfrom old techniques.
Journal of Algorithms , 34(2):222–250, 2000.[25] J. E. Hopcroft and R. E. Tarjan. Dividing a graph into triconnected components.
SIAMJournal on Computing , 2(3):135–158, 1973.[26] S. Hu, X. Wu, and T.-H. H. Chan. Maintaining densest subsets efficiently in evolving hyper-graphs. In
CIKM ’17: Proceedings of the 26th ACM International Conference on Informationand Knowledge Management , pages 929–938, 2017.1927] D. R. Karger. Minimum cuts in near-linear time.
Journal of the ACM , 47(1):46–76, 2000.[28] K.-I. Kawarabayashi and M. Thorup. Deterministic edge connectivity in near-linear time.
Journal of the ACM , 66(1):Article No. 4, 2018.[29] Y. Kawase, Y. Kuroki, and A. Miyauchi. Graph mining meets crowdsourcing: Extractingexperts for answer aggregation. In
IJCAI ’19: Proceedings of the 28th International JointConference on Artificial Intelligence , pages 1272–1279, 2019.[30] Y. Kawase and A. Miyauchi. The densest subgraph problem with a convex/concave sizefunction.
Algorithmica , 80(12):3461–3480, 2018.[31] S. Khuller and B. Saha. On finding dense subgraphs. In
ICALP ’09: Proceedings of the 36thInternational Colloquium on Automata, Languages and Programming , pages 597–608, 2009.[32] J. Leskovec and A. Krevl. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data , 2014.[33] N. Linial, L. Lov´asz, and A. Wigderson. Rubber bands, convex embeddings and graph con-nectivity.
Combinatorica , 8(1):91–102, 1988.[34] W. Mader. Existenzn-fach zusammenh¨angender teilgraphen in graphen gen¨ugend großerkantendichte.
Abhandlungen aus dem Mathematischen Seminar der Universit¨at Hamburg ,37(1):86–97, 1972.[35] W. Mader. Connectivity and edge-connectivity in finite graphs. In B. Bollobas, editor,
Surveysin Combinatorics (Proceedings of the Seventh British Combinatorial Conference) , volume 38of
London Mathematical Society Lecture Note Series , pages 66–95, 1979.[36] S. Makino. An algorithm for finding all the k-components of a digraph.
International journalof computer mathematics , 24(3-4):213–221, 1988.[37] D. W. Matula. Graph theoretic techniques for cluster analysis algorithms. In J. V. Ryzin,editor,
Classification and Clustering , pages 95–129. Academic Press, 1977.[38] D. W. Matula. k -blocks and ultrablocks in graphs. Journal of Combinatorial Theory, SeriesB , 24(1):1–13, 1978.[39] M. Mitzenmacher, J. Pachocki, R. Peng, C. E. Tsourakakis, and S. C. Xu. Scalable large near-clique detection in large-scale networks via sampling. In
KDD ’15: Proceedings of the 21stACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pages815–824, 2015.[40] A. Miyauchi, Y. Iwamasa, T. Fukunaga, and N. Kakimura. Threshold influence model forallocating advertising budgets. In
ICML ’15: Proceedings of the 32nd International Conferenceon Machine Learning , pages 1395–1404, 2015.[41] A. Miyauchi and A. Takeda. Robust densest subgraph discovery. In
ICDM ’18: Proceedingsof the 18th IEEE International Conference on Data Mining , pages 1188–1193, 2018.2042] H. Nagamochi and T. Ibaraki. Computing edge-connectivity in multigraphs and capacitatedgraphs.
SIAM Journal on Discrete Mathematics , 5(1):54–66, 1992.[43] D. Nanongkai, T. Saranurak, and S. Yingchareonthawornchai. Breaking quadratic time forsmall vertex connectivity and an approximation scheme. In
STOC ’19: Proceedings of the 51stAnnual ACM SIGACT Symposium on Theory of Computing , pages 241–252, 2019.[44] J. B. Orlin. Max flows in O ( nm ) time, or better. In STOC ’13: Proceedings of the 45th AnnualACM Symposium on Theory of Computing , pages 765–774, 2013.[45] V. Spirin and L. A. Mirny. Protein complexes and functional modules in molecular net-works.
Proceedings of the National Academy of Sciences of the United States of America ,100(21):12123–12128, 2003.[46] M. Stoer and F. Wagner. A simple min-cut algorithm.
Journal of the ACM , 44(4):585–591,1997.[47] R. Tarjan. Depth-first search and linear graph algorithms.
SIAM Journal on Computing ,1(2):146–160, 1972.[48] C. E. Tsourakakis. The k-clique densest subgraph problem. In
WWW ’15: Proceedings of the24th International Conference on World Wide Web , pages 1122–1132, 2015.[49] C. E. Tsourakakis. Streaming graph partitioning in the planted partition model. In
COSN ’15:Proceedings of the 2015 ACM Conference on Online Social Networks , pages 27–35, 2015.[50] C. E. Tsourakakis, F. Bonchi, A. Gionis, F. Gullo, and M. Tsiarli. Denser than the densestsubgraph: extracting optimal quasi-cliques with quality guarantees. In
KDD ’13: Proceed-ings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and DataMining , pages 104–112, 2013.[51] C. E. Tsourakakis, T. Chen, N. Kakimura, and J. Pachocki. Novel dense subgraph discoveryprimitives: Risk aversion and exclusion queries. In
ECML-PKDD ’19: Proceedings of the2019 European Conference on Machine Learning and Principles and Practice of KnowledgeDiscovery in Databases , 2019. No page numbers.[52] Y. Wu, X. Zhu, L. Li, W. Fan, R. Jin, and X. Zhang. Mining dual networks: Models, algo-rithms, and applications.
ACM Transactions on Knowledge Discovery from Data , 10(4):40:1–40:37, 2016.[53] Z. Zou. Polynomial-time algorithm for finding densest subgraphs in uncertain graphs. In