[PDF] Algorithms and Hardness Results for the Maximum Balanced Connected Subgraph Problem

Abstract

The Balanced Connected Subgraph problem (BCS) was recently introduced by Bhore et al. (CALDAM 2019). In this problem, we are given a graph G whose vertices are colored by red or blue. The goal is to find a maximum connected subgraph of G having the same number of blue vertices and red vertices. They showed that this problem is NP-hard even on planar graphs, bipartite graphs, and chordal graphs. They also gave some positive results: BCS can be solved in O( n 3 ) time for trees and O(n+m) time for split graphs and properly colored bipartite graphs, where n is the number of vertices and m is the number of edges. In this paper, we show that BCS can be solved in O( n 2 ) time for trees and O( n 3 ) time for interval graphs. The former result can be extended to bounded treewidth graphs. We also consider a weighted version of BCS (WBCS). We prove that this variant is weakly NP-hard even on star graphs and strongly NP-hard even on split graphs and properly colored bipartite graphs, whereas the unweighted counterpart is tractable on those graph classes. Finally, we consider an exact exponential-time algorithm for general graphs. We show that BCS can be solved in 2 n/2 n O(1) time. This algorithm is based on a variant of Dreyfus-Wagner algorithm for the Steiner tree problem.

Full PDF

AAlgorithms and Hardness Results for theMaximum Balanced Connected SubgraphProblem

Yasuaki Kobayashi, Kensuke Kojima, Norihide Matsubara,Taiga Sone, Akihiro YamamotoMarch 11, 2020

Abstract

The Balanced Connected Subgraph problem (BCS) was recently intro-duced by Bhore et al. (CALDAM 2019). In this problem, we are given agraph G whose vertices are colored by red or blue. The goal is to ﬁnd amaximum connected subgraph of G having the same number of blue ver-tices and red vertices. They showed that this problem is NP-hard even onplanar graphs, bipartite graphs, and chordal graphs. They also gave somepositive results: BCS can be solved in O ( n ) time for trees and O ( n + m ) time for split graphs and properly colored bipartite graphs, where n is thenumber of vertices and m is the number of edges.In this paper, we show that BCS can be solved in O ( n ) time for treesand O ( n ) time for interval graphs. The former result can be extended tobounded treewidth graphs. We also consider a weighted version of BCS(WBCS). We prove that this variant is weakly NP-hard even on star graphsand strongly NP-hard even on split graphs and properly colored bipartitegraphs, whereas the unweighted counterpart is tractable on those graphclasses. Finally, we consider an exact exponential-time algorithm for gen-eral graphs. We show that BCS can be solved in 2 n/ n O ( ) time. This algo-rithm is based on a variant of Dreyfus-Wagner algorithm for the Steiner treeproblem. Fairness is one of the most important concepts in recent machine learning stud-ies and numerous researches concerning “fair solutions” have been done so farsuch as fair bandit problem [16], fair clustering [9], fair ranking [8], and fairregression [7]. This brings us to a new question: Is it easy to ﬁnd “fair solu-tions” in classical combinatorial optimization problems? Chierichetti et al. [10]recently addressed a fair version of matroid constrained optimization problems1 a r X i v : . [ c s . D S ] M a r nd discussed polynomial time solvability, approximability, and hardness resultsfor those problems.In this paper, we study the problem of ﬁnding a “fair subgraph”. Here, ourgoal is to ﬁnd a maximum cardinality connected “fair subgraph” of a given bi-colored graph. To be precise, we are given a graph G = ( B ∪ R , E ) in which thevertices in B are colored by blue and those in R are colored by red. We say thata subgraph is balanced if it contains an equal number of blue and red vertices.The objective of the problem is to ﬁnd a balanced connected subgraph with themaximum number of vertices. This problem is called the balanced connectedsubgraph problem (BCS), recently introduced by Bhore et al. [3]. Althoughﬁnding a maximum size connected subgraph is trivially solvable in linear time,they proved that BCS is NP-hard even on bipartite graphs, on chordal graphs,and on planar graphs. They also gave some positive results on some graphclasses: BCS is solvable in polynomial time on trees, on split graphs, and onproperly colored bipartite graphs. In particular, they gave an O ( n ) time algo-rithm for trees, where n is the number of vertices of the input tree.BCS can be seen as a special case of the graph motif problem in the fol-lowing sense. We are given a vertex (multi)colored graph G = ( V , E , c ) withcoloring function c : V → {

1, 2, . . . , q } and a multiset M of colors {

1, 2, . . . , q } .The objective of the graph motif problem is to ﬁnd a connected subgraph H of G that agrees with M : the multiset c ( H ) = { c ( v ) : v ∈ V ( H ) } coincideswith M . If M is given as a set of k/ k/ k vertices. Bj¨orklund et al. [1] proved that there is an O ∗ ( | M | ) time ran-domized algorithm for the graph motif problem, where the O ∗ notation sup-presses the polynomial factor in n . This allows us to ﬁnd a balanced con-nected subgraph of k vertices in time O ∗ ( k ) and hence BCS can be solvedin max { O ∗ ( n ) , O ∗ ( H ( − ) n ) } ⊆ O ( n ) time by using this O ∗ ( k ) time algorithm for k (cid:54) n or by guessing the complement of an optimalsolution for otherwise, where H ( x ) = − x log x − ( − x ) log ( − x ) is the binaryentropy function.In this paper, we improve the previous running time O ( n ) to O ( n ) for treesand also give a polynomial time algorithm for interval graphs, which is in sharpcontrast with the hardness result for chordal graphs. The algorithm for trees canbe extended to bounded treewidth graphs. These results are given in Section 3and 4. For general graphs, we show in Section 6 that BCS can be solved in O ∗ ( n/ ) = O ( n ) time. The idea of this exponential-time algorithm is toexploit the Dreyfus-Wagner algorithm [14] for the Steiner tree problem. Let R be the set of red vertices of G . Then, for each S ⊆ R and for each v in G , weﬁrst compute a tree T that contains all the vertices S ∪ { v } but excludes all thevertices R \ ( S ∪ { v } ) . This can be done in O ∗ ( | R | ) time by the Dreyfus-Wagneralgorithm and its improvement due to Bj¨orklund et al. [2]. Once we have sucha tree for each S and v , we can in linear time compute a balanced connectedsubgraph H with V ( H ) ∩ R = S . We also consider a weighted counterpart ofBCS, namely WBCS: the input is vertex-weighted bicolored graph and the goalis to ﬁnd a maximum weight connected subgraph H in which the total weights2f red vertices and of blue vertices are equal. If every vertex has a unit weight,the problem exactly corresponds to the normal BCS, and hence the hardnessresults for BCS on bipartite, chordal, and planar graphs also hold for WBCS.In contrast to the unweighted case, WBCS is particularly hard. In Section 5,we show that WBCS is (weakly) NP-hard even on properly colored star graphsand strongly NP-hard even on split and properly colored bipartite graphs. Thehardness result for stars is best possible in the sense that WBCS on trees can besolved in pseudo-polynomial time. Throughout the paper, all the graphs are simple and undirected. Let G be agraph. We denote by V ( G ) and E ( G ) the set of vertices and edges in G , respec-tively. We use n to denote the number of vertices of the input graph. We say thata vertex set U ⊆ V ( G ) is connected if its induced subgraph G [ U ] is connected. Agraph is bicolored if every vertex is colored by blue or red. Note that this color-ing is not necessarily proper, that is, there may be adjacent vertices having thesame color. We denote by B (resp. R ) the set of blue (resp. red) vertices of theinput graph. The problems we consider are as follows. The Balanced Connected Subgraph Problem (BCS) (cid:19) (cid:16)

Input:

A bicolored graph G = ( B ∪ R , E ) . Output:

A maximum size connected induced subgraph H of G such that | V ( H ) ∩ B | = | V ( H ) ∩ R | . (cid:18) (cid:17) The Weighted Balanced Connected Subgraph Problem (WBCS) (cid:19) (cid:16)

Input:

A bicolored vertex weighted graph G = ( B ∪ R , E , w ) , w : B ∪ R → N . Output:

A maximum weight connected induced subgraph H of G such that (cid:88) v ∈ V ( H ) ∩ B w ( v ) = (cid:88) v ∈ V ( H ) ∩ R w ( v ) . (cid:18) (cid:17) Here, the size and the weight of a subgraph are measured by the number ofvertices and the sum of the weight of vertices in the subgraph, respectively.

This section is devoted to showing that BCS can be solved in O ( n ) time fortrees, which improves upon the previous running time O ( n ) of [3]. We also3 v v v v v u u v v v v Figure 1: Binarizing a degree- p vertex with p > O ( n ) as well.The essential idea behind our algorithm is the same as one in [3]. Let T be a bicolored rooted tree. For each v ∈ V ( T ) , we denote by T v the subtreeof T rooted at v . For the sake of simplicity, we convert the input tree T into arooted binary tree by adding uncolored vertices as follows. For each v ∈ V ( T ) having p > v , v , . . . , v p , we introduce a path of p − u , u , . . . , u p − that are all uncolored and make T a rooted binary tree T (cid:48) as inFig. 1.We also assume that each internal node has exactly two children by appro-priately adding uncolored children. This conversion can be done in O ( n ) time.It is not hard to see that T has a balanced connected subtree of size 2 k whoseroot is v ∈ V ( T ) if and only if T (cid:48) has a connected subgraph with k blue verticesand k red vertices that contains v as the root. Moreover, T (cid:48) has O ( n ) vertices.Thus, in the following, we seek a connected subgraph with k blue vertices and k red vertices, where k is as large as possible. For each v ∈ V ( T (cid:48) v ) and eachinteger − | V ( T (cid:48) v ) | (cid:54) d (cid:54) | V ( T (cid:48) v ) | , we say that S ⊆ V ( T (cid:48) ) is feasible for ( v , d ) if itsatisﬁes • if S is not empty, S must contain v and • | S ∩ B | − | S ∩ R | = d .We denote by bcs ( v , d ) the maximum size of S that is feasible for ( v , d ) , wherethe size is measured by the number of colored vertices only. Let us note that forany v ∈ V ( T (cid:48) ) , the empty set is feasible for ( v , d ) when d =

0. Given this, ourgoal is to compute bcs ( v , d ) for all v and d .Let f : V ( T (cid:48) ) → { −

1, 0, 1 } be a function, where f ( v ) = v is a blue vertex, f ( v ) = − v is a red vertex, and f ( v ) = v is an uncolored vertex.Suppose v is a leaf of T (cid:48) . Then, bcs ( v , f ( v )) = | f ( v ) | , bcs ( v , 0 ) =

0, andbcs ( v , d ) = − ∞ for d / ∈ { f ( v ) } . Suppose otherwise that v is an internal node.Let v l and v r be the children of v . Observe that every feasible solution for ( v , d ) can be split by v into two feasible solutions for ( v l , d l ) and ( v r , d r ) for some d l and d r . Conversely, for every pair of feasible solutions for ( v l , d l ) and ( v r , d r ) ,we can construct a feasible solution for ( v , d l + d r + f ( v )) . Thus, we have thefollowing straightforward lemma. 4 emma 1. Let d be an integer with − | V ( T (cid:48) v ) | (cid:54) d (cid:54) | V ( T (cid:48) v ) | . Then, bcs ( v , d ) = max d l , d r d l + d r + f ( v )= d ( bcs ( v l , d l ) + bcs ( v r , d r )) + | f ( v ) | , where the maximum is taken among all pairs d l , d r with d l + d r + f ( v ) = d . The running time of evaluating this recurrence may be estimated to be O ( n ) in total to compute bcs ( v , d ) for all v ∈ V ( T (cid:48) ) and − | V ( T (cid:48) v ) | (cid:54) d (cid:54) | V ( T (cid:48) v ) | sincethere are O ( n ) subproblems and solving each subproblem may take O ( n ) .However, for a node v with two children v l and v r the evaluation can bedone in O ( | V ( T (cid:48) v l ) || V ( T (cid:48) v r ) | ) time in total for all d by simply joining all the pairsbcs ( v l , d l ) and bcs ( v r , d r ) . Therefore, the total running time is (cid:88) v ∈ V ( T (cid:48) ) O ( | V ( T (cid:48) v l ) || V ( T (cid:48) v r ) | ) = O ( n ) .This upper bound follows from the fact that the left-hand side can be seen ascounting the number of edges in the complete graph on V ( T (cid:48) v ) . Theorem 1.

BCS on trees can be solved in O ( n ) time. This algorithm can be extended for bounded treewidth graphs.

Treewidth is a well-known graph invariant, measuring “tree-likeness” of graphs. A treedecomposition of a graph G = ( V , E ) is a rooted tree T that satisﬁes the followingproperties: (1) for each v ∈ V ( T ) , some vertex set X v , called a bag , is assignedand (cid:83) v ∈ V ( T ) X v = V ; (2) for each e ∈ E , there is v ∈ V ( T ) such that e ⊆ X v ; (3)for each w ∈ V , the set of nodes v ∈ V ( T ) containing w (i.e. w ∈ X v ) inducesa subtree of T . The width of T is the maximum size of its bag minus one. The treewidth of G , denoted by tw ( G ) , is the minimum integer k such that G has atree decomposition of width k .The algorithm is quite similar to dynamic programming algorithms basedon tree decompositions for connectivity problems [13], such as the Steiner treeproblem and the Hamiltonian cycle problem. It is worth noting that the propertyof being balanced may not be able to be expressed by a formula in the MonadicSecond Order Logic (MSO) with bounded length. Thus, we cannot directlyapply the famous Courcelle’s theorem [11, 12] to our problem.Here, we only sketch an overview of the algorithm and the proof is almostthe same as those for connectivity problems. Let T be a tree decomposition of G whose width is O ( tw ( G )) . Such a decomposition can be obtained in 2 O ( tw ( G )) n time by the algorithm of Bodlaender et al. [5]. We can assume that T is rooted.For each bag X of T , we denote by T X the subtree rooted at X and by V X theset of vertices appeared in some bag of T X . For each bag X of T , integer d with − | V X | (cid:54) d (cid:54) | V X | , S ⊆ X , and a partition S of S , we compute the maximumsize bcs ( X , d , S , S ) of a set of vertices U ⊆ V X such that | U ∩ B | − | U ∩ R | = d , X ∩ U = S , and u , v ∈ S is connected in U if and only if u and v belong tothe same block in the partition S . We can compute bcs ( X , d , S , S ) guided by thetree decomposition in 2 O ( tw ( G ) log tw ( G )) n time. To improve the running time,5e can apply the rank-based approach of Bodlaender et al. [5] to this dynamicprogramming in the same way as the Steiner tree problem. The running time isstill quadratic in n but the exponential part can be improved to 2 O ( tw ( G )) . Theorem 2.

BCS can be solved in O ( tw ( G )) n time. We can extend the algorithm for trees to the weighted case, namely WBCS.For a tree T , instead of computing bcs ( v , d ) , we compute wbcs ( v , d ) ; For v ∈ V ( T ) and for d with − (cid:80) u ∈ V ( T v ) w ( u ) (cid:54) d (cid:54) (cid:80) u ∈ V ( T v ) w ( u ) , wbcs ( v , d ) isthe maximum total weight of a subtree T (cid:48) v of T v that contains v and satisﬁes (cid:80) u ∈ V ( T (cid:48) v ) ∩ B w ( u ) − (cid:80) w ∈ V ( T (cid:48) v ) ∩ R w ( u ) = d . The algorithm itself is almost thesame with the previous one, but the running time analysis is slightly different.Let W be the total weight of the vertices of T . The straightforward evaluationis that for each v ∈ V ( T ) , the values wbcs ( v , ∗ ) are computed by a dynamic pro-gramming algorithm, which runs in O ( p v W ) time, where p v is the number ofchildren of v . Therefore, the overall running time is upper bounded by O ( nW ) .To improve the quadratic dependency of W , we can exploit the heavy-light re-cursive dynamic programming technique [17]. They proved that, given a treewhose vertex contains an item and each item has a weight and a value, the prob-lem, called tree constrained knapsack problems , of maximizing the total value ofitems that induces a subtree subject to the condition that the total weight isupper bounded by a given budget can be solved in O ( n log 3 W ) = O ( n W ) time, where W is the total weight of items. WBCS can be seen as this tree con-strained knapsack problem and then almost the same algorithm works as well.Therefore, WBCS can be solved in O ( n W ) time. Theorem 3.

WBCS on trees can be solved in min { O ( nW ) , O ( n W ) } time,where W is the total weight of the vertices. In this section, we show that BCS can be solved in O ( n ) time on intervalgraphs. Very recently, another polynomial time algorithm for interval graphshas been developed by [4].A graph G = ( V , E ) is interval if it has an interval representation: an intervalrepresentation of G is a set of intervals that corresponds to its vertex set V ,such that two vertices u , v ∈ V are adjacent to each other in G if and only ifthe corresponding intervals have a non-empty intersection. We denote by I v the interval corresponding to vertex v and by l v and r v the left and right endpoints of I v , respectively. Hence, in what follows, we do not distinguish betweenvertices and intervals and interchangeably use them. Given an interval graph,we can compute an interval representation in linear time [6]. Moreover, we canassume that, in the interval representation, every end point of intervals has aunique integer coordinate between 1 and 2 n .First, we sort the input intervals in ascending order of their left end points,that is, for any I v i = [ l i , r i ] and I v j = [ l j , r j ] with i < j , it holds that l i < l j . Thefollowing lemma is crucial for our dynamic programming.6 emma 2. Let S be a non-empty subset of V such that G [ S ] is connected and let v be the vertex in S whose index is maximized. Then G [ S \ { v } ] is connected.Proof. Suppose for contradiction that G [ S \ { v } ] has at least two connected com-ponents, say C a and C b . An important observation is that an interval graphis connected if and only if the union of their intervals forms a single interval.Thus, C a and C b respectively induce intervals I a and I b that have no inter-section with each other. Without loss of generality, we may assume that I a isentirely to the left of I b , i.e., the right end of I a is strictly to the left of the leftend of I b . Since G [ S ] is connected, I v must have an intersection with both I a and I b . This contradicts the fact that l u < l v for every u ∈ S \ { v } .For 0 (cid:54) i (cid:54) n , 1 (cid:54) k (cid:54) n , and − n (cid:54) d (cid:54) n , we say that a non-empty set S ⊆ { v , v , . . . , v i } is feasible for ( i , k , d ) if S induces a connected subgraph of G ,max v ∈ S r v = k , and | S ∩ B | − | S ∩ R | = d . We denote by bcs ( i , k , d ) the maximumcardinality set that is feasible for ( i , k , d ) . We also deﬁne as bcs ( i , k , d ) = − ∞ if there is no feasible subset for ( i , k , d ) . In particular, bcs ( k , d ) = − ∞ for all1 (cid:54) k (cid:54) n and − n (cid:54) d (cid:54) n . Let f : V → { − } such that f ( v ) = v ∈ B . The algorithm is based on the following recurrences. Lemma 3.

For i > , bcs ( i , k , d ) =  max { bcs ( i − k , d − f ( v i )) +

1, bcs ( i − k , d ) } ( k > r i ) max l i

1, max l i

We ﬁrst show that the left-hand side is at most the right-hand side inall cases. Let S ⊆ { v , v , . . . , v i } be feasible for ( i , k , d ) with | S | = bcs ( i , k , d ) .Suppose ﬁrst that v i / ∈ S . This implies that k (cid:54) = r i . In this case, S is alsofeasible for ( i − k , d ) and hence we have bcs ( i , k , d ) (cid:54) bcs ( i − k , d ) . Supposeotherwise that v i ∈ S . By the deﬁnition of feasibility, it holds that k (cid:62) r i . If S = { v i } , it holds that bcs ( i , k , d ) = k = r i and d = f ( v i ) . Thus,the left-hand side is at most the right-hand side in the third recurrence. Let S (cid:48) = S \ { v i } be non-empty. Suppose moreover that k > r i , that is, max v ∈ S (cid:48) r v = k . Since v i has the maximum index among S , by Lemma 2, G [ S (cid:48) ] is connected.Moreover, | S (cid:48) ∩ R | − | S (cid:48) ∩ B | = | S ∩ R | − | S ∩ B | − f ( v i ) holds. Therefore, S (cid:48) isfeasible for ( i − k , d − f ( v i )) , and then bcs ( i , k , d ) (cid:54) bcs ( i − k , d − f ( v i )) + k = r i , it holds that l i < max v ∈ S (cid:48) r v < r i . This follows fromthe fact that S is connected and there are no intervals that share end points.Similar to the case where k > r i , S (cid:48) is feasible for ( i −

1, max v ∈ S (cid:48) r v , d − f ( v i )) .Hence, bcs ( i , k , d ) (cid:54) max l i r i . If there is a feasible set for ( i − k , d ) , this is also feasible for ( i , k , d ) and bcs ( i , k , d ) (cid:62) bcs ( i − k , d ) follows. Let S (cid:48) be feasible for ( i − k , d − f ( v i )) with | S (cid:48) | = bcs ( i − k , d − f ( v i )) .7ince the intervals are sorted in their left end and k > r i , S (cid:48) contains an intervalthat entirely covers the interval I v i . This means that S := S (cid:48) ∪ { v i } is connectedand then feasible for ( i , k , d − f ( v i )) . Therefore, we have bcs ( i , k , d ) (cid:62) | S (cid:48) | + k = r i . Let S (cid:48) be feasible for ( i − k (cid:48) , d − f ( v i )) with l i < k (cid:48) < r i and let S := S (cid:48) ∪ { v i } . As S (cid:48) contains an interval whose right endis strictly in between l i and r i , S is connected, and hence feasible for ( i , k , d ) .Therefore, bcs ( i , k , d ) (cid:62) | S (cid:48) | +

1. Finally, suppose that k = r i , d = f ( v i ) . Even ifthere is no feasible set for ( i − k (cid:48) , d − f ( v i )) with l i < k (cid:48) < r i , the singleton { v i } can be feasible and hence bcs ( i , k , d ) (cid:62) Theorem 4.

Given an n -vertex interval graph, BCS can be solved in O ( n ) time.Proof. For each i >

0, we can evaluate the recurrence in Lemma 3 in time O ( n ) using dynamic programming and hence the theorem follows.As a special case of the results for interval graphs and trees, BCS on pathscan be solved in linear time. Let v , v , . . . , v n be a path in which v i and v i + are adjacent to each other for 1 (cid:54) i < n . First, we compute left ( d ) that is theminimum integer i such that |{ v , v , . . . , v i } ∩ B | − |{ v , v , . . . , v i } ∩ R | = d for all d with − n (cid:54) d (cid:54) n and pref ( i ) = |{ v , v , . . . , v i } ∩ B | − |{ v , v , . . . , v i } ∩ R | forall 1 (cid:54) i (cid:54) n . We can compute these values in O ( n ) time and store them intoa table. Note that left ( ) = ( d ) is deﬁned to be ∞ when thereis no i satisfying the above condition. Using these values, for each 1 (cid:54) i (cid:54) n ,the maximum size of a balanced subpath whose rightmost index is i can becomputed by i − left ( pref ( i )) . Therefore, BCS on paths can be solved in lineartime. Theorem 5.

Given an n -vertex path, BCS can be solved in O ( n ) time. In this section, we discuss the hardness of the weighted counterpart of BCS,namely WBCS. Bhore et al. [3] proved that BCS is respectively solvable in poly-nomial time on trees, split graphs, and properly colored bipartite graphs. How-ever, we prove in this section that WBCS is hard even on those graph classes.

Theorem 6.

WBCS is NP-hard even on properly colored star graphs.Proof.

We can easily encode the subset sum problem into WBCS on star graphs.The subset sum problem asks for, given a set of integers S = { s , s , . . . , s n } andan integer B , a subset S (cid:48) ⊆ S whose sum is exactly B , which is known to be NP-complete [15]. We take a blue vertex of weight B , add a red vertex of weight s i for each s i ∈ S , and make adjacent each red vertex to the blue vertex. It is easyto see that the obtained graph has a feasible solution if and only if the instanceof the subset sum problem has a feasible solution.8et us note that WBCS can be solved in pseudo-polynomial time on treesusing the algorithm described in Section 3. However, WBCS is still hard onproperly colored bipartite graphs and split graphs even if the total weight ispolynomially upper bounded. Theorem 7.

WBCS is strongly NP-hard even on properly colored bipartite graphs.Proof.

The reduction is performed from the Exact 3-Cover problem, where givena ﬁnite set E and a collection of three-element subsets F = { S , S , . . . , S n } of E ,the goal is to ﬁnd a subcollection F (cid:48) ⊆ F such that F (cid:48) is mutually disjoint andentirely covers E . This problem is known to be NP-complete [15].For an instance ( E , F ) of the Exact 3-Cover problem, we construct a bipartitegraph G = ( V E ∪ V F ∪ { w } , E F ∪ E w ) as: V E = { v e : e ∈ E } , V F = { V S : S ∈ F} , E F = {{ v e , v S } : e ∈ E , S ∈ F , v e ∈ S } , and E w = {{ w , v S } : S ∈ F} . We colorthe vertices of V F with red and the other vertices with blue. Indeed, the graphobtained is bipartite and properly colored. We may assume that n = k forsome integer k . We assign weight one to each v e ∈ V E , weight n to each v S ∈ V F , and weight k ( n − ) to w . In the following, we show that F has asolution if and only if G has a solution of total weight at least 2 kn .Let F (cid:48) ⊆ F be a solution of Exact 3-Cover. We choose w , all the vertices of V E , and v S for each S ∈ F . Clearly, the chosen vertices have total weight 2 kn .As F (cid:48) covers E , every vertex in V E is adjacent to some v S , which is chosen asour solution. Moreover, every vertex in V F is adjacent to w . This implies thatthe chosen vertices are connected. Therefore, G has a feasible solution of totalweight at least 2 kn .Conversely, let U ⊆ V E ∪ V F ∪ { w } be connected in G with total weight at least2 kn . Since the total weight of the blue vertices in G is kn , we can assumethat the total weight of U is exactly 2 kn . This means that U contains exactly k vertices of V F . Let F (cid:48) ⊆ F be the subsets corresponding to U ∩ V F . We claimthat F (cid:48) is a solution of Exact 3-Cover. To see this, suppose that F (cid:48) does not cover e ∈ E . Since U is connected, v e has a neighbor v S in U ∩ V F , contradicting that e is not covered by F (cid:48) . Theorem 8.

WBCS is strongly NP-hard even on split graphs.Proof.

Recall that a graph is split if the vertex set can be partitioned into a cliqueand an independent set. The proof is almost the same with Theorem 7. In theproof of Theorem 7, we construct a bipartite graph G that has a solution oftotal weight 2 kn if and only if the instance of the Exact 3-Cover problem hasa solution. We construct a split graph G (cid:48) from G by adding an edge for eachpair of vertices of V F . Analogously, we can show that G (cid:48) has a solution of totalweight 2 kn if and only if the instance of the Exact 3-Cover problem has asolution. 9 General graphs

Since BCS is NP-hard [3], efﬁcient algorithms for general graphs are unlikelyto exist. From the viewpoint of exact exponential-time algorithms, the problemcan be solved in time O ∗ ( n ) using the algorithm due to Bj¨orklund et al.[1], discussed in Section 1. In this section, we improve this running time to O ∗ ( n/ ) by modifying the well-known Dreyfus-Wagner algorithm for the min-imum Steiner tree problem [14].Before describing our algorithm, we brieﬂy sketch the Dreyfus-Wagner algo-rithm and its improvement by Bj¨orklund et al. [2]. The minimum Steiner treeproblem asks for, given a graph G = ( V , E ) and a terminal set T ⊆ V , a con-nected subgraph of G that contains all the vertices of T having the least numberof edges. The Dreyfus-Wagner algorithm solves the minimum Steiner tree prob-lem in time O ∗ ( | T | ) by dynamic programming. For S ⊆ T and v ∈ V , we denoteby opt ( S , v ) the minimum number of edges in a connected subgraph of G thatcontains S ∪ { v } . Assume that | S ∪ { v }| (cid:62) ( S , v ) can be com-puted in polynomial time. Let F be a connected subgraph that contains S ∪ { v } with | E ( F ) | = opt ( S , v ) . Note that F must be a tree as otherwise we can delete atleast one edge from F without being disconnected. A key observation for apply-ing the below algorithm is that every leaf of F belongs to S ∪ { v } . This enablesus to decompose F into (at most) three parts. Suppose ﬁrst that v is a leaf of F .Then, there is w ∈ V ( F ) such that the edge set of F can be partitioned into threeedge disjoint subtrees F , F , and F : F is a shortest path between v and w , F and F induce a non-trivial partition of S ∪ { w } , that is, V ( F ) ∩ ( S ∪ { w } ) and V ( F ) ∩ ( S ∪ { w } ) are both non-empty. Suppose otherwise that v is an internalvertex of F . We can also partition the edges of F into two edge disjoint subtrees F and F such that F contains S (cid:48) ∪ { w } and F contains ( S ∪ { w } ) \ S (cid:48) for somenon-empty proper subset S (cid:48) of S . This leads to the following recurrence.opt ( S , v ) = min w ∈ V  d ( v , w ) + min S (cid:48) ⊂ SS (cid:48) (cid:54) = ∅ ( opt ( S (cid:48) , w ) + opt ( S \ S (cid:48) , w ))  .Note that if v is an internal vertex, the minimum is attained when v = w in theabove recurrence. A naive evaluation of this recurrence takes O ∗ ( | T | ) time intotal. Bj¨orklund et al. [2] proposed a fast evaluation technique for the aboverecurrence known as the fast subset convolution , described in Theorem 9, whichallows us to compute opt ( S , v ) for all S ⊆ T and v ∈ V in total time O ∗ ( | T | ) . Theorem 9 ([2]) . Let U be a ﬁnite set. Let M be a positive integer and let f , g : U → {

0, 1, . . . , M , ∞ } . Then, the subset convolution over the min-sum semiring ( f ∗ g )( X ) = min Y ⊆ X ( f ( Y ) + g ( X \ Y )) can be computed in | U | ( | U | + M ) O ( ) time in total for all X ⊆ U . For our problem, namely BCS, we ﬁrst solve a variant of the minimumSteiner tree problem deﬁned as follows. Let G = ( V , E ) be the instance of BCS.10ithout loss of generality, we assume that | R | (cid:54) | B | . For S ⊆ R and v ∈ V \ S ,we compute opt (cid:48) ( S , v ) the minimum number of edges of a tree connecting allthe vertices of S ∪ { v } and excluding any vertex of R \ ( S ∪ { v } ) . The recurrencefor opt (cid:48) is quite similar to one for the ordinary minimum Steiner tree problem,but an essential difference from the above recurrence is that the shortest pathbetween v and w does not contain any red vertex other than S ∪ { v } . Lemma 4.

For S ⊆ R and for v ∈ V , opt (cid:48) ( S , v ) = min w ∈ V \ ( R \ ( S ∪ { v } ))  d (cid:48) ( v , w ) + min S (cid:48) ⊂ SS (cid:48) (cid:54) = ∅ ( opt (cid:48) ( S (cid:48) , w ) + opt (cid:48) ( S \ S (cid:48) , w ))  , where d (cid:48) ( v , w ) is the number of edges in a shortest path between v and w excludingred vertices except for its end vertices. If there is no such a path, d (cid:48) ( v , w ) is deﬁnedto be ∞ .Proof. The idea of the proof is analogous to the Dreyfus-Wagner algorithm forthe ordinary Steiner tree problem. Suppose that F is an optimal solution thatcontains every vertex in S ∪ { v } and does not contain every vertex in R \ ( S ∪ { v } ) .Similarly, we can assume that every leaf of F belongs to S ∪ { v } as otherwisesuch a leaf can be deleted without losing feasibility. Then, the edge set of F canbe partitioned into three edge disjoint subtrees F , F , and F such that F is ashortest path between v and w , F and F induces a non-trivial bipartition of S .The only difference from the ordinary Steiner tree problem is that F , F and F should avoid any irrelevant red vertex. This can be done since F does avoidsuch a vertex.Similar to the normal Steiner tree problem, we can improve the naive run-ning time O ∗ ( | R | ) to O ∗ ( | R | ) by the subset convolution algorithm in Theo-rem 9.Now, we are ready to describe the ﬁnal part of our algorithm for BCS. Wehave already know opt (cid:48) ( S , v ) for all S ⊆ V and v ∈ V \ ( R \ S ) . Supposeopt (cid:48) ( S , v ) < ∞ . Let F be a tree with | E ( F ) | = opt (cid:48) ( S , v ) such that V ( F ) ∩ R =( S ∪ { v } ) ∩ R . Such a tree can be obtained in polynomial time using the standardtraceback technique. Since F is a tree, we know that | V ( F ) | = opt (cid:48) ( S , v ) +

1. Let k be the number of red vertices in F . If F contains more than k blue vertices, wecan immediately conclude that there is no balanced connected subgraph H with V ( H ) ∩ R = V ( F ) ∩ R . Suppose otherwise. The following lemma ensures that anoptimal solution of BCS can be computed from some Steiner tree. Lemma 5.

Let R F = V ( F ) ∩ R . If there is a balanced connected subgraph H with V ( H ) ∩ R = R F , then there is a balanced connected subgraph H (cid:48) with V ( S ) (cid:48) ∩ R = R F such that F is a subtree of H (cid:48) . Moreover, such a subgraph H (cid:48) can be constructed inlinear time from F .Proof. We prove this lemma by giving a linear time algorithm that constructs abalanced connected subgraph H (cid:48) when given F as in the statement of the lemma.11uppose that there is a balanced connected subgraph H with V ( H ) ∩ R = R F .Since F is a minimum Steiner tree with V ( F ) ∩ R = R F , the number of bluevertices in F is not larger than that of red vertices. We greedily add a bluevertex that has a neighbor in F to F as long as it is not balanced. We claim thatit is possible to construct a balanced connected subgraph using this procedure.Let H (cid:48) be a maximal graph that is obtained by the above procedure. Supposefor contradiction that | V ( H (cid:48) ) | < | V ( H ) | . Let v ∈ ( V ( H ) \ V ( H (cid:48) )) ∩ B be a bluevertex and let r be a red vertex in F . We choose v and r in such a way that thedistance between v and r in H is as small as possible. Consider a shortest pathbetween v and r in H . By the choice of v and r , there are no red vertices otherthan r and no blue vertices of V ( H ) \ V ( H (cid:48) ) on the path. Since v / ∈ V ( H (cid:48) ) and r ∈ V ( H ) , there is a vertex v (cid:48) on the path such that v (cid:48) has a neighbor in V ( H (cid:48) ) but not contained in H (cid:48) , which contradicts the maximality of V ( H (cid:48) ) .This greedy algorithm runs in linear time and hence the lemma follows.Overall, we have the following running time. Theorem 10.

BCS can be solved in O ∗ ( n/ ) time. Acknowledgements

This work is partially supported by JSPS KAKENHI Grant Number JP17H01788and JST CREST JPMJCR1401.