[PDF] Fair Colorful k-Center Clustering

Abstract

An instance of colorful k-center consists of points in a metric space that are colored red or blue, along with an integer k and a coverage requirement for each color. The goal is to find the smallest radius \r{ho} such that there exist balls of radius \r{ho} around k of the points that meet the coverage requirements. The motivation behind this problem is twofold. First, from fairness considerations: each color/group should receive a similar service guarantee, and second, from the algorithmic challenges it poses: this problem combines the difficulties of clustering along with the subset-sum problem. In particular, we show that this combination results in strong integrality gap lower bounds for several natural linear programming relaxations. Our main result is an efficient approximation algorithm that overcomes these difficulties to achieve an approximation guarantee of 3, nearly matching the tight approximation guarantee of 2 for the classical k-center problem which this problem generalizes.

Full PDF

aa r X i v : . [ c s . D S ] J u l Fair Colorful k -Center Clustering ∗ Xinrui Jia † Kshiteej Sheth † Ola Svensson † Abstract

An instance of colorful k -center consists of points in a metric space that are colored red orblue, along with an integer k and a coverage requirement for each color. The goal is to ﬁnd thesmallest radius ρ such that there exist balls of radius ρ around k of the points that meet thecoverage requirements.The motivation behind this problem is twofold. First, from fairness considerations: eachcolor/group should receive a similar service guarantee, and second, from the algorithmic chal-lenges it poses: this problem combines the diﬃculties of clustering along with the subset-sumproblem. In particular, we show that this combination results in strong integrality gap lowerbounds for several natural linear programming relaxations.Our main result is an eﬃcient approximation algorithm that overcomes these diﬃculties toachieve an approximation guarantee of 3, nearly matching the tight approximation guarantee of2 for the classical k -center problem which this problem generalizes. In the colorful k-center problem introduced in [5], we are given a set of n points P in a metric spacepartitioned into a set R of red points and a set B of blue points, along with parameters k , r , and b . The goal is to ﬁnd a set of k centers C ⊆ P that minimizes ρ so that balls of radius ρ aroundeach point in C cover at least r red points and at least b blue points. More generally, the pointscan be partitioned into ω color classes C , . . . , C ω , with coverage requirements p , . . . , p ω . To keepthe exposition of our ideas as clean as possible, we concentrate the bulk of our discussion to theversion with two colors. In Section 3 we show how our algorithm can be generalized for ω colorclasses with an exponential dependence on ω in the running time in a rather straightforward way,thus getting a polynomial time algorithm for constant ω .This generalization of the classic k -center problem has applications in situations where fairnessis a concern. For example, if a telecommunications company is required to provide service to atleast 90% of the people in a country, it would be cost eﬀective to only provide service in denselypopulated areas. This is at odds with the ideal that at least some people in every community shouldreceive service. In the absence of color classes, an approximation algorithm could be “unfair” tosome groups by completely considering them as outliers. The inception of fairness in clusteringcan be found in the recent paper [8] (see also [1, 4]), which uses a related but incomparable notionof fairness. Their notion of fairness requires each individual cluster to have a balanced number ∗ A preliminary version of this work was presented at the 21st Conference on Integer Programming and Combi-natorial Optimization (IPCO 2020). An independent work of Anegg et al. [2], presented at the same venue, gavea 4-approximation for Colorful k-Center with constantly many colors using diﬀerent techniques. This work is sup-ported by the Swiss National Science Foundation project 200021-184656 “Randomness in Problem Instances andRandomized Algorithms.” † ´Ecole Polytechnique F´ed´erale de Lausanne. Emails: { xinrui.jia, kshiteej.sheth, ola.svensson } @epﬂ.ch

1f points from each color class, which leads to very diﬀerent algorithmic considerations and ismotivated by other applications, such as “feature engineering”.The other motive for studying the colorful k -center problem derives from the algorithmic chal-lenges it poses. One can observe that it generalizes the k -center problem with outliers , which isequivalent to only having red points and needing to cover at least r of them. This outlier ver-sion is already more challenging than the classic k -center problem: only recent results give tight2-approximation algorithms [6, 12], improving upon the 3-approximation guarantee of [7]. In con-trast, such algorithms for the classic k -center problem have been known since the ’80s [10,13]. Thatthe approximation guarantee of 2 is tight, even for classic k -center, was proved in [14].At the same time, a subset-sum problem with polynomial-sized numbers is embedded withinthe colorful k -center problem. To see this, consider n numbers a , . . . , a n and let A = P ni =1 a i .Construct an instance of the colorful k -center problem with r = k · A + A/ b = k · A − A/

2, and forevery i ∈ { , . . . , n } , a ball of radius one containing A + a i red points and A − a i blue points. Theseballs are assumed to be far apart so that any single ball that covers two of these balls must have avery large radius. It is easy to see that the constructed colorful k -center instance has a solution ofradius one if and only if there is a size k subset of the n numbers whose sum equals A/ k -center problem has an unbounded integrality gap even after a linear numberof rounds of the powerful Lasserre/Sum-of-Squares hierarchy (see Section 4.1). We remark that thestandard linear programming relaxation gives a 2-approximation algorithm for the outliers versioneven without applying lift-and-project methods. Another natural approach for strengthening thestandard linear programming relaxation is to add ﬂow-based inequalities specially designed to solvesubset-sum problems. However, in Section 4.2, we prove that they do not improve the integralitygap due to the clustering feature of the problem. This shows that clustering and the subset-sumproblem are intricately related in colorful k -center. This interplay makes the problem more com-plex and prior to our work only a randomized constant-factor approximation algorithm was knownwhen the points are in R with an approximation guarantee greater than 6 [5].Our main result overcomes these diﬃculties and we give a nearly tight approximation guarantee: Theorem 1.

There is a -approximation algorithm for the colorful k -center problem. As aforementioned, our techniques can be easily extended to a constant number of color classesbut we restrict the discussion here to two colors.On a very high level, our algorithm manages to decouple the clustering and the subset-sumaspects. First, our algorithm guesses certain centers of the optimal solution that it then uses topartition the point set into a “dense” part P d and a “sparse” part P s . The dense part is clusteredusing a subset-sum instance while the sparse set is clustered using the techniques of Bandyapadhyay,Inamdar, Pai, and Varadarajan [5] (see Section 2.1). Speciﬁcally, we use the pseudo-approximationof [5] that satisﬁes the coverage requirements using k + 1 balls of at most twice the optimal radius.While our approximation guarantee is nearly tight, it remains an interesting open problem togive a 2-approximation algorithm or to show that the ratio 3 is tight. One possible direction isto understand the strength of the relaxation obtained by combining the Lasserre/Sum-of-Squareshierarchy with the ﬂow constraints. While we show that individually they do not improve theintegrality gap, we believe that their combination can lead to a strong relaxation. Independent work.

Independently and concurrently to our work, authors in [2] obtained a 4-approximation algorithm for the colorful k-center problem with ω = O (1) using diﬀerent techniquesthan the ones described in this work. Furthermore they show that, assuming P = N P , if ω isallowed to be unbounded then the colorful k-center problem admits no algorithm guaranteeing a2 P1 X i ∈B ( j ) x i ≥ z j , ∀ j ∈ P X i ∈ P x i ≤ k X j ∈ R z j ≥ r, X j ∈ B z j ≥ b,z j , x i ∈ [0 , , ∀ i, j ∈ P. LP2 maximize X j ∈ S r j y j subject to X j ∈ S b j y j ≥ b, X j ∈ S y j ≤ k,y j ∈ [0 , ∀ j ∈ S. Figure 1: The linear programs used in the pseudo-approximation algorithm.ﬁnite approximation. They also show that assuming the Exponential Time Hypothesis, colorfulk-center is inapproximable if ω grows faster than log n . Organization.

We begin by giving some notation and deﬁnitions and describing the pseudo-approximation algorithm in [5]. In fact, we then describe a 2-approximation algorithm on a certainclass of instances that are well-separated , and the 3-approximation follows almost immediately. This2-approximation proceeds in two phases: the ﬁrst is dedicated to the guessing of certain centers,while the second processes the dense and sparse sets.Section 3 explains the generalization to ω color classes. In Section 3 we present our integralitygaps under the Sum-of-Squares hierarchy and additional constraints deriving from a ﬂow networkto solve subset-sums. In this section we present our 3-approximation algorithm. We brieﬂy describe the pseudo-approxima-tion algorithm of Bandhyapadhyay et al. [5] since we use it as a subroutine in our algorithm.

Notation:

We assume that our problem instance is normalized to have an optimal radius ofone and we refer to the set of centers in an optimal solution as

OP T . The set of all points atdistance at most ρ from a point j is denoted by B ( j, ρ ) and we refer to this set as a ball of radius ρ at j . We write B ( j ) for B ( j, ball of OP T we mean B ( j ) for some j ∈ OP T . The algorithm of Bandhyapadhyay et al. [5] ﬁrst guesses the optimal radius for the instance (thereare at most O ( n ) distinct values the optimal radius can take), which we assume by normalization tobe one, and considers the natural LP relaxation LP1 depicted on the left in Figure 1. The variable x i indicates how much point i is fractionally opened as a center and z i indicates the amount that i is covered by centers.Given a fractional solution to LP1, the algorithm of [5] ﬁnds a clustering of the points. Theclusters that are produced are of radius two, and with a simple modiﬁcation (details can be foundin Appendix B), can be made to have a special structure that we call a ﬂower: Deﬁnition 2.1.

For j ∈ P , a ﬂower centered at j is the set F ( j ) = ∪ i ∈B ( j ) B ( i ) . x, z ) to LP1, the clustering algorithm in [5] pro-duces a set of points S ⊆ P and a cluster C j ⊆ P for every j ∈ S such that:1. The set S is a subset of the points { j ∈ P : z j > } with positive z -values.2. For each j ∈ S , we have C j ⊆ F ( j ) and the clusters { C j } j ∈ S are pairwise disjoint.3. If we let r j = | C j ∩ R | and b j = | C j ∩ B | for j ∈ S , then the linear program LP2 (depicted onthe right in Figure 1) has a feasible solution y of value at least r .As LP2 has only two non-trivial constraints, any extreme point will have at most two variablesattaining strictly fractional values. So at most k + 1 variables of y are non-zero. The pseudo-approximation of [5] now simply takes those non-zero points as centers. Since each ﬂower is ofradius two, this gives a 2-approximation algorithm that opens at most k + 1 centers. (Note that,as the clusters { C j } j ∈ S are pairwise disjoint, at least b blue points are covered, and at least r redpoints are covered since the value of the solution is at least r .)Obtaining a constant-factor approximation algorithm that only opens k centers turns out to besigniﬁcantly more challenging. Nevertheless, the above techniques form an important subroutinein our algorithm. Given a fractional solution ( x, z ) to LP1, we proceed as above to ﬁnd S andan extreme point to LP2 of value at least r . However, instead of selecting all points with positive y -value, we, in the case of two fractional values, only select the one whose cluster covers moreblue points. This gives us a solution of at most k centers whose clusters cover at least b bluepoints. Furthermore, the number of red points that are covered is at least r − max j ∈ S r j since wedisregarded at most one center. As S ⊆ { j : z j > } (see ﬁrst property above) and C j ⊆ F ( j ) (seesecond property above), we have max j ∈ S r j ≤ max j : z j > |F ( j ) ∩ R | . We summarize the obtainedproperties in the following lemma. Lemma 2.2.

Given a fractional solution ( x, z ) to LP1, there is a polynomial-time algorithmthat outputs at most k clusters of radius two that cover at least b blue points and at least r − max j : z j > |F ( j ) ∩ R | red points. We can thus ﬁnd a 2-approximate solution that covers suﬃciently many blue points but maycover fewer red points than necessary. The idea now is that, if the number of red points in anycluster is not too large, i.e., max j : z j > |F ( j ) ∩ R | is “small”, then we can hope to meet the coveragerequirements for the red points by increasing the radius around some opened centers. Our algorithmbuilds on this intuition to get a 2-approximation algorithm using at most k centers for well-separated instances as deﬁned below. Deﬁnition 2.3.

An instance of colorful k -center is well-separated if there does not exist a ballof radius three that covers at least two balls of OP T . Our main result of this section can now be stated as follows:

Theorem 2.

There is a -approximation algorithm for well-separated instances. The above theorem immediately implies Theorem 1, i.e., the 3-approximation algorithm forgeneral instances. Indeed, if the instance is not well-separated, we can ﬁnd a ball of radius threethat covers at least two balls of

OP T by trying all n points and running the pseudo-approximationof [5] on the remaining uncovered points with k − k − OP T , is a 3-approximation. 4ur algorithm for well-separated instances now proceeds in two phases with the objective ofﬁnding a subset of P on which the pseudo-approximation algorithm produces subsets of ﬂowerscontaining not too many red points. In addition, we maintain a partial solution set of centers(some guessed in the ﬁrst phase), so that we can expand the radius around these centers to recoverthe deﬁcit of red points from closing one of the fractional centers. In this phase we will guess some balls of

OP T that can be used to construct a bound on max j : z j > | R ∩F ( j ) | . To achieve this, we deﬁne the notion of Gain ( p, q ) for any point p ∈ P and q ∈ B ( p ). Deﬁnition 2.4.

For any p ∈ P and q ∈ B ( p ) , let Gain ( p, q ) := R ∩ ( F ( q ) \ B ( p )) be the set of red points added to B ( p ) by forming a ﬂower centered at q . Our algorithm in this phase proceeds by guessing three centers c , c , c of the optimal solution OP T :For i = 1 , ,

3, guess the center c i in OP T and calculate the point q i ∈ B ( c i ) such that thenumber of red points in Gain ( c i , q i ) ∩ P i is maximized over all possible c i , where P = PP i = P i − \ F ( q i − ) for 2 ≤ i ≤ . The time it takes to guess c , c , and c is O ( n ) and for each c i we ﬁnd the q i ∈ B ( c i ) such that | Gain ( c i , q i ) ∩ P i | is maximized by trying all points in B ( c i ) (at most n many).For notation, deﬁne Guess := ∪ i =1 B ( c i ) and let τ = | Gain ( c , q ) ∩ P | . The important properties guaranteed by the ﬁrst phase is summarized in the following lemma.

Lemma 2.5.

Assuming that c , c , and c are guessed correctly, we have that1. the k − balls of radius one in OP T \ { c i } i =1 are contained in P and cover b − | B ∩ Guess | blue points and r − | R ∩ Guess | red points; and2. the three clusters F ( q ) , F ( q ) , and F ( q ) are contained in P \ P and cover at least | B ∩ Guess | blue points and at least | R ∩ Guess | + 3 · τ red points.Proof.

1) We claim that the intersection of any ball of

OP T \ { c i } i =1 with F ( q i ) in P is empty,for all 1 ≤ i ≤

3. Then the k − OP T \ { c i } i =1 satisfy the statement of (1). To provethe claim, suppose that there is p ∈ OP T \ { c i } i =1 such that B ( p ) ∩ F ( q i ) = ∅ for some 1 ≤ i ≤ F ( q i ) = ∪ i ∈B ( q i ) B ( i ), so this implies that B ( p ) ∩ B ( q ′ ) = ∅ , for some q ′ ∈ B ( q i ). Hence, aball of radius three around q ′ covers both B ( p ) and B ( c i ) as c i ∈ B ( q i ), which contradicts that theinstance is well-separated.2) Note that for 1 ≤ i ≤ B ( c i ) ∪ Gain ( c i , q i ) ⊆ F ( q i ), and that B ( c i ) and Gain ( c i , q i ) aredisjoint. The balls B ( c i ) cover at least | B ∩ Guess | blue points and | R ∩ Guess | red points, while P i =1 | Gain ( c i , q i ) ∩ P i | ≥ τ . 5 .3 Phase II Throughout this section we assume c , c , and c have been guessed correctly in Phase I so thatthe properties of Lemma 2.5 hold. Furthermore, by the selection and the deﬁnition of τ , we alsohave | Gain ( p, q ) ∩ P | ≤ τ for any p ∈ P ∩ OP T and q ∈ B ( p ) ∩ P . (1)This implies that F ( p ) \ B ( p ) contains at most τ red points of P . However, to apply Lemma 2.2we need that the number of red points of P in the whole ﬂower F ( p ) is bounded. To deal withballs with many more than τ red points, we will iteratively remove dense sets from P to obtain asubset P s of sparse points. Deﬁnition 2.6.

When considering a subset of the points P s ⊆ P , we say that a point j ∈ P s is dense if the ball B ( j ) contains strictly more than · τ red points of P s . For a dense point j , wealso let I j ⊆ P s contain those points i ∈ P s whose intersection B ( i ) ∩ B ( j ) contains strictly morethan τ red points of P s . We remark that in the above deﬁnition, we have in particular that j ∈ I j for a dense point j ∈ P s . Our iterative procedure now works as follows:Initially, let I = ∅ and P s = P . While there is a dense point j ∈ P s : • Add I j to I and update P s by removing the points D j = ∪ i ∈ I j B ( i ) ∩ P s .Let P d = P \ P s denote those points that were removed from P . We will cluster the two sets P s and P d of points separately. Indeed, the following lemma says that a center in OP T \ { c i } i =1 eithercovers points in P s or P d but not points from both sets. Recall that D j denotes the set of pointsthat are removed from P s in the iteration when j was selected and so P d = ∪ j D j . Lemma 2.7.

For any c ∈ OP T \ { c i } i =1 and any I j ∈ I , either c ∈ I j or B ( c ) ∩ D j = ∅ .Proof. Let c ∈ OP T \ { c i } i =1 , I j ∈ I , and suppose c / ∈ I j . If B ( c ) ∩ D j = ∅ , there is a point p inthe intersection B ( c ) ∩ B ( i ) for some i ∈ I j . Suppose ﬁrst that B ( c ) ∩ B ( j ) = ∅ . Then, since c / ∈ I j ,the intersection B ( c ) ∩ B ( j ) contains fewer than τ red points from D j (recall that D j contains thepoints of B ( j ) in P s at the time j was selected). But by the deﬁnition of dense clients, B ( j ) ∩ D j has more than 2 · τ red points, so ( B ( j ) \ B ( c )) ∩ D j has more than τ red points. This region is asubset of Gain ( c, p ) ∩ P , which contradicts (1). This is shown in Figure 2(a). Now consider thesecond case when B ( c ) ∩ B ( j ) = ∅ and there is a point p in the intersection B ( c ) ∩ B ( i ) for some i ∈ I j and i = j . Then, by the deﬁnition of I j , B ( i ) ∩ B ( j ) has more than τ red points of D j .However, this is also a subset of Gain ( c, p ) ∩ P so we reach the same contradiction. See Figure2(b).Our algorithm now proceeds by guessing the number k d of balls of OP T \ { c i } i =1 contained in P d . We also guess the numbers r d and b d of red and blue points, respectively, that these balls coverin P d . Note that after guessing k d , we know that the number of balls in OP T \ { c i } i =1 containedin P s equals k s = k − − k d . Furthermore, by the ﬁrst property of Lemma 2.5, these balls cover atleast b s = b − | B ∩ Guess | − b d blue points in P s and at least r s = r − | R ∩ Guess | − r d red pointsin P s . As there are O ( n ) possible values of k d , b d , and r d (each can take a value between 0 and n ) we can try all possibilities by increasing the running time by a multiplicative factor of O ( n ).Henceforth, we therefore assume that we have guessed those parameters correctly. In that case, weshow that we can recover an equally good solution for P d and a solution for P s that covers b s bluepoints and almost r s red points: 6 j p cj i p (a) (b)Figure 2: The shaded regions are subsets of Gain (c,p), which contain the darkly shaded regionsthat have > τ red points.

Lemma 2.8.

There exist two polynomial-time algorithms A d and A s such that if k d , r d , and b d areguessed correctly then • A d returns k d balls of radius one that cover b d blue points of P d and r d red points of P d ; • A s returns k s balls of radius two that cover at least b s blue points of P s and at least r s − · τ red points of P s .Proof. We ﬁrst describe and analyze the algorithm A d followed by A s . The algorithm A d for the dense point set P d . By Lemma 2.7, we have that all k d balls in OP T \ { c i } i =1 that cover points in P d are centered at points in ∪ j I j . Furthermore, we have thateach I j contains at most one center of OP T . This is because every i ∈ I j is such that B ( i ) ∩B ( j ) = ∅ and so, by the triangle inequality, B ( j,

3) contains all balls {B ( i ) } i ∈ I j . Hence, by the assumptionthat the instance is well-separated, the set I j contains at most one center of OP T .We now reduce our problem to a 3-dimensional subset-sum problem. For each I j ∈ I , forma group consisting of an item for each p ∈ I j . The item corresponding to p ∈ I j has the 3-dimensional value vector (1 , |B ( p ) ∩ D j ∩ B | , |B ( p ) ∩ D j ∩ R | ). Our goal is to ﬁnd k d items such thatat most one item per group is selected and their 3-dimensional vectors sum up to ( k d , b d , r d ). Sucha solution, if it exists, can be found by standard dynamic programming that has a table of size O ( n ). For completeness, we provide the recurrence and precise details of this standard techniquein Appendix A. Furthermore, since the D j ’s are disjoint by deﬁnition, this gives k d centers thatcover b d blue points and r d red points in P d , as required in the statement of the lemma.It remains to show that such a solution exists. Let o , o , . . . , o k d denote the centers of theballs in OP T \ { c i } i =1 that cover points in P d . Furthermore, let I j , . . . , I j kd be the sets in I suchthat o i ∈ I j i for i ∈ { , . . . , k d } . Notice that by Lemma 2.7 we have that B ( o i ) ∩ P d is disjointfrom P d \ D j i and contained in D j i . It follows that the 3-dimensional vector corresponding to an OP T center o i equals (1 , |B ( p ) ∩ P d ∩ B | , |B ( p ) ∩ P d ∩ R | ). Therefore, the sum of these vectorscorresponding to o , . . . , o k d results in the vector ( k d , b d , r d ), where we used that our guesses of k d , b d , and r d were correct. The algorithm A s for the sparse point set P s . Assuming that the guesses are correct we havethat

OP T \ { c i } i =1 contains k s balls that cover b s blue points of P s and r s red points of P s . Hence,LP1 has a feasible solution ( x, z ) to the instance deﬁned by the point set P s , the number of balls k s , and the constraints b s and r s on the number of blue and red points to be covered, respectively.Lemma 2.2 then says that we can in polynomial-time ﬁnd k s balls of radius two such that at least b s blue balls of P s are covered and at least r s − max j : z j > |F ( j ) ∩ R | P s are covered. Here, F ( j ) refers to the ﬂower restricted to the point set P s .To prove the the second part of Lemma 2.8, it is thus suﬃcient to show that LP1 has a feasiblesolution where z j = 0 for all j ∈ P s such that |F ( j ) ∩ R | > · τ . In turn, this follows by showingthat, for any such j ∈ P s with |F ( j ) ∩ R | > · τ , no point in B ( j ) is in OP T (since then z j = 0in the integral solution corresponding to OP T ). Such a feasible solution can be found by adding x i = 0 ∀ i ∈ B ( j ) for all such j to LP1.To see why this holds, suppose towards a contradiction that there is a c ∈ OP T such that c ∈ B ( j ). First, since there are no dense points in P s , we have that the number of red points in B ( c ) ∩ P s is at most 2 · τ . Therefore the number of red points of P s in F ( j ) \ B ( c ) is strictly morethan τ . In other words, we have τ < | Gain ( c, j ) ∩ P s | ≤ | Gain ( c, j ) ∩ P | which contradicts (1).Equipped with the above lemma we are now ready to ﬁnalize the proof of Theorem 2. Proof of Theorem 2.

Our algorithm guesses the optimal radius and the centers c , c , c in PhaseI, and k d , r d , b d in Phase II. There are at most (cid:0) n (cid:1) choices of the optimal radius, n choices for each c i , and n + 1 choices of k d , r d , b d (ranging from 0 to n ). We can thus try all these possibilitiesin polynomial time and, since all other steps in our algorithm run in polynomial time, the totalrunning time will be polynomial. The algorithm tries all these guesses and outputs the best solutionfound over all choices. For the correct guesses, we output a solution with 3 + k d + k s = k balls ofradius at most two. Furthermore, by the second property of Lemma 2.5 and the two properties ofLemma 2.8, we have that • the number of blue points covered is at least | B ∩ Guess | + b d + b s = b ; and • the number of red points covered is at least | R ∩ Guess | + 3 τ + r d + r s − τ = r .We have thus given a polynomial-time algorithm that returns a solution where the balls are ofradius at most twice the optimal radius. Our algorithm extends easily to a constant number ω of color classes C , . . . , C ω with coveragerequirements p , . . . , p ω . We use the LPs in Fig. 3 for a general number of colors, where p j,i in LP2( ω ) indicates the number of points of color class i in cluster j ∈ S . S is the set of clustercenters obtained from modiﬁed clustering algorithm in Appendix B to instances with ω color classes.LP2( ω ) has only ω non-trivial constraints, so any extreme point has at most ω variables attainingstrictly fractional values, and a feasible solution attaining objective value at least p will have atmost k + ω − C ω , we can cover p ω points of C ω . We would like to be able to close theremaining fractional centers, so we apply an analogous procedure to the case with just two colors.We can guess 3( ω −

1) centers of

OP T for each of the ω − C , and closing all other fractional centers.In particular, we get a running time with a factor of n O ( ω ) . The remainder of this section givesa formal description of the algorithm for ω color classes.8 P1 ( ω ) X m ∈B ( i ) x m ≥ z i , ∀ i ∈ P X i ∈ P x i ≤ k X i ∈ C j z i ≥ p j , ∀ ≤ j ≤ ωz i , x i ∈ [0 , , ∀ i ∈ P. LP2 ( ω )maximize X i ∈ S p ,i y i subject to X i ∈ S p j,i y i ≥ p j , ∀ ≤ j ≤ ω X i ∈ S y i ≤ k,y i ∈ [0 , ∀ i ∈ S. Figure 3: Linear programs for ω color classes. ω colors The following is a natural generalization of Lemma 2.2 and summarizes the main properties of theclustering algorithm of Appendix B for instances with ω color classes. Lemma 1 ′ . Given a fractional solution ( x, z ) to LP1 ( ω ) , there is a polynomial-time algorithmthat outputs at most k clusters of radius two that cover at least p ω points of C ω , and at least p i − ( ω −

1) max j : z j > |F ( j ) ∩ C i | for ≤ i ≤ ω . Since we may not meet the coverage requirements for ω − OP T for each of those colors, and for each fractional center. In total we guess3( ω − points of OP T as follows:For j = 2 , . . . , ω , for i = 1 , , . . . , ω −

1) guess the center c j,i in OP T and calculate thepoint q j,i ∈ B ( c j,i ) such that |C j ∩ Gain ( c j,i , q j,i ) ∩ P j,i | is maximized over all possible c j,i ,where P j, = PP j,i = P j,i − \ ( C i ∩ F ( q j,i − )) for 2 ≤ i ≤ ω −

1) + 1 . This guessing takes O ( n ω − ) rounds. It is possible that some c j,i coincide, but this does notaﬀect the correctness of the algorithm. In fact, this can only improve the solution, in the sensethat the coverage requirements will be met with fewer than k centers. Let k c denote the numberof distinct c j,i obtained in the correct guess. For notation, deﬁne Guess : = ∪ ωj =2 ∪ ω − i =1 B ( c j,i ) τ j = (cid:12)(cid:12) C j ∩ Gain ( c j, ω − , q j, ω − ) ∩ P j, ω − (cid:12)(cid:12) . To be consistent with previous notation, let P := P \ ∪ ωj =2 ∪ ω − i =1 F ( q j,i ) . The important properties guaranteed by the ﬁrst phase can be summarized in the following lemmawhose proof is the natural extension of Lemma 2.5.

Lemma 2 ′ . Assuming that c j,i are guessed correctly, we have that . the k − ω − balls of radius one in OP T \ ∪ ωj =2 ∪ ω − i =1 { c j,i } are contained in P andcover p ω − |C ω ∩ Guess | of points in C ω and p j − |C j ∩ Guess | points of C j for j = 2 , . . . , ω ;and2. the clusters F ( q j,i ) are contained in P \ P ω − and cover at least |C ω ∩ Guess | points of C ω and at least |C j ∩ Guess | + 3( ω − · τ j points of C j . Now we need to remove points which contain many points from any one of the color classes topartition the instance into dense and sparse parts which leads to the following generalized deﬁnitionof dense points.

Deﬁnition 4 ′ . When considering a subset of the points P s ⊆ P , we say that a point p ∈ P s is j - dense if |C j ∩ B ( p ) ∩ P s | > τ j . For a j -dense point p , we also let I p ⊆ P s contain those points i ∈ P s such that |C j ∩ B ( i ) ∩ B ( p ) ∩ P s | > τ j , for every ≤ j ≤ ω . Now we perform a similar iterative procedure as for two colors:Initially, let I = ∅ and P s = P ω − . While there is a j -dense point p ∈ P s for any 2 ≤ j ≤ ω : • Add I p to I and update P s by removing the points D p = ∪ i ∈ I p B ( i ) ∩ P s .As in the case of two colors, set P d = P ω − \ P s . By naturally extending Lemma 2.7 and itsproof, we can ensure that any ball of OP T \ ∪ ωj =2 ∪ ω − i =1 { c j,i } is completely contained in either P d or P s . We guess the number k d of such balls of OP T contained in P d , and guess the numbers d , . . . , d ω of points of C , . . . , C ω covered by these balls in P d . There are O ( n ω +1 ) possible values of k d , d , . . . , d ω and all the possibilities can be tried by increasing the running time by a multiplicativefactor. The number of balls of OP T \ ∪ ωj =2 ∪ ω − i =1 { c j,i } contained in P s is given by k s = k − k c − k d and these balls cover at least s j = p j − |C j ∩ Guess all | − d j points of C j in P s , 1 ≤ j ≤ ω .Assuming that the parameters are guessed correctly we can show, similar to Lemma 2.8, thatthe following holds. Lemma 4 ′ . There exist two polynomial-time algorithms A ′ d and A ′ s such that if k d , d , . . . d ω areguessed correctly then • A ′ d returns k d balls of radius one that cover d , . . . , d ω points of C , . . . , C ω of P d ; • A ′ s returns k s balls of radius two that cover at least s points of C of P s and at least s j − ω − · τ j points of C j of P s , ≤ j ≤ ω . The algorithm A ′ d proceeds as did A d , with the modiﬁcation that the dynamic program is now( ω + 1)-dimensional. Algorithm A ′ s , is also similar to A s , because LP1 has a feasible solution where z p = 0 for all p ∈ P s such that |F ( p ) ∩ C j | > τ j holds for any 2 ≤ j ≤ ω . Hence, we output asolution with k c + k d + k s = k balls of radius at most two, and • the number of points of C covered is at least |C ∩ Guess | + d + s = p ; and • the number of points of C j covered is at least |C j ∩ Guess | +3( ω − τ j + d j + s j − ω − τ j = p j ,for all j = 2 , . . . , ω .This is a polynomial-time algorithm for colorful k -center with a constant number of color classes.10 LP Integrality Gaps

In this section, we present two natural ways to strengthen LP1 and show that they both failto close the integrality gap, providing evidence that clustering and knapsack feasibility cannotbe decoupled in the colorful k -center problem. On one hand, the Sum-of-Squares hierarchy isineﬀective for knapsack problems, while on the other hand, adding knapsack constraints to LP1 isalso insuﬃcient due to the clustering aspect of this problem. The Sum-of-Squares hierarchy (equivalently Lasserre [16, 17]) is a method of strengthening linearprograms that has been used in constraint satisfaction problems, set-cover, and graph coloring, tojust name a few examples [3, 9, 18]. We use the same notation for the Sum-of-Squares hierarchy,abbreviated as SoS, as in Karlin et al. [15]. For a set V of variables, P ( V ) are the power sets of V and P t ( V ) are the subsets of V of size at most t . Their succinct deﬁnition of the hierarchy makesuse of the shift operator : for two vectors x, y ∈ R P ( V ) the shift operator is the vector x ∗ y ∈ R P ( V ) such that ( x ∗ y ) I = X J ⊆ V x J y I ∪ J . Analogously, for a polynomial g ( x ) = P I ⊆ V a I Q i ∈ I x i we have ( g ∗ y ) I = P J ⊆ V a J y I ∪ J . Inparticular, we work with the linear inequalities g , . . . , g m so that the polytope to be lifted is K = { x ∈ [0 , n : g ℓ ( x ) ≥ ℓ = 1 , . . . , m } . Let T be a collection of subsets of V and y a vector in R T . The matrix M T ( y ) is indexed byelements of T such that ( M T ( y )) I,J = y I ∪ J . We can now deﬁne the t -th SoS lifted polytope. Deﬁnition 4.1.

For any ≤ t ≤ n , the t -th SoS lifted polytope SoS t ( K ) is the set of vectors y ∈ [0 , P t ( V ) such that y ∅ = 1 , M P t ( V ) ( y ) (cid:23) , and M P t − ( V ) ( g ℓ ∗ y ) (cid:23) for all ℓ .A point x ∈ [0 , n belongs to the t -th SoS polytope SoS t ( K ) if there exists y ∈ SoS t ( K ) suchthat y { i } = x i for all i ∈ V . We use a reduction from Grigoriev’s SoS lower bound for knapsack [11] to show that the followinginstance has a fractional solution with small radius that is valid for a linear number of rounds ofSoS.

Theorem 3 (Grigoriev) . At least min { ⌊ min { k/ , n − k/ }⌋ + 3 , n } rounds of SoS are required torecognize that the following polytope contains no integral solution for k ∈ Z odd. n X i =1 w i = kw i ∈ [0 , ∀ i. Figure 4: Integrality gap example for linear rounds of SoSConsider an instance of colorful k -center with two colors, 8 n points, k = n , and r = b = 2 n where n is odd. Points { i − , i − , i − , i }∀ i ∈ [2 n ] belong to cluster C i of radius one. For odd i , C i has three red points and one blue point and for even i , C i has one red point and three bluepoints. A picture is shown in Figure 4. In an optimal integer solution, one center needs to cover atleast 2 of these clusters while a fractional solution satisfying LP1 can open a center of 1 / t th round of SoS on the systemof equations in Theorem 3 to our variables in the t th round of SoS on LP1 for this instance todemonstrate that the infeasibility of balls of radius one is not recognized. More precisely, we assigna variable w i to each pair of clusters of radius one as shown in Figure 4, corresponding to openingeach cluster in the pair by w i amount. Then a fractional opening of balls of radius one can bemapped to variables that satisfy the polytope in Theorem 3. The remainder of this subsection isdedicated to formally describing the reduction from Theorem 3.Let W denote the set of variables used in the polytope deﬁned in Theorem 3. Let w be in the t -thround of SoS applied to the system in Theorem 3 so that w is indexed by subsets of W of size atmost t . Let V = V x ∪ V z , where V x = { x , . . . , x n } and V z = { z , . . . , z n } , be the set of variablesused in LP1 for the instance shown in Figure 4. We deﬁne vector y with entries indexed by subsetsof V , and show that y is in the t -th SoS lifting of LP1. In each ball we pick a representative x i , i ≡ y I = 0 if x j ∈ I , j y I = w π ( I ) where π ( I ) = { w i : x i − or x i − or z i − j ∈ I, for some i ∈ [ n ] , j ∈ [7] } . We have M P t ( W ) ( w ) (cid:23)

0, and for g = − n + P ni =1 x i and g = n − P ni =1 x i , M P t − ( W ) ( g ℓ ∗ w ) (cid:23) ℓ = 1 , w satisﬁes the t -th round of SoS. This implies that M P t − ( W ) ( g ℓ ∗ w ) is the zeromatrix.To show that M P t ( V ) ( y ) (cid:23)

0, we start with M P t ( W ) ( w ) and construct a sequence of matricessuch that the semideﬁniteness of one implies the semideﬁniteness of the next, until we arrive at amatrix that is M P t ( V ) ( y ) with rows and columns permuted, i.e. M P t ( V ) ( y ) multiplied on the leftand right by a permutation matrix and its transpose. Since the eigenvalues of a matrix are invariantunder this operation, M P t ( W ) ( w ) (cid:23) M P t ( V ) ( y ) (cid:23) Lemma 4.2.

There exists a sequence of square matrices M P t ( W ) ( w ) := M , M , M , . . . , M p , uch that the rank of M i is the same as the rank of M i +1 , M i is the leading principal submatrix of M i +1 of dimension one less, and M p is M P t ( V ) ( y ) with rows and columns permuted.Proof. We claim that this sequence of matrices exists with the following description. Firstly, thematrix M i +1 has one extra row and column than M i , and is the same on the leading principalsubmatrix of size M i . Then there are two possibilities:(a) The last row and column of M i +1 are all zeroes, or(b) for some j , the last row of M i +1 is a copy of the j th row of M i , the last column is a copy ofthe j th column of M i , and the last entry is ( M i ) j,j .Either way, the rank of M i +1 would be the same as the rank of M i .To prove this claim, it suﬃces to consider a sequence of indices of the matrix M P t ( V ) ( y ). Thematrix M in our sequence will be the submatrix of M P t ( V ) ( y ) indexed by the ﬁrst k indices, where k is the dimension of M P t ( W ) ( w ), i.e. the number of subsets of W of size at most t . Each subsequentmatrix M i will be the submatrix of M P t ( V ) ( y ) indexed by the ﬁrst k + i indices. Note that therows/columns of M P t ( V ) ( y ) can be considered to be indexed by all the subsets of V of size at most t . With this in mind, consider a sequence of subsets of V of size at most t with the followingproperties:1. All subsets of { x i − : i ∈ [ n ] } of size at most t form a preﬁx of our sequence.2. Each set index after the ﬁrst has exactly one more element than some set index that cameearlier in the sequence.It is clear that it is possible to arrange all the subsets of V of size at most t in a sequence to satisfythese properties. It only remains to show that this sequence produces the desired construction for M , M , . . . , M p .We have (cid:0) M P t ( y ) (cid:1) I,J = y I ∪ J = w π ( I ∪ J ) = w π ( I ) ,π ( J ) so property (1) guarantees that we begin with M being M P t ( W ) ( w ), up to the correct permutationof subsets of { x i − : i ∈ [ n ] } . Now consider some k ′ th index in the sequence, k ′ > k where k isthe dimension of M P t ( W ) ( w ). By property (2), it is of the form J ∪ { x } , where J is one of the ﬁrst k ′ − x ∈ V . There are two cases: • If x is some x i with i y I ℓ ∪ J = 0 for all ℓ ≤ k ′ . • Otherwise, π ( J ∪ { x } ) = π ( J ).In the ﬁrst case, the matrix constructed from the ﬁrst k ′ indices will have property (a), and in thesecond, property (b). Finally, it is clear that at each step the dimension of the matrices increasesby one, and that it is the leading principal submatrix of the following matrix in the sequence, untilwe end up with M P t ( V ) ( y ) (up to some permutation of its rows and columns).By the rank-nullity theorem, M i +1 has one more 0 eigenvalue than M i , so we can apply thefollowing theorem. Theorem 4 (Cauchy’s Interlace Theorem) . Let A be a symmetric n × n matrix and B be a principalsubmatrix of A of dimension ( n − × ( n − . If the eigenvalues of A are α ≥ · · · ≥ α n and theeigenvalues of B are β ≥ · · · ≥ β n − then α ≥ β ≥ α ≥ β ≥ · · · ≥ α n − ≥ β n − ≥ α n . M i +1 = A and M i = B as in Theorem 4 we have that α n = 0 (since M i +1 and M i havethe same eigenvalues but the dimension of the zero eigenspace of M i +1 is one greater than that of M i ). Hence, M i +1 has no negative eigenvalues if M i has no negative eigenvalues. This is suﬃcientto show that each matrix in the sequence constructed is positive semideﬁnite, and concludes theproof that M P t ( V ) ( y ) (cid:23) y and the linearconstraints of our polytope are positive semideﬁnite. Let h i denote the linear inequalities in LP1.In essence, the corresponding moment matrices M P t − ( V ) ( h i ∗ y ) are zero matrices since all h i aretight for the example in Figure 4. Formally, we have Lemma 4.3.

Matrices M P t − ( V ) ( h ℓ ∗ y ) are the zero matrix, for each h ℓ a linear constraint fromLP1.Proof. Let h ,j be the linear polynomial that corresponds to the ﬁrst inequality of LP1 for j ∈ P .First, if i y I ∪{ x i } = 0 for any I ⊆ V . Otherwise, we have( M P t − ( h j ∗ y )) I,J =  X i ∈B ( j, y I ∪ J ∪{ x i }  − y I ∪ J ∪{ z j } = w π ( I ∪ J ) ∪ π ( x i ) − w π ( I ∪ J ) ∪ π ( z j ) = 0since π ( { x i } ) = π ( z j ) for i ∈ B ( j, i ≡ h , h ,and h , we have that M P t − ( V ) ( h ℓ ∗ y ) is the zero matrix because of how we deﬁned the projectiononto w : ( M P t − ( h ∗ y )) I,J = ny I ∪ J − X x j ∈ V x y I ∪ J ∪{ x j } = nw π ( I ∪ J ) − n X j =1 w π ( I ∪ J ∪{ w j } ) = ( M P t − ( g ∗ w )) π ( I ) ,π ( J ) = 0 M P t − ( h ∗ y )) I,J = M P t − ( h ∗ y )) I,J = X j ∈ R y I ∪ J ∪{ z j }  − ny I ∪ J = n X i =1 w π ( I ∪ J ) ∪{ w i } ! − nw π ( I ∪ J ) = 2( M P t − ( g ∗ w )) π ( I ) ,π ( J ) = 0 . This concludes the formal proof of the following theorem.

Theorem 5.

The integrality gap of LP1 with n points persists up to Ω( n ) rounds of Sum-of-Squares. (cid:3) k = 3, r = b = 8 In this section we add additional constraints based on standard techniques to LP1. These incor-porate knapsack constraints for the fractional centers produced in the hope of obtaining a betterclustering and show that this fails to reduce the integrality gap.We deﬁne an instance of a knapsack problem with multiple objectives. Each point p ∈ P corresponds to an item with three dimensions: a dimension of size one to restrict the number ofcenters, | B ∩ B ( p ) | , and | R ∩ B ( p ) | . We set up a ﬂow network with an ( n + 1) × n × n × k gridof nodes and we name the nodes with the coordinate ( w, x, y, z ) of its position. The source s islocated at (0 , , ,

0) and we add an extra node t for the sink. Assign an arbitrary order to thepoints in P . For the item corresponding to i ∈ P , for each x ∈ [ n ], y ∈ [ n ], z ∈ [ k ]:1. Add an edge from ( i, x, y, z ) to ( i + 1 , x, y, z ) with ﬂow variable e i,x,y,z .2. With b i := | B ∩ B ( i ) | and r i := | R ∩ B ( i ) | , if z < k add an edge from ( i, x, y, z ) to ( i +1 , min { x + b i , n } , min { y + b i , n } , z + 1) with ﬂow variable f i,x,y,z .For each x ∈ [ b, n ], y ∈ [ r, n ]:3. Add an edge from ( n + 1 , x, y, k ) to t with ﬂow variable g x,y .Set the capacities of all edges to one. In addition to the usual ﬂow constraints, add to LP1 theconstraints x i = X x,y ∈ [ n ] ,z ∈ [ k ] f i,x,y,z for all i ∈ P (2)1 − x i = X x,y ∈ [ n ] ,z ∈ [ k ] e i,x,y,z for all i ∈ P. (3)We refer to the resulting linear program as LP3. Notice that an integral solution to LP1 deﬁnes apath from s to t through which one unit of ﬂow can be sent; hence LP3 is a valid relaxation. Onthe other hand, any path P from s to t deﬁnes a set C P of at most k centers by taking those points c for which f c,x,y,z ∈ P for some x, y , and z . Moreover, as t can only be reached from a coordinatewith x ≥ b and y ≥ r we have that P c ∈ C P |B ( c ) ∩ B | ≥ b and P c ∈ C P |B ( c ) ∩ R | ≥ r . It followsthat C P forms a solution to the problem of radius one if the balls are disjoint. In particular, ourintegrality gap instances for the Sum-of-Squares hierarchy do not fool LP3.The example in Figure 5 shows that in an instance where balls overlap, the integrality gapremains large. Here, the fractional assignment of open centers is 1 / References [1] Anagnostopoulos, A., Becchetti, L., B¨ohm, M., Fazzone, A., Leonardi, S., Mengh-ini, C., Schwiegelshohn, C.: Principal fairness: Removing bias via projections. CoRR abs/1905.13651 (2019)[2] Anegg, G., Angelidakis, H., Kurpisz, A., Zenklusen, R.: A technique for obtaining true ap-proximations for k-center with covering constraints. In: International Conference on IntegerProgramming and Combinatorial Optimization (IPCO). pp. 52–65 (2020)[3] Arora, S., Ge, R.: New tools for graph coloring. In: Approximation, Randomization, andCombinatorial Optimization. Algorithms and Techniques. pp. 1–12. Springer (2011)[4] Backurs, A., Indyk, P., Onak, K., Schieber, B., Vakilian, A., Wagner, T.: Scalable fair clus-tering. In: Proceedings of the 36th International Conference on Machine Learning, ICML. pp.405–413 (2019)[5] Bandyapadhyay, S., Inamdar, T., Pai, S., Varadarajan, K.R.: A constant approximation forcolorful k-center. In: 27th Annual European Symposium on Algorithms, ESA. pp. 12:1–12:14(2019)[6] Chakrabarty, D., Goyal, P., Krishnaswamy, R.: The non-uniform k-center problem. In: 43rdInternational Colloquium on Automata, Languages, and Programming, ICALP. pp. 67:1–67:15(2016)[7] Charikar, M., Khuller, S., Mount, D.M., Narasimhan, G.: Algorithms for facility locationproblems with outliers. In: Proceedings of the 12th Annual ACM-SIAM symposium on Discretealgorithms (SODA). pp. 642–651 (2001)[8] Chierichetti, F.and Kumar, R., Lattanzi, S., Vassilvitskii, S.: Fair clustering through fairlets.In: Advances in Neural Information Processing Systems (NIPS). pp. 5029–5037 (2017)[9] Chlamtac, E., Friggstad, Z., Georgiou, K.: Understanding set cover: Sub-exponential timeapproximations and lift-and-project methods. CoRR abs/1204.5489 (2012)[10] Gonzalez, T.F.: Clustering to minimize the maximum intercluster distance. Theoretical Com-puter Science , 293–306 (1985)[11] Grigoriev, D.: Complexity of positivstellensatz proofs for the knapsack. Computational Com-plexity (2), 139–154 (2001)[12] Harris, D.G., Pensyl, T., Srinivasan, A., Trinh, K.: A lottery model for center-type problemswith outliers. ACM Trans. Algorithms (3), 36:1–36:25 (2019)[13] Hochbaum, D.S., Shmoys, D.B.: A best possible heuristic for the k-center problem. Mathe-matics of operations research (2), 180–184 (1985)1614] Hsu, W.L., Nemhauser, G.L.: Easy and hard bottleneck location problems. Discrete AppliedMathematics (3), 209–215 (1979)[15] Karlin, A.R., Mathieu, C., Nguyen, C.T.: Integrality gaps of linear and semi-deﬁnite pro-gramming relaxations for knapsack. In: Integer Programming and Combinatoral OptimizationIPCO. pp. 301–314 (2011)[16] Lasserre, J.B.: An explicit exact SDP relaxation for nonlinear 0-1 programs. In: InternationalConference on Integer Programming and Combinatorial Optimization (IPCO). pp. 293–303(2001)[17] Lasserre, J.B.: Global optimization with polynomials and the problem of moments. SIAMJournal on optimization (3), 796–817 (2001)[18] Tulsiani, M.: CSP gaps and reductions in the lasserre hierarchy. In: Proceedings of the 41stAnnual ACM Symposium on Theory of Computing, STOC. pp. 303–312 (2009) A Dynamic Programming for Dense Points

In this section we describe the dynamic programming algorithm discussed in Lemma 2.8. As statedin the proof of Lemma 2.8, given I = ∪ j I j and correct guesses for k d , b d , r d , we need to ﬁnd k d balls of radius one centered at points in I covering b d blue and r d red points with at most onepoint from each I j ∈ I picked as a center. To do this, we ﬁrst order the sets in I arbitrarilyas I = { I j , . . . , I j m } , m = | I | . We create a 4-dimensional table T of dimension ( m, b d , r d , k d ). T [ m ′ , b ′ , r ′ , k ′ ] stores whether there is a set of k ′ balls in the ﬁrst m ′ sets of I covering b ′ blue and r ′ red points. The recurrence relation for T is T [0 , , ,

0] = True T [0 , b ′ , r ′ , k ′ ] = False, for any b ′ , r ′ , k ′ = 0 T [ m ′ , b ′ , r ′ , k ′ ] =  True if T [ m ′ − , b ′ , r ′ , k ′ ] = TrueTrue if ∃ c ∈ I j m ′ s.t. T [ m ′ − , b ′′ , r ′′ , k ′ −

1] = True, for b ′′ = b ′ − |B ( c ) ∩ B | , r ′′ = r ′ − |B ( c ) ∩ R | False otherwise . The table T has size O (( m + 1) · ( n + 1) · ( n + 1) · ( n + 1)) = O ( n ) since the ﬁrst parameter hasrange from 0 to m , and the other parameters can have value 0 up to at most n . Moreover, since | I j | ≤ n for all I j ∈ I , we can compute the the whole table in time O ( n ) using e.g. the bottom-upapproach. We can also remember the choices in a separate table and so we can ﬁnd a solution intime O ( n ) if it exists. B The Clustering Algorithm

In this section we present the clustering algorithm used in [5] with a simple modiﬁcation. Thealgorithm is described in pseudo-code in Algorithm 1.17 lgorithm 1:

Clustering Algorithm S ← ∅ , P ′ ← P while P ′ = ∅ and max j ∈ P ′ z j > do j ∈ P ′ be a point with maximum z j : let S ← S ∪ { j } y j ← min { , P i ∈B ( j ) x i } ; ˜ z j ← y j C j ← F ( j ) ∩ P ′ For all j ′ = j ∈ C j , set y j ′ ←

0, ˜ z j ′ ← ˜ z j P ′ ← P ′ \ C j end Now we state the theorem which states the properties of this clustering algorithm used in Section2.1.

Theorem 6.

Given a feasible fractional solution ( x, z ) to LP1, the set of points S ⊆ P and clusters C j ⊆ P for every j ∈ S produced by Algorithm 1 satisfy:1. The set S is a subset of the points { j ∈ P : z j > } with positive z -values.2. For each j ∈ S , we have C j ⊆ F ( j ) and the clusters { C j } j ∈ S are pairwise disjoint.Moreover, if we let R j = C j ∩ R and B j = C j ∩ B with r j = | R j | and b j = | B j | for j ∈ S , then y isa feasible solution to LP2 (depicted on the right in Fig.1) with objective value at least r .Proof. The proof of the ﬁrst statement is clear from the condition in the while loop of the algorithm.For the second statement, observe that, by the deﬁnition of C j as stated in the algorithm, C j ⊆ S i ∈B ( j ) B ( i ) = F ( j ). Since in each iteration, the cluster is removed from P ′ , the clusters are clearlydisjoint.In order to prove that y is feasible this we ﬁrst state some useful observations. • Firstly, for any i ∈ P there is at most one j ∈ S such that d ( i, j ) ≤

1. This is true becauseif there were j, j ′ ∈ S such that both j, j ′ ∈ B ( j ) then, assuming w.l.o.g. j was consideredbefore in the while loop, j ′ ∈ C j and thus j ′ cannot be in S which is a contradiction. • Secondly, note that for any j ∈ P such that j ∈ C j for some j , then ˜ z j = ˜ z j ≥ z j . This istrivially true if ˜ z j = 1, otherwise ˜ z j = P i ∈B ( j ) x i ≥ z j ≥ z j where the ﬁrst inequality followsfrom LP1 constraints and second inequality from the fact that when C j was removed, z j hadthe highest z value.Now we show that y is feasible for LP2 with objective value at least r . Firstly we show that18 j ∈ S r j y j ≥ r . To see this, X j ∈ S r j y j = X j ∈ S | R j | y j = X j ∈ S X j ′ ∈ R j ˜ z j ( y j = ˜ z j for any j ∈ S ) ≥ X j ∈ S X j ′ ∈ R j z j ′ (from second observation, ˜ z j ≥ z j ′ for any j ′ ∈ C j )= X j ′ ∈ R : z j ′ > z j ′ (since C j ’s are disjoint and contain all j s.t. z j > X j ′ ∈ R z j ′ ≥ r (since z satisﬁes LP1))Similarly P j ∈ S b j y j ≥ b . Finally we will show that P j ∈ S y j ≤ k , X j ∈ S y j ≤ X j ∈ S X j ′ ∈B ( j ) x j ′ (since y j ≤ X j ′ ∈B ( j ) x j ′ ) ≤ X j ′ ∈ P x j ′ (from the ﬁrst observation) ≤ k (since x satisﬁes LP1)This concludes the proof of the claim that y is a feasible solution to LP2 with objective value atleast rr