[PDF] Hardness of Approximation of Euclidean k -Median

Abstract

The Euclidean k -median problem is defined in the following manner: given a set X of n points in R d , and an integer k , find a set C⊂ R d of k points (called centers) such that the cost function Φ(C,X)≡ ∑ x∈X min c∈C ∥x−c ∥ 2 is minimized. The Euclidean k -means problem is defined similarly by replacing the distance with squared distance in the cost function. Various hardness of approximation results are known for the Euclidean k -means problem. However, no hardness of approximation results were known for the Euclidean k -median problem. In this work, assuming the unique games conjecture (UGC), we provide the first hardness of approximation result for the Euclidean k -median problem. Furthermore, we study the hardness of approximation for the Euclidean k -means/ k -median problems in the bi-criteria setting where an algorithm is allowed to choose more than k centers. That is, bi-criteria approximation algorithms are allowed to output βk centers (for constant β>1 ) and the approximation ratio is computed with respect to the optimal k -means/ k -median cost. In this setting, we show the first hardness of approximation result for the Euclidean k -median problem for any β<1.015 , assuming UGC. We also show a similar bi-criteria hardness of approximation result for the Euclidean k -means problem with a stronger bound of β<1.28 , again assuming UGC.

Full PDF

HHardness of Approximation of Euclidean k -Median Anup Bhattacharya , Dishant Goyal , and Ragesh Jaiswal Indian Statistical Institute Kolkata, [email protected] Indian Institute of Technology Delhi, {dishant.goyal, rjaiswal}@cse.iitd.ac.in

Abstract

The Euclidean k -median problem is deﬁned in the following manner: given a set X of n points in R d , and an integer k , ﬁnd a set C ⊂ R d of k points (called centers) such that the costfunction Φ( C, X ) ≡ P x ∈X min c ∈ C k x − c k is minimized. The Euclidean k -means problem isdeﬁned similarly by replacing the distance with squared distance in the cost function. Varioushardness of approximation results are known for the Euclidean k -means problem [ACKS15,LSW17, CAC19]. However, no hardness of approximation results were known for the Euclidean k -median problem. In this work, assuming the unique games conjecture (UGC), we provide theﬁrst hardness of approximation result for the Euclidean k -median problem. This solves an openquestion posed explicitly in the work of Awasthi et al. [ACKS15].Furthermore, we study the hardness of approximation for the Euclidean k -means/ k -medianproblems in the bi-criteria setting where an algorithm is allowed to choose more than k centers.That is, bi-criteria approximation algorithms are allowed to output βk centers (for constant β >

1) and the approximation ratio is computed with respect to the optimal k -means/ k -mediancost. In this setting, we show the ﬁrst hardness of approximation result for the Euclidean k -median problem for any β < . k -means problem with a stronger bound of β < . a r X i v : . [ c s . CC ] N ov Introduction

We start by giving the deﬁnition of the Euclidean k -median problem. Deﬁnition 1.1 ( k -median) . Given a set X of n points in R d , and a positive integer k , ﬁnd a set ofcenters C ⊂ R d of size k such that the cost function Φ( C, X ) ≡ P x ∈X min c ∈ C k x − c k is minimized. The Euclidean k -means problem is deﬁned similarly by replacing the distance with squared distancein the cost function (i.e., replacing k x − c k with k x − c k ). These problems are also studied in thediscrete setting where the centers are restricted to be chosen from a speciﬁc set L ⊂ R d , also givenas input. This is known as the discrete version whereas the former version (with L = R d ) is knownas the continuous version. In this work, we discuss only the continuous version of the problem.Henceforth, we will refer to the continuous Euclidean k -median/ k -means problem as the Euclidean k -median/ k -means problem or simply the k -median/ k -means problem.The relevance of the k -means and k -median problems in various computational domains suchas resource allocation, big data analysis, pattern mining, and data compression is well known.A signiﬁcant amount of work has been done to understand the computational aspects of the k -means/median problems. The k -means problem is known to be NP -hard even for ﬁxed k or d [ADHP09, Das08, MNV12, Vat09]. Similar NP hardness result is also known for the k -medianproblem [MS84]. Even the 1-median problem, popularly known as the Fermat-Weber problem [DH01],is a hard problem and designing eﬃcient algorithms for this problem is a separate line of researchin itself – see for e.g. [KV97, WEI37, CT89, BHPI02, CLM + k -means and k -median when k is ﬁxed (or constant) [Mat00, KSS10, FMS07, Che09, JKS14].Similarly, various PTASs are known for ﬁxed d [CAKM16, FRS16, CA18]. Various constant factorapproximation algorithms are known for k -means and k -median even considering k and d as part ofthe input instead of ﬁxed constants. For the k -means problem, constant approximation algorithmshave been given [KMN +

02, ANSW17], the best being a 6 .

357 approximation algorithm by Ahma-dian et al. [ANSW17]. On the negative side, the k -means problem is NP -hard to approximate withinany factor smaller than a particular constant greater than one [ACKS15, LSW17, CAC19]. In otherwords, there exist a constant ε > ε )-approximationalgorithm for the k -means problem, assuming P = NP . The best-known hardness of approximationresult for the problem is 1 .

07 due to Addad and Karthik [CAC19]. Constant factor approximationalgorithms for the k -median problem are known [CGTS99, AGK +

04, LS13, BPR +

17, ANSW17].The best known approximation guarantee for k -median is 2 .

633 due to Ahmadian et al. [ANSW17].However, unlike the k -means problem, no hardness of approximation result was known for k -medianproblem. In fact, hardness of approximation for the k -median problem was left as an open problemin the work of Awasthi et al. [ACKS15] who proved the hardness of approximation for the k -meansproblem. In this work, we solve this open problem by obtaining hardness of approximation resultfor the Euclidean k -median problem assuming that the Unique Games Conjecture holds. Followingis one of the main results of this work. Theorem 1 (Main Theorem) . There exist a constant ε > such that the Euclidean k -medianproblem cannot be approximated to a factor better than (1 + ε ) assuming the Unique GamesConjecture. In the approximation setting, the continuous version is not harder than its discrete counterpart since it isknown(e.g., [FMS07, Mat00]) that an α -approximation for the discrete problem gives an α + ε approximation for thecontinuous version, for arbitrary small constant ε > mportant note : We would like to note that similar hardness of approximation result for theEuclidean k -median problem using diﬀerent techniques has been obtained independently by VincentCohen-Addad, Karthik C. S., and Euiwoong Lee . We came to know about their results throughpersonal communication with the authors. Since their manuscript has not been published onlineyet, we are not able to add a citation to their work.Now having hardness of approximation for k -means and k -median, the next natural step in the beyond worst-case discussion is to allow more ﬂexibility to the algorithm. One possible relaxationis to allow an approximation algorithm to choose more than k centers. In other words, allow thealgorithm to choose βk centers (for some constant β >

1) and produce a solution that is close tothe optimal solution with respect to k centers. This is known as bi-criteria approximation and thefollowing deﬁnition formalizes this notion. Deﬁnition 1.2 (( α, β )-approximation algorithm) . An algorithm A is called an ( α, β ) -approximationalgorithm for the Euclidean k -means/median problem if given any instance I = ( X , k ) with X ⊂ R d , A outputs a center set F ⊂ R d of size βk that has the cost at most α times the optimal cost with k centers. That is, X x ∈X min f ∈ F { D ( x, f ) } ≤ α · min C ⊆ R d | C | = k ( X x ∈X min c ∈ C { D ( x, c ) } ) For the Euclidean k -means problem, D ( p, q ) ≡ k p − q k and for the k -median problem D ( p, q ) ≡k p − q k . One expects that as β grows, there exists eﬃcient ( α, β )-approximation algorithms with smallervalue of α . This is indeed observed in the work of Makarychev et al. [MMSW16]. For example, theiralgorithm gives a (9 + ε ) approximation for β = 1; 2 .

59 approximation for β = 2; 1 . β = 3. The approximation factor of their algorithm decreases as the value of β increases.Furthermore, their algorithm gives a (1 + ε )-approximation guarantee with O ( k log(1 /ε )) centers.Bandyapadhyay and Varadarajan [BV16] gave a (1 + ε ) approximation algorithm that outputs (1 + ε ) k centers in constant dimension. There are various other bi-criteria approximation algorithms thatuse distance-based sampling techniques and achieve better approximation guarantees than theirnon-bi-criteria counterparts [AJM09, ADK09, Wei16]. Unfortunately in these bi-criteria algorithms,at least one of α, β is large. Ideally, we would like to obtain a PTAS with a small violation of thenumber of output centers. More speciﬁcally, we would like to address the following question: Does the k -median or k -means problem admit an eﬃcient (1 + ε, ε ) -approximationalgorithm? Note that such type of bi-criteria approximation algorithms that outputs (1 + ε ) k centers have beenextremely useful in obtaining a constant approximation for the capacitated k -median problem [Li16,Li17] for which no true constant approximation is known yet . Therefore, the above question isworth exploring. Note that here we are speciﬁcally aiming for a PTAS since the k -means and k -median problems already admit a constant factor approximation algorithm. In this work, we givea negative answer to the above question by showing that there exists a constant ε > ε, ε )-approximation algorithm for the k -means and k -median problems does notexist assuming the Unique Games Conjecture. The following two theorems state this result moreformally. In the capacitated k -median/ k -means problem there is an additional constraint on each center that it cannotserve more than a speciﬁed number of clients (or points). heorem 2 ( k -median) . For any constant < β < . , there exists a constant ε > suchthat there is no (1 + ε, β ) -approximation algorithm for the k -median problem assuming the UniqueGames Conjecture. Theorem 3 ( k -means) . For any constant < β < . , there exists a constant ε > suchthat there is no (1 + ε, β ) -approximation algorithm for the k -means problem assuming the UniqueGames Conjecture. Moreover, the same result holds for any < β < . under the assumption that P = NP .Dimensionality reduction : Note that we can use dimensionality reduction techniques of Makarychev et al. [MMR19] to show that our hardness of approximation results hold for O (log kε /ε ) dimensionalinstances.In the next subsection, we discuss the known results on hardness of approximation of the k -means and k -median problems in more detail. The ﬁrst hardness of approximation result for the Euclidean k -means problem was given by Awasthi etal. [ACKS15]. They obtained their result using a reduction from Vertex Cover on triangle-free graphsof bounded degree ∆ to the Euclidean k -means instances. Their reduction yields a (1 + ε ∆ ) hardnessfactor for the k -means problem for a particular constant ε >

0. However, due to an unspeciﬁedvalue of ∆, the authors did not deduce the exact hardness factor for the k -means problem. Toovercome this barrier of unspeciﬁed bounded degree, Lee et al. [LSW17] showed the hardness ofapproximation of Vertex Cover on triangle-free graphs of bounded degree four. Using ∆ = 4, theyobtained a 1 . k -means problem. Subsequently,Addad and Karthik [CAC19] improved the hardness of approximation to 1 .

07 using a reduction fromthe vertex coverage problem instead of a reduction from the vertex cover problem. Moreover, theyalso gave several improved hardness results for the discrete k -means/ k -median problems in generaland ‘ p metric spaces. In their more recent work, they also improved the hardness of approximationresults for the continuous k -means/ k -median problem in general metric spaces [CASL20].Unlike the Euclidean k -means problem, no hardness of approximation result was known forthe Euclidean k -median problem. In this work, we give hardness of approximation result for theEuclidean k -median problem assuming the Unique Game Conjecture. However, we do not deducethe exact constant hardness factor since we use the same reduction as in [ACKS15] and hencerun into the same problem of unspeciﬁed degree that we discussed in the previous paragraph. Asmentioned earlier, in an unpublished work communicated to us through personal communication, Vincent Cohen-Addad, Karthik C. S., and Euiwoong Lee have independently obtained hardness ofapproximation result for the Euclidean k -median problem using diﬀerent set of techniques. Theyalso gave bi-criteria hardness of approximation results in ‘ ∞ -metric for the 2-means and 2-medianproblems. We would like to point out that in the bi-criteria setting, our result is the ﬁrst hardnessof approximation result for the Euclidean k -means/ k -median problem to the best of our knowledge.All of our hardness of approximation results are based on the reduction from Vertex Cover onbounded degree and triangle-free graphs. As we mentioned earlier, the same reduction is usedin [ACKS15, LSW17] to obtain the hardness of approximation for the Euclidean k -means problem.However, extending this gap-preserving reduction to the Euclidean k -median setting is non-trivial.This problem was left as an open problem by Awasthi et al. [ACKS15]. In the next subsection,we discuss this reduction and related diﬃculties in obtaining the hardness of approximation forEuclidean k -median problem. 4 .2 Comparison with [ACKS15] and Technical Contribution Awasthi et al. [ACKS15] showed that the k -means problem is hard to approximate within a factor(1 + ε ) for a particular constant 0 < ε < k -median problem, within a factor (1 + ε ) for a particularconstant 0 < ε <

1. However, this task is challenging and non-trivial. We will discuss thesechallenges in this subsection. First, let us brieﬂy discuss the results and techniques of [ACKS15].Awasthi et al. [ACKS15] ﬁrst gave a (1 + ε )-approximation preserving reduction from VertexCover on bounded degree graphs to

Vertex Cover on bounded degree triangle-free graphs. Then,they gave a reduction from

Vertex Cover on bounded degree triangle-free graphs to the Euclidean k -means instances. The ﬁrst reduction straightaway gives a 1 .

36 hardness of approximation for the

Vertex Cover on bounded degree and triangle-free graphs since a 1 .

36 hardness of approximationis already known for the

Vertex Cover on bounded degree graphs [DS05]. Following is a formalstatement for this.

Theorem 4 (Corollary 5.3 [ACKS15]) . Given any unweighted bounded degree, triangle-free graph G , it is NP -hard to approximate Vertex Cover within any factor smaller than . . Here is the description of the second reduction.

Construction of k -means instance: Let (

G, k ) be a hard

Vertex Cover instance where G has bounded degree ∆. Let n denote the number of vertices in the graph and m denote the number of edges in the graph. A k -means instance I := ( X , k ) with X ⊂ R n is constructed as follows. For every vertex i ∈ V , there is an n -dimensional vector x i := (0 , . . . , , . . . ,

0) in { , } n , which has 1 at i th coordinate and 0 at the rest of thecoordinates. For each edge e = ( i, j ) ∈ E , a point x e := x i + x j is deﬁned in { , } n .The point set X := { x e | e ∈ E } and parameter k deﬁnes the k -means instance.The following theorem based on the above construction is given in [ACKS15]. Theorem 5 (Theorem 4.1 [ACKS15]) . There is an eﬃcient reduction from

Vertex Cover on boundeddegree, triangle-free graphs to the Euclidean k -means instances that satisﬁes the following properties:1. If the Vertex Cover instance has value k , then the k -means instance has a cost at most ( m − k ) .2. If the Vertex Cover instance has value at least (1 + ε ) · k , then the optimal cost of k -meansinstance is at least ( m − k + δk ) .Here, ε is some ﬁxed constant > and δ = Ω( ε )The above reduction is only valid for the bounded degree triangle-free graphs. This is the mainreason, why the Vertex Cover was shown to be

APX -hard on these graph instances. Furthermore,the above theorem implies that the k -means problem is APX -hard. Following is a formal statementfor the same (see Section 4 of [ACKS15] for the proof of this result).

Corollary 5.1.

There exist a constant ε > such that it is NP -hard to approximate the Euclidean k -means problem to any factor better than (1 + ε ) . Note that we can further reduce the above k -means instances to k -means instances in a smallerdimensional space by applying standard dimensionality reduction techniques [DG03, MMR19].Finally, the authors conclude with the following important question (see Section 6 of [ACKS15]):“ It would also be interesting to study whether our techniques give hardness of approxi-mation results for the Euclidean k-median problem. ”5n other words, if we employ the same construction as in Awasthi et al. [ACKS15], and let I = ( X , k )denote the Euclidean k -median instance, then can we show this reduction to be gap preserving for k -median? This question is challenging due to the hardness of the 1-median problem. Unlike the 1-means problem, where the optimal center is the centroid of the point set, the 1-median problem doesnot have any closed-form expression for the optimal center. As we mentioned earlier, this problem isalso popularly known as the Fermat Weber problem [DH01]. However, despite these diﬃculties, wecan show that the above reduction is gap-preserving for Euclidean k -median problem. This is madepossible using the idea that we do not require the exact optimal cost of 1-median instance, instead,good lower and upper bounds on the optimal 1-median cost suﬃce for this problem. Following arethe two main ideas that we use here. In order to obtain an upper bound, we simply compute the1-median cost of a point set with respect to its centroid. In order to obtain a lower bound, weuse a clever decomposition technique that decomposes a 1-median instance into many smaller sizeinstances and bound the total cost in terms of the cost of simpler instances, the 1-median costs ofwhich can be easily computed. We elaborate on these ideas later in Section 4. Overall, the noveltyof this work lies in bounding the optimal cost of the k -median instances and further deducingits relationship with the vertex cover of the bounded degree triangle-free graphs. Furthermore,we extend these techniques to show the bi-criteria hardness of approximation results of Euclidean k -means and k -median problems. In this section, we discuss some basic facts and inequalities that we will frequently use in our proofs.First, we note that the Fermat-Weber problem is not diﬃcult for all 1-median instances. We caneﬃciently obtain 1-median for some special instances. For example, for a set of equidistant points,the 1-median is simply the centroid of the point set. We give a proof of this statement in the nextsection. Most importantly, we use the following two facts to compute the 1-median cost.

Fact 1 ([MD87]) . For a set of non-collinear points the optimal -median is unique. Fact 2 ([LR91, HAL48]) . The -median cost is preserved if pairwise distances between the inputpoints are preserved. We use the above fact, in vector spaces where it is tricky to compute the optimal 1-medianexactly. In such cases, we transform the space to a diﬀerent vector space, where computing the1-median is relatively simpler. More speciﬁcally, we employ a rigid transformation since it preservespairwise distances. Next, we give a simple lemma, that is used to prove various bounds related tothe quantity p m ( m − Lemma 6.

Let m and t be any positive real numbers greater than one. If m ≥ t , the followingbound holds: m − ( t − q t ( t − ≤ q m ( m − ≤ m − / Proof.

The upper bound follows from the following sequence of inequalities: q m ( m − < q m − m + 1 / q ( m − / = m − / q m ( m −

1) = m + ( q m ( m − − m ) Even though this statement is not explicitly mentioned in these references, it can be derived from them. m + m · r m − m − ! ≥ m + t · r t − t − ! ∵ a + 1 b + 1 ≥ ab for b ≥ a = m − ( t − q t ( t − k -median problem, assumingthe Unique Games Conjecture (UGC). k -Median We show a gap preserving reduction from

Vertex Cover on bounded degree triangle-free graphsto the Euclidean k -median instances. In addition to it, we use the following result which followsfrom [ACKS15] and [AKS09] . Theorem 7.

Given any unweighted triangle-free graph G of bounded degree, Vertex Cover can notbe approximated within a factor smaller than − ε , for any constant ε > , assuming the UniqueGames Conjecture. In Section 1.2, we described the construction of a k -means instance from a Vertex Cover instance.We use the same construction for the k -median instances. Let G = ( V, E ) denote a triangle-freegraph of bounded degree ∆. Let I = ( X , k ) denote the Euclidean k -median instance constructedfrom G . We establish the following theorem based on this construction. Theorem 8.

There is an eﬃcient reduction from instances of

Vertex Cover on triangle-free graphswith m edges to those of Euclidean k -median that satisﬁes the following properties:1. If the graph has a vertex cover of size k , then the k -median instance has a solution of cost atmost m − k/

2. If the graph has no vertex cover of size ≤ (2 − ε ) · k , then the cost of any k -median solutionon the instance is at least m − k/ δk Here, ε is some ﬁxed constant and δ = Ω( ε ) . The graphs with a vertex cover size at most k are said to be “Yes” instances and the graphswith no vertex cover of size ≤ (2 − ε ) k are said to be “No” instances. Now, the above theoremgives the following inapproximability result for the Euclidean k -median problem. Corollary 8.1.

There exists a constant ε > such that the Euclidean k -median problem can notbe approximated to a factor better than (1 + ε ) , assuming the Unique Games Conjecture.Proof. Since the hard

Vertex Cover instances have bounded degree ∆, the maximum matching ofsuch graphs is at least d m e . First, let us prove this statement. Suppose M be a matching, that isinitially empty, i.e., M = ∅ . We construct M in an iterative manner. First, we pick an arbitrary [AKS09] showed that Vertex Cover on d -degree graphs is hard to approximate within any factor smaller than2 − ε , for ε = (2 + o d (1)) · log log d log d assuming the Unique Games Conjecture. Therefore, ε can be set to arbitrarily smallvalue by taking suﬃciently large value of d . M . Then, we remove this edge and all the edges incident on it.We repeat this process for the remaining graph till the graph contains no edge. In each iteration,we remove at most 2∆ edges. Therefore, the matching size of the graph is at least d m e .Now, suppose k < m . Then, the graph does not have a vertex cover of size k since matching sizeis at least d m e . Therefore, such graph instances can be classiﬁed as “No” instances in polynomialtime. So, they are not the hard Vertex Cover instances. Therefore, we can assume k ≥ m for allthe hard Vertex Cover instances. In that case, the second property of Theorem 8, implies that thecost of k -median instance is ( m − k ) + δk ≥ (1 + δ ) · ( m − k ). Thus, the k -median problem cannot be approximated within any factor smaller than 1 + δ = 1 + Ω( ε ).Let us deﬁne some notations before proving Theorem 8. For any subgraph S of the graph G ,we denote its number of edges by m ( S ). Recall that a point in X corresponds to an edge of thegraph. Therefore, a subgraph S of G deﬁnes a subset of points X ( S ) := { x e | e ∈ E ( S ) } of X . Wedeﬁne the 1-median cost of X ( S ) with respect to a center c ∈ R n as:Φ( c, S ) ≡ X x ∈X d ( x, c ) . Furthermore, we deﬁne the optimal X ( S ) as Φ ∗ ( S ), i.e.,Φ ∗ ( S ) ≡ min c ∈ R n Φ( c, S )In the discussion that follows, we often use the statement: “optimal 1-median cost of a graph S ”, which simply means: “optimal 1-median cost of the cluster X ( S )”. Let V = { v , . . . , v k } be a vertex cover of G . Let S i denote the set of edges covered by v i . If anedge is covered by two vertices i and j , then we arbitrarily keep the edge either in S i or S j . Let m i denote the number of edges in S i . We deﬁne {X ( S ) , . . . , X ( S k ) } as a clustering of the pointset X . Now, we show that the cost of this clustering is at most m − k/

2. Note that each S i formsa star graph centered at v i . Moreover, the point set X ( S i ) forms a regular simplex of side length √

2. We compute the optimal cost of X ( S i ) using the following lemma. Lemma 9.

For a regular simplex on r vertices and side length s , the optimal -median is thecentroid of the simplex. Moreover, the optimal -median cost is s · r r ( r − .Proof. The statement is simple for r = 1 ,

2. So for the rest of the proof, we assume that r > A = { a , a , . . . , a r } denote the vertex set of a regular simplex. Let s be the side lengthof the simplex. Using Fact 2, we can represent each point a i in an r -dimensional space as follows: a := (cid:18) s √ , , ..., (cid:19) , a := (cid:18) , s √ , ..., (cid:19) , . . . , a r := (cid:18) , , ..., s √ (cid:19) Note that the distance between any a i and a j is s , which is the side length of the simplex. Let c ∗ = ( c , . . . , c r ) be an optimal 1-median of A . Then, the 1-median cost is the following:Φ( c ∗ , A ) = r X i =1 k a i − c ∗ k = r X i =1  r X j =1 c j − c i + (cid:18) s √ − c i (cid:19)  / c i = c j for any i = j . Then, we can swap c i and c j to create a diﬀerent median, whilekeeping the 1-median cost the same. It contradicts the fact that there is only one optimal 1-median,by Fact 1. Therefore, we can assume c ∗ = ( c, c, . . . , c ). Now, the optimal 1-median cost is:Φ ∗ ( A ) = Φ( c ∗ , A ) := r · s(cid:18) c − s √ (cid:19) + ( r − · c The function Φ( c ∗ , A ) is strictly convex and attains minimum at c = sm · √ , which is the centroidof A . The optimal 1-median clustering cost is Φ( c ∗ , A ) = s · r r ( r − . This completes the proof ofthe lemma.The following corollary establishes the cost of a star graph S i . Corollary 9.1.

Any star graph S i with r edges has the optimal -median cost of p r ( r − r pairwise vertex-disjoint edges forms a regular simplex in X ,of side length 2. The following corollary establishes the cost of such clusters. Corollary 9.2.

Let F be any non-star graph with r pairwise vertex-disjoint edges, then the optimal -median cost of F is √ · p r ( r − k -median cost of X . Let OP T ( X , k ) denote the optimal k -median cost of X . The following sequenceof inequalities proves the ﬁrst property of Theorem 8. OP T ( X , k ) ≤ k X i =1 Φ ∗ ( S i ) ( Corollary . = k X i =1 q m i ( m i − ( Lemma ≤ k X i =1 (cid:18) m i − (cid:19) = m − k . Now, we prove the second property of Theorem 8. For this, we prove the equivalent contrapositivestatement: If the optimal k -median clustering of X has cost at most (cid:16) m − k + δk (cid:17) , for someconstant δ >

0, then G has a vertex cover of size at most (2 − ε ) k , for some constant ε >

0. Let C denote an optimal k -median clustering of X . We classify its optimal clusters into two categories:(1) star and (2) non-star . Let F , F , . . . , F t denote the non-star clusters, and S , . . . , S k − t denotethe star clusters. For any star cluster, the vertex cover size is exactly one. Moreover, the optimal1-median cost of any star cluster with r edges is p r ( r − F on r edges is given as p r ( r −

1) + δ ( F ). We denote by δ ( F )the extra-cost of a cluster F . In the entire discussion, we will use | . | to denote the number of edgesin a given graph. When used in the context of a set, | . | denotes the cardinality of the given set.Using this, we write: δ ( F ) ≡ Φ ∗ ( F ) − q | F | ( | F | − F in terms of δ ( F ). Lemma 10.

Any non-star cluster F with a maximum matching of size two has a vertex cover ofsize at most .

62 + (cid:16) √ (cid:17) δ ( F ) . emma 11. Any non-star cluster F with a maximum matching of size at least three has a vertexcover of size at most . (cid:16) √ (cid:17) δ ( F ) . These lemmas are the key to proving the main result. We will prove these lemmas later. First,let us see how they give a vertex cover of size at most (2 − ε ) k . Let us classify the star clusters intothe following two sub-categories:(a) Clusters composed of exactly one edge. Let these clusters be: P , P , . . . , P t .(b) Clusters composed of at least two edges. Let these clusters be: S , S , . . . , S t .Similarly, we classify the non-star clusters into the following two sub-categories:(i) Clusters with a maximum matching of size two. Let these clusters be: W , W , . . . , W t (ii) Clusters with a maximum matching of size at least three. Let these clusters be: Y , Y , . . . , Y t Note that t + t + t + t equals k . Now, consider the following strategy of computing the vertexcover of G . Suppose, we compute the vertex cover for every cluster separately. Let C i be anycluster, and | V C ( C i ) | denote the vertex cover size of C i . Then, the vertex cover of G can be simplybounded in the following manner: | V C ( G ) | ≤ t X i =1 | V C ( P i ) | + t X i =1 | V C ( S i ) | + t X i =1 | V C ( W i ) | + t X i =1 | V C ( Y i ) | However, we can obtain a vertex cover of smaller size using a slightly diﬀerent strategy. Inthis strategy, we ﬁrst compute a minimum vertex cover of all the clusters except single edgeclusters P , P , . . . , P t . Suppose that vertex cover is V C . Then we compute a vertex cover for P , P , . . . , P t . Now, let us see why this strategy gives a vertex cover of smaller size than before.Note that some vertices in V C may also cover the edges in P , . . . , P t . Suppose there are t clusters in P , . . . , P t that remain uncovered by V C . Without loss of generality, assume theseclusters to be P , . . . , P t . Now, the vertex cover of G is bounded in the following manner: | V C ( G ) | ≤ | V C (cid:16) ∪ t i =1 P i (cid:17) | + | V C | = | V C (cid:16) ∪ t i =1 P i (cid:17) | + | V C (cid:16) ( ∪ t j =1 S j ) ∪ ( ∪ t k =1 W k ) ∪ ( ∪ t l =1 Y l ) (cid:17) |≤ | V C (cid:16) ∪ t i =1 P i (cid:17) | + t X i =1 | V C ( S i ) | + t X i =1 | V C ( W i ) | + t X i =1 | V C ( Y i ) | Now, we will try to bound the size of the vertex cover of P ∪ ... ∪ P t . Note that we can coverall these single-edge clusters with t vertices by choosing one vertex per cluster. However, it maybe possible to obtain a vertex cover of smaller size if we collectively consider all these clusters.Suppose E P denote the set of all edges in P , . . . , P t and V P denote the vertex set spanned bythem. We deﬁne a graph G P = ( V P , E P ). Note that G P is a subgraph of G . Further, we deﬁneanother subgraph G P of G that is obtained after removing the edges of E P from G . That is, G P = ( V, E \ E P ). In other words, G P is the graph spanned by the remaining clusters: S , . . . , S t ; W , . . . , W t ; Y , . . . , Y t ; and P t +1 , . . . , P t . An important property of G P is that any edge of G P does not have its both endpoints in V P . In other words, every edge of G P is incident on at mostone edge of G P . This is because every edge of G P has at least one endpoint in V C , and G P isonly deﬁned on the edge that are not incident on V C . This property will help us in obtaining abetter vertex cover for P , . . . , P t . We bound the size of the vertex cover in the following lemma.10 emma 12. Let δ > be any constant and G P be as deﬁned above. If G P does not have a vertexcover of size ≤ ( t + 8 δk ) , then G has a vertex cover of size at most (2 k − δk ) .Proof. Let M P be a maximal matching of G P . We divide the analysis into two cases based on thesize of M P . In the ﬁrst case, we consider | M P | ≤ ( t / δk ) and show that G P has a vertex coverof size at most (2 t / δk ). In the second case, we consider | M P | > ( t / δk ) and show that G has a vertex cover of size at most (2 k − δk ). Let us consider these cases one by one.• Case-I: (cid:0) | M P | ≤ t / δk (cid:1) In this case, we simply cover all edges of G P by picking both endpoints of every edge in M P .Thus, G P can be covered using only (2 t / δk ) vertices.• Case-II: (cid:0) | M P | > t / δk (cid:1) In this case, we will incrementally construct a vertex cover

V C G of the graph G of size atmost (2 k − δk ). First, let us discuss the main idea of this incremental construction. Duringthe construction, we maintain a maximal matching M G of the graph G . Then, for every edgein M G , we add both its endpoints to V C G , except at least 2 δk edges for which we choose onlyone endpoint in V C G . Suppose, V C G covers all edges in G . Then, we can claim that V C G hasa size at most (2 k − δk ). Note that for the hard Vertex Cover instances, we can assume thatmaximum matching size is at most k . This is because the graphs with a matching of size > k have a minimum vertex cover of size > k . Therefore, such instances can be simply classiﬁedas “No” instances in polynomial time. Since the size of a maximal matching is always lessthan the size of a maximum matching, we can further assume | M G | ≤ k for the hard VertexCover instances. This implies that

V C G has a size at most 2( k − δk ) + 2 δk = (2 k − δk ).Now, let us discuss the construction of such a matching M G and correspondingly the vertexcover V C G .Initially, both M G and V C G are empty sets. That is, M G = ∅ and V C G = ∅ . Recall that G P is the graph spanned by E P and G P is the graph spanned by E \ E P . Let E I denotethe set of edges in G P that are incident on M P . Based on this, we deﬁne two new graph:(1) G R := ( V, E R ) where E R = E P ∪ E I , and (2) G := ( V, E ) where E = E \ ( E P ∪ E I ).In other words, G R is the graph spanned by the edges of G P and the edges of G P that areincident on M P ; and G is the graph spanned by the edges of G P that are not incident on M P . Now, we compute a maximal matching M of G , and execute the following procedure: Procedure 1 (1) M G ← M (2) For every edge e ≡ ( u, v ) ∈ M G :(3) V C G ← V C G ∪ { u, v } (4) Update G R by removing all the edges in them that are incident on u and v Figure 1

Note that the above procedure removes every edge of G since M is a maximal matching of G . Now, we will ﬁnd a vertex cover and maximal matching of updated G R . Note that anymaximal matching of G R can be combined with that of M G to form a maximal matching ofthe original graph G since we already removed the edges which were incident on M G .Note that G R is composed of the edge sets E P and a subset of edges from E I . Therefore, G P is also a subgraph of G R . Recall that M P is a maximal matching of G P . We deﬁne a new11dge set U P that denote the set of unmatched edges in G P , i.e., U P = E P \ M P . Note that | U P | ≤ t / − δk since | M P | > t / δk . Moreover, every edge in U P is incident on M P ,since M P is a maximal matching of G P . In the next two procedures, we remove some edgesfrom U P and M P such that the updated U P only contains those edges that are incident onone edge of the updated M P . Then, we will use this graph to obtain a vertex cover of G ofsize at most 2 k − δk . Following is the ﬁrst procedure: Procedure 2 (1) while there is an edge e ≡ ( u, v ) ∈ M P that is incident on two edges of U P (2) M G ← M G ∪ { e } (3) V C G ← V C G ∪ { u, v } (4) Update G R , M P , and U P by removing all the edges in them that areincident on u and v Figure 2

Note that before the beginning of this procedure there were at most ≤ t / − δk edges in U P and at least t / δk edges in M P . Then, the above procedure removes at least twoedges from U P and one edge from M P in each iteration of the while loop. Therefore, at theend of the procedure at least 6 δk edges remain in M P . Moreover, at the end of the procedure, M P has the property that no two edges in U P are incident on the same edge of M P . Next,consider the following procedure: Procedure 3 (1) while there is an edge e ≡ ( u, v ) ∈ U P that is incident on two edges e , e ∈ M P (2) Arbitrarily pick one edge from { e , e } . W.l.o.g., let e ≡ ( u, v ) be that edge.(3) M G ← M G ∪ { e } (4) V C G ← V C G ∪ { u, v } (5) Update G R , M P , and U P by remove all the edges in them that areincident on u and v Figure 3

Let p ≥ δk denote the number of edges in M P , before the beginning of the above procedure.Now, observe that the above procedure removes one edge from U P and one edge from M P in each iteration of the while loop. Furthermore, the while loop executes at most p/ M P had the property that two edges of U P do not incident on the same edge of M P .Therefore, at the end of the procedure, M P contains at least p/ ≥ δk edges. Moreover, U P has obtained the property that all its edges are incident on exactly one edge of M P . Now,recall that all edges in E I are also incident on exactly one edge of M P . We proved thisproperty earlier (just before we stated Lemma 12) for every edge of G p that was incident on G p . Therefore, at this point, G R consists of the edge set M P of size at least 3 δk , and theedges that are incident on exactly one edge of M P . Now, we will ﬁnd a maximal matchingand vertex cover of the remaining graph G R .We color the edges of M P with red color and the remaining edges of G R with blue color.Now, let us deﬁne the concept of “ plank edge ”. A plank edge is red edge e ≡ ( u, v ) ∈ M P thatsatisﬁes the following two conditions. The ﬁrst condition is that at least one blue edge in G R is incident on u and at least one incident on v . Let e u and e v denote the blue edges incident12n u and v , respectively. The second condition is that e u and e v should be vertex disjointfrom every edge of M G (with respect to the current set M G which keeps getting updated). Inother words, we should be able to add e u and e v to M G . Note that e u and e v do not shareany common vertex; otherwise it would form a triangle. Therefore, we can add both of themto M G . Now, we complete the construction of the maximal matching M G using the followingprocedure. Procedure 4 (1) T ← ∅ (2) M Y ← ∅ *(this variable accumulates plank edges below)* (3) M N ← M P *(variable for non-plank edges)* (4) while there is a plank edge e ≡ ( u, v ) ∈ M N (5) M Y ← M Y ∪ { e } (6) M N ← M N \ { e } (7) T ← T ∪ { e u , e v } (8) M G ← M G ∪ { e u , e v } (9) M G ← M G ∪ M N Figure 4

Note that the above procedure adds one edge in M Y and two edges in T in every iteration ofthe while loop. Therefore, | T | = 2 · | M Y | . Also, note that T ⊆ M G . Now, we complete theconstruction of the vertex cover V C G . We consider two sub-cases based on the size of M Y .And, for each of the sub-cases, we construct the vertex cover separately.1. Sub-case: ( | M Y | ≥ δk )For every edge in M P , we simply add both its endpoints to V C G . This completes theconstruction of V C G . It covers all edges of G R since all edges are incident on some edgeof M P . Let us compute the size of V C G . Note that for every edge in M Y ⊆ M P , thereare two edges in T ⊆ M G . Therefore, the size of vertex cover is: | V C G | = 2 | M G | − | T | = 2 | M G | − | M Y | ≤ | M G | − δk ≤ k − δk

2. Sub-case: ( | M Y | < δk )For this sub-case, we construct the vertex cover in the following manner. For every edgein T , we add its both endpoints to V C G . And, we remove all the edges in G R coveredby them. The remaining graph contains the set M N , and some blue edges incident on it.Since M N is deﬁned on non-plank edges, the remaining blue edges can not incident onboth endpoints of any edge of M N . Now, for every edge in M N , we pick its that endpointin the vertex cover that it shares with the blue edges incident on it. It completes theconstruction of V C G , and it covers all edges of G R . Note that for every edge in M G , weadded its both endpoints to V C G except the edges that came from M N for which, wejust added one endpoint in V C G . Also, note that | M N | = | M P |−| M Y | > δk − δk = 2 δk .Therefore, the size of the vertex cover is: | V C G | = 2 | M G | − | M N | < | M G | − δk ≤ k − δk Hence, we have a vertex cover of size at most 2 k − δk . This completes the proof of the lemma.13ased on the above lemma, we will assume that all single edge clusters can be covered with ( t +8 δk ) ≤ ( t + 8 δk ) vertices; otherwise the graph has a vertex cover of size at most (2 k − δk ) andthe soundness proof would be complete. Now, we bound the vertex cover of the entire graph in thefollowing manner. | V C ( G ) | ≤ | V C (cid:16) ∪ t i =1 P i (cid:17) | + | V C | = | V C (cid:16) ∪ t i =1 P i (cid:17) | + | V C (cid:16) ( ∪ t j =1 S j ) ∪ ( ∪ t k =1 W k ) ∪ ( ∪ t l =1 Y l ) (cid:17) |≤ t X i =1 | V C ( P i ) | + t X i =1 | V C ( S i ) | + t X i =1 | V C ( W i ) | + t X i =1 | V C ( Y i ) |≤ (cid:18) t δk (cid:19) + t + t X i =1 (cid:16)(cid:16) √ (cid:17) δ ( W i ) + 1 . (cid:17) + t X i =1 (cid:16)(cid:16) √ (cid:17) δ ( Y i ) + 1 . (cid:17) , (using Lemmas 10, 11, and 12)= (0 . t + 8 δk + t + (1 . t + (1 . t + (cid:16) √ (cid:17) t X i =1 δ ( W i ) + t X i =1 δ ( Y i ) ! Since the optimal cost

OP T ( X , k ) = k X j =1 q m j ( m j −

1) + t X i =1 δ ( W i ) + t X i =1 δ ( Y i ) ≤ m − k/ δk , we get t X i =1 δ ( W i ) + t X i =1 δ ( Y i ) ≤ m − k/ δk − k X j =1 q m j ( m j − | V C ( G ) | ≤ (0 . t + 8 δk + t + (1 . t + (1 . t + (cid:16) √ (cid:17) ·  m − k/ − k X j =1 q m j ( m j −

1) + δk  Using Lemma 6, we obtain the following inequalities:1. For P j , q | P j | ( | P j | − ≥ | P j | − | P j | = 12. For S j , q | S j | ( | S j | − ≥ | S j | − (2 − √

2) since | S j | ≥

23. For W j , q | W j | ( | W j | − ≥ | W j | − (2 − √

2) since | W j | ≥

24. For Y j , q | Y j | ( | Y j | − ≥ | Y j | − (3 − √

6) since | Y j | ≥ | V C ( G ) | ≤ (0 . t + 8 δk + t + (1 . t + (1 . t + (cid:16) √ (cid:17) ·  m − k/ − t X j =1 ( | P j | −

1) + − t X j =1 (cid:16) | S j | − (2 − √ (cid:17) − t X j =1 (cid:16) | W j | − (2 − √ (cid:17) − t X j =1 (cid:16) | Y j | − (3 − √ (cid:17) + δk  | G | = t X j =1 | P j | + t X j =1 | S j | + t X j =1 | W j | + t X j =1 | Y j | , we get the followinginequality: | V C ( G ) | ≤ (0 . t + 8 δk + t + (1 . t + (1 . t + (cid:16) √ (cid:17) · − k/ t + t · (cid:16) − √ (cid:17) ++ t · (cid:16) − √ (cid:17) + t · (cid:16) − √ (cid:17) + δk ! We substitute k = t + t + t + t , and obtain the following inequality: | V C ( G ) | ≤ (0 . t + 8 δk + t + (1 . t + (1 . t + (cid:16) √ (cid:17) · (cid:18) t t

10 + t

10 + 3 t

50 + δk (cid:19) = (1 . t + (1 . t + (1 . t + (1 . t + (cid:16) √ (cid:17) δk< (1 . k + (cid:16) √ (cid:17) δk (using t + t + t + t = k ) ≤ (2 − ε ) k, for appropriately small constants ε, δ > In this section, we bound the vertex cover size of any non-star graph F . We aim to obtain thisbound in terms of δ ( F ), i.e., the extra cost of the graph F . To do so, we require a bound on theextra cost. The 1-median cost of an arbitrary non-star graph is tricky to compute. Fortunately,we do not require the exact optimal cost of a graph; a lower bound on the optimal cost suﬃces.Furthermore, for some graph instances we can compute the exact optimal cost. For example, a stargraph with r edges corresponds to a regular simplex of size length s that has the optimal 1-mediancost: s · q r ( r − . We proved this earlier in Lemma 9. For the more complex graph instances, weuse the following decomposition lemma to bound their optimal cost. Lemma 13 (Decomposition lemma) . Let G = ( V, E ) be any graph and let E , ..., E t be any partitionof edges and let G , G , ..., G t be the subgraphs induced by these edges respectively. The followinginequality bounds the optimal cost of G in terms of the optimal costs of subgraphs G , ..., G t . Φ ∗ ( G ) ≥ Φ ∗ ( G ) + . . . + Φ ∗ ( G t ) Proof.

Let f ∗ be an optimal median of X ( G ). For any edge e ∈ E , let x e denote the corre-sponding point in X ( G ). The proof follows from the following sequence of inequalities: Φ ∗ ( G ) = P e ∈ E d ( x e , f ∗ ) = P ti =1 P e ∈ E i d ( x e , f ∗ ) ≥ P ti =1 Φ ∗ ( G i ).If we can compute the optimal cost of each subgraph, then we can lower bound the overall costof the graph using the above decomposition lemma. In general, a graph may be decomposed invarious ways. However, we prefer that decomposition that gives better bound on the optimal cost.In the next subsection, we use this decomposition lemma to bound the extra cost of any non-stargraph. We will use that bound in subsequent subsections to prove Lemmas 10 and 11.15 .1 1-Median Cost of Non-Star Graphs In this section, we show that any non-star graph on m edges has the optimal 1-median cost atleast m − .

342 which is at least p m ( m −

1) + 0 . Vertex Cover instances are triangle-free. Therefore, we do not explicitly mention the“triangle-freeness” of graphs whenever we state a lemma related to graphs.To obtain a lower bound on the optimal cost, we decompose a graph into so called “Fundamentalnon-star graphs”. We will show this decomposition process later. For now, we ﬁrst bound the1-median cost of these fundamental graphs. We then bound the cost of any graph using thedecomposition lemma. Here is the formal description of fundamental graphs.

Deﬁnition 4.1 (Fundamental Non-Star Graph) . A fundamental non-star graph is a graph thatbecomes a star graph when any pair of vertex-disjoint edges are removed from it. The graph thathas only two vertex-disjoint edges is also a fundamental non-star graph.

Figure 5:

Fundamental non-star graphs: 3- P , A n , and L n .The following lemma shows that there are exactly three types of fundamental non-star graphs.These are shown in Figure 5. Lemma 14.

There are only three types of fundamental non-star graphs: - P , A n , and L n , for n ≥ . These are shown in Figure 5.Proof. It is easy to see that the three types of graphs 3- P , A n , and L n shown in Figure 5 arefundamental non-star graphs. We need to argue that these are the only fundamental non-stargraphs. For this we do a case analysis. Let M denote a maximum matching of any graph G = ( V, E ).We divide the analysis into three cases based on the size of the maximum matching.1. Case 1: | M | ≥ M . Then, the remaining graph isstill a non-star graph due to a matching of size at least two. Therefore, any such graph cannot be a fundamental non-star graph.2. Case 2: | M | = 3.Let U denote the set of unmatched edges in the graph, i.e, U = E \ M . Suppose U = ∅ , and l be any edge in U . Observe that l can incident on at most two edges of M . Since M has a sizeexactly three, there is always an edge in M that is vertex-disjoint from l . Let this edge be m e ∈ M . The set of edges { l, m e } forms a 2- P (i.e., two vertex-disjoint edges), and we canremove it from G . The remaining graph still has a matching of size at least two. Therefore,such a graph can not be a fundamental non-star graph. Now, let us consider the case when U = ∅ . In this case, the graph is equivalent to a 3- P .16. Case 3: | M | = 2.Let e := ( u , v ), and e := ( u , v ) be the matching edges in M . Let U = E \ M denote theset of unmatched edges. If U forms a non-star graph, we can remove a pair of vertex-disjointedges from it. The remaining graph is a non-star due to | M | = 2. Therefore, any graph with | M | = 2 and U as a non-star graph, can not be a fundamental non-star graph. Therefore,let us consider the case when U forms a star graph. Suppose all edges of U share a commonvertex w . There are two possibilities: w is an endpoint of some matching edge or not. Let usconsider these two possibilities one by one.(a) Subcase: w is an endpoint of some matching edgeWithout loss of generality, we can assume w ≡ u . Now, no two edges of U can beincident on u and v ; otherwise, it would form a triangle ( w, u , v ). Therefore, at mostone edge of U can incident on either u or v . If such an edge exist, without loss ofgenerality, we can assume that it is incident on u . This graph is of type L n . On theother hand, if no edge of U is incident on either u or v , the graph is of type A n .(b) Subcase: w is not an endpoint of any matching edge.If | U | = 1, the graph is simply A . Now, let us assume that | U | ≥

2. Let l and l be anytwo edges in U . The edges: l and l , can not incident on the same matching edge say( u , v ); otherwise it forms a triangle: ( w, u , v ). Also, note that every edge e ∈ U mustbe incident on some matching edge; otherwise M ∪ { e } would form a matching of size > | M | . It would contradict that M is a maximum matching. Therefore, without lossof generality, we can assume that l is incident on u , and l is incident on u . Now, itforms a path of length four: ( v , u , w, u , v ). We can always remove an alternating pairof edges from the path, and the resulting graph still contains a pair of vertex-disjointedges, i.e., a 2- P . Therefore, any such graph can not be a fundamental non-star graph.From the above case analysis we conclude that all fundamental non-star graphs are of type 3- P , A n , or L n . This completes the proof of the lemma.Next, we bound the optimal 1-median cost of each fundamental non-star graph. In thisdiscussion, we will use r to denote the number of edges in various cases. Lemma 15.

Let r := 3 denote the number of edges in - P . The optimal cost of - P is at least ( r + 0 . .Proof. It is easy to see that X (3- P ) forms a simplex of side length 2. Therefore, we can useCorollary 9.2 to compute its optimal 1-median cost: Φ ∗ (3- P ) = √ p −

1) = 2 √ ≥ r +0 .

46 (since r = 3) . Lemma 16.

Let r := n + 1 denote the number of edges in A n . The optimal cost of A n is at least:1. r , for n ≥ .2. r + 0 . , for n ≥ .3. r + 0 . , for n ≥ .Proof. Consider the point set X ( A n ). It forms a simplex with ( r −

1) points at a distance of √ r − a = (1 , , ..., , a = (0 , , ..., , ..., a r − = (0 , ..., , , a r = ( u, . . . , u )Here u = r − + q r − + r − . Let ( c , . . . , c r − ) be an optimal 1-median of S = { a , ..., a r } . If c i = c j for any i = j , we can swap c i and c j to create a diﬀerent median with the same 1-mediancost. Since the 1-median is always unique for a set of non-collinear points (by Fact 1), we assume c ∗ = { c, . . . , c } as the optimal median. Then, the optimal 1-median cost is:Φ( c ∗ , S ) = Φ( c ∗ , a r ) + r − X i =1 Φ( c ∗ , a i ) = ( u − c ) · √ r − r − · q (1 − c ) + ( r − c The function is strictly convex and attains minimum at c = √ r + 1 √ r · ( r − . We get the followingoptimal cost on substituting the values of t and c in the previous equation:Φ( c ∗ , S ) = q r ( r −

1) + s r − − r rr − q r ( r −

1) + 2 q r − + q rr − For r ≥ t , we get:Φ( c ∗ , S ) ≥ q r ( r −

1) + 2 q t − + q tt − , ( ∵ r ≥ t )= q r ( r −

1) + s t − − s tt − ≥ r − ( t − q t ( t − s t − − s tt − , (using Lemma 6)Substituting t = 2, we get: Φ( c ∗ , S ) ≥ r − (2 − √

2) + √ − √ ≥ r. Substituting t = 3, we get: Φ( c ∗ , S ) ≥ r − (3 − √

6) + p / − p / > r + 0 . . Substituting t = 4, we get: Φ( c ∗ , S ) ≥ r − (4 − √

12) + p / − p / > r + 0 . . This completes the proof of the lemma.

Lemma 17.

Let r := n + 2 denote the number of edges in L n . Then the optimal cost of L n is atleast:1. r − . , for n = 1 .2. r − . , for n = 2 .3. r − . , for n ≥ .Proof. Let us prove the ﬁrst statement for r = 3 which corresponds to the graph being L (i.e., n = 1). In X ( L ), there are two points at distance of 2 from each other, and the third point at adistance of √ S of dimension two. Based on Fact 2,we represent the coordinates of the simplex in the following way: a = (0 , , a = ( √ , , a = (0 , √ . Note that the pairwise distances are preserved in this representation. Let ( c , c ) be the optimal1-median of S . If c i = c j for any i = j , we can swap c i and c j to create a diﬀerent median with18he same 1-median cost. Therefore, we consider c ∗ := ( c, c ) as the optimal 1-median of S . Then,we get the following optimal 1-median cost of S .Φ( c ∗ , S ) = Φ( c ∗ , a ) + Φ( c ∗ , a ) + Φ( c ∗ , a ) = √ · c + 2 · r(cid:16) c − √ (cid:17) + c The function is strictly convex and attains minimum at c = q − q . Substituting the value of c in Φ( c ∗ , S ), we get Φ( c ∗ , S ) = 1 + √ > r − . r = 3. This completes the proof of the ﬁrststatement.Let us prove the second statement. Here, we have r = 4 (or n = 2). We create three copies of L (i.e., 3- L ), and decompose them into three subgraphs: 2- P , A , and S . The decompositionis shown in Figure 6. Note that P is the same as L . There are also other ways to decompose thegraph. However, some of those decompositions give weak bound on the optimal cost. And, thisdecomposition gives suﬃciently good bound on the optimal cost of L . Figure 6:

Decomposition of 3- L Let c ∗ be the optimal 1-median for L . Based on the decomposition, we bound the optimal 1-mediancost of L as follows: 3 · Φ ∗ ( L ) ≥ Φ ∗ ( A ) + 2 · Φ ∗ ( P ) + Φ ∗ ( S ) (1)We already have the bounds on the optimal costs of A , P , and S . That is,• For A , we have Φ ∗ ( A ) ≥ . P , we have Φ ∗ ( P ) ≥ | P | − .

268 = 2 . P is the same as L , and that the number of edges in P , denoted by | P | , equals 3.• For S , we have Φ ∗ ( S ) = p −

1) = √

6. This follows from Corollary 9.1, for r = 3.Substituting the above values in Equation (1), we get the following inequality:3 · Φ ∗ ( L ) ≥ .

095 + 2 · (2 . √ > L is at least 11 / ≥ .

666 = | L | − . r ≥ n ≥ L n , and decompose it into three subgraphs: A n , S n , and P . The decomposition is shownin Figure 7. Again, note that there are many ways to decompose the graph. However, those19ecompositions may yield weak bound on the optimal cost. Whereas, this decomposition givessuﬃciently good bound on the optimal 1-median cost of L n . Figure 7:

Decomposition of 2- L n for n ≥ c ∗ be the optimal 1-median for L n . Based on the decomposition, we bound its optimal 1-mediancost in the following manner:2 · Φ ∗ ( L n ) ≥ Φ ∗ ( A n ) + Φ ∗ ( S n ) + Φ ∗ ( P ) (2)We already know the bounds on the optimal costs of A n , S n and P . That is,• For A n , we have Φ ∗ ( A n ) ≥ ( n + 1) + 0 . ∵ n ≥ S n , we have Φ ∗ ( S n ) = p n ( n − ≥ n − (3 − √ n ≥ P is is the same as L , we have Φ ∗ ( P ) ≥ | P | − . | P | = 3, we obtain thefollowing inequality:2 · Φ ∗ ( L n ) ≥ (2 n + 1 + | P | ) + (0 .

135 + √ − − . ≥ · ( n + 2) − . L n is at least ( n + 2) − .

342 = | L n | − . Corollary 17.1.

The cost of any fundamental non-star graph with r edges is at least r − . .Proof. The proof simply follows from Lemmas 15, 16, and 17.We will now bound the cost of any non-star graph by decomposing it into fundamental non-stargraphs. For this, we deﬁne the concept of “ safe pair ”. A safe pair is a pair of vertex-disjoint edgesin the graph such that when we remove it from the graph, the remaining graph remains a non-stargraph. Let us see why such a pair is important. First, note that the optimal cost of a 2- P isexactly two, using Corollary 9.2. Suppose we remove a 2- P from the graph F . Let the remaininggraph be F . Suppose Φ ∗ ( F ) = | F | + γ , where γ is some constant. Then it is easy to see thatΦ ∗ ( F ) ≥ Φ ∗ ( F ) + Φ ∗ (2- P ) = | F | + γ . Note that γ value is preserved in this decomposition.Suppose we keep removing a safe pair from F until we obtain a graph that does not contain anysafe pair. Then the remaining graph is simply a fundamental non-star graph by the deﬁnition of20undamental non-star graph. Moreover, we showed earlier that the optimal cost of any fundamentalnon-star graph with r edges, is at least r − . γ ≥ − . F has theoptimal cost at least | F | − . Decompose ( F ) Input : A non-star graph F . Output : A fundamental non-star graph D ∈ { P , A n , L n } (1) D ← F (2) while D / ∈ { P , A n , L n } (3) Let { e, e } ∈ E ( D ) be a safe pair(3) E ( D ) ← E ( D ) \ { e, e } (4) return D Figure 8:

Decomposition of any non-star graph F into the fundamental non-star graphs.The above discussion is formalised as the next lemma. Lemma 18.

The cost of any non-star graph F is at least ( | F | − . ≥ p | F | ( | F | −

1) + 0 . .Proof. Suppose the procedure

Decompose ( F ) runs the while loop t times. This means that F iscomposed of t safe pairs and a fundamental non-star graph D ∈ { P , A n , L n } . We call D the residual graph of F . Note that D has exactly | F | − t edges. Also, note that 2- P is a fundamentalnon-star graph since it the same as A . Based on the decomposition of F into 2- P ’s and D , webound the optimal cost of F as follows:Φ ∗ ( F ) ≥ t · Φ ∗ (2- P ) + Φ ∗ ( D )= 2 · t + Φ ∗ ( D ) (using Corollary 9.2) ≥ · t + ( | F | − t ) − .

342 (using Corollary 17.1)= | F | − . > q | F | ( | F | −

1) + 1 / − .

342 (using Lemma 6)= q | F | ( | F | −

1) + 0 . p, q , let usdeﬁne a new graph L p,q . This graph is composed of two star graphs S p and S q , such that there isan edge between the center vertices of S p and S q . Here, the center vertex is the vertex that is thecommon endpoint of all edges in a star graph and the remaining vertices are called pendent vertices.Let s and s denote the center vertices of S p and S q respectively. We call the edge ( s , s ), the bridge edge, and the graph L p,q , the bridge graph. Also, we call the pendent vertices of S p and S q as the left and right pendent vertices respectively. Please see Figure 9 for the pictorial depiction of L p,q . Note that when p = n and q = 1, the bridge graph is the same as L n .21 igure 9: A Bridge Graph: L p,q , for p, q ≥ Lemma 19.

Suppose F is a non-star non-bridge graph. Then F has the optimal -median cost atleast | F | ≥ p | F | ( | F | −

1) + 0 . .Proof. Here, we need to deﬁne the new concept of “ ultra-safe ” pair. An ultra-safe pair is a pairof vertex-disjoint edges such that removing it from the graph does not make the resulting graph astar or a bridge graph. We decompose F in a similar way as we did before. However, instead ofremoving a safe-pair from the graph, we remove an ultra-safe pair in every iteration of the whileloop. We decompose F using the following procedure. UltraDecompose ( F ) Input : A non-star non-bridge graph F . Output : A fundamental non-star graph D ∈ { P , A n , L n } (1) D ← F (2) while D / ∈ { P , A n , L n } (3) Let { e, e } ∈ E ( D ) be an ultra safe pair(3) E ( D ) ← E ( D ) \ { e, e } (4) return D Figure 10:

Decomposition of a non-star non-bridge graph F into fundamental non-star graphs.First, note that the procedure UltraDecompose ( F ) produces a residual graph of type 3- P or A n . It does not produce a residual graph of type L n since we are always removing an ultra-safepair from the graph, and L n is equivalent to L n, . Next, we show that we can always remove anultra-safe pair from G until we obtain a graph of type 3- P or A n . Consider the i th iteration of thewhile loop given that it is executed. Let D be the graph at the start of this iteration. It is clearthat D is neither a 3- P nor A n ; otherwise, this while loop would not have been executed. Also, D can neither be a star nor a bridge graph since an ultra-safe pair was removed in the previousiteration. This fact also holds for the ﬁrst iteration since the input graph is neither a star nor abridge graph. It implies that D is a non-star graph but not a fundamental non-star graph. Sincethe graph is not a fundamental non-star graph, it must contain a safe pair. Let e ≡ ( u , v ) and e ≡ ( u , v ) form a safe pair in D . If { e , e } is also an ultra-safe pair, we are done. On the otherhand, if { e , e } is not an ultra-safe pair, we show that there is another ultra-safe pair in D . Since { e , e } is a safe pair but not an ultra safe-pair, removing it would make the resulting graph, abridge graph L p,q . Therefore, D is composed of a graph of type L p,q , and the two additional edges: e and e . Let b ≡ ( s , s ) denote the bridge edge of L p,q . We split the analysis into two cases,based on the orientation of e and e in the graph. For each case, we show that D contains anultra-safe pair. 22 Case-I: At least one of the two edges, e , e connects a left pendent vertex with a right one. Without loss of generality, let e ≡ ( u , v ) be the edge that connects a left pendent vertex u with a right pendent vertex v . We claim that e and the bridge edge b , form an ultra-safepair. Indeed, suppose we remove this pair from the graph. The resulting graph would bea non-star graph since ( s , u ) and ( s , v ) are still present in the graph, and they form a2- P . Therefore, the pair { e , b } satisﬁes the condition of being a safe pair. Furthermore, theresulting graph is a non-bridge graph since there is no common edge incident on ( s , u ) and( s , v ). This proves that { e , b } is an ultra-safe pair.• Case-II: Neither e nor e connects any left pendent vertex with a right one. First, let us consider the possibility that both the edges e and e are incident on the bridgeedge, i.e., e is incident on s and e is incident on s . Then the graph D is a bridge graphof the form L p +1 ,q +1 which is not possible as per earlier discussion. Hence, without loss ofgenerality, we can assume that e is not incident on b . Now, we claim that the pair { e , b } forms an ultra-safe pair. Note that it forms a 2- P since both edges are vertex-disjoint. Now,suppose we remove this pair from the graph. Then in the resulting graph S p and S q arenot connected by any edge since e does not connect left and right pendent vertices. Such agraph can neither be a bridge graph or a star graph, since both of these graphs are connectedgraphs.The above discussion implies that there D always contains an ultra-safe pair unless D is of type3- P or A n . This means that if the procedure UltraDecompose ( F ) runs the while loop ‘ t ’ times,then F is composed of t ultra-safe pairs and a fundamental non-star graph D ∈ { P , A n } . Basedon this decomposition, we bound the optimal cost of F in the following manner:Φ ∗ ( F ) ≥ t · Φ ∗ (2- P ) + Φ ∗ ( D )= 2 · t + Φ ∗ ( D ) , (using Corollary 9.2) ≥ · t + ( | F | − t ) , (using Lemma 15 and 16)= | F | > q | F | ( | F | −

1) + 1 / , (using Lemma 6)This completes the proof of the lemma.In the next two subsections, we bound the vertex cover size of any non-star graph F in terms ofthe extra cost δ ( F ). In this section, we show that any graph with a maximum matching of size exactly two has a vertexcover of size at most 2 (cid:16) √ (cid:17) δ ( F ) + 1 .

62. Let C denote a cycle on ﬁve vertices. In the followinglemma, we show that C is the only graph with a maximum matching of size two and a vertexcover of size three. The rest of the graphs with a maximum matching of size two, have a vertexcover of size two. Lemma 20.

Let F be any graph other than C . If F has a maximum matching of size two, it hasa vertex cover of size two. Furthermore, C has a vertex cover of size three. roof. Let M be a maximum matching of F . Let e ≡ ( u , v ) and e ≡ ( u , v ) denote the edgesin M . Let V M denote the vertex set spanned by M , i.e., V M := { u , v , u , v } . Let U denote theunmatched edges of F . i.e., U := E ( F ) \ M . Note that all edges in U are incident on at least one ofthe matching edges; otherwise it forms a matching of size three and this contradicts the fact that F has a maximum matching of size two. Let U denote the edges in U that are incident on exactlyone of the matching edges and U = U \ U be the remaining unmatched edges. In other words, U contains the edges that have their both endpoints in V M .First, we claim that no two edges in U can be incident on diﬀerent endpoints of the samematching edge. For the sake of contradiction, suppose ( x, u ) and ( y, v ) are the edges in U suchthat x, y / ∈ V M . If x = y , it forms a triangle ( x, u , v ), which is not allowed. If x = y , we have amatching of size three – { ( x, u ) , ( y, v ) , ( u , v ) } . This contradicts the fact that F has a maximummatching M of size two. Therefore, U cannot contain both ( x, u ) and ( y, v ). Similarly, U cannotcontain both ( x, u ) and ( y, v ). Now, without loss of generality, we can assume that all the edgesin U have their one endpoint in the vertex-set: { u , u } . Let us divide the remaining analysis intofollowing two cases based on the existence of edge ( v , v ) in the graph.• Case 1: ( v , v ) / ∈ E ( F ). U can only contain the following edges: ( u , v ) , ( u , v ), and ( u , u ). Note that all edges in U have at least one endpoint in { u , u } . Previously, we showed all edges of U are incidenton { u , u } . Based on both of these facts, we can cover all edges in U using only two vertices: { u , u } . Furthermore, these vertices also cover the matching edges in M . Therefore, alledges of the graph are covered, and we have a vertex cover of size two.• Case 2: ( v , v ) ∈ E ( F ).Now, note that U can not contain the edges: ( u , v ) and ( u , v ), since they form thetriangles: ( v , v , u ) and ( v , v , u ). However, U can contain the edge: ( u , u ). Let usconsider the following two sub-cases based on the existence of ( u , u ) in the graph.(a) Sub-case: ( u , u ) ∈ E ( F ).We claim that either all the edges in U are incident on u or all of them are incidenton u . For the sake of contradiction, let ( x, u ) and ( y, u ) be the edges in U such that x, y / ∈ V M . If x = y , it forms a triangle ( x, u , u ), which is not allowed. If x = y , weget a matching of size three – { ( x, u ) , ( y, u ) , ( v , v ) } which contradicts the fact that F has a maximum matching M of size two. Without loss of generality, we can assumethat all edges in U are incident on u . Therefore, we can cover all edges of U usingonly u . Furthermore, u covers one of the matching edge ( u , v ) ∈ M and the edge( u , u ) ∈ U . Only two edges remain uncovered in the graph, which are ( u , v ) ∈ M and ( v , v ) ∈ U . We cover both these edges by picking the vertex v . Thus, we get avertex cover of size two.(b) Sub-case: ( u , u ) / ∈ E ( F ).Let us consider the case when all the edges in U are incident on either u or u . In thiscase, either { u , v } or { u , v } forms a vertex cover of size two. Hence, we are done. Letus consider the other case. Suppose, there are two edges ( x, u ) and ( y, u ) in U suchthat x, y / ∈ V M . If x = y , we get a matching of size three – { ( x, u ) , ( y, u ) , ( v , v ) } ,which is not possible. On the other hand, if x = y , then the only possibility is that F is a cycle of length ﬁve – C : ( x, u , v , v , u ). In this case, the vertex cover of F is ofsize 3. 24e showed that all graphs with maximum matching 2 has a vertex cover of size 2 except for C that has a vertex cover of size 3. This completes the proof of the lemma. Corollary 20.1.

Let F be any graph with a maximum matching of size two. If the graph is not a C , it has a vertex cover of size at most (cid:16) √ (cid:17) δ ( F ) + 1 . .Proof. In Lemma 18, we showed that any graph F has an extra cost at least 0.158. In other words, δ ( F ) ≥ . | V C ( F ) | = 2 ≤ (cid:16) √ (cid:17) δ ( F ) + 1 .

62. Hence proved.Next, we consider the particular case of C . The following lemma bounds the optimal 1-mediancost of C . Lemma 21.

The optimal -median cost of C is at least p | C | ( | C | −

1) + 0 . .Proof. Let C be ( u, v, w, x, y ). We decompose the graph into two fundamental non-star graphs: A : { ( u, v ) , ( w, x ) } and A : { ( v, w ) , ( x, y ) , ( y, u ) } . The following sequence of inequalities bound theoptimal cost of C .Φ ∗ ( C ) ≥ Φ ∗ ( A ) + Φ ∗ ( A ) ≥ . √

20 + (cid:16) − √ (cid:17) + 0 . q | C | ( | C | −

1) + 0 .

622 ( ∵ | C | = 5)This completes the proof of the lemma. Corollary 21.1.

The graph C has a vertex cover of size at most (cid:16) √ (cid:17) δ ( C ) + 1 . .Proof. Since δ ( C ) ≥ . | V C ( C ) | = 3 ≤ (cid:16) √ (cid:17) δ ( C ) + 1 .

62 which proves thecorollary.

In this section, we show that any non-star graph F , with a maximum matching of size at leastthree, has a vertex cover of size at most 1 . (cid:16) √ (cid:17) δ ( F ). First, let us deﬁne some notations.Let M denote a maximum matching of F , and G M denote the subgraph spanned by M . Let F bethe graph obtained by removing M from F , i.e., F = ( V, E ( F ) \ M ). Let L denote a maximummatching of F , and G L denote the subgraph spanned by L . We call L the second maximummatching of F after M . Now, we remove L from the graph. Let F be the graph obtained byremoving L from F , i.e., F = ( V, E ( F ) \ ( M ∪ L )). Recall that in this entire discussion, we areusing | . | to denote the number of edges in any given graph.Now, we obtain a relation between the vertex cover size and extra cost of a graph. To establishthis relation, we show that both of them are proportional to the number of vertex disjoint edges inthe graph. For example, a graph with a maximum matching M has a vertex cover of size at most2 | M | . Similarly, a set of m vertex-disjoint edges has an extra cost of (cid:16) √ − (cid:17) · p m ( m −

1) (usingCorollary 9.2). Also, note that a star graph which has a maximum matching of size one, has anextra cost of only zero. In the next two lemmas, we formally establish these relations of the vertexcover size and extra cost in terms of number of vertex-disjoint edges in the graph. Then we willuse the two lemmas to bound the vertex cover size in terms of the extra cost. First, let us boundthe vertex cover size in terms of | L | and | M | . 25 emma 22. Any non-star graph F has a vertex cover of size at most ( | M | + | L | − .Proof. Let G ML denote the graph spanned by the edge set M ∪ L . Note that there are no oddcycles in G ML ; otherwise there would be two adjacent edges in the cycle that would belong to thesame matching set L or M . Thus, G ML is a bipartite graph. There is a well-known result thatsays that in a bipartite graph, the size of a maximum matching is equal to the size of a vertexcover [Bon76]. Therefore, G ML has a vertex cover of size exactly | M | . Let S denote a vertex coverof G ML .Now, we give an incremental construction of a vertex cover of the entire graph G . Let thisvertex cover be denoted by S . Initially, we add all vertices of S to S . Therefore, at this stage, S covers all edges in L and M which means that for every edge in L , at least one of its endpointsmust belong to S . Now, we include its other endpoint in S as well and we do this for all edges in L . We observe that S now covers all edges in F since L is a maximum matching of F , and alledges in F are incident on L . Therefore, S covers all edges in G , and has a size of | M | + | L | .Our main goal is to obtain a vertex cover of size | M | + | L | −

1. We again give an incrementalconstruction and let S denote this incrementally constructed vertex cover. Initially, S is empty.Let us color the edges of the graph. We color the edges in M with red color, L with green color,and E ( F ) with blue color. Note that non-red edges are the edges of the graph F . Now, for everyedge in L except one, we add both its endpoints in S . Let e ≡ ( u , v ) ∈ L be the remaining edgeof L . Now, we remove all the edges of F covered by S . Let the resulting graph be G S . G S containssome red edges, some blue edges, and exactly one green edge e . Also, note that all non-red edgesin G S form a star graph. This is because, if they form a non-star graph, it would have a matchingof size at least two and this matching together with the removed green edges form a matching of F of size ≥ | L | + 1. This contradicts with the fact that that F has a maximum matching of size | L | . Therefore, non-red edges of G S form a star graph. Now, let us construct a vertex cover of G S .Let R be the set of red edges in G S . Let N R be the set of non-red edges in G S . Further, assumethat N R := { ( u, v ) , ( u, v ) , . . . , ( u, v t ) } , i.e., all non-red edges are incident on a common vertex u .We consider three diﬀerent cases depending on the number of red edges in R . For each of thesecases, we construct a vertex cover for G S .1. Case 1: | R | ≤ | M | − | L | .Since N R forms a star graph, we cover it using a single vertex u . For every edge in R , wepick one vertex per edge in the vertex cover. Thus, all edges of G S are covered. So, the sizeof the entire vertex cover of F is ( | M | − | L | ) + 1 + 2 · ( | L | −

1) = | M | + | L | − | R | ≥ | M | − | L | + 2.In this case, R and the removed green edges form a matching of size | M | + 1 which contradictswith the fact that G has the maximum matching of size | M | . Therefore, we can rule out thiscase.3. Case 3: | R | = | M | − | L | + 1Here, we claim that every non-red edge in N R must be incident on some red edge in R . Forthe sake of contradiction, suppose this is not true and there is a non-red e i ∈ N R that is notincident on any of the red edges in R . It is easy to see that { e i }∪ R ∪ L \{ e } forms a matchingof size | M | + 1. It contradicts that F has a matching of size | M | . Therefore, each non-rededge in N R must be incident on some red edge in R . Moreover, note that no two edges in N R can be incident on the same red edge ( r , r ) ∈ R . Otherwise, it would form a triangle –( u, r , r ), which is not allowed. Now, for every edge in R , we pick exactly one of its endpoints.26his is the endpoint that it shares with some non-red edge in N R if one exists; otherwise anarbitrary endpoint is picked. Thus, we cover the edges in G S using only | R | vertices. So, thesize of the vertex cover of the entire graph F in this case is | R | + 2 | L | − | M | + | L | − | M | and | L | . There are some special graphinstances for which we do the analysis separately. For the following lemma, we assume that | L | ≥ F is a non-star non-bridge graph. Lemma 23.

Let | L | ≥ , and F be a non-star non-bridge graph. Then the extra cost of F is atleast ( √ − · ( | M | + | L | ) − . .Proof. We decompose F into three subgraphs: G M , G L , and F . It gives the following bound onthe optimal cost of F . Φ ∗ ( F ) ≥ Φ ∗ ( G M ) + Φ ∗ ( G L ) + Φ ∗ ( F ) (3)We already know the bounds on the optimal costs of G M , G L and F . That is,• Φ ∗ ( G M ) Corollary 9.2 ≥ √ · p | M | ( | M | − (Lemma 6, | M | ≥ ≥ √ · (cid:16) | M | − (3 − √ (cid:17) ≥ √ · | M | − . . • Φ ∗ ( G L ) Corollary 9.2 = √ · p | L | ( | L | − (Lemma 6, | L | ≥ ≥ √ · (cid:16) | L | − (3 − √ (cid:17) ≥ √ · | L | − . . • Φ ∗ ( F ) Lemma 19 ≥ | F | . Substituting the above values in equation 3, we obtain the following inequality:Φ ∗ ( F ) ≥ | M | + | L | + | F | + ( √ − · ( | M | + | L | ) − . | F | + ( √ − · ( | M | + | L | ) − . > q | F | ( | F | −

1) + 0 . √ − · ( | M | + | L | ) − .

56 (using Lemma 6)= q | F | ( | F | −

1) + ( √ − · ( | M | + | L | ) − . | M | and | L | . Then in Lemma 23,we bound the extra cost in terms of | M | + | L | . Now, we put these two results together and obtaina relation between the extra cost and vertex cover size. Corollary 23.1.

Let | L | ≥ and F is a non-star non-bridge graph. Then F has a vertex coverof size at most . (cid:16) √ (cid:17) δ ( F ) .Proof. The proof follows from the following sequence of inequalities:1 . (cid:16) √ (cid:17) δ ( F ) Lemma 23 ≥ | M | + | L | + 1 . − (cid:16) √ (cid:17) (1 . > | M | + | L | − Lemma 22 = | V C ( F ) | . There are some special graph instances for which either Lemma 22 gives a weak bound on thevertex cover size or Lemma 23 gives a weak bound the extra cost of the graph. This would givean overall weak relation between the vertex cover size and extra cost of the instances. Therefore,we analyse such instances separately. We divide the remaining instances into the following ﬁvecategories. 27. | L | = 0: In this case, we show | V C ( F ) | ≤ ( √ δ ( F ) + 0 . | L | = 1: In this case, we show | V C ( F ) | ≤ ( √ δ ( F ) + . .3. | L | = 2 and F is a bridge graph: In this case, we show | V C ( F ) | ≤ ( √ δ ( F ) + 1 . | L | = 2 and F is a non-bridge graph: In this case, we show | V C ( F ) | ≤ ( √ δ ( F ) + 1 . | L | ≥ F is a bridge graph: In this case, we show | V C ( F ) | ≤ ( √ δ ( F ) + 1 . | M | and | L | . Then, we obtain a lower bound on theextra cost in terms of | M | and | L | . And, ﬁnally we state a corollary (similar to the corollary above)combining these two results. Also, note that for all the following cases we will consider | M | ≥ | M | = 2 in Section 4.2. | L | = The following lemma is trivial.

Lemma 24 (Vertex Cover) . If | L | = 0 , F has a vertex cover of size exactly | M | . Lemma 25 (Extra Cost) . If | L | = 0 , the extra cost of F is exactly (cid:16) √ − (cid:17) · p | M | ( | M | − .Proof. The proof simply follows from Corollary 9.2.

Corollary 25.1. If | L | = 0 , F has a vertex of size at most .

551 + (cid:16) √ (cid:17) δ ( F ) .Proof. The proof follows from the following series of inequalities:0 .

551 + (cid:16) √ (cid:17) δ ( F ) Lemma 25 = 0 .

551 + p | M | ( | M | − (Lemma 6, | M | ≥ ≥ .

551 + | M | − (3 − √ > | M | Lemma 24 = | V C ( F ) | . | L | = Note that the condition | L | = 1 is equivalent to F being a star graph. Lemma 26 (Vertex Cover) . If | L | = 1 , F has a vertex cover of size exactly | M | Proof.

The proof follows from Lemma 22 and substituting | L | = 1. Lemma 27 (Extra Cost) . If | L | = 1 , the extra cost of F is at least (cid:16) √ − (cid:17) ( | M | ) − . Proof.

Let e be some edge in E ( F ). The edge e must incident on some edge of M ; otherwise M would not be a maximum matching. Furthermore, e can only be incident on at most two edges of M . Let us deﬁne the edges l and l in the graph depending on the orientation of e in the graph.• If e is incident on two edges of M , then l and l are deﬁned as the corresponding incidentedges in M .• If e is incident on only one edge of M , then l ∈ M is deﬁned as the incident edge and l isdeﬁned as any other edge in M . 28et M := ( M \ { ‘ , ‘ } ) ∪ { e } and L = E ( F ) ∪ { ‘ , ‘ } \ { e } . Given this, note that M formsa matching of size ( | M | −

1) and L spans a graph of type A n for n ≥

1. Let G M denote thegraph spanned by M , and G L denote the graph spanned by L . We decompose F into these twosubgraphs, i.e., G M and G L . It gives the following bound on the optimal cost of F .Φ ∗ ( F ) ≥ Φ ∗ ( G M ) + Φ ∗ ( G L ) (4)We already know the bounds on the optimal costs of G M and G L . That is,• Φ ∗ ( G M ) Corollary 9.2 ≥ √ · p | M | ( | M | − (Lemma 6, | M | ≥ ≥ √ · (cid:16) | M | − (2 − √ (cid:17) . • Φ ∗ ( G L ) (Lemma 16 statement 1) ≥ | L | . We substitute the above values in Equation (4). This gives the following inequality:Φ ∗ ( F ) ≥ | M | + | L | + (cid:16) √ − (cid:17) | M | + 2 − √ | M | + | F | + (cid:16) √ − (cid:17) | M | + 3 − √ | M | = | M | − | L | = | F | + 1)= | F | + (cid:16) √ − (cid:17) | M | + 3 − √ (cid:0) ∵ | F | = | M | + | F | (cid:1) > q | F | ( | F | −

1) + 0 . (cid:16) √ − (cid:17) | M | + 3 − √ > q | F | ( | F | −

1) + (cid:16) √ − (cid:17) | M | − . Corollary 27.1. If | L | = 1 , then F has a vertex cover of size at most . (cid:16) √ (cid:17) δ ( F ) .Proof. The proof follows from the following sequence of inequalities:1 . (cid:16) √ (cid:17) δ ( F ) Lemma 27 ≥ | M | + 1 . − (cid:16) √ (cid:17) (0 . > | M | Lemma 26 = | V C ( F ) | . | L | = 2 and F is Bridge Graph Since F is a bridge graph, | L | = 2. For this case, Lemma 22 gives a vertex cover of size at most | M | + 1. However, we show a stronger bound than this in the following lemma. Lemma 28 (Vetex Cover) . If F is a bridge graph L p,q for some p, q ≥ , then F has a vertexcover of size | M | .Proof. Let b ≡ ( u, v ) be the bridge edge of L p,q . Suppose b be incident on an edge e ∈ M . Withoutloss of generality, we can assume that u is the common endpoint of e and b . Let us pick the vertex u in the vertex cover and remove the edges covered by it. Let G denote the resulting graph. Further,let M denote a maximum matching of G . Now, we claim that | M | = | M | −

1. For the sake ofcontradiction, assume that | M | ≥ | M | . Then the edge b and matching set M would together forma matching of size | M | + 1 and this would contradict that F has a maximum matching M of size | M | . Now, suppose we choose M ≡ M \ { e } as the maximum matching of G and let L be the29econd maximum matching after M . If we remove the edges of M from G , the remaining graphwould be a star graph. Therefore, the size of second maximum matching L is exactly one. Now,using Lemma 22, we can cover G using | M | + | L | − u ) of the entire graph F has a size at most | M | + | L | = | M | . This proves the lemma. Lemma 29 (Extra Cost) . If F is a bridge graph L p,q for some p, q ≥ , then the extra cost of F is at least (cid:16) √ − (cid:17) · | M | − . .Proof. We decompose F into two subgraphs: G M and F . It gives the following bound on theoptimal cost of F . Φ ∗ ( F ) ≥ Φ ∗ ( G M ) + Φ ∗ ( F ) (5)We already know the bounds on the optimal costs of G M and F . That is,• Φ ∗ ( G M ) Corollary 9.2 ≥ √ · p | M | ( | M | − (Lemma 6, | M | ≥ ≥ √ · (cid:16) | M | − (3 − √ (cid:17) ≥ √ ·| M |− . . • Φ ∗ ( F ) Lemma 18 ≥ | F | − . . We substitute the above values in Equation (5). It gives the following inequality:Φ ∗ ( F ) ≥ | F | + | M | + (cid:16) √ − (cid:17) | M | − . | F | + (cid:16) √ − (cid:17) | M | − . > q | F | ( | F | −

1) + 0 . (cid:16) √ − (cid:17) | M | − .

122 (using Lemma 6) > q | F | ( | F | −

1) + (cid:16) √ − (cid:17) | M | − . Corollary 29.1. If F is a bridge graph L p,q for some p, q ≥ , then F has a vertex cover of sizeat most .

53 + (cid:16) √ (cid:17) δ ( F ) .Proof. The proof follows from the following sequence of inequalities:1 .

53 + (cid:16) √ (cid:17) δ ( F ) Lemma 29 ≥ | M | + 1 . − (cid:16) √ (cid:17) (0 . > | M | Lemma 28 = | V C ( F ) | . | L | = 2 and F is Non-Bridge GraphLemma 30 (Vertex Cover) . If | L | = 2 and F is a non-bridge graph, then F has a vertex cover ofsize at most | M | + 1 .Proof. The proof simply follows from Lemma 22 for | L | = 2. Lemma 31 (Extra Cost) . If | L | = 2 and F is a non-bridge graph, the extra cost of F is at least (cid:16) √ − (cid:17) · | M | + 0 . Proof.

We decompose F into two subgraphs: G M and F . It gives the following bound on theoptimal cost of F . Φ ∗ ( F ) ≥ Φ ∗ ( G M ) + Φ ∗ ( F ) (6)We already know the bounds on the optimal costs of G M and F . That is,30 Φ ∗ ( G M ) Corollary 9.2 ≥ √ · p | M | ( | M | − (Lemma 6, | M | ≥ ≥ √ · (cid:16) | M | − (3 − √ (cid:17) ≥ √ ·| M |− . . • Φ ∗ ( F ) Lemma 19 ≥ | F | . We substitute the above values in Equation (6). It gives the following inequality:Φ ∗ ( F ) ≥ | F | + | M | + (cid:16) √ − (cid:17) | M | − . | F | + (cid:16) √ − (cid:17) | M | − . (cid:0) ∵ | F | = | F | + | M | (cid:1) > q | F | ( | F | −

1) + 0 . (cid:16) √ − (cid:17) | M | − .

78 (using Lemma 6)= q | F | ( | F | −

1) + (cid:16) √ − (cid:17) | M | − . Corollary 31.1. If F is a non-star non-bridge graph, then F has a vertex cover of size at most .

68 + (cid:16) √ (cid:17) δ ( F ) .Proof. The proof follows from the following sequence of inequalities:1 .

68 + (cid:16) √ (cid:17) δ ( F ) Lemma 31 ≥ | M | + 1 . − (cid:16) √ (cid:17) (0 . > | M | + 1 Lemma 30 = | V C ( F ) | . | L | ≥ and F is Bridge Graph Since | L | ≥

3, Lemma 22 gives a vertex cover of size at most | M | + | L | −

1, which is at least | M | + 2.However, we can obtain a stronger bound than this if F is a bridge graph as shown in the followinglemma. Lemma 32. If | L | ≥ and F is a bridge graph L p,q for some p, q ≥ , then F has a vertex coverof size at most | M | + 1 .Proof. We will incrementally construct a vertex cover S of size | M | + 1. Initially, S is empty, i.e., S = ∅ . Let b ≡ ( u, v ) be the bridge edge of L p,q . We will add both vertices u and v to the set S ,so that it covers all edges in F . Now, we remove all the edges in the graph that are covered by u and v . Let M and L be the remaining sets corresponding to M and L , respectively. Let G bethe graph spanned by the edge set M ∪ L . Now, observe that G does not contain any odd cycles;otherwise there would be two adjacent edges in the cycle that would belong to the same set M or L . Moreover, G has a maximum matching of size at most | M | −

1. This is because, the edge b is vertex-disjoint from every edge of G and if G has a matching of size at least | M | , then thismatching together with b form a matching of size | M | + 1. This contradicts the fact that F has themaximum matching of size | M | . Since G is bipartite and has a matching of size at most | M | −

1, itadmits a vertex cover of size | M | − Kőnig’s Theorem [Bon76]). Thus, the vertex cover(including the vertices u and v ) of the entire graph has a size at most | M | + 1. This completes theproof of the lemma. Lemma 33. If | L | ≥ and F is a bridge graph L p,q for some p, q ≥ , then the extra cost of F isat least ( √ − · ( | M | + | L | ) − . .Proof. We decompose F into three subgraphs: G M , G L , and F . Then, it gives the following boundon the optimal cost of F . Φ ∗ ( F ) ≥ Φ ∗ ( G M ) + Φ ∗ ( G L ) + Φ ∗ ( F ) (7)We already know the bounds on the optimal costs of G M , G L and F . That is,31 Φ ∗ ( G M ) Corollary 9.2 ≥ √ · p | M | ( | M | − (Lemma 6, | M | ≥ ≥ √ · (cid:16) | M | − (3 − √ (cid:17) ≥ √ ·| M |− . . • Φ ∗ ( G L ) (using Corollary 9.2) = √ · p | L | ( | L | − (Lemma 6, | L | ≥ ≥ √ · (cid:16) | L | − (3 − √ (cid:17) ≥ √ ·| L |− . . • Φ ∗ ( F ) Lemma 18 ≥ | F | − . . We substitute the above values in Equation (7). It gives the following inequality:Φ ∗ ( F ) ≥ | M | + | L | + | F | + ( √ − · ( | M | + | L | ) − . | F | + ( √ − · ( | M | + | L | ) − . > q | F | ( | F | −

1) + 0 . √ − · ( | M | + | L | ) − .

902 (using Lemma 6) > q | F | ( | F | −

1) + ( √ − · ( | M | + | L | ) − . Corollary 33.1. If | L | ≥ and F is a bridge graph L p,q for some p, q ≥ , then F has a vertexcover of size at most . (cid:16) √ (cid:17) δ ( F ) .Proof. The proof follows from the following sequence of inequalities:1 . (cid:16) √ (cid:17) δ ( F ) Lemma 33 ≥ | M | + | L | + 1 . − (cid:16) √ (cid:17) (1 . ( | L | ≥ > | M | + 1 Lemma 32 = | V C ( F ) | . This completes the analysis for all graph instances.

In the previous section, we showed that the k -median problem cannot be approximated to anyfactor smaller than (1 + ε ), where ε is some positive constant. The next step in the beyond worst-case discussion is to discuss bi-criteria approximation algorithms. That is, suppose we allow thealgorithm to choose more than k centers. Then does it produce a solution that is close to theoptimal solution with respect to k centers? Since the algorithm is allowed to output more numberof centers we can hope to get a better approximate solution. An interesting question in this regardwould be: Is there a PTAS (polynomial time approximation scheme) for the k -median/ k -meansproblem when the algorithm is allowed to choose βk centers for some constant β > ? In otherwords, is there an (1 + ε, β )-approximation algorithm? Note that here we compare the cost of βk centers with the optimal cost with respect to k centers. See Section 1 for the deﬁnition of ( α, β )bi-criteria approximation algorithms.In this section, we show that even with βk centers, the k -means/ k -median problems cannotbe approximated within any factor smaller than (1 + ε ), for some constant ε > Theorem 34 ( k -median) . For any constant < β < . , there exists a constant ε > suchthat there is no (1 + ε, β ) -approximation algorithm for the k -median problem assuming the UniqueGames Conjecture. Theorem 35 ( k -means) . For any constant < β < . , there exists a constant ε > suchthat there is no (1 + ε, β ) -approximation algorithm for the k -means problem assuming the UniqueGames Conjecture. Moreover, the same result holds for any < β < . under the assumption that P = NP . First, let us prove the bi-criteria inapproximability result corresponding to the k -median objective.32 .1 Bi-criteria Inapproximability: k -Median In this subsection, we give a proof of Theorem 34. Let us deﬁne a few notations. Suppose I = ( X , k )is some k -median instance. Then, OP T ( X , k ) denote the optimal k -median cost of X . Similarly, OP T ( X , βk ) denote the optimal βk -median cost of X (or the optimal cost of X with βk centers). Weuse the same reduction as we used in the previous section for showing the hardness of approximationof the k -median problem. Based on the reduction, we establish the following theorem. Theorem 36.

There is an eﬃcient reduction from

Vertex Cover on bounded degree triangle-freegraphs G (with m edges) to Euclidean k -median instances I = ( X , k ) that satisﬁes the followingproperties:1. If G has a vertex cover of size k , then OP T ( X , k ) ≤ m − k/

2. For any constant < β < . , there exists constants ε, δ > such that if G has no vertexcover of size ≤ (2 − ε ) · k , then OP T ( X , βk ) ≥ m − k/ δk .Proof. Since the reduction is the same as we discussed in Section 1.2 and 3, we keep all notations thesame as before. Also, note that Property 1 in this theorem is the same as Property 1 of Theorem 8.Therefore, the proof is also the same as we did in Section 3.1. Now, we directly move to the proofof Property 2.The proof is almost the same as we gave in Section 3.2. However, it has some minor diﬀerencessince we consider the optimal cost with respect to βk centers instead of k centers. Now, we provethe following contrapositive statement: “For any constants 1 < β < .

015 and ε >

0, there existsconstants ε, δ >

OP T ( X , βk ) < ( m − k/ δk ) then G has a vertex cover of size atmost (2 − ε ) k ”. Let C denote an optimal clustering of X with βk centers. We classify its optimalclusters into two categories: (1) star and (2) non-star . Further, we sub-classify the star clustersinto the following two sub-categories:(a) Clusters composed of exactly one edge. Let these clusters be: P , P , . . . , P t .(b) Clusters composed of at least two edges. Let these clusters be: S , S , . . . , S t .Similarly, we sub-classify the non-star clusters into the following two sub-categories:(i) Clusters with a maximum matching of size two. Let these clusters be: W , W , . . . , W t (ii) Clusters with a maximum matching of size at least three. Let these clusters be: Y , Y , . . . , Y t Note that t + t + t + t equals βk . Suppose, we ﬁrst compute a vertex cover of all the clustersexcept the single edge clusters: P , . . . , P t . Let that vertex cover be V C . Now, some vertices in V C might also cover the edges in P , . . . , P t . Suppose there are t single edge clusters that remainuncovered by V C . Without loss of generality, we assume that these clusters are P , . . . , P t . ByLemma 12, we can cover these cluster with ( t + 8 δk ) ≤ ( t + 8 δk ) vertices; otherwise the graphwould have a vertex cover of size at most (2 k − δk ), and the proof of Property 2 would be complete.Now, we bound the vertex cover of the entire graph in the following manner. | V C ( G ) | ≤ t X i =1 | V C ( P i ) | + t X i =1 | V C ( S i ) | + t X i =1 | V C ( W i ) | + t X i =1 | V C ( Y i ) |≤ (cid:18) t δk (cid:19) + t + t X i =1 (cid:16)(cid:16) √ (cid:17) δ ( W i ) + 1 . (cid:17) + t X i =1 (cid:16)(cid:16) √ (cid:17) δ ( Y i ) + 1 . (cid:17) , . t + 8 δk + t + (1 . t + (1 . t + (cid:16) √ (cid:17) t X i =1 δ ( W i ) + t X i =1 δ ( Y i ) ! Since the optimal cost

OP T ( X , βk ) = βk X j =1 q m j ( m j −

1) + t X i =1 δ ( W i ) + t X i =1 δ ( Y i ) ≤ m − k/ δk , weget t X i =1 δ ( W i ) + t X i =1 δ ( Y i ) ≤ m − k/ δk − βk X j =1 q m j ( m j − | V C ( G ) | ≤ (0 . t + 8 δk + t + (1 . t + (1 . t + (cid:16) √ (cid:17) ·  m − k/ − βk X j =1 q m j ( m j −

1) + δk  Using Lemma 6, we obtain the following inequalities:1. For P j , q m ( P j ) ( m ( P j ) − ≥ m ( P j ) − m ( P j ) = 12. For S j , q m ( S j ) ( m ( S j ) − ≥ m ( S j ) − (2 − √

2) since m ( S j ) ≥

23. For W j , q m ( W j ) ( m ( W j ) − ≥ m ( W j ) − (2 − √

2) since m ( W j ) ≥

24. For Y j , q m ( Y j ) ( m ( Y j ) − ≥ m ( Y j ) − (3 − √

6) since m ( Y j ) ≥ | V C ( G ) | ≤ (0 . t + 8 δk + t + (1 . t + (1 . t + (cid:16) √ (cid:17) ·  m − k/ − t X j =1 ( m ( P j ) −

1) + − t X j =1 (cid:16) m ( S j ) − (2 − √ (cid:17) − t X j =1 (cid:16) m ( W j ) − (2 − √ (cid:17) − t X j =1 (cid:16) m ( Y j ) − (3 − √ (cid:17) + δk  Since m = t X j =1 m ( P j ) + t X j =1 m ( S j ) + t X j =1 m ( W j ) + t X j =1 m ( Y j ), we get the following inequality: | V C ( G ) | ≤ (0 . t + 8 δk + t + (1 . t + (1 . t + (cid:16) √ (cid:17) · − k/ t + t · (cid:16) − √ (cid:17) ++ t · (cid:16) − √ (cid:17) + t · (cid:16) − √ (cid:17) + δk ! = (0 . t + 8 δk + t + (1 . t + (1 . t + (cid:16) √ (cid:17) · ( β − k − βk t + t · (cid:16) − √ (cid:17) ++ t · (cid:16) − √ (cid:17) + t · (cid:16) − √ (cid:17) + δk ! βk = t + t + t + t , and obtain the following inequality: | V C ( G ) | ≤ (0 . t + 8 δk + t + (1 . t + (1 . t + (cid:16) √ (cid:17) · (cid:18) ( β − k t t

10 + t

10 + 3 t

50 + δk (cid:19) = (1 . t + (1 . t + (1 . t + (1 . t + (cid:16) √ (cid:17) · ( β − k (cid:16) √ (cid:17) δk< (1 . βk + (cid:16) √ (cid:17) · ( β − k (cid:16) √ (cid:17) δk (using t + t + t + t = βk ) < (3 . βk − (1 . k + (cid:16) √ (cid:17) δk ≤ (2 − ε ) k, for β < .

015 and appropriately small constants ε, δ > k -medianproblem. Corollary 36.1.

There exists a constant ε > such that for any constant < β < . , thereis no (1 + ε , β ) -approximation algorithm for the k -median problem assuming the Unique GamesConjecture.Proof. In the proof of Corollary 8.1, we showed that k ≥ m for all the hard Vertex Cover instances.Therefore, the second property of Theorem 36, implies that

OP T ( X , βk ) ≥ ( m − k ) + δk ≥ (1 + δ ) · ( m − k ). Thus, the k -median problem can not be approximated within any factor smallerthan 1 + δ = 1 + Ω( ε ), with βk centers for any β < . k -means objective. k -means Here, we again use the same reduction that we used earlier for the k -median problem in Sec-tions 1.2, 3, and 5.1. Using this, we establish the following theorem. Theorem 37.

There is an eﬃcient reduction from

Vertex Cover on bounded degree triangle-freegraphs G (with m edges) to Euclidean k -means instances I = ( X , k ) that satisﬁes the followingproperties:1. If G has a vertex cover of size k , then OP T ( X , k ) ≤ m − k

2. For any < λ ≤ and β < · (cid:16) λ + (cid:17) , there exists constants ε, δ > such that if G has novertex cover of size ≤ ( λ − ε ) · k , then OP T ( X , βk ) ≥ m − k + δk . This theorem is simply an extension of the result of Awasthi et al. [ACKS15] to the bi-criteriasetting. Now, let us prove this theorem.

Note that the proof of completeness is already given in [ACKS15]. Therefore, we just describe themain components of the proof for the sake of clarity. To understand the proof, let us deﬁne somenotations used in [ACKS15]. Suppose F is a subgraph of G . For a vertex v ∈ V ( F ), let d F ( v )denote the number of edges in F that are incident on v . Note that, the optimal center for 1-meansproblem is simply the centroid of the point set. Therefore, we can compute the optimal 1-meanscost of F . The following lemma states the optimal 1-means cost of F .35 emma 38 (Claim 4.3 [ACKS15]) . Let F be a subgraph of G with r edges. Then, the optimal -means cost of F is P v d F ( v ) (cid:16) − d F ( v ) r (cid:17) The following corollary bounds the optimal 1-means cost of a star cluster. This corollary is implicitlystated in the proof of Claim 4.4 of [ACKS15].

Corollary 38.1.

The optimal -means cost of a star cluster with r edges is r − . Using the above corollary, we give the proof of completeness. Let V = { v , . . . , v k } be a vertexcover of G . Let S i denote the set of edges covered by v i . If an edge is covered by two vertices i and j , then we arbitrarily keep the edge either in S i or S j . Let m i denote the number of edges in S i . We deﬁne {X ( S ) , . . . , X ( S k ) } as a clustering of the point set X . Now, we show that the costof this clustering is at most m − k . Note that each S i forms a star graph with its edges sharing thecommon vertex v i . The following sequence of inequalities bound the optimal k -means cost of X . OP T ( X , k ) ≤ k X i =1 Φ ∗ ( S i ) ( Corollary . = k X i =1 ( m ( S i ) −

1) = m − k. For the proof of soundness, we prove the following contrapositive statement: “For any constant1 < λ ≤ β < · (cid:16) λ + (cid:17) , there exists constants ε, δ > OP T ( βk ) ≤ ( m − k + δk )then G has a vertex cover of size at most ( λ − ε ) k , for ε = Ω( δ ).” Let C denote an optimal clusteringof X with βk centers. We classify its optimal clusters into two categories: (1) star and (2) non-star .Suppose there are t star clusters: S , . . . , S t , and t non-star clusters: F , F , . . . , F t . Note that t + t equals βk . The following lemma bounds the optimal 1-means cost of a non-star cluster. Lemma 39 (Lemma 4.8 [ACKS15]) . The optimal -means cost of any non-star cluster F with m edges is at least m − δ ( F ) , where δ ( F ) ≥ . Furthermore, there is an edge ( u, v ) ∈ E ( F ) suchthat d F ( u ) + d F ( v ) ≥ m + 1 − δ ( F ) . In the actual statement of the lemma in [ACKS15], the authors mentioned a weak bound of δ ( F ) > /

2. However, in the proof of their lemma they have shown δ ( F ) > / > /

2. Thisdiﬀerence does not matter when we consider inapproximability of the k -means problem. However,this diﬀerence improves the β value in bi-criteria inapproximability of the k -means problem. Corollary 39.1 ([ACKS15]) . Any non-star cluster F has a vertex cover of size at most · δ ( F ) .Proof. Suppose ( u, v ) be an edge in F that satisﬁes the property: d F ( u ) + d F ( v ) ≥ m + 1 − δ ( F ), byLemma 39. This means that u and v covers at least m ( F ) − δ ( F ) edges of F . We pick u and v inthe vertex cover, and for the remaining δ ( F ) edges we pick one vertex per edge. Therefore, F has avertex cover of size at most 2 + δ ( F ). Since δ ( F ) ≥ , by Lemma 39, we get 2 + δ ( F ) ≤ · δ ( F ).Hence, F has a vertex cover of size at most 1 + · δ ( F ). This proves the corollary.Now, the following sequence of inequalities bound the vertex cover size of the enire graph G . | V C ( G ) | ≤ t X i =1 | V C ( S i ) | + t X i =1 | V C ( F i ) |≤ t + t X i =1 (cid:18) · δ ( F i ) (cid:19) (using Corollary 39.1)36 t + t + 52 · t X i =1 δ ( F i )Since the optimal k -means cost OP T ( X , βk ) = t X i =1 ( m ( S i ) −

1) + t X i =1 ( m ( F i ) − δ ( F i )) ≤ m − k + δk , and t + t = βk . Therefore, t X i =1 δ ( F i ) ≤ ( β − k + δk . On substituting this value in theprevious equation, we get the following inequality: | V C ( G ) | ≤ t + t + 52 · ( β − k + 52 · δk = βk + 52 · ( β − k + 52 · δk, ( ∵ t + t = βk ) ≤ ( λ − ε ) k, for β < · (cid:16) λ + (cid:17) and appropriately small constants ε, δ > k -meansproblem. Corollary 39.2.

For any constant < β < . , there exists a constant ε > such that thereis no (1 + ε , β ) -approximation algorithm for the k -means problem assuming the Unique GamesConjecture. Moreover, the same result holds for any < β < . under the assumption that P = NP .Proof. Suppose

Vertex Cover can not be approximated to any factor smaller than λ − ε , for someconstant ε, λ >

0. In the proof of Corollary 8.1, we showed that k ≥ m for all the hard VertexCover instances. In that case, the second property of Theorem 37 implies that

OP T ( X , βk ) ≥ ( m − k ) + δk ≥ (1 + δ ) · ( m − k ). Thus, the k -means problem can not be approximated withinany factor smaller than 1 + δ = 1 + Ω( ε ), with βk centers. Now, let us compute the value of β based on the value of λ . We know that β < · (cid:16) λ + (cid:17) . Consider the following two cases:• By Theorem 7, Vertex Cover is hard to approximate within any factor smaller than 2 − ε onbounded degree triangle-free graphs assuming the Unique Games Conjecture. Hence λ = 2and thus β < .

28 assuming the Unique Game Conjecture.• By Theorem 4,

Vertex Cover is hard to approximate within any factor smaller than 1 .

36 onbounded degree triangle-free graphs assuming P = NP . Hence λ = 1 .

36 and thus β < . P = NP .This completes the proof of the corollary. We showed that the Euclidean k -median problem cannot be approximated to any factor smallerthan (1 + ε ) for some constant ε > k -median and k -means problems. Besides trying toimprove the inapproximability bounds, one interesting future direction is to check if the Euclidean k -means/ k -median problems are hard to approximate in the bi-criteria setting with 2 k or more37enters. It would also be interesting to try designing a (1 + ε, β )-approximation algorithm for k -means and k -median, for arbitrary small constant ε > β > ε . References [ACKS15] Pranjal Awasthi, Moses Charikar, Ravishankar Krishnaswamy, and Ali Kemal Sinop.The Hardness of Approximation of Euclidean k -Means. In Lars Arge and János Pach,editors, ,volume 34 of Leibniz International Proceedings in Informatics (LIPIcs) , pages 754–767, Dagstuhl, Germany, 2015. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.[ADHP09] Daniel Aloise, Amit Deshpande, Pierre Hansen, and Preyas Popat. Np-hardness ofeuclidean sum-of-squares clustering.

Mach. Learn. , 75(2):245–248, May 2009.[ADK09] Ankit Aggarwal, Amit Deshpande, and Ravi Kannan. Adaptive sampling for k-means clustering. In Irit Dinur, Klaus Jansen, Joseph Naor, and José Rolim, editors,

Approximation, Randomization, and Combinatorial Optimization. Algorithms andTechniques , pages 15–28, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg.[AGK +

04] Vijay Arya, Naveen Garg, Rohit Khandekar, Adam Meyerson, Kamesh Munagala, andVinayaka Pandit. Local search heuristics for k-median and facility location problems.

SIAM Journal on Computing , 33(3):544–562, 2004.[AJM09] Nir Ailon, Ragesh Jaiswal, and Claire Monteleoni. Streaming k-means approximation.In Y. Bengio, D. Schuurmans, J. D. Laﬀerty, C. K. I. Williams, and A. Culotta,editors,

Advances in Neural Information Processing Systems 22 , pages 10–18. CurranAssociates, Inc., 2009.[AKS09] P. Austrin, S. Khot, and M. Safra. Inapproximability of vertex cover and independentset in bounded degree graphs. In , pages 74–80, 2009.[ANSW17] S. Ahmadian, A. Norouzi-Fard, O. Svensson, and J. Ward. Better guarantees for k-means and euclidean k-median by primal-dual algorithms. In , pages 61–72, Oct 2017.[BHPI02] Mihai Bundeﬁneddoiu, Sariel Har-Peled, and Piotr Indyk. Approximate clusteringvia core-sets. In

Proceedings of the Thiry-Fourth Annual ACM Symposium on Theoryof Computing , STOC ’02, page 250–257, New York, NY, USA, 2002. Association forComputing Machinery.[Bon76] John Adrian Bondy.

Graph Theory With Applications . Elsevier Science Ltd., GBR,1976.[BPR +

17] Jarosław Byrka, Thomas Pensyl, Bartosz Rybicki, Aravind Srinivasan, and KhoaTrinh. An improved approximation for k-median and positive correlation in budgetedoptimization.

ACM Trans. Algorithms , 13(2), March 2017.38BV16] Sayan Bandyapadhyay and Kasturi Varadarajan. On Variants of k-means Clustering.In Sándor Fekete and Anna Lubiw, editors, , volume 51 of

Leibniz International Proceedings inInformatics (LIPIcs) , pages 14:1–14:15, Dagstuhl, Germany, 2016. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.[CA18] Vincent Cohen-Addad. A fast approximation scheme for low-dimensional k-means.In

Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on DiscreteAlgorithms , SODA ’18, page 430–440, USA, 2018. Society for Industrial and AppliedMathematics.[CAC19] Vincent Cohen-Addad and Karthik C.S. Inapproximability of clustering in lp metrics.In ,pages 519–539, 2019.[CAKM16] Vincent Cohen-Addad, Philip N. Klein, and Claire Mathieu. Local search yieldsapproximation schemes for k -means and k -median in euclidean and minor-free metrics. ,00:353–364, 2016.[CASL20] Vincent Cohen-Addad, Karthik C. S., and Euiwoong Lee. On approximability ofclustering problems without candidate centers, 2020.[CGTS99] Moses Charikar, Sudipto Guha, Éva Tardos, and David B. Shmoys. A constant-factorapproximation algorithm for the k -median problem (extended abstract). In Proceedingsof the Thirty-First Annual ACM Symposium on Theory of Computing , STOC ’99, page1–10, New York, NY, USA, 1999. Association for Computing Machinery.[Che09] Ke Chen. On coresets for k-median and k-means clustering in metric and euclideanspaces and their applications.

SIAM Journal on Computing , 39(3):923–947, 2009.[CLM +

16] Michael B. Cohen, Yin Tat Lee, Gary Miller, Jakub Pachocki, and Aaron Sidford.

Geometric Median in Nearly Linear Time , page 9–21. Association for ComputingMachinery, New York, NY, USA, 2016.[CT89] Ramaswamy Chandrasekaran and Arie Tamir. Open questions concerning weiszfeld’salgorithm for the fermat-weber location problem.

Mathematical Programming , 44(1-3):293–295, 1989.[Das08] Sanjoy Dasgupta. The hardness of k -means clustering. Technical Report CS2008-0916,Department of Computer Science and Engineering, University of California San Diego,2008.[DG03] Sanjoy Dasgupta and Anupam Gupta. An elementary proof of a theorem of johnsonand lindenstrauss. Random Structures & Algorithms , 22(1):60–65, 2003.[DH01] Zvi Drezner and Horst W Hamacher.

Facility location: applications and theory .Springer Science & Business Media, 2001.[DS05] Irit Dinur and Samuel Safra. On the hardness of approximating minimum vertex cover.

Annals of Mathematics , 162(1):439–485, 2005.39FMS07] Dan Feldman, Morteza Monemizadeh, and Christian Sohler. A PTAS for k -meansclustering based on weak coresets. In Proceedings of the twenty-third annual symposiumon Computational geometry , SCG ’07, pages 11–18, New York, NY, USA, 2007. ACM.[FRS16] Zachary Friggstad, Mohsen Rezapour, and Mohammad R. Salavatipour. Local searchyields a PTAS for k -means in doubling metrics. , 00:365–374, 2016.[HAL48] J. B. S. HALDANE. Note on the median of a multivariate distribution. Biometrika ,35(3-4):414–417, 12 1948.[JKS14] Ragesh Jaiswal, Amit Kumar, and Sandeep Sen. A simple D -sampling based PTASfor k -means and other clustering problems. Algorithmica , 70(1):22–46, 2014.[KMN +

02] Tapas Kanungo, David M. Mount, Nathan S. Netanyahu, Christine D. Piatko, RuthSilverman, and Angela Y. Wu. A local search approximation algorithm for k-meansclustering. In

Proceedings of the Eighteenth Annual Symposium on ComputationalGeometry , SCG ’02, page 10–18, New York, NY, USA, 2002. Association for ComputingMachinery.[KSS10] Amit Kumar, Yogish Sabharwal, and Sandeep Sen. Linear-time approximation schemesfor clustering problems in any dimensions.

J. ACM , 57(2):5:1–5:32, February 2010.[KV97] J. KRARUP and S. VAJDA. On torricelli’s geometrical solution to a problem of fermat.

IMA Journal of Management Mathematics , 8(3):215–224, 1997.[Li16] Shi Li. Approximating capacitated k-median with (1 + ε )k open facilities. In Proceed-ings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms ,SODA ’16, page 786–796, USA, 2016. Society for Industrial and Applied Mathematics.[Li17] Shi Li. On uniform capacitated k-median beyond the natural lp relaxation.

ACMTrans. Algorithms , 13(2), January 2017.[LR91] Hendrik P. Lopuhaa and Peter J. Rousseeuw. Breakdown points of aﬃne equivariantestimators of multivariate location and covariance matrices.

The Annals of Statistics ,19(1):229–248, 1991.[LS13] Shi Li and Ola Svensson. Approximating k-median via pseudo-approximation. In

Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of Computing ,STOC ’13, page 901–910, New York, NY, USA, 2013. Association for ComputingMachinery.[LSW17] Euiwoong Lee, Melanie Schmidt, and John Wright. Improved and simpliﬁedinapproximability for k-means.

Information Processing Letters , 120:40 – 43, 2017.[Mat00] Jirı Matoušek. On approximate geometric k-clustering.

Discrete & ComputationalGeometry , 24(1):61–84, 2000.[MD87] P. Milasevic and G. R. Ducharme. Uniqueness of the spatial median.

Ann. Statist. ,15(3):1332–1333, 09 1987. 40MMR19] Konstantin Makarychev, Yury Makarychev, and Ilya Razenshteyn. Performance ofjohnson-lindenstrauss transform for k-means and k-medians clustering. In

Proceedingsof the 51st Annual ACM SIGACT Symposium on Theory of Computing , STOC 2019,page 1027–1038, New York, NY, USA, 2019. Association for Computing Machinery.[MMSW16] Konstantin Makarychev, Yury Makarychev, Maxim Sviridenko, and Justin Ward. ABi-Criteria Approximation Algorithm for k-Means. In Klaus Jansen, Claire Mathieu,José D. P. Rolim, and Chris Umans, editors,

Approximation, Randomization, andCombinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016) ,volume 60 of

Leibniz International Proceedings in Informatics (LIPIcs) , pages 14:1–14:20, Dagstuhl, Germany, 2016. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.[MNV12] Meena Mahajan, Prajakta Nimbhorkar, and Kasturi Varadarajan. The planar k-meansproblem is np-hard.

Theoretical Computer Science , 442:13 – 21, 2012. Special Issueon the Workshop on Algorithms and Computation (WALCOM 2009).[MS84] Nimrod Megiddo and Kenneth J. Supowit. On the complexity of some commongeometric location problems.

SIAM Journal on Computing , 13(1):182–196, 1984.[Vat09] Andrea Vattani. The hardness of k-means clustering in the plane. Technical report,Department of Computer Science and Engineering, University of California San Diego,2009.[WEI37] E. WEISZFELD. Sur le point pour lequel la somme des distances de n points donnesest minimum.

Tohoku Mathematical Journal, First Series , 43:355–386, 1937.[Wei16] Dennis Wei. A constant-factor bi-criteria approximation guarantee for k-means++. InD. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors,