Approximation Algorithms for Hard Capacitated k -facility Location Problems
Karen Aardal, Pieter van den Berg, Dion Gijswijt, Shanfei Li
AApproximation Algorithms for Hard Capacitated k -facilityLocation Problems Karen Aardal , Pieter van den Berg , Dion Gijswijt , and Shanfei Li ∗ Delft Institute of Applied Mathematics, Delft University of Technology, TheNetherlands Centrum Wiskunde en Informatica, The Netherlands
Abstract
We study the capacitated k -facility location problem, in which we are given a set ofclients with demands, a set of facilities with capacities and a constant number k . It costs f i to open facility i , and c ij for facility i to serve one unit of demand from client j . Theobjective is to open at most k facilities serving all the demands and satisfying the capacityconstraints while minimizing the sum of service and opening costs.In this paper, we give the first fully polynomial time approximation scheme (FPTAS)for the single-sink (single-client) capacitated k -facility location problem. Then, we showthat the capacitated k -facility location problem with uniform capacities is solvable in poly-nomial time if the number of clients is fixed by reducing it to a collection of transportationproblems. Third, we analyze the structure of extreme point solutions, and examine the ef-ficiency of this structure in designing approximation algorithms for capacitated k -facilitylocation problems. Finally, we extend our results to obtain an improved approximationalgorithm for the capacitated facility location problem with uniform opening cost. In the capacitated k -facility location problem (CKFL), we are given a set D of clients and aset F of potential facilities (locations where we can potentially open a facility) in a metricspace. Each facility i ∈ F has a capacity s i . Each client j has a demand d j that must beserved. Establishing facility i incurs an opening cost f i . Shipping x ij units from facility i to client j incurs service costs c ij x ij , where c ij is proportional to the distance between i and j . The goal is to serve all the clients by using at most k facilities and satisfying thecapacity constraints such that the total cost is minimized. In this paper, we consider the hard capacities, that is, we allow at most one facility to be opened at any location. (Notethat in the soft capacities case multiple facilities can be opened in a single location [33].)CKFL can be formulated as the following mixed integer program (MIP), where variable x ij indicates the amount of the demand of client j that is served by facility i , and y i indicateswhether facility i is open. ∗ Corresponding authors: [email protected] a r X i v : . [ c s . D S ] S e p in (cid:88) i ∈ F (cid:88) j ∈ D c ij x ij + (cid:88) i ∈ F f i y i (1)subject to: (cid:88) i ∈ F x ij = d j , ∀ j ∈ D, (2) (cid:88) j ∈ D x ij ≤ s i y i , ∀ i ∈ F, (3) (cid:88) i ∈ F y i ≤ k, (4) x ij ≥ , ∀ i ∈ F, ∀ j ∈ D, (5) y i ∈ { , } , ∀ i ∈ F. (6)If we replace constraints (6) by 0 ≤ y i ≤ , i ∈ F, (7)we obtain the LP-relaxation of CKFL. Without loss of generality we suppose that s i , d j foreach i ∈ F, j ∈ D are all integral.CKFL is related to the capacitated k -median problem (CKM), in which inputs and goalare the same as CKFL except that there is no opening cost for facilities. A constant fac-tor approximation algorithm is still unknown for CKM, let alone CKFL. All the previousattempts with constant approximation ratios for these problems violate the capacity con-straint, or cardinality constraint that at most k facilities are allowed to be used. We callthese approximation algorithms pesudo-approximation algorithms . Recently, Byrka et al. [7]gave a constant factor approximation algorithm for CKM with uniform capacities while vio-lating the capacities with a factor 2 + (cid:15) , where (cid:15) > (cid:15) to get a constant factorapproximation algorithm [27] for CKM with uniform capacities. It seems that to obtain abetter constant factor approximation algorithm with violating the cardinality constraint hasnot received much attention yet.In this paper, we give an improved approximation algorithm for CKFL (with arbitrarycapacities) with uniform opening cost by using at most 2 k facilities. To show the poten-tial power of this algorithm, we improve the approximation ratio for the capacitated facilitylocation problem with uniform opening cost [30], by combining this algorithm with a pesudo-approximation algorithm for the k -median problem derived from a bifactor approximationalgorithm for the uncapacitated facility location problem [10]. That is, pesudo-approximationalgorithms for capacitated k -facility location problems may be extended to get approxima-tion algorithms for well-studied capacitated facility location problems. We believe that thistechnique has the potential to further improve approximation ratios for capacitated facilitylocation problems.Additionally, in Section 2 we give the first fully polynomial time approximation scheme(FPTAS) for the single-sink (hard) capacitated k -facility location problem. In Section 3, wegive a polynomial time algorithm for the uniform capacitated k -facility location problem witha fixed number of clients. The k -facility location problem has already been studied since the early 90s [14, 21]. It is acommon generalization of the k -median problem (in which at most k facilities are allowed to2e opened, and there is no opening costs) and the uncapacitated facility location problem,which are classical problems in computer science and operations research and have a widevariety of applications in clustering, data mining, logistics [6, 22, 28], even for the single-sink(single client) case [19].For the uncapacitated k -facility location problem (UKFL), Charikar et al. [11] gave thefirst constant factor approximation algorithm with performance guarantee 9.8, by modifyingtheir 6 -approximation algorithm for the uncapacitated k -median problem. Later, the ap-proximation ratio was improved by Jain and Vazirani [25], who made use of a primal-dualscheme and Lagrangian relaxation techniques to obtain a 6-approximation algorithm. Jainet al. [23, 24] further improved the ratio to 4 by using a greedy approach and the so-calledLagrangian Multiplier Preserving property of the algorithms. The best known approximationalgorithm for this problem, due to Zhang [38], achieves a factor of 2 + √ (cid:15) using a localsearch technique. The k -median problem, as a special case of UKFL, was studied extensively[2, 3, 8, 10, 11, 24, 25, 31] and the best known approximation algorithm was recently givenby Byrka et al. [8] with approximation ratio 2 .
611 + (cid:15) by improving the algorithm of Li andSvensson [31]. In addition, Edwards [15] gave a 7 . k -facility location problem by extending the 6 -approximation algorithmby Charikar et al. [11] for the uncapacitated k -median problem.Unfortunately, the capacitated k -facility location problem is much less understood al-though the presence of capacity constraints is natural in practice. The difficulty of theproblem lies in the fact that two kinds of hard constraints appear together: the cardinalityconstraint, and the capacity constraints. This seems to result in hardness of the methodssuch as LP-rounding, primal-dual method used to solve the k -median problem, and even localsearch algorithms used to solve the capacitated facility location problem and the k -medianproblem.The capacitated k -facility location problem is related to the capacitated facility locationproblem (CFL), whose inputs and goal are the same as for CKFL but without the cardi-nality constraint. Most known approximation algorithms for CFL are based on local searchtechnique since the natural linear programming relaxation has an unbounded integrality gapfor the general case [34]. For nonuniform capacities, P´al, Tardos, and Wexler [34] proposedthe first constant factor approximation algorithm with a factor of 8 .
53. Later, Mahdianand P´al [32] improved this factor to 7 .
88. Zhang, Chen, and Ye [37] reduced this factor to(3 + 2 √ ε ) by introducing a multi-exchange operation. The currently best known approx-imation algorithm, due to Bansal, Garg, and Gupta [4], achieves the approximation ratio 5.As it was expected that the problem is easier for uniform capacities, Korupolu, Plaxton, andRajaraman (KPR) [27] gave the first constant factor approximation algorithm with a factorof 8. Later, this factor was improved to 5 .
83 by Chudak and Williamson [12]. The currentlybest approximation algorithm due to Aggarwal et al. [1] has performance guarantee of 3.Additionally, Levi, Shmoys, and Swamy [30] showed that the linear programming re-laxation has a bounded integrality gap for CFL with uniform opening costs, and gave a5-approximation algorithm for this case by an LP-rounding technique.The capacitated k -median problem (CKM), which is a special case of CKFL, is alreadydifficult to handle. The natural linear programming relaxation has an unbounded integralitygap (see Remark 1). We have to blow up the capacity or increase the number of openingfacilities by a factor of at least 2 if we use the cost of the LP solution as a lower bound toobtain an integral solution [11].For the hard uniform capacity case, Charikar et al. [11] gave a constant factor approxi-mation algorithm while violating the capacities within a constant factor 3 by LP-rounding.Recently, Byrka et al. [7] improved this violation ratio to 2 + (cid:15) by designing a (32 l + 28 l + 7)-approximation algorithm increasing the capacity by a factor of 2 + l − , l ∈ { , , , · · · } .Based on a local search technique, Korupolu et al. [27] proposed a (1 + 5 /(cid:15) )-approximation3lgorithm by using at most (5 + (cid:15) ) k facilities, and a (1 + (cid:15) )-approximation algorithm by usingat most (5 + 5 /(cid:15) ) k facilities.For soft non-uniform capacities, based on primal-dual and Lagrangian relaxation methods,Chuzhoy and Rabani [13] presented a 40-approximation algorithm by violating the capacitieswithin a constant factor of 50. Bartal et al. [5] proposed a 19 . δ ) /δ -approximationalgorithm ( δ >
0) by using at most (1 + δ ) k facilities.To the best of our knowledge, for hard non-uniform capacities, a constant factor ap-proximation algorithm is still unknown if we allow for a violation of the two kinds of hardconstraints: the cardinality constraint and capacity constraints. Without violating any con-straint, a constant factor approximation algorithm remains unknown even for the single-sinkcapacitated k -median problem in which | D | = 1, let alone the capacitated k -facility locationproblem. (i) The single-sink facility location problem has several applications in practice [19]. Weshow that the single-sink hard capacitated k -facility location problem, in which D containsexactly one client, is NP-hard even when f i = 0 , i ∈ F . We give the first FPTAS for SCKFLby extending the FPTAS for the knapsack problem. To the best of our knowledge, this isalso the fist FPTAS for the single-sink capacitated facility location problem, which answersa question by G¨ortz and Klose [17].(ii) For the hard capacitated k -facility location problem with uniform capacities, in which s i = s, i ∈ F , we observe that for | D | =1, it is easy to find an optimal solution. A naturalquestion is to extend this to any fixed number m := | D | of clients. We give a polynomial timealgorithm for this setting that runs in time O ( (cid:0) nm (cid:1) · n ), where n = | F | . Using the structureof the graph consisting of the fractional valued edges in any extreme solution, the problem isreduced to a number of transportation problems.(iii) We observe that the number of fractionally open facilities can be bounded by ana-lyzing the rank of the constraint matrix corresponding to the tight constraints at a fractionalextreme point solution. Then, we give approximation algorithms for two variants of the hardcapacitated k -facility location problem based on this upper bound.Another example to show the potential power of the structure of extreme point solu-tions is that we can slightly improve the previous best approximation ratio 5 obtained byLevi, Shmoys, and Swamy [30], and Bansal, Garg, and Gupta [4] for the capacitated facilitylocation problem with uniform opening costs, by combining our technique with a pesudo-approximation algorithm for the k -median problem. k -facility Location Problem In this section, we consider the single-sink capacitated k -facility location problem (SCKF).Since we only have one client with demand d , the formulation for the CKF is reduced to thefollowing mixed integer program. 4 MIP = min (cid:88) i ∈ F ( c i x i + f i y i ) (8)subject to: (cid:88) i ∈ F x i = d, (9) (cid:88) i ∈ F y i ≤ k, (10)0 ≤ x i ≤ s i y i , ∀ i ∈ F, (11) y i ∈ { , } , ∀ i ∈ F. (12)Again, the natural LP relaxation of SCKFL can be obtained by replacing constraints (12)by (7). Lemma 1.
The single-sink capacitated k -facility location problem is NP-hard even when f i = 0 for all i ∈ F .Proof. Consider the case that s i > c i := 1 − s i and f i = 0 for all i ∈ F . We claim that Z MIP ≤ d − k ⇐⇒ there exists I ⊆ F with | I | = k and (cid:88) i ∈ I s i = d. (13)Indeed, for the objective value we find (cid:88) i ∈ F c i x i = d − (cid:88) i ∈ F x i s i = d − (cid:88) i | y i =1 x i s i ≥ d − k, where the last inequality holds because x i ≤ s i and y i = 1 for k values of i . Equality holds ifand only if x i = s i for all i ∈ F with y i = 1. That is, if and only if (cid:80) { s i | y i = 1 } = d .The claim above allows to reduce SUBSET-SUM to SCKFL as follows. Let positiveintegers s , · · · , s n > d form an instance of SUBSET-SUM. Now there exists a subset I ⊆ { , , · · · , n } such that (cid:80) i ∈ I s i = d if and only if the objective value of SCKFL is atmost d − k for some k ∈ { , · · · , n } . Remark 1.
The integrality gap Z MIP /Z LP is unbounded. Take the instance shown in Figure 1 with four facilities { , , , } , s = s = s, s = M s, s = s + 1, d = 2 s + 1, f = f = f = f = 0, and c = c = 0 , c = 100 , c = 1 , k = 2and M (cid:29) s (cid:29) Z MIP = s + 1 and Z LP = MM − . Thus, Z MIP /Z LP = s +1 MM − > s +1200 , which can be arbitrarily large. In addition, a simple LP-roundingtechnique does not work for SCKFL. For the above instance, an optimal solution for LP-relaxation is y = 1 , y = Ms − s − M − s , y = M − s , x = s, x = Ms − s − M − , x = MM − . A naturalidea is to round y to be 1, y to be 0. It is clear that the objective value of the solutionobtained by this simple rounding is still really large.We aim to design a fully polynomial time approximation scheme (FPTAS) for SCKFL.Before introducing our algorithm, we present a key observation (P´al, Tardos, and Wexlergave a similar observation in the proof of Lemma 3.3 in [34]). Observation 1.
For the single-sink capacitated k -facility location problem, there is an opti-mal solution ( x ∗ , y ∗ ) in which at most one open facility t is not fully used, i.e., x ∗ i ∈ { , s i } for i (cid:54) = t . acilityclient Figure 1:
An instance for SCKFL. An optimal solution for the LP-relaxation of this instanceis y = 1 , y = Ms − s − M − s , y = M − s , x = s, x = Ms − s − M − , x = MM − , with the total cost MM − . An optimal solution for the MIP is y = y = 1 , x = s, x = s + 1 with the total cost s + 1 . Without loss of generality we suppose that c ij and f i , for each i ∈ F, j ∈ D , are allintegral. Given t , which is allowed not to be fully used in an optimal integral solution( x ∗ , y ∗ ), in order to solve SCKFL it is sufficient to solve the following problem for a giveninteger p : max (cid:8) (cid:88) i ∈ F (cid:48) s i | F (cid:48) ⊆ F \ { t } , | F (cid:48) | ≤ k − , (cid:88) i ∈ F (cid:48) ( c i s i + f i ) = p (cid:9) . (14)In words, we find for each total cost p a set of at most k − t ) toopen and use to full capacity, maximizing the total capacity.We can recursively solve the above problem by dynamic programming. Without loss ofgenerality, suppose F \ { t } = { , , · · · , n − } , where n = | F | . For nonnegative integers p and g ≤ b ≤ n − S g ( b, p ) := max (cid:8) (cid:88) i ∈ F (cid:48) s i | F (cid:48) ⊆ { , . . . , b } , | F (cid:48) | ≤ g, (cid:88) i ∈ F (cid:48) ( c i s i + f i ) = p (cid:9) , and let F g ( b, p ) be an optimal solution F (cid:48) . If (cid:80) i ∈ F (cid:48) ( c i s i + f i ) = p does not hold for any F (cid:48) ⊆ { , . . . , b } with | F (cid:48) | ≤ g , we set S g ( b, p ) := −∞ and F g ( b, p ) := ∅ . Clearly, S g (0 ,
0) = 0and S g (0 , p ) = −∞ for p >
0. The other values S g ( b, p ), and the corresponding optimumsolutions F g ( b, p ), can be computed recursively since S g ( b + 1 , p ) = max (cid:0) S g ( b, p ) , s b +1 + S g − ( b, p − ( f b +1 + c b +1 s b +1 ) (cid:1) for 0 < g ≤ b . In the maximum, the two values correspond to not opening and openingfacility b + 1, respectively.For computing the maximum in (14), it suffices to restrict to values 0 ≤ p ≤ ( k − P ≤ n P ,where P = max { c i s i + f i | i ∈ { , . . . , n − }} . Hence we can solve (14) in time O ( n P ).Since P may be exponential in the size of the input of SCKFL, the computing time couldbe non-polynomial. We overcome this difficulty by a scaling-and-rounding technique. Theresulting Algorithm 1 may be seen as a generalization of the FPTAS for the knapsack problem(with cardinality constraints) [9, 29]. Assumption 1.
For each i ∈ F, C i > , where C i := c i s i + f i . Note that if C i = 0 and s i < d we directly open i and serve demand s i of the single clientby i without increasing any cost. If C i = 0 and s i ≥ d , the optimal total cost is 0. Remark 2.
Note that in Algorithm 1 for nonnegative integers p and g ≤ b ≤ ¯ r , S g ( b, p ) := max (cid:8) (cid:88) i ∈ F (cid:48) s i | F (cid:48) ⊆ { , . . . , b } − { t } , | F (cid:48) | ≤ g, (cid:88) i ∈ F (cid:48) ( c i s i + f i ) = p (cid:9) . Theorem 1.
Let
OP T be the cost of an optimal solution,
SOL be the cost of the solutionreturned by Algorithm 1. Then,
SOL ≤ (1 + (cid:15) ) OP T . The running time of Algorithm 1 is O ( n (cid:15) ) , for any (cid:15) > . lgorithm 1 An FPTAS for the single sink capacitated k -facility location problem Input
Finite set F of facilities, costs c ∈ Z F ≥ , costs f ∈ Z F ≥ , demand d ∈ Z ≥ , capacities s ∈ Z F ≥ , integer 1 ≤ k ≤ n := | F | , (cid:15) > Output
A feasible solution ( x, y ) that is within a factor 1 + (cid:15) of optimum, if a feasiblesolution exists.
Description
1. Order facilities such that C ≤ C ≤ · · · ≤ C n , where C i := c i s i + f i .2. for t = 1 to n dofor r = 1 to n doif { , · · · , r } − { t } = ∅ ,Let C r = 0 , ¯ f t = f t , ¯ c t = c t , W = 1.end ifif { , · · · , r } − { t } (cid:54) = ∅ Let W = (cid:15)C r k , where C r = max { C i | i ∈ { , · · · , r } − { t }} .For each facility i ∈ { , · · · , r } − { t } , define ¯ C i = (cid:98) C i W (cid:99) .Let ¯ f t = f t W , ¯ c t = c t W .end ifConsider the subproblem P rt involving items { , · · · , r } ∪ { t } , in whichonly t can be not fully used, that is, x i ∈ { , s i } , i ∈ { , · · · , r } − { t } ;0 ≤ x t ≤ s t . With the above scaled costs, compute S g (¯ r, p ) for each0 ≤ g ≤ k −
1, 0 ≤ p ≤ ( k − (cid:98) C r W (cid:99) , where ¯ r = r if r (cid:54) = t , ¯ r = r − { p + ( d − S g (¯ r, p ))¯ c t + ¯ f t | ≤ d − S g (¯ r, p ) ≤ s t , ≤ g ≤ k − , ≤ p ≤ ( k − (cid:98) C r W (cid:99)} , if a feasible solution exists.end forend for3. for r = 1 to n doif s r ≥ d ,find a solution with total cost dc r + f r .end ifend for4. Output the solution with the minimum total original cost.7 roof. Suppose ( x ∗ , y ∗ ) is an optimal solution in which at most one open facility is not fullyused. Let t be the open facility in ( x ∗ , y ∗ ) that is not fully used if it exists. Otherwise, let t be some open facility in ( x ∗ , y ∗ ). Then, we define F ∗ = { i ∈ F | y ∗ i = 1 } − { t } as the setof opened facilities in ( x ∗ , y ∗ ) excluding t . If F ∗ = ∅ , clearly our algorithm can find an optimal solution in Step 3. If F ∗ (cid:54) = ∅ , let C i = max { C i | i (cid:54) = t , y ∗ i = 1 } . Note that C i ≤ OP T . Moreover, let i = max { i ∈ F ∗ | C i = C i } . Thus, C i = C i . Suppose in iteration t = t , r = i of Step 2 we get an optimal solution ( x, y ). Let F = { i ∈ F | y i = 1 } − { t } . Let
Cost ( x, y ) and Scaled cost ( x, y ) be the original and scaledtotal cost of solution ( x, y ) respectively. So, Cost ( x, y ) = ( (cid:80) i ∈ F C i ) + x t c t + f t , and Scaled cost ( x, y ) = ( (cid:80) i ∈ F ¯ C i ) + x t ¯ c t + ¯ f t , where the definition of ¯ C i is given in Algorithm1. We will show that Cost ( x, y ) ≤ (1 + (cid:15) ) OP T , which then implies
SOL ≤ (1 + (cid:15) ) OP T .Recall that W = (cid:15)C r k . We have Cost ( x, y ) = ( (cid:88) i ∈ F C i ) + x t c t + f t ≤ ( (cid:88) i ∈ F ( W ¯ C i + W )) + W x t ¯ c t + W ¯ f t ≤ W (( (cid:88) i ∈ F ¯ C i ) + x t ¯ c t + ¯ f t ) + kW ≤ W · Scaled cost ( x, y ) + kW, where the second inequality holds as | F | ≤ k − x ∗ , y ∗ ) in this iteration is ( (cid:80) i ∈ F ∗ ¯ C i ) + x ∗ t ¯ c t + ¯ f t .Clearly, Scaled cost ( x, y ) ≤ ( (cid:88) i ∈ F ∗ ¯ C i ) + x ∗ t ¯ c t + ¯ f t , since ( x, y ) is optimal in this iteration. That is, Scaled cost ( x, y ) ≤ ( (cid:88) i ∈ F ∗ (cid:98) C i W (cid:99) ) + x ∗ t c t W + f t W .
Then, we have W · Scaled cost ( x, y ) ≤ W ( (cid:88) i ∈ F ∗ (cid:98) C i W (cid:99) ) + W x ∗ t c t W + W f t W ⇒ W · Scaled cost ( x, y ) + kW ≤ ( (cid:88) i ∈ F ∗ C i ) + x ∗ t c t + f t + kW. Therefore, we get
Cost ( x, y ) ≤ ( (cid:88) i ∈ F ∗ C i ) + x ∗ t c t + f t + kW = OP T + (cid:15)C i ≤ (1 + (cid:15) ) OP T, where the equality holds by the definition of W and the last inequality holds as C i ≤ OP T .For fixed t , the running time of the subproblem P rt , r = 1 , · · · , n is O ( n (cid:98) C r W (cid:99) ). That is, O ( n k(cid:15) ). Thus, the total running time of our algorithm is O ( n (cid:15) ) as we have O ( n ) subprob-lems. 8 The Capacitated k -facility Location Problem with UniformCapacities In this section, we aim to show the following result for the capacitated k -facility locationproblem with uniform capacities (CKFU). Let m = | D | , n = | F | and s i = s, i ∈ F . Theorem 2.
For fixed m , the capacitated k -facility location problem with uniform capacitiescan be solved in polynomial time O ( (cid:0) nm (cid:1) · n ) . We need new notation to describe our idea. We consider an optimal solution ( x, y ) forCKFLU as a weighted bipartite graph G = ( V, E ), where V = { i ∈ F | y i = 1 } ∪ D and E = {{ i, j } | x ij > , i ∈ F, j ∈ D } . To be more precise, if x ij >
0, we add an edge { i, j } between facility i and client j with weight x ij . Moreover, let ¯ E = {{ i, j } ∈ E | < x ij < s } and ¯ V = (cid:83) e ∈ ¯ E e . We call ( ¯ V , ¯ E ) the untight weighted subgraph of G .Define r j := d j /s for all j ∈ D . If all r j are integral, we say that the CKFLU is divisible . Lemma 2.
The divisible capacitated k -facility location problem with uniform capacities canbe solved in O ( n ) time.Proof. We transform the divisible CKFLU to a balanced transportation problem, in which thetotal capacity is equal to total demand. Then, to get an integer solution to this transportationproblem, we can consider this problem as a minimum weight perfect matching problem thatcan be solved in O ( n ) time [16], by splitting the demands. Since the problem is infeasible if k < (cid:80) j ∈ D r j , we only consider the case: | F | ≥ k ≥ (cid:80) j ∈ D r j .By dividing the capacity and demand constraints by s , we can get an equivalent formu-lation for the divisible CKFLU, in which the new capacity of each facility is 1 and the newdemand of each client j is r j .First, we show that there exists an optimal integral solution for this equivalent formu-lation. We add a dummy client j (cid:48) to D with demand r j (cid:48) = n − (cid:80) j ∈ D r j . Take the cost ofshipping one unit from i ∈ F to j ∈ D \ { j (cid:48) } to be sc ij + f i , from i ∈ F to j (cid:48) to be 0. Now thedivisible CKFLU can be considered as a balanced transportation problem with total demand n . Since r j , j ∈ D are integers, there is an integer optimal solution for this transportationproblem (see for instance [20], or Theorem 21.14 in [35]). Note that based on the optimalinteger solution for this transportation problem, we can easily construct an optimal solutionfor our original problem.Then, to get an optimal integer solution for the constructed transportation problem, wecan split each j ∈ D to r j copies each with demand 1. Now we can consider the balancedtransportation problem as a minimum weight perfect matching problem that can be solvedin O ( n ) time[16].Note that if we know the exact structure of ( ¯ V , ¯ E ), then according to the definition of G the remaining part ( V, E \ ¯ E ) can be generated by an optimal integer solution to an instanceof the divisible CKFLU problem. Thus, the high-level idea is that we reduce our originalproblem to a collection of divisible CKFLU problems by checking all the possible structuresof ( ¯ V , ¯ E ). To prove that we can examine all the structures in polynomial time, we show someuseful properties of the untight weighted subgraph of G first. Lemma 3.
Let G = ( V, E ) be the graph corresponding to a vertex ( x, y ) of the convex hullof feasible solutions of the MIP to CKFLU, and H = ( ¯ V , ¯ E ) be its corresponding untightsubgraph. Then,(a) G is acyclic;(b) in each connected component of H , there is at most one i ∈ F ∩ ¯ V with < (cid:80) j ∈ D x ij x + (cid:15)χ O , y ) and ( x − (cid:15)χ O , y ) are feasible solutions,contradicting the fact that ( x, y ) is a vertex.(b). The idea is similar to (a). Consider any connected component B of H . Suppose forcontradiction that we have two facilities i , i in B with 0 < (cid:80) j ∈ D x i j < s, < (cid:80) j ∈ D x i j
For any untight and acyclic subgraph H = ( ¯ V , ¯ E ) , given the set I = { i ∈ F ∩ ¯ V | < (cid:80) j ∈ D x ij < s } , we can get the unique weight x ij for each edge { i, j } ∈ ¯ E in O ( m ) time.Proof. Consider any connected component of H . Note that each connected component mustbe in the form of a tree. If there is a facility i ∗ ∈ I in this component, then take i ∗ as theroot. Otherwise, take an arbitrary facility i ∗ in this component as the root. Then, all leavesare clients since (cid:80) j ∈ D x ij = s for each facility i (cid:54) = i ∗ in the considered connected component(Lemma 3(b)) and 0 < x ij < s for each edge { i, j } .We will show that in each connected component, if node (client) j is a leaf, we can obtainthe exact value of x ij , where i is the father of j ; and for each other node in this tree, we cancompute the value of the edge between this node and its father based on the values of edgesbetween this node and its children. Then, we can obtain the values of all edges in the treeby induction.Consider a client j . Let f ( j ) be the father node (facility) of j in the tree and c ( j ) bethe set of children (facilities) of j . If j is a leaf, that is c ( j ) = ∅ , then we know |{ i ∈ F | < x ij < s }| = 1. Otherwise, j cannot be a leaf. Thus, we can get the exact value for x f ( j ) ,j = d j − (cid:98) d j s (cid:99) · s since j has exactly one father. If j is not a leaf, the value x f ( j ) ,j =( d j − (cid:80) i ∈ c ( j ) x ij ) − (cid:98) d j − (cid:80) i ∈ c ( j ) x ij s (cid:99) · s as x tj ∈ { , s } , ∀ t ∈ V \ ¯ V .Consider a facility i (cid:54) = i ∗ . Let f ( i ) be the father node (client) of i in the tree and c ( i ) be theset of children (clients) of i . We can obtain the value of x i,f ( i ) as long as all values of x ij , j ∈ c ( i ) are known, since i must be fully used by Lemma 3. That is, x i,f ( i ) = s − (cid:80) j ∈ c ( i ) x ij .Note that if i = i ∗ , we can stop since f ( i ∗ ) = ∅ .Moreover, the computing time is O ( m ) since each edge is only examined once.Consider an optimal integer vertex ( x, y ) of the convex hull of feasible solutions for CKFLUwhose corresponding graph G = ( V, E ) is a forest. The graph H = ( ¯ V , ¯ E ) (the untight10ubgraph of G ) can be viewed as a subgraph of some spanning tree of the complete bipartitegraph K ¯ F ,D , where ¯ F = F ∩ ¯ V . Consequently, checking all the possible structures of H meanschecking all the subgraphs of these spanning trees. Note that H and K ¯ F ,D have the samevertices. Then, it now suffices to answer the following questions:1. how many different complete bipartite graphs do we have for K ¯ F ,D ?2. how to list all the spanning trees for a complete bipartite graph?3. how many subgraphs, that have the same vertices as the considered spanning tree, doesa spanning tree have?4. for a fixed structure of H , how to compute the corresponding total cost?If all the above questions can be solved in polynomial time, we can get all the possibilities of H in polynomial time. Consequently, Theorem 2 can be proved by Lemma 2 and 4. Proof of Theorem 2.
Because H = ( ¯ V , ¯ E ) contains at most m facilities by Lemma 3, thenumber of all the possible cases for K ¯ F ,D can be bounded by (cid:80) mt =1 (cid:0) nt (cid:1) ≤ m · (cid:0) nm (cid:1) . So, we cananswer question 1.Lemma 5 and 6 answer question 2. The time to list all the spanning trees for the completebipartite graph is O ( m m − + 2 m + m ) since we have at most m facilities and m clients in K ¯ F ,D by Lemma 3. Note that at this stage, we do not need to consider the weight x ij of edge { i, j } .By Lemma 3, we know that the number of edges is at most 2 m − m − subgraphs that have the same vertices as thespanning tree. This answers question 3.Then, the total time to list all the possible untight subgraphs is O ( m · (cid:0) nm (cid:1) · ( m m − +2 m + m ) · m − ).By Lemma 4, we can get the cost for any untight subgraph in polynomial time O ( m ) aslong as I = { i ∈ F ∩ ¯ V | < (cid:80) j ∈ D x ij < s } is fixed. Note that the opening costs for facilitiesare easy to get if we know the structure of H . Indeed, it is (cid:80) i ∈ F ∩ ¯ V f i . The remaining part( V, E \ ¯ E ) can be considered as an optimal integer solution to a divisible CKFLU, whichmeans we can get the total cost in polynomial time O ( n ) + O ( m ) by Lemma 2. This answersquestion 4. Moreover, the number of all the choices for I is bounded by 2 m since there areat most m facilities in each spanning tree by Lemma 3.Combining all the pieces together, we can get all the possibilities of solutions in computingtime O ( m · (cid:0) nm (cid:1) · ( m m − + 2 m + m ) · m − · m · ( m + n )) = O ( (cid:0) nm (cid:1) · ( m m − + 2 m + m ) · m − · ( m + n )), that is, O ( (cid:0) nm (cid:1) · n ). Finally, we output the solution with at most k openfacilities and the smallest total cost. Lemma 5. [26]
For an undirected graph without weight G = ( V, E ) , all spanning trees canbe correctly generated in O ( N + | V | + | E | ) time, where N is the number of spanning trees. Lemma 6. [36]
The number of spanning trees of a complete bipartite graph is m n − n m − ,where m and n are respectively the cardinalities of two disjoint sets in this bipartite graph. k -facility Location Problem with Non-uniform Capacities In this section, we show how to bound the number of fractionally open facilities by a simplerank-counting argument on an extreme point solution. Then, together with an algorithmto group clients, we give a simple constant factor approximation algorithm for the hard11apacitated k -facility location problem with non-uniform capacities (CKFL) (with uniformopening cost) with approximation ratio 7 + (cid:15) by using at most 2 k facilities. As a simpleillustration of the techniques used, we first give a 2-approximation algorithm for the single-sink hard capacitated k -facility location problem (SCKFL). Note that this ratio is worse thanthat of the FPTAS in Section 2. Here we aim to show that this upper bound is helpful todesign approximation algorithms. And the approach is totally different from the FPTAS. The Structure of Extreme Point Solutions to SCKFLDefinition 1.
Let Ax ≤ a, Bx ≥ b, Cx = c be a system of linear (in)equalities. For a feasiblesolution z we define the rank at z of the system to be the (row)rank of (cid:2) A T z B T z C T (cid:3) T ,where A z x ≤ a z , B z x ≥ b z , Cx = c is the subsystem consisting of the (in)equalities that aresatisfied with equality by z . Note that for two subsystems, the sum of the ranks at z of those two subsystems is atleast the rank at z of their union.Let P be the set of feasible solutions to the system SCKFL-LP consisting of (7), (9),(11)and (cid:80) i ∈ F y i = k (Note that in this section we consider constraint (cid:80) i ∈ F y i = k instead of thecorresponding inequality (10)). That is, P := { ( x, y ) : SCKFL-LP } , where SCKFL-LP is a system of constraints given below: (cid:88) i ∈ F x i = d, (cid:88) i ∈ F y i = k, (15)0 ≤ x i ≤ s i y i , ∀ i ∈ F, ≤ y i ≤ , ∀ i ∈ F. Lemma 7.
Let ( x, y ) be a vertex of P . Then either y is integer, or y has exactly twononinteger components and for every i ∈ F we have x i = 0 or x i = s i y i .Proof. Let F (cid:48) := { i ∈ F | < y i < } . If | F (cid:48) | = 0 we are done. As | F (cid:48) | = 1 is ruled outbecause the sum of the y i is k, k ∈ Z , we may assume that | F (cid:48) | ≥ x, y ) is equal to 2 n, n = | F | (Theorem 5.7 in [35]). Wepartition the (in)equalities in this system and bound the rank at ( x, y ) for each subsystem: • The rank at ( x, y ) of the subsystem (cid:80) i ∈ F x i = d, (cid:80) i ∈ F y i = k is at most 2. • For every i ∈ F (cid:48) , the rank at ( x, y ) of the subsystem 0 ≤ x i , x i ≤ s i y i , ≤ y i , y i ≤ x i = 0 or x i = s i y i . • For every i ∈ F \ F (cid:48) , the rank at ( x, y ) of the subsystem 0 ≤ x i , x i ≤ s i y i , ≤ y i , y i ≤ x i = 0 or x i = s i y i .Since the rank is subadditive, we find that the rank at ( x, y ) of SCKFL-LP is at most2 + | F (cid:48) | + 2 | F \ F (cid:48) | = 2 n + 2 − | F (cid:48) | ≤ n, where the inequality holds as | F (cid:48) | ≥ , with equality only if | F (cid:48) | = 2 and for each i we have x i = 0 or x i = s i y i . 12e give a 2-approximation algorithms for SCKFL to show the potential power of thisnice structure. We give an alternative approach to get an approximate solution for SCKFL, comparedto the FPTAS in Section 2. This approach can be viewed as incomplete implement of abranch and bound technique, branching on the 0-1 variables y i . To obtain a 2-approximationalgorithm that runs in polynomial time, we use two key ideas. First, by Lemma 7, we knowin any vertex of the feasible region of the LP-relaxation that either 0 or 2 components of y are fractional. We exploit this to guide the branching. Secondly, we show that for a branch y i = 1 either there is no 2-approximation solution, or we can find a 2-approximation solutionin polynomial time by again exploiting the structure of the vertices of the feasible region tothe LP-relaxation. A precise description of this algorithm is given in Algorithm 2. Algorithm 2
A 2-approximation algorithm for the single-sink hard capacitated k -facilitylocation problem Input
Finite set F of facilities, costs c ∈ Z F ≥ , costs f ∈ Z F ≥ , capacities s ∈ Z F ≥ , demand d ∈ Z ≥ , integer k ∈ Z ≥ . Output
A feasible solution ( x, y ) to MIP: (8),(11),(12), and (15), that is within a factor 2of optimum, if a feasible solution exists.
Description
1. Find an optimal vertex ( x, y ) of the feasible region of the LP-relaxation.If no solution exists then stop. If y is integer then return ( x, y ) and stop.2. Let i (cid:54) = i in F with y i , y i ∈ (0 ,
1) and s i ≥ s i .3. Define x by x i := x i + x i , x i := 0 and x i := x i for i (cid:54) = i , i .Define y by y i := 1, y i := 0, y i := y i for i (cid:54) = i , i .4. Recursively compute a 2-approximation solution ( x , y ) for the restriction to F \ { i } and extend it by setting x i := 0 and y i := 0.5. Set F := ∅ . While | F | ≤ | F | − k do:a. Find an optimal vertex ( x (cid:48) , y (cid:48) ) of the feasible region of the LP-relaxation intersectedwith { ( x, y ) | y i = 1 , y i = 0 ∀ i ∈ F } .b. If y (cid:48) is integer, return the best solution among ( x (cid:48) , y (cid:48) ), ( x , y ) and ( x , y ) andstop.c. If x (cid:48) i = s i , return the best solution among ( x , y ) and ( x , y ) and stop.d. Let i (cid:54) = i in F with y (cid:48) i , y (cid:48) i ∈ (0 ,
1) and f i ≤ f i .e. Define y (cid:48)(cid:48) by y (cid:48)(cid:48) i := 0, y (cid:48)(cid:48) i := y (cid:48)(cid:48) i := 1 and y (cid:48)(cid:48) i := y (cid:48) i for i (cid:54) = i , i , i .If ( x (cid:48) , y (cid:48)(cid:48) ) has smaller value than ( x , y ), set ( x , y ) ← ( x (cid:48) , y (cid:48)(cid:48) ).f. Set F ← F ∪ { i } . Theorem 3.
For the single-sink hard capacitated k -facility location problem, Algorithm 2finds a solution that is within a factor of optimum, or it concludes correctly that there is nofeasible solution. The running time is polynomially bounded in the number | F | of facilities.Proof. Notice that an optimal vertex of the feasible region of the LP-relaxation can be found13n polynomial time (see for instance [18]). Furthermore, since the number of recursive callsis no more than | F | −
1, the polynomial running time is evident. It now suffices to show thatwhen the MIP: (8),(11),(12), and (15) is feasible, the solution given by Algorithm 2 is withina factor two of optimum.Clearly, if y is integer in Step 1 of Algorithm 2, then the output ( x, y ) is an optimalfeasible solution. Hence, by Lemma 7, we may assume that y has exactly two fractionalcomponents y i and y i . Then, we know y i + y i = 1 since (cid:80) i ∈ F y i = k , and all y i , i ∈ F areinteger except y i and y i . Without loss of generality we can assume that s i ≥ s i .To see that ( x , y ) defined in Step 3 of Algorithm 2 is indeed a feasible solution, it sufficesto show that x i ≤ s i . This follows directly from the fact that s i ≥ s i , since x i = x i + x i ≤ y i s i + y i s i ≤ y i s i + y i s i = s i . Further, we find an upper bound for the value of ( x , y ), c T x + f T y ≤ ( c T x + f T y ) + ( c i s i + f i ) , (16)which is at most the optimum plus c i s i + f i .To conclude the proof, we analyse Step 5 of Algorithm 2. Observe that the initial solution( x , y ) may be replaced, but only by a better solution. Also observe, that the solution thatis returned is always at least as good as ( x , y ) and ( x , y ). Hence, we may assume that( x , y ) (at the end of the algorithm) and ( x , y ) are not 2-approximations. Let ( x ∗ , y ∗ ) bean optimal solution. We have y ∗ i = 1, since otherwise ( x , y ) would be a 2-approximationalready at Step 4. It suffices to show that ( x ∗ , y ∗ ) remains feasible throughout the iterationsof Step 5, until a solution of the same value is returned in Step 5b. For this, we observethat while ( x ∗ , y ∗ ) is feasible, the situation x (cid:48) i = s i as in Step 5c cannot occur, becauseotherwise, by (16), we would have c T x + f T y ≤ c T x + f T y + ( c i s i + f i ) ≤ c T x + f T y + c T x (cid:48) + f T y (cid:48) ≤ c T x ∗ + f T y ∗ ) , contradicting the fact that ( x , y ) is not a 2-approximation.In Step 5d, the fact that y (cid:48) has exactly two fractional components follows from Lemma 7as y (cid:48) is a vertex of a face of the feasible region of SCKFL-LP, and hence of that region itself.Observe that this implies that y (cid:48) i + y (cid:48) i = 1, hence ( x (cid:48) , y (cid:48)(cid:48) ) defined in Step 5e is a feasiblesolution.In Step 5f, we have y ∗ i = 0. Indeed, for the cost of ( x (cid:48) , y (cid:48)(cid:48) ) we find: c T x (cid:48) + f T y (cid:48)(cid:48) = ( c T x (cid:48) + f T y (cid:48) ) − f i + (1 − y (cid:48) i ) f i + (1 − y (cid:48) i ) f i ≤ ( c T x (cid:48) + f T y (cid:48) ) + f i ≤ ( c T x ∗ + f T y ∗ ) + f i . Since ( x , y ) and hence ( x (cid:48) , y (cid:48)(cid:48) ) is not a 2-approximation, we find that f i > c T x ∗ + f T y ∗ and hence y ∗ i = 0. This shows that ( x ∗ , y ∗ ) remains feasible after adding i to F . In this section, we consider the capacitated k -facility location problem with uniform openingcosts, i.e., f i = f, i ∈ F . Since we have an upper bound on the number of fractionally openfacilities based on Lemma 8 below, a natural idea is to design a constant factor approxima-tion algorithm for CKFL by relaxing the cardinality constraint with a constant factor. Wegive a simple algorithmic framework that can extend any α -approximation algorithm for the(uncapacitated) k -median problem (UKM) to a (1 + 2 α )-approximation algorithm for CKFLusing at most 2 k facilities (2 k − k -median problem (UKM) can be formulated as follows, where vari-able x ij indicates the fraction of the demand of client j that is served by facility i , and y i indicates whether facility i is open.min (cid:88) i ∈ F (cid:88) j ∈ D d j c ij x ij subject to: (cid:88) i ∈ F x ij = 1 , ∀ j ∈ D,x ij ≤ y i , ∀ i ∈ F, ∀ j ∈ D, (cid:88) i ∈ F y i ≤ k,x ij , y i ∈ { , } , ∀ i ∈ F, ∀ j ∈ D. The Structure of Extreme Point Solutions to CKFL
Let Q be the set of feasible solutions ( x, y ) to the system CKFL-LP consisting of (2), (3),(4), (5) and (7). That is, Q := { ( x, y ) : CKFL-LP } , where CKFL-LP is a system of constraints given below: (cid:88) i ∈ F x ij = d j , ∀ j ∈ D ; (cid:88) i ∈ F y i ≤ k, (cid:88) j ∈ D x ij ≤ s i y i , ∀ i ∈ F,x ij ≥ , ∀ i ∈ F, ∀ j ∈ D, ≤ y i ≤ , ∀ i ∈ F. Lemma 8.
Let ( x, y ) be a vertex of Q . Then y has at most m + 1 noninteger components,where m = | D | .Proof. The proof is similar to the proof of Lemma 7. Let F (cid:48) = { i ∈ F | < y i < } . Therank of system CKFL-LP at ( x, y ) is equal to ( m + 1) n, n = | F | , m = | D | . We partition the(in)equalities in this system and bound the rank at ( x, y ) for each subsystem: • The rank at ( x, y ) of the subsystem (cid:80) i ∈ F x ij = d j , ∀ j ∈ D ; (cid:80) i ∈ F y i ≤ k is at most m + 1. • For every i ∈ F (cid:48) , the rank at ( x, y ) of the subsystem (cid:80) j ∈ D x ij ≤ s i y i ; 0 ≤ x ij , j ∈ D ; 0 ≤ y i ; y i ≤ m and equality holds if and only if x ij = 0 or x ij = s i y i foreach x ij . • For every i ∈ F \ F (cid:48) , the rank at ( x, y ) of the subsystem (cid:80) j ∈ D x ij ≤ s i y i ; 0 ≤ x ij , j ∈ D ; 0 ≤ y i ; y i ≤ m + 1 and equality holds if and only if x ij = 0 or x ij = s i y i for each x ij .Since the rank is subadditive, we find that the rank of CKFL-LP is at most m + 1 + m | F (cid:48) | + ( m + 1) | F \ F (cid:48) | = m + 1 + ( m + 1) n − | F (cid:48) | . So, we have | F (cid:48) | ≤ m + 1 as m + 1 + ( m + 1) n − | F (cid:48) | ≥ ( m + 1) n .15or the uniform capacities case ( s i = s > , ∀ i ∈ F ), we will show a stronger propertythat there is an optimal solution ( x, y ) to the LP-relaxation with at most m nonintegercomponents in y . Indeed, consider an optimal solution ( x, y ) with |{ i | < y i < }| minimal.Suppose for contradiction that y has more than m fractional components. Then there exista client j and two facilities i , i such that y i and y i are fractional and x i j , x i j >
0, and x i j = sy i , x i j = sy i by Lemma 8. Without loss of generality assume that c i j ≤ c i j . Let (cid:15) := min { sy i , s (1 − y i ) } . Now modify ( x, y ) by setting x i j := x i j + (cid:15) y i := y i + (cid:15)/sx i j := x i j − (cid:15) y i := y i − (cid:15)/s, to obtain a new optimal solution, while |{ i | < y i < }| decreases, a contradiction. Thus, wecan find an optimal solution ( x, y ) to the LP-relaxation for which y has at most m nonintegercomponents. The Algorithm
We convert our original instance to a new instance with at most k clients while incurringsome bounded extra costs. Then, at most 2 k facilities are (fractionally or fully) opened forthe new instance according to Lemma 8. Theorem 4.
By Algorithm 3, each α -approximation algorithm for UKM can be extended toget a (1 + 2 α ) -approximation algorithm for CKFL with uniform opening costs using at most k facilities.Proof. Without loss of generality, suppose exactly k facilities are opened in an optimal solu-tion to our original problem (as we check all the cases in our algorithm).Let OP T ( ∗ ) denote the optimal cost of the instance ∗ . We consider the following instances. I the original instance. I the constructed instance in Step 1,that is a (uncapacitated) k -median problem. I the constructed instance in Step 2 in whichwe have at most k clients.Let COST ( · , · ) be the total cost of obtained solution ( · , · ). We consider the followingsolutions( x (cid:48) , y (cid:48) ) the obtained integral solution by α -approx. alg. for instance I .( x, y ) an optimal fractional solution of instance I .( x ∗ , y ∗ ) an integral solution of instance I while using at most 2 k facilities.Clearly, we have COST ( x, y ) ≤ OP T ( I ) , and COST ( x (cid:48) , y (cid:48) ) ≤ αOP T ( I ) . By the process to construct instance I , we have OP T ( I ) + COST ( x (cid:48) , y (cid:48) ) ≥ OP T ( I ).Moreover, we know that OP T ( I ) ≥ OP T ( I ) + kf .We will prove COST ( x ∗ , y ∗ ) ≤ COST ( x (cid:48) , y (cid:48) ) + COST ( x, y ) + kf. We first show that we can obtain an integer solution for I with the total cost at most COST ( x, y ) + kf in Step 4 of Algorithm 3. We have |{ i | < y i < }| ≤ k + 1, since Lemma8 still holds when (cid:80) i ∈ F y i = k . Moreover, if |{ i | < y i < }| > , then |{ i | y i = 1 }| ≤ k − k facilities. Thus, the total cost of the obtained solution for I is atmost COST ( x, y ) + kf .Then, based on the above solution for I we can construct an integer solution for I bymoving the demand of t r , which is located at the same position with facility r , back to all16 lgorithm 3 A (1 + 2 α )-approximation algorithm for CKFL with uniform opening costsusing at most 2 k facilities. Input
Finite set F of facilities, D of clients, costs c ∈ Q F × D ≥ , opening cost f ∈ Q ≥ ,capacities s ∈ Q F ≥ , demands d ∈ Q D ≥ , integer k ∈ Z ≥ . Output
A solution ( x, y ) to MIP (1)-(6) using at most 2 k facilities that is within a factor1 + 2 α of optimum, if a feasible solution exists. Description
Suppose exactly l facilities are opened in an optimal solution. That is, we can consider astronger constraint (cid:80) i ∈ F y i ≤ l. Step 1 . Reduce the input instance I of CKFL to an instance I of UKM as follows.Let F and D be the set of facilities and clients of our input instance I respectively. Let F (cid:48) = F (located at the same sites) be the set of facilities of UKM while with infinitecapacities and without opening costs. Let D (cid:48) = D be the set of clients of UKM. Solve thisconstructed instance (denoted by I ) by the existing α -approximation algorithm for UKM.Suppose we get an integer solution ( x (cid:48) , y (cid:48) ). Note that for UKM, there is an optimal solutionin a form of so-called stars. That is, each client is served by exactly one open facility.Without loss of generality, suppose y (cid:48) = · · · = y (cid:48) l = 1. Then, we can consider ( x (cid:48) , y (cid:48) ) as l stars { T , · · · , T l } , where T r = { j ∈ D (cid:48) | x (cid:48) rj = 1 } and the center of T r is the facility r . Step 2 . Consolidate clients and construct a new instance I of CKFL with at most l clientsas follows.For each star T r in ( x (cid:48) , y (cid:48) ), we set a client t r at the location of facility r with the totaldemand of clients in T r , i.e., d t r = (cid:80) j ∈ T r d j . Let ¯ D = { t , · · · , t l } be the set of our newclients. Now we get a new instance of CKFL, denoted by I , with facilities F and clients ¯ D . Step 3 . Find an optimal vertex ( x, y ) of the feasible region of the LP-relaxation to theconstructed instance I in step 2 with (cid:80) i ∈ F y i = l . Step 4 . We simply open all the facilities with y i > I and thensolve a transportation problem to get an integer solution ( x ∗ , y ∗ ).Since we do not know how many facilities are opened in an optimal solution in advance, werepeat the above 4 steps for l := 1 , · · · , k . Then, output the solution with smallest totalcost. 17lients in T r = { j ∈ D (cid:48) | x (cid:48) rj = 1 } with increasing at most COST ( x (cid:48) , y (cid:48) ) cost as COST ( x (cid:48) , y (cid:48) ) = (cid:80) kr =1 (cid:80) j ∈ D (cid:48) d j c r,j x (cid:48) r,j . Therefore, the solution obtained by Step 4 has COST ( x ∗ , y ∗ ) ≤ COST ( x (cid:48) , y (cid:48) ) + COST ( x, y ) + kf .Then, we have COST ( x ∗ , y ∗ ) ≤ COST ( x (cid:48) , y (cid:48) ) + OP T ( I ) + kf = ( OP T ( I ) − COST ( x (cid:48) , y (cid:48) )) + 2 COST ( x (cid:48) , y (cid:48) ) + kf ≤ OP T ( I ) + 2 COST ( x (cid:48) , y (cid:48) ) + kf ≤ OP T ( I ) + 2 αOP T ( I ) + kf ≤ OP T ( I ) + 2 α ( OP T ( I ) + kf ) ≤ (1 + 2 α ) OP T ( I ) . That is, the approximation ratio is 1 + 2 α .We can obtain the following result as there is a (3 + (cid:15) )-approximation algorithm for the(uncapacitated) k -median problem in [3], and we can make sure that at most 2 k − Corollary 1.
Algorithm 3 can get an integer solution within (cid:15) times of the optimal costby using at most k facilities ( k − facilities) for the hard capacitated k -facility locationproblem with uniform opening costs (with uniform opening costs and uniform capacities). We show how to combine the algorithm in Section 4.2 with the algorithm of Charikar andGuha [10] to improve the approximation ratio for the capacitated facility location problem(CFL) with uniform opening cost. In this section, we only consider uniform opening cost.To simplify the description, sometimes we omit “with uniform opening cost” when we referto the problems.A ( β, δ )-approximation algorithm for the (uncapacitated) k -median problem (UKM) out-puts an integer solution by using at most δk facilities, with service cost at most β times theoptimal total cost. Theorem 5.
Each ( β, δ ) -approximation algorithm for the k -median problem gives rise to a max { β + 1 , δ + 1 } -approximation algorithm for the CFL with uniform opening costs.Proof. A crucial observation is that if exactly k facilities are opened in the optimal solutionfor an instance I of CFL, then this solution is also an optimal solution to the correspondinginstance of CKFL, where the input is the same as that in I but with an extra constraint thatat most k facilities can be opened. Thus, if for each k = 1 , · · · , n, n = | F | we can obtain theoptimal solution for CKFL, then the solution with smallest total cost is the optimal solutionfor CFL.Our algorithm for CFL is as follows: Repeat the 4 steps in Algorithm 3 for l := 1 , · · · , n , n = | F | . In each iteration, we consider the constraint (cid:80) i ∈ F y i ≤ l. Then, output the solutionwith smallest total cost.To get a better approximation ratio for CFL, we replace the α -approximation algorithmfor the k -median problem in Step 1 of Algorithm 3 by a ( β, δ )-approximation algorithm.Then, for each iteration l , we obtain an integer solution for the instance I with at most δl open facilities. Thus, we have at most δl clients in instance I .Again, without loss of generality, suppose exactly k facilities are opened in an optimalsolution for the original instance I .We maintain all the notations in the proof of Theorem 4. Notice that we have COST ( x, y ) ≤ OP T ( I ) , OP T ( I ) + COST ( x (cid:48) , y (cid:48) ) ≥ OP T ( I ) and OP T ( I ) ≥ OP T ( I ) + kf still hold.Moreover, COST ( x (cid:48) , y (cid:48) ) ≤ βOP T ( I ), 18e will prove COST ( x ∗ , y ∗ ) ≤ COST ( x (cid:48) , y (cid:48) ) + COST ( x, y ) + δkf. The proof is similar to that in proof of Theorem 4. We first show that we can obtain aninteger solution for I with the total cost at most COST ( x, y ) + δkf in step 4. Note that |{ i | < y i < }| ≤ δk + 1, since Lemma 8 still holds when (cid:80) i ∈ F y i = k . So, at most ( δ + 1) k facilities are opened at the end, since if |{ i | < y i < }| > , then |{ i | y i = 1 }| ≤ k − I is at most COST ( x, y ) + δkf .Then, by moving the demand of t r back to all clients in T r = { j ∈ D (cid:48) | x (cid:48) rj = 1 } , wecan construct an integer solution for I . This operation increases at most COST ( x (cid:48) , y (cid:48) ) costas COST ( x (cid:48) , y (cid:48) ) = (cid:80) kr =1 (cid:80) j ∈ D (cid:48) d j c r,j x (cid:48) r,j . Therefore, the solution obtained by Step 4 has COST ( x ∗ , y ∗ ) ≤ COST ( x (cid:48) , y (cid:48) ) + COST ( x, y ) + δkf .Then, we have COST ( x ∗ , y ∗ ) ≤ COST ( x (cid:48) , y (cid:48) ) + OP T ( I ) + δkf ≤ COST ( x (cid:48) , y (cid:48) ) + OP T ( I ) + COST ( x (cid:48) , y (cid:48) ) + δkf = OP T ( I ) + 2 COST ( x (cid:48) , y (cid:48) ) + δkf ≤ OP T ( I ) + 2 βOP T ( I ) + δkf ≤ OP T ( I ) + 2 β ( OP T ( I ) − kf ) + δkf ≤ (1 + 2 β )( OP T ( I ) − kf ) + ( δ + 1) kf. Recall that we assume that exactly k facilities are opened in the optimal solution to I . So,the total service cost of the optimal solution to I is OP T ( I ) − kf . Then, COST ( x ∗ , y ∗ ) ≤ max { β + 1 , δ + 1 } OP T ( I ) . Theorem 6. ([10]) Let
SOL be any solution to the uncapacitated facility location problem(possibly fractional), with facility cost F SOL and service cost C SOL . For any γ > , the localsearch heuristic proposed (together with scaling) gives a solution with facility cost at most (1 + γ ) F SOL and service cost at most (1 + γ ) C SOL . The approximation is up to multiplicativefactors of (1 + (cid:15) ) for arbitrarily small (cid:15) > . Based on Theorem 6, we can obtain the following corollary.
Corollary 2.
For any γ > , there exists a ((1 + (cid:15) )(1 + γ ) , (1 + (cid:15) )(1 + γ )) -approximationalgorithm for the k -median problem, where (cid:15) > can be arbitrarily small.Proof. Let ( x (cid:48) , y (cid:48) ) be an optimal solution with total cost T to the LP relaxation of UKM. Acrucial observation is that ( x (cid:48) , y (cid:48) ) is also a feasible fractional solution to UFL with uniformopening cost f >
0. Let
SOL = ( x (cid:48) , y (cid:48) ) with total facility cost F SOL and total service cost C SOL . Note that C SOL = T and F SOL ≤ kf. Now, it is easy to see that, based on the Charikar and Guha algorithm [10], we could getan integer solution with at most (1 + γ )(1 + (cid:15) ) times the optimal cost while using at most(1 + γ )(1 + (cid:15) ) k facilities for the k -median problem. That is, there exists a ((1 + (cid:15) )(1 + γ ) , (1 + (cid:15) )(1 + γ ))-approximation algorithm for the k -median problem, where (cid:15) > γ = 0 . . Theorem 7.
There is a . (cid:15) ) -approximation algorithm for the capacitated facilitylocation problem with uniform opening costs, where (cid:15) > can be arbitrarily small. eferences [1] A. Aggarwal, A. Louis, M. Bansal, N. Garg, N. Gupta, S. Gupta, and S. Jain. A3-approximation algorithm for the facility location problem with uniform capacities. Mathematical Programming , 141(1-2):527–547, 2013.[2] A. Archer, R. Rajagopalan, and D. B. Shmoys. Lagrangian relaxation for the k -medianproblem: New insights and continuity properties. In G. D. Battista and U. Zwick,editors, Algorithms - ESA 2003, 11th Annual European Symposium, Budapest, Hungary,September, 2003, Proceedings , volume 2832 of
LNCS , pages 31–42. Springer, 2003.[3] V. Arya, N. Garg, R. Khandekar, A. Meyerson, K. Munagala, and V. Pandit. Localsearch heuristic for k -median and facility location problems. In Proceedings of 33th An-nual ACM Symposium on Theory of Computing , pages 11–20, Heraklion, Crete, Greece,2001. ACM, New York, NY, USA.[4] M. Bansal, N. Garg, and N. Gupta. A 5-approximation for capacitated facility location.In L. Epstein and P. Ferragina, editors,
Algorithms - ESA 2012 - 20th Annual EuropeanSymposium, Ljubljana, Slovenia, September, 2012. Proceedings , volume 7501 of
LectureNotes in Computer Science , pages 133–144. Springer, 2012.[5] Y. Bartal, M. Charikar, and D. Raz. Approximating min-sum k -clustering in metricspaces. In Proceedings of 33th Annual ACM Symposium on Theory of Computing , pages11–20, Heraklion, Crete, Greece, 2001. ACM, New York, NY, USA.[6] P. S. Bradley, U. M. Fayyad, and O. L. Mangasarian. Mathematical programming fordata mining: Formulations and challenges.
INFORMS Journal on Computing , 11(3):217–238, 1999.[7] J. Byrka, K. Fleszar, B. Rybicki, and J. Spoerhase. A constant-factor approximationalgorithm for uniform hard capacitated k -median. CoRR , abs/1312.6550, 2013.[8] J. Byrka, T. Pensyl, B. Rybicki, A. Srinivasan, and K. Trinh. An improved ap-proximation for k -median, and positive correlation in budgeted optimization. CoRR ,abs/1406.2951, 2014.[9] A. Caprara, H. Kellerer, U. Pferschy, and D. Pisinger. Approximation algorithms forknapsack problems with cardinality constraints.
European Journal of Operational Re-search , 123(2):333–345, 2000.[10] M. Charikar and S. Guha. Improved combinatorial algorithms for the facility locationand k -median problems. In Proceedings of 40th Annual Symposium on Foundations ofComputer Science , pages 378–388, New York, NY, USA, 1999. IEEE Computer Society.[11] M. Charikar, S. Guha, ´E. Tardos, and D. B. Shmoys. A constant-factor approximationalgorithm for the k -median problem (extended abstract). In Proceedings of the 31thAnnual ACM Symposium on Theory of Computing , pages 1–10, Atlanta, Georgia, USA,1999. ACM, New York, NY, USA.[12] F. A. Chudak and D. P. Williamson. Improved approximation algorithms for capacitatedfacility location problems.
Mathematical Programming , 102(2):207–222, 2005.[13] J. Chuzhoy and Y. Rabani. Approximating k -median with non-uniform capacities. In Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms ,pages 952–958, Vancouver, British Columbia, Canada, 2005. SIAM, Philadelphia, PA,USA. 2014] G. Cornu´ejols, G. L. Nemhauser, and L. A. Wolsey.
Discrete Location Theory , chapterThe uncapacitated facility location problem, pages 119–171. Wiley, New York, 1990.[15] N. J. Edwards.
Approximation algorithms for the multi-level facility location problem .PhD thesis, Cornell University, 2001.[16] H. N. Gabow. An efficient implementation of edmonds’ algorithm for maximum matchingon graphs.
Journal of the ACM , 23(2):221–234, 1976.[17] S. G¨ortz and A. Klose. Analysis of some greedy algorithms for the single-sink fixed-chargetransportation problem.
Journal of Heuristics , 15(4):331–349, 2009.[18] M. Gr¨otschel, L. Lov´asz, and A. Schrijver.
Geometric Algorithms and CombinatorialOptimization . Springer, Berlin, 1988.[19] Y. T. Herer, M. J. Rosenblatt, and I. Hefter. Fast algorithms for single-sink fixedcharge transportation problems with applications to manufacturing and transportation.
Transportation Science , 30(4):276–290, 1996.[20] A. J. Hoffman and J. B. Kruskal.
Linear Inequalities and Related Systems , chapterIntegral boundary points of convex polyhedra, pages 233–246. Princeton Univ. Press,Princeton, NJ, 1956.[21] V. N. Hsu, T. J. Lowe, and A. Tamir. Structured p -facility location problems on the linesolvable in polynomial time. Operations Research Letters , 21(4):159–164, 1997.[22] A. K. Jain and R. C. Dubes.
Algorithms for clustering data . Prentice-Hall, Inc., UpperSaddle River, NJ, USA, 1988.[23] K. Jain, M. Mahdian, E. Markakis, A. Saberi, and V. V. Vazirani. Greedy facilitylocation algorithms analyzed using dual fitting with factor-revealing LP.
Journal of theACM , 50(6):795–824, 2003.[24] K. Jain, M. Mahdian, and A. Saberi. A new greedy approach for facility location prob-lems. In
Proceedings of the 34th Annual ACM Symposium on Theory of Computing ,pages 731–740, Montr´eal, Qu´ebec, Canada, 2002. ACM, New York, NY, USA.[25] K. Jain and V. V. Vazirani. Approximation algorithms for metric facility location and k -median problems using the primal-dual schema and lagrangian relaxation. Journal ofthe ACM , 48(2):274–296, 2001.[26] S. Kapoor and H. Ramesh. Algorithms for enumerating all spanning trees of undirectedand weighted graphs.
SIAM Journal on Computing , 24(2):247–265, 1995.[27] M. R. Korupolu, C. G. Plaxton, and R. Rajaraman. Analysis of a local search heuristicfor facility location problems.
Journal of Algorithms , 37(1):146–188, 2000.[28] A. A. Kuehn and M. J. Hamburger. A heuristic program for locating warehouses.
Man-agement Science , 9(4):643–666, 1963.[29] E. L. Lawler. Fast approximation algorithms for knapsack problems.
MathematicalMethods of Operations Research , 4(4):339–356, 1979.[30] R. Levi, D. B. Shmoys, and C. Swamy. Lp-based approximation algorithms for capaci-tated facility location.
Mathematical Programming , 131(1-2):365–379, 2012.2131] S. Li and O. Svensson. Approximating k-median via pseudo-approximation. In
Proceed-ings of the 2013 ACM Symposium on Theory of Computing , pages 901–910, Palo Alto,California, USA, 2013. ACM, New York, NY, USA.[32] M. Mahdian and M. P´al. Universal facility location. In G. D. Battista and U. Zwick,editors,
Algorithms - ESA 2003, 11th Annual European Symposium, Budapest, Hungary,September 2003, Proceedings , volume 2832 of
Lecture Notes in Computer Science , pages409–421. Springer, 2003.[33] M. Mahdian, Y. Ye, and J. Zhang. A 2-approximation algorithm for the soft-capacitatedfacility location problem. In S. Arora, K. Jansen, J. D. P. Rolim, and A. Sahai, editors,
Approximation, Randomization, and Combinatorial Optimization: Algorithms and Tech-niques, 6th International Workshop on Approximation Algorithms for Combinatorial Op-timization Problems, APPROX 2003 and 7th International Workshop on Randomizationand Approximation Techniques in Computer Science, RANDOM 2003, Princeton, NJ,USA, August 24-26, 2003, Proceedings , volume 2764 of
Lecture Notes in Computer Sci-ence , pages 129–140. Springer, 2003.[34] M. P´al, ´E. Tardos, and T. Wexler. Facility location with nonuniform hard capacities.In
Proceedings of 42th Annual Symposium on Foundations of Computer Science , pages329–338, Las Vegas, Nevada, USA, 2001. IEEE Computer Society.[35] A. Schrijver.
Combinatorial Optimization: Polyhedra and Efficiency . Springer-Verlag,Berlin, 2003.[36] H. I. Scoins. The number of trees with nodes of alternate parity.
Mathematical Proceed-ings of the Cambridge Philosophical Society , 58(1):12–16, 1962.[37] J. Zhang, B. Chen, and Y. Ye. A multiexchange local search algorithm for the capacitatedfacility location problem.