[PDF] Packing-Based Approximation Algorithm for the k-Set Cover Problem

Abstract

Full PDF

aa r X i v : . [ c s . D S ] S e p Packing-Based Approximation Algorithm for the k -SetCover Problem Martin F¨urer ⋆ and Huiwen Yu Department of Computer Science and EngineeringThe Pennsylvania State University, University Park, PA 16802, USA

Abstract.

We present a packing-based approximation algorithm for the k -SetCover problem. We introduce a new local search-based k -set packing heuristic,and call it Restricted k -Set Packing. We analyze its tight approximation ratio via acomplicated combinatorial argument. Equipped with the Restricted k -Set Packingalgorithm, our k -Set Cover algorithm is composed of the k -Set Packing heuristic[7] for k ≥ , Restricted k -Set Packing for k = 6 , , and the semi-local (2 , -improvement [2] for 3-Set Cover. We show that our algorithm obtains a tightapproximation ratio of H k − . Θ ( k ) , where H k is the k -th harmonicnumber. For small k , our results are . for k = 6 , . for k = 5 and . for k = 4 . Our algorithm improves the currently best approximation ratiofor the k -Set Cover problem of any k ≥ . Given a set of elements U and a collection of subsets S of U with each subset of S having size at most k and the union of S being U , the k -Set Cover problem isto ﬁnd a minimal size sub-collection of S whose union remains U . Without loss ofgenerality, we assume that S is closed under subsets. Then the objective of the k -SetCover problem can be viewed as ﬁnding a disjoint union of sets of S which covers U .The k -Set Cover problem is NP-hard for any k ≥ . For k = 2 , the 2-Set Coverproblem is polynomial-time solvable by a maximum matching algorithm. The greedyapproach for approximating the k -Set Cover problem chooses a maximal collection of i -sets (sets with size i ) for each i from k down to 1. It achieves a tight approximation ra-tio H k (the k-th harmonic number) [9]. The hardness result by Feige [3] shows that for n = | U | , the Set Cover problem is not approximable within (1 − ǫ ) ln n for any ǫ > un-less NP ⊆ DTIME( n log log n ). For the k -Set Cover problem, Trevisan [12] shows that nopolynomial-time algorithm has an approximation ratio better than ln k − Ω (ln ln k ) un-less subexponential-time deterministic algorithms for NP-hard problems exist. There-fore, it is unlikely that a tremendous improvement of the approximation ratio is possible.There is no evidence that the ln k − Ω (ln ln k ) lower bound can be achieved. Re-search on approximating the k -Set Cover problem has been focused on improving thepositive constant c in the approximation ratio H k − c . Small improvements on the con-stant might lead us closer to the optimal ratio. One of the main ideas based on greedyalgorithms is to handle small sets separately. Goldschmidt et al. [4] give a heuristic ⋆ Research supported in part by NSF Grant CCF-0728921 and CCF-0964655 Martin F¨urer and Huiwen Yu using a matching computation to deal with sets of size 2 and obtain an H k − ap-proximation ratio. Halld´orsson [5] improves c to via his “t-change” and “augmentingpath” techniques. Duh and F¨urer [2] give a semi-local search algorithm for the 3-SetCover problem and further improve c to . They also present a tight example for theirsemi-local search algorithm.A different idea is to replace the greedy approach by a set-packing approach. Levin[10] uses a set-packing algorithm for packing 4-sets and improves c to . for k ≥ .Athanassopoulos et al. [1] substitute the greedy phases for k ≥ with packing phasesand reach an approximation ratio H k − . for k ≥ .The goal of this paper is not to provide incremental improvement in the approxima-tion ratio for k -Set Cover. We rather want to obtain the best such result achievable bycurrent methods. It might be the best possible result, as we conjecture the lower boundpresented in [12] not to be optimal.In this paper, we give a complete packing-based approximation algorithm (in short,PRPSLI) for the k -Set Cover problem. For k ≥ , we use the k -set packing heuristicintroduced by Hurkens and Shrijver [7], which achieves the best known to date approx-imation ratio k − ǫ for the k -Set Packing problem for any ǫ > . On the other hand,the best hardness result by Hazan et al. [6] shows that it is NP-hard to approximate the k -Set Packing problem within Ω ( ln kk ) .For k = 6 , , , we use the same packing heuristic with the restriction that any localimprovement should not increase the number of 1-sets which are needed to ﬁnish thedisjoint set cover. We call this new heuristic Restricted k -Set Packing. We prove thatfor any k ≥ , the Restricted k -Set Packing algorithm achieves the same approximationratio as the corresponding unrestricted set packing heuristic. For k = 4 , this is not thecase. The approximation ratio of the Restricted 4-Set Packing algorithm is , whichis worse than the − ǫ ratio of the 4-set packing heuristic but it is also tight. For k =3 , we use the semi-local optimization technique [2]. We thereby obtain the currentlybest approximation ratio for the k -Set Cover problem. Table 1 (in Appendix Section5) includes a comparison of the approximation ratio of our algorithm with GRSLI k, [2], Levin’s algorithm [10] and P RSLI k, [1]. We also show that our result is indeedtight. Thus, k -Set Cover algorithms which are based on packing heuristic can hardlybe improved. Our novel Restricted k -Set Packing algorithm is quite simple and natural,but its analysis is complicated. It is essentially based on combinatorial arguments. Weuse the factor-revealing linear programming analysis for the k -Set Cover problem. Thefactor-revealing linear program is introduced by Jain et al. [8] for analyzing the facilitylocation problem. Athanassopoulos et al. [1] are the ﬁrst to apply it to the k -Set Coverproblem.The paper is organized as follows. In Section 2, we give the description of ouralgorithm and present the main results. In Section 3, we prove the approximation ratioof the Restricted k -Set Packing algorithm. In Section 4, we analyze our k -Set Coveralgorithm via the factor-revealing linear program. acking-Based Approximation Algorithm for the k -Set Cover Problem 3 In this section, we describe our packing-based k -Set Cover approximation algorithm.We ﬁrst give an overview of some existing results.Duh and F¨urer [2] introduce a semi-local ( s, t ) -improvement for the 3-Set Coverproblem. First, it greedily selects a maximal disjoint union of 3-sets. Then each localimprovement replaces t s (2 , -improvement algorithm gives the best performance ratio for the -SetCover problem among all semi-local ( s, t ) -improvement algorithms. The ratio is provedto be tight. Theorem 1 ( [2]).

The semi-local (2 , -optimization algorithm for 3-Set Cover pro-duces a solution with performance ratio . It uses a minimal number of 1-sets. We use the semi-local (2 , -improvement as the basis of our k -Set Cover algorithm.Other phases of the algorithm are based on the set packing heuristic [7]. For ﬁxed s , theheuristic starts with an arbitrary maximal packing, it replaces p ≤ s sets in the packingwith p + 1 sets if the resulting collection is still a packing. Hurkens and Shrijver [7]show the following result, Theorem 2 ( [7]).

For all ǫ > , the local search k -Set Packing algorithm for parame-ter s = O (log k ǫ ) has an approximation ratio k − ǫ . The worst-case ratio is also known to be tight. We apply this packing heuristic for k ≥ . For k = 6 , , , we follow the intuition of the semi-local improvement andmodify the local search of the packing heuristic, requiring that any improvement doesnot increase the number of 1-sets. We use the semi-local (2,1)-improvement for 3-SetCover to compute the number of 1-sets required to ﬁnish the cover. Lemma 2.2 in [2]guarantees that the number of 1-sets returned by the semi-local (2,1)-improvement isno more than this number in any optimal solution. We compute this number ﬁrst at thebeginning of the restricted phase. Each time we want to make a replacement via thepacking heuristic, we compute the number of 1-sets needed to ﬁnish the cover aftermaking the replacement. If this number increases, the replacement is prohibited. Tosummarize, we call our algorithm the Restricted Packing-based k -Set Cover algorithm(PRPSLI) and give the pseudo-code in Algorithm 1. For input parameter ǫ > , s i isthe parameter of the local improvement in Phase i . For any i = 5 , , we set s i in thesame way as in Theorem 2. For i = 5 , , we set s i = ⌈ iǫ ⌉ .The algorithm clearly runs in polynomial time. The approximation ratio of PRPSLIis presented in the following main theorem. For completeness, we also state the approx-imation ratio for the 3-Set Cover problem, which is obtained by Duh and F¨urer [2] andremains the best result. Let ρ k be the approximation ratio of the k -Set Cover problem. Theorem 3 (Main).

For all ǫ > , the Packing-based k -Set Cover algorithm has anapproximation ratio ρ k = 2 H k − H k + k − k − − + ǫ for even k and k ≥ ; ρ k = 2 H k − H k − − + ǫ for odd k and k ≥ ; ρ = 1 . ; ρ = 1 . ; ρ = . Martin F¨urer and Huiwen Yu

Algorithm 1

Packing-based k -Set Cover Algorithm (PRPSLI) // The k -Set Packing Phase for i ← k down to 7 do Select a maximal collection of disjoint i -sets. repeat Select p ≤ s i i -sets and replace them with p + 1 i -sets. until there exist no more such improvements. end for // The Restricted k -Set Packing Phase Run the semi-local (2 , -improvement algorithm for 3-Set Cover on the remaining uncoveredelements to obtain the number of 1-sets. for i ← to 4 dorepeat Try to replace p ≤ s i i -sets with p + 1 i -sets. Commit to the replacement only if thenumber of 1-sets computed by the semi-local (2 , -improvement algorithm for 3-SetCover on the remaining uncovered elements does not increase. until there exist no more such improvements. end for // The Semi-Local Optimization Phase

Run the semi-local (2 , -improvement algorithm on the remaining uncovered elements. Remark 1.

For odd k ≥ , the approximation ratio ρ k is derived from the expression ρ k = k + · · · + + + 1 + ǫ . We can further obtain the asymptotic representation of ρ k ,i.e., ρ k = 2 H k − H k − − + ǫ = H k + ln 2 − + Θ ( k )+ ǫ = H k − . Θ ( k )+ ǫ .Similarly, for even k ≥ , ρ k = k + k − + k − + · · · + + + 1 + ǫ = 2 H k − H k + k − k − − + ǫ = H k + ln 2 − + Θ ( k ) + ǫ = H k − . Θ ( k ) + ǫ . Finally, ρ = + + 1 and ρ = + + 1 . Remark 2.

Restriction on Phase 6 is only required for obtaining the approximation ratio ρ k for even k and k ≤ . In other cases, only restriction on Phase 5 and Phase 4 arenecessary.We prove the main theorem in Section 4. Before that, we analyze the approximationratio of the Restricted k -Set Packing algorithm for k ≥ in Section 3. We state theresult of the approximation ratio of the Restricted k -Set Packing algorithm as follows. Theorem 4 (Restricted k -Set Packing). There exists a Restricted 4-Set Packing algo-rithm which has an approximation ratio . For all ǫ > and for any k ≥ , thereexists a Restricted k -Set Packing algorithm which has an approximation ratio k − ǫ .Remark 3. Without loss of generality, we assume that optimal solution of the Restricted k -Set Packing problem also has the property that it does not increase the number of 1-sets needed to ﬁnish the cover of the remaining uncovered elements. This assumptionis justiﬁed by Lemma 2 in Appendix 9.1. acking-Based Approximation Algorithm for the k -Set Cover Problem 5 k -Set Packing Algorithm We ﬁx one optimal solution O of the Restricted k -Set Packing algorithm. We refer tothe sets in O as optimal sets. For ﬁxed s , a local improvement replaces p ≤ s k -sets with p +1 k -sets. We pick a packing of k -sets A that cannot be improved by the Restricted k -Set Packing algorithm. We say an optimal set is an i -level set if exactly i of its elementsare covered by sets in A . For the sake of analysis, we call a local improvement an i - j -improvement if it replaces i sets in A with j sets in O . As a convention in therest of the paper, small letters represent elements, capital letters represent subsets of U , and calligraphic letters represent collections of sets. We ﬁrst introduce the notion ofblocking. The main difference between unrestricted k -set packing and restricted k -set packing isthe restriction on the number of 1-sets which are needed to ﬁnish the covering via thesemi-local (2,1)-improvement. This restriction can prohibit a local improvement. If any i - j -improvement is prohibited because of an increase of 1-sets, we say there exists a blocking . In Example 1 given in Appendix Section 6.1, we construct an instance of4-set packing to help explain how blocking works.We now deﬁne blocking formally. We are given a ﬁxed optimal k -set packing O of U and a k -set packing A chosen by the Restricted k -Set Packing algorithm. Weconsider all possible extensions of A to a disjoint cover of U by 1-sets, 2-sets and 3-sets. We order these extensions lexicographically, ﬁrst by the number of 1-sets, secondby the total number of 2-sets and 3-sets which are not within a k -set of O , and thirdby the number of 3-sets which are not within a k -set of O . We are interested in thelexicographically ﬁrst extension. Notice that we pick this speciﬁc extension for analysisonly. We cannot obtain this ordering without access to O . We explain how we order theextensions in Example 2 (Appendix Section 6.2).Suppose we ﬁnish the cover from the packing A with the lexicographically ﬁrstextension. Let F be an undirected graph such that each vertex in F represents anoptimal set. Two vertices are adjacent if and only if there is a 2-set in the extensionintersecting with the corresponding optimal sets. Since the number of 2-sets and 3-setsnot within an optimal set is minimized, there are no multiple edges in the graph. Forbrevity, when we talk about a node V in F , we also refer to V as the correspondingoptimal set. Moreover, when we say the degree of a node V , we refer to the number ofneighbors of V . Proposition 1. F is a forest. Proposition 2.

For any i < k − , there is no 1-set inside an i -level set. i.e. 1-set canonly appear in ( k − -level sets. Proposition 3.

For any tree T in F , there is at most one node which represents an i -level set, such that the degree of the node is smaller than k − i . Martin F¨urer and Huiwen Yu

For any tree, if there exists a node with property in Proposition 3, we deﬁne it to bethe root. Otherwise, we know that all degree 1 nodes represent ( k − -level sets. Wedeﬁne an arbitrary node not representing a ( k − -level set to be the root. If there areonly ( k − -level sets in the tree, i.e. the tree degenerates to one edge or a single point,we deﬁne an arbitrary ( k − -level set to be the root. All leaves represent ( k − -levelsets. (The root is not considered to by a leaf.) We call such a tree a blocking tree . Forany subtree, we say that the leaves block the nodes in this subtree. We also call the setrepresented by a leaf a blocking set .We consider one further property of the root. Proposition 4.

Let k ≥ . In any blocking tree, there exists at most one node of either0-level or 1-level that is of degree 2. If such a node exists, it is the root. The proofs of Proposition 1 to 4 are given in Appendix Section 6.3.Based on these simple structures of the blocking tree, we are now ready to provethe approximation ratio of the Restricted k -Set Packing algorithm. We prove in this section that the Restricted 4-Set Packing algorithm has an approxima-tion ratio . We ﬁrst explain how this ratio is derived. We use the unit U deﬁned inExample 1 (Appendix Section 6.1). Assume when the algorithm stops, we have n ≫ copies of U and a relatively small number of 3-level sets. We denote the i -th copy of U by U i . For each i and ≤ j ≤ , the set O j in U i and U i +1 are adjacent. This chainof O j ’s starts from and ends at a 3-level set respectively. Then the performance ratio ofthis instance is slightly larger than . We ﬁrst prove that the approximation ratio of theRestricted 4-Set Packing algorithm is at least .Given F , a collection of blocking trees. We assign 4 tokens to every element cov-ered by sets chosen by the restricted packing algorithm. We say a set has a free tokenif after distributing the token, this set retains at least 7 tokens. We show that we canalways distribute the tokens among all the optimal sets O , so that there are at least 7tokens in each optimal set. Proof.

We present the ﬁrst round of redistribution.

Round 1 - Redistribution in each blocking tree T . Every leaf in T has 4 freetokens to distribute. Every internal node V of degree d requests d − tokens from aleaf. We consider each node with nonzero request in the reverse order given by breadthﬁrst search (BFS). – If d = 3 , V requests 4 tokens from any leaf in the subtree rooted at V which has 12tokens. – If d = 4 , V has three children V , V , V . V sends requests of 4 tokens to any leafin the subtree rooted at V , V , V , one for each subtree. V takes any two donationsof 4 tokens. – The root of degree r receives the rest of the tokens contributed by the leaves. acking-Based Approximation Algorithm for the k -Set Cover Problem 7 Proposition 5.

After Round 1, every internal node in T has at least 8 tokens, the rootof degree r has r tokens. Proposition 5 is proved in Appendix Section 7.1. According to Proposition 4, weknow that after the ﬁrst round of redistribution every node has at least 8 tokens exceptany 0-level roots which are of degree 1 and any singletons in F which are 1-level sets.We ﬁrst consider the collection of 1-level sets S which are singletons in F . Let S be such a 1-level set that intersects with a 4-set A chosen by the algorithm. Assume A also intersects with j other optimal sets { O i } ji =1 .We point out that no O i belongs to S . Otherwise suppose O i ∈ S . Then there isa 1-2-improvement (replace A with S and O i ).We give the second round of redistribution, such that after this round, every set in S has at least 7 tokens. For optimal sets O, W , consider each token request sent to O from W . We say it is an internal request , if W ∈ S , and W and O intersect with a set A chosen by the algorithm. Otherwise, we say it is an external request . Round 2 - Redistribution for S ∈ S . S sends τ requests of one token to O i if | O i T A | = τ . For each node V in T , internal requests are considered prior to externalrequests.For each request sent to V from W , – If V has at least 8 tokens, give one to W . – If V has only 7 tokens.(1) If V is a leaf, it requests from the node which has received 4 tokens from itduring the ﬁrst round of redistribution.(2) If V is not a leaf,(2.1) If W ∈ S , V requests a token from a leaf which has at least 8 tokens in thesubtree rooted at V .(2.2) If W is a node in T . Suppose V has children V , .., V d and W belongs tothe subtree rooted at V . V requests from a leaf which has at least 8 tokens in thesubtree rooted at V , ..., V d .We now prove the correctness of the second round of redistribution. Proposition 6.

Every singleton node O of level j has at most j − requests. Everyleaf has at most k − internal requests. Every internal node of level j has at most j requests. The root of level s has at most s requests. Proposition 7.

Singleton node O can satisfy all the requests. Proposition 8.

The root R of level s and degree r can satisfy all the requests if s + r ≥ . Proposition 9.

There is no external request sent to a root which is of level 0 and degree1.

Proposition 10.

Any external request sent from a leaf L can be satisﬁed. Martin F¨urer and Huiwen Yu

Proposition 11.

Any external request sent from an internal node of degree d ≥ canbe satisﬁed. Proposition 12.

Any external request sent from an internal node V of degree 2 can besatisﬁed. The proofs of Proposition 6 to 12 are given in Appendix Section 7.1. From Propo-sition 7, 8, 10, 11 and 12, we know that all requests can be satisﬁed. Hence after thesecond round of distribution, every set in S has at least 7 tokens. And from Proposi-tion 9, we know that every root of level 0 and degree 1 retains 4 tokens.We consider a root R which is a 0-level set of degree 1. R receives 4 tokens from leaf B . Assume B is covered by { A i } ji =1 ∈ A and { A i } ji =1 intersect with { O i } li =1 ∈ O .We ﬁrst prove that, Proposition 13. ∀ O ∈ { O i } li =1 , O does not receive any token request during Round 2of redistribution. The proof of Proposition 13 is given in Appendix Section 7.1. Based on Proposition13, for any set O ∈ { O i } li =1 , we can think of a token request from R to B as an internalrequest to O . We describe the third round of redistribution. Round 3 - Redistribution for root of level 0 and degree 1 . Request 1 token fromeach of O , ..., O l following Round 2. Proposition 14. l ≥ . The correctness of Round 3 follows from Proposition 14. The proof of Proposition14 is given in Appendix Section 7.1. We thus prove that a root of level 0 and degree 1has 7 tokens after the third round of redistribution.Therefore, after the three rounds of token redistribution, each optimal set has at least7 tokens, then the approximation ratio of the Restricted 4-Set Packing algorithm is atleast . ⊓⊔ We give the construction of tight example in Appendix Section 7.2. We thus con-clude that the approximation ratio of the Restricted 4-Set Packing algorithm is . k -Set Packing Algorithm, k ≥ For k ≥ , we prove that the approximation ratio of the Restricted k -Set Packing algo-rithm is the same as the set packing heuristic [7]. For ﬁxed s , a local improvement canreplace at most s sets with s + 1 sets. We prove that for any ǫ > , there exists an s ,such that the approximation ratio of the Restricted k -Set Packing algorithm is at least k − ǫ .The proof strategy is similar to that of the Restricted 4-Set Packing algorithm. Weﬁrst create a forest of blocking trees F . We give every element covered by sets chosenby the algorithm one unit of tokens. We then redistribute the tokens among all theoptimal sets. We claim that for s ≥ kǫ , after redistribution, every optimal set gets atleast − kǫ units of tokens. We use a different parameter of local improvement fromTheorem 2. The algorithm still runs in polynomial time. acking-Based Approximation Algorithm for the k -Set Cover Problem 9 Proof.

The ﬁrst round of redistribution goes as follows.

Round 1 - Redistribution in each blocking tree T . Every leaf in T has k − units of free tokens to distribute. Every internal node V of degree d receives d − unitsof tokens from a leaf. The root receives the remaining tokens. Proposition 15.

After Round 1, every node has at least 2 units of tokens, except single-tons which are 1-level sets.

We consider the collection of 1-level sets S which are singletons in F . Let S be such a 1-level set that intersects with a k -set A chosen by the algorithm. Assume A intersects with j other optimal sets { Q i } ji =1 . Round 2 - Redistribution for S ∈ S . – For every singleton node of level i ( i ≥ ):Send 1 unit of tokens each to arbitrarily i − internal requests. – If S receives one unit of tokens, we are done. Otherwise, pick an arbitrary single-ton node Q i from { Q i } ji =1 . Let Q i = O . Let A ∈ A be a set intersecting with Q i while it does not intersect with any set in S . (The existence of A follows fromProposition 6.) – If A intersects with some singleton node which has at least 3 units of tokens, wemove 1 unit to S . Otherwise pick an arbitrary singleton node O which intersectswith A . Let A ∈ A be a set intersecting with O while it does not intersect withany set in S . – Repeat this procedure and form a chain of singleton sets O = Q i , O , ..., O p suchthat S and O intersect with A = A , O i and O i +1 intersect with a set A i chosenby the algorithm, for i = 1 , .., p − , until(1) The chain ends when excluding O p , every set intersecting with A p − is not asingleton node. Denote the collection of such S by S T .Move 1 unit of tokens from any node in the collection of non-degenerate blockingtrees which has at least 3 units of tokens to S .(2) The chain ends where there exists a 1-level set S ′ ∈ S which intersects with A ′ , such that S ′ starts another chain of length q which ends at O p (= O q ) (asillustrated in Fig. 4 (Appendix Section 8.1)).Construct a graph G , such that every vertex in the graph represents a node in achain, V , V are connected, if the corresponding nodes intersect with a set A cho-sen by the algorithm. Consider every connected component C in G . Equally dis-tribute all tokens to every vertex in C .The correctness of step (1) and step (2) follows from Proposition 16 and 17. Wegive the proofs in Appendix Section 8.2. Proposition 16.

The collection of non-degenerate blocking trees has at least | S T | freeunits of tokens. After step (1), every set in S T has 2 units of tokens. Proposition 17.

After step (2), every optimal set has at least − kǫ units of tokens. We conclude that after the second round of redistribution, every optimal set gets atleast − kǫ units of tokens. Therefore, the approximation ratio of the Restricted k -SetPacking algorithm is at least k − ǫ . ⊓⊔ Moreover, the tight example of the k -set packing heuristic [7] can also serve as atight example of the Restricted k -Set Packing algorithm for k ≥ . Hence, the approxi-mation ratio of the Restricted k -Set Packing algorithm is k − ǫ . We use the factor-revealing linear program introduced by Jain et al. [8] to analyze theapproximation ratio of the algorithm PRPSLI. Athanassopoulos et al. [1] ﬁrst apply thismethod to the k -Set Cover problem. Notice that the cover produced by the restrictedset packing algorithms is a cover which minimizes the number of 1-sets. In AppendixSection 9.1, we ﬁrst show that for any k ≥ , there exists a k -set cover which simulta-neously minimizes the size of the cover and the number of 1-sets in the cover. We thenpresent the set-ups and notations of the factor-revealing linear program (LP).The proof of Theorem 3 is similar as the proof of Theorem 6 in [1]. Namely, weﬁnd a feasible solution to the dual program of (LP), which makes the objective functionof the dual program equal to the value of ρ k deﬁned in Theorem 3, thus ρ k is an upperbound of the approximation ratio of PRPSLI. We give the proof of an upper bound ofthe approximation ratio of PRPSLI in Appendix Section 9.2. On the other side, we givean instance for each k , such that PRPSLI does not achieve a better ratio than ρ k on thisinstance in Appendix Section 9.3. We thus prove Theorem 3. References

1. S. Athanassopoulos, I. Caragiannis, and C. Kaklamanis. Analysis of approximation algo-rithms for k -set cover using factor-revealing linear programs. Theory of computing systems ,45(3):555–576, 2009.2. R. Duh and M. F¨urer. Approximation of k -set cover by semi-local optimization. Proceedingsof the 29th Annual ACM Symposium on Theory of Computing , pages 256–264, 1997.3. U. Feige. A threshold of ln n for approximating set cover. Journal of ACM , 45(4):634–652,1998.4. O. Goldschmidt, D.S. Hochbaum, and G. Yu. A modiﬁed greedy heuristic for the set coveringproblem with improved worst case bound.

Information processing letters , 48:305–310, 1993.5. M.M. Halld´orsson. Approximating k -set cover and complementary graph coloring. Proceed-ings of the 5th conference on integer programming and combinatorial optimization. LNCS ,1084:118–131, 1996.6. Elad Hazan, Shmuel Safra, and Oded Schwartz. On the complexity of approximating k-setpacking.

Computational Complexity , 15:20–39, 2006.7. C.A. Hurkens and J. Shrijver. On the size of systems of sets every t of which have an SDR,with an application to the worst-case ratio of heuristics for packing problems. SIAM Journalof Discrete Math , 2(1):68–72, 1989.acking-Based Approximation Algorithm for the k -Set Cover Problem 118. K. Jain, M. Mahdian, E. Markakis, A. Saberi, and V.V. Vazirani. Greedy facility locationalgorithms analyzed using dual ﬁtting with factor-revealing LP. Journal of ACM , 50(6):795–824, 2003.9. D.S. Johnson. Approximation algorithms for combinatorial problems.

Journal of computerand system sciences , 9:256–278, 1974.10. A. Levin. Approximating the unweighted k -set cover problem: greedy meets local search. SIAM J, Discrete Math , 23(1):251–264, 2008.11. Rajeev Motwani and Prabhakar Raghavan.

Randomized algorithms . Cambridge UniversityPress, 1995.12. L. Trevisan. Non-approximability results for optimization problems on bounded degree in-stances.

Proceedings of the 33rd annual ACM symposium on theory of computing , pages453–461, 2001.2 Martin F¨urer and Huiwen Yu

Appendix5 Comparison on the Approximation Ratio of the k -Set CoverProblem with Previous Works Table 1.

Comparison on the Approximation Ratio of the k -Set Cover Problem k GRSLI k, [2] [10] P RSLI k, [1] PRPSLI3 1.3333 1.3333 1.3333 1.33334 1.5833 1.5808 1.5833 1.52085 1.7833 1.7801 1.7833 1.73336 1.9500 1.9474 1.9208 1.86677 2.0929 2.0903 2.0690 2.01908 2.2179 2.2153 2.1762 2.12629 2.3290 2.3264 2.2917 2.241310 2.4290 2.4264 2.3802 2.330220 3.0977 3.0952 3.0305 2.977921 3.1454 3.1428 3.0784 3.028450 3.9992 3.9966 3.9187 3.868375 4.4014 4.3988 4.3178 4.2678100 4.6874 4.6848 4.6021 4.5520large k H k − . H k − . H k − . H k − . Example 1 (Blocking).

Consider an instance ( U, S ) of the 4-Set Cover problem. Sup-pose there is an optimal solution O of the Restricted 4-Set Packing algorithm whichconsists of only disjoint 4-sets that cover all elements. Let O = { O i } i =1 S { B i } mi =1 , m > . Let { A i } i =1 be a collection of 4-sets chosen by the algorithm. { O i } i =1 isa collection of 1-level or 2-level sets. Denote the j -th element of O i by o ji , for j =1 , , , . If A i = ( o j i , o j i , o j i , o j i ) , we say that A i covers the elements o j i , o j i , o j i and o j i . Denote the following unit by U , U =  A = ( o , o , o , o ) A = ( o , o , o , o ) A = ( o , o , o , o ) A = ( o , o , o , o ) A = ( o , o , o , o ) A = ( o , o , o , o ) A = ( o , o , o , o ) acking-Based Approximation Algorithm for the k -Set Cover Problem 13 We visualize this construction in Fig. 1. Let each cube represent a 4-set in { A i } i =1 and we place { O i } i =1 vertically within a × square (not shown in the ﬁgure), suchthat each O i intersects with one or two sets in { A i } i =1 . { A i } i =1 are placed horizontally.Notice that O , O , O , O which intersects with A , A , A , A respectively are1-level sets. The other 12 sets in { O i } i =1 are 2-level sets. { B i } mi =1 is a collection of 3-level sets. Notice that for our ﬁxed optimal solution, allelements can be covered by 4-sets, so there is no 1-set needed to ﬁnish the cover. Forgiven S , when we compute an extension of the packing to a full cover via the semi-local (2,1)-improvement, assume the unpacked element of B i ( ≤ i ≤ ) can only becovered by a 2-set intersecting with both B i and O i , or it introduces a 1-set in the cover.The remaining unpacked elements of { O i } i =1 and { B i } mi =1 can be covered arbitrarilyby 2-sets and 3-sets. In unrestricted packing, one of the local improvements consistsof replacing A , A , A , A , A by O , O , O , O , O , O , O , O . However, inrestricted packing, for ≤ i ≤ , adding any O i to the packing would create a 1-setcovering the unpacked element of B i during the semi-local (2,1)-improvement. Hencethis local improvement is prohibited as a result of restricting on the number of 1-sets.We remark that blocking can be much more complicated than in this simple exam-ple. As we shall see later in Section 3.2, for the Restricted 4-Set Packing problem, a3-level set can initiate a blocking of many optimal sets. A A A A A A A Fig. 1.

Placement of A to A Example 2 (Finish the cover by 1-sets, 2-sets and 3-sets).

In Fig. 2, Fig. 3 and Fig. 4.Rectangles placed vertically represent optimal sets. Circles represent 1-sets. Ellipsesplaced horizontally represent 2-sets or 3-sets, where the smaller ones stand for 2-setsand the larger ones stand for 3-sets. The cross symbol represents an element covered bya k -set in A . Let ( n , n , n ) be an ordered pair, such that, n is the number of 1-sets, n is the total number of 2-sets and 3-sets which are not within one optimal set, and n is the number of 3-sets which are not within one optimal set. These 1-sets, 2-sets and3-sets are used to ﬁnish the cover. The right picture is always a cover which is beforethe cover in the left picture in the lexicographic order. = ⇒ XXX XXX XX

Fig. 2. (2 , , ⇒ (2 , , X XX XX X = ⇒ X XX XX X

Fig. 3. (0 , , ⇒ (0 , , XX XX X = ⇒ XX XX X

Fig. 4. (0 , , ⇒ (0 , , acking-Based Approximation Algorithm for the k -Set Cover Problem 15 Proof (Proposition 1).

It is sufﬁcient to show that there is no cycle in F . Suppose V , V , ..., V l form a cycle in F . We remove the 2-sets intersecting with adjacent nodesin the cycle and add the 2-sets inside each V i , for i = 1 , ..., l . Then the total number of2-sets and 3-sets not within an optimal set decreases. ⊓⊔ Proof (Proposition 2).

Let V be an i -level set which is covered by a 1-set, for i < k − .If there are more than one 1-set covering V , we can replace them with 2-set or 3-setinside V . Hence we only consider the case that there is only one 1-set covering V .If there is a 2-set or 3-set inside V , we can remove the 1-set and replace the 2-set or3-set with a 3-set or two 2-sets respectively. If V is a singleton node, since i < k − ,there exists a 2-set or 3-set inside V . Otherwise let V l be any node of degree 1 whichconnects to V via a simple path V , V , ..., V l − , V l . We remove all 2-sets intersectingwith adjacent nodes in this path and move the 1-set from V to V l , then add 2-sets inside V , ..., V l − . If there is a 1-set in any V i , ≤ i ≤ l − , it can be replaced togetherwith the 2-set just added inside V i with a 3-set. If there is another 1-set in V l , it canbe combined with the 1-set moved from V to a 2-set. Hence, there is no 1-set in any i -level set for i < k − .However, 1-set can remain in ( k − -level set. In this case, the ( k − -level set is asingleton node in F . ⊓⊔ Proof (Proposition 3).

We consider non-degenerate tree. Suppose there are two nodes,an i -level set V and a j -level set V which satisfy the requirements. Consider a simplepath connecting V and V . We remove all 2-sets intersecting with adjacent nodes onthis simple path. For those nodes excluding V , V on the path, we can add a 2-set insideeach optimal set. For V , since V is not a ( k − -level set, there is no 1-set inside V .Moreover, the degree of V is smaller than k − i , hence there is a 2-set or 3-set inside V , we can then replace it with a 3-set or two 2-sets inside V respectively. The sameargument applies to V . Hence, there can be at most one i -level set with degree smallerthan k − i . ⊓⊔ Proof (Proposition 4).

We ﬁrst prove that there is at most one node of either 0-level or1-level which is of degree 2. Assume V and V are 0-level or 1-level sets and of degree2. We replace the 2-sets intersecting with two adjacent nodes along the simple pathconnecting V and V with 2-sets or 3-sets inside the optimal sets represented by thesenodes, then the total number of 2-sets and 3-sets not within an optimal set decreases. Inthis way, V and V turn to be of degree 1.Suppose we have one 0-level or 1-level set V of degree 2. We prove that there isno node of degree 1 representing an i -level set for any i ≤ k − , thus V is the root.Assume W is an i -level set of degree 1. We replace the 2-sets intersecting with twoadjacent nodes along the simple path connecting V and W with 2-sets or 3-sets insidethe optimal sets represented by these nodes, then the total number of 2-sets and 3-setsnot within an optimal set decreases. In this way, V turns to be of degree 1 and W turnsto be of degree 0. ⊓⊔ Proof (Proposition 5).

Consider any subtree with root V of degree d + 1 , d ≥ ( V isnot the root of T ). Since any internal node of degree 2 does not request any token, forsimplicity we assume there is no internal node of degree 2 in the subtree. Assume V has children V , ..., V d . If every child of V is a leaf, since the request is processed in thereverse order given by BFS, we know that there is no other node requesting tokens from V , ..., V d , hence V ’s requests can be satisﬁed. Assume that in every subtree rooted at V , ..., V d , all the requests have been satisﬁed. Since for any tree with root of degree r ,the quantity X V is an internal node ( d V −

2) + r . (7.1)equals to the number of the leaves of the tree. We know that there are 4 free tokens ineach subtree rooted at V , .., V d . Hence, V ’s requests can be satisﬁed.By (7.1), the root of T receives r tokens. ⊓⊔ Proof (Proposition 6).

Suppose a singleton node O is covered by { A i } li =1 . If every A i intersects with S i ∈ S , we have a local improvement by replacing A , ..., A l with S , ..., S l and O . Hence, { A i } li =1 intersect with at most l − ≤ j − sets in S . O hasat most j − requests of tokens. Similarly, we know that a leaf is a ( k − -level set, ithas at most k − internal requests.Every internal node of level j can have j requests, and the root of level s can have s requests. ⊓⊔ Proof (Proposition 7).

Assume O is of level j and j ≥ . From Proposition 6, O has atmost j − request. After giving j − tokens, O has j + 1 ≥ tokens left. Hence, O can satisfy all the requests. ⊓⊔ Proof (Proposition 8).

The root has s + r ) tokens and it has at most s + r requests.If it has at most s + r − requests, it retains s + 3 r + 1 ≥ tokens after satisfying allthe requests.It remains to prove that it cannot have s + r requests. Otherwise, suppose R iscovered by { A i } li =1 and every A i intersects with S i ∈ S . There are leaves L , ..., L r which request token from R for 1-level singleton S ′ , ..., S ′ r . S ′ i intersects with A ′ i ∈ A .Then we have a ( l + r ) - ( l + r + 1) -improvement (replace S , ..., S l , S ′ , ..., S ′ r with A , ..., A l , A ′ , ..., A ′ r and R ). ⊓⊔ Proof (Proposition 9).

If such a root has an external request from V , then V is a leafand V has an internal request from an S ∈ S , S intersects with an A ∈ A . Then wehave a 1-2-improvement (replace A with S and the root). ⊓⊔ Proof (Proposition 10).

By Proposition 6, L makes an external request if and only if ithas 2 internal requests. As in Step (1), L sends request to V which has received 4 tokensfrom it. If V has at least 8 tokens, it gives one to L . Otherwise, we proceed with Step(2.2). Hence, it is sufﬁcient to prove that in (2.2), there exists a leaf in some subtreerooted at V , ..., V d which has 8 tokens. acking-Based Approximation Algorithm for the k -Set Cover Problem 17 Otherwise, suppose any leaf in the subtree rooted at V , ..., V d has an internal re-quest. We pick L , .., L d belonging to the subtree rooted at V , .., V d respectively. L i has an internal request from S i which intersects with A i ∈ A . Moreover, we knowthat L has an internal request from S intersecting with A ∈ A . We have a d - ( d + 1) -improvement (replace A , ..., A d , A with S , ..., S d , S and V ). ⊓⊔ Proof (Proposition 11).

An internal node of degree at least 3 sends a request if and onlyif it receives a request from a leaf but fails to satisfy it. The proof is indeed contained inthe proof of Proposition 10. ⊓⊔ Proof (Proposition 12). If V makes an external request, by Proposition 6, V has twointernal requests, say from S , S . S intersects with A ∈ A , S intersects with A ∈ A . Let X be a closest node to V which belongs to the subtree rooted at X and hasa degree d ≥ . If no such X exists, let X = V and d = 2 . If on the contrary, V ’srequest cannot be satisﬁed, we pick a leaf in each subtree rooted at a child of X . Let L , ..., L d − be these leaves, then L i has an internal request from S ′ i intersecting with A ′ i ∈ A . We then have a local improvement by replacing A ′ , ..., A ′ d − , A , A with S ′ , ..., S ′ d − , S , S and V . ⊓⊔ Proof (Proposition 13).

First, ∀ O / ∈ S . Otherwise, assume O intersects with A ∈ A ,we have a 1-2-improvement (replace A with O and R ). Hence, O does not have anyinternal request. O does not have any external request either. Recall in Round 2 of redistribution, aleaf has an external request if there exists an internal node V of degree d , such that O belongs to the subtree rooted at V , and- If d = 2 , V has two internal requests from S , S , S , S intersect with A , A ∈ A respectively. However in this case, we have a 3-4-improvement (replace A, A , A with R, S , S , V ).- If d = 3 , there is an external request sent to V from leaf L , L has an internal requestfrom S which intersects with A ∈ A , and there is an internal request sent to V from S . Let S intersect with A ∈ A . We have a 3-4-improvement (replace A, A , A with R, S , S , V ).- If d = 4 , there are two external requests sent to V from leaves L , L . L , L haveinternal requests from S , S . S , S intersect with A , A ∈ A respectively. There isa 3-4-improvement (replace A, A , A with R, S , S , V ).Hence, O does not have any token request during Round 2 of redistribution. ⊓⊔ Proof (Proposition 14).

First, j ≥ . Otherwise, we have a 1-2-improvement (replace A with B and R ).If j = 2 , then the number of elements contained in { O i } li =1 T { A i } ji =1 is · − .Hence l ≥ . If l = 2 , there exists an O i which is a 3-level set and completely coveredby { A i } ji =1 . However in this case, we have a 2-3-improvement (replace A , A with O i , B, R ). Hence, l ≥ .If j = 3 , then the number of elements contained in { O i } li =1 T { A i } ji =1 is · − .In this case, l ≥ . ⊓⊔ We construct an example showing that for any ﬁxed s , which is the parameter of thelocal improvement, and any ǫ > and there exists an instance, on which the Restricted4-Set Packing algorithm has a performance ratio at most + ǫ . We thereby concludethat the approximation ratio of the Restricted 4-Set Packing algorithm is .Our construction is randomized. We take n copies of the unit U deﬁned in Example1. Then there are n m S , then propa-gate through an arbitrary number of 2-level sets S , S , ..., S i ( i ≥ ). More speciﬁcally,we form the blocking by covering the remaining uncovered elements of S , S , ..., S i by 2-sets T , T , ..., T i , such that T j intersects S j − and S j , for j = 1 , , ..., i .In our example, we assign each 2-level set to one of the m blocking sets inde-pendently and uniformly at random. If a 2-level set is assigned to a blocking set, thatblocking set is a leaf of the blocking tree containing the 2-level set. For ﬁxed s , weconsider local p - ( p + 1) -improvements for any p ≤ s . For each subset of A with size p , suppose after removing one covering set of λ blocking sets, we can replace these p sets with at most q optimal sets. Then q ≤ p ≤ s . If q − p − λ ≥ , i.e., λ ≤ q − p − ≤ s ≤ s , there is a local improvement.We assume n ≫ s . Let t be the maximum number of 2-level sets which can be addedto the solution by the local improvement. Then t ≤ s . Let O be the collection of 2-level sets. Let {E , E , ..., E N } be a set of random variables, where N = P ti =1 (cid:0) ni (cid:1) ≈ n t is the number of nonempty subsets of O with size at most t . Assume we enumerateevery 2-level sets and arrange all subsets of O lexicographically. Let E i be the eventthat there exists a local improvement which adds the i -th subset with size at most t of O to the solution. Let Y be the number of blocking sets assigned to the 2-level sets inthis subset. Then, Pr( E i ) ≤ Pr( Y ≤ s ) = s X λ =1 Pr( Y = λ ) . (7.2)Since the assignments of 2-level sets to blocking sets are independently and uni-formly at random, we bound (7.2) as follows, Pr( E i ) ≤ s X λ =1 (cid:18) mλ (cid:19) ( λm ) N ≤ s X λ =1 m λ λ N m N ≤ (2 s ) N · m − ( N − s ) ≈ ( 2 sm ) N . (7.3)Assume that m ≫ s . Since each E i depends on less than N ≈ n t elements in {E i } Ni =1 , and N · Pr( E i ) = o (1) . By the following lemma, we have Pr( T Ni =1 E i ) > . Lemma 1 (Corollary 5.12 [11]).

Let {E , E , ..., E n } be events in a probability spacewith Pr( E i ) ≤ p for all i . If each event is mutually independent of all other eventsexcept for at most d , and ep ( d + 1) ≤ , then Pr( T ni =1 E i ) > . acking-Based Approximation Algorithm for the k -Set Cover Problem 19 Therefore, there exists an assignment of 2-level sets to blocking sets such that nolocal p - ( p + 1) -improvement is possible, for p ≤ s .Assume all 3-level sets but a constant few of them are blocking sets. The perfor-mance ratio of the Restricted 4-Set Packing algorithm on this instance is n +9 m n +12 m = + O ( n ) . The ratio tends arbitrarily close to when n → ∞ . S O O A A A · · · · · · O p = O q O ′ q − · · · · · · O ′ S ′ A p − A p − A ′ A ′ Fig. 5.

Rectangles represent optimal k -sets. Ellipses represent k -sets chosen by the al-gorithm. Proof (Proposition 15).

According to Proposition 4 and (7.1), we know that after Round1, every node has at least 2 units of tokens except singletons which are 1-level sets.Note that contrary to the Restricted 4-Set Packing problem, in restricted k -set packingfor k ≥ , every leaf contributes at least 2 units of tokens. Hence, even if a root is oflevel 0 and degree 1, it can still receives at least 2 units of tokens after the ﬁrst round ofredistribution.We remark that we do not care about from which leaf an internal node or the rootreceive the tokens. ⊓⊔ Proof (Proposition 16).

Let N = | S T | . Without loss of generality, assume there isonly one non-degenerate tree T . Let T have an s -level root of degree r and a set ofinternal nodes with degree set { d V } V is an internal node .From (7.1), we know that the leaves contribute ( k − P ( d V −

2) + ( k − r units of tokens. Here the summation goes through the set of internal nodes. There are P ( d V − k − d V −

2) + ( k − P ( d V −

2) + ( k − r + s − , i.e. ( k − P ( d V −

1) + ( k − r + s − free units of tokens in T which can be distributed to sets in S T .Assume on the contrary that N > ( k − X ( d V −

1) + ( k − r + s − . (8.1)We derive an upper bound on N .We ﬁrst claim that any | A i T O j | ≤ . If | A i T O j | ≥ , we know that O j has atleast 3 units of tokens.If A p − = A , the k − elements of A which do not intersect with S , intersect with T . If A p − = A , for each S , there are k − corresponding elements intersecting with T . Here, these k − elements belong to A p − . Hence, T covers at least ( k − N elements. On the other hand, there are ( k − P ( d V −

2) + r ] + P ( k − d V ) + s elements covered by T . We have, ( k − N ≤ ( k − X ( d V −

2) + r ] + X ( k − d V ) + s . (8.2)Combining (8.1) and (8.2), we get ( k − k − P ( d V −

1) + ( k − r + s − < ( k − P ( d V −

2) + r ] + P ( k − d V ) + s , which implies X ( k − k + 6)( d V −

1) + ( k − k + 5) r + ( k − s − k − < . (8.3)(8.3) only holds for the case that k = 5 , r = 1 , s = 0 and there are at most 2 internalnodes. In this case, we observe that the leaf cannot intersect with any set A whichintersects with S ∈ S , or we have a 1-2-improvement by replacing A with S and theroot. Hence, we modify (8.2) and get ( k − N < X ( k − d V ) + s . (8.4)Combining (8.4) and (8.1), we have P (5 d V − < . Since d V ≥ , it leads to acontradiction. ⊓⊔ Proof (Proposition 17).

Let e be the number of edges and v be the number of verticesin C .If C contains a circle, then e ≥ v . The average number of tokens in C is ev ≥ .If C is a tree, then e = v − . We claim that e ≥ s + 1 . Otherwise, there existsa e - v -improvement (replace all the sets corresponding to edges in C with all the setscorresponding to vertices in C ). The average number of tokens in C is ev ≥ − v ≥ − s +2 ≥ − kǫ for s ≥ kǫ .Hence, in any case, after collecting all tokens of C then equally distributed amongevery vertex in C , each vertex gets at least − kǫ units of tokens. ⊓⊔ We ﬁrst prove that there exists a k -set cover which simultaneous minimizes the size andthe number of 1-sets in the cover. acking-Based Approximation Algorithm for the k -Set Cover Problem 21 Lemma 2.

Suppose there are k -set covers C , C ′ , where C has b sets and b C ′ has b ′ sets and b ′ k -set cover C ′′ which has min( b, b ′ ) sets and min( b , b ′ ) We create a bipartite graph G = ( V, V ′ , E ) , where every vertex in V ( V ′ ) repre-sents a set in C ( C ′ ), and every edge in E represents an element in the universe. Hence,the degree of a vertex represents the size of the corresponding set. Two vertices beingadjacent means the corresponding two sets covers a same element. We show how to ﬁnd C ′′ .For simplicity, assume G is connected and b ≤ b ′ . If b ≤ b ′ , take C ′′ to be C .Otherwise, consider a vertex v ∈ V of degree 1 such that its neighbor has degree atleast 2. If there exists a vertex v l ∈ V of degree at least 3, assume v l is the one with theshortest distance to v . Consider the path v , v ′ , ..., v ′ l − , v l connecting v and v l , wereplace the corresponding sets of v , ..., v l − with v ′ , ..., v ′ l − and delete the element in v l which is covered by both v l and v ′ l − . If any v ′ i ( ≤ i ≤ l − ) has degree at least 3,delete the elements in v ′ i which are not covered by v , ..., v l − . In this way, b decreasesby 1 while b remains the same. If there is no v l ∈ V which has degree at least 3, since b > b ′ , and C and C ′ cover the same universe of elements, b must be greater than b ′ ,a contradiction. Hence, we can eventually decrease b to be at most b ′ . Finally, we take C ′′ to be this modiﬁed C . ⊓⊔ We are now ready to set-up the factor-revealing linear program for the k -set coverproblem.Let ( U, S ) be an instance of the k -Set Cover problem, where U is the set of ele-ments to be covered, S is a collection of sets, and S S ∈ S S = U . For i = k, k − , ..., ,let ( U i , S i ) be the instance for phase i of Algorithm PRPSLI, where U i is the set of el-ements which have not been covered before Phase i and S i is the collection of sets in S which contain only the elements in U i . Let OP T i be an optimal solution of ( U i , S i ) for i ≥ . For i ≤ , OP T i is an optimal solution of ( U i , S i ) with minimal numberof 1-sets. OP T is an optimal solution of ( U, S ) . Let b i,j be the ratio of the number of j -sets in OP T i over the number of sets in OP T . Let ̺ i be the approximation ratio ofthe set packing algorithm used in Phase i . Let a be the ratio of the number of -setschosen by the semi-local optimization phase over the number of sets in OP T . Since | OP T i | ≤ | OP T | , we have for i = k, k − , ..., , i X j =1 b i,j ≤ . (9.1)In each phase of PRPSLI, the number of i -sets chosen by the algorithm is n i = | U i \ U i − | i . Since U i − ⊆ U i , then | U i \ U i − | = | U i | − | U i − | = ( P ij =1 jb i,j − P i − j =1 jb i − ,j ) | OP T | . Let ̺ i be the approximation ratio of the set packing algorithmused in Phase i . At the beginning of Phase i , there are b i,i | OP T | i -sets. Thus, n i = ( P ij =1 jb i,j − P i − j =1 jb i − ,j ) | OP T | i (9.2) ≥ ̺ i b i,i | OP T | . (9.3) i.e. i − X j =1 jb i − ,j − i − X j =1 jb i,j − i (1 − ̺ i ) b i,i ≤ . (9.4)We consider additional constraints imposed by the restricted phases, namely forPhase 6 to 3. Let a j be the ratio of the number of j -sets chosen by the semi-localoptimization phase over the number of sets in OP T , for j = 1 , , . In each restrictedphase, the number of 1-sets does not increase. Hence, for i = 3 , , , , a ≤ b i, . (9.5)Next, we obtain an upper bound of the approximation ratio of PRPSLI. From Lemma2.3 in [2], we have a + a ≤ b , + b , + b , . Also notice that a + 2 a + 3 a = b , + 2 b , + 3 b , . Thus we have an upper bound of n , namely, n = ( a + a + a ) | OP T | = ( a a + a a + 2 a + 3 a | OP T |≤ ( a b , + b , + b , b , + 2 b , + 3 b , | OP T | = ( 13 a + 23 b , + b , + 43 b , ) | OP T | . (9.6)Combining (9.2) and (9.6), we have an upper bound of the approximation ratio ofPRPSLI as, P ki =3 n i | OP T | ≤ k X i =4 P ij =1 jb i,j − P i − j =1 jb i − ,j i + 13 a + 23 b , + b , + 43 b , = k X j =1 jk b k,j + k − X i =4 i X j =1 ji ( i + 1) b i,j + 13 a + 512 b , + 12 b , + 712 b , . (9.7)Moreover, b i,j ≥ for j = 1 , .., i ; i = k, ..., . (9.8) a ≥ . (9.9)Hence, we deﬁne the factor-revealing linear program of PRPSLI with objectivefunction (9.7) and constraints (9.1), (9.4), (9.5), (9.8), (9.9) as follows, acking-Based Approximation Algorithm for the k -Set Cover Problem 23 max k X j =1 jk b k,j + k − X i =4 i X j =1 ji ( i + 1) b i,j + 13 a + 512 b , + 12 b , + 712 b , s . t . i X j =1 b i,j ≤ , i = 3 , ..., k, i − X j =1 jb i − ,j − i − X j =1 jb i,j − i (1 − ̺ i ) b i,i ≤ , i = 4 , ..., k,a − b i, ≤ , i = 3 , .., ,b i,j ≥ , i = 3 , ..., k, j = 1 , ..., i,a ≥ . (LP)We also prove that Lemma 3.

For any k ≥ , the approximation ratio of Algorithm PRPSLI is upper-bounded by the maximized objective function value of the factor-revealing linear pro-gram (LP). Proof.

Plug ̺ i = i − ǫ for i = k, ..., and ̺ = in (LP). The dual of (LP) is, min k X i =3 β i s . t . (1) δ + δ + δ + δ ≥ , (2) β + γ − δ ≥ , (3) β + 2 γ ≥ , (4) β + 3 γ ≥ , (5) β i + γ i +1 − γ i − δ i ≥ i ( i + 1) , i = 4 , , , (6) β i + jγ i +1 − jγ i ≥ ji ( i + 1) , i = 4 , ..., k − , j = 1 , ..., i − , (7) β i + iγ i +1 − . γ i ≥ i + 1 , i = 4 , (8) β i + iγ i +1 − ( i − ǫ ) γ i ≥ i + 1 , i = 5 , ..., k − , (9) β k − jγ k ≥ jk , j = 1 , ..., k − , k ≥ . β k − γ k − δ k ≥ k , k = 4 , , , (10) β k − ( k − ǫ ) γ k ≥ , (11) β i ≥ , i = 3 , ..., k, (12) γ i ≥ , i = 4 , ..., k, (13) δ i ≥ , i = 3 , , , . (Dual) For k = 4 , set γ = , δ = 0 , δ = , β = , β = 1 + · . We have P i =3 β i = + + 1 .For k = 5 , set γ = , γ = 0 , β = , β = + 3 γ , β = 1 , δ = 0 , δ = + 2 γ , δ = . We have P i =3 β i = + + 1 .For k ≥ .Set δ = 0 ; γ = , γ i = γ i +2 + i ( i +1)( i +2) for i = 6 , ..., k − , γ k − = 0 , γ k = k − k ; β = , β i = i +1 − iγ i +1 + ( i − ǫ ) γ i for i = 6 , ..., k − , β k = 1 + ( k − ǫ ) γ k .For i ≥ , γ i + γ i +1 − i ( i +1) = γ i +1 + γ i +2 + i ( i +1)( i +2) − i ( i +1) = γ i +1 + γ i +2 − i +1)( i +2) . By induction, γ i + γ i +1 − i ( i +1) = γ k − + γ k − k − k = 0 . Hence, γ i + γ i +1 = 1 i ( i + 1) , for i = 6 , ..., k − . Then, constraints (2),(3) and (4) hold as equality.In constraint (6) for i ≥ , j ≤ i − , β i + jγ i +1 − jγ i = i +1 − iγ i +1 + ( i − ǫ ) γ i + jγ i +1 − jγ i = i +1 − ( i − j ) γ i +1 + ( i − − j + ǫ )( i ( i +1) − γ i +1 ) = i +1 (1 + i − j − ǫi ) − (2 i − − j + ǫ ) γ i +1 ≥ i +1 + i − − j + ǫi ( i +1) − i − − j + ǫi ( i +1) = ji ( i +1) .In constraint (8) for i ≥ , inequalities hold as equality.Constraint (9) holds, β k − jγ k = 1 + ( k − γ k − jγ k ≥ − γ k ≥ jk .Constraint (10) holds as equality.Set γ = − γ , δ = 2 γ − γ + , δ = 2 γ − γ + , δ = 0 for odd k and δ = for even k . β ≥ max { + γ − γ , + γ − γ + δ , + 3 γ − γ } = + 3 γ − γ , β ≥ max { +(3+ ǫ ) γ − γ , + γ − γ + δ , +4 γ − γ } = +(3+ ǫ ) γ − γ .Let β = + 3 γ − γ , β = + (3 + ǫ ) γ − γ .For k is odd and k ≥ , γ = γ − γ + · · · + γ k − − γ k − + γ k − = 2( · · + · · · + k − k − k − ) increases monotonically with respect to k . Thus, for any odd k and k ≥ , γ = 2 H k − − H k − + k − + ≤ − < .For k = 7 , γ = 0 .Hence, Constraint (1) holds, δ + δ + δ + δ = 2 γ + + − γ = − γ > . acking-Based Approximation Algorithm for the k -Set Cover Problem 25 For k is even and k ≥ , γ = γ − γ + · · · + γ k − − γ k + γ k = 2( · · + · · · + k − k − k ) + k − k = 2 H k − − H k − + k − + decreases monotonically withrespect to k . For k = 6 , γ = . Thus, for any even k and k ≥ , γ ≤ . Moreover, γ > − > Hence, Constraint (1) holds, δ + δ + δ + δ = 2 γ + + − γ + = − γ ≥ .Constraint (5) for i = 4 , , constraint (6) for i = 4 , , constraint (7) and constraint(8) for i = 5 hold directly as a result of the settings of these parameters.Constraint (5) for i = 6 holds, β + γ − γ − δ − = + 3 γ − γ − = + 3 γ − − γ ) − − = 8 γ − > − > .Constraint (9.1) holds for k = 6 , β − γ − δ > γ − > .Moreover, constraint (11), (12), (13) hold.Finally, we compute the value of the objective function.For odd k and k ≥ , P ki =3 β i = β + β + β + P ki =6 β i = + + + 3 γ + ǫγ − γ + P k − i =6 1 i +1 − iγ i +1 + ( i − ǫ ) γ i + 1 + ( k − ǫ ) γ k = + + + 3 γ − γ +1+ − γ +4 γ + P ki =8 1 i − ( · + · + · · · + k − k )+ ǫ P ki =5 γ i = 1+ + + + − γ − ( − γ )+ 4 γ + 2( + · · · k )+ ǫ P ki =5 γ i ≤ + + + · · · + k + ǫ .Last inequality holds because P ki =5 γ i ≤ P ki =5 1( i +1) i ≤ .Similarly, for even k and k ≥ , P ki =3 β i = β + β + β + P ki =6 β i = + + +3 γ + ǫγ − γ + P k − i =6 1 i +1 − iγ i +1 +( i − ǫ ) γ i +1+( k − ǫ ) γ k = + + +3 γ − γ +1+ − γ +4 γ + P ki =8 1 i − ( · + · + · · · + k − k − + k − k )+ ǫ P ki =5 γ i =1 + + + + − γ − ( − γ ) + 4 γ + 2( + · · · k − ) + k − + k + ǫ P ki =5 γ i ≤ + + + · · · + k − + k − + k + ǫ .Therefore, the approximation ratio of PRPSLI for odd k and k ≥ can be upperbounded by + + + · · · + k + ǫ . For even k and k ≥ , it is upper boundedby + + + · · · + k − + k − + k + ǫ . ⊓⊔ For every k ≥ and any ǫ > , we give a tight example of PRPSLI based on the tightexample of the semi-local (2 , -improvement [2] for 3-Set Cover, the tight example ofthe Restricted 4-Set Packing algorithm we give in Appendix Section 7.2, and the tightexample of the Restricted k -Set Packing algorithm for k ≥ , which is the same as thetight example of the k -set packing heuristic [7]. We assume that the optimal solution O consists of only disjoint k -sets. To calculatethe performance ratio on this instance, we charge a cost of 1 for each set chosen by thealgorithm, and the cost is uniformly distributed to every element of the chosen set [2]. – k = 4 . In Phase 4, the Restricted 4-Set Packing algorithm covers 1 element ofeach set in a (1 − ǫ ) fraction of O , 2 elements of each set in a (1 − ǫ ) frac-tion, and 3 elements of each set in the remaining ǫ fraction. Denote the three partsof O by O , O and O respectively. In Phase 3, the semi-local optimization cov-ers 1 element in each set of O by 3-sets, and the remaining uncovered elementsof O are covered by 2-sets. The performance ratio of PRPSLI on this instance is (1 − ǫ )( + 1) + (1 − ǫ )( + + 1) + ǫ ( + ) = + + 1 − ǫ . – k = 5 . In Phase 5, the Restricted 5-Set Packing algorithm covers 2 elements ofeach set in a − ǫ fraction of O , 1 element of each set in the remaining ǫ fraction.Denote the two parts of O by O and O respectively. The algorithm switches to 4-Set Cover on O and it performs a semi-local optimization on O . The performanceratio of Algorithm 1 on this instance is (1 − ǫ )( + + 1) + ǫ ( + + + 1) = + + 1 − ǫ . – k odd and k ≥ . In Phase k , the k -Set Packing algorithm covers 2 elements ofeach set in a − ǫ fraction of O , 1 element of each set in the remaining ǫ fraction.Denote the two parts of O by O and O respectively. In Phase k − , the algorithmcovers 1 element of each set in O . Then it switches to ( k − -Set Cover on theremaining uncovered elements. The performance ratio of PRPSLI on this instanceis at least (1 − ǫ ) k + ǫ ( k + k − ) + ρ k − , i.e. k + ρ k − + ( k − − k ) ǫ , which is + + + 1 + ǫ for k = 7 , and by induction k + k − + · · · + + + 1 +( k − − k + k − − k − + · · · + − + ) ǫ for k ≥ . The coefﬁcient of ǫ isupper bounded by k − − k + k − − k − + · · · + − + = − k + ≤ .Hence, the performance ratio is at most k + k − + · · · + + + 1 + ǫ for k ≥ . – k even and k ≥ . In Phase k , the k -Set Packing algorithm covers 2 elements ofeach set in a − ǫ fraction of O , 1 elements of each set in the remaining ǫ fraction.Denote the two parts of O by O and O respectively. In Phase k − , the ( k − -Set Packing algorithm covers 1 element of each set in O and then 1 element ofeach set in O . Then the algorithm switches to ( k − -Set Cover on the remaininguncovered elements. The performance ratio of Algorithm 1 on this instance is atleast (1 − ǫ ) k + ǫ ( k + k − ) + k − + ρ k − , i.e. k + k − + ρ k − + ( k − − k ) ǫ ,which is + + + 1 + ǫ for k = 6 and by a similar argument as the abovecase, at most k + k − + k − + · · · + + + 1 + ǫ for k ≥8