[PDF] A performance study of some approximation algorithms for minimum dominating set in a graph

Abstract

We implement and test the performances of several approximation algorithms for computing the minimum dominating set of a graph. These algorithms are the standard greedy algorithm, the recent LP rounding algorithms and a hybrid algorithm that we design by combining the greedy and LP rounding algorithms. All algorithms perform better than anticipated in their theoretical analysis, and have small performance ratios, measured as the size of output divided by the LP objective lower-bound. However, each may have advantages over the others. For instance, LP rounding algorithm normally outperforms the other algorithms on sparse real-world graphs. On a graph with 400,000+ vertices, LP rounding took less than 15 seconds of CPU time to generate a solution with performance ratio 1.011, while the greedy and hybrid algorithms generated solutions of performance ratio 1.12 in similar time. For synthetic graphs, the hybrid algorithm normally outperforms the others, whereas for hypercubes and k-Queens graphs, greedy outperforms the rest. Another advantage of the hybrid algorithm is to solve very large problems where LP solvers crash, as demonstrated on a real-world graph with 7.7 million+ vertices.

Full PDF

aa r X i v : . [ c s . D S ] S e p A performance study of some approximation algorithms forminimum dominating set in a graph

Jonathan S. Li , Rohan Potru , Farhad Shahrokhi The University of Texas at Austin University of North [email protected]@[email protected] Work done at UNT and supported by the Texas Academy of Math and Science.

Abstract

We implement and test the performances of several approximation algorithms for com-puting the minimum dominating set of a graph. These algorithms are the standard greedyalgorithm, the recent LP rounding algorithms and a hybrid algorithm that we design bycombining the greedy and LP rounding algorithms. All algorithms perform better than an-ticipated in their theoretical analysis, and have small performance ratios, measured as thesize of output divided by the LP objective lower-bound. However, each may have advantagesover the others. For instance, LP rounding algorithm normally outperforms the other algo-rithms on sparse real-world graphs. On a graph with 400,000+ vertices, LP rounding tookless than 15 seconds of CPU time to generate a solution with performance ratio 1.011, whilethe greedy and hybrid algorithms generated solutions of performance ratio 1.12 in similartime. For synthetic graphs, the hybrid algorithm normally outperforms the others, whereasfor hypercubes and k-Queens graphs, greedy outperforms the rest. Another advantage of thehybrid algorithm is to solve very large problems where LP solvers crash, as we observed ona real-world graph with 7.7 million+ vertices.

Domination theory has its roots in the k -Queens problem in 18th century. Later in 1957, Berge[4] formally introduced the domination number of a graph. The problem of computing the dom-ination number of a graph has extensive applications including the design of telecommunicationnetworks, facility location, and social networks. We refer the reader to the book by Haynes,Hedetniemi, and Slater [22] as a general reference in domination theory.We assume that the reader is familiar with general concepts of graph theory as in [12], the theoryof algorithms as in [11], and linear and integer programming concepts as in [14], respectively.Throughout this paper G “ p V, E q denotes an undirected graph on vertex set V and edge set E with n “ | V | and m “ | E | . Two vertices x, y P V where x ‰ y are adjacent (or they areneighbors) if x, y P E . For any x P V , degree of x , denoted by deg p x q is the number of verticesadjacent to x in G . For any x P V , let N p x q denote the set of all vertices in G that are adjacentto x . Let N r x s denote N p x q Y t x u . Arboricity of G , denoted by a p G q is the minimum number ofspanning acyclic subgraphs of G that E can be partitioned into. By a theorem of Nash Williams, a p G q “ max S r m S n S ´ s , where n S and m S are the number of vertices and edges, respectively, of1he induced subgraph on the vertex set S [30]. Consequently m ď a p G qp n ´ q , and thus a p G q measures how dense G is. It is known that a p G q can be computed in polynomial time [19].Let D Ď V . D is a dominating set if for every x P V z D there exists y P D such that p x, y q P E .The domination number of G , denoted by γ p G q , is the cardinality of a minimum (smallest)dominating set of G . Computing γ p G q is known to be an NP-Hard problem even for unit discgraphs and grids [13]. A simple greedy algorithm attributed to Chvatal [9] and Lovas [25] (for approximating the setcover problem) is known to approximate γ p G q within a multiplicative factor of H p ∆ p G qq fromits optimal value, where ∆ p G q is maximum degree of G and H p k q “ ř ki “ p { i q is the k ´ thharmonic number. The algorithm initially labels all vertices uncovered. At iteration one, thealgorithm selects a vertex v of maximum degree in G , places v in a set D , and labels allvertices adjacent to it as covered. In general, at iteration i ě

2, the algorithm selects a vertex v i P V ´ t v , v , ..., v i ´ u with the largest number of uncovered vertices adjacent to it , adds v i to D , and labels all of its uncovered adjacent vertices as covered. The algorithm stops when D becomes a dominating set. it is easy to implement the algorithm in O p n ` m q time. It is knownthat approximating γ p G q within a factor p ´ ǫ q ln p ∆ q from the optimal is NP-hard [17]. Hence,no algorithm for approximation γ p G q can improve the asymptotic worse case performance ratioachieved by the greedy algorithm. Diﬀerent variations of the greedy algorithm to approximate γ p G q are developed and some are tested in practice; See work of Chalupa [9] Campan et. al. [8],Eubank et. al [18], Parekh [26], Sanchis [27], and Siebertz [28].Below are two examples of worst-case graphs (one sparse and one dense) for greedy algorithmwhich are derived from an instance of set cover problem provided in [6]. For both instances, thesolutions provided by the greedy algorithm are actually O p ln p ∆ qq times the optimal. Example 1.1.

Let p ě i “ , , ..., p , let S i be a star on 2 i vertices. Consider agraph G on n “ p ` vertices whose vertices are the disjoint union of the vertices of the S i ’s( i “ , , ..., p ) plus two additional vertices t and t . Now, place edges from t and t to theﬁrst half of the vertices in each S i (including the root), and the second half of the vertices ineach S i , respectively. Note that the root of each S i has degree 2 i and the degree of both t and t is 2 p ´

1. Initially, greedy chooses the root of S p which can cover 2 p ` i ě

2, there is a tie between the root of S p ` ´ i and t since eachcan cover 2 p ´ uncovered vertices. If tie breaking does not result in selecting t , there will be atie in every iteration until the algorithm returns the set of S i ’s ( i “ , , ..., p ). This dominatingset has cardinality p “ log p ∆ q ´

1, but γ p G q “

2, since t t , t u is a minimum dominating set.Note that G is a planar graph. Note that ln p k ` q ď H p k q ď ln p x q ` xample 1.2. Let p ě G be a graph with vertices V Y V ,where V “ t s , s , ..., s p , t , t u and V “ t v , v , ..., v p ` ´ u . Now make V a clique and V an independent set of vertices, re-spectively. Next, consider a linear ordering L on V : for i “ , , ..., p , the set of neighbors of s i in V , denoted by W i , has cardinality 2 i and is disjoint from W k , for any k ď i . Finally, for i “ , , ..., p place edges between t and the ﬁrst half of the vertices in each W i , and place edgesbetween t and the second half of the vertices in each W i . Now note that the greedy algorithmwill be forced to pick the vertices s p , s p ´ , ..., s , in that order but the minimum dominating setin G is t t , t u and ∆ “ p ` p ` One can formulate the computation of γ p G q as an integer programming problem stated be-low. However, since integer programming problems are known to be NP-hard [23], the directapplications of the integer programming method would not be computationally fruitful. IP1:

Minimize I “ ř v P V x v Subject to ř u P N r v s x u ě , @ v P Vx v P t , u , @ v P V Now observe that by relaxing the integer program IP1 one obtains the following linear program.

LP1:

Minimize L “ ř v P V x v Subject to ř u P N r v s x u ě , @ v P V ď x v ď , @ v P V Note that L ˚ ď γ p G q “ I ˚ , where L ˚ and I ˚ are the values of L and I at optimality. Since theclass of linear programming problems are solvable in polynomial time [24], LP1 can be solvedin polynomial time. Very recently, Bansal and Umboh [3] and Dvok [16] have shown that anappropriate rounding of fractional solutions of LP1 gives integer solutions to IP1 whose valuesare at most 3 ¨ a p G q ¨ L ˚ and p ¨ a p G q ` q ¨ L ˚ , respectively, in polynomial time. Hence, forsparse graphs (graphs with bounded arboricity), one can get a better approximation ratio than O p ln p ∆ qq which is achieved by the greedy algorithm. To our knowledge, and in contrast tothe greedy algorithm, the performances of the LP rounding approaches have not been tested inpractice. 3 .3 Other approximation algorithms There are other approximation algorithms for very speciﬁc classes of graphs including planargraphs which have better than constant performance ratio in the worst case but are more complexthan algorithms described here. See [28] for a brief reference to some related papers.

Greedy is simple and fast, since it can be implemented in linear time. Its performance ratioin the worst case scenario is logarithmic. Linear programming works in polynomial time butis more time consuming than greedy. For sparse graphs, recent linear programming roundingmethods in [3, 16] have a constant performance ratio, but there have not been any experimentalstudy of their performances.In this paper, we implement three types of algorithms and compare and contrast their perfor-mances in practice. These algorithms are the greedy algorithm, the LP rounding algorithms,and a hybrid algorithm that combines the greedy and LP approach. The hybrid algorithm ﬁrstsolves the problem using the greedy algorithm and ﬁnds a dominating set D, | D | “ d . It thentakes a portion of vertices in D , forces their weights to be 1 in linear program LP1, solves theresulting (partial) linear program, and then properly rounds the solution to the partial LP. Fi-nally, it returns the rounded solution plus the portion of the greedy solution that was forced toLP1. We used a laptop with modest computational power - 8th generation Intel i5 (1.6GHz) and 8GBRAM - to perform the experiments. We implemented the O p n ` m q time version of the greedyalgorithm in C++. We used IBM Decision Optimization CPLEX Modeling (DOCPLEX) forPython to solve the LP relaxation of the problem. Python and DOCPLEX were used to imple-ment the LP rounding and hybrid algorithms.The graph generator at : was used to create the planar graphs, trees, k-planar graphs (graphsembedded in the plane with at most k crossings per edge) , and k-trees (graphs with tree width k with largest number of edges) up to 20,000 vertices. The k-Queens graphs, hypercubes (upto 12 dimensions) and graph implementations of the cases described in 1.1 and 1.2 were createdourselves. We also used publicly available Google+ and Pokec social-network graphs, as well asreal-world DIMACS Graphs with up to more than 7,700,000 vertices.https://snap.stanford.edu/data/com-Youtube.html [8]https://github.com/joklawitter/GraphGenerators : Through experimentation, all algorithms perform better than anticipated in their theoreticalanalysis, particularly with respect to the performance ratios (measured with respect to theLP objective lower-bound). However, each may have advantages over the others for speciﬁcdata sets. For instance, LP rounding normally outperforms the other algorithms on real-worldgraphs. On a graph with 400,000+ vertices, LP rounding took less than 15 seconds of CPU timeto generate a solution with performance ratio 1.011, while the greedy and hybrid algorithms4enerated solutions of performance ratio 1.12 in similar time. For synthetic graphs (generatedk-trees, k-planar) the hybrid algorithm normally outperforms the others, whereas for hypercubesand k-Queens graphs, the greedy outperforms the rest. Particularly, on the 12-dimensionalhypercube, greedy ﬁnds a solution with performance ratio 1.7 in 0.01 seconds. On the otherhand, the LP rounding and hybrid algorithms produce solutions with performance ratio 13 and3.3 using 7.5 and 0.08 seconds of CPU time, respectively. It is notable that greedy gives optimalresults in some cases where the domination number is known. Speciﬁcally, the greedy algorithmproduces an optimal solution on hypercubes with dimensions d “ k ´ L ˚ can not be computed, and hence the performance ratio of the hybrid algorithm can not bedetermined. We resolved this problem by decomposing LP1 in to two smaller linear programs soeach of them has an objective value not exceeding L ˚ and used the maximum objective value ofthe two smaller LP’s, instead of L ˚ , to measure the performance ratio of the hybrid algorithm.Section 3, 4, and 5 contains results for Planar, k-Planar, and k-Tree graphs, hypercubes andk-Queen graphs, and real-world graphs respectively. The following algorithm is due to Bansal and Umboh [3].

Algorithm A ([3]) Solve LP1, and let H be the set of all vertices that have weight at least 1 {p a p G qq , where a p G q is the arboricity of graph G . Let U be the set of all vertices not adjacent to anyvertex in H and returns H Y U .Dvok[15, 16] studied d -domination problem, that is, when a vertex dominates all vertices atdistance at most d from it and its combinatorial dual, or a 2 d -independent set [1]. In [16] heemployed the LP rounding approach of Bansal and Umboh, as a part of his frame work andconsequently, for d “

1, he improved the approximation ratio of Algorithm A by showing thatthe algorithm A given below provides a 2 a p G q ` Algorithm A ([16]) Solve LP1, and let H be the set of all vertices that have weight at least 1 {p a p G q ` q ,where a p G q is the arboricity of graph G . Let U be the set of all vertices that are notadjacent to any vertex of H and return H Y U . Remark 2.1.

Graph G in example 1.1 is planar, so a p G q ď . Thus, algorithms A and A have a worst-case performance ratio of nine and seven respectively, whereas greedy exhibitsa worst-case O p log p n qq performance ratio. Throughout our experiments, rounding algorithmsreturned an optimal solution of size two for both examples, whereas greedy returned a set of sizethree for Example 1.1. Furthermore, in Example 1.2, it can be veriﬁed that a p G q ě p p ` q{ forgraph G and hence in theory the worse case performance ratios of the rounding algorithms arenot constant either. Interestingly enough, in our experiments, L ˚ was always two for graphs of ype Example 1.2, and LP rounding algorithms also always found a solution of size two whichis the optimal value. Thus the performance ratio was always one and much smaller than thepredicted worst case. Next, we provide a description of the decomposition approach for approximating LP1 and ourhybrid algorithm. Recall that a separation in G “ p V, E q is a partition A Y B Y C of V so thatno vertex of A is adjacent to any vertex of C . In this case B is called a vertex separator in G . Let X “ t x v | v P V u be a feasible solution to LP1, and let V Ď V . Then X p V q denotes ř v P V x v . Lemma 2.1.

Let A Y B Y C be a separation in G “ p V, E q and consider the following linearprograms: LP2:

Minimize M “ ř v P A Y B x v Subject to ř u P N r v s x u ě , @ v P A ď x v ď , @ v P A Y B and LP3:

Minimize N “ ř v P C Y B x v Subject to ř u P N r v s x u ě , @ v P C ď x v ď , @ v P B Y C Then max t M ˚ , N ˚ u ď L ˚ . Proof.

Let X “ t x v | v P V u be an optimal solution to LP1. Note that the restrictions of X to A Y B and C Y B give feasible solutions for LP3 and LP2 of values X p B Y C q and X p B Y A q ,and hence the claim for the lower bound on L ˚ follows. l .Note that in LP2, LP3 the constraints are not written for all variables, and rounding method in[3] may not directly be applied. Theorem 2.1.

Let G “ p V, E q , let A Ă V , let B “ E p A q and let C “ V ´ p A Y B q . Let X bean optimal solution for LP3, and let X p C q denote the sum of the weights assigned to all verticesin C . Then there is dominating set in G of size at most | A | ` a p G q X p C q ď | A | ` a p G q N ˚ . Proof.

Let H be the set of all vertices v in C with x p v q ě a , and let U “ C ´ p H Y E p H qq .Now apply the method in [3] to C to obtain a rounded solution, or a dominating set D , of atmost | U | ` | H | ď a p G q X p C q vertices in C . Finally, note that A Y D is a dominating set in G with cardinality at most | A | ` a p G q X p C q ď | A | ` a p G q N ˚ l .6 lgorithm H (Hybrid Algorithm) Apply the greedy algorithm to G to obtain a dominating set D “ t x , x , ....x d u , and let S “ t x , x , ..., x α.d u be the ﬁrst α.d vertices in D . Now solve the following linear programon the induced subgraph of G with the vertex set V ´ t S u . M inimize J “ ÿ v P V ´t S u x v (1) Subject to ÿ u P N r v s x u ě , @ v P V ´ t S Y N r S su (2)0 ď x v ď , @ v P V ´ S (3)Next, let A “ S, B “ E p S q and C “ V ´ p A Y B q , and apply the rounding scheme inalgorithms A or A to C , and let H and U be corresponding sets, and output the set S Y H Y U . Remark 2.2.

Note that by Theorem 2.1 Algorithm H can be implemented in polynomial time.Furthermore, | S Y H Y U | ď α.d ` a p G q N ˚ ď α. p ln p ∆ q ` q ` a p G qq .γ p G q , and thus Algorithm H has a bounded performance ratio. In this section, we compare the performance ratios of Greedy, A , A , A Hybrid, and A Hybridon planar graphs, k-planar graphs k-trees. In Tables 2 and 3, we present the performance of thealgorithms on k-trees where k “ t | V | . u and k-planar graphs where k “ t ln p| V |q u , respectively.These graphs are dense. We also present the algorithms’ performance on sparse k-trees andsparse k-planar graphs in tables 4 and 5. The planar graphs k-trees, and k-planar graphs wereall made using : described in section 1.5.In most cases, the A and A variants of the hybrid algorithm outperformed the others, produc-ing the lowest performance ratio to the LP lower bound L ˚ . Greedy performs close to hybrid andoutperforms it for the larger dense k-trees and a few of the k-planar graphs. The LP-roundingalgorithms performed the worst across the board. All algorithms were able to compute domi-nating sets in less than 2 seconds across the diﬀerent types of graphs and their range of sizes.The arboricity of each of the planar graphs is at most 3. For k-trees, we use r k ´ p k { qp k ´ q N ´ s forarboricity. For k-planar graphs, we use the upper bound of r ? k s on arboricity.Table 1: Results for Planar Graphs n, m L ˚ Greedy/ L ˚ A / L ˚ A Hybrid/ L ˚ A / L ˚ A Hybrid/ L ˚ k “ t | V | . u n, m L ˚ Greedy/ L ˚ A / L ˚ A Hybrid/ L ˚ A / L ˚ A Hybrid/ L ˚ Table 3: Results for k-Planar Graphs where k “ t ln p| V |q u n, m L ˚ Greedy/ L ˚ A / L ˚ A Hybrid/ L ˚ A / L ˚ A Hybrid/ L ˚ Table 4: Results for k-Trees where k “ n, m L ˚ Greedy/ L ˚ A / L ˚ A Hybrid/ L ˚ A / L ˚ A Hybrid/ L ˚ Table 5: Results for k-Planar Graphs where k “ n, m L ˚ Greedy/ L ˚ A / L ˚ A Hybrid/ L ˚ A / L ˚ A Hybrid/ L ˚ In this section, we present the performance of Greedy, A , A , A Hybrid, and A Hybrid onhypercubes from 5-12 dimensions and k-Queens graphs.8able 6 compares the performance ratios of the algorithms on hypercubes. We use the arboricityfor hypercubes a “ t d { ` u for LP rounding and hybrid [21]. For k-Queens graphs, arboricityis unknown, so we use the upper bound 3 p k ´ q , where k is the length of the chessboard.For both hypercubes and k-Queens graphs, Greedy performs the best, followed by A Hybridand A Hybrid. A and A LP rounding perform the worst by far. This is not surprising asLP Rounding approaches are known to in general perform worse on dense graphs than sparsegraphs. Solutions were computed in under 8 seconds for all graphs and algorithms.Table 6: Results for Hypercubes n, m L ˚ Greedy/ L ˚ A / L ˚ A Hybrid/ L ˚ A / L ˚ A Hybrid/ L ˚

5, 80 5.33 1.50 3.00 1.50 3.00 1.506, 192 9.14 1.75 7.00 1.75 7.00 1.757, 448 16.00 1.00 1.00 1.00 1.00 1.008, 1024 28.44 1.13 9.00 1.13 9.00 1.139, 2304 51.20 1.25 7.07 2.99 7.07 2.9910, 5120 93.09 1.38 11.00 2.70 11.00 2.7011, 11264 170.67 1.50 6.59 2.85 6.59 2.8512, 24576 315.08 1.63 13.00 3.14 13.00 3.14

Table 7: Results for k-Queens Graphs n, m L ˚ Greedy/ L ˚ A / L ˚ A Hybrid/ L ˚ A / L ˚ A Hybrid/ L ˚ In this section, we present the performance of LP rounding, greedy, and hybrid on the real-worldsocial network graphs from Google+ [9], Pokec [9], and DIMACS [2]. Each of these graphs aresparse, but their arboricity is unknown. Since arboricity is unknown, we experiment with thethreshold applied during LP rounding and hybrid, starting with 1 { a , where a “ r | E |{p| V | ´ q s is a lower bound on arborictiy . We call LP Rounding with this threshold Algorithm A . Simi-larly, Algorithm A has threshold 1 { a `

1. Through experimentation, the best threshold whichwe found was 2 { a ; the resulting Algorithm is called A .In Table 8, we compare the solution size of A , A , and A , along with their hybrid analogsand greedy, to the LP lower bound L ˚ on the Google+ graphs. Table 9 compares the samealgorithms on the Pokec graphs. In Table 10, we compare the performance ratio to the LPlower bound for these algorithms on 3 social network graphs from DIMACS. In Tables 8, 9 and10, LP Rounding performs better than the greedy and hybrid approaches, with greedy beingthe worst out of the algorithms tested. Out of the LP rounding approaches, A performs the best.9able 8: Results for Google + Graphs n, m L ˚ Greedy A A A A Hybrid A Hybrid A Hybrid500, 1006 42 42 42 42 42 42 42 422000, 5343 170 176 170 170 170 176 176 17610000, 33954 860 900 864 864 864 893 893 89320000, 81352 1715 1817 1730 1730 1716 1800 1800 180050000, 231583 4565 4849 4651 4607 4585 4790 4790 4790

Table 9: Results for

Pokec

Graphs n, m L ˚ Greedy A A A A Hybrid A Hybrid A Hybrid500, 993 16 16 16 16 16 16 16 162000, 5893 75 75 75 75 75 75 75 7510000, 44745 413 413 413 413 413 413 413 41320000, 102826 921 928 921 921 921 923 923 92350000, 281726 2706 2773 2712 2712 2712 2757 2757 2743

Compared to the best results from [9], which used a randomized local search algorithm thatis run for up to one hour, LP Rounding approaches generally produced a smaller or as goodsolution using signiﬁcantly less run-time at less than 0.5 seconds for each graph.Table 10: Results for DIMACS Graphs

Graph n, m L ˚ Greedy/ L ˚ A / L ˚ A Hybrid { L ˚ A / L ˚ A Hybrid/ L ˚ A { L ˚ A Hybrid { L ˚ coAuthorsDBLP 299067, 977676 43969.00 1.02 1.00 1.02 1.00 1.02 1.00 1.02coPapersCiteseer 434102, 16036720 26040.92 1.12 1.01 1.12 1.01 1.12 1.01 1.12citatinCiteseer 268495, 1156647 43318.85 1.04 1.03 1.04 1.03 1.04 1.02 1.04 Table 11 shows an example of a 7 million+ vertices graph where A and A cannot be run asa result of the large size. For hybrid approaches, using the ﬁrst d { d is the size of the greedy solution, resulted in the use of too much memory. Weinstead used the ﬁrst 3 d { A Hybrid and A Hybridperformed better than greedy. Greedy took 14 seconds to produce a solution while hybrid took107 seconds. max t M ˚ , N ˚ u is provided as a lower bound on L ˚ , and therefore, γ p G q .Table 11: Results for the DIMACS Great Britain Street Network n, m M ˚ N ˚ max { M ˚ , N ˚ } Greedy A Hybrid A Hybrid7733822, 8156517 1314133 1357189 1357189 2732935 2724608 2724608 eferenceseferences