Approximation Algorithm for N-distance Minimal Vertex Cover Problem
AApproximation Algorithm forN-distance Minimal Vertex Cover Problem
Tarun Yadav
Scientist, Scientific Analysis GroupDefence R & D Organisation, INDIAEmail: [email protected]
Koustav Sadhukhan ∗ , Rao Arvind Mallari † Scientist, Defence Research andDevelopment Organisation, INDIAEmail: ∗ [email protected], † [email protected] Abstract —Evolution of large scale networks demand for ef-ficient way of communication in the networks. One way topropagate information in the network is to find vertex cover.In this paper we describe a variant of vertex cover problemnaming it
N-distance Vertex Minimal Cover(N-MVC) Problem tooptimize information propagation throughout the network. Aminimum subset of vertices of a unweighted and undirectedgraph G = ( V, E ) is called N-MVC if ∀ v ∈ V , v is at distance ≤ N from at least one of the the vertices in N-MVC . Inthe following paper, this problem is defined, formulated andan approximation algorithm is proposed with discussion on itscorrectness and upper bound.
Index Terms —Minimal Vertex Cover, Approximation, N-Trail,N-distance, Maximal Matching, Graph Reduction, ExtendedGraph
I. I
NTRODUCTION
Network such as Internet, human body, malicious botnet,mobile networks are part of everyday life. Networks in phys-ical world are mapped to graphs in computer world. A graph G ( V, E ) is set of nodes called vertices( V ) and connectionsbetween these nodes called Edges( E ). Essence of network iscommunication between nodes, in case of large networks it isa big challenge to perform this task efficiently. Primarily thereare two objectives, first information should reach to each nodeof graph and second is to perform this task efficiently withminimum resources and given constraints. Given unlimitedresources and no constraints, information can propagate toeach node but this is not the case in real life scenarios. Inpractice, there is need to select some of the nodes which canpropagate given information to other nodes. Example of suchscenarios can be found in social networks[7], P2P botnets andsimilar densely connected graphs of various networks. As aspecific example, consider the case where one has to selectthe minimal set of influential nodes in a social network suchthat some critical information is propagated to all nodes in thenetwork in a finite number of hops.One solution of this problem is to determine an approximateMinimal Vertex Cover( MVC ) of the network and use the nodesin MVC for propagating information. Due to the property of
MVC , the information will be propagated to all nodes in thenetwork in a single hop. But in practice, the cardinality of
MVC in huge networks will be very large. Hence, the resourcesused to propagate information using a single hop from verticesof
MVC will be very high. A smaller set of nodes can be used to propagate the information in multiple hops, to reduce thenumber of resources used. The challenge is to find a set ofnodes given a constraint of the maximum hops( N ), beforewhich information has to be propagated to all nodes in thenetwork. We propose a solution to estimate this smaller set ofnodes and term it as approximated N-distance minimum vertexcover (N-MVC) , where N is the maximum number of hopswithin which information has to reach all nodes in the network.If the network is static and don’t change over time then oncecomputed nodes in N-MVC require propagation capability,all other node may behave as sink nodes (don’t propagatethe information) whereas in dynamic networks where nodesor connections are changed over time,
N-MVC needs to berecomputed as soon as new any change happens. We can say N is capacity of nodes to propagate the information. We areconsidering homogeneous situation where N is same for allnodes. In this paper, we discuss the problem statement, presentthe approximation algorithm, correctness and discuss upperbound on the solution.One of the similar type of problem is k-path Vertex Cover [2]and discussed[4] [5] thoroughly. k-path Vertex Cover ensuresexistence of special(having some defined properties) nodes inany path of k vertices in the network. Variants of Vertex coverproblem are part of research in domains related to securecommunication in sensor networks[6], topology analysis ofmalicious bots network and security and resources optimiza-tion in various types of network.This paper is organized into 4 sections. Section II definesthe problem statement, Section III describe the algorithm indetail, Section IV proves correctness of the algorithm usingcontradiction, Section V discusses the upper bound for thesolution with reference to N . The paper ends with summaryand concluding remarks in Section VI.II. N- DISTANCE M INIMAL V ERTEX C OVER P ROBLEM
Given an unweighted and undirected graph G ( V, E ) , wedefine the following. Vertex Cover (VC):
The Vertex cover of Graph G ( V, E ) isa subset of vertices S ⊆ V ( G ) such that every edge has atleastone endpoint in S , that is ( u, v ) ∈ E ( G ) = ⇒ u ∈ S ∨ v ∈ S .Alternatively, vertex cover of a given Graph G ( V, E ) is a setS of vertices such that any vertex of the graph either ∈ S or a r X i v : . [ c s . D S ] J un ig. 1: An example graph to understand N-distance vertex cover at most hop distance(or edge) away from at least one of thevertices in S . Minimum Vertex Cover Problem(MVC)[1]:
For a graph G , Minimum Vertex Cover Problem(MVCP) is the optimiza-tion problem of finding the vertex cover of G with the leastpossible cardinality. N-distance Vertex Cover(N-VC):
N-distance VertexCover(N-VC) for a given Graph G ( V, E ) is a subset of vertices S ⊆ V ( G ) such that every vertex is either in S or at mostat a distance of N away from at least one of the vertices invertex cover. ∀ v ∈ V ( G ) , either v ∈ S or ∃ u ∈ S such that d ( v, u ) ≤ N where d ( u, v ) denotes the geodesic distance between verticesu and v. N-distance Minimum Vertex Cover Problem(N-MVC):
N-distance Minimum Vertex Cover(N-MVC) Problem for agraph G is the optimization problem of finding the N-distanceVertex cover with the least possible cardinality.
Explanation of Problem Statement:
N-distance MinimalVertex Cover (N-MVC) is a set of minimum nodes( S ), suchthat every node of network either belongs to N-MVC oris at a maximum of N hops away from atleast one nodein S . In other words it can be said that in any arbitrarynetwork, information( I ) propagated to N nodes in N-MVC will guarantee that each and every node in the network willpossess the information(I). This is the problem statement, thatthe authors would like to solve in this paper.The problem being formulated and solved is specificallyfor unweighted and undirected graphs. Distance between twonodes u, v in a graph G is denoted as d ( u, v ) and defined asthe number of edges on the geodesic (shortest path), if it exists,connecting them. If no geodesic exists, as per conventiondistance is taken to be infinite. Since we are dealing withundirected graphs, d ( u, v ) = d ( v, u ) . Example:
In example graph Fig. 1 3-distance minimumvertex cover is S = { v , v } . It is easily verifiable that ∀ v ∈ V ( G ) , ∃ u ∈ S such that d ( u, v ) ≤ .Similarly, the reader can verify that in the same graph 2-distance minimum vertex cover is S = { v , v } .In next section, we present an approximation algorithm for N-MVCP because
N-MVCP is a
NP-Complete problem. As wewill discuss a reduction from
N-distance to , which is original MVCP , it is implied that
N-MVCP is NP-Complete .III. A
PPROXIMATION A LGORITHM TO FIND N- DISTANCE M INIMAL V ERTEX C OVER
Before discussing the approximation algorithm, we willrevisit definitions and concepts of graph theory being used.
Walk : Given a graph G = ( V, E ) , a walk W ( v , v n ) joining v and v n is defined as an alternating sequence of vertices andedges of G W ( v , v n ) = v , e , v , e , v , · · · , e n , v n such that e i = ( v i − , v i ) , ≤ i ≤ n . The length of a walk W denoted by (cid:96) ( W ) is the number of edges in W. Trail:
A walk W ( v , v n ) is called a trail if all edges in thewalk are different. N-Trail:
A walk W ( v , v N ) is called a N-Trail if there are Nin the walk and all are different.
Degree( v, E ): Number of edges of E incident to the vertex v .In this paper Degree( v, E ) is denoted as deg ( v, E ) .For a given G ( V, E ) and value of N ( where G has atleast one N − path as described in Algorithm 1 and N ≥ because for N = 1 algorithm reduces to original vertexcover problem), the approximation algorithm is described asAlgorithm 3 in subsection III-C which calls Algorithm 1(III-A) and Algorithm 2 (III-B) as subalgorithms. A. Finding N − T rail (Trail of length N ) Algorithm 1
Algorithm to Find N − T rail
Input: G ( V, E ) where V = { v , v , ... } , E = { e , e , ... } Output:
Trail E N such that (cid:96) ( E n ) = N E N ← φ E N ← E N ∪ e i where e i = { v j , v k } s.t. e i ∈ E & deg ( v k , E ) ≥ repeat E (cid:48) = φ E (cid:48) ← { e i , s.t. e i = ( v k , v m ) & e i ∈ E } E N ← E N ∪ e j where e j = { v k , v m } s.t. e j ∈ E (cid:48) & deg ( v m , E ) ≥ if | E N | < N v k ← v m until | E N | (cid:54) = N return E N
1) Compute degree of each vertex v i ∈ V and store it.2) Randomly pick an edge e i = { v j , v k } s.t. degree of v k is at least 2. Add e i to E N .3) Pick the vertex v k which has degree at least 2. Find alledges E (cid:48) s.t. each edge in E (cid:48) has v k as one end point.4) As we know one end point of e j is v k and let us sayanother end point is v m . Now pick any edge e j ∈ E (cid:48) s.t. degree ( v m ) ≥ . and add it to E N . Now repeat step 3with v k ← v m until size of E N (cid:54) = N
5) We define endpoints of E N as vertices { v x , v y } s.t. deg ( v x ) and deg ( v y ) = 1 considering onlyedges e i ∈ E N lgorithm 2 Algorithm for Graph Reduction
Input: G ( V, E ) and distance ( e ) = 1 ∀ e ∈ E Output:
Reduced Graph G (cid:48) ( V, E ∗ ) G (cid:48) ( V, E ∗ ) ← G ( V, E ) loop: while N ≥ do if N − T rail not exists in G (cid:48) then N ← N − go to loop end if V (cid:48) ← φ , V (cid:48)(cid:48) ← φ , E (cid:48) ← φ , E (cid:48)(cid:48) ← φ , E (cid:48)(cid:48)(cid:48) ← φ , E (cid:48)(cid:48)(cid:48)(cid:48) ← φ Pick a N − T rail E N from Algorithm 1 with endpoints ( E N ) = ( v , v N ) & edges ( E N ) ∈ E V (cid:48) ← { v i } s.t. either d ( v , v i ) = N or (2 Initialization: Initialize G (cid:48) ← G and Compute degree ofeach vertex of G (cid:48) and store it. Assign each edge of G (cid:48) ( V, E ) T rail of N connected edges (using Algorithmsdiscussed in subsection III-A) { e , e , ..., e N } connect-ing N + 1 vertices { v , v , ..., v N } from G (cid:48) s.t. e i = { v i , v i +1 } and we define endpoints of N connected edgesas { v , v N } 2) Find set of vertices V (cid:48) and V (cid:48)(cid:48) , which are either N (where N ≥ ) distance away from ( v or v N ) or < N distance away from ( v or v N ) and having degree one, s.t.each vertex from V (cid:48) or V (cid:48)(cid:48) is connected to v or v N +1 respectively by a trail.3) For each vertex v i from V (cid:48) , add an edge between v and v i and mark edges connecting v and v i for deletionand For each vertex v j from V (cid:48)(cid:48) add an edge between v j and v N and mark edges connecting v j and v N fordeletion. Now delete all the edges marked for deletionand recompute the degree and update the degree tablew.r.t edges with distance assigned ( e ∈ E ).4) Repeat step 1,2,3 with new T rail of length N usingalgorithm 1 s.t. edges ( T rail ) ∈ E . Note: We will not use newly added edges(not assignedany distance) of G to form N − T rail in Algorithm 1 5) if no N-Trail is found then go to step 1 with N ← N − N ≥ C. Approximation algorithm for N-distance Minimal VertexCover Problem (NMVCP) Approximation algorithm is application of graph reductionand then Approx-Vertex-Cover[3] algorithm subsequently. Algorithm 3 Approximation Algorithm for NMVCP Input: G ( V, E ) s.t. V = { v , v , ... } , E = { e , e , ... } Output: Approximated Solution of NMVCP for G ( V, E ) G (cid:48) ( V, E (cid:48) ) ← Graph Reduction Algorithm 2 with input G ( V, E ) NMVC ← Approx-Vertex-Cover [3] Algorithm 4 with input G (cid:48) ( V, E (cid:48) ) return NMVC Algorithm 4 Approx-Vertex-Cover(G) [3] Input: G ( V, E ) Output: Approx-Vertex-Cover solution AV C AV C ← φ while E (cid:54) = φ do pick any ( u, v ) ∈ E AV C ← AV C ∪ { u, v } delete all edges incident to either u or v end while return AVC D. Example As algorithm described in subsection III-C (Algorithm 3)to find N-distance vertex cover , first step is to apply graphreduction algorithm to the given graph. If we consider thegraph shown in Fig. 1 as input then step 1 of algorithm 3 forgraph reduction (algorithm 2) with N = 3 will proceed asfollows: Initialization: G (cid:48) ← G and each edge is assigned a distanceof 1 unit and compute degree of each vertex.1) From algorithm 1 we get N(=3)-path as described insubsection III-A. Suppose we get set of three edges { e v − v , e v − v , e v − v } as one of the N-Trail . End pointof this N-Trail are { v , v } 2) a) Vertices at N (=3) distance away from v are { v , v , v , v } Vertices at < N ( = 3 ) but ≥ distance from v andwith degree 1 are { v } Therefore V (cid:48) = { v , v , v , v , v } b) Vertices at N ( = 3 ) distance away from v are { v , v , v , v , v } Vertices at < N ( = 3 ) ≥ distance from v and withdegree 1 are { φ } Therefore V (cid:48)(cid:48) = { v , v , v , v , v } 3) a) Add an edge e v − v i ∀ v i ∈ V (cid:48) and mark connectingedges (of corresponding Trail) for deletion and add anedge e v j − v ∀ v j ∈ V (cid:48)(cid:48) and mark connecting edges for ig. 2: Graph after one iteration of Graph Reduction AlgorithmFig. 3: Graph after one iteration of Graph Reduction Algorithm deletion as shown in Fig. 2 E (cid:48) ← { e v − v , e v − v , e v − v , e v − v , e v − v } E ” ← { e v − v , e v − v , e v − v , e v − v , e v − v } E (cid:48)(cid:48)(cid:48) ← { e v − v , e v − v , e v − v , e v − v , e v − v , e v − v ,e v − v , e v − v } E (cid:48)(cid:48)(cid:48)(cid:48) ← { e v − v , e v − v } b) After adding the edges ( E (cid:48) & E ” ), remove the edges( E (cid:48)(cid:48)(cid:48) & E (cid:48)(cid:48)(cid:48)(cid:48) ) marked for deletion and update thedegree table considering edges from G ( e ∈ E ).4) Now if we search for a new N(=3)-Trail in reduced graphin Fig. 3, there are no such edges. There is no N(=2)-path also, therefore only edges left are e v − v and e v − v with N (= 1) which is not considered in algorithm.After graph reduction process G (cid:48) is the graph as shownin Fig. 3 but without distances assigned to edges. Now asdescribed in step 2 of algorithm 3 Approx-Vertex-Cover(algorithm 4) for G (cid:48) will return the solution NMVC . Approx-Vertex-Cover Algorithm on G (cid:48) 1) Select an edge randomly from G (cid:48) and add its endpoints tothe solution AVC and remove all edges connected to theendpoints of this edge. Lets pick the edge e v − v then alledged connected to v or v will be removed and we cansee only one edge will remain after this process e v − v 2) Now we need to pick another edge randomly but as weknow only one edge is there in the graph therefore weneed to pick e v − v and add its endpoints to the solution AVC 3) Now no edges is remaining in graph therefore solution AVC = { v , v , v , v } which is NMVC in algorithm 3Here we can see optimal solution is for NMVC is { v , v , v } but approx-vertex-cover algorithm outputs approximate solu-tion with 1 extra vertex.IV. P ROOF OF C ORRECTNESS OF A LGORITHM We will prove the correctness of proposed algorithm usingcontradiction.By graph reduction described in algorithm 2 we are reducing G ( V, E ) → G (cid:48) ( V, E (cid:48) ) and solving G (cid:48) for Vertex Cover using Approx-Vertex-Cover algorithm which will provide solutionfor N-distance vertex cover for G .By reduction algorithm any edge e (cid:48) = { u, v } ∈ E (cid:48) is eitheran edge from original graph ( e ∈ E ) or added by graphreduction algorithm.Lets assume there is an edge e ∈ E which is not coveredby N-distance vertex cover solution(NMVC) given by theproposed algorithm. It means both the endpoint of e are > N distance from each vertex in NMVC .From reduction algorithm we can say either e ∈ E (cid:48) or e isremoved during reduction G → G (cid:48) if e ∈ E (cid:48) then e has to be covered by approx-vertex-coveralgorithm for G (cid:48) ( V, E (cid:48) ) because of correctness of Approx-Vertex-Cover algorithm(contradiction to assumption).If e is removed during graph reduction algorithm then byproperties of reduction algorithm1) An edge is only removed when it’s endpoints are ≤ N distance from one of the the endpoints of E N .2) During reduction all the edges of E N are removed butone new edge is added between endpoints of E N From subsection III-B step 3, e could be an edge connecting v and v i or connecting v j and v N . When edges connectingthese vertices will be removed, one edge connecting endpointvertices will be added. So, one of these endpoints (ofnewly added edge) has to be in solution (by Approx-Vertex-Cover Algorithm property) and e is ≤ N distance fromboth endpoints. Therefore e is ≤ N distance from one ofthe vertex in solution which is contradiction to the assumption.V. D ISCUSSION ON U PPER BOUND ON S OLUTION Given graph instance( I ) G ( V, E ) , solution for N-distancevertex cover depends on reduction G → G (cid:48) . From Approx-Vertex-Cover upper bound on solution for G ( V, E ) : OP T ( I ) ≥ | M | (1) A ( I ) = 2 | M | ≤ OP T ( I ) (2)where M is maximal matching for G and | M | is size ofmaximal matching. OP T ( I ) is the optimal solution for thegiven instarnace I. After Graph Reduction Algorithm performed G → G (cid:48) , new ig. 4: Graph G (cid:48)(cid:48) - tretched version of G w.r.t example descrined insubection III-D instance ( I (cid:48) ) is G (cid:48) ( V, E (cid:48) ) . To get final solution, Approx-Vertex-Cover algorithm is applied on G (cid:48) in algorithm 3. Thereforewe can say OP T ( I (cid:48) ) ≥ | M (cid:48) | (3) A ( I (cid:48) ) = 2 | M (cid:48) | ≤ OP T ( I (cid:48) ) (4)where M (cid:48) is maximal matching for G (cid:48) and | M (cid:48) | is size ofmaximal matching. OP T ( I (cid:48) ) is the optimal solution for thegiven instarnace I (cid:48) . We know each edge of maximal matching in M (cid:48) is added byremoving a k-Trail ( k ≤ N ) of ≤ N connected edges in G selected using algorithm 1.We will modify graph G → G (cid:48)(cid:48) s.t. N-distance vertex cover solutions for G (cid:48) and G (cid:48)(cid:48) are same. We initialize G (cid:48)(cid:48) ← G As we know when graph reduction algorithm executes 2conditions are checked when an edge is added betweenvertices with distance < N : Algorithm 2 steps 10 & 11: Vertices with distance < N but degree = 1 are added to V (cid:48) and V (cid:48)(cid:48) respectively andthen edges E (cid:48) and E (cid:48)(cid:48) are added in step 12 & 13.2) Algorithm 2 step 4: if there is no N-path , N is decreased(to n ) and algorithm is repeated. When N is decreasedand reduction algorithm is processed, new edges areadded between the vertices with distance < N (= n ).While executing algorithm in both the conditions, we willextend the vertex other than endpoint of E N in G (cid:48)(cid:48) by splittingthe edge connected to that endpoint into N − n edges andvertices as shown in Fig. 4. This extension also apply to thevertices at unit distance( N = 1 ) which we have avoidedin reduction algorithm. By this construction we have madeexactly N-Trail to one of the endpoint of E N to each vertexin V (cid:48) & V (cid:48)(cid:48) .The above construction guarantees that in G (cid:48) every maximalmatching is an edge added by removal of exactly N edges ofa N-Trail in G (cid:48)(cid:48) . If we apply Approx-Vertex-Cover algorithmto G (cid:48)(cid:48) then OP T ( I (cid:48)(cid:48) ) ≥ | M (cid:48)(cid:48) | (5) A ( I (cid:48)(cid:48) ) = 2 | M (cid:48)(cid:48) | ≤ OP T ( I (cid:48)(cid:48) ) (6) where M (cid:48)(cid:48) is maximal matching for G (cid:48)(cid:48) and | M (cid:48)(cid:48) | is size ofmaximal matching. OP T ( I (cid:48)(cid:48) ) is the optimal solution for thegiven instance I (cid:48)(cid:48) . As we discussed each edge of maximal matching in M (cid:48)(cid:48) is replacement of a N-Trail in G (cid:48)(cid:48) . Therefore, from theconstruction of G (cid:48)(cid:48) we can write | M (cid:48)(cid:48) | ≥ | M (cid:48) | ∗ N/ (7) | M (cid:48) | ≤ /N ∗ | M (cid:48)(cid:48) | (8)( N ← N + 1 if N is odd )from equations 4 and 8 A ( I (cid:48) ) = 2 | M (cid:48) | ≤ ∗ (2 /N ) ∗ | M (cid:48)(cid:48) | (9)From equations 6 and 9 A ( I (cid:48) ) ≤ (2 /N ) ∗ | M (cid:48)(cid:48) | ≤ (4 /N ) OP T ( I (cid:48)(cid:48) ) (10)from equation 10 we can say for a given graph G if weextend G to G (cid:48)(cid:48) and reduce G to G (cid:48) then solution to N-distance Vertex Cover Problem for G is the solution from Approx-Vertex-Cover algorithm for G (cid:48) which have upperbound w.r.t G (cid:48)(cid:48) , which is extended version of G . Utility of Extended Graph: We constructed extended graphto discuss upper bound on solution but there are utilities ofextended version of a graph. As we discussed extending agraph ensures the traversal of exact N-Trail which inherentlysuggest a process to maximize a network’s capability topropagate the information. Fig. 3 shows the extended graphwhere dotted edges are extended edges and diamond shapednodes are extended nodes. In a network scenario if more no.of nodes are to be added then these extended nodes are thepossible places where one should add the new nodes withoutworrying about propagating the information to the new nodes.These new nodes could be sink nodes which don’t requirecapability to propagate information.VI. S UMMARY This paper presents a variant of vertex cover problem called N-distance vertex cover problem which addresses challengesin networks to propagate information efficiently. We proposeda solution by approximation algorithm using graph reductionand Approx-Vertex-Cover algorithm. Correctness of proposedsolution is proved using contradiction and upper bound is alsodiscussed by construction of extended graph.Ralgorithm. Correctness of proposedsolution is proved using contradiction and upper bound is alsodiscussed by construction of extended graph.R