[PDF] A Simpler NP-Hardness Proof for Familial Graph Compression

Abstract

This document presents a simpler proof showcasing the NP-hardness of Familial Graph Compression.

Full PDF

AA Simpler NP-Hardness Proof for Familial Graph Compression

Ammar Ahmed ∗ †

Zohair Raza Hassan ∗† Mudassir Shabbir ∗† Abstract

This document presents a simpler proof showcasing the NP-hardness of Familial Graph Compression.

Familial Graph Compression (FGC) is a problem introduced in [1]. The problem entails determining whetherit is possible to convert a given graph G to a target graph H via a series of “compressions” based on thepresence of certain sub-graphs in G , speciﬁed in a set F . A complete deﬁnition is given in the next section.A single instance of FGC involves G , H , and F as input. This problem was proven to be NP-complete in [1]:

Theorem 1.1.

The FGC problem is

NP-complete when:1. G is simple graph on n nodes, H is the single node graph, and family F contains a single motif C n i.e.a cycle on n nodes.2. G is a simple graph on n = 3 k nodes, H is the single node graph, and F contains a single motif with k disjoint triangles.3. G is a simple graph, H is a forest of isolated nodes, and F is a family of graphlets. In this work, we provide an easier proof for the third setting.

We adopt the same notation and terminology as in [1]. The relevant preliminaries have been reiteratedbelow. ∗ Department of Computer Science, Information Technology University, Pakistan † All authors contributed equally to this article. ‡ Email addresses: [email protected] (Ammar Ahmed), [email protected] (Zohair Raza Hassan), [email protected] (Mudassir Shabbir). a r X i v : . [ c s . CC ] S e p .1 Preliminaries A graph G is a collection of nodes V and edges E ⊆ V × V i.e. pairwise interactions between pairs of nodes.For a node u , its neighborhood N ( u ) is deﬁned as the set of all nodes v ∈ V such that there exists an edge( u, v ) in E . The degree d ( u ) is deﬁned as the size of the neighborhood of a node u . G is undirected andunweighted, i.e. for u, v ∈ V , an edge ( u, v ) is same as the edge ( v, u ). For a ﬁxed graph G = ( V, E ), a given F = ( V F , E F ) is called a motif of G , if F is isomorphic to a sub-graph in G i.e. F is a motif if there exists V (cid:48) ⊂ V and a function φ : V F → V (cid:48) such that for all edges ( u, v ) ∈ E F there is an edge ( φ ( u ) , φ ( v )) ∈ E .Similarly, F = ( V F , E F ) is called a graphlet of G , if F is isomorphic to an induced sub-graph in G i.e. F is agraphlet if there exists V (cid:48) ⊂ V and a function φ : V F → V (cid:48) such that for all edges ( u, v ) ∈ E F if and only if there is an edge ( φ ( u ) , φ ( v )) ∈ E . We will use the term motif (and similarly graphlet) for both F and anyof its isomorphic copies in G .For a given equivalence relation ∼ on the set nodes of a graph G , the quotient graph, denoted by G (cid:14) ∼ , isa graph where the node set is the set of equivalence classes deﬁned by ∼ and there is an edge between apair of nodes (classes) if and only if there is an edge between any pair of nodes of two corresponding classesin G . Intuitively, in quotient graphs, prescribed subsets of nodes are merged and the incidence is preservedwithout creating multi-edges [2]. We will repeatedly deal with graphs with names G, H , and F i ; their nodeand edge set will, respectively, be denoted by ( V G , E G ), ( V H , E H ) and ( V F i , E F i ). Finally, for a set V and apositive integer c , (cid:0) Vc (cid:1) is deﬁned as the set of all size subsets of V with exactly c elements. We start by deﬁning an equivalence relation on the node set V of G based on a motif (or a graphlet) F .Consider the relation R F where node u is related to v whenever both u and v lie in a sub-graph of G isomorphic to F . We deﬁne ∼ F to be the transitive closure of R F . Intuitively, if two motifs (resp. graphlets)share a common node in G , then all nodes in both motifs (resp. graphlets) are related in ∼ F . Clearly, ∼ F is an equivalence relation on V . Then, an F - compression step (referred to as compression step when F isclear from the context) is deﬁned as computing the quotient graph G (cid:14) ∼ F . Recall that a quotient graph G (cid:14) ∼ F is a graph on classes in the partition ∼ F , where two classes are adjacent if any pair of nodes in thecorresponding classes are adjacent in the graph G . The familial compression of a graph G for a family F is the process of repeatedly applying F i -compression steps on G where after each step G is replaced by thequotient graph of the previous step. Thus, we say that a graph H can be constructed by a F -compression of G if there exist a sequence of graphs: [ G G G . . . G k = H ] where G = G and G j = G j − (cid:14) ∼ F i i.e. G j is result of an F i -compression on the graph G j − for some F i ∈ F . Note, that a graph H may be constructedin several diﬀerent ways via diﬀerent compression steps. To avoid trivial compressions, we restrict that each F ∈ F contains at least three nodes. The following is the FGC problem: Problem 2.1 (Familial Graph Compression) . Given simple graphs, G , and H , and a family of motifs (orgraphlets) F , can H be constructed from a F -compression of G ? In the original proof for Theorem 1.1-(3), a reduction is provided from a variant of the 3-SAT problem toFGC. In this section we showcase the same result via reduction from Exacty Cover by Three Sets (XC3),deﬁned below.

Problem 3.1 (Exact Cover by Three Sets [3]) . Let X = { x , x , . . . , x k } , and let S be a collection of3-element subsets of X , in which no element in X appears in more than three subsets. For s j ∈ S, s j =2 x j , x j , x j } . The problem consists of determining whether S has an exact cover for X , i.e. a S (cid:48) ⊆ S suchthat every element in X occurs in exactly one member of S (cid:48) . This problem was proven to be

NP-complete in [3]. Note that for our reduction, the fact that “each elementappears in no more than three subsets” is inconsequential.

Theorem 3.1.

XC3 ≤ P FGC.Proof.

Suppose we are given an instance of XC3, i.e. the sets X and S . We show how one can make graphs G , and H , and family F for an FGC instance that is solvable only if the given XC3 instance is solvable.Let C i denote a cycle on i vertices. Let f ( i ) = i + 2 for i ∈ { , , , . . . } . The graph G is the union of 3 k disjoint cycles: G = (cid:83) x i ∈ X C f ( i ) . For each s j ∈ S , we deﬁne a graph Z j which is the union of three disjointcycles: Z j = C f ( j ) ∪ C f ( j ) ∪ C f ( j ) . The family F contains Z j for each s j ∈ S : F = (cid:83) s j ∈ S Z j . Finally, thetarget graph H is a graph on k isolated vertices, i.e. | V H | = k , and E H = ∅ .Intuitively, when a Z j is compressed in G , it corresponds to selecting a c j ∈ S to form an exact coverfor X . Observe that FGC would not allow the same element to be covered by diﬀerent c j ’s, since the cyclecorresponding to the covered elements no longer exist in the quotient graph, and thereby can’t be compressed(selected) again. We get k isolated vertices if an only if k disjoint 3-element subsets form an exact cover of X . Clearly, the reduction can be performed in polynomial time.Observe that the G , H , and F used in Theorem 3.1 are exactly as described in Theorem 1.1-(3). We notethat this reduction holds even when F is a family of motifs. We also obvserve that some simple changes tothe provided reduction can be made to show the following: Theorem 3.2.

FGC is

NP-complete when G is a connected, simple graph, H is the single node graph, and F is a family of graphlets or motifs. References [1] A. Ahmed, Z. R. Hassan, and M. Shabbir, “Interpretable multi-scale graph descriptors via structuralcompression,”

Information Sciences , 2020.[2] M. C. Golumbic and I. B.-A. Hartman,

Graph theory, combinatorics and algorithms: Interdisciplinaryapplications , vol. 34. Springer Science & Business Media, 2006.[3] T. F. Gonzalez, “Clustering to minimize the maximum intercluster distance,”