Complexity of acyclic colorings of graphs and digraphs with degree and girth constraints
aa r X i v : . [ c s . D M ] J un Complexity of Acyclic Colorings of Graphs and Digraphs withDegree and Girth Constraints
Tom´as Feder268 Waverley St., Palo Alto, CA 94301, USA [email protected]://theory.stanford.edu/ ˜ tomas ,andPavol HellSchool of Computing ScienceSimon Fraser UniversityBurnaby, B.C., Canada V5A 1S6 [email protected] andCarlos Subi [email protected] Let G be either a graph or a digraph. An acyclic r -coloring of G is an assignment of r colors to thevertices in G , so that the vertices of each color i induce an acyclic subgraph G i of G . Note thatthe vertex sets V ( G i ) partition V ( G ). An equivalent condition on the r -coloring is that no cycle C in G is monochromatic. (In the case of digraphs C is a directed cycle.) The least r ≥ G admits an acyclic r -coloring is called the arborocity of G if G is a graph [16], and is calledthe dichromatic number of G if G is a digraph [1]. In both cases, the problem of deciding if G hasan acyclic r -coloring is NP-complete, for any fixed r ≥ G . For digraphs, we define the directed girth to be theminimum length of a directed cycle, if one exists, and leave it undefined otherwise; and we take thedegree of a vertex to be the sum of its in-degree and out-degree.We focus on a combination of girth and degree constraints, and we look at the two oppositeends of the spectrum: small girth with high degree on the one hand, and large girth with smalldegree on the other. For the former problem, graphs of high degree and small girth are typified bycomplete graphs and the arboricity of complete graphs is trivial, but digraphs of high degree andsmall directed girth are typified by tournaments, and the dichromatic number of tournaments isalready a hard problem: even just deciding acyclic two-colorability of a tournament is known to beNP-complete [3] (see also [11]). We prove that for random acyclically r -colorable tournaments T we can recover the unknown acyclic r -coloring in deterministic linear time, with high probabilityover the choices of T . (Such a coloring is unique with high probability, as long as r is a constant.)This underscores the fact that the NP-completeness does not come from random instances. Weplaced this discussion in the last section, as it is quite technical.For the latter problem, we consider graphs and digraphs of low degree and high girth. It isknown that in the context of classical graph colorings, for each r and k there exists a d such that1eciding r -colorability of graphs with girth at least k and all degrees at most d is NP-complete[5]). We offer a simple proof of this fact to illustrate our techniques, and then prove an analogousresult for acyclic colorings. For both graphs and digraphs, we consider the special cases of r = 2separately, as we can offer simpler proofs and/or better bounds. In any case, even at this oppositeend of the scale, the arboricity and the dichromatic number remain mostly NP-complete.Our NP-completeness constructions depend on gadgets constructed from graphs and digraphswith high girth and high arboricity or dichromatic number. There are well known constructions ofgraphs and digraphs with high girth and high chromatic and dichromatic numbers [6, 3]. As far aswe were able to determine, there does not appear to be such a result for arboricity, so we providea short proof. The proof depends on a general result for high girth relational systems from [9]; infact the same result implies the corresponding results for the chromatic and dichromatic numbersas well.To highlight the gap for digraphs between the largest acyclic induced subgraph and the largestacyclic induced subgraph that can be found in polynomial time, we prove that even for digraphs(without digons) that have an acyclic n ǫ -coloring, and hence must have acyclic subgraphs of size n − ǫ , it is hard to find one of size greater than n / ǫ (Theorem 4.2). The following prototype result for ordinary r -coloring of undirected graphs is known [5]. We offeran easy proof to illustrate our technique. Theorem 2.1
There exists a function d = d ( r, k ) such that given r, k ≥ , the r -colorability of agraph G of girth at least k , and of maximum degree at most d = d ( r, k ) , is NP-complete. Proof.
We shall reduce from the problem of r -colorability. Given a graph G , we shall construct agraph G ′ with maximum degree at most d and girth at least k that is r -colorable if and only if G is r -colorable. The first step is to replace each vertex x with a binary tree T x having deg( x ) leaves,and place each edge xy of G between the y -th leaf of T x and the x -th leaf of T y . The resultinggraph G ∗ has maximum degree three. In the second step we shall replace each edge of every tree T x by a gadget J designed to ensure that the girth of the resulting graph G ′ is at least k and thatall vertices that were leaves of any one T x obtain the same colour in any r -colouring of G ′ . Thisensures that G ′ is r -colourable if and only if G is r -colourable. The maximum degree of G ′ is thenthree times the maximum degree ∆( J ) of the gadget J , so d ( r, k ) = 3∆( J ). It remains to construct J : it is well known that for any k and r there exists a graph K which is not r -colorable and hasgirth at least k [6] (for a constructive proof see [15]). We may assume that K contains an edge uv such that K − uv is r -colorable. We let J to be the graph K − uv , and replace each edge st of every T x by a copy of J , identifying s with u and t with v . Since the girth of K was at least k , each pathjoining u and v in J has at least k vertices, and the girth of the entire graph G ′ is at least k . Any r -colouring of G ′ assigned the same color to u and v in each copy of J , since otherwise K wouldhave been r -colorable.The same technique can be applied to acyclic coloring problems. We start with discussing thecomputational complexity of arboricity of graphs of high girth and low degree, and prove that itremains NP-complete. We begin with the special case of r = 2. Theorem 2.2
There is a function d ( k ) such that given k ≥ , the problem of acyclic two-coloringa graph G of girth at least k , and of maximum degree at most d ( k ) , is NP-complete. roof. Fix k ≥
3. In this case we reduce from the not-all-equal k -satisfiability problem withthree occurences of each variable, which is NP-complete by Feder and Ford [8]. An instance ofthis problem is a set of variables with binary values, and a set of clauses each consisting of k variables. The question is if values can be chosen so that no clause has all variables of the samevalue. (This is also known as the two-colorability problem for k -uniform hypergraphs [12].) Givensuch an instance, we replace each clause by a disjoint cycle of k vertices, one corresponding to eachvariable. Clearly, in every acyclic two-colouring each of these cycles will receive both colours. Weneed to ensure that all three occurences of a variable are given the same value. We shall add forevery variable x a simple claw T x = K , with the three leaves identified with the three occurencesof x . The resulting graph G ′ has maximum degree three. For this construction we shall similarlyuse a gadget J to replace each edge of every claw T x ; the gadget J will ensure that the final graph G has girth at least k and have the same color on the three occurences of each variable, in everyacyclic two-colouring of G . This guarantees that G is acyclically two-colorable if and only if theoriginal instance is satisfiable. We construct J from a graph K that has girth at least k thatdoes not admit an acyclic r -coloring, but contains an edge uv such that J = K − uv is acyclically r -colorable. We then replace each edge of each T x by a copy of J , identifying u with x and v with y . The construction of such a graph K is discussed in the next section.The general result is the following. Theorem 2.3
There exists a function d = d ( r, k ) such that given r, k ≥ , the problem of acyclic r -coloring a graph G of girth at least k , and maximum degree at most d = d ( r, k ) , is NP-complete. Proof.
Here we combine both of the previous tricks. We again reduce from the problem of graph r -colorability. Thus let G be an instance. We again replace each vertex x by a binary tree T x with deg ( v ) leaves. We will use two gadgets, J and J with the following properties: • J has vertices u , v such that any acyclic r -coloring of J assigns the same color to u and v ; • J has vertices u , v such that any acyclic r -coloring of J assigns different colors to u and v ; • J has girth at least k and contains no path with fewer than k vertices between u and v ;and • J has girth at least k .Each edge st of each T x will be replaced by J identifying s with u and t with v , and each edge xy of G will similarly be replaced by a copy of J between the corresponding leaves of T x and T y .The resulting graph G ′ has girth at least k because every cycle in G ′ is either inside a copy of some J i , or passes through some J . Clearly, G ′ is acyclically r -colorable if and only if G is r -colorablein the usual sense. The degrees of G ′ are maximized by three times the maximum degree of any u , u , v , v in J , J .It remains to explain how to construct J , J . Let K be a graph of girth k that is not acyclically r -colorable but has an edge uv such that K − uv is acyclically r -colorable. (These graphs areconstructed in the next section.) Let J = K − uv and let J be obtained from K by subdividingthe edge uv by a new vertex w . We also take u = u, v = v and u = u, v = w . Then it iseasy to verify that J , J satisfy the required properties. Indeed, in any acyclic r -coloring of J ,the vertices u = u, v = v must obtain the same colour, otherwise J ∪ uv = K would also be3cyclically r -colorable. The same argument holds for J and u = u and v , and therefore u = u and v = w must obtain different colors. Any path between u and v in J contains at least k vertices, otherwise K would contain a cycle shorter than k .We are ready to tackle the desired result for the dichromatic number. Theorem 2.4
There exists a function d = d ( r, k ) such that given r, k ≥ , the problem of acyclically r -coloring a digraph G of directed girth at least k , and of in-degrees and out-degrees at most d = d ( r, k ) , is NP-complete. Proof.
The proof is similar to the undirected case above. We again reduce from graph r -colorability. Let G be an instance, and replace each vertex x by an oriented binary tree T x with deg ( v ) leaves. The tree is first rooted at a non-leaf vertex, then oriented away from the root. Wewill use two digraph gadgets, J with the following properties: • J has vertices u , v such that any acyclic r -coloring of J assigns the same color to u and v ; • J has vertices u , v such that any acyclic r -coloring of J assigns different colors to u and v ; • J has directed girth at least k and contains no directed path with fewer than k vertices from u to v ; and • J has directed girth at least k .Each directed edge st of each T x will be replaced by J identifying s with u and t with v , andeach directed edge xy of G will similarly be replaced by a copy of J between the correspondingleaves of T x and T y . The resulting graph G ′ has directed girth at least k because every directedcycle in G ′ is either inside a copy of some J i , or passes through some J . Clearly, G ′ is acyclically r -colorable if and only if G is r -colorable in the usual sense. The in- and out-degrees of G ′ aremaximized by three times the maximum in- and out-degree of any u , u , v , v in J , J .In this case there is again a simpler construction when r = 2. Theorem 2.5
Given k ≥ , the problem of acyclic 2-coloring a digraph G of directed girth at least k and of in-degrees and out-degrees at most k + 1 , is NP-complete. Proof.
We proceed as in Theorem 2.2, reducing from the not-all-equal k -satisfiability problemwith three occurences of each variable x , replacing each clause with a disjoint directed k -cycle. Toensure that the three occurences of a variable x in clauses have the same value, we consider foreach variable x a separate digraph H k whose vertices are partitioned into k sets S , S , . . . , S k − with S of size one, S an independent set of size three and each S i for 2 ≤ i < k inducing adirected k -cycle. In addition, we include all edges from S i to S i +1 , and S k − to S . In any acyclictwo-coloring of H k , each S i for 2 ≤ i < k must have both colors, so the colors in S must all bedifferent from the color in S , and hence the same. The three elements of S can thus be identifiedwith the three occurences of x . 4 High Girth Graphs and Digraphs
In this section we discuss the existence of high-girth graphs and digraphs without acyclic r -colorings.For ordinary graph r -colorings, we have the following well-known result of [6]. Theorem 3.1
For any r, k ≥ , there exists a graph with girth at least k which is not r -colorable. For dichromatic number we have the following theorem [3].
Theorem 3.2
For any r, k ≥ , there exists a digraph with directed girth at least k which is notacyclically r -colorable. A very general version of such results is proved in [9] (Theorem 5). The proof in [9] isprobabilistic but there is a constructive proof in [14]. We refer the reader to [9] for the definitionof a constraint satisfaction problem, the girth of an instance, and equivalence of problems. Weexplain below the special case sufficient for our applications here.
Theorem 3.3
For every constraint satisfaction problem P , any instance I of P , and any integer k ≥ , there exists an instance I ′ , equivalent to I , with girth at least k . The r -valued not-all-equal k -satisfiability problem is an example of a constraint satisfactionproblem. Here an instance is a set of variables x , x , . . . , x n each taking one of r possible values,and a set of constraints C , C , . . . , C m , each consisting of exactly k variables. The solution ofan instance is an assignment of values to the variables such that no constraint C i has all itsvariables assigned the same value. (This can also be viewed as the r -coloring problem of k -uniformhypergraphs; cf. the special case r = 2 in the proof of Theorem 2.2.) In this case, the girth ofan instance is the smallest set of variables y , y , . . . , y k − such that any two consecutive y j , y j +1 (subscripts modulo k ) occur together in some constraint C i . We say that an instance I is equivalentto an instance I ′ if I has a solution if and only if I ′ has a solution. There obviously are instanceswithout a solution, for example n = ( k − r + 1 variables and all m = (cid:0) nk (cid:1) constraints imposed oneach subset of size k . (If each variable is assigned one of r values, some k variables will have thesame value, so I has no solution.) We obtain the following corollary of the theorem. Theorem 3.4
For any r, k ≥ , there exists an instance of r -valued not-all-equal k -satisfiabilityproblem with girth at least k , which has no solution. We can transform the instance into a digraph by taking a vertex for each variable x i and forma directed k -cycle on any set of k variables occurring in a constraint C i . Clearly, this digraph hasdirected girth at least k . We obtain a new proof of Theorem 3.2.By replacing each constraint with an undirected k -cycle, we similarly conclude the followinguseful fact. Theorem 3.5
For any r, k ≥ , there exists a graph with girth at least k which is not acyclically r -colorable. We close this section by noting that graph r -coloring is another example of a constraintsatisfaction problem, and applying Theorem 3.3 to the graph K r +1 which is not r -colorable, weobtain Theorem 3.1. 5he digraph K from Theorem 3.2 may be assumed to contain an edge uv such that K − uv isacyclically r -colorable, e.g., by assuming that K is minimal with respect to inclusion. A similarremark applies to the graph K from Theorem 3.5.The obvious question is whether a more explicit construction for H and H ′ can be given, thusavoiding randomization [6, 9] or a more complex construction [14]. For example, in the case k = 3,a random tournament as in Theorem 5.2 of size O ( r log r ) suffices for H , yet it remains hard to find H and H ′ . Our construction below gives H = H ′ of size polynomial in k for r fixed, or polynomialin r for k fixed. Theorem 3.6
For every r ≥ , k ≥ , there exists a digraph H kr with the following properties.1. H kr has at most k r vertices;2. moreover, if k ≤ r , then H kr has at most k ⌈ k (1+log( rk )) ⌉ ≤ k ( erk ) k log k vertices;3. H kr can be constructed in time linear in the number of vertices;4. H kr does not have an acyclic r -coloring;5. for each edge uv , the graph H kr − uv does have an acyclic r -coloring; and6. H kr has directed girth k . This gives the bound d ( r, k ) ≤ | V ( H kr ) | in Theorem 2.3. Proof.
We fix k , and let H k be a k -cycle, satisfying all conditions. For H kr with r ≥
2, write r − a ⌊ r − k ⌋ + b ⌈ r − k ⌉ with a, b ≥ a + b = k .Let r ′ = r − − ⌊ r − k ⌋ and r ′′ = r − − ⌈ r − k ⌉ . Define H kr as the disjoint union of a copies of H kr ′ and b copies of H kr ′′ , for a total of k copies, with all edges joining each such copy to the next,or the last one to first one.We prove the last three conditions by induction on r . The first a copies need at least r ′ + 1colors, avoiding at most only ⌊ r − k ⌋ colors. The last b copies need at least r ′′ + 1 colors, avoiding atmost only ⌈ r − k ⌉ colors. By the definition of a, b, at most r − i appears in all the copies, and this gives a cycle of color i of length k acrossall the copies. This proves condition (4).For condition (5), suppose the removed edge uv is inside the j th copy H j . Then H j can becolored with only r ′ colors ( r ′′ colors), giving one more color than in the definition of a, b, for atotal of r avoided colors across all the copies, so no color appears in all the copies, and there is nocycle across all the copies that gives the same color in all copies. This proves condition (5) in thiscase.If the removed edge is uv joins say H j , H j +1 , then color H j − u and H j +1 − v with only r ′ (or r ′′ ) colors by condition (5), avoiding one more color in each of H j , H j +1 , with only color i avoidedin both cases. Assign color i to u, v . This gives us again r − i in all copies does not gives a cycle of length k of color i , since the cycle would have to go throughedge uv . This proves condition (5).For condition (6), note that all cycles either go through only one copy and are thus inductivelyof length at least k , or go through all the copies and must thus be of length at least k .6or conditions (1, 2), note that each step of the induction has r ′ , r ′′ ≤ r (1 − r,k ) ) and k | V ( H kr ′ ) | ≥ | V ( H kr ) | .Note that this last result allows us to prove Theorems 2.5 and 2.3 without necessarily assumingthat r, k are constants, but may depend on n , for as long as the bound | V ( H kr ) | ≤ n − ǫ holds with ǫ > We now prove a hardness of approximation result.
Lemma 4.1
Let < ǫ < be a constant. Given a complete bipartite graph H = ( U, V, E ) with | U | = | V | = n , let H ′ be the digraph obtained from H by orienting the edges in either directionindependently with equal probability . Then with probability approaching 1 as n goes to infinity,for every two subsets U ′ ⊆ U, V ′ ⊆ V having | U ′ | , | V ′ | ≥ n ǫ , the digraph induced by U ′ ∪ V ′ containsa cycle. Proof.
Say | U ′ | = | V ′ | = n ǫ and choose U ′ , V ′ ordered in at most n n ǫ ways. If U ′ ∪ V ′ induces anacyclic subgraph consistent with these ordering, then the order of the neighbors of a vertex u ∈ U ′ ,for some such order, will have first the edges incoming to u from V ′ , then the outgoing edges from u to V ′ . This can happen in n ǫ + 1 ways out of 2 n ǫ possible choices for the edes joining u . Multiplyingresulting probability over all n ǫ choices of u from choices of subsets U ′ , V ′ gives probability of therebeing U ′ ∪ V ′ acyclic at most n n ǫ ( n ǫ + 1) n ǫ − n ǫ , which tends to zero as n goes to infinity.Noga Alon informed us that the known relatively recent explicit construction for bipartiteRamsey graphs [2] will give a derandomization of this lemma. Indeed, if U ′ ∪ V ′ induces an acyclicdigraph with a corresponding linear order L , then the middle vertices of U ′ , V ′ in L are u, v, respectively. Say the edge joining u, v goes in the direction uv . Then there are edges going fromvertices below u in U ′ to vertices above v in V ′ . (The other case is symmetric, from below v in V ′ to above u in U ′ if the direction is vu ). We can define a biparite graph from H ′ by including onlyedges from U to V . Then we just saw that we would have either a bipartite clique or a bipariteindependent set with k vertices in each side, k = n ǫ /
2. The bipartite Ramsey construction in [2]guarantees this does not happen even with k = n o (1) .Feige and Kilian [10] proved that it is NP-hard to find an independent set of size greater than n ǫ (thus hard to n − ǫ color) in a graph G that is colorable with n ǫ colors, for any ǫ >
0. As aresult, it is equally hard to find a large acyclic subgraph in a digraph, since we could replace theedges of G with digons (girth 2), so that acyclic sugraphs in the resulting digraph correspond toindependent sets in G . Theorem 4.2
Fix < ǫ < . It is NP-hard to find an acyclic induced subgraph of size greaterthat N + ǫ (thus hard to find an acyclic N − ǫ coloring) in an N -vertex digraph G ′ without digons,i.e., of girth at least 3, that has an acyclic r -coloring with r ≤ N ǫ . Proof.
Let G be an instance of the NP-hard question of Feige and Killian. For each vertex v i ∈ V ( G ), let U i be a set with | U i | = n . For each edge v i v j ∈ E ( G ), join U i , U j with the random7ipartite digraph as in the lemma (which can be derandomized). This gives a digraph G ′ with N = n vertices that has an acyclic r -coloring with r ≤ N ǫ , by copying each color of a v i into thecorresponding U i .However, an acyclic induced subgraph S can only meet sets U i in at least n ǫ vertices if thecorresponding v i form an independent set, by the lemma. Therefore there will only be found n ǫ such large intersections by the result of Feige and Killian, giving us n ǫ vertices of S , plus smallintersections, of size at most n ǫ for the remaining U i , for a total | S | ≤ n ǫ = 2 N ǫ . We begin with two simple observations to introduce random tournaments, as in [7].
Theorem 5.1
Every tournament T on n vertices contains a transitive subtournament on ⌈ log ( n +1) ⌉ ) vertices, and therefore has an acyclic n (1 − ǫ ) log n + n − ǫ = O ( n log n ) coloring. Proof.
Greedily select a vertex v from T of outdegree at least n − , remove v and its in neighborsfrom T to obtain T ′ of size at least n − . This halving can be done ⌈ log ( n + 1) ⌉ ) − ⌈ log ( n + 1) ⌉ ) vertices v will form a transitive tournament. For the acyclic coloring, selectand remove greedily transitive tournaments from T , each of size at least (log n )1 − ǫ , until wereach a tournament T ′ of size at most n − ǫ , and we use at most these many colors for T ′ .Random tournaments essentially match the preceding bound Theorem 5.2
With high probability, a tournament T on n vertices only contains a transitivesubtournament on O (log n ) vertices, and therefore only has an an acyclic Ω( n log n ) -coloring. Proof.
Selecting a sequence of ⌊ n ⌋ distinct vertices v of T can be done in at most s = n n ways. The probability that such a sequence will give the ordering of a transitive tournament is atmost t for t = 2( n − ). The ratio st tends to 0 as n goes to infinity.We now define a random model for acyclic r -colorable tournaments T on n vertices. Consider r integers s ≥ s ≥ · · · ≥ s r ≥ s + s + · · · + s r = n . To define T , first consider r disjoint setsof vertices S i with | S i | = s i , and impose on each S i a transitive (acyclic) tournament. Finally, orienteach edge joining vertices in two different S i independently with probability in either direction.The tournaments T so generated have an acyclic r -coloring, obtained by assigning color i to thevertices in S i . We consider the problem of acyclic r -coloring such a T when the vertices of T aregiven in arbitrary order, and the S i and s i are not given. We give a deterministic algorithm thatfinds such a coloring with probability arbitrarily close to 1. If r is fixed, the algorithm runs in time O ( n ), linear in the size of the input T .The algorithm runs in three phases, which we describe below.The first phase starts with the tournament T n ,r = T with n = n and r = r , and operates inrounds. At the beginning of the j th round, we have T n j ,r j .We define d j = c √ n j log n j for some constant c . Given a tournament R and a vertex v in R ,we define d R diff ( v ) = d R out ( v ) − d R in ( v )8s the difference between the out-degree and the in-degree of v in R . Note that E ( d T nj ,rj diff ( v )) = d S i diff ( v ) if v ∈ S i . Consider the Chernoff bounds for X equal to the sum of independent Bernoullirandom veriables, with µ = E ( X ). P r ( X ≤ (1 − δ ) µ ) ≤ e − δ µ , ≤ δ ≤ ,P r ( X ≥ (1 + δ ) µ ) ≤ e − δ µ , ≤ δ ≤ . Letting X = d T nj,rj out ( v ) − d S i out ( v ))we have that P r ( | X − n j − s i | ≥ d j ) ≤ e − d j nj − si ) ≤ e − c log n j and therefore P r ( | d T nj ,rj diff ( v ) − d S i diff ( v ) | ≥ d j ) ≤ e − c log n j . The probability that this holds for all v in T n j ,r j is at most n j times the bound. Let u ∗ be thevertex that maximizes | d T nj,rj diff ( v ) | , say the quantity within the absolute value is nonnegative. Then u ∗ ∈ S ∗ = S i ∗ with s ∗ = | S ∗ | . Let S be the largest of the S i in T n j ,r j , and let u be the startingvertex of the transitive tournament S . Then | S | − | S ∗ | ≤ d S diff ( u ) − d S ∗ diff ( u ∗ ) ≤ ( d S diff ( u ) − d T nj ,rj diff ( u )) + ( d T nj ,rj diff ( u ) − d T nj,rj diff ( u ∗ )) + ( d T nj ,rj diff ( u ∗ ) − d S ∗ diff ( u ∗ )) ≤ d j + 0 + 2 d j = 4 d j with probability at least 1 − n j + 1) e − c log n j .The algorithm seeks to determine S ∗ given the vertex u ∗ ∈ S ∗ . For v = u ∗ , and w = u ∗ , v , theprobability that u ∗ vw is a directed 3-cycle in either direction is , unless v, w ∈ S ∗ , in which casethe probability is zero. Let X ( v ) be the number of w s that form such a directed 3-cycle with u ∗ and v . Then E ( X ( v )) = ( n −
2) if v / ∈ S ∗ , and E ( X ( v )) = ( n − s ∗ ) if v ∈ S ∗ . Then P r ( | X ( v ) − E ( X ( v )) | ≥ d j ) ≤ n j e − d j nj ≤ n j e − c log n j , where the factor of n j is needed to account for the fact that u ∗ could be any vertex in T n j ,r j . Withprobability n j times this much this holds for all X ( v ).Suppose s ∗ − > d j . With probability at least 1 − n j e − c log n j , vertices v ∈ S ∗ have X ( v ) ≤ A ′ = ( n − s ∗ ) + d j , and vertices v / ∈ S ∗ have X ( v ) ≥ B ′ = ( n − − d j and B ′ > A ′ .Thus for as long as s > d j + 2, we have s ∗ > d j + 2, and and the algorithm can determine u ∗ and test the A ′ , B ′ bounds to determine S ∗ . This works with probability at least1 − p j = 1 − n j + 1) e − c log n j − n j e − c log n j = 1 − e − (1 − ǫ j ) ( c log n j ) where ǫ j ≤ c log n j . 9 emma 5.3 Suppose the j th round starts with T n j ,r j . Let d j = c √ n j log n j . With probability atleast − p j , if the largest S has s > d j + 2 , then we find S ∗ with s ∗ ≥ s − d j > d j + 2 , andreduce the problem to T n j +1 ,r j +1 = T n j − s ∗ ,r j − . The last round of the first phase takes T n j ,r j to T n j +1 ,r j +1 . We let n − = n j , r − = r j , n ′ = n j +1 , r ′ = r j +1 , and ˆ j = j for this j . Note that after phase one is over, we have s ′ ≤ d + ≤ c √ n ′ log n ′ with s ′ and d ′ defined similarly with j = ˆ j . When phase one no longer applies, r ′ ≥ n ′ s ′ ≥ √ n ′ c log n ′ .In particular, if r = O (1), then r ′ = O (1) and n ′ = O (1) and the rest of the problem can besolved in O (1) time.We may assume n − > max( log n log r , log n log log n ) since otherwise the problem can be solved in lineartime avoiding the ˆ j th round. Theorem 5.4
Finding S ∗ takes O ( n ) time, linear in the size of the input T . This yields atotal running time of O ( rn ) over at most r rounds. The probability bound for the first phase is − n j p j = 1 − e − Ω( c log n − ) with j = ˆ j .For r constant, we can find an acyclic r -coloring of T on n vertices in time O ( n ) , linear inthe size of the input, with probability as above. After the first phase is over, we have T ′ = T n ′ ,r ′ . The second phase first identifies all sets U in T ′ with | U | ≤ c log n ′ that could be the least elements of an S i . The number of possible such sets U is at most n ′ c log n ′ .First U must be a transitive tournament. Suppose U of size c log n ′ is the bottom of an S i . Let V be U plus all the elements above all of U .We claim that | V | \ S i | < c log n ′ . Otherwise choose W with | W | = c log n ′ contained in V \ S i .The sets U, W are joined by edges joining different colors, thus the probability that they will allbe oriented upwards is at most 2 − c log n ′ . There are at most n ′ c log n ′ possible choices of V ℓ , W , sowith probability at least 1 − e (2 c − c log 2) log n ′ the claim follows.Suppose k ≥ c log n ′ for ssome sufficiently large constant c . The algorithm repeatedlychooses sets Z = V \ W that define transitive (acyclic) tournaments within T n ′ ,r ′ . The sets Z areconsidered in nonincresasing order of z = | Z | ≥ k . We claim that with high probability, we willhave Z = S i for one of the sets S i , so we choose such a Z and discard all later Z ′ that intersect Z ,with | Z ′ | ≤ | Z | . The algorithm ends when there are no more Z with z ≥ k . Theorem 5.5
The second phase correctly selects the remaining S i with s i ≥ k , with probabilityat least − e (2 c − c log 2) log n ′ − − k +1 . for k ≥≥
24 log n ′ and n ′ sufficiently large. The runningtime is bounded by O ( e c log n ′ ) . Proof.
It only remains to show that all chosen Z are S i , with probability at least 1 − − k +1 . Ifnot, Z meets at least two S i . Let Z = S i Z i , where Z i = Z ∩ S i , with z i = | Z i | . Order the Z i indecresasing order of z i , and let ˆ i be such that P i< ˆ i z i < z and P i> ˆ i z i < z .Let ℓ = z − z ˆ i . The z i can be partitioned into two sets into one of the two cases A = { z , . . . , z ˆ i } , B = { z ˆ i +1 , . . . , z t } or A = { z , . . . , z ˆ i − } , B = { z ˆ i , . . . , z t } , and one of these twopartitions has corresponding sizes at least z , ℓ .Once the edges within A with | A | ≥ z have been oriented, the probability that a vertex w in B with | B | ≥ ℓ will fit in some order among A is ( | A | + 1)2 −| A | = ( z )2 − z , or (( z )2 − ( z ) ) ℓ over B .10he ℓ vertices in Z − Z and at most ℓ vertices in S − Z (since s ≤ z ) can be chosen in at most n ℓ ways, giving the probability bound n ℓ (( z − ( z ) ) ℓ = 2 − ℓ ( − n − log2 z + z ) = 2 − ℓz (1 − (log e )(8 log nz + z z )) ≤ − ℓz ≤ − ℓk for z ≥ k ≥
24 log n ′ and n ′ sufficiently large. Summing over all ℓ ≥ ≤ − k +1 .The third phase starts with a resultant T n ′′ ,r ′′ and possible sets Z with z <
24 log n ′ , so r ′′ ≥ n ′′
24 log n ′ . Each S i has e c log n ′ choices of U, W , for a total of e cr ′′ log n ′ choices for the r ′′ sets S i to be selected, giving running time O ( n ′′ e cr ′′ log n ′ ).The third phase seems the most expensive, since the running time is exponential in r versusquasi-polynomial in the second phase, and polynomial in the first phase. We can avoid the thirdphase by approximating the bound 24 log n ′ on color classes with the bound log n from Theorem 5.1,for an approximation factor of 24 log 2. References [1] J. Bang-Jensen, F. Havet, Finding good 2-partitions of digraphs I. Hereditary properties, J.Theoretical Computer Science (TCS), Elsevier, 636 (2016) 88–94.[2] B. Barak, A. Rao, R. Shaltiel, and A. Wigderson, 2-source dispers for n o (1)(1)