aa r X i v : . [ m a t h . C O ] F e b RAINBOW SPANNING TREES IN RANDOM EDGE-COLORED GRAPHS
PETER BRADSHAW
Abstract.
A well known result of Erd˝os and R´enyi states that if p = c log nn and G is a random graphconstructed from G ( n, p ), G is a.a.s. disconnected when c <
1, and G is a.a.s. connected when c >
1. When c >
1, we may equivalently say that G a.a.s. contains a spanning tree. We find analogous thresholds in thesetting of random edge-colored graphs. Specifically, we consider a family G of n − X of n vertices, each of a different color, and each randomly chosen from G ( n, p ), with p = c log nn . Weshow that when c >
2, there a.a.s. exists a spanning tree on X using exactly one edge of each color, and weshow that such a spanning tree a.a.s. does not exist when c < Introduction
Consider a family G = { G , . . . , G s } of s graphs on a common set X of n vertices. We define a transversalon G to be a set of s edges E ⊆ (cid:0) X (cid:1) for which there exists an bijective function φ : E → [ s ] such that for all e ∈ E , the edge e appears in G φ ( e ) . If | E | < s and there exists an injective function φ : E → [ s ] satisfyingthe same property, then we say that E is a partial transversal on G . An informal way of understanding thedefinition of a tranvsersal on G would be to describe G as an edge-colored graph, where each color class ofedges in G is given by a graph G i ∈ G , and then to say that a transversal on G is a rainbow edge-set—thatis, a set of colored edges from G in which each color appears exactly once. (Note that with this description,we allow a single vertex pair of (cid:0) X (cid:1) to have multiple edges as long as no pair of multiple edges has the samecolor.) Similarly, a partial transversal would be defined as a set of colored edges from G in which each colorappears at most once.A graph transversal is a specific type of a transversal , where a transversal on a set family F is definedas a set S ⊆ S A ∈F A that contains at least one element of each set A ∈ F . Transversals are often givenadditional restrictions depending on the setting in which the are being considered (see, for example, [5] and[2]). The notion of a graph transversal allows one to extend certain classical theorems of graph theory intoa more general setting. For instance, Aharoni et al. [1] give a transversal analogue of Mantel’s theorem,showing that given a graph set G = { G , G , G } on a common set of n vertices in which each G i hasroughly at least 0 . n edges, there must exist a transversal on G isomorphic to K —that is, three edges e ∈ E ( G ), e ∈ E ( G ), and e ∈ E ( G ) that form a triangle. Additionally, Joos and Kim [11] give atransversal analogue of Dirac’s theorem, showing that given a graph family G = { G , . . . , G n } on a commonvertex set of n vertices, if the minimum degree of each graph G i is at least n/
2, then G has a transversalisomorphic to a Hamiltonian cycle.In this paper, we will consider graph families G = { G , . . . , G n − } with n − X of n vertices, where each graph G i ∈ G is constructed randomly. If a statement holds for a random setwith probability tending to 1 as n tends to infinity, then we will say that the statements holds asymptoticallyalmost surely , or a.a.s. for short. We will construct each graph G i using the model G ( n, p ), often calledthe Erd˝os-R´enyi model , where each edge e ∈ (cid:0) X (cid:1) is added to G i independently with some probability p .We will be interested in answering the following question: for which values of p does G a.a.s. contain aspanning tree transversal? If we consider our graph family G as an edge-colored graph, then a spanningtree transversal on G is simply a rainbow spanning tree in our edge-colored graph (allowing multiple edges,provided they have different colors, as described above). Many sufficient conditions exist for the existence ofone or more rainbow spanning trees in edge-colored complete graphs and general edge-colored graphs (see,for example, [9] and [10]). In particular, Schrijver [14, Chapter 41.1a] and, independently, Suzuki [15] showthat an ( n − n vertices contains a rainbow spanning tree if and only if the removalof all edges of any k (0 ≤ k ≤ n −
2) colors leaves at most k + 1 components, and one may easily showfrom Suzuki’s method that this condition holds even when allowing multiple edges that are colored with istinct colors. Therefore, if we wish to consider rainbow spanning trees in random graphs with randomedge-colorings, constructing a random edge-colored graph as the union of sparse random graphs, each ofwhich provides edges of a single color, is a natural model. We note that this is not the only model that canbe used to consider random graphs with random edge-colorings; one could also first choose a random graph G from G ( n, p ) and then give G an edge-coloring using some probability distribution. A similar method isused by Casteigts et al. [4] when considering the related problem of temporal connectivity in random graphs.Erd˝os and R´enyi show in [6] that when p = c log nn and G is randomly constructed from G ( n, p ), then G a.a.s. contains a spanning tree when c >
1, and G a.a.s. contains no spanning tree when c <
1. Inthe setting of a graph family G = { G , . . . , G n − } on a common n vertex set, when each graph G i ∈ G ischosen according to G ( n, p ) with p = c log nn , the probability that a given edge e ∈ (cid:0) V ( G )2 (cid:1) appears in theunion of all graphs G i is ( c − o (1)) log nn . Therefore, when c > S n − i =1 G i a.a.s. contains a spanning tree, and when c <
1, the union S n − i =1 G i a.a.s. contains no spanningtree. Therefore, it is natural to ask: For which values of c does G a.a.s. contain a spanning tree transversal ?Does the existence of a spanning tree transversal in G have a similar probability threshold? These questionsare answered by the following theorem, which is our main result. Theorem 1.1.
Let G = { G , . . . , G n − } be a family of graphs on a common set of n vertices, each constructedrandomly from G ( n, p ) , with p = c log nn . When c > , G a.a.s. contains a spanning tree transversal, andwhen c < , G a.a.s. does not contain a spanning tree transversal. The reason that the family G in Theorem 1.1 a.a.s. contains no spanning tree transversal when c <
2, aswe will see, is that G a.a.s. contains a graph with no edge. Hence, we will see that the probability thresholdfor the existence of a spanning tree transversal in G coincides with the probability threshold for this trivialnecessary condition of each graph in G being nonempty. This type of threshold behavior, in which thethreshold for a random event coincides with the threshold for a trivial necessary condition of this event, iscommon for many events in random graphs; Bollob´as gives more examples in [3].The paper will be organized as follows. In Section 2, we will prove the nonexistence statement of Theorem1.1. In Section 3, we will prove the existence statement of Theorem 1.1. Finally, in Section 4, we will posesome open questions. 2. Lower half of the threshold
In this section, we prove the nonexistence statement of Theorems 1.1, which gives the lower half of theprobability threshold in the theorem. We will show in the following proposition that when c < G a.a.s. contains a graph with no edge, which immediately prevents the existence of a transversalon G . Proposition 2.1.
Let G = { G , . . . , G n − } be a family of graphs on a common set of n vertices, eachconstructed randomly from G ( n, p ) , with p = c log nn . If c < , then a.a.s. at least one graph of G contains noedge.Proof. Consider a graph G i ∈ G . The probability that G i contains no edge is equal to (1 − p )( n ) > exp (cid:18) − p ( n ) − p (cid:19) . Thus, the probability p that each graph G i contains at least one edge satisfies p < − exp − p (cid:0) n (cid:1) − p !! n − < exp (cid:18) − exp (cid:18) log( n − − c ( n −
1) log nn (1 − p ) (cid:19)(cid:19) . By applying the equation log( n −
1) = log n + log (cid:0) n − n (cid:1) and doing some simplification, we see that p < exp (cid:18) − exp (cid:18) log (cid:18) n − n (cid:19) + − p + 1 − c · n − n − p · log n (cid:19)(cid:19) . When c <
2, the argument of the inner exp function approaches ∞ , and hence the entire expression bounding p above approaches 0, showing that a.a.s. some graph G i is an independent set. (cid:3) . Upper half of the threshold
In this section, we will prove the existence statement of Theorem 1.1. We fix c ′ >
2, and we will considera family G ′ = { G ′ , . . . , G ′ n − } of n − X of n common vertices, each chosen randomly from G ( n, p ′ ) with p ′ = c ′ log nn . We will show that almost surely, we can construct a spanning tree transversal on G ′ . (It will later become clear that it is more notationally convenient to call our graph family G ′ and not G .)Throughout our entire argument, we will often use language that suggests that the edges of the graphs inour family G ′ have colors. Specifically, for each graph G ′ i ∈ G ′ , we will say that the edges of G ′ i have the color i . Then, as we have explained in the introduction, we will be able to consider G ′ as a loopless multigraphon X in which no two multiple edges may have the same color. We also tacitly assume that n is large, andwe omit floors and ceilings, as they do not affect our arguments.3.1. Tools.
We establish some tools that we will use throughout the proof of Theorem 1.1. We give thesetools their own subsection so as not to interrupt the flow of the proof. First, we will need the well-knownMarkov inequality, which appears, for instance, as Theorem 3.1 of [13].
Theorem 3.1.
Let Y be a random variable that takes only nonnegative values. Then for any real number a , P r ( Y ≥ a ) ≤ E [ Y ] a . Next, we will use the following forms of the Chernoff bound, which all appear, for example, in Chapter 4of [13].
Theorem 3.2.
Let Y be a random variable that is the sum of pairwise independent indicator variables—thatis, variables taking values in { , } . Let µ be the expected value of Y . Then for any value δ ∈ (0 , , Pr(
Y < (1 − δ ) µ ) ≤ (cid:18) e − δ (1 − δ ) − δ (cid:19) µ , and for any value δ > , Pr(
Y > (1 + δ ) µ ) ≤ (cid:18) e δ (1 + δ ) δ (cid:19) µ . Theorem 3.3.
Let Y be a random variable that is the sum of pairwise independent indicator variables—thatis, variables taking values in { , } . Let µ be the expected value of Y . Then for any value δ ∈ (0 , , Pr(
Y > (1 + δ ) µ ) ≤ exp (cid:18) − δ µ (cid:19) , and for any value δ ∈ (0 , , Pr(
Y < (1 − δ ) µ ) ≤ exp (cid:18) − δ µ (cid:19) . Lastly, we will need a lemma about the number of edges between components of a disconnected forest.
Lemma 3.4. If F is a forest on X with at least n − (log n ) edges and κ ≥ components, then there existat least nκ edges e ∈ (cid:0) X (cid:1) for which the endpoints of e are in distinct components of F .Proof. We let M denote the number of vertices in the largest component in F . We let E be the set of edges e ∈ (cid:0) X (cid:1) for which e has endpoints in two distinct components of F . We consider two cases.(1) If M ≤ n − κ , then we use the fact that | E | is at least the product of M and the size of the secondlargest component of F , which must be at least n − Mκ − . Thus, | E | ≥ n − Mκ − · M. This lower bound on | E | , when considered as a function of M , is a concave parabola and thus isminimized by making M either as small or as large as possible. Since κ ≤ (log n ) + 1, we must havethat M ≥ n (log n ) +1 , and when this inequality on M is tight, | E | ≥ n − n (log n ) +1 κ − · n (log n ) + 1 > κn. n the other hand, when M = n − κ , | E | > κ ( n − κ ) > κn. This completes the first case.(2) If
M > n − κ , then we use the fact that | E | ≥ M ( κ − > κn , and the second case is complete. (cid:3) Strategy outline.
We will let our graph family G ′ be realized as the union of many random graphfamilies as follows. First, we let G = { G , . . . , G n − } be a family of n − c = c ′ + 1,and we let each graph G i ∈ G be randomly chosen from G ( n, p ), with p = c log nn . We note that c >
2. Additionally, we define a collection of (log n ) graph families that are much sparserthan G . For 2 ≤ κ ≤ (log n ) + 1, we let S ( κ ) be a family of n − S ( κ )1 , . . . , S ( κ ) n − , each constructedrandomly from G ( n, s κ ), where s κ = √ log nn κ . Then, we define G ′ so that for each graph G ′ i ∈ G ′ , we let G ′ i = G i ∪ S (2) i ∪ · · · ∪ S ((log n ) +1) i . Using a generous upper bound for harmonic numbers, we estimate that p + s + s + · · · + s (log n ) +1 < p + 3 √ log n log log nn < p ′ = c ′ log nn , and hence it follows that an edge e ∈ (cid:0) X (cid:1) is added to a graph G ′ i with probability less than p ′ . Therefore,in order to prove Theorem 1.1, it is sufficient to show that G ′ a.a.s. contains a spanning tree transversalwhen constructed as a union of random graph families in this way. We note that as p is much larger than s + s + · · · + s (log n ) +1 , most of the edges in our original graph family G ′ will come from G , and only a fewedges will come from our families S ( κ ) . In fact, for a given value κ ≫ √ log n , we expect almost all of thegraphs in S ( κ ) to be empty.The technique of letting each graph G i ∈ G ′ be equal to the union of multiple random graphs is knownas multiple exposure . The reason we use this technique is that at certain points in our argument, we needto estimate the probability that certain edges exist within specific subsets of (cid:0) X (cid:1) . However, if we considera certain subset of (cid:0) X (cid:1) multiple times in our argument, we cannot be sure that the probabilities we needto estimate in a later part of our argument are independent of probabilities that we have estimated inearlier parts of our argument. Therefore, it is often convenient to use a “fresh” random set of edges thatis independent of all previous choices, so that we can guarantee that the probabilities we estimate areindependent of all previously measured probabilities. A simpler form of multiple exposure is used to prove aclassical result by Fernandez de la Vega [8] (explained in [3, Chapter 8]) that finds a long path in a randomgraph.As G contains the bulk of the edges in our random graph family G ′ , we will need to rely mainly on G tofind our tree partial transversal on G ′ . It is only when we get stuck using G that we will turn to our sparsefamilies S ( κ ) for assistance. We will use our graph families G and S (2) , . . . , S ((log n ) +1) as follows. First, wewill show that by a random method, and with the help of a classical result of Erd˝os and R´enyi on perfectmatchings in random bipartite graphs, we can almost surely find a forest partial transversal F on G thatcontains at least n − (log n ) edges, and hence at least n − (log n ) − S ( κ ) to add edges of colors not appearing in F between componentsof F in order to extend F into a tree partial transversal on G ′ . However, if we attempt to do this directly,we will likely find that our families S ( κ ) are too sparse and that we cannot find an edge in an available colorbetween two components of F . Therefore, in order to complete F we will need to use the idea of replacing edges in F , which we define as follows. efinition 3.5. Let F be a forest. We say that an edge r ∈ E ( F ) is replaceable by an edge e ∈ (cid:0) V ( F )2 (cid:1) if T + e has a cycle containing r .In Definition 3.5, if e shares both endpoints with r , then we consider F + e to be a multigraph with a2-cycle consisting of the edges e and r , and thus we say that e replaces r .We give an informal explanation of how the idea of replacing edges in F might help us. Suppose that F has two components, and the only color missing from F is red. We would like to find a red edge e ∗ in one ofour families S ( κ ) that connects the two components of F and add e ∗ to F in order to extend F to a treepartial transversal. However, it is possible that no such red edge e ∗ exists. Then, what we may do instead,is we may consider a red edge e in G , which a.a.s. must exist. If e connects the two components of F , thenwe have our tree transversal; otherwise, F + e contains a cycle C . Then, we may modify F by adding e and removing some other edge r ∈ E ( C ) of a different color, say yellow. Now, we may look for a yellow edgein one of our families S ( κ ) that connects the two components of F . If no such yellow edge exists, then wecan repeat the process above, replacing an edge of F and searching for edges in some S ( κ ) of a third colorthat connect the components of F . The idea is that if we can repeat this process for enough colors, then afamily S ( k ) must eventually have an edge connecting the components of F in a desired color. We will alsobe able to apply this idea when considering a forest with more than two components.3.3. Some properties of G . We will establish some properties of our family G that we a.a.s. can expectwill hold. After establishing these properties, we will be able to make certain deterministic statements about G without invoking probability. Claim 3.6. G a.a.s. contains a forest partial transversal with at least n − log n edges.Proof. We consider a random digraph family H = { H , . . . , H n − } . For each non-loop arc a ∈ X × X andeach graph H i ∈ H , we add a to H i with probability q = c log n n . We observe that for each i ∈ [ n −
1] and foran edge e ∈ (cid:0) X (cid:1) , e appears in H i with some orientation with probability less than p . Therefore, if we canshow that the lemma holds for H after removing edge orientations, then this will also show that the lemmaholds for G .We construct an auxiliary bipartite graph B . For one partite set of B , we use the set [ n − B , we use the set X − , which is obtained from X by removing an arbitrary vertex. Fora value j ∈ [ n −
1] and a vertex v ∈ X − , we add an edge between j and v if and only if there exists an arc a ∈ A ( H j ) outgoing from v . (Here, A ( H j ) is the set of arcs in H j .)We claim that B is a random bipartite graph, in the sense that every potential edge of [ n − × X − isadded to B independently with a fixed probability. Indeed, let j ∈ [ n − v ∈ X − . In X × X , thereare n − v , and each arc belongs to H j with probability q . Therefore, theprobability that v is adjacent to j in B is equal to q ∗ = 1 − (1 − q ) n − , and thus we see that B is a randomly constructed bipartite graph. We note that by a rough estimate usingthe binomial theorem, q ∗ > q ( n − − q ( n − = q ( n − − q ( n − > (1 + ǫ ) log nn , for some sufficiently small constant ǫ >
0. Then, by a classical theorem of Erd˝os and R´enyi [7] (originallystated for the permanents of random matrices), B a.a.s. contains a perfect matching.Next, we note that there exists a bijection between perfect matchings in B and pairs ( D, φ ), where D isa digraph on X in which every vertex of X − has out-degree exactly 1, and φ : A ( D ) → [ n −
1] is a bijectivefunction such that for each a ∈ A ( D ), a ∈ A ( H φ ( a ) ). As B a.a.s. contains a perfect matching, such a pair( D, φ ) a.a.s. exists, and we randomly choose such a pair (
D, φ ). Note that after removing the orientationsfrom the arcs in A ( D ) (and keeping any multiple edges), ( D, φ ) is a transversal on H , and hence it followsthat any subgraph of D admits a partial transversal after removing edge orientations. Therefore, we mayobtain a forest partial transversal from D by deleting an edge from each cycle. (Note that deleting an edgefrom each cycle of D will delete an edge from each multiple edge pair, so the resulting graph will be simple.)With this in mind, we estimate the number of cycles in D . ince only the starting points of the arcs of the digraphs in H affect our choice of D , we may considerthe endpoints of the arcs in D to be chosen at random. Therefore, we may consider that D is randomlyconstructed by starting an arc at each vertex v ∈ X − and ending the arc at a random vertex of X \ { v } .We also observe that if D contains any cycle C , the we must have V ( C ) ⊆ X − , and furthermore, as eachvertex of D has out-degree at most 1, C must be a directed cycle. Therefore, if C is a fixed undirected cycleand | V ( C ) | = k , then the probability that C appears in D is equal to 2 (cid:16) n − (cid:17) k . Then, since the number ofundirected cycles of length k on X − is at most ( n − k k , we may estimate the expected number of cycles in D as follows: E (number of cycles in D ) ≤ n − X k =2 ( n − k k · (cid:18) n − (cid:19) k = n − X k =2 k< log n. Therefore, by Markov’s inequality (Theorem 3.1), it a.a.s. holds that D has at most log n − D , we obtain a forest partial transversal on H containing at least n − log n edges. (cid:3) Lemma 3.7.
For any unbounded increasing function ω ( n ) , it a.a.s. holds that for any subset H ⊆ G satisfying |H| ≤ nω ( n ) log n , at least |H| · log nω ( n ) vertices of X have an incident edge in a graph of H .Proof. Let
H ⊆ G be fixed, and let |H| = k . We order the vertices on X as v , v , . . . , v n − , and we considerthe random graph H on X with the edge set E ( H ) = S G ∈H E ( G ). For each edge e = v i v j ∈ E ( H ), assumingwithout loss of generality that i < j , we orient e from v i to v j .We would like to count the number Y of vertices in X with positive in-degree in H . For each value i ∈ { , , . . . , n − } , v i has i possible in-neighbors. Therefore, the probability that a vertex v i has noin-neighbor in H is equal to (1 − p ) ik . Therefore, the expected number Y of vertices in H with positivein-degree is equal to n − X i =0 (cid:0) − (1 − p ) ik (cid:1) = n − n − X i =0 (1 − p ) ik = n − − (1 − p ) kn − (1 − p ) k . We write 11 − (1 − p ) k = 1 + ζkp , where ζ = kp − (1 − p ) k − . By applying the binomial theorem to ζ , we may estimate that ζ = kp − (1 − p ) k − (cid:0) k (cid:1) p − (cid:0) k (cid:1) p + . . .kp − (cid:0) k (cid:1) p + (cid:0) k (cid:1) p − . . . < kp = o (cid:18) n (cid:19) . Then, by applying the binomial theorem, the expected value µ = E [ Y ] is µ = n − (cid:0) − (1 − p ) kn (cid:1) (cid:18) ζkp (cid:19) = n − (cid:18) pkn − p (cid:18) kn (cid:19) + p (cid:18) kn (cid:19) − · · · (cid:19) (cid:18) ζkp (cid:19) = n − (1 + ζ ) (cid:18) n − pn ( kn −
1) + 16 p n ( kn − kn − − · · · (cid:19) . ince k ≤ nω ( n )(log n ) , pkn →
0. Therefore, the infinite binomial expansion above is of the order Θ( n ), andhence ζ will make an overall contribution of o (1) to the final value of µ . Hence, we may ignore ζ and cancelour n terms, after which we see that µ > (cid:18) − o (1) (cid:19) kpn . Then, using the fact that c is a constant satisfying c > µ > (1 + 2 β ) k log n, for some positive value β > Y ≥ k log nω ( n ) . As Y is the sum of pairwise-independent indicator variables,a Chernoff bound may be applied to µ . We use the first Chernoff bound from Theorem 3.2, which statesPr( Y < (1 − δ ) µ ) ≤ (cid:18) e − δ (1 − δ ) − δ (cid:19) µ = e (1 − δ ) δ − δ ! − δµ . We let δ = 1 − ω ( n ) ; then, since lim δ → (1 − δ ) δ − δ = 1 , we have thatPr (cid:18) Y < k log nω ( n ) (cid:19) = Pr( Y < (1 − δ ) µ ) ≤ exp ( − (1 − o (1)) δµ ) ≤ exp ( − (1 + β ) k log n ) . Therefore, summing over all valid subsets
H ⊆ G , the probability that the lemma does not hold for somesubset
H ⊆ G of appropriate size is less than ∞ X k =1 (cid:18) n − k (cid:19) exp( − (1 + β ) k log n ) < ∞ X k =1 exp( k log n − (1 + β ) k log n )= ∞ X k =1 exp( − βk log n ) = exp( − β log n )1 − exp( − β log n ) = o (1) . Therefore, a.a.s., the lemma holds for all subsets
H ⊆ G of appropriate size. (cid:3)
Growing our forest partial transversal on G ′ : iterative replacement. Now we will begin tobuild our tree partial transversal on G ′ . We will start with a forest partial transversal F on G with acorresponding injective function λ : E ( F ) → [ n − F contains at least n − (log n ) edges. As explained in Section 3.2, we wish to grow our forest partial transversal by taking edgesfrom our sparse families S ( κ ) and adding them to F . In order to ensure that we can add edges of appropriatecolors, we will often need to replace edges in our forest partial transversals. In the following claim, we showthat by using edges of G to make replacements on a forest partial transversal sufficiently many times, we canalways find many forest partial transversals that altogether miss a large number of colors in [ n − F ∗ is a forest partial transversal on X with a corresponding injective function φ , then a partial transversalon G ∪ E ( F ∗ ) is a partial transversal on the graph family obtained from G by adding to each G i the edge e ∈ E ( F ∗ ) for which φ ( e ) = i , if one exists. In other words, we consider a partial transversal on G ∪ E ( F ∗ )to be a rainbow graph that uses colored edges from both G and F ∗ . Claim 3.8.
Let F ∗ be a forest partial transversal on G ′ with a corresponding injective function φ . Let σ ∈ [ n − \ φ ( E ( F ∗ )) —that is, let σ be a color “missed” by F ∗ . Let J ⊆ φ ( E ( F ∗ )) ∪ { σ } denote the setof values j for which there exists a forest partial transversal F on G ∪ E ( F ∗ ) with an associated injectivefunction ψ satisfying the following properties: • The components of F partition X in an identical way as the components of F ∗ . • ψ ( E ( F )) = φ ( E ( F ∗ )) ∪ { σ } \ { j } . f there exists no forest partial transversal on G ∪ E ( F ∗ ) with | E ( F ∗ ) | + 1 edges, then | J | ≥ n log log n . We explain the second condition of the pairs (
F, ψ ) considered by Claim 3.8 in plain English. The secondcondition of Claim 3.8 states that we would only like to consider those forest partial transversals F for whichthe colors of F consist only of those colors used by the edges of F ∗ and also possibly σ . Having found sucha forest partial transversal F , the unique color among φ ( E ( F ∗ )) ∪ { σ } missed by ψ should then be added tothe set J . Then, the conclusion of the claim states that if we cannot construct a rainbow forest with moreedges than F ∗ using just colored edges of F ∗ and G , then this set J of colors missed by the our forests F must be large. The remainder of Section 3.4 will be dedicated to proving this claim.We note that Claim 3.8 uses the random graph family G but makes a determininstic statement withoutany mention of probability. The reason that we may make deterministic statements about G without invokingprobability is that we assume that Lemma 3.7 holds for G . It may be helpful to the reader’s understandingto know that we first intend to apply Claim 3.8 with F ∗ = F and φ = λ , but afterward we will need to beable to apply the claim to other forest partial transversals F ∗ . Proof of Claim 3.8:
We assume that there exists no forest partial transversal on
G ∪ E ( F ∗ ) with | E ( F ∗ ) | + 1edges. Let σ ∈ [ n − \ φ ( E ( F ∗ )). We define an infinite rooted tree T which will store information about forestpartial transversals that satisfy the properties of Claim 3.8. Each node of T will store a pair ( F, j ), where F is a forest partial transversal spanning S with an injective function ψ , and { j } = φ ( E ( F ∗ )) ∪ { σ } \ ψ ( E ( F )).With this definition, j is the unique color from φ ( E ( F ∗ )) ∪ { σ } that is “missed” by the edges of F . For apair ( F, j ) stored in a node ν of T , we say that j is the color of ν . If a node ν of T stores a pair ( F, j ), thenwe will often identify ν and ( F, j ).We construct T by a recursive procedure that considers each leaf of T and then adds children in T tothat leaf according to certain rules. The procedure is carried out as follows. Construct T :(1) Let T have a root ( F ∗ , σ ).(2) Execute (3) infinitely many times.(3) For each leaf L of T storing a pair ( F, j ) with an injective function ψ , execute the following steps:(a) Let R denote the set of edges e ∈ E ( F ) replaceable by E ( G j ).(b) For each edge r ∈ R replaceable by an edge e ∈ E ( G j ), add a child node of L storing the pair( F + e − r, ψ ( r )). Let F + e − r have an injective function ψ ′ that agrees with ψ for all of E ( F ) \ { r } and such that ψ ′ ( e ) = j .We show an example of a tree T constructed by this process in Figure 1. We note that with thisconstruction, a pair ( F, j ) that appears once in T will appear infinitely often in T . This is not a concern forus.We argue that each pair ( F, j ) stored by T satisfies the properties of Claim 3.8. The pair ( F ∗ , σ ) clearlysatisfies the properties of Claim 3.8. Next, suppose that some pair ( F, j ) satisfies the properties of Claim 3.8;we show that the children of (
F, j ) also satisfy these properties. In more detail, let e ∈ E ( G j ), let r ∈ E ( F )be replaced by e , and let F have an injective function ψ . We show that the pair ( F + e − r, ψ ( r )) satisfiesthe properties of Claim 3.8. First, it is clear that exchanging the edge r with e in F does not change theway that the forest’s components partition X . Next, clearly F + e − r is a forest on X . Furthermore, as j ψ ( F ), ψ ′ as defined in the definition of T is an injective function satisfying the properties of a partialtransversal. Finally, ψ ( r ) is the only color missed by ψ ′ on F + e − r . Hence, every node ( F, j ) ∈ T satisfiesthe properties of Claim 3.8.We would like to show that T stores a large number of distinct colors in its nodes, as this will ultimatelyallow us to show that the set J defined in Claim 3.8 is large. In order to show that T stores a large numberof distinct colors, we will compute a finite subtree T f ⊆ T , each of whose nodes stores a distinct color, andwe will show that T f contains many nodes. Figure 1.
The figure shows the tree T as constructed when F ∗ is a forest partial transversalconsisting of two 2-paths, with φ ( E ( F ∗ )) = { , , , } , and with σ = 5. The root of treecontains the pair ( F ∗ , T are computed by considering theforests obtained after replacing an edge of F ∗ with a new edge of color 5. The leftmost childof the root of T is obtained by replacing the edge of color 4 in F ∗ . This child contains apair ( F ′ , F ′ is the forest shown in the figure. The children ( F ′ ,
4) are computedsimilarly, by considering the forests obtained by replacing an edge of F ′ by an edge of color4. The children of the other nodes are computed similarly. In each depicted node ν = ( F, j ),the dashed edges represent edges in
G ∪ E ( F ∗ ) of the color j missed by F , and the boldededges represent edges of F that may be replaced to produce children of ν .Throughout the proof, we write ω ( n ) = log log log n . We compute T f in two parts. The first of theseparts is the following algorithm, which will output finite subtree T f ⊆ T containing at least nω ( n ) log n nodes,each of whose nodes stores a distinct color. Construct T f (I) :(1) Let T f have a root ( F ∗ , σ ).(2) Set h ← L h be the set of vertices of T f at height h .(4) For each pair ( F, j ) ∈ L h , add to T f all children of ( F, j ) in T whose color has not yet appeared in T f , adding at most one child of each color.(5) Set h ← h + 1.(6) Goto (3).As T f can contain at most n − T f (I) procedure must terminate. We firstwould like to show that T f grows at a predictable rate and eventually contains at least nω ( n ) log n nodes. Tothis end, let M be the maximum integer value for which all values h < M satisfy |L h | ≤ nω ( n ) log n . If no suchmaximum exists, we let M = ∞ . We define t − = − ω ( n )log log n . Claim 3.9.
For ≤ h ≤ M , the total number of elements in L h is a value (log n ) t h , where t h ≥ t h − + 1 − ω ( n )log log n . roof of Claim 3.9: We prove this claim by induction. When h = 0, L h consists of the single element( F ∗ , σ ). Therefore, the number of elements in L is 1 = (log n ) ; hence, t = 0 = t − + 1 − ω ( n )log log n .Next, we show that if the claim holds for values up to some h < M , then the claim also holds for h + 1.In other words, we assume that L h contains (log n ) t h pairs ( F, j ), and we would like to show that L h +1 contains at least (log n ) t h +1 − ω ( n )log log n pairs. When L h +1 is constructed, for each pair ( F, j ) ∈ L h , we computeall children ( F ′ , j ′ ) of ( F, j ) for which j ′ has not yet appeared as a color in T f . We would like to estimatethe total number of these child nodes ( F ′ , j ′ ), as these child nodes make up L h +1 .In order to estimate the number of nodes in L h +1 , we will need to count the total number of colors amongedges in forests of L h that are replaced during the iteration of Step (4) that generates L h +1 . In order tomake this estimation, consider a pair ( F, j ) ∈ L h . Every child of ( F, j ) is obtained by choosing an edge e ∈ E ( G j ) and using e to replace some edge r ∈ E ( F ). If the color of r has not yet appeared in a node of T f , then the color of r will appear in a child node of ( F, j ). Furthermore, we know that every edge of F that also belongs to F ∗ has a distinct color; therefore, if we can show that the number of replaced edges in E ( F ) ∩ E ( F ∗ ) is much larger than the number of colors already appearing in T f , then this will show that( F, j ) will have many children. In fact, we will not consider single nodes (
F, j ), but rather all nodes of L h at the same time, but our process for estimating the overall number of children nodes that make up L h +1 will be just as we have described here.We now rigorously compute a lower bound for the number of nodes in L h +1 . For a pair ( F, j ) ∈ L h , let R ( F,j ) ⊆ E ( F ) denote the set of edges in E ( F ) that are replaceable by E ( G j ). Then, let R = [ ( F,j ) ∈L h R ( F,j ) . We also let A h = [ ( F,j ) ∈L h E ( G j ) , and by Lemma 3.7, | V ( A h ) | ≥ (log n ) t h +1 − ω ( n )log log n . Now, recall that we have assumed that there is no forest partial transversal on
G ∪ E ( F ∗ ) with | E ( F ∗ ) | + 1edges, and therefore, for each pair ( F, j ) that we consider, every edge of G j must create a cycle when addedto F . Therefore, for each ( F, j ) ∈ L h , R ( T,j ) is an edge cover of V ( E ( G j )); that is, every vertex incident toan edge of E ( G j ) must also be incident to an edge of R ( F,j ) . Therefore, R is an edge cover of V ( A h ). As asingle edge is only incident with two vertices, it must follow that |R| ≥ | V ( A h ) | ≥
12 (log n ) t h +1 − ω ( n )log log n . We would like to estimate the number of edges in the intersection of colored edges of R and colored edgesof E ( F ∗ ). More precisely, for each ( F, j ) ∈ L h , let F have an associated injective function φ F . We wouldlike to estimate the number of edges r ∈ R for which r ∈ E ( F ∗ ) and there exists a pair ( F, j ) ∈ L h suchthat φ F ( r ) = φ ( r ). (Recall that here, φ is the injective function associated with F ∗ .) We give the name R ∗ to the set of edges r ∈ R satisfying this property.As each color appearing in E ( F ∗ ) is distinct, we observe that each edge r ∈ R ∗ has a distinct color.Furthermore, we know that any node ( F, j ) ∈ T f and any child ( F ′ , j ′ ), F and F ′ differ by at most one edge.We let K = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) h [ i =0 L i (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , and we see that immediately before L h +1 is constructed, (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) [ ( F,j ) ∈T f E ( F ) \ E ( F ∗ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ K. Therefore, the number of edges in R ∗ is at least |R| − K . Furthermore, the number of distinct colors alreadyappearing in S hi =0 L i is equal to K . Therefore, as each edge R ∗ whose color does not already appear in hi =0 L i gives a new node in L h +1 , the number of new nodes in L h +1 satisfies |L h +1 | ≥ | R | − K ≥
12 (log n ) t h +1 − ω ( n )log log n − K. Hence, to finish the induction step, we only need to show that K is not too large. By the induction hypothesis, K = h X i =0 |L i | = (log n ) t h + (log n ) t h − + · · · + (log n ) t < (log n ) t h (cid:16) n ) − (1 − ω ( n )log log n ) + (log n ) − − ω ( n )log log n ) + . . . (cid:17) < n ) t h . Therefore, the number of nodes in L h +1 is at least (log n ) t h +1 − ω ( n )log log n − n ) t h > (log n ) t h +1 − ω ( n )log log n ,and induction is complete. This proves Claim 3.9. (cid:3) By Claim 3.9, for some value
M < log n , L M contains at least nω ( n ) log n elements. Now, we may extend T f so that it contains n log log n nodes, using the following procedure: Construct T f (II) :(1) Delete nodes from L M until L M has exactly nω ( n ) log n nodes.(2) For each pair ( F, j ) ∈ L M , add to T f all children of ( F, j ) in T whose color has not yet appeared in T f , adding at most one child of each color. Call this new set of nodes L M +1 .By the argument used in Claim 3.9, L M +1 must contain at least nω ( n ) log n · (log n ) (1 − ω ( n )log log n ) = nω ( n ) · e − ω ( n ) > n · e − ω ( n ) = n log log n nodes. As each node of T f must contribute a distinct element to J , it follows that | J | ≥ n log log n . Thiscompletes the proof of Claim 3.8. (cid:3) Adding edges of the remaining colors:
Recall that we have a partial transversal F on G with atleast n − (log n ) edges. We will finally show that with the help of our edge replacement technique, we maysuccessively build larger forest partial transversals until we have a tree partial transversal on G ′ . We iteratethe following procedure, which will build a tree partial transversal on G ′ . Connect Forest Components: (1) Let φ = λ denote the injective function associated with F .(2) Set i ← F on G ∪ E ( F i − ) with | E ( F i − ) | + 1 edges, then let F i = F , and let φ i be the injective function associated with F . Then set i ← i + 1 and Goto (3).(4) If F i − is a tree, then terminate.(5) Let σ ∈ [ n − \ φ i − ( E ( F i − )) be some color missed by F i − .(6) Let J i ⊆ φ i − ( E ( F i − )) ∪ { σ } denote the set of values j for which there exists a forest partialtransversal F on G ∪ E ( F i − ) with an injective function ψ satisfying ψ ( F ) = φ i − ( F i − ) ∪ { σ } \ { j } ,such that F partitions X in the same way as F i − .(7) Choose a value j ∈ J i for which there exists an edge e ∗ ∈ E ( S ( κ ) j ) with endpoints in distinctcomponents of F i − , where κ is the number of components in F i − .(8) Let F i be a forest partial transversal partitioning X in the same way as F i − with an injectivefunction φ i for which φ i ( E ( F i )) = φ i − ( E ( F i − )) ∪ { σ } \ { j } .(9) Add e ∗ to F i , and extend φ i so that φ i ( e ∗ ) = j .(10) Set i ← i + 1, and Goto (3).Assuming that the Connect Forest Components procedure always iterates successfully, the procedurewill produce increasingly large forest partial transversals F i on G ′ with associated injective functions φ i .Furthermore, when the procedure finally terminates at Step (4), we will have a tree partial transversal on ′ . Therefore, it remains only to show that the procedure a.a.s. never fails to execute any step. In fact, Step(7) of the Connect Forest Components procedure is the only step that might fail, so we only need to checkthat Step (7) is a.a.s. executed successfully each time that it is called.By Claim 3.8, for each i , the set J i produced in Step (6) has at least n log log n elements. Also, by Lemma 3.4, F i − admits at least nκ edges with endpoints in distinct components, where κ is the number of componentsof F i − . Furthermore, each graph family S ( κ ) observed during Step (7) is never observed at any other time,and therefore the edges in S ( κ ) are independent of any previous observations. Therefore, the probabilitythat Step (7) fails on a given step is at most(1 − s κ ) n log log n · nκ < exp (cid:18) − s κ · n κ n (cid:19) = exp (cid:18) − √ log n n (cid:19) = o (cid:0) (log n ) − (cid:1) . Therefore, since at most (log n ) edges need to be added to F to obtain a tree transversal, Step (7) iscalled at most (log n ) times. Thus, it a.a.s. holds that the Construct Forest Components procedure willsuccessfully iterate until producing a tree transversal on G . This completes the proof of Theorem 1.1. (cid:3) Conclusion
In this section, we pose some open questions. One immediate question is whether we can improve thelower bound for c in the existence statement of Theorem 1.1. We suspect that a lower bound for c of theform 2 + O (cid:16) log log n log n (cid:17) is enough to guarantee a.a.s. that a spanning tree transversal exists in our graph family,but we have been unable to prove this.Additionally, in [12], Korˇsunov shows that a graph G randomly constructed from G ( n, p ) with p = c log nn and c > G ( n, p ) setting in [3, Chapter 8].) This result shows that the probability threshold for a random graph to beconnected coincides with the probability threshold for a random graph to be Hamiltonian. Therefore, it isnatural to ask if the probability threshold for the existence of a spanning tree transversal given in Theorem1.1 also coincides with a threshold for certain stronger events. In each of the following questions, we assumethat G is a family of graphs on a common set of n vertices, and that each graph G i ∈ G is chosen randomlyfrom G ( n, p ), with p = c log nn . Question 4.1.
Suppose that G contains n graphs and c > . Does G a.a.s. contain a transversal in the formof a Hamiltonian cycle? Question 4.2.
Suppose that G contains n − graphs and c > . Does G a.a.s. contain a transversal in theform of a Hamiltonian path? The following easier question would also be interesting.
Question 4.3.
Suppose that G contains n − graphs and c > . Does G a.a.s. contain a spanning treetransversal with diameter (1 − o (1)) n ? Acknowledgments
I am grateful to the graph theory group of Simon Fraser University for listening to and commenting onseveral presentations of the main ideas of this paper. In particular, I am grateful to Kevin Halasz, BojanMohar, and Ladislav Stacho for helpful discussions.
References [1] R. Aharoni, M. DeVos, S. Gonz´alez Hermosillo de la Maza, A. Montejano, and R. ˇS´amal. A rainbow version of Mantel’stheorem.
Adv. Comb. , 2020. Paper No. 2, 12.[2] R. Aharoni, C. S. J. A. Nash-Williams, and S. Shelah. A general criterion for the existence of transversals.
Proc. LondonMath. Soc. (3) , 47(1):43–68, 1983.[3] B. Bollob´as.
Random Graphs . Cambridge University Press, 2 edition, 2001.[4] A. Casteigts, M. Raskin, M. Renken, and V. Zamaraev. Sharp thresholds in random simple temporal graphs. 2020.arXiv:2011.03738.
5] P. Erd˝os, A. Hajnal, and E. C. Milner. On sets of almost disjoint subsets of a set.
Acta Math. Acad. Sci. Hungar. ,19:209–218, 1968.[6] P. Erd˝os and A. R´enyi. On random graphs. I.
Publicationes Mathematicae , 6:290–297, 1959.[7] P. Erd˝os and A. R´enyi. On random matrices.
Magyar Tud. Akad. Mat. Kutat´o Int. K¨ozl. , 8:455–461 (1964), 1964.[8] W. Fernandez de la Vega. Long paths in random graphs.
Studia Sci. Math. Hungar. , 14(4):335–340, 1979.[9] H.-L. Fu, Y.-H. Lo, K. E. Perry, and C. A. Rodger. On the number of rainbow spanning trees in edge-colored completegraphs.
Discrete Math. , 341(8):2343–2352, 2018.[10] P. Horn and L. Nelsen. Many edge-disjoint rainbow spanning trees in general graphs. arXiv e-prints , 03 2017.[11] F. Joos and J. Kim. On a rainbow version of Dirac’s theorem.
Bull. Lond. Math. Soc. , 52(3):498–504, 2020.[12] A. D. Korˇsunov. Solution of a problem of P. Erd˝os and A. R´enyi on hamiltonian cycles in nonoriented graphs.
Diskret.Analiz , (31, Metody Diskret. Anal. v Teorii Upravljajuˇsˇcih Sistem):17–56, 90, 1977.[13] M. Mitzenmacher and E. Upfal.
Probability and computing . Cambridge University Press, Cambridge, 2005. Randomizedalgorithms and probabilistic analysis.[14] A. Schrijver.
Combinatorial optimization. Polyhedra and efficiency. Vol. B , volume 24 of
Algorithms and Combinatorics .Springer-Verlag, Berlin, 2003. Matroids, trees, stable sets, Chapters 39–69.[15] K. Suzuki. A necessary and sufficient condition for the existence of a heterochromatic spanning tree in a graph.
GraphsCombin. , 22(2):261–269, 2006.
Department of Mathematics, Simon Fraser University, Burnaby, Canada
Email address : [email protected]@sfu.ca