[PDF] A (Slightly) Improved Approximation Algorithm for Metric TSP

Abstract

For some ϵ> 10 −36 we give a 3/2−ϵ approximation algorithm for metric TSP.

Full PDF

aa r X i v : . [ c s . D S ] A ug A (Slightly) Improved Approximation Algorithmfor Metric TSP

Anna R. Karlin ∗ , Nathan Klein † , and Shayan Oveis Gharan ‡ University of WashingtonSeptember 1, 2020

Abstract

For some ǫ > − we give a 3/2 − ǫ approximation algorithm for metric TSP. ∗ [email protected]. Research supported by Air Force Ofﬁce of Scientiﬁc Research grant FA9550-20-1-0212and NSF grant CCF-1813135. † [email protected]. Research supported in part by NSF grants CCF-1813135 and CCF-1552097. ‡ [email protected]. Research supported by Air Force Ofﬁce of Scientiﬁc Research grant FA9550-20-1-0212,NSF grants CCF-1552097, CCF-1907845, ONR YIP grant N00014-17-1-2429, and a Sloan fellowship. ontents λ -uniform Spanning Tree Distributions . . . . 72.5 Sum of Bernoullis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.6 Random Spanning Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 S is a degree cut . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727.2.2 Case 2: S and its parent ˆ S are both polygon cuts . . . . . . . . . . . . . . . . . 73 A Proofs from Section 5 80

Introduction

One of the most fundamental problems in combinatorial optimization is the traveling salespersonproblem (TSP), formalized as early as 1832 (c.f. [App+07, Ch 1]). In an instance of TSP we aregiven a set of n cities V along with their pairwise symmetric distances, c : V × V → R ≥ . Thegoal is to ﬁnd a Hamiltonian cycle of minimum cost. In the metric TSP problem, which we studyhere, the distances satisfy the triangle inequality. Therefore, the problem is equivalent to ﬁndinga closed Eulerian connected walk of minimum cost. It is NP-hard to approximate TSP within a factor of [KLS15]. An algorithm of Christoﬁdes-Serdyukov [Chr76; Ser78] from four decades ago gives a -approximation for TSP (see [BS20]for a historical note about TSP). This remains the best known approximation algorithm for thegeneral case of the problem despite signiﬁcant work, e.g., [Wol80; SW90; BP91; Goe95; CV00;GLS05; BEM10; BC11; SWZ12; HNR17; HN19; KKO20].In contrast, there have been major improvements to this algorithm for a number of specialcases of TSP. For example, polynomial-time approximation schemes (PTAS) have been foundfor Euclidean [Aro96; Mit99], planar [GKP95; Aro+98; Kle05], and low-genus metric [DHM07]instances. In addition, the case of graph metrics has received signiﬁcant attention. In 2011, thethird author, Saberi, and Singh [OSS11] found a − ǫ approximation for this case. Mömke andSvensson [MS11] then obtained a combinatorial algorithm for graphic TSP with an approximationratio of 1.461. This ratio was later improved by Mucha [Muc12] to ≈ Theorem 1.1.

For some absolute constant ǫ > − , there is a randomized algorithm that outputs a tourwith expected cost at most − ǫ times the cost of the optimum solution. We note that while the algorithm makes use of the Held-Karp relaxation, we do not prove thatthe integrality gap of this polytope is bounded away from 3/2. We also remark that although ourapproximation factor is only slightly better than Christoﬁdes-Serdyukov, we are not aware of anyexample where the approximation ratio of the algorithm we analyze exceeds 4/3 in expectation.Following a new exciting result of Traub, Vygen, Zenklusen [TVZ20] we also get the followingtheorem.

Theorem 1.2.

For some absolute constant ǫ > there is a randomized algorithm that outputs a TSP pathwith expected cost at most − ǫ times the cost of the optimum solution. First, we recall the classical Christoﬁdes-Serdyukov algorithm: Given an instance of TSP, choosea minimum spanning tree and then add the minimum cost matching on the odd degree verticesof the tree. The algorithm we study is very similar, except we choose a random spanning treebased on the standard linear programming relaxation of TSP. Given such an Eulerian cycle, we can use the triangle inequality to shortcut vertices visited more than once to geta Hamiltonian cycle. x be an optimum solution of the following TSP linear program relaxation [DFJ59; HK70]:min ∑ u , v x ( u , v ) c ( u , v ) s.t., ∑ u x ( u , v ) = ∀ v ∈ V , ∑ u ∈ S , v / ∈ S x ( u , v ) ≥ ∀ S ( V , x ( u , v ) ≥ ∀ u , v ∈ V . (1)Given x , we pick an arbitrary node, u , split it into two nodes u , v and set x ( u , v ) = c ( u , v ) = u to u and the other half to v . This allows us toassume without loss of generality that x has an edge e = ( u , v ) such that x e = c ( e ) = E = E ∪ { e } be the support of x and let x be x restricted to E and G = ( V , E ) . x restricted to E is in the spanning tree polytope (3).For a vector λ : E → R ≥ , a λ -uniform distribution µ λ over spanning trees of G = ( V , E ) is a distribution where for every spanning tree T ⊆ E , P µ [ T ] = ∏ e ∈ T λ e ∑ T ′ ∏ e ∈ T ′ λ e . Now, ﬁnd a vector λ such that for every edge e ∈ E , P µ λ [ e ∈ T ] = x e ( ± ǫ ) , for some ǫ < − n . Such a vector λ can be found using the multiplicative weight update algorithm [Asa+10] or by applying interiorpoint methods [SV12] or the ellipsoid method [Asa+10]. (We note that the multiplicative weightupdate method can only guarantee ǫ < ( n ) in polynomial time.) Theorem 1.3 ([Asa+10]) . Let z be a point in the spanning tree polytope (see (3) ) of a graph G = ( V , E ) .For any ǫ > , a vector λ : E → R ≥ can be found such that the corresponding λ -uniform spanning treedistribution, µ λ , satisﬁes ∑ T ∈T : T ∋ e P µ λ [ T ] ≤ ( + ε ) z e , ∀ e ∈ E , i.e., the marginals are approximately preserved. In the above T is the set of all spanning trees of ( V , E ) .The running time is polynomial in n = | V | , − log min e ∈ E z e and log ( ǫ ) . Finally, we sample a tree T ∼ µ λ and then add the minimum cost matching on the odddegree vertices of T . The above algorithm is a slight modiﬁcation of the algorithm proposed in Algorithm 1

An Improved Approximation Algorithm for TSPFind an optimum solution x of Eq. (1), and let e = ( u , v ) be an edge with x e = c ( e ) = E = E ∪ { e } be the support of x and x be x restricted to E and G = ( V , E ) .Find a vector λ : E → R ≥ such that for any e ∈ E , P µ λ [ e ] = x e ( ± − n ) .Sample a tree T ∼ µ λ .Let M be the minimum cost matching on odd degree vertices of T .Output T ∪ M .[OSS11]. We refer the interested reader to exciting work of Genova and Williamson [GW17] onthe empirical performance of the max-entropy rounding algorithm. We also remark that althoughthe algorithm implemented in [GW17] is slightly different from the above algorithm, we expectthe performance to be similar. 2 .2 New Techniques Here we discuss new machinery and technical tools that we developed for this result which couldbe of independent interest.

Let G = ( V , E , x ) be an undirected graph equipped with a weight function x : E → R ≥ suchthat for any cut ( S , S ) such that u , v S , x ( δ ( S )) ≥ η ≥

0, consider the family of η -near min cuts of G . Let C be a connected com-ponent of crossing η -near min cuts. Given C we can partition vertices of G into sets a , . . . , a m − (called atoms); this is the coarsest partition such that for each a i , and each ( S , S ) ∈ C , we have a i ⊆ S or a i ⊆ S . Here a is the atom that contains u , v .There has been several works studying the structure of edges between these atoms and thestructure of cuts in C w.r.t. the a i ’s. The cactus structure (see [DKL76]) shows that if η =

0, thenwe can arrange the a i ’s around a cycle, say a , . . . , a m (after renaming), such that x ( E ( a i , a i + )) = i .Benczúr and Goemans [Ben95; BG08] studied the case when η ≤ polygon representation , in which case atoms can be placed on the sides of an equilateralpolygon and some atoms placed inside the polygon, such that every cut in C can be representedby a diagonal of this polygon. Later, [OSS11] studied the structure of edges of G in this polygonwhen η < C of η -near min cuts of G such that • No atom is mapped inside, • If we identify each cut ( S , S ) ∈ C with the interval along the polygon that does not contain a , then any interval is only crossed on one side (only on the left or only on the right).Then, we have (i) For any atom a i , x ( δ ( a i )) ≤ + O ( δ ) and (ii) For any pair of atoms a i , a i + , x ( E ( a i , a i + ) ≥ − Ω ( η ) (see Theorem 4.9 for details).We expect to see further applications of our theorem in studying variants of TSP. Given a real stable polynomial p ∈ R ≥ [ z , . . . , z n ] (with non-negative coefﬁcients), Gurvitsproved the following inequality [Gur06; Gur08] n ! n n inf z > p ( z , . . . , z n ) z . . . z n ≤ ∂ z . . . ∂ z n p | z = ≤ inf z > p ( z , . . . , z n ) z . . . z n . (2)As an immediate consequence, one can prove the following theorem about strongly Rayleigh(SR) distributions. Theorem 1.4.

Let µ : 2 [ n ] → R ≥ be SR and A , . . . , A m be random variables corresponding to thenumber of elements sampled in m disjoint subsets of [ n ] such that E [ A i ] = n i for all i. If n i = for all ≤ i ≤ n, then P [ ∀ i , A i = ] ≥ m ! m m . ~ n = ( n , . . . , n m ) in the above theorem is not equalbut close to the all ones vector, .A related theorem was proved in [OSS11]. Theorem 1.5.

Let µ : 2 [ n ] → R ≥ be SR and A , B be random variables corresponding to the num-ber of elements sampled in two disjoint sets. If P [ A + B = ] ≥ ǫ , P [ A ≤ ] , P [ B ≤ ] ≥ α and P [ A ≥ ] , P [ B ≥ ] ≥ β then P [ A = B = ] ≥ ǫαβ /3 . We prove a generalization of both of the above statements; roughly speaking, we show thatas long as k ~ n − k < − ǫ then P [ ∀ i , A i = ] ≥ f ( ǫ , m ) where f ( ǫ , m ) has no dependence on n ,the number of underlying elements in the support of µ . Theorem 1.6 (Informal version of Proposition 5.1) . Let µ : 2 [ n ] → R ≥ be SR and let A , . . . , A m berandom variables corresponding to the number of elements sampled in m disjoint subsets of [ n ] . Supposethat there are integers n , . . . , n m such that for any set S ⊆ [ m ] , P [ ∑ i ∈ S A i = ∑ i ∈ S n i ] ≥ ǫ . Then, P [ ∀ i , A i = n i ] ≥ f ( ǫ , m ) .The above statement is even stronger than Theorem 1.4 as we only require P [ ∑ i ∈ S A i = ∑ i ∈ S n i ] to be bounded away from 0 for any set S ⊆ [ m ] and we don’t need a bound on the expectation.Our proof of the above theorem has double exponential dependence on ǫ . We leave it an openproblem to ﬁnd the optimum dependency on ǫ . Furthermore, our proof of the above theoremis probabilistic in nature; we expect that an algebraic proof based on the theory of real stablepolynomials will provide a signiﬁcantly improved lower bound. Unlike the above theorem, sucha proof may possibly extend to the more general class of completely log-concave distributions[AOV18]. Consider a SR distribution µ : 2 [ n ] → R ≥ and let x : [ n ] → R ≥ , where for all i , x i = P T ∼ µ [ i ∈ T ] ,be the marginals.Let A , B ⊆ [ n ] be two disjoint sets such that E [ A T ] , E [ B T ] ≈

1. It follows from Theorem 1.6that P [ A T = B T = ] ≥ Ω ( ) . Here, however, we are interested in a stronger event; let ν = µ | A T = B T = y i = P T ∼ µ [ i ∈ T ] . It turns out that the y vector can be very different fromthe x vector, in particular, for some i ’s we can have | y i − x i | bounded away from 0. We show thatthere is an event of non-negligible probability that is a subset of A T = B T = A , B are almost preserved. Theorem 1.7 (Informal version of Proposition 5.6) . Let µ : 2 [ n ] → R ≥ be a SR distribution and letA , B ⊆ [ n ] be two disjoint subsets such that E [ A T ] , E [ B T ] ≈ . For any α ≪ there is an event E A , B such that P [ E A , B ] ≥ Ω ( α ) and • P [ A T = B T = |E A , B ] = , • ∑ i ∈ A | P [ i ] − P [ i |E ] | ≤ α , • ∑ i ∈ B | P [ i ] − P [ i |E ] | ≤ α .

4e remark that the quadratic lower bound on α is necessary in the above theorem for asufﬁciently small α >

0. The above theorem can be seen as a generalization of Theorem 1.4 in thespecial case of two sets.We leave it an open problem to extend the above theorem to arbitrary k disjoint sets. Wesuspect that in such a case the ideal event E A ,..., A k occurs with probability Ω ( α ) k and preservesall marginals of elements in each of the sets A , . . . , A k up to a total variation distance of α . We write [ n ] : = {

1, . . . , n } to denote the set of integers from 1 to n . For a set of edges A ⊆ E and(a tree) T ⊆ E , we write A T = | A ∩ T | .For a set S ⊆ V , we write E ( S ) = { ( u , v ) ∈ E : u , v ∈ S } to denote the set of edges in S and we write δ ( S ) = { ( u , v ) ∈ E : |{ u , v } ∩ S | = } to denote the set of edges that leave S . For two disjoint sets of vertices A , B ⊆ V , we write E ( A , B ) = { ( u , v ) ∈ E : u ∈ A , v ∈ B } .For a set A ⊆ E and a function x : E → R we write x ( A ) : = ∑ e ∈ A x e .For two sets A , B ⊆ V , we say A crosses B if all of the following sets are non-empty: A ∩ B , A r B , B r A , A ∪ B .We write G = ( V , E , x ) to denote an (undirected) graph G together with special vertices u , v and a weight function x : E → R ≥ such that x ( δ ( S )) ≥ ∀ S ( V : u , v / ∈ S .For such a graph, we say a cut S ⊆ V is an η -near min cut w.r.t., x (or simply η -near min cut when x is understood) if x ( δ ( S )) ≤ + η . Unless otherwise speciﬁed, in any statement about a cut ( S , S ) in G , we assume u , v S . 5 .2 Polyhedral Background For any graph G = ( V , E ) , Edmonds [Edm70] gave the following description for the convex hullof spanning trees of a graph G = ( V , E ) , known as the spanning tree polytope . z ( E ) = | V | − z ( E ( S )) ≤ | S | − ∀ S ⊆ Vz e ≥ ∀ e ∈ E . (3)Edmonds [Edm70] proved that the extreme point solutions of this polytope are the characteristicvectors of the spanning trees of G . Fact 2.1.

Let x be a feasible solution of (1) such that x e = with support E = E ∪ { e } . Let x be x restricted to E; then x is in the spanning tree polytope of G = ( V , E ) .Proof. For any set S ⊆ V such that u , v / ∈ S , x ( E ( S )) = | S |− x ( δ ( S )) ≤ | S | −

1. If u ∈ S , v / ∈ S ,then x ( E ( S )) = | S |− − ( x ( δ ( S )) − ) ≤ | S | −

1. Finally, if u , v ∈ S , then x ( E ( S )) = | S |− − x ( δ ( S )) ≤| S | −

2. The claim follows because x ( E ) = x ( E ) − = n − c ( e ) =

0, the following fact is immediate.

Fact 2.2.

Let G = ( V , E , x ) where x is in the spanning tree polytope. Let µ be any distribution of spanningtrees with marginals x, then E T ∼ µ [ c ( T ∪ e )] = c ( x ) . To bound the cost of the min-cost matching on the set O of odd degree vertices of the tree T , we use the following characterization of the O -join polytope due to Edmonds and Johnson[EJ73]. Proposition 2.3.

For any graph G = ( V , E ) , cost function c : E → R + , and a set O ⊆ V with an evennumber of vertices, the minimum weight of an O-join equals the optimum value of the following integrallinear program. min c ( y ) s.t. y ( δ ( S )) ≥ ∀ S ⊆ V , | S ∩ O | oddy e ≥ ∀ e ∈ E (4) Deﬁnition 2.4 (Satisﬁed cuts) . For a set S ⊆ V such that u , v / ∈ S and a spanning tree T ⊆ E we saya vector y : E → R ≥ satisﬁes S if one of the following holds: • δ ( S ) T is even, or • y ( δ ( S )) ≥ . To analyze our algorithm, we will see that the main challenge is to construct a (random)vector y that satisﬁes all cuts and E [ c ( y )] ≤ ( − ǫ ) OPT . The standard name for this is the T -join polytope. Because we reserve T to represent our tree, we call this the O -join polytope, where O represents the set of odd vertices in the tree. .3 Structure of Near Minimum Cuts Lemma 2.5 ([OSS11]) . For G = ( V , E , x ) , let A , B ( V be two crossing ǫ A , ǫ B near min cuts respectively.Then, A ∩ B , A ∪ B , A r B , B r A are ǫ A + ǫ B near min cuts.Proof. We prove the lemma only for A ∩ B ; the rest of the cases can be proved similarly. Bysubmodularity, x ( δ ( A ∩ B )) + x ( δ ( A ∪ B )) ≤ x ( δ ( A )) + x ( δ ( B )) ≤ + ǫ A + ǫ B .Since x ( δ ( A ∪ B )) ≥

2, we have x ( δ ( A ∩ B )) ≤ + ǫ A + ǫ B , as desired.The following lemma is proved in [Ben97]: Lemma 2.6 ([Ben97, Lem 5.3.5]) . For G = ( V , E , x ) , let A , B ( V be two crossing ǫ -near minimumcuts. Then,x ( E ( A ∩ B , A − B )) , x ( E ( A ∩ B , B − A )) , x ( E ( A ∪ B , A − B )) , x ( E ( A ∪ B , B − A )) ≥ ( − ǫ /2 ) . Lemma 2.7.

For G = ( V , E , x ) , let A , B ( V be two ǫ near min cuts such that A ( B. Thenx ( δ ( A ) ∩ δ ( B )) = x ( E ( A , B )) ≤ + ǫ , andx ( E ( δ ( A ) r δ ( B ))) ≥ − ǫ /2. Proof.

Notice 2 + ǫ ≥ x ( δ ( A )) = x ( E ( A , B r A )) + x ( E ( A , B )) + ǫ ≥ x ( δ ( B )) = x ( E ( B r A , B )) + x ( E ( A , B )) Summing these up, we get2 x ( E ( A , B )) + x ( E ( A , B r A )) + x ( E ( B r A , B )) = x ( E ( A , B )) + x ( δ ( B r A )) ≤ + ǫ .Since B r A is non-empty, x ( δ ( B r A )) ≥

2, which implies the ﬁrst inequality. To see the secondone, let C = B r A and note4 ≤ x ( δ ( A )) + x ( δ ( C )) = x ( E ( A , C )) + x ( δ ( B )) ≤ x ( E ( A , C )) + + ǫ which implies x ( E ( A , C )) ≥ − ǫ /2. λ -uniform Spanning Tree Distributions Let B E be the set of all probability measures on the Boolean algebra 2 E . Let µ ∈ B E . Thegenerating polynomial g µ : R [ { z e } e ∈ E ] of µ is deﬁned as follows: g µ ( z ) : = ∑ S µ ( S ) ∏ e ∈ S z e .We say µ is a strongly Rayleigh distribution if g µ = { y e } e ∈ E ∈ C E where Im ( z e ) > e ∈ E . We say µ is d -homogenous if for any λ ∈ R , g µ ( λ z ) = λ d g µ ( z ) . Strongly Rayleigh(SR) distributions were deﬁned in [BBL09] where it was shown any λ -uniform spanning tree dis-tribution is strongly Rayleigh. In this subsection we recall several properties of SR distributionsproved in [BBL09; OSS11] which will be useful to us.7 losure Operations of SR Distributions. SR distributions are closed under the following oper-ations. • Projection.

For any µ ∈ B E , and any F ⊆ E , the projection of µ onto F is the measure µ F where for any A ⊆ F , µ F ( A ) = ∑ S : S ∩ F = A µ ( S ) . • Conditioning.

For any e ∈ E , { µ | e out } and { µ | e in } . • Truncation.

For any integer k ≥ µ ∈ B E , truncation of µ to k , is the measure µ k wherefor any A ⊆ E , µ k ( A ) = ( µ ( A ) ∑ S : | S | = k µ ( S ) if | A | = k • Product.

For any two disjoint sets E , F , and µ E ∈ B E , µ F ∈ B F the product measure µ E × F isthe measure where for any A ⊆ E , B ⊆ F , µ E × F ( A ∪ B ) = µ E ( A ) µ F ( B ) .Throughout this paper we will repeatedly apply the above operations. We remark that SR dis-tributions are not necessarily closed under truncation of a subset, i.e., if we require exactly k elements from F ( E .Since λ -uniform spanning tree distributions are special classes of SR distributions, if we per-form any of the above operations on a λ -uniform spanning tree distribution µ we get another SRdistribution. Below, we see that by performing the following particular operations we still havea λ -uniform spanning tree distribution (perhaps with a different λ ). Closure Operations of λ -uniform Spanning Tree Distributions • Conditioning . For any e ∈ E , { µ | e out } , { µ | e in } . • Tree Conditioning . For G = ( V , E ) , a spanning tree distribution µ ∈ B E , and S ⊆ V , { µ | S tree } .Note that arbitrary spanning tree distributions are not necessarily closed under truncation andprojection. We remark that SR measures are also closed under an analogue of tree conditioning,i.e., for a set F ⊆ E , let k = max S ∈ supp µ | S ∩ F | . Then, { µ || S ∩ F | = k } is SR. But if µ is a spanningtree distribution we get an extra independence property. The following independence is crucial toseveral of our proofs. Fact 2.8.

For a graph G = ( V , E ) , and a vector λ ( G ) : E → R ≥ , let µ λ ( G ) be the corresponding λ -uniform spanning tree distribution. Then for any S ( V, { µ λ ( G ) | S tree } = µ λ ( G [ S ]) × µ λ ( G / S ) . Proof.

Intuitively, this holds because in the max entropy distribution, conditioned on S beinga tree, any tree chosen inside S can be composed with any tree chosen on G / S to obtain a8panning tree on G . So, to maximize the entropy these trees should be chosen independently.More formally for any T ∈ G [ S ] and T ∈ G / S , P [ T = T ∪ T | S is a tree ] = λ T λ T ∑ T ′ ∈ G [ S ] , T ′ ∈ G / S λ T ′ λ T ′ = λ T ∑ T ′ ∈ G [ S ] λ T ′ · λ T ∑ T ′ ∈ G / S λ T ′ = P T ′ ∼ G [ S ] (cid:2) T ′ = T (cid:3) P T ′ ∼ G / S (cid:2) T ′ = T (cid:3) ,giving independence. Negative Dependence Properties. An upward event , A , on 2 E is a collection of subsets of E thatis closed under upward containment, i.e. if A ∈ A and A ⊆ B ⊆ E , then B ∈ A . Similarly, a downward event is closed under downward containment. An increasing function f : 2 E → R , is afunction where for any A ⊆ B ⊆ E , we have f ( A ) ≤ f ( B ) . We also say f : 2 E → R is a decreasingfunction if − f is an increasing function. So, an indicator of an upward event is an increasingfunction. For example, if E is the set of edges of a graph G , then the existence of a Hamiltoniancycle is an increasing function, and the 3-colorability of G is a decreasing function. Deﬁnition 2.9 (Negative Association) . A measure µ ∈ B E is negatively associated if for any increas-ing functions f , g : 2 E → R , that depend on disjoint sets of edges, E µ [ f ] · E µ [ g ] ≥ E µ [ f · g ] It is shown in [BBL09; FM92] that strongly Rayleigh measures are negatively associated.

Stochastic Dominance.

For two measures µ , ν : 2 E → R ≥ , we say µ (cid:22) ν if there exists a coupling ρ : 2 E × E → R ≥ such that ∑ B ρ ( A , B ) = µ ( A ) , ∀ A ∈ E , ∑ A ρ ( A , B ) = ν ( B ) , ∀ B ∈ E ,and for all A , B such that ρ ( A , B ) > A ⊆ B (coordinate-wise). Theorem 2.10 (BBL) . If µ is strongly Rayleigh and µ k , µ k + are well-deﬁned, then µ k (cid:22) µ k + . Note that in the above particular case the coupling ρ satisﬁes the following: For any A , B ⊆ E where ρ ( A , B ) > B ⊇ A and | B r A | =

1, i.e., B has exactly one more element.Let µ be a strongly Rayleigh measure on edges of G . Recall that for a set A ⊆ E , we write A T = | A ∩ T | to denote the random variable indicating the number of edges in A chosen in arandom sample T of µ . The following facts immediately follow from the negative associationand stochastic dominance properties. We will use these facts repeatedly in this paper. Fact 2.11.

Let µ be any SR distribution on E, then for any F ⊂ E, and any integer k1. (Negative Association) If e / ∈ F, then P µ (cid:2) e (cid:12)(cid:12) F T ≥ k (cid:3) ≤ P µ [ e ] and P µ [ e | F T ≤ k ] ≥ P µ [ e ] . (Stochastic Dominance) If e ∈ F, then P µ [ e | F T ≥ k ] ≥ P µ [ e ] and P µ [ e | F T ≤ k ] ≤ P µ [ e ] . Fact 2.12.

Let µ be a homogenous SR distribution on E. Then, • (Negative association with homogeneity) For any A ⊆ E, and any B ⊆ A E µ [ B T | A T = ] ≤ E µ [ B T ] + E µ [ A T ] (5) • Suppose that µ is a spanning tree distribution. For S ⊆ V, let q : = | S | − − E µ [ E ( S ) T ] . For anyA ⊆ E ( S ) , B ⊆ E ( S ) , E µ [ B T ] − q ≤ E µ [ B T | S is a tree ] ≤ E µ [ B T ] (Negative association and homogeneity) E µ [ A T ] ≤ E µ [ A T | S is a tree ] ≤ E µ [ A T ] + q (Stochastic dominance and tree) Rank Sequence.

The rank sequence of µ is the sequence P [ | S | = ] , P [ | S | = ] , . . . , P [ | S | = m ] ,where S ∼ µ . Let g µ ( z ) be the generating polynomial of µ . The diagonal specialization of µ is theunivariate polynomial ¯ g µ ( z ) : = g µ ( z , z , . . . , z ) .Observe that ¯ g ( . ) is the generating polynomial of the rank sequence of µ . It follows that if µ isSR then ¯ g µ is real rooted.It is not hard to see that the rank sequence of µ corresponds to sum of independent Bernoullisiff ¯ g µ is real rooted. It follows that the rank sequence of an SR distributions has the law of a sumof independent Bernoullis. As a consequence, it follows (see [HLP52; Dar64; BBL09]) that therank sequence of any strongly Rayleigh measure is log concave (see below for the deﬁnition),unimodal, and its mode differs from the mean by less than 1. Deﬁnition 2.13 (Log-concavity [BBL09, Deﬁnition 2.8]) . A real sequence { a k } mk = is log-concave ifa k ≥ a k − · a k + for all ≤ k ≤ m − , and it is said to have no internal zeros if the indices of its non-zeroterms form an interval (of non-negative integers). In this section, we collect a number of properties of sums of Bernoulli random variables.

Deﬁnition 2.14 (Bernoulli Sum Random Variable) . We say BS ( q ) is a Bernoulli-Sum random vari-able if it has the law of a sum of independent Bernoulli random variables, say B + B + . . . + B n for somen ≥ , with E [ B + · · · + B n ] = q. We start with the following theorem of Hoeffding.

Theorem 2.15 ([Hoe56, Corollary 2.1]) . Let g : {

0, 1, . . . , n } → R and ≤ q ≤ n for some integern ≥ . Let B , . . . , B n be n independent Bernoulli random variables with success probabilities p , . . . , p n ,where ∑ ni = p n = q that minimizes (or maximizes) E [ g ( B + · · · + B n )] over all such distributions. Then, p , . . . , p n ∈ { x , 1 } for some < x < . In particular, if only m ofp i ’s are nonzero and ℓ of p i ’s are 1, then the rest of the m − ℓ are q − ℓ m − ℓ . act 2.16. Let B , . . . , B n be independent Bernoulli random variables each with expectation ≤ p ≤ .Then P " ∑ i B i even = ( + ( − p ) n ) Proof.

Note that ( p + ( − p )) n = n ∑ k = p k ( − p ) n − k (cid:18) nk (cid:19) and (( − p ) − p ) n = n ∑ k = ( − p ) k ( − p ) n − k (cid:18) nk (cid:19) Summing them up we get,1 + ( − p ) n = ∑ ≤ k ≤ n , k even p k ( − p ) n − k (cid:18) nk (cid:19) . Corollary 2.17.

Given a BS ( q ) random variable with < q ≤ , then P [ BS ( q ) even ] ≤ ( + e − q ) Proof.

First, if q ≤

1, then by Hoeffding’s theorem we can write BS ( q ) as sum of n Bernoulliswith success probability p = q / n . If n =

1, then the statement obviously holds. Otherwise, bythe previous fact, we have (for some n ), P [ BS ( q ) even ] ≤ ( + ( − p ) n )) ≤ ( + e − q ) where we used that | − p | ≤ e − p for p ≤ q >

1. Write BS ( q ) as the sum of n Bernoullis, each with success probabilities1 or p . First assume we have no ones. Then, either we only have two non-zero Bernoullis withsuccess probability q /2 in which case P [ BS ( q ) even ] ≤ + and we are done. Otherwise, n ≥ p ≤ P [ BS ( q ) even ] ≤ ( + e − q ) .Finally, if q > BS ( q ) = BS ( q − ) +

1, then we get P [ BS ( q ) even ] = P [ BS ( q − ) odd ] = ( − ( − p ) n − ) ≤ ( − e − ( q − ) ) ≤ − x ≥ e − x for 0 ≤ x ≤ Lemma 2.18.

Let p , . . . , p n be a log-concave sequence. If for some i, γ p i ≥ p i + for some γ < , then, n ∑ j = k p j ≤ p k − γ , ∀ k ≥ i n ∑ j = i + p j · j ≤ p i + − γ (cid:18) i + + γ − γ (cid:19) .11 roof. Since we have a log-concave sequence we can write1 γ ≤ p i p i + ≤ p i + p i + ≤ . . . (6)Since all of the above ratios are at least 1/ γ , for all k ≥ p i + k ≤ γ k − p i + ≤ γ k p i .Therefore, the ﬁrst statement is immediate and the second one follows, n ∑ j = i + p j j ≤ ∞ ∑ k = γ k p i + ( i + k + ) = p i + (cid:18) i + − γ + γ ( − γ ) (cid:19) Corollary 2.19.

Let X be a BS ( q ) random variable such that P [ X = k ] ≥ − ǫ for some integer k ≥ , ǫ < . Then, k ( − ǫ ) ≤ q ≤ k ( + ǫ ) + ǫ .Proof. The left inequality simply follows since X ≥

0. Since P [ X = k + ] ≤ ǫ , we can applyLemma 2.18 with γ = ǫ / ( − ǫ ) to get E [ X | X ≥ k + ] P [ X ≥ k + ] ≤ ǫ ( − ǫ ) − ǫ (cid:18) k + + ǫ − ǫ (cid:19) Therefore, q = E [ X ] ≤ k ( − ǫ ) + ǫ ( − ǫ ) − ǫ ( k + + ǫ − ǫ ) ≤ k ( + ǫ ) + ǫ as desired. Fact 2.20.

For integers k < t and k − ≤ p ≤ k, k − ∏ i = ( − i / t )( − p / t ) t − k ≥ e − p . Proof.

We show that the LHS is a decreasing function of t . Since ln is monotone, it is enough toshow 0 ≥ ∂ t ln ( LHS ) = ∂ t k − ∑ i = ln ( − i / t ) + ( t − k ) ln ( − p / t ) ! = t k − ∑ i = i − t + ln ( − p / t ) + ( t − k ) pt ( t − p ) Using ∑ k − i = t / i − t ≤ R k − dxt / x − t = − ( k − ) / t − ln ( − ( k − ) / t ) it is enough to show0 ≥ − k − t − ln ( − k − t ) + ln ( − p / t ) + ( t − k ) pt ( t − p ) + t ( k − − t )= ln t − pt − k + + p − kt − p + t + k − t ( t − k + ) ( + p − k + t − p ) ≥ p − kt − p + t − k + p > k −

1, using taylor series of ln, to prove the above it is enough to show p − k + t − p − ( p − k + ) ( t − p ) ≥ p − kt − p + t − k + p − k + ( t − p )( t − k + ) ≥ ( p − k + ) ( t − p ) ⇔ t − k + ≥ p − k + ( t − p ) Finally the latter holds because ( t − k + )( p − k + ) ≤ ( t − k + ) ≤ ( t − p ) where we use t ≥ k + p ≤ k .Let Poi ( p , k ) = e − p p k / k ! be the probability that a Poisson random variable with rate p isexactly k ; similarly, deﬁne Poi ( p , ≤ k ) , Poi ( p , ≥ k ) as the probability that a Poisson with rate p isat most k or at least k . Lemma 2.21.

Let X be a Bernoulli sum BS ( p ) for some n. For any integer k ≥ such that k − < p < k + , the following holds true P [ X = k ] ≥ min ≤ ℓ ≤ p , k Poi ( p − ℓ , k − ℓ ) (cid:18) − p − ℓ k − ℓ + (cid:19) ( p − k ) + where the minimum is over all nonnegative integers ℓ ≤ p , k, and for z ∈ R , z + = max { z , 0 } .Proof. Let X = B + · · · + B n where B i is a Bernoulli. Applying Hoeffding’s theorem, if ℓ of themhave success probability 1, we need to prove a lower bound of Poi ( p − ℓ , k − ℓ )( − p − ℓ k − ℓ + ) ( p − k ) + .So, assuming none have success probability 1, it follows that each has success probability p / n . If k ≥ p , P [ X = k ] = (cid:18) nk (cid:19) (cid:16) pn (cid:17) k ( − p / n ) n − k = k − ∏ i = ( − i / n ) p k k ! ( − p / n ) n − k ≥ p k k ! e − p = Poi ( p , k ) ,where in the inequality we used Fact 2.20 (also note if n = k the inequality follows from Stirling’sformula and that p ≥ k − k < p < k +

1, then as above P [ X = k ] = k − ∏ i = ( − i / n ) p k k ! ( − p / n ) n − p ( − p / n ) p − k ≥ Poi ( p , k )( − p / n ) p − k ,where again we used Fact 2.20.Note that if we further know X ≥ a with probability 1 we can restrict ℓ in the statement to bein the interval [ a , min ( p , k )] . 13 emma 2.22. Let X be a Bernoulli sum BS ( p ) , where for some integer k = ⌈ p ⌉ , Then, P [ X ≥ k ] ≥ min ≤ ℓ ≤ p Poi ( p − ℓ , ≥ k − ℓ ) where the minimum is over all non-negative integers ℓ ≤ p.Proof. Suppose that X is a BS ( p ) with n Bernoullis with probabilities p , . . . , p n . If p − < k − < p , by [Hoe56, Thm 4, (25)], P [ X ≤ k − ] ≤ max ≤ ℓ< p k − − ℓ ∑ i = (cid:18) n − ℓ i (cid:19) q i ( − q ) n − ℓ − i (7)where q = p − ℓ n − ℓ .If Y is a BS ( p ) with m > n Bernoullis with probabilities q , . . . , q m , the same upper boundapplies of course, with m replacing n . Also, note thatmax p ... p n P [ X ≤ k − ] ≤ max q ,..., q m P [ Y ≤ k − ] since it is always possible to set q i = p i for i ≤ n and q j = j > n .Therefore, the upper bound in (7) obtained by taking the limit as n goes to inﬁnity applies,from which it follows that P [ X ≤ k − ] ≤ max ≤ ℓ< p k − − ℓ ∑ i = Poi ( p − ℓ , i ) and therefore P [ X ≥ k ] ≥ min ≤ ℓ< p Poi ( p − ℓ , ≥ k − ℓ ) . Lemma 2.23.

Let G = ( V , E , x ) , and let µ be any distribution over spanning trees with marginals x. Forany ǫ -near min cut S ⊆ V (such that none of the endpoints of e = ( u , v ) are in S), we have P T ∼ µ [ T ∩ E ( S ) is tree ] ≥ − ǫ /2. Moreover, if µ is a max-entropy distribution with marginals x, then for any set of edges A ⊆ E ( S ) andB ⊆ E r E ( S ) , E [ A T ] ≤ E [ A T | S is tree ] ≤ E [ A T ] + ǫ /2, E [ B T ] − ǫ /2 ≤ E [ B T | S is tree ] ≤ E [ B T ] . Proof.

First, observe that E [ E ( S ) T ] = x ( E ( S )) ≥ | S | − x ( δ ( S )) ≥ | S | − − ǫ /2,where we used that since u , v / ∈ S , and that for any v ∈ S , E [ δ ( v ) T )] = x ( δ ( v )) = p S = P [ S is tree ] . Then, we must have | S | − − ( − p S ) = p S ( | S | − ) + ( − p S )( | S | − ) ≥ E [ E ( S ) T ] ≥ | S | − − ǫ /2.Therefore, p S ≥ − ǫ /2.The second part of the claim follows from Fact 2.12.14 orollary 2.24. Let A , B ⊆ V be disjoint sets such that A , B , A ∪ B are ǫ A , ǫ B , ǫ A ∪ B -near minimum cutsw.r.t., x respectively, where none of them contain endpoints of e . Then for any distribution µ of spanningtrees on E with marginals x, P T ∼ µ [ E ( A , B ) T = ] ≥ − ( ǫ A + ǫ B + ǫ A ∪ B ) /2. Proof.

By the union bound, with probability at least 1 − ( ǫ A + ǫ B + ǫ A ∪ B ) /2, A , B , and A ∪ B aretrees. But this implies that we must have exactly one edge between A , B .The following simple fact also holds by the union bound. Fact 2.25.

Let G = ( V , E , x ) and let µ be a distribution over spanning trees with marginals x. For anyset A ⊆ E , we have P T ∼ µ [ T ∩ A = ∅ ] ≥ − x ( A ) . Lemma 2.26.

Let G = ( V , E , x ) , and let µ be a λ -uniform random spanning tree distribution withmarginals x. For any edge e = ( u , v ) and any vertex w = u , v we have E [ W T | e T ] ≤ E [ W T ] + P [ w ∈ P u , v | e T ] · P [ e ∈ T ] , where W T = | T ∩ δ ( w ) | and for a spanning tree T and vertices u , v ∈ V, P u , v ( T ) is the set of vertices onthe path from u to v in T.Proof. Deﬁne E ′ = E r { e } . Let µ ′ = µ | E ′ be µ projected on all edges except e . Deﬁne µ i = µ ′ n − (corresponding to e in the tree) and µ o = µ ′ n − (corresponding to e out of the tree). Observe thatany tree T has positive measure in exactly one of these distributions.By Theorem 2.10, µ i (cid:22) µ o so there exists a coupling ρ : 2 E ′ × E ′ between them such that forany T i , T o such that ρ ( T i , T o ) >

0, the tree T o has exactly one more edge than T i . Also, observethat T o is always a spanning tree whereas T i ∪ { e } is a spanning tree. The added edge (i.e., theedge in T o r T i ) is always along the unique path from u to v in T o .For intuition for the rest of the proof, observe that if w is not on the path from u to v in T o ,then the same set of edges is incident to w in both T i and T o . So, if w is almost never on the pathfrom u to v , the distribution of W T is almost independent of e . On the other hand, whenever w ison the path from u to v , then in the worst case, we may replace e with one of the edges incidentto w , so conditioned on e out, W T increases by at most the probability that e is in the tree.Say x e is the marginal of e . Then, E [ W T ] = E [ W T | e / ∈ T ] ( − x e ) + E [ W T | e ∈ T ] x e = ∑ T i , T o ρ ( T i , T o ) W o ( − x e ) + ∑ T i , T o ρ ( T i , T o ) W i x e = ∑ T i , T o ρ ( T i , T o )(( − x e ) W o + x e W i ) , (8)where we write W i / W o instead of W T i / W T o [ W T | e / ∈ T ] = ∑ T i , T o ρ ( T i , T o ) W o = ∑ T i , T o : w ∈ P u , v ( T o ) ρ ( T i , T o ) W o + ∑ T i , T o : w / ∈ P u , v ( T o ) ρ ( T i , T o ) W o ≤ ∑ T i , T o : w ∈ P u , v ( T o ) ρ ( T i , T o )( x e ( W i + ) + ( − x e ) W o )+ ∑ T i , T o : w / ∈ P u , v ( T o ) ρ ( T i , T o )( x e W i + ( − x e ) W o )= E [ W T ] + ∑ T i , T o : w ∈ P u , v ( T o ) ρ ( T i , T o ) x e = E [ W T ] + ∑ T o : w ∈ P u , v ( T o ) µ o ( T o ) x e = E [ W T ] + P [ w ∈ P u , v | e out ] · P [ e in ] where in the inequality we used the following: When w / ∈ P u , v ( T o ) we have W i = W o and when w ∈ P u , v ( T o ) we have W o ≤ W i +

1. Finally, in the third to last equality we used (8). uU wWve f

Figure 1: Setting of Lemma 2.27

Lemma 2.27.

Let G = ( V , E , x ) , and let µ be a λ -uniform spanning tree distribution with marginals x.For any pair of edges e = ( u , v ) , f = ( v , w ) such that | P [ e ] − | , | P [ f ] − | < ǫ (see Fig. 1), if ǫ < , then E [ W T | e T ] + E [ U T | f T ] ≤ E [ W T + U T ] + where U = δ ( u ) − e and W = δ ( w ) − f .Proof. All probabilistic statements are with respect to ν so we drop the subscript. First, byLemma 2.26, and negative association we can write, E [ W T | e T ] ≤ E [ W T ] + P [ w ∈ P u , v | e T ] P [ e ∈ T ] ≤ E [ W T ] + P [ w ∈ P u , v ∧ e / ∈ T ] + ǫ Note that the lemma only implies E [ δ ( w ) T | e / ∈ T ] ≤ E [ δ ( w ) T ] + P [ w ∈ P u , v | e / ∈ T ] P [ e ∈ T ] . Toderive the ﬁrst inequality we also exploit negative association which asserts that the marginal ofevery edge only goes up under e / ∈ T , so any subset of δ ( w ) (in particular W ) also goes up by atmost P [ e / ∈ T ∧ w ∈ P u , v ] . Also, the second inequality uses P [ e ∈ T ] ≤ P [ e / ∈ T ] + ǫ . Using asimilar inequality for U T , to prove the lemma it is enough to show that P [ w ∈ P u , v ∧ e T ] + P [ u ∈ P v , w ∧ f T ] ≤ u is on the v − w path and w ison the u − v path. Therefore P [ u ∈ P v , w | e , f T ] + P [ w ∈ P u , v | e , f T ] ≤ P [ e T ∧ w ∈ P u , v ] + P [ f T ∧ u ∈ P v , w ] ≤ P [ e , f T ∧ w ∈ P u , v ] + P [ e / ∈ T , f ∈ T ] + P ν [ e , f T ∧ u ∈ P v , w ] + P [ f / ∈ T , e ∈ T ] ≤ P [ e , f / ∈ T ] + P [ e / ∈ T , f ∈ T ] + P [ f / ∈ T , e ∈ T ]= − P [ e , f ∈ T ] .It remains to upper bound the RHS. Let α = P [ f ∈ T | e / ∈ T ] . Observe that P [ e , f ∈ T ] = P [ f ∈ T ] − P [ f ∈ T , e / ∈ T ] ≥ − ǫ − ( + ǫ ) α .If α ≤ P [ e , f ∈ T ] ≥ ǫ < P [ f | e / ∈ T ] ≥ P [ e | f / ∈ T ] ≥ E [ W T | e / ∈ T ] ≤ E [ W T ] + P [ e ] − ( P [ f | e / ∈ T ] − P [ f ]) ≤ E [ W T ] + ǫ + ≤ E [ W T ] + E [ U T | f / ∈ T ] ≤ E [ U T ] + As alluded to earlier, the crux of the proof of Theorem 1.1 is to show that the expected cost of theminimum cost matching on the odd degree vertices of the sampled tree is at most

OPT ( − ǫ ) .We do this by showing the existence of a cheap feasible O -join solution to (4).First, recall that if we only wanted to get an O -join solution of value at most OPT /2, to satisfyall cuts, it is enough to set y e : = x e /2 for each edge [Wol80]. To do better, we want to takeadvantage of the fact that we only need to satisfy a constraint in the O -join for S when δ ( S ) T is odd. Here, we are aided by the fact that the sampled tree is likely to have many even cutsbecause it is drawn from a Strong Rayleigh distribution.If an edge e is exclusively on even cuts then y e can be reduced below x e /2. This, more orless, was the approach in [OSS11] for graphic TSP, where it was shown that a constant fractionof LP edges will be exclusively on even near min cuts with constant probability. The difﬁculty inimplementing this approach in the metric case comes from the fact that a high cost edge can beon many cuts and it may be exceedingly unlikely that all of these cuts will be even simultaneously.Overall, our approach to addressing this is to start with y e : = x e /2 and then modify it with arandom slack vector s : E → R : When certain special (few) cuts that e is on are even we let s e = − x e η /8 (for a carefully chosen constant η > e , whenever theyare odd, we will increase the slack of other edges on that cut to satisfy them. The bulk of oureffort is to show that we can do this while guaranteeing that E [ s e ] < − ǫη x e for some ǫ > where the randomness comes from the random sampling of the tree S such that x ( δ ( S )) > ( + η ) . Since we always have s e ≥ − x e η /8, any such cut is alwayssatisﬁed, even if every edge in δ ( S ) is decreased and no edge is increased.Let OPT be the optimum TSP tour, i.e., a Hamiltonian cycle, with set of edges E ∗ ; throughoutthe paper, we write e ∗ to denote an edge in E ∗ . To bound the expected cost of the O -join for arandom spanning tree T ∼ µ λ , we also construct a random slack vector s ∗ : E ∗ → R ≥ such that ( x + OPT ) /4 + s + s ∗ is a feasible for Eq. (4) with probability 1. In Section 3.1 we explain how touse s ∗ to satisfy all but a linear number of near mincuts. Theorem 3.1 (Main Technical Theorem) . Let x be a solution of LP (1) with support E = E ∪ { e } ,and x be x restricted to E. Let z : = ( x + OPT ) /2 , η ≤ − and let µ be the max-entropy distributionwith marginals x. Also, let E ∗ denote the support of OPT. There are two functions s : E → R ands ∗ : E ∗ → R ≥ (as functions of T ∼ µ ), , such thati) For each edge e ∈ E, s e ≥ − x e η /8 .ii) For each η -near-min-cut S of z, if δ ( S ) T is odd, then s ( δ ( S )) + s ∗ ( δ ( S )) ≥ iii) For every OPT edge e ∗ , E [ s ∗ e ∗ ] ≤ η and for every LP edge e = e , E [ s e ] ≤ − x e ǫ P η /2 for ǫ P deﬁned in (31) . In the next subsection, we explain the main ideas needed to prove this technical theorem. Butﬁrst, we show how our main theorem follows readily from Theorem 3.1.

Proof of Theorem 1.1.

Let x be an extreme point solution of LP (1), with support E and let x be x restricted to E . By Fact 2.1 x is in spanning tree polytope. Let µ = µ λ ∗ be the maxentropy distribution with marginals x , and let s , s ∗ be as deﬁned in Theorem 3.1. We will deﬁne y : E → R ≥ and y ∗ : E ∗ → R ≥ . Let y e = ( x e /4 + s e if e ∈ E ∞ if e = e we also let y ∗ e ∗ = + s ∗ e ∗ for any edge e ∗ ∈ E ∗ . We will show that y + y ∗ is a feasible solution to (4). First, observe that for any S where e ∈ δ ( S ) , we have y ( δ ( S )) + y ∗ ( δ ( S )) ≥

1. Otherwise,we assume u , v / ∈ S . If S is an η -near min cut w.r.t., z and δ ( S ) T is odd, then by property (ii) ofTheorem 3.1, we have y ( δ ( S )) + y ∗ ( δ ( S )) = z ( δ ( S )) + s ( δ ( S )) + s ∗ ( δ ( S )) ≥ S is not an η -near min cut (w.r.t., z ), y ( δ ( S )) + y ∗ ( δ ( S )) ≥ z ( δ ( S )) − η x ( δ ( S )) ≥ z ( δ ( S )) − η ( z ( δ ( S )) − ) ≥ z ( δ ( S ))( − η /4 ) + η /4 ≥ ( + η )( − η /4 ) + η /4 ≥ Recall that we merely need to prove the existence of a cheap O-join solution. The actual optimal O-join solutioncan be found in polynomial time. s e ≥ x e η /8with probability 1 for all LP edges and that s ∗ e ∗ ≥ z = ( x + OPT ) /2, so, since OPT ≥ x ( δ ( S )) ≤ ( z ( δ ( S )) − ) .Therefore, y + y ∗ is a feasible O -join solution.Finally, using c ( e ) = E [ c ( y ) + c ( y ∗ )] = OPT /4 + c ( x ) /4 + E [ c ( s ) + c ( s ∗ )] ≤ OPT /4 + c ( x ) /4 + η OPT − ǫ P η c ( x ) /2 ≤ ( − ǫ P η /4 ) OPT choosing η such that 45 η = ǫ P /4.1 and that c ( x ) ≤ OPT .Now, we are ready to bound approximation factor of our algorithm. First, since x is anextreme point solution of (1), min e ∈ E x e ≥ n ! . So, by Theorem 1.3, in polynomial time we canﬁnd λ : E → R ≥ such that for any e ∈ E , P µ λ [ e ] ≤ x e ( + δ ) for some δ that we ﬁx later. Itfollows that ∑ e ∈ E | P µ [ e ] − P µ λ [ e ] | ≤ n δ .By stability of maximum entropy distributions (see [SV19, Thm 4] and references therein), wehave that k µ − µ λ k ≤ O ( n δ ) = : q . Therefore, for some δ ≪ n − we get k µ − µ λ k = q ≤ ǫ P η .That means that E T ∼ µ λ [ min cost matching ] ≤ E T ∼ µ [ c ( y ) + c ( y ∗ )] + q ( OPT /2 ) ≤ (cid:18) − ǫ P η + ǫ P η (cid:19) OPT ,where we used that for any spanning tree the cost of the minimum cost matching on odd degreevertices is at most

OPT /2. Finally, since E T ∼ µ λ [ c ( T )] ≤ OPT ( + δ ) and ǫ P = · − we geta 3/2 − · − approximation algorithm for TSP. The ﬁrst step of the proof is to show that it sufﬁces to construct a slack vector s for a “cactus-like”structure of near min-cuts that we call a hierarchy . Informally, a hierarchy H is a laminar familyof mincuts , consisting of two types of cuts: triangle cuts and degree cuts . A triangle S is the unionof two min-cuts X and Y in H such that x ( E ( X , Y )) =

1. See Fig. 2 for an example of a hierarchywith three triangles.We will refer to the set of edges E ( X , S ) (resp. E ( Y , S ) ) as A (respectively B ) for a trianglecut S . In addition, we say a triangle cut S is happy if A T and B T are both odd. All other cuts arecalled degree cuts. A degree cut S is happy if δ ( S ) T is even. Theorem 3.2 (Main Payment Theorem (informal)) . Let G = ( V , E , x ) for LP solution x and let µ bethe max-entropy distribution with marginals x. Given a hierarchy H , there is a slack vector s : E → R such thati) For each edge e ∈ E, s e ≥ − x e η /8 .ii) For each cut S ∈ H if δ ( S ) T is not happy, then s ( δ ( S )) ≥ iii) For every LP edge e = e , E [ s e ] ≤ − ηǫ P x e for ǫ P > . This is really a family of near-min-cuts, but for the purpose of this overview, assume η = b cdu u u a bu c du u Figure 2: An example of part of a hierarchy with three triangles. The graph on the left showspart of a feasible LP solution where dashed (and sometimes colored) edges have fraction 1/2and solid edges have fraction 1. The dotted ellipses on the left show the min-cuts u , u , u inthe graph. (Each vertex is also a min-cut). On the right is a representation of the correspondinghierarchy. Triangle u corresponds to the cut { a , b } , u corresponds to { c , d } and u correspondsto { a , b , c , d } . Note that, for example, the edge ( a , c ) , represented in green, is in δ ( u ) , δ ( u ) , andinside u . For triangle u , we have A = δ ( a ) r ( a , b ) and B = δ ( b ) r ( b , d ) .In the following subsection, we discuss how to prove this theorem. Here we explain at a highlevel how to deﬁne the hierarchy and reduce Theorem 3.1 to this theorem. The details are inSection 4.First, observe that, given Theorem 3.2, cuts in H will automatically satisfy (ii) of Theorem 3.1.The approach we take to satisfying all other cuts is to introduce additional slack, the vector s ∗ ,on OPT edges.Consider the set of all near-min-cuts of z , where z : = ( x + OPT ) /2. Starting with z ratherthan x allows us to restrict attention to a signiﬁcantly more structured collection of near-min-cuts.The key observation here is that in OPT , all min-cuts have value 2, and any non-min-cut hasvalue at least

4. Therefore averaging x with OPT guarantees that every η -near min-cut of z mustconsist of a contiguous sequence of vertices (an interval) along the OPT cycle . Moreover, each of thesecuts is a 2 η -near min-cut of x . Arranging the vertices in the OPT cycle around a circle, we identifyevery such cut with the interval of vertices that does not contain ( u , v ) . Also, we say that a cutis crossed on both sides if it is crossed on the left and on the right.To ensure that any cut S that is crossed on both sides is satisﬁed, we ﬁrst observe that S is oddwith probability O ( η ) . To see this, let S L and S R be the cuts crossing S on the left and rightwith minimum intersection with S and consider the two (bad) events { E ( S ∩ S L , S L r S )) T = } and { E ( S ∩ S R , S R r S )) T = } . Recall that if A , B and A ∪ B are all near-min-cuts, then P [ E ( A , B ) T = ] = O ( η ) (see Corollary 2.24). Applying this fact to the two aforementioned badevents implies that each of them has probability O ( η ) . Therefore, we will let the two OPT edgesin δ ( S ) be responsible for these two events, i.e., we will increase the slack s ∗ on these two OPT edges by O ( η ) when the respective bad events happens. This gives E [ s ∗ ( e ∗ )] = O ( η ) for eachOPT edge e ∗ . As we will see, this simple step will reduce the number of near-min-cuts of z thatwe need to worry about satisfying to O ( n ) .Next, we consider the set of near-min-cuts of z that are crossed on at most one side. Partitionthese into maximal connected components of crossing cuts. Each such component corresponds20o an interval along the OPT cycle and, by deﬁnition, these intervals form a laminar family.A single connected component C of at least two crossing cuts is called a polygon . We provethe following structural theorem about the polygons induced by z : Theorem 3.3 (Polygons look like cycles (Informal version of Theorem 4.9)) . Given a connectedcomponent C of near-min-cuts of z that are crossed on one side, consider the coarsest partition of verticesof the OPT cycle into a sequence a , . . . , a m − of sets called atoms (together with a which is the set ofvertices not contained in any cut of C ). Then • Every cut in C is the union of some number of consecutive atoms in a , . . . , a m − . • For each i such that ≤ i < m − , x ( E ( a i , a i + )) ≈ and similarly x ( E ( a m − , a )) ≈ . • For each i > , x ( δ ( a i )) ≈ . The main observation used to prove Theorem 3.3 is that the cuts in C crossed on one side canbe partitioned into two laminar families L and R , where L (resp. R ) is the set of cuts crossed onthe left (resp. right). This immediately implies that |C| is linear in m . Since cuts in L cannot crosseach other (and similarly for R ), the proof boils down to understanding the interaction between L and R .The approximations in Theorem 3.3 are correct up to O ( η ) . Using additional slack in OPT , atthe cost of an additional O ( η ) for edge, we can treat these approximate equations as if they areexact. Observe that if x ( E ( a i , a i + )) =

1, and x ( δ ( a i )) = x ( δ ( a i + )) = ≤ i ≤ m −

2, thenwith probability 1, E ( a i , a i + ) T =

1. Therefore, any cut in C which doesn’t include a or a m − iseven with probability 1. The cuts in C that contain a are even precisely when E ( a , a ) T = C that contain a m − are even when E ( a , a m − ) T =

1. These observationsare what allow us to imagine that each polygon is a triangle, i.e., assume m = H is the set of all η -near mincuts of z that are not crossed at all (these will be thedegree cuts), together with a triangle for every polygon. In particular, for a connected component C of size more than 1, the corresponding triangle cut is a ∪ . . . ∪ a m − , with A = E ( a , a ) and B = E ( a , a m − ) . Observe that from the discussion above, when a triangle cut is happy, then allof the cuts in the corresponding polygon C are even.Summarizing, we show that if we can construct a good slack vector s for a hierarchy of degreecuts and triangles, then there is a nonnegative slack vector s ∗ , that satisﬁes all near-minimumcuts of z not represented in the hierarchy, while maintaining slack for each OPT edge e ∗ suchthat E [ s ∗ ( e ∗ )] = O ( η ) . Remarks:

The reduction that we sketched above only uses the fact that µ is an arbitrary distri-bution of spanning trees with marginals x and not necessarily a maximum-entropy distribution.We also observe that to prove Theorem 1.1, we crucially used that 45 η ≪ ǫ . This forcesus to take η very small, which is why we get only a “very slightly” improved approximationalgorithm for TSP. Furthermore, since we use OPT edges in our construction, we don’t get anew upper bound on the integrality gap. We leave it as an open problem to ﬁnd a reductionto the “cactus” case that doesn’t involve using a slack vector for OPT (or a completely differentapproach). Roughly, this corresponds to the deﬁnition of the polygon being left-happy. .2 Proof ideas for Theorem 3.2 We now address the problem of constructing a good slack vector s for a hierarchy of degree cutsand triangle cuts. For each LP edge f , consider the lowest cut in the hierarchy, that containsboth endpoints of f . We call this cut p ( f ) . If p ( f ) is a degree cut, then we call f a top edge andotherwise, it is a bottom edge . We will see that bottom edges are easier to deal with, so we startby discussing the slack vector s for top edges.Let S be a degree cut and let e = ( u , v ) (where u and v are children of S in H ) be the set ofall top edges f = ( u ′ , v ′ ) such that u ′ ∈ u and v ′ ∈ v . We call e a top edge bundle and say that u and v are the top cuts of each f ∈ e . We will also sometimes say that e ∈ S .Ideally, our plan is to reduce the slack of every edge f ∈ e when it is happy , that is, both of itstop cuts are even in T . Speciﬁcally, we will set s f : = − η x f when δ ( u ) T and δ ( v ) T are even. Whenthis happens, we say that f is reduced , and refer to the event { δ ( u ) T , δ ( v ) T even } as the reductionevent for f . Since this latter event doesn’t depend on the actual endpoints of f , we view this as asimultaneous reduction of s e .Now consider the situation from the perspective of the degree cut u (where p ( u ) = S ) andconsider any incident edge bundle in S , e.g., e = ( u , v ) . Either its top cuts are both even and s e : = − η x e , or they aren’t even, because, for example, δ ( u ) T is odd. In this latter situation,edges in δ ↑ ( u ) : = δ ( u ) ∩ δ ( S ) might have been reduced (because their top two cuts are even),which a priori could leave δ ( u ) unsatisﬁed. In such a case, we increase s e for edge bundles in δ → ( u ) : = δ ( u ) r δ ( S ) to compensate for this reduction. Our main goal is then to prove is that forany edge bundle its expected reduction is greater than its expected increase. The next exampleshows this analysis in an ideal setting. Example . Fix a top edge bundle e = ( u , v ) with p ( e ) = S . Let x u : = x ( δ ↑ ( u )) andlet x v : = x ( δ ↑ ( v )) . Suppose we have constructed a (fractional) matching between edges whose toptwo cuts are children of S in H and the edges in δ ( S ) , and this matching satisﬁes the followingthree conditions: (a) e = ( u , v ) ∈ S is matched (only) to edges going higher from its top two cuts(i.e., to edges in δ ↑ ( u ) and δ ↑ ( v ) ), (b) e is matched to an m e , u fraction of every edge in δ ↑ ( u ) andto an m e , v fraction of each edge in δ ↑ ( v ) , where m e , u + m e , v = x e ,and (c) the fractional value of edges in δ → ( u ) : = δ ( u ) r δ ↑ ( u ) matched to edges in δ ↑ ( u ) is equalto x u . That is, for each u ∈ S , ∑ f ∈ δ → ( u ) m f , u = x u . u v e x u x v S The plan is for e ∈ S to be tasked with part of the responsibility for ﬁxing the cuts δ ( u ) and δ ( v ) when they are odd and edges going higher are reduced. Speciﬁcally, s e is increased to For example, in Fig. 2, p ( a , c ) = u , and ( a , c ) is a bottom edge. m e , u fraction of the reductions in edges in δ ↑ ( u ) when δ ( u ) T is odd. (Andsimilarly for reductions in v .) Thus, E [ s e ] = − P [ e reduced ] η x e + m e , u ∑ g ∈ δ ↑ ( u ) P [ δ ( u ) T odd | g reduced ] P [ g reduced ] η x g x ( δ ↑ ( u ))+ m e , v ∑ g ∈ δ ↑ ( v ) P [ δ ( v ) T odd | g reduced ] P [ g reduced ] η x g x ( δ ↑ ( v )) (9)We will lower bound P [ δ ( u ) T even | g reduced ] . We can write this as P h δ → ( u ) T and δ ↑ ( u ) T have same parity | g reduced i .Unfortunately, we do not currently have a good handle on the parity of δ ↑ ( u ) T conditioned on g reduced. However, we can use the following simple but crucial property: Since x ( δ ( S )) = T consists of two independent trees, one on S and one on V r S , each with thecorresponding marginals of x . Therefore, we can write P [ δ ( u ) T even | g reduced ] ≥ min ( P [( δ → ( u )) T even ] , P [( δ → ( u )) T odd ]) .This gives us a reasonable bound when ǫ ≤ x u , x v ≤ − ǫ since, because x ( δ ( u )) = x ( δ ( v )) = ( δ → ( u )) T (and similarly ( δ → ( v )) T ) is the sum of Bernoulis with expectationin [ + ǫ , 2 − ǫ ] . From this it follows thatmin ( P [( δ → ( u )) T even ] , P [( δ → ( u )) T odd ]) = Ω ( ǫ ) .We can therefore conclude that P [ δ ( u ) T odd | g reduced ] ≤ − O ( ǫ ) .The rest of the analysis of this special case follows from (a) the fact that our construction willguarantee that for all edges g , the probability that g is reduced is exactly p , i.e., it is the same forall edges, and (b) the fact that m e , u x u + m e , v x v = x e . Plugging these facts back into (9), gives E [ s e ] ≤ − p η x e + m e , u ( − ǫ ) p η + m e , v ( − ǫ ) p η ≤ − p η x e + ( − ǫ ) p η x e = − ǫ p η x e . (10)If we could prove (10) for every edge f in the support of x , that would complete the proof thatthe expected cost of the min O -join for a random spanning tree T ∼ µ is at most ( − ǫ ) OPT . Remark:

Throughout this paper, we repeatedly use a mild generalization of the above "inde-pendent trees fact": that if S is a cut with x ( δ ( S )) ≤ + ǫ , then S T is very likely to be a tree.Conditioned on this fact, marginals inside S and outside S are nearly preserved and the treesinside S and outside S are sampled independently (see Lemma 2.23). Ideal reduction:

In the example, we were able to show that P [ δ ( u ) T odd | g reduced ] was boundedaway from 1 for every edge g ∈ δ ↑ ( u ) , and this is how we proved that the expected reduction foreach edge was greater than the expected increase on each edge, yielding negative expected slack.This motivates the following deﬁnition: A reduction for an edge g is k - ideal if, conditioned on g reduced, every cut S that is in the top k levels of cuts containing g is odd with probability thatis bounded away from 1. 23 oving away from an idealized setting: In Example 3.4, we oversimpliﬁed in four ways:(a) We assumed that it would be possible to show that each top edge is good . That is, that itstop two cuts are even simultaneously with constant probability.(b) We considered only top edge bundles (i.e., edges whose top cuts were inside a degree cut).(c) We assumed that x u , x v ∈ [ ǫ , 1 − ǫ ] .(d) We assumed the existence of a nice matching between edges whose top two cuts werechildren of S and the edges in δ ( S ) .Our proof needs to address all four anomalies that result from deviating from these assumptions. a u b c v e d Figure 3: An Example with Bad Edges. A feasible solution of (1) is shown; dashed edges havefraction 1/2 and solid edges have fraction 1. Writing E = E r { e } as a maximum entropydistribution µ we get the following: Edges ( a , b ) , ( c , d ) must be completely negatively correlated(and independent of all other edges). So, ( b , u ) , ( a , u ) are also completely negatively correlated.This implies ( a , b ) is a bad edge. Bad edges.

Consider ﬁrst (a). Unfortunately, it is not the case that all top edges are good.Indeed, some are bad . However, it turns out that bad edges are rare in the following senses: First,for an edge to be bad, it must be a half edge, where we say that an edge e is a half edge if x e ∈ ± ǫ for a suitably chosen constant ǫ . Second, of any two half edge bundles sharinga common endpoint in the hierarchy, at least one is good. For example, in Fig. 3, ( a , u ) and ( b , u ) are good half-edge bundles. We advise the reader to ignore half edges in the ﬁrst readingof the paper. Correspondingly, we note that our proofs would be much simpler if half-edgebundles never showed up in the hierarchy. It may not be a coincidence that half edges are hardto deal with, as it is conjectured that TSP instances with half-integral LP solutions are the hardestto round [SWZ12; SWZ13].Our solution is to never reduce bad edges. But this in turn poses two problems. First, it meansthat we need to address the possibility that the bad edges constitute most of the cost of the LPsolution. Second, our objective is to get negative expected slack on each good edge and non-positive expected slack on bad edges. Therefore, if we never reduce bad edges, we can’t increasethem either, which means that the responsibility for ﬁxing an odd cut with reduced edges goinghigher will have to be split amongst fewer edges (the incident good ones).We deal with the ﬁrst problem by showing that in every cut u in the hierarchy at least 3/4 ofthe fractional mass in δ ( u ) is good and these edges sufﬁce to compensate for reductions on theedges going higher. Moreover, because there are sufﬁciently many good edges incident to eachcut, we can show that either using the slack vector { s e } gives us a low-cost O-join, or we can24 a f uA B a a g v e A ′ B ′ Figure 4: In the triangle u corresponding to the cut δ ( a ∪ a ) , when A T and B T are odd, all 3cuts ( δ ( a ) T , δ ( a ) T and δ ( a ∪ a ) T = δ ( u ) T are odd (since f T is always 1). (Recall also that theedges in the bundle e must have one endpoint in { a ∪ a } and one endpoint in { a ∪ a } , as wasthe case, e.g., for the edge ( a , c ) in Fig. 2.)average it out with another O-join solution concentrated on bad edges to obtain a reduced costmatching of odd degree vertices.We deal with the second problem by proving Lemma 6.2, which guarantees a matching be-tween good edge bundles e = ( u , v ) and fractions m e , u , m e , v of edges in δ ↑ ( u ) , δ ↑ ( v ) such that,roughly, m e , u + m e , v = ( + O ( ǫ )) x e . Dealing with triangles.

Turning to (b), consider a triangle cut S , for example δ ( a ∪ a ) in Fig. 4.Recall that in a triangle, we can assume that there is an edge of fractional value 1 connecting a and a in the tree, and this is why we deﬁned the cut to be happy when A T and B T are odd: thisguarantees that all 3 cuts deﬁned by the triangle ( δ ( a ) , δ ( a ) , δ ( a ∪ a ) are even.Now suppose that e = ( u , v ) is a top edge bundle, where u and v are both triangles, asshown in Fig. 4. Then we’d like to reduce s e when both cuts u and v are happy. But this wouldrequire more than simply both cuts being even. This would require all of A T , B T , A ′ T , B ′ T to beodd. Note that if, for whatever reason, e is reduced only when δ ( u ) T and δ ( u ) T are both even,then it could be, for example, that this only happens when A T and B T are both even. In this case,both δ ( a ) T and δ ( a ) T will be odd with probability 1 (recalling that f T = s f whenever e is reduced. In other words, the reduction will not evenbe 1-ideal.It turns out to be easier for us to get a 1-ideal reduction rule for e as follows: Say that e is if δ ( u ) T is even and both A ′ T , B ′ T are odd. We reduce e with probability p /2 when it is 2-1-1 happy with respect to u and with probability p /2 when it is 2-1-1 happywith respect to v . This means that when e is reduced, half of the time no increase in s f is neededsince u is happy. Similarly for v .The 2-1-1 criterion for reduction introduces a new kind of bad edge: a half edge that is good,but not 2-1-1 good. We are able to show that non-half-edge bundles are 2-1-1 good (Lemmas 5.22and 5.23), and that if there are two half edges which are both in A or are both in B , then atleast one of them is 2-1-1 good (Lemma 5.25). Finally, we show that if there are two half edges,where one is in A and the other is in B , and neither is 2-1-1 good, then we can apply a differentreduction criterion that we call . When the latter applies, we are guaranteed to decreaseboth of the half edge bundles simultaneously. All together, the various considerations discussedin this paragraph force us to come up with a relatively more complicated set of rules underwhich we reduce s e for a top edge bundle e whose children are triangle cuts. Section 5 focuses25n developing the relevant probabilistic statements. Bottom edge reduction.

Next, consider a bottom edge bundle f = ( a , a ) where p ( a ) = p ( a ) is a triangle. Our plan is to reduce s f (i.e., set it to − η x f ) when the triangle is happy, that is, A T = B T =

1. The good news here is that every triangle is happy with constant probability.However, when a triangle is not happy, s f may need to increase to make sure that the O-joinconstraint for δ ( a ) and δ ( a ) are satisﬁed, if edges in A and B going higher are reduced. Since x f = x ( A ) = x ( B ) =

1, this means that f may need to compensate at twice the rate at which it isgetting reduced. This would result in E [ s f ] >

0, which is the opposite of what we seek.We use two key ideas to address this problem. First, we reduce top edges and bottom edgesby different amounts: Speciﬁcally, when the relevant reduction event occurs, we reduce a bottomedge f by β x f and top edges e by τ x e , where β > τ (and τ is a multiple of η ).Thus, the expected reduction in s f is p β x f = p β , whereas the expected increase (due tocompensation of, say, top edges going higher) is p τ ( x ( A ) + x ( B )) q = p τ q , where q = P [ triangle happy | reductions in A and B ] .Thus, so long as 2 τ q < β − ǫ , we get the expected reduction in s f that we seek.The discussion so far suggests that we need to take τ smaller than β /2 q , which is β /2 if q is 1, for example. On the other hand, if τ = β /2, then when a top edge needs to ﬁx a cut dueto reductions on bottom edges, we have the opposite problem – their expected increase will begreater than their expected reduction, and we are back to square one.Coming to our aid is the second key idea, already discussed in Section 1.2.3. We reducebottom edges only when A T = B T = and the marginals of edges in A , B are approximatelypreserved (conditioned on A T = B T = ∞ -ideal.It turns out that the combined effects of (a) choosing τ = β , and (b) getting better boundson the probability that a lower cut is even given that a bottom edge is reduced, sufﬁce to dealwith the interaction between the reductions and the increases in slack for top and bottom edges. a a f ˆ S ˆ A ˆ Ba ′ a ′ A → − α B → α A ↑ α B ↑ − α Figure 5: Setting of Example 3.5. Note that the set A = δ ( a ) ∩ δ ( a ′ ) decomposes into two sets ofedges, A ↑ , those that are also in δ ( S ) , and the rest, which we call A → . Similarly for B . Example . [Bottom-bottom case] To see how preserving marginals helps us handle the inter-action between bottom edges at consecutive levels, consider a triangle cut a ′ = { a , a } whose26arent cut ˆ S = { a ′ , a ′ } is also a triangle cut (as shown in Fig. 5). Let’s analyze E [ s f ] where f = ( a , a ) . Observe ﬁrst that A → ∪ B → is a bottom edge bundle in the triangle ˆ S and all edgesin this bundle are reduced simultaneously when ˆ A T = ˆ B T = A ∪ ˆ B are approximately preserved. (For the purposes of this overview, we’ll assume they are preservedexactly). Let x ( A ↑ ) = α . Then since A = A ↑ ∪ A → and x ( A ) =

1, we have x ( A → ) = − α . More-over, since ˆ A = A ↑ ∪ B ↑ and x ( ˆ A ) =

1, we also have x ( B ↑ ) = − α and x ( B → ) = α .Therefore, using the fact that when A → ∪ B → is reduced, exactly one edge in A ↑ ∪ B ↑ isselected (and also exactly one edge in A → ∪ B → is selected since it is a bottom edge bundle), andmarginals are preserved given the reduction, we conclude that P (cid:2) a ′ happy | A → ∪ B → reduced (cid:3) = P [ A T = B T = | A → ∪ B → reduced ] = α + ( − α ) .Now, we calculate E [ s f ] . First, note that f may have to increase to compensate either for reducededges in A ↑ ∪ B ↑ or in A → ∪ B → . For the sake of this discussion, suppose that A ↑ ∪ B ↑ is a set oftop edges. Then, in the worst case we need to increase f by p τ in expectation to ﬁx the cuts a , a due to the reduction in A ↑ ∪ B ↑ . Now, we calculate the expected increase due to the reduction in A → ∪ B → . The crucial observation is that edges in A → ∪ B → are reduced simultaneously, so bothcuts δ ( a ) and δ ( a ) can be ﬁxed simultaneously by an increase in s f . Therefore, when they areboth odd, it sufﬁces for f to increase bymax { x ( A → ) , x ( B → ) } β = max { α , 1 − α } β ,to ﬁx cuts a , a . Putting this together, we get E [ s f ] = − p β + E [ increase due to A → ∪ B → ] + E h increase due to A ↑ ∪ B ↑ i ≤ − p β + p β max α ∈ [ ] α [ − α − ( − α ) ] + p τ which, since max α ∈ [ ] α [ − α − ( − α ) ] = τ = β is = p β ( − + + ) = − p β . Dealing with x u close to . Now, suppose that e = ( u , v ) is a top edge bundle with x u : = x ( δ ↑ ( u )) is close to 1. Then, the analysis in Example 3.4, bounding r : = P [ δ ( u ) T odd | g reduced ] away from 1 for an edge g ∈ δ ↑ ( u ) doesn’t hold. To address this, we consider two cases: Theﬁrst case, is that the edges in δ ↑ ( u ) break up into many groups that end at different levels inthe hierarchy. In this case, we can analyze r separately for the edges that end at any givenlevel, taking advantage of the independence between the trees chosen at different levels of thehierarchy.The second case is when nearly all of the edges in δ ↑ ( u ) end at the same level, for example,they are all in δ → ( u ′ ) where p ( u ′ ) is a degree cut. In this case, we introduce a more complex(2-1-1) reduction rule for these edges. The observation is that from the perspective of these edges u ′ is a "pseudo-triangle". That is, it looks like a triangle cut, with atoms u and u ′ r u where δ ( u ) ∩ δ ( u ′ ) corresponds to the “ A ”-side of the triangle. Some portions of this discussion might be easier to understand after reading the rest of the paper. f = ( u ′ , v ′ ) ∈ δ → ( u ′ ) . So far, we only considered the following reduction rule for f : If both u ′ , v ′ are degree cuts, f reduces when they are both even in the tree; otherwise if say u ′ is a triangle cut, f reduces whenit is 2-1-1 good w.r.t., u ′ (and similarly for v ′ ). But clearly these rules ignore the pseudo triangle.The simplest adjustment is, if u ′ is a pseudo triangle with partition ( u , u ′ r u ) , to require f toreduce when A T = B T = v ′ is happy. However, as stated, it is not clear that the sets A and B are well-deﬁned. For example, u ′ could be an actual triangle or there could be multiple waysto see u ′ as a pseudo triangle only one of which is ( u , u ′ r u ) . Our solution is to ﬁnd the smallest disjoint pair of cuts a , b ⊂ u ′ in the hierarchy such that x ( δ ( a ) ∩ δ ( u ′ )) , x ( δ ( b ) ∩ δ ( u ′ )) ≥ − ǫ ,where ǫ is a ﬁxed universal constant, and then let A = δ ( a ) ∩ δ ( u ′ ) , B = δ ( b ) ∩ δ ( u ′ ) and C = δ ( u ′ ) r A r B (see Fig. 6 for an example). Then, we say f is 2-1-1 happy w.r.t., u ′ if A T = B T = C T = a a + ǫ a − ǫ a b b + ǫ b − ǫ b c c + ǫ c − ǫ c − ǫǫ u u a a a u a u b b b v b v c c c w c w u Figure 6:

Part of the hierarchy of the graph is shown on top. Edges of the same color have the samefraction and ǫ ≫ η is a small constant. u corresponds to the degree cut { a , a , a } , u corresponds tothe triangle cut { u , a } and u corresponds to the degree cut containing all of the vertices shown. Observethat edges in δ ↑ ( a ) are top edges in the degree cut u . If ǫ < ǫ then the ( A , B , C ) -degree partitioningof edges in δ ( u ) is as follows: A = δ ( a ) ∩ δ ( u ) are the blue highlighted edges each of fractional value1/2 − ǫ , B = δ ( a ) ∩ δ ( u ) are the green highlighted edges of total fractional value 1, and C are the redhighlighted edges each of fractional value ǫ . The cuts that contain edge ( a , c ) are highlighted in thehierarchy at the bottom.

28 few observations are in order: • Since u is a candidate for, say a , it must be that a is a descendent of u in the hierarchy (orequal to u ). In addition, b cannot simultaneously be in u , since a ∩ b = ∅ and x ( δ ( u ) ∩ δ ( u ′ )) ≤ f is 2-1-1 happy w.r.t. u ′ we get ( δ ( u ) ∩ δ ( u ′ )) T = • If u ′ = ( X , Y ) is a actual triangle cut, then we must have a ⊆ X , b ⊆ Y . So, when f is2-1-1 happy w.r.t. u ′ , we know that u ′ is a happy triangle, i.e., ( δ ( X ) ∩ δ ( u ′ )) T = ( δ ( Y ) ∩ δ ( u ′ )) T = δ ( u ′ ) are 2-1-1 good w.r.t. u ′ . Then, whenan edge g ∈ δ ( u ) ∩ δ ( u ′ ) is reduced, ( δ ( u ) ∩ δ ( u ′ )) T =

1, so P [ δ ( u ) T odd | g reduced ] ≤ P (cid:2) E ( u , u ′ r u ) T even | g reduced (cid:3) ≤ E ( u , u ′ r u ) are in the tree independent of the reduction and E [ E ( u , u ′ r u ) T ] ≈ Dealing with x u close to 0 and the matching. We already discussed how the matching ismodiﬁed to handle the existence of bad edges. We now observe that we can handle the case x u ≈ x ( δ → ( u )) ≫ x ( δ ↑ ( u )) . Roughly speaking, this enables us to ﬁnd a matching in which each edge in δ → ( u ) hasto increase about half as much as would normally be expected to ﬁx the cut of u . This eliminatesthe need to prove a nontrivial bound on P [ δ ( u ) T odd | g reduced ] . The details of the matchingare in Section 6. Let OPT be a minimum TSP solution, i.e., minimum cost Hamiltonian cycle and without lossof generality assume it visits u and v consecutively (recall that c ( u , v ) = E ∗ to denote the edges of OPT and we write e ∗ to denote an edge of OPT. Analogously, we use s ∗ : E ∗ → R ≥ to denote the slack vector that we will construct for OPT edges.Throughout this section we study η -near minimum cuts of G = ( V , E , z ) Note that these cutsare 2 η -near minimum cuts w.r.t., x . For every such near minimum cut, ( S , S ) , we identify the cutwith the side, say S , such that u , v / ∈ S . Equivalently, we can identify these cuts with an intervalalong the optimum cycle, OPT, that does not contain u , v .We will use “left" synonymously with “clockwise" and “right" synonymously with “counter-clockwise." We say a vertex is to the left of another vertex if it is to the left of that vertex and tothe right of edge e = ( u , v ) . Otherwise, we say it is to the right (including the root itself in thiscase). Deﬁnition 4.1 (Crossed on the Left/Right, Crossed on Both Sides) . For two crossing near minimumcuts S , S ′ , we say S crosses S ′ on the left if the leftmost endpoint of S on the optimal cycle is to the leftof the leftmost endpoint of S. Otherwise, we say S crosses S ′ on the right .A near minimum cut is crossed on both sides if it is crossed on both the left and the right. We alsosay a a near minimum cut is crossed on one side if it is either crossed on the left or on the right, but notboth. .1 Cuts Crossed on Both Sides The following theorem is the main result of this section:

Theorem 4.2.

Given OPT TSP tour with set of edges E ∗ , and a feasible LP solution x of (1) with supportE = E ∪ { e } and let x be x restricted to E. For any distribution µ of spanning trees with marginals x,if η < , then there is a random vector s ∗ : E ∗ → R ≥ (the randomness in s ∗ depends exclusively onT ∼ µ ) such that • For any vector s : E → R where s e ≥ − x e η /8 for all e and for any η -near minimum cut S w.r.t.,z = ( x + OPT ) /2 crossed on both sides where δ ( S ) T is odd, we have s ( δ ( S )) + s ∗ ( δ ( S )) ≥ ; • For any e ∗ ∈ E ∗ , E [ s ∗ e ∗ ] ≤ η . L ( e ∗ ) R ( e ∗ ) e ∗ u v Figure 7: L and R for an OPT edge e ∗ .For an OPT edge e ∗ = ( u , v ) , let L ( e ∗ ) be the largest η -near minimum cut (w.r.t. z ) containing u and not v which is crossed on both sides. Let R ( e ∗ ) be the largest near minimum cut containing v and not u which is crossed on both sides. For example, see Fig. 7. S L S R S Figure 8: S is crossed on the left by S L and on the right by S R . In green are edges in δ ( S ) L , inblue edges in δ ( S ) R , and in red are edges in δ ( S ) O .30 eﬁnition 4.3. For a near minimum cut S that is crossed on both sides let S L be the near minimum cutcrossing S on the left which minimizes the intersection with S, and similarly for S R ; if there are multiplesets crossing S on the left with the same minimum intersection, choose the smallest one to be S L (andsimilar do for S R ).We partition δ ( S ) into three sets δ ( S ) L , δ ( S ) R and δ ( S ) O as in Fig. 8 such that δ ( S ) L = E ( S ∩ S L , S L r S ) δ ( S ) R = E ( S ∩ S R , S R r S ) δ ( S ) O = δ ( S ) r ( δ ( S ) L ∪ δ ( S ) R ) For an OPT edge e ∗ deﬁne an (increase) event (of second type) I ( e ∗ ) as the event that at leastone of the following does not hold. | T ∩ δ ( L ( e ∗ )) R | = | T ∩ δ ( R ( e ∗ )) L | = T ∩ δ ( L ( e ∗ )) O = ∅ , and T ∩ δ ( R ( e ∗ )) O = ∅ . (11)In the proof of Theorem 4.2 we will increase an OPT edge e ∗ whenever I ( e ∗ ) occurs. Lemma 4.4.

For any OPT edge e ∗ , P [ I ( e ∗ )] ≤ η .Proof. Fix e ∗ . To simplify notation we abbreviate L ( e ∗ ) , R ( e ∗ ) to L , R . Since L is crossed on bothsides, L L , L R are well deﬁned. Since by Lemma 2.5 L L ∩ L , L L r L are 4 η -near min cuts and L is 2 η -near mincut with respect to x , by Corollary 2.24, P [ | T ∩ δ ( L ) L ) | = ] ≥ − η . Simi-larly, P [ | T ∩ δ ( R ) L | = ] ≥ − η . On the other hand, since L , L L , L R are 2 η -near min cuts, byLemma 2.6, x ( E ( L ∩ L R , L R )) , x ( E ( L ∩ L L , L L )) ≥ − η . Therefore x ( δ ( L ) O ) ≤ + η − x ( E ( L ∩ L R , L R )) − x ( E ( L ∩ L L , L L )) ≤ η .It follows that P [ T ∩ δ ( L ) O = ∅ ] ≥ − η . Similarly, P [ T ∩ δ ( R ) O = ∅ ] ≥ − η . Finally, by theunion bound, all events occur simultaneously with probability at least 1 − η . So, P [ I ( e ∗ )] ≤ η as desired. SS L S R e ∗ L e ∗ R I ( e ∗ L ) does not occur then E ( S ∩ S L , S L r S ) T = Lemma 4.5.

Let S be a cut which is crossed on both sides and let e ∗ L , e ∗ R be the OPT edges on its intervalwhere e ∗ L is the edge further clockwise. Then, if δ ( S ) T = , at least one of I ( e ∗ L ) , I ( e ∗ R ) occurs. roof. We prove by contradiction. Suppose none of I ( e ∗ L ) , I ( e ∗ R ) occur; we will show that thisimplies δ ( S ) T = R = R ( e ∗ L ) ; note that S is a candidate for R ( e ∗ L ) , so S ⊆ R . Therefore, S L = R L and wehave δ ( R ) L = E ( R ∩ R L , R L r R ) = E ( R ∩ S L , S L r R ) = δ ( S ) L .where we used S ∩ S L = R ∩ S L and that S L r S = S L r R . Similarly let L = L ( e ∗ R ) , and, we have δ ( L ) R = δ ( S ) R .Now, since I ( e ∗ L ) has not occurred, 1 = | T ∩ δ ( R ) L | = | T ∩ δ ( S ) L | , and since I ( e ∗ R ) has notoccurred, 1 = | T ∩ δ ( L ) R | = | T ∩ δ ( S ) R | , where L = L ( e ∗ R ) . So, to get δ ( S ) T =

2, it remains toshow that T ∩ δ ( S ) O = ∅ . Consider any edge e = ( u , v ) ∈ δ ( S ) O where u ∈ S . We need toshow e / ∈ T . Assume that v is to the left of S (the other case can be proven similarly). Then e ∈ δ ( R ) . So, since e goes to the left of R , either e ∈ E ( R ∩ R L , R L r R ) or e ∈ δ ( R ) O . Butsince e / ∈ δ ( S ) L = δ ( R ) L , we must have e ∈ δ ( R ) O . So, since I ( e ∗ L ) has not occurred, e / ∈ T asdesired. Proof of Theorem 4.2.

For any OPT edge e ∗ whenever I ( e ∗ ) occurs, deﬁne s ∗ e ∗ = η /3.9. Then, byLemma 4.4, E [ s e ∗ ] ≤ η /3.9 and for any 2 η -near min cut S (w.r.t., x ) that is crossed on bothsides if δ ( S ) T is odd, then at least one of I ( e ∗ L ) , I w ( e ∗ R ) occurs, so s ( δ ( S )) + s ∗ ( δ ( S )) ≥ − x ( δ ( S )) η /8 + s ∗ e ∗ L + s ∗ e ∗ R ≥ − ( + η ) η /8 + η /3.9 ≥ η < The following theorem is the main result of this section.

Theorem 4.6.

Let x be a feasible solution of LP (1) with support E = E ∪ { e } and x be x restricted to E.Let µ be the max entropy distribution with marginals x. For η ≤ − , there is a set E g ⊂ E r δ ( { u , v } ) of good edges and two functions s : E → R and s ∗ : E ∗ → R ≥ (as functions of T ∼ µ ) such that(i) For each edge e ∈ E g , s e ≥ − x e η /8 and for any e ∈ E r E g , s e = .(ii) For each η -near-min-cut S w.r.t. z, if δ ( S ) T is odd, then s ( δ ( S )) + s ∗ ( δ ( S )) ≥ (iii) We have E [ s e ] ≤ − ǫ P η x e for all edges e ∈ E g and E [ s ∗ e ∗ ] ≤ η for all OPT edges e ∗ ∈ E ∗ . for ǫ P deﬁned in (31) .(iv) For every η -near minimum cut S of z crossed on (at most) one side such that S = V r { u , v } ,x ( δ ( S ) ∩ E g ) ≥ Theorem 3.1 (Main Technical Theorem) . Let x be a solution of LP (1) with support E = E ∪ { e } ,and x be x restricted to E. Let z : = ( x + OPT ) /2 , η ≤ − and let µ be the max-entropy distributionwith marginals x. Also, let E ∗ denote the support of OPT. There are two functions s : E → R ands ∗ : E ∗ → R ≥ (as functions of T ∼ µ ), , such that ) For each edge e ∈ E, s e ≥ − x e η /8 .ii) For each η -near-min-cut S of z, if δ ( S ) T is odd, then s ( δ ( S )) + s ∗ ( δ ( S )) ≥ iii) For every OPT edge e ∗ , E [ s ∗ e ∗ ] ≤ η and for every LP edge e = e , E [ s e ] ≤ − x e ǫ P η /2 for ǫ P deﬁned in (31) .Proof of Theorem 3.1. Let E g be the good edges deﬁned in Theorem 4.6 and let E b : = E r E g be theset of bad edges. We deﬁne a new vector ˜ s : E ∪ { e } → R as follows:˜ s ( e ) ←  ∞ if e = e − x e η /8 if e ∈ E b , x e η /4 otherwise. (12)Let ˜ s ∗ the vector in Theorem 4.2. We claim that for any η -near minimum cut S w.r.t., z such that δ ( S ) T is odd, we have ˜ s ( δ ( S )) + ˜ s ∗ ( δ ( S )) ≥ S = V r { u , v } crossed on at most oneside, ˜ s ( δ ( S )) + ˜ s ∗ ( δ ( S )) ≥ ˜ s ( δ ( S )) ≥ η s ( E g ∩ δ ( S )) − η s ( E b ∩ δ ( S )) ≥

0. (13)For S = V r { u , v } , we have δ ( S ) T = δ ( u ) T + δ ( v ) T = S crossed on both sides, by Theorem 4.2 and the fact that ˜ s e ≥ − η x e /8 for all e , theinequality holds.Now, we are ready to deﬁne s , s ∗ . Let ˆ s , ˆ s ∗ be the s , s ∗ of Theorem 4.6 respectively. Deﬁne s = α ˜ s + ( − α ) ˆ s and similarly deﬁne s ∗ = α ˜ s ∗ + ( − α ) ˆ s ∗ for some α that we choose later. Weprove all three conclusions for s , s ∗ . (i) follows by (i) of Theorem 4.6 and Eq. (12). (ii) follows by(ii) Theorem 4.6 and Eq. (13) above. It remains to verify (iii). For any OPT edge e ∗ , E [ s ∗ e ∗ ] ≤ η by (iii) of Theorem 4.6 and the construction of ˜ s ∗ . On the other hand, by (iii) of Theorem 4.6 andEq. (12), E [ s e ] ≤ x e ( αη /4 − ( − α ) ǫ P η ) , ∀ e ∈ E g , E [ s e ] = − x e ( − α ) η /8, ∀ e ∈ E b .Setting α = ǫ P we get E [ s e ] ≤ − ǫ P η x e /2 for e ∈ E g and E [ s e ] ≤ − x e η /9 for e ∈ E b as desired. Deﬁnition 4.7 (Connected Component of Crossing Cuts) . Given a family of cuts crossed on at mostone side, construct a graph where two cuts are connected by an edge if they cross. Partition this graph into maximal connected components . We call a path in this graph, a path of crossing cuts . In the rest of this section we will focus on a single connected component C of cuts crossed on(at most) one side. Deﬁnition 4.8 (Polygon) . For a connected component C of crossing near min cuts that are crossed onone side, let a , . . . , a m − be the coarsest partition of the vertices V , such that for all ≤ i ≤ m − and or any A ∈ C either a i ⊆ A or a i ∩ A = ∅ . These are called atoms. We assume a is the atom thatcontains the special edge e , and we call it the root . Note that for any A ∈ C , a ∩ A = ∅ .Since every cut A ∈ C corresponds to an interval of vertices in V in the optimum Hamiltonian cycle,we can arrange a , . . . , a m − around a cycle (in the counter clockwise order). We label the arcs in this cyclefrom 1 to m, where i + is the arc connecting a i and a i + (and m is the name of the arc connecting a m − and a ). Then every cut A ∈ C can be identiﬁed by the two arcs surrounding its atoms. Speciﬁcally, A isidentiﬁed with arcs i , j (where i < j) if A contains atoms a i , . . . , a j − , and we write ℓ ( A ) = i , r ( A ) = j.Note that A does not contain the root a .By construction for every arc ≤ i ≤ m, there exists a cut A such that ℓ ( A ) = i or r ( A ) = i.Furthermore, A , B ∈ C (with ℓ ( A ) ≤ ℓ ( B ) ) cross iff ℓ ( A ) < ℓ ( B ) < r ( A ) < r ( B ) .See Fig. 10 for a visual example. Notice that every atom of a polygon is an interval of the optimal cycle. In this section, weprove the following structural theorem about polygons of near minimum cuts crossed on oneside.

Theorem 4.9 (Polygon Structure) . For ǫ η ≥ η and any polygon with atoms a ... a m − (where a isthe root) the following holds: • For all adjacent atoms a i , a i + (also including a , a m − ), we have x ( E ( a i , a i + )) ≥ − ǫ η . • All atoms a i (including the root) have x ( δ ( a i )) ≤ + ǫ η . • x ( E ( a , { a , . . . , a m − } )) ≤ ǫ η . The interpretation of this theorem is that the structure of a polygon converges to the structureof an actual integral cycle as η →

0. The proof of the theorem follows from the lemmas in therest of this subsection.

Deﬁnition 4.10 (Left and Right Hierarchies) . For a polygon u corresponding to a connected component C of cuts crossed on one side, let L (the left hierarchy ) be the set of all cuts A ∈ C that are not crossed onthe left. We call any cut in L open on the left. Similarly, we let R be the set of cuts that are open on theright. So, L , R is a partitioning of all cuts in C .For two distinct cuts A , B ∈ L we say A is an ancestor of B in the left polygon hierarchy if A ⊇ B.We say A is a strict ancestor of B if, in addition, ℓ ( A ) = ℓ ( B ) . We deﬁne the right hierarchy similarly: Ais a strict ancestor of B if A ⊇ B and r ( A ) = r ( B ) .We say B is a strict parent of A if among all strict ancestors of A in the (left or right) hierarchy, B isthe one closest to A.See Fig. 10 for examples of sets and their parent/ancestor relationships. Fact 4.11.

If A , B are in the same hierarchy and they are not ancestors of each other, then A ∩ B = ∅ .Proof. If A ∩ B = ∅ then they cross. So, they cannot be open on the same side.This lemma immediately implies that the cuts in each of the left (and right) hierarchies forma laminar family. Lemma 4.12.

For A , B ∈ R where B is a strict parent of A, there exists a cut C ∈ L that crosses bothA , B. Similarly, if A , B ∈ L and B is a strict parent of A, there exists a cut C ∈ R that crosses A , B. L L R R a a a a a a a a a a a a a A B

Figure 10: An example of a polygon with contracted atoms. In black are the cuts in the leftpolygon hierarchy, in red the cuts in the right polygon hierarchy. OPT edges around the cycle areshown in green. Here R is an ancestor of R , however it is not a strict ancestor of R since theyhave the same right endpoint. L is a strict ancestor and the strict parent of L . By Theorem 4.9,every edge in the bottom picture represents a set of LP edges of total fraction at least 1 − ǫ η . Proof.

Since we have a connected component of near min cuts, there exists a path of crossing cutsfrom A to B . Let P = ( A = C , C , . . . , C k = B ) be the shortest such path. We need to show that k = C crosses C and C is open on right, we have ℓ ( C ) < ℓ ( C ) < r ( C ) < r ( C ) .Let I be the closed interval [ ℓ ( C ) , r ( C )] . Note that C k = B has an endpoint that does not belongto I . Let C i be the ﬁrst cut in the path with an endpoint not in I (deﬁnitely i > C i − ⊆ I ; so, since C i − crosses C i , exactly one of the endpoints of C i is strictly inside I . Weconsider two cases: Case 1: r ( C i ) > r ( C ) . In this case, C i must be crossed on the left (by C i − ) and C i ∈ R and itdoes not cross C . So, C ( C i and ℓ ( C ) < ℓ ( C i ) ≤ ℓ ( C ) where the ﬁrst inequality uses that the left endpoint of C i is strictly inside I . Therefore, C crossesboth of C , C i , and C i is a strict ancestor of A = C . If C i = B we are done, otherwise, A ⊆ B ⊆ C i ,but since C crosses both A and C i , it also crosses B and we are done.35 ase 2: ℓ ( C i ) < ℓ ( C ) . In this case, C i must be crossed on the right (by C i − ) and C i ∈ L andit does not cross C . So, we must have r ( C ) ≤ r ( C i ) < r ( C ) ,where the second inequality uses that the right endpoint of C i is strictly inside I . But, this impliesthat C i also crosses C . So, we can obtain a shorter path by excluding all cuts C , . . . , C i − andthat is a contradiction. Lemma 4.13.

Let A , B ∈ R such that A ∩ B = ∅ , i.e., they are not ancestors of each other. Then, theyhave a common ancestor, i.e., there exists a set C ∈ R such that A , B ⊆ C.Proof.

WLOG assume r ( A ) ≤ ℓ ( B ) . Let C be the highest ancestor of A in the hierarchy, i.e., C has no ancestor. For the sake of contradiction suppose B ∩ C = ∅ (otherwise, C is an ancestorof B and we are done). So, r ( C ) ≤ ℓ ( B ) . Consider the path of crossing cuts from C to B , say C = C , . . . , C k = B .Let C i be the ﬁrst cut in this path such that r ( C i ) > r ( C ) . Note that such a cut always existsas r ( B ) > r ( C ) . Since C i − crosses C i and r ( C i − ) ≤ r ( C ) , C i − crosses C i on the left and C i is open on the right. We show that C i is an ancestor of C = C and we get a contradiction to C having no ancestors (in R ). If ℓ ( C ) < ℓ ( C i ) , then C i crosses C on the right and that is acontradiction. So, we must have C ⊆ C i , i.e., C i is an ancestor of C .It follows from the above lemma that each of the left and right hierarchies have a unique cutwith no ancestors. Lemma 4.14.

If A is a cut in R such that r ( A ) < m, then A has a strict ancestor. And, similarly, ifA ∈ L satisﬁes ℓ ( A ) > , then it has a strict ancestor.Proof. Fix a cut A ∈ R . If there is a cut in B ∈ R such that r ( B ) > r ( A ) , then either B is a strictancestor of A in which case we are done, or A ∩ B = ∅ , but then by Lemma 4.13 A , B have acommon ancestor C , and C must be a strict ancestor of A and we are done.Now, suppose for any R ∈ R , r ( R ) ≤ r ( A ) . So, there must be a cut B ∈ L such that r ( B ) > r ( A ) (otherwise we should have less than m atoms in our polygon). The cut B must becrossed on the right by a cut C ∈ R . But then, we must have r ( C ) > r ( B ) > r ( A ) which is acontradiction. Corollary 4.15.

If A ∈ C has no strict ancestor, then r ( A ) = m if A ∈ R and ℓ ( A ) = otherwise. Lemma 4.16 (Polygons are Near Minimum Cuts) . x ( δ ( a ∪ · · · ∪ a m )) ≤ + η . Proof.

Let A ∈ L and B ∈ R be the unique cuts in the left/right hierarchy with no ancestors.Note that A and B are crossing (because there is a cut C that crosses A on the right, and B isan ancestor of C ). Therefore, since A , B are both 2 + η near min cuts, by Lemma 2.5, A ∪ B is a2 + η near min cut. Lemma 4.17 (Root Neighbors) . x ( E ( a , a )) , x ( E ( a , a m − )) ≥ − η . roof. Here we prove x ( E ( a , a )) ≥ − η . One can prove x ( E ( a , a m − )) ≥ − η similarly. Let A ∈ L and B ∈ R be the unique cuts in the left/right hierarchy with no ancestors. First, observethat if ℓ ( B ) =

2, then since A , B are crossing, by Lemma 2.6 we have x ( E ( A r B , A ∪ B )) = x ( E ( a , a )) ≥ − η .as desired.By deﬁnition of atoms, there exists a cut C ∈ C such that either ℓ ( C ) = r ( C ) =

2; but if r ( C ) = ℓ ( C ) = C cannot be crossed, so this does not happen. So,we must have ℓ ( C ) =

2. If C ∈ R , then since C is a descendent of B , we must have ℓ ( B ) =

2, andwe are done by the previous paragraph.Otherwise, suppose C ∈ L . We claim that B crosses C . This is because, C is crossed onthe right by some cut B ′ and B is an ancestor of B ′ , so B ∩ C = ∅ and C B since ℓ ( B ) > B ∪ C is a 2 + η near min cut. Since A crosses B ∪ C , by Lemma 2.6 wehave x ( E ( A r ( B ∪ C ) , A ∪ B ∪ C )) = x ( E ( a , a )) ≥ − η as desired. Lemma 4.18.

For any pair of atoms a i , a i + where ≤ i ≤ m − we have x ( δ ( { a i , a i + } )) ≤ + η ,so x ( E ( a i , a i + )) ≥ − η .Proof. We prove the following claim: There exists j ≤ i such that x ( δ ( { a j , . . . , a i + } )) ≤ + η .Then, by a similar argument we can ﬁnd j ′ ≥ i + x ( δ ( { a i , . . . , a j ′ } )) ≤ + η . ByLemma 2.5 it follows that x ( δ ( { a i , a i + } )) ≤ + η . Since x ( δ ( a i )) , x ( δ ( a i + )) ≥

2, we have x ( δ ( { a i , a i + } )) + x ( E ( a i , a i + )) ≥ x ( δ ( { a i , a i + } )) we must have x ( E ( a i , a i + )) ≥ − η as desired.It remains to prove the claim. First, observe that there is a cut A separating a i + , a i + (Notethat if i + = m − a i + = a ); so, either ℓ ( A ) = i + r ( A ) = i +

2. If r ( A ) = i + A is the cut we are looking for and we are done. So, assume ℓ ( A ) = i + Case 1: A ∈ L . Let L ∈ L be the strict parent of A . If ℓ ( L ) ≤ i then we are done (since thereis a cut R ∈ R crossing A , L on the right so L r ( A ∪ R ) is the cut that we want. If ℓ ( L ) = i + L ′ be the strict parent of L ). Then, there is a cut R ∈ R crossing A , L and a cut R ′ crossing L , L ′ . First, since both R , R ′ cross L (on the right) they have a non-empty intersection, so one ofthem say R ′ is an ancestor of the other ( R ) and therefore R ′ must intersect A . On the other hand,since R ′ crosses L and ℓ ( L ) = i + ℓ ( R ′ ) ≥ i + = ℓ ( A ) . Since R ′ intersect A , either they cross,or A ⊆ R ′ , so we must have x ( δ ( A ∪ R )) ≤ + η . Finally, since R ′ crosses L ′ (on the right) wehave x ( δ ( L ′ r ( A ∪ R ))) ≤ + η and L ′ r ( A ∪ R ) is our desired set. Case 2: A ∈ R . We know that A is crossed on the left by, say, L ∈ L . If ℓ ( L ) ≤ i , we are done,since then L r A is the cut that we seek and we get x ( δ ( L r A )) ≤ + η .Suppose then that ℓ ( L ) = i +

1. Let L ′ be the strict parent of L , which must have ℓ ( L ′ ) ≤ i . If L ′ crosses A , then L ′ r A is the cut we seek and we get x ( δ ( L r A )) ≤ + η .Finally, if L ′ doesn’t cross A , i.e., r ( A ) ≤ r ( L ′ ) , then consider the cut R ∈ R that crosses L and L ′ on the right. Since r ( L ) < r ( A ) , and A is not crossed on the right, it must be that ℓ ( R ) = i + L ′ r R is the cut we want, and we get x ( δ ( L ′ r R )) ≤ + η .37 emma 4.19 (Atoms are Near Minimum Cuts) . For any ≤ i ≤ m − , we have x ( δ ( a i )) ≤ + η . Proof.

By Lemma 4.18, x ( E ( { a i , a i + } )) ≤ + η (note that in the special case i = m − a i − , a i ). There must be a 2 η -near minimum cut C (w.r.t., x ) separating a i from a i + .Then either a i = C ∩ { a i , a i + } or a i = { a i , a i + } r C . In either case, we get x ( δ ( a i )) ≤ + η byLemma 2.5. Deﬁnition 4.20 ( A , B , C -Polygon Partition) . Let u be a polygon with atoms a , . . . , a m − with root a where a , a m − are the atoms left and right of the root. The A , B , C-polygon partition of u is a partition ofedges of δ ( u ) into sets A = E ( a , a ) and B = E ( a m − , a ) , C = δ ( u ) r A r B. Note that by Theorem 4.9, x ( A ) , x ( B ) ≥ − ǫ η and x ( C ) ≤ ǫ η where ǫ η = η is deﬁned inTheorem 4.9. Deﬁnition 4.21 (Leftmost and Rightmost cuts) . Let u be a polygon with atoms a , . . . , a m and arcslabelled

1, . . . , m corresponding to a connected component C of η -near minimum cuts (w.r.t., z). We callany cut C ∈ C with ℓ ( C ) = a leftmost cut of u and any cut C ∈ C with r ( C ) = m a rightmost cut ofu. We also call a the leftmost atom of u (resp. a m − the rightmost atom). Observe that by Corollary 4.15, any cut that is not a leftmost or a rightmost cut has a strictancestor.

Deﬁnition 4.22 (Happy Polygon) . Let u be a polygon with polygon partition A , B , C. For a spanningtree T, we say that u is happy if A T and B T odd , C T = We say that u is left-happy (respectively right-happy ) ifA T odd , C T = (respectively B T odd , C T = ). Deﬁnition 4.23 (Relevant Cuts) . Given a polygon u corresponding to a connected component C of cutscrossed on one side with atoms a , . . . , a m − , deﬁne a family of relevant cuts C ′ = C ∪ { a i : 1 ≤ i ≤ m − z ( δ ( a i )) ≤ + η } .Note that atoms of u are always ǫ η /2-near minimum cuts w.r.t., z but not necessarily η -nearminimum cuts. The following theorem is the main result of this section. Theorem 4.24 (Happy Polygons and Cuts Crossed on One Side) . Let G = ( V , E , x ) for x be anLP solution and z = ( x + OPT ) /2 . For a connected component C of near minimum cuts of z, let u bethe polygon with atoms a , a ... a m − with polygon partition A , B , C. For µ an arbitrary distribution ofspanning trees with marginals x, there is a random vector s ∗ : E ∗ → R ≥ (as a function of T ∼ µ ) suchthat for any vector s : E → R where s e ≥ − η x e /8 for all e ∈ E the following holds: • If u is happy then, for any cut S ∈ C ′ if δ ( S ) T is odd then we have s ( δ ( S )) + s ∗ ( δ ( S )) ≥ , If u is left happy, then for any S ∈ C ′ that is not a rightmost cut or the rightmost atom, if δ ( S ) T isodd, then we have s ( δ ( S )) + s ∗ ( δ ( S )) ≥ . Similarly, if u is right happy then for any cut S ∈ C ′ that is not a rightmost cut or the rightmost atom, the latter inequality holds. • E [ s ∗ e ∗ ] ≤ η . Before proving the above theorem, we study a special case.

Lemma 4.25 (Triangles as Degenerate Polygons) . Let S = X ∪ Y where X , Y , S are ǫ η -near min cuts(w.r.t., x) and each of these sets is a contiguous interval around the OPT cycle. Then, viewing X as a andY as a (and a = X ∪ Y) the above theorem holds viewing S as a degenerate polygon.Proof.

In this case A = E ( a , a ) , B = E ( a , a ) , C = ∅ . For the OPT edge e ∗ between X , Y wedeﬁne I ( e ∗ ) to be the event that at least one of T ∩ E ( X ) , T ∩ E ( Y ) , T ∩ E ( S ) is not a tree.Whenever this happens we deﬁne s ∗ e ∗ = η /3.9. If S is left-happy we need to show when δ ( X ) T is odd, then s ( δ ( X )) + s ∗ ( δ ( X )) ≥

0. This is because when S is left-happy we have A T = C T = I ( e ∗ ) does not happen and we get δ ( X ) T = s ( δ ( X )) + s ∗ ( δ ( X )) ≥ s ( δ ( X )) ≥ − ( + η ) η /8 and s ∗ e ∗ = η /3.9. Finally, observe that byCorollary 2.24, P [ I ( e ∗ )] ≤ ǫ η , so E [ s ∗ e ∗ ] = ǫ η η /3.9 ≤ η . Lemma 4.26.

For every cut A ∈ C that is not a leftmost or a rightmost cut, P [ δ ( A ) T = ] ≥ − η .Proof. Assume A ∈ R ; the other case can be proven similarly. Let B be the strict parent of A . ByLemma 4.12 there is a cut C ∈ L which crosses A , B on their left. It follows by Lemma 2.5 that C r A , C ∩ A are 4 η near minimum cuts (w.r.t., x ). So, by Corollary 2.24, P [ E ( A ∩ C , C r A ) T = ] ≥ − η . On the other hand, B r ( A ∪ C ) is a 6 η near minimum cut and A r C , B r C are 4 η nearmin cuts (w.r.t., x ). So, by Corollary 2.24 P [ E ( A r C , B r ( A ∪ C )) T = ] ≥ − η .Finally, by Lemma 2.6, x ( E ( A ∩ C , C r A )) , x ( A r C , B r ( A ∪ C )) ≥ − η . Since A is a2 η near min cut (w.r.t., x ), all remaining edges have fractional value at most 8 η , so with prob-ability 1 − η , T does not choose any of them. Taking a union bound over all of these events, P [ δ ( A ) T = ] ≥ − η . Lemma 4.27.

For any atom a i ∈ C ′ that is not the leftmost or the rightmost atom we have P [ δ ( a i ) T = ] ≥ − η . Proof.

By Lemma 4.18, x ( δ ( { a i , a i + } )) ≤ + η , and by Lemma 4.19, x ( δ ( a i + )) ≤ + η (alsorecall by the assumption of lemma x ( δ ( a i )) ≤ + η , Therefore, by Corollary 2.24, P [ E ( a i , a i + ) T = ] , P [ E ( a i − , a i ) T = ] ≥ − η ,where the second inequality holds similarly. Also, by Lemma 4.18, x ( E ( a i − , a i )) , x ( E ( a i , a i + )) ≥ − η . Since x ( δ ( a i )) ≤ + η , x ( E ( a i , a i − ∪ a i ∪ a i + )) ≤ η . So, P [ T ∩ E ( a i , a i − ∪ a i ∪ a i + ) = ∅ ] ≥ − η .Finally, by the union bound all events occur with probability at least 1 − η .Let e ∗ , . . . , e ∗ m be the OPT edges mapped to the arcs 1, . . . , m of the component C respectively.39 emma 4.28. There is a mapping of cuts in C ′ to OPT edges e ∗ , . . . e ∗ m − such that each OPT edge hasat most 4 cuts mapped to it.Proof. Consider ﬁrst the set of cuts in C ′ R : = R ∪ { a i : 1 ≤ i ≤ m − z ( δ ( a i )) ≤ + η } . Observethat this is also a laminar family. We deﬁne a map from cuts in C ′ R to OPT edges such that everyOPT edge e ∗ , . . . , e ∗ m − gets at most 2 cuts mapped to it. A similar argument works for cuts in L .For any 2 ≤ i ≤ m −

1, we mapargmax A ∈R : ℓ ( A )= i | A | and argmax A ∈R : r ( A )= i | A | to e ∗ i . By construction, each OPT edge gets at most two cuts mapped to it. Furthermore, weclaim every cut A ∈ C ′ R gets mapped to at least one OPT edge. For the sake of contradiction let A ∈ C ′ R be a cut that is not mapped to any OPT edge. Note that A is not a or a m − and that if A ∈ R , ℓ ( A ) =

1. Furthermore, if A ∈ R and r ( A ) = m , then A is deﬁnitely the largest cut withleft endpoint ℓ ( A ) . So assume, 1 < ℓ ( A ) = r ( A ) < m . Let B = argmax B ∈R : ℓ ( B )= ℓ ( A ) | B | and let C = argmax B ∈R : r ( C )= r ( A ) | C | . Since A is not mapped to any OPT edge, we must have B , C = A .But that implies A ( B , C . And this means B , C cross; but this is a contradiction with R being alaminar family.If A is mapped to two OPT edges in the above construction, we choose one of them arbitrarily.Recall for a cut L ∈ L , L R is the near minimum cut crossing L on the right that minimizes theintersection (see Deﬁnition 4.3). Deﬁnition 4.29 (Happy Cut) . We say a leftmost cut L ∈ L is happy if | T ∩ E ( L R ∩ L , L R r L ) | = δ ( L , a ∪ L ∪ L R ) = ∅ . Also, we say the leftmost atom a is happy if | T ∩ E ( a , a ) | = E ( a , a ∪ · · · ∪ a m − ) = ∅ . Similarly, deﬁne rightmost cuts in u or the rightmost atom in u to be happy.

Note that, by deﬁnition, if leftmost cut L is happy and u is left happy then L is even, i.e., δ ( L ) T =

2. Similarly, a is even if it is happy and u is left-happy. Lemma 4.30.

For every leftmost or rightmost cut A in u that is η -near min cut w.r.t. z, P [ A happy ] ≥ − η , and for the leftmost atom a (resp. rightmost atom a m − ), if it is an η -near min cut then P [ a happy ] ≥ − η (resp. P [ a m − happy ] ≥ − η ).Proof. Recall that if A is a η -near min cut w.r.t. z then it is a 2 η -near min cut w.r.t. x . We provethis for the leftmost cuts and the leftmost atom; the other case can be proven similarly. Considera cut L ∈ L . Since by Lemma 2.5 L R ∩ L , L R r L are 4 η near min cuts (w.r.t., x ) and L R is a2 η near min cut, by Corollary 2.24, P [ E ( L R ∩ L , L R r L ) T = ] ≥ − η . On the other hand, byLemma 2.6, x ( E ( L R ∩ L , L R r L )) ≥ − η , and by Lemma 4.17, x ( E ( L , a )) ≥ − η . It followsthat x ( L , a ∪ L ∪ L R ) ≤ + η − ( − η ) − ( − η ) ≤ η .Therefore, by the union bound, P [ L happy ] ≥ − η a , and suppose it is an η near min cut. By Lemma 4.18, x ( δ ( { a , a } )) ≤ + η and by Lemma 4.19, x ( δ ( a )) ≤ + η . Therefore, by Corollary 2.24, P [ E ( a , a ) T = ] ≥ − η . On the other hand, by Lemma 4.18, x ( E ( a , a )) ≥ − η and by Lemma 4.17, x ( E ( a , a )) ≥ − η . Therefore, x ( E ( a , a ∪ · · · ∪ a m − )) ≤ + η − ( − η ) − ( − η )) ≤ η .Therefore, by the union bound, P [ a happy ] ≥ − η as desired. Proof of Theorem 4.24.

Consider an OPT edge e ∗ i for 1 < i < m . We deﬁne an increase event I ( e ∗ i ) of the ﬁrst type as follows: This event occurs if at least one of the possible 4 cuts mapped to e ∗ i in Lemma 4.28 is odd or if a leftmost cut L ∈ L assigned to u with r ( L ) = i is not happy or arightmost cut R ∈ R assigned to u with l ( R ) = i is not happy (note in the special case that i = L will be the leftmost atom if it is a near min cut, and similarly when i = m − R will be therightmost atom if it is a near min cut).Whenever I ( e ∗ i ) occurs, we deﬁne s ∗ e ∗ i = η /3.9, otherwise we let it be 0. First, observe thatfor any cut S ∈ C ′ that is not a leftmost or a rightmost cut/atom, if δ ( S ) T is odd, then if e ∗ i is theOPT edge that S is mapped to, it satisﬁes s ∗ e ∗ i = η /3.9, so s ( δ ( S )) + s ∗ ( δ ( S )) ≥ − x ( δ ( S )) η /8 + s ∗ ( e ∗ i ) ≥ − ( + η ) η /8 + η /3.9 ≥ η < S ∈ L is the leftmost cut assigned to u and δ ( S ) T is odd, and r ( S ) = i . If u is not left-happy there is nothing to prove. If u is left-happy, then we must have S isnot happy, so I ( e ∗ i ) occurs, so similar to the above inequality s ( δ ( S )) + s ∗ ( δ ( S )) ≥

0. The sameholds for rightmost cuts and the leftmost/rightmost atoms if assigned to u .It remains to upper bound E [ e ∗ i ] for 1 < i < m . By Lemma 4.28 at most 4 cuts that arenot leftmost or rightmost are mapped to e ∗ i and at most two are atoms. By Lemma 4.26 andLemma 4.27 and a union bound, all of these 4 cuts are even with probability at least 1 − η .Also, by Lemma 4.30, at most one leftmost cut and rightmost cut are mapped to e ∗ i , and s ∗ e ∗ i is setto η /3.9 if either is not happy.. Both of these cut will be happy with probability at least 1 − η .In the special case that i = i = m − − η . Putting everything together, P [ I ( e ∗ i )] ≤ η . So, E [ s ∗ ( e ∗ i )] ≤ η /3.9 η = η asdesired. Deﬁnition 4.31 (Hierarchy) . For an LP solution x with support E = E ∪ { e } and x be x restrictedto E, a hierarchy H is a laminar family of ǫ η -near min cuts of G = ( V , E , x ) with root V r { u , v } ,where every cut S ∈ H is either a polygon cut (including triangles) or a degree cut and u , v / ∈ S. Forany (non-root) cut S ∈ H , deﬁne the parent of S, p ( S ) , to be the smallest cut S ′ ∈ H such that S ( S ′ .For a cut S ∈ H , let A ( S ) : = { u ∈ H : p ( u ) = S } . If S is a polygon cut, then we can order cuts in A ( S ) , u , . . . , u m − such that • A = E ( S , u ) , B = E ( u m − , S ) satisfy x ( A ) , x ( B ) ≥ − ǫ η . • For any ≤ i < m − , x ( E ( u i , u i + ) ≥ − ǫ η . C = ∪ m − i = E ( u i , S ) satisﬁes x ( C ) ≤ ǫ η .We call the sets A , B , C the polygon partition of edges in δ ( S ) . We say S is left-happy when A T is oddand C T = and right happy when B T is odd and C T = and happy when A T , B T are odd and C T = .We abuse notation, and for an (LP) edge e = ( u , v ) that is not a neighbor of u , v , let p ( e ) denote thesmallest cut S ′ ∈ H such that u , v ∈ S ′ . We say edge e is a bottom edge if p ( e ) is a polygon cut and wesay it is a top edge if p ( e ) is a degree cut. Note that when S is a polygon cut u , . . . , u m − will be the atoms a , . . . , a m − that we deﬁnedin the previous section, but a reader should understand this deﬁnition independent of the poly-gon deﬁnition that we discussed before; in particular, the reader no longer needs to worry aboutthe details of speciﬁc cuts C that make up a polygon. Also, note that since V r { u , v } is the rootof the hierarchy, for any edge e ∈ E that is not incident to u or v , p ( e ) is well-deﬁned; so allthose edges are either bottom or top, and edges which are incident to u or v are neither bottomedges nor top edges.The following observation is immediate from the above deﬁnition. Observation 4.32.

For any polygon cut S ∈ H , and any cut S ′ ∈ H which is a descendant of S letD = δ ( S ′ ) ∩ δ ( S ) . If D = ∅ , then exactly one of the following is true: D ⊆ A or D ⊆ B or D ⊆ C. Theorem 4.33 (Main Payment Theorem) . For an LP solution x and x be x restricted to E and ahierarchy H for some ǫ η ≤ − , the maximum entropy distribution µ with marginals x satisﬁes thefollowing:i) There is a set of good edges E g ⊆ E r δ ( { u , v } ) such that any bottom edge e is in E g and for any(non-root) S ∈ H such that p ( S ) is a degree cut, we have x ( E g ∩ δ ( S )) ≥ .ii) There is a random vector s : E g → R (as a function of T ∼ µ ) such that for all e, s e ≥ − x e η /8 (with probability 1), andiii) If a polygon cut u with polygon partition A , B , C is not left happy, then for any set F ⊆ E with p ( e ) = u for all e ∈ F and x ( F ) ≥ − ǫ η /2 , we haves ( A ) + s ( F ) + s − ( C ) ≥ where s − ( C ) = ∑ e ∈ C min { s e , 0 } . A similar inequality holds if u is not right happy.iv) For every cut S ∈ H such that p ( S ) is not a polygon cut, if δ ( S ) T is odd, then s ( δ ( S )) ≥ .v) For a good edge e ∈ E g , E [ s e ] ≤ − ǫ P η x e (see Eq. (31) for deﬁnition of ǫ P ) . The above theorem is the main part of the paper in which we use that µ is a SR distribution.See Section 7 for the proof. We use this theorem to construct a random vector s such that essen-tially for all cuts S ∈ H in the hierarchy z /2 + s is feasible; furthermore for a large fraction of“good” edges we have that E [ s e ] is negative and bounded away from 0.As we will see in the following subsection, using part (iii) of the theorem we will be able toshow that every leftmost and rightmost cut of any polygon is satisﬁed. in the sense of the number of vertices that it contains

42n the rest of this section we use the above theorem to prove Theorem 4.6. We start byexplaining how to construct H . Given the vector z = ( x + OPT ) /2 run the following procedureon the OPT cycle with the family of η -near minimum cuts of z that are crossed on at most oneside:For every connected component C of η near minimum cuts (w.r.t., z ) crossed on at most oneside, if |C| = C to the hierarchy. Otherwise, C corresponds to apolygon u with atoms a , . . . , a m − (for some m > a , . . . , a m − and ∪ m − i = a i to H . Notethat since z ( { u , v } ) =

2, the root of the hierarchy is always V r { u , v } .Now, we name every cut in the hierarchy. For a cut S if there is a connected component ofat least two cuts with union equal to S , then call S a polygon cut with the A , B , C partitioning asdeﬁned in Deﬁnition 5.18. If S is a cut with exactly two children X , Y in the hierarchy, then alsocall S a polygon cut , A = E ( X , X r Y ) , B = E ( Y , Y r X ) and C = ∅ . Otherwise, call S a degreecut. Fact 4.34.

The above procedure produces a valid hierarchy.Proof.

First observe that whenever |C| = C is a 2 η near min cut (w.r.t, x ) whichis not crossed. For a polygon cut S in the hierarchy, by Lemma 4.16, the set S is a ǫ η near min cutw.r.t., x . If S is an atom of a polygon, then by Lemma 4.19 S is a ǫ η near min cut.Now, it remains to show that for a polygon cut S we have a valid ordering u , . . . , u k of cutsin A ( S ) . If S is a non-triangle polygon cut, the u , . . . , u k are exactly atoms of the polygon of S and x ( A ) , x ( B ) ≥ − ǫ η and x ( C ) ≤ ǫ η and x ( E ( u i , u i + )) ≥ − ǫ η follow by Theorem 4.9. Fora triangle cut S = X ∪ Y because S , X , Y are ǫ η -near min cuts (by the previous paragraph), weget x ( A ) , x ( B ) ≥ − ǫ η as desired, by Lemma 2.7. Finally, since x ( δ ( X )) , x ( δ ( Y )) ≥ x ( E ( X , Y )) ≥ − ǫ η .The following observation is immediate: Observation 4.35.

Each cut S ∈ H corresponds to a contiguous interval around OPT cycle. For apolygon u (or a triangle) with atoms a , . . . , a m − for m ≥ we say an OPT edge e ∗ is interior to u ife ∗ ∈ E ∗ ( a i , a i + ) for some ≤ i ≤ m − . Any OPT edge e ∗ is interior to at most one polygon. Theorem 4.6.

Let x be a feasible solution of LP (1) with support E = E ∪ { e } and x be x restricted to E.Let µ be the max entropy distribution with marginals x. For η ≤ − , there is a set E g ⊂ E r δ ( { u , v } ) of good edges and two functions s : E → R and s ∗ : E ∗ → R ≥ (as functions of T ∼ µ ) such that(i) For each edge e ∈ E g , s e ≥ − x e η /8 and for any e ∈ E r E g , s e = .(ii) For each η -near-min-cut S w.r.t. z, if δ ( S ) T is odd, then s ( δ ( S )) + s ∗ ( δ ( S )) ≥ (iii) We have E [ s e ] ≤ − ǫ P η x e for all edges e ∈ E g and E [ s ∗ e ∗ ] ≤ η for all OPT edges e ∗ ∈ E ∗ . for ǫ P deﬁned in (31) .(iv) For every η -near minimum cut S of z crossed on (at most) one side such that S = V r { u , v } ,x ( δ ( S ) ∩ E g ) ≥ Notice that an atom may already correspond to a connected component, in such a case we do not add it in thisstep. Think about such set as a degenerate polygon with atoms a : = X , a : = Y , a : = X ∪ Y . So, for the rest of thissection we call them triangles and in later section we just think of them as polygon cuts. roof. Let E g , s be as deﬁned in Theorem 4.33, and let s e = ∞ . Also, let s ∗ be the sum of the s ∗ vectors from Theorem 4.2 and Theorem 4.24. (i) follows (ii) of Theorem 4.33. E [ s ∗ e ∗ ] ≤ η follows from Theorem 4.2 and Theorem 4.24 and the fact that every OPT edge is interior to atmost one polygon. Also, E [ s e ] ≤ − ǫ P x e for edges e ∈ E g follows from (v) of Theorem 4.33.Now, we verify (iv): For any (non-root) cut S ∈ H such that p ( S ) is not a polygon cut x ( δ ( S ) ∩ E g ) ≥ η -near minimum cuts are sets S which are either atoms or near minimum cuts in the component C corresponding to a polygon u .So, by Lemma 2.7, x ( δ ( S ) ∩ δ ( u )) ≤ + ǫ η . By (i) of Theorem 4.33 all edges in δ ( S ) r δ ( u ) are in E g . Therefore, x ( δ ( S ) ∩ E g )) ≥ − ǫ η ≥ Type 1 : Near minimum cuts S such that e ∈ δ ( S ) . Then, since s e = ∞ , s ( δ ( S )) + s ∗ ( δ ( S )) ≥ Type 2 : Near minimum cuts S ∈ H where p ( S ) is not a polygon cut. By (iv) of Theorem 4.33and that s ∗ ≥ Type 3:

Near minimum cuts S crossed on both sides. Then, the inequality follows byTheorem 4.2 and the fact that s e ≥ − η /8 for all e ∈ E . Type 4:

Near minimum cuts S that are crossed on one side (and not in H ) or S ∈ H and p ( S ) is a (non-triangle) polygon cut. In this case S must be an atom or a η -near minimum cut (w.r.t., z ) in some polygon u ∈ H . If S is not a leftmost cut/atom or a rightmost cut/atom, then theinequality follows by Theorem 4.24. Otherwise, say S is a leftmost cut. If u is left-happy then byTheorem 4.24 the inequality is satisﬁed. Otherwise, for F = δ ( S ) r δ ( u ) , by Lemma 2.7, we have x ( F ) ≥ − ǫ η /2. Therefore, by (iii) of Theorem 4.33 we have s ( δ ( S )) + s ∗ ( δ ( S )) ≥ s ( A ) + s ( F ) + s − ( C ) ≥ S is a leftmost cut, we always have A ⊆ δ ( S ) . But C may havean unpredictable intersection with δ ( S ) ; in particular, in the worst case only edges of C withnegative slack belong to δ ( S ) . A similar argument holds when S is the leftmost atom or arightmost cut/atom. Type 5:

Near min cut S is the leftmost atom or the rightmost atom of a triangle u . This issimilar to the previous case except we use Lemma 4.25 to argue that the inequality is satisﬁedwhen u is left happy. In the rest of the paper we will not work with z , OPT edges, or the notion of polygons. So,practically, by Deﬁnition 4.31, from now on, a reader can just think of every polygon as a triangle.In the rest of the paper we adopt the following notation.We abuse notation and call any u ∈ A ( S ) an atom of S . Deﬁnition 4.36 (Edge Bundles, Top Edges, and Bottom Edges) . For every degree cut S and everypair of atoms u , v ∈ A ( S ) , we deﬁne a top edge bundle f = ( u , v ) such that f = { e = ( u ′ , v ′ ) ∈ E : p ( e ) = S , u ′ ∈ u , v ′ ∈ v } . Note that in the above deﬁnition, u ′ , v ′ are actual vertices of G.For every polygon cut S, we deﬁne the bottom edge bundle f = { e : p ( e ) = S } .

44e will always use bold letters to distinguish top edge bundles from actual LP edges. Also,we abuse notation and write x e : = ∑ f ∈ e x f to denote the total fractional value of all edges in thisbundle.In the rest of the paper, unless otherwise speciﬁed, we work with edge bundles and sometimeswe just call them edges.For any u ∈ H with p ( u ) = S we write δ ↑ ( u ) : = δ ( u ) ∩ δ ( S ) , δ → ( u ) : = δ ( u ) r δ ( S ) . E → ( S ) : = { e = ( u i , u j ) : u i , u j ∈ A ( S ) , u i = u j } .Also, for a set of edges A ⊆ δ ( u ) we write A → , A ↑ to denote A ∩ δ → ( u ) , A ∩ δ ↑ ( u ) respectively.Note that E → ( S ) ⊆ E ( S ) includes only edges between atoms of S and not all edges betweenvertices in S . The following is the main result of this subsection.

Proposition 5.1.

Given a SR distribution µ : 2 [ n ] → R + , let A , . . . , A m be random variables correspond-ing to the number of elements sampled from m disjoint sets, and let integers n , . . . , n m ≥ be such thatfor any S ⊆ [ m ] , P " ∑ i ∈ S A i ≥ ∑ i ∈ S n i ≥ ǫ , P " ∑ i ∈ S A i ≤ ∑ i ∈ S n i ≥ ǫ , it follows that, P [ ∀ i : A i = n i ] ≥ f ( ǫ ) P [ A + · · · + A m = n + · · · + n m ] , where f ( ǫ ) ≥ ǫ m ∏ mk = { n k , n + ··· + n k − } + . We remark that in applications of the above statement, it is enough to know that for any set S ⊆ [ m ] , ∑ i ∈ S n i − < E [ ∑ i ∈ S A i ] < ∑ i ∈ S n i +

1. Because, then by Lemma 2.21 we can prove alower bound on the probability that ∑ i ∈ S A i = ∑ i ∈ S n i .We also remark the above lower bound of f ( ǫ ) is not tight; in particular, we expect thedependency on m should only be exponential (not doubly exponential). We leave it as an openproblem to ﬁnd a tight lower bound on f ( ǫ ) . Proof.

Let E be the event A + · · · + A m = n + · · · + n m . P [ ≤ i ≤ m : A i = n i ] = P [ E ] P [ A m = n m |E ] P [ A m − = n m − | A m = n m , E ] . . . P [ A = n | A = n , . . . , S A m = n m , E ] ≤ k ≤ n , P [ A k = n k | A k + = n k + , . . . , A m = n m , E ] ≥ ǫ m − k + { n k , n + · · · + n k − } + P [ A k ≥ n k | A k + = n k + , . . . , A m = n m , E ] ≥ ǫ m − k + , P [ A k ≤ n k | A k + = n k + , . . . , A m = n m , E ] ≥ ǫ m − k + .So, (14) simply follows by Lemma 5.3. Now we prove this claim. Claim 5.2.

Let [ k ] : = {

1, . . . , k } . For any ≤ k ≤ m, and any set S ( [ k ] , P " ∑ i ∈ S A i ≥ ∑ i ∈ S n i | A k + = n k + , . . . , A m = n m , E ≥ ǫ m − k + , P " ∑ i ∈ S A i ≤ ∑ i ∈ S n i | A k + = n k + , . . . , A m = n m , E ≥ ǫ m − k + Proof.

We prove by induction. First, notice for k = m the statement holds just by lemma’sassumption and Lemma 5.4. Now, suppose the statement holds for k +

1. Now, ﬁx a set S ( [ k ] .Let S = [ k ] r S . Deﬁne A = ∑ i ∈ S A i and B = ∑ i ∈ S A i , and similarly deﬁne n A , n B . By theinduction hypothesis, ǫ m − k ≤ P [ A ≤ n A | A k + = n k + , . . . , A m = n m , E ] The same statement holds for events A ≥ n A , B ≤ n B , B ≥ n B , A + B ≥ n A + n B , A + B ≤ n A + n B .Let E k + be the event A k + = n k + , . . . , A m = n m , E . Then, by Lemma 5.3, P [ A + B = n A + n B |E k + ] >

0. Therefore, by Lemma 5.4, P [ A ≥ n A | A + B = n A + n B , E k + ] , P [ A ≤ n A | A + B = n A + n B , E k + ] ≥ ( ǫ m − k ) = ǫ m − k + as desired. Note that here we are using that A + B = n A + n B and E k + implies that A k + = n k + .This ﬁnishes the proof of Proposition 5.1 Lemma 5.3.

Let µ : 2 [ n ] → R ≥ be a d-homogeneous SR distribution. If for an integer ≤ k ≤ d, P S ∼ µ [ | S | ≥ k ] ≥ ǫ and P µ [ | S | ≤ k ] ≥ ǫ . Then, P [ | S | = k ] ≥ min { ǫ k + ǫ d − k + } , P [ | S | = k ] ≥ min ( p m , ǫ − (cid:18) ǫ p m (cid:19)

1/ max { k , d − k } !) . where p m ≤ max ≤ i ≤ d P [ | S | = i ] is a lower bound on the mode of | S | . roof. Since µ is SR, the sequence s , s , . . . , s d where s i = P [ | S | = i ] is log-concave and unimodal.So, either the mode is in the interval [ k ] or in [ k , d ] . We assume the former and prove thelemma; the latter can be proven similarly. First, observe that since s k ≥ s k + ≥ · · · ≥ s d , we get s k ≥ ǫ / ( d − k + ) . In the rest of the proof, we show that s k ≥ ǫ ( − ( ǫ / p m ) k ) .Suppose s i is the mode. It follows that there is i ≤ j ≤ k − s j s j + ≥ (cid:16) s i s k (cid:17) ( k − i ) . So,by Eq. (6), ǫ ≤ s k + · · · + s d ≤ s k − (cid:16) s k s i (cid:17) ( k − i ) If s k ≥ p m or s k ≥ ǫ then we are done. Otherwise, s k ≥ ǫ (cid:16) − ( s k / p m ) ( k − i ) (cid:17) ≥ ǫ (cid:16) − ( ǫ / p m ) k (cid:17) where we used s i ≥ p m and s k ≤ ǫ . Lemma 5.4.

Given a strongly Rayleigh distribution µ : 2 [ n ] → R ≥ , let A , B be two (nonnegative)random variables corresponding to the number of elements sampled from two disjoint sets such that P [ A + B = n ] > where n = n A + n B . Then, P [ A ≥ n A | A + B = n ] = P [ B ≤ n B | A + B = n ] ≥ P [ A ≥ n A ] P [ B ≤ n B ] , (15) P [ A ≤ n A | A + B = n ] = P [ B ≥ n B | A + B = n ] ≥ P [ A ≤ n A ] P [ B ≥ n B ] . (16) Proof.

We prove the second statement. The ﬁrst one can be proven similarly. First, notice P [ A ≤ n A , A + B ≥ n ] + P [ B ≥ n B , A + B < n ]= P [ B ≥ n B , A ≤ n A , A + B ≥ n ] + P [ A ≤ n A , B ≥ n B , A + B < n ]= P [ B ≥ n B , A ≤ n A ] ≥ P [ B ≥ n B ] P [ A ≤ n A ] = : α ,where the last inequality follows by negative association. Say q = P [ A + B ≥ n ] . From above,either P [ A ≤ n A , A + B ≥ n ] ≥ α q or P [ B ≥ n B , A + B < n ] ≥ α ( − q ) . In the former case, weget P [ A ≤ n A | A + B ≥ n ] ≥ α and in the latter we get P [ B ≥ n B | A + B < n ] ≥ α . Now thelemma follows by the stochastic dominance property P [ A ≤ n A | A + B = n ] ≥ P [ A ≤ n A | A + B ≥ n ] P [ B ≥ n B | A + B = n ] ≥ P [ B ≥ n B | A + B < n ] Note that in the special case that A + B < n never happens, the lemma holds trivially.Combining the previous two lemmas, we get Corollary 5.5.

Let µ : 2 [ n ] → R ≥ be a SR distribution. Let A , B be two random variables correspondingto the number of elements sampled from two disjoint sets of elements. If P [ A ≥ n A ] , P [ B ≥ n B ] ≥ ǫ and P [ A ≤ n A ] , P [ B ≤ n B ] ≥ ǫ , then P [ A = n A | A + B = n A + n B ] ≥ ǫ min { n A + n B + } , P [ A = n A | A + B = n A + n B ] ≥ min n p m , ǫ ( − ( ǫ / p m )

1/ max { n A , n B } ) o here ǫ = ǫ ǫ and p m ≤ max ≤ k ≤ n A + n B P [ A = k | A + B = n A + n B ] is a lower bound on the mode ofA. For n A = n B = , if P [ A = | A + B = ] ≤ ǫ , since the distribution of A is unimodal, we getp m ≥ − ǫ . Therefore, if ǫ ≤ , P [ A = | A + B = ] ≥ max (cid:26) ǫ /2, ǫ (cid:18) − ǫ − ǫ (cid:19)(cid:27) . The following proposition is the main statement of this subsection.

Proposition 5.6.

Let µ : 2 E → R ≥ be a homogeneous SR distribution. For any ǫ < ζ < and disjoint sets A , B ⊆ E such that − ǫ ≤ E [ A T ] , E [ B T ] ≤ + ǫ (where T ∼ µ ) there is an event E A , B ( T ) such that P [ E A , B ( T )] ≥ ζ ( − ζ /3 − ǫ ) and it satisﬁes the following three properties.i) P [ A T = B T = |E A , B ( T )] = ,ii) ∑ e ∈ A | P [ e ] − P [ e |E A , B ( T )] | ≤ ζ , andiii) ∑ e ∈ B | P [ e ] − P [ e |E A , B ( T )] | ≤ ζ . In other words, under event E A , B which has a constant probability, A T = B T = A , B are preserved up to total variation distance ζ . We also remark thatabove statement holds for a much larger value of ζ at the expense of a smaller lower bound on P [ E A , B ( T )] .Before, proving the above statement we prove the following lemma. Lemma 5.7.

Let µ : 2 E → R ≥ be a homogeneous SR distribution. Let A , B ⊆ E be two disjoint sets suchthat − ǫ ≤ E [ A T ] , E [ B T ] ≤ + ǫ (where T ∼ µ ), A ′ ⊂ A and B ′ ⊆ B and E [ A ′ T ∪ B ′ T ] ≥ + α forsome α > ǫ . If α < , we have P (cid:2) A ′ T = B ′ T = A T = B T = (cid:3) ≥ α . Proof.

First, condition on ( A r A ′ ) T = ( B r B ′ ) T =

0. This happens with probability at least α − ǫ ≥ α because E [ A T ] + E [ B T ] ≤ + ǫ and E [ A ′ T ] + E [ B ′ T ] ≥ + α . Call this measure ν . It follows by negative association that E ν (cid:2) A ′ T (cid:3) , E ν (cid:2) B ′ T (cid:3) ∈ [ α − ǫ , 2 + ǫ − α ] . (17) • Case 1: E ν [ A ′ T + B ′ T ] > . Since E ν [ A ′ T + B ′ T ] ≤ + ǫ , by Lemma 2.21, P ν [ A ′ T + B ′ T = ] ≥ P ν (cid:2) A ′ T ≥ (cid:3) , P ν (cid:2) B ′ T ≥ (cid:3) ≥ − e − ( α − ǫ ) ≥ α (Lemma 2.22, α < P ν (cid:2) A ′ T ≤ (cid:3) , P ν (cid:2) B ′ T ≤ (cid:3) ≥ α /2 − ǫ (Markov’s Inequality)Therefore, by Corollary 5.5 and using α ≤ P [ A ′ T = | A ′ T + B ′ T = ] ≥ α . Itfollows that P (cid:2) A T = B T = A ′ T = B ′ T = (cid:3) ≥ ( α ) P ν (cid:2) A ′ T = B ′ T = (cid:3) ≥ ( α ) ( α ) ≥ α .48 Case 2: E [ A ′ T + B ′ T ] ≤ E ν [ A ′ T + B ′ T ] ≥ + α , by Lemma 2.21, P [ A ′ T + B ′ T = ] ≥ α e − α ≥ α . But now E [ A ′ T ] , E [ B ′ T ] ≤ P ν (cid:2) A ′ T ≤ (cid:3) , P ν (cid:2) B ′ T ≤ (cid:3) ≥ P ν [ A ′ T ≥ ] , P ν [ B ′ T ≥ ] ≥ ( α − ǫ ) e − α + ǫ ≥ α . Itfollows by Corollary 5.5 that P [ A ′ T = | A ′ T + B ′ T = ] ≥ α . Therefore, P (cid:2) A T = B T = A ′ T = B ′ T = (cid:3) ≥ ( α ) P ν (cid:2) A ′ T = B ′ T = (cid:3) ≥ ( α )( α )( α ) ≥ α as desired.It is worth noting that α dependency is necessary in the above example. For an explicitStrongly Rayleigh distribution consider the following product distribution: ( α x + ( − α ) y )( α y + ( − α ) z )( α z + ( − α ) x ) ,and let A = { x , x } , B ′ = B = { y , y } , and A ′ = { x } . Observe that P (cid:2) A = B = A ′ = B ′ = (cid:3) = P [ x = y = z = ] = α . Proof of Proposition 5.6.

To prove the lemma, we construct an instance of the max-ﬂow, min-cutproblem. Consider the following graph with vertex set { s , A , B , t } . For any e ∈ A , f ∈ B connect e to f with a directed edge of capacity y e , f = P [ e , f ∈ T | A T = B T = ] . For any e ∈ E , let x e : = P [ e ∈ T ] . Connect s to e ∈ A with an arc of capacity β x e and similarly connect f ∈ B to t with arc of capacity β x f , where β is a parameter that we choose later. We claim that the min-cutof this graph is at least β ( − ǫ − ζ /3 ) . Assuming this, we can prove the lemma as follows: let z be the maximum ﬂow, where z e , f is the ﬂow on the edge from e to f . We deﬁne the event E A , B ( T ) = E ( T ) to be the union of events z e , f . More precisely, conditioned on A T = B T = e , f ∈ T | A T = B T = e ∈ A , f ∈ B , so we know that wehave a speciﬁc e , f in the tree T with probability y e , f . And, of course, ∑ e ∈ A , f ∈ B y e , f =

1. So, for e ∈ A , f ∈ B we include a z e , f measure of trees, T , such that A T = B T = e , f ∈ T . First, observethat P [ E ] = ∑ e ∈ A , f ∈ B z e , f P [ A T = B T = ] ≥ β ( − ζ /3 − ǫ ) P [ A T = B T = ] . (18)Part (i) of the proposition follows from the deﬁnition of E . Now, we check part (ii): Say z = ∑ e ∈ A , f ∈ B z e , f , and the ﬂow into e is z e . Then, ∑ e ∈ A | x e − P [ e ∈ T |E ] | = ∑ e ∈ A (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) x e − ∑ f z e , f z (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = ∑ e ∈ A | x e − z e z | Note that both x and z e / z deﬁne a probability distribution on edges in A ; so the RHS is just thetotal variation distance between these two distributions. We can write ∑ e ∈ X | x e − P [ e ∈ T |E ] | = ∑ e ∈ X : z e / z > x e (cid:16) z e z − x e (cid:17) ≤ ∑ e ∈ X : z e / z > x e (cid:18) β x e β ( − ζ /3 − ǫ ) − x e (cid:19) ≤ · ∑ e x e ζ /3 + ǫ − ζ /3 − ǫ ≤ ( + ǫ )( ζ /3 + ǫ ) − ζ /3 − ǫ ≤ ζ .49he ﬁrst inequality uses that the max-ﬂow is at least β ( − ζ /3 ) and that the incoming ﬂow of e is at most β x e , and the last inequality follows by ζ < ǫ < ζ /300. (iii) can be checkedsimilarly.It remains to lower-bound the max-ﬂow or equivalently the min-cut. Consider an s , t -cut S , S ,i.e., assume s ∈ S and t / ∈ S . Deﬁne S A = A ∩ S and S B = B ∩ S . We writecap ( S , S ) = β x ( S A ) + β x ( S B ) + ∑ e ∈ S A , f ∈ S B y e , f = β x ( S A ∪ S B ) + P (cid:2) S A = S B = | A = B = (cid:3) If x ( S B ) ≥ x ( S A ) − ζ /3, thencap ( S , S ) ≥ β x ( S A ∪ S B ) ≥ β ( x ( S A ∪ S A ) − ζ /3 ) ≥ β ( − ǫ − ζ /3 ) ,and we are done. Otherwise, say x ( S B ) + γ = x ( S A ) , for some γ > ζ /3. So, x ( S B ) + x ( S A ) = x ( S B ) + x ( S B ) + γ ≥ − ǫ + γ So, by Lemma 5.7 with ( α = γ − ǫ ) P (cid:2) S A = S B = | A = B = (cid:3) ≥ P (cid:2) S A = S B = A = B = (cid:3) P [ A = B = ] ≥ ( γ − ǫ ) P [ A = B = ] .It follows that cap ( S , S ) ≥ β x ( S A ∪ S B ) + ( γ − ǫ ) P [ A = B = ] ≥ β ( x ( S A ∪ S A ) − γ ) + ( γ − ǫ ) P [ A = B = ] ≥ β ( − ǫ − γ ) + ( γ − ǫ ) P [ A = B = ] To prove the lemma we just need to choose β such that RHS is at least β ( − ǫ − ζ /3 ) . Orequivalently, 0.1 ( γ − ǫ ) P [ A = B = ] ≥ β ( γ − ζ /3 ) .In other words, it is enough to choose β ≤ ( γ − ǫ ) P [ A = B = ]( γ − ζ /3 ) . Since γ ≥ ζ /3 and ζ /3 > ǫ , wecertainly have γ − ǫ ≥ ζ /6. Therefore, we can set β = ζ /6 P [ A = B = ] . Finally, this plus (18) gives P [ E ] ≥ ( − ζ /3 − ǫ ) β P [ A = B = ] = ( ζ /6 )( − ζ /3 − ǫ ) ≥ ζ ( − ζ /3 − ǫ ) as desired. Deﬁnition 5.8 (Max-ﬂow Event) . For a polygon cut S ∈ H with polygon partition A , B , C, let ν bethe max-entropy distribution conditioned on S is a tree and C T = . By Lemma 2.23, we can write ν : ν S × ν G / S , where ν S is supported on trees in E ( S ) and ν G / S on trees in E ( G / S ) . For a sample ( T S , T G / S ) ∼ ν S × ν G / S , we say E S occurs if E A , B ( T G / S ) occurs, where E A , B ( . ) is the event deﬁned inProposition 5.6 for sets A , B and ζ = ǫ M : = and ǫ = ǫ η . orollary 5.9. For a polygon cut S ∈ H with polygon partition A , B , C, we have,i) P [ E S ] ≥ ǫ M . ii) For any set F ⊆ δ ( S ) conditioned on E S marginals of edges in F are preserved up to ǫ M + ǫ η intotal variation distance.iii) For any F ⊆ E ( S ) ∪ δ ( S ) where either F ∩ A = ∅ or F ∩ B = ∅ , there is some q ∈ x ( F ) ± ( ǫ M + ǫ η ) such that the law of F T |E S is the same as a BS ( q ) .Proof. Condition S to be a tree and C T = ν be the resulting measure. It follows that P [ E S ] = P ν [ E S ] P [ C T = S tree ] ≥ ǫ M ( − ǫ M /3 − ǫ ) P [ C T = S tree ] ≥ ǫ M .which proves (i).Now, we prove (ii). By Proposition 5.6, the marginals of edges in δ ( S ) are preserved up to atotal variation distance of ǫ M , so E ν [( F ∩ δ ( S )) T |E A , B ( T G / S )] = E ν [( F ∩ δ ( S )) T ] ± ǫ M .Since x ( C ) ≤ ǫ η and x ( δ ( S )) ≤ + ǫ η , by negative association, x ( F ∩ δ ( S )) − ǫ η /2 ≤ E ν [( F ∩ δ ( S )) T ] ≤ x ( F ∩ δ ( S )) + ǫ η .This proves (ii). Also observe that since conditioned on E S , we choose at most one edge of F ∩ δ ( S ) , ( F ∩ δ ( S )) T is a BS ( q G / S ) for some q G / S = x ( F ) ± ( ǫ M + ǫ η ) .On the other hand, observe that conditioned on E S , S is a tree, so x ( F ∩ E ( S )) ≤ E [( F ∩ E ( S )) T |E S ] ≤ x ( F ∩ E ( S )) + ǫ η /2.Since the distribution of ( F ∩ E ( S )) T under ν |E S is SR, there is a random variable BS ( q S ) =( F ∩ E ( S )) T where x ( F ∩ E ( S )) ≤ q S ≤ x ( F ∩ E ( S )) + ǫ η /2.Finally, F T |E S is exactly BS ( q S ) + BS ( q G / S ) = BS ( q ) for q = x ( F ) ± ( ǫ M + ǫ η ) . Corollary 5.10.

For u ∈ H and a polygon cut S ∈ H that is an ancestor of u, P [ δ ( u ) T odd |E S ] ≤ Proof.

First, notice by Observation 4.32, δ ( u ) ∩ δ ( S ) is either a subset of A , B , or C . Therefore,we can write δ ( u ) T |E S as a BS ( q ) for q ∈ ± [ ] (where we use that ǫ M + ǫ η < δ ( u ) T = + BS ( q − ) . Therefore,by Corollary 2.17, P [ δ ( u ) T odd |E S ] = P [ BS ( q − ) even ] ≤ ( + e − ( q − ) ) ≤ ( + e − ) ≤ Corollary 5.11.

For a polygon cut u ∈ H and a polygon cut S ∈ H that is an ancestor of u, P [ u not left happy |E S ] ≤ and the same follows for right happy. roof. Let A , B , C be the polygon partition of u . Recall that for u to be left-happy, we need C T = A T odd. Similar to the previous statement, we can write A T |E S as a BS ( q A ) for q A ∈ ± [ ] (where we used that ǫ M = ǫ η ≤ ǫ M /300). Therefore, by Corollary 2.17, P [ A T even |E S ] ≤ ( + e − q A ) ≤ ( + e − ) ≤ E [ C T |E S ] ≤ x ( C T ) + ǫ M + ǫ η ≤ P [ u not left happy | E S ] ≤ + ≤ Deﬁnition 5.12 (Half Edges) . We say an edge bundle e = ( u , v ) in a degree cut S ∈ H , i.e., p ( e ) = S,is a half edge if | x e − | ≤ ǫ . Deﬁnition 5.13 (Good Edges) . We say a top edge bundle e = ( u , v ) in a degree cut S ∈ H is (2-2)good, if one of the following holds:1. e is not a half edge or2. e is a half edge and P [ δ ( u ) T = δ ( v ) T = | u , v trees ] ≥ ǫ . We say a top edge e is bad otherwise. We say every bottom edge bundle is good (but generally do not referto bottom edges as good or bad). We say any edge e that is a neighbor of u or v is bad. In the next subsection we will see that for any top edge bundle e = ( u , v ) which is not a halfedge, P [( δ ( u )) T = ( δ ( v )) T = | u , v trees ] = Ω ( ) . The following theorem is the main result ofthis subsection: Theorem 5.14.

For ǫ ≤ , ǫ η ≤ ǫ , a top edge bundle e = ( u , v ) is bad only if the followingthree conditions hold simultaneously: • e is a half edge, • x ( δ ↑ ( u )) , x ( δ ↑ ( v )) ≤ + ǫ , • Every other half edge bundle incident to u or v is (2-2) good.

The proof of this theorem follows from Lemma 5.16 and Lemma 5.17 below.In this subsection, we use repeatedly that for any atom u in a degree cut S , x ( δ ( u )) ≤ + ǫ η .We also repeatedly use that for a half edge bundle e = ( u , v ) in a degree cut, conditioned on u , v trees, e is in or out with probability at least 1/2 − ǫ − ǫ η > Lemma 5.15.

Let e = ( u , v ) be a good half edge bundle in a degree cut S ∈ H . Let A = δ ( u ) − e andB = δ ( v ) − e . If ǫ ≤ and ǫ η < ǫ /100 , then P [ A T + B T ≤ | u , v trees ] , P [ A T + B T ≥ | u , v trees ] ≥ ǫ roof. Throughout the proof all probabilistic statements are with respect to the measure µ condi-tioned on u , v trees. Let p ≤ = P [ A T + B T ≤ ] and similarly deﬁne p ≥ . Observe that whenever δ ( u ) T = δ ( v ) T =

2, we must have A T + B T =

3. Since e is 2-2 good, this event happens withprobability at least 3 ǫ , i.e., p ≤ + p ≥ ≥ ǫ (19)By Lemma 2.21, using the fact that p =

0, we get p = ≥ p ≤ ≥ ǫ . We have3 + ǫ ≥ E [ A T + B T ] ≥ p ≥ + p = + ( − p ≥ − p ≤ ) = + p ≥ − p = − p = .Again, we are using p =

0. By log-concavity p = ≥ p = p = , so since p = ≥ p = ≤ p = ≤ p ≤ . Therefore, p ≥ − ǫ ≤ p = + p = = p ≤ + p = ≤ p ≤ ( + p ≤ ) .Finally, since ǫ < p ≥ into Eq. (19) we get p ≤ ≥ ǫ .Now, we show p ≥ ≥ ǫ /2. Assume p ≥ < ǫ /2 (otherwise we are done). Since p = ≥ γ ≤ ( ǫ /2 ) / ( ) = ǫ E [ A T + B T | A T + B T ≥ ] · p ≥ ≤ p ≥ − ǫ ( + ǫ ) Therefore,3 − ǫ − ǫ η ≤ E [ A T + B T ] ≤ p ≤ + p ≥ − ǫ ( + ǫ ) + ( − p ≤ − p ≥ ) So, 1.01 p ≥ ≥ p ≤ − ǫ where we used ǫ ≤ ǫ η < ǫ /100. Now, p ≥ ≥ ǫ follows by Eq. (19). Su v e W δ ↑ ( u ) Figure 11: Setting of Lemma 5.16

Lemma 5.16.

Let e = ( u , v ) be a half edge bundle in a degree cut S ∈ H , and suppose x ( δ ↑ ( u )) ≥ + k ǫ . If k ≥ and ǫ ≤ , then, e is 2-2 good.Proof. First, condition u , v , S to be trees. Let W = S r { u } . Since S is a near mincut, x ( δ ( W )) = x ( δ ( S )) + x ( δ ( u )) − x ( δ ↑ ( u )) ≤ ( + ǫ η ) − ( + k ǫ ) = − k ǫ + ǫ η P [ W is tree ] ≥ + k ǫ − ǫ η − ǫ η . Note that the extra − ǫ η comes fromthe fact that conditioning u be a tree can decrease marginals of edges in E ( W ) by at most ǫ η .Let ν be the measure in which we also condition on W to be a tree. Note that ν is a stronglyRayleigh distribution on the set of edges in E ( W ) ∪ E ( u , W ) ∪ E ( G / S ) ; this is because ν is aproduct of 3 SR distributions each supported on one of the aforementioned sets.Let X = δ ↑ ( u ) T and Y = δ ( v ) T −

1. Observe that, under ν , X = Y = δ ( u ) T = δ ( v ) T = Y ≥ v is connected to the rest of the graph. So, we justto lower P ν [ X = Y = ] . First, notice E ν [ X ] ∈ [ + k ǫ − ǫ η , 1 + ǫ η ] E ν [ Y ] ∈ [ + k ǫ − ǫ η , 1.5 − k ǫ + ǫ η ] (20)Note that using Proposition 5.1, we can immediately argue that P ν [ X = Y = ] ≥ Ω ( ǫ ) . Wedo the following more reﬁned analysis to make sure that this probability is at least 3 ǫ (for ǫ ≤ k ≥ Case 1: P ν [ X + Y = ] ≥ . By Lemma 2.22, P ν [ X ≥ ] P ν [ Y ≥ ] ≥ − e − ≥ P ν [ X ≤ ] , P ν [ Y ≤ ] ≥ P ν [ X = | X + Y = ] ≥ ( − ) = P ν [ X = Y = ] ≥ ( )( ) ≥ S and W being trees, we get P [ X = Y = ] ≥ ( )( ) = ≥ ǫ since ǫ ≤ Case 2: P ν [ X + Y = ] < . We know that E ν [ X + Y ] ≤ P [ X + Y = ] ≥ E ν [ X + Y ] < P [ X + Y = ] ≥ P ν [ X + Y = ] < γ = i = k = P [ X + Y > ] < E ν [ X + Y ] < E ν [ X ] , E ν [ Y ] ≤ P ν [ X ≥ ] , P ν [ Y ≥ ] ≥ − e − ≥ P ν [ X ≥ | X + Y = ] ≥ P ν [ X ≥ | X + Y ≤ ] ≥ P ν [ X ≥ X + Y ≤ ] ≥ P ν [ X ≥ ] − P ν [ X + Y > ] ≥ P ν [ X ≥ ] − ≥ P [ X ≤ | X + Y = ] = P [ Y ≥ | X + Y = ] ≥ P [ Y ≥ ] − ≥ X conditioned on X + Y = p and p , we can minimize p ( − p ) +( − p ) p subject to 1 − p p ≥ − ( − p )( − p ) ≥ P [ X = | X + Y = ] ≥ E ν [ X + Y ] ≥ + ( k − ) ǫ , by Lemma 2.21 we canwrite P ν [ X + Y = ] ≥ ( k − ) ǫ e − ( k − ) ǫ ≥ ( k − ) ǫ .Finally, P [ δ ( u ) T = δ ( v ) T = ] ≥ P [ W tree ] P ν [ X = | X + Y = ] P ν [ X + Y = ] ≥ · ( k − ) ǫ To get the RHS to be at least 3 ǫ it sufﬁces that k ≥ U vV wW e f

Figure 12: Setting of Lemma 5.17

Lemma 5.17.

Let e = ( u , v ) , f = ( v , w ) be two half edge bundles in a degree cut S ∈ H . If ǫ < ,then one of e or f is good.Proof. We use the following notation V = δ ( v ) − e − f , U = δ ( u ) − e , W = δ ( w ) − f . For a set A ofedges and an edge bundle e we write A + e = A ∪ { e } . Furthermore, for a measure ν we write ν − e to denote ν conditioned on e / ∈ T .Condition u , v , w to be trees. This occurs with probability at least 1 − ǫ η . Let ν be thismeasure. By Lemma 2.27, without loss of generality, we can assume E ν [ W T | e / ∈ T ] ≤ E ν [ W T ] + E ν [ V T | e / ∈ T ] ≥ E ν [ V T ] + e is 2-2 good. First, E ν − e [( V + f ) T ] ∈ [ − ǫ , 2 + ǫ η ] , E ν − e [ U T ] ∈ [ − ǫ , 2 + ǫ η ] , E ν − e [( V + f ) T + U T ] ∈ [ − ǫ , 3.5 + ǫ ] .Therefore, by Lemma 2.21, P ν − e [( V + f ) T + U T = ] ≥ U T ≥ ( V + f ) T ≥ ν − e and apply this and the remaining calculations to U T − ( V + f ) T −

1. In addition, we have P ν − e [ U T ≤ ] , P ν − e [( V + f ) T ≤ ] ≥ P ν − e [ U T ≥ ] , P ν − e [( V + f ) T ≥ ] ≥ ǫ = p m = P ν − e [ U T = | U T + ( V + f ) T = ] ≥ P [ δ ( u ) T = δ ( v ) T = ] ≥ P [ u , v , w trees, e / ∈ T ] P ν − e [ U T = ( V + f ) T = ] ≥ ( )( )( ) ≥ e is 2-2 good) since 0.0018 ≥ ǫ for ǫ ≤ E ν [ V T | e / ∈ T ] ≤ E ν [ V T ] + f is 2/2 good. We have, E ν + f [( V + e ) T ] , E ν + f [ W T ] ∈ [ − ǫ , 1.5 + ǫ ] P ν + f [( V + e ) T ≤ ] , P ν + f [ W T ≤ ] ≥ P ν + f [( V + e ) T ≥ ] , P ν + f [ W T ≥ ] ≥ ǫ = p m = P ν + f [ W T = | ( V + e ) T + W T = ] ≥ P ν + f [( V + e ) T + W T = ] ≥ P ν + f [ e / ∈ T ] P ν + f − e [( V + e ) T + W T = ] ≥ ( )( ) ≥ P ν + f − e [( V + e ) T + W T = ] ≥ E ν + f − e [( V + e ) T + W T ] = E ν + f − e [ V T + W T ] ≤ E ν − e [ V T + W T ] ≤ E ν [ W T ] + + E ν [ V T ] + ≤ ( V + e ) T + W T is always at least 1, so by Theorem 2.15, in the worst case, P ν − e + f [( V + e ) T + W T = ] is the probability that the sum of two Bernoullis with success probability 1.94/2 is 1, which is0.0582.Therefore, similar to the previous case, P [ δ ( u ) T = δ ( v ) T = ] ≥ P [ u , v , w trees, f ∈ T ] P ν + f [( V + e ) T = W T = ] P ν + f [ W T = | ( V + e ) T = W T = ] ≥ ( )( )( ) ≥ ǫ for ǫ ≤ Deﬁnition 5.18 ( A , B , C -Degree Partitioning) . For u ∈ H , we deﬁne a partitioning of edges in δ ( u ) :Let a , b ( u be minimal cuts in the hierarchy, i.e., a , b ∈ H , such that a = b and x ( δ ( a ) ∩ δ ( u )) , x ( δ ( b ) ∩ δ ( u )) ≥ − ǫ . Note that since the hierarchy is laminar, a , b cannot cross. Let A = δ ( a ) ∩ δ ( u ) , B = δ ( b ) ∩ δ ( u ) , C = δ ( u ) r A r B.If there is no cut a ( u (in the hierarchy) such that x ( δ ( a ) ∩ δ ( u )) ≥ − ǫ , we just let A , B bearbitrary set of edges in δ ( u ) which x ( A ) , x ( B ) ≥ − ǫ .If there is just one minimal cut a ( u (in the hierarchy) with x ( δ ( a ) ∩ δ ( u )) ≥ − ǫ , i.e., b doesnot exist in the above deﬁnition, then we deﬁne A = δ ( a ) ∩ δ ( u ) . Let a ′ ∈ H be the unique child of usuch that a ⊆ a ′ , i.e., a is equal to a ′ or a descendant of a ′ . Then we deﬁne C = δ ( a ′ ) ∩ δ ( u ) r δ ( a ) andB = ( δ ( u ) r A ) r C. Note that in this case since x ( δ ↑ ( a ′ )) ≤ + ǫ η , we have x ( B ) ≥ − ǫ η ≥ − ǫ . See Fig. 6 for an example. The following inequalities on A , B , C degree partitioning will beused in this section: 1 − ǫ ≤ x ( A ) , x ( B ) ≤ + ǫ η , x ( C ) ≤ ǫ + ǫ η . (22)For an edge bundle e = ( u , v ) and degree partitioning A , B , C of δ ( u ) we write e ( A ) = e ∩ A .Note that e ( A ) is not really an edge bundle.In this section we will deﬁne a constant p > Deﬁnition 5.19 (2-1-1 Happy/Good) . Let e = ( u , v ) be a top edge bundle. Let A , B , C ⊆ δ ( u ) be aDegree Partitioning of edges δ ( u ) as deﬁned in Deﬁnition 5.18. We say that e is 2-1-1 happy with respectto u if the event A T = B T = C T = δ ( v ) T = and u and v are both trees ccurs.We say e is u if P [ e is 2-1-1 happy wrt u ] ≥ p . Deﬁnition 5.20 (2-2-2 Happy/Good) . Let e = ( u , v ) , f = ( v , w ) be top half-edge bundles (with p ( e ) = p ( f ) ). We say e , f are 2-2-2 happy (with respect to v) if δ ( u ) T = δ ( v ) T = δ ( w ) T = and u , v , ware all trees.We say e , f are 2-2-2 good with respect to v if P [ e , f ] ≥ p. We will use the following notation: For a set of edges D , and an edge bundle e , let e ( D ) : = e ∩ D .The following theorem is the main result of this section. Theorem 5.21.

Let v , S ∈ H where p ( v ) = S, and let A , B , C be the degree partitioning of δ ( v ) . Forp ≥ ǫ , with ǫ ≤ , ǫ ≤ ǫ /12 and ǫ η ≤ ǫ , at least one of the following is true:i) δ → ( v ) has at least − ǫ fraction of bad edges,ii) δ → ( v ) has at least − ǫ − ǫ η fraction of 2/1/1 good edges with respect to v.iii) There are two (top) half edge bundles e , f ∈ δ → ( v ) such that x e ( B ) ≤ ǫ , x f ( A ) ≤ ǫ , and e , f are 2/2/2 good (with respect to v). We will prove this theorem after proving several intermediate lemmas (whose proofs can befound in Appendix A).

Lemma 5.22.

Let e = ( u , v ) be a top edge bundle such that x e ≤ − ǫ . If ǫ ≤ ǫ ≤ then, e is 2/1/1 happy with probability at least ǫ . Lemma 5.23.

Let e = ( u , v ) be a top edge bundle such that x e ≥ + ǫ . If ǫ ≤ ǫ ≤ ,then, e is 2/1/1 happy with respect to u with probability at least ǫ . Lemma 5.24.

For a good half top edge bundle e = ( u , v ) , let A , B , C be the degree partitioning of δ ( u ) ,and let V = δ ( v ) − e (see Fig. 14). If x e ( B ) ≤ ǫ and P [( A − e ) T + V T ≤ ] ≥ ǫ then e is 2-1-1good, P [ e ] ≥ ǫ Lemma 5.25.

Let e = ( v , u ) and f = ( v , w ) be good half top edge bundles and let A , B , C be thedegree partitioning of δ ( v ) such that x e ( B ) , x f ( B ) ≤ ǫ . Then, one of e , f is 2-1-1 happy with probabilityat least ǫ . Lemma 5.26.

Let e = ( u , v ) be a good half edge bundle and let A , B , C be the degree partitioning of δ ( u ) (see Fig. 15). If ǫ ≤ ǫ ≤ and x e ( A ) , x e ( B ) ≥ ǫ , then P [ e ] ≥ ǫ .57 emma 5.27. Let e = ( u , v ) , f = ( v , w ) be two good top half edge bundles and let A , B , C be degree partitioningof δ ( v ) such that x e ( B ) , x f ( A ) ≤ ǫ . If e , f are not 2-1-1 good with respect to v, and ǫ ≤ ǫ ≤ , then e , f are 2-2-2 happy with probability at least .Proof of Theorem 5.21. Suppose case (i) does not happen. Since every bad edge has fraction at least1/2 − ǫ this means that δ ( v ) has no bad edges. First, notice by Lemma 5.22 and Lemma 5.23any non half-edge in δ → ( v ) is 2/1/1 good (with respect to v ). If there is only one half edge in δ → ( v ) , then we have at least fraction 1 − ǫ η − ( + ǫ ) fraction of 2-1-1 good edges and weare done with case (ii). Otherwise, there are two good half edges e , f ∈ δ → ( v ) .First, by Lemma 5.26 if x e ( A ) , x e ( B ) ≥ ǫ , then e is 2/1/1 good (w.r.t., v ) and we are done.Similarly, if x f ( A ) , x f ( B ) ≥ ǫ , then f is good. So assume none of these happens.Furthermore by Lemma 5.25 if x e ( B ) , x f ( B ) ≤ ǫ (or x e ( A ) , x f ( A ) ≤ ǫ ) then one of e , f is2/1/1 good.So, the only remaining case is when e , f are not 2-1-1 good and x e ( B ) , x f ( A ) ≤ ǫ . But in thiscase by Lemma 5.27, e , f are 2/2/2 good; so (iii) holds. Lemma 5.28.

For a degree cut S ∈ H , and u ∈ A ( S ) , let A , B , C be the degree partition of u. Then,A ∩ δ → ( u ) has fraction at most + ǫ of good edges that are not 2-1-1 good (w.r.t., u).Proof. Suppose by way of contradiction that there is a set D ⊆ A → of good edges that are not2-1-1 good w.r.t. u with x ( D ) ≥ + ǫ . By Lemma 5.22 and Lemma 5.23, every edge in D ispart of a half edge bundle.There are at least two half edge bundles e , f such that x ( D ∩ e ) , x ( D ∩ f ) ≥ ǫ , as there are atmost four half edge bundles in δ → ( u ) (and using that for any half edge bundle e , x e ≤ + ǫ ).Since D ⊆ A → , we have x ( A ∩ e ) , x ( A ∩ f ) ≥ ǫ .Since x ( A ∩ e ) ≥ ǫ , if x ( B ∩ e ) ≥ ǫ then, by Lemma 5.26 e is 2-1-1 good. But since everyedge in D is not 2-1-1 good w.r.t u , we must have x ( B ∩ e ) < ǫ . The same also holds for f .Finally, since x ( B ∩ e ) < ǫ and x ( B ∩ f ) < ǫ by Lemma 5.25 at least one of e , f is 2-1-1 goodw.r.t u . This is a contradiction. Deﬁnition 6.1 ( ǫ F fractional edge) . For z ≥ we say that z is ǫ F -fractional if ǫ F ≤ z ≤ − ǫ F . The following lemma is the main result of this section.

Lemma 6.2 (Matching Lemma) . For a degree cut S ∈ H , let F ( S ) ⊆ A ( S ) denote the set of atoms u suchthat x ( δ ↑ ( u )) is ǫ F -fractional. Then for any ǫ F ≤ ǫ B ≥ ǫ , α ≥ ǫ η , there is a matching fromgood edges (see Deﬁnition 5.13) in E → ( S ) to edges in δ ( S ) such that every good edge bundle e = ( u , v ) (where u , v ∈ A ( S ) ) is matched to a fraction m e , u of edges in δ ↑ ( u ) and a fraction m e , v of δ ↑ ( v ) wherem e , u F u + m e , v F v ≤ x e ( + α ) , (23) m e , u = m e , v = if e is bad, and for every atom u ∈ A ( S ) , where for an atom u ∈ A ( S ) ,F u = − ǫ B I n x ( δ ↑ ( u )) is ǫ F fractional o ,58 .e., it is active if u is ǫ F - fractional , and ∑ e ∈ δ → ( u ) m e , u = x ( δ ↑ ( u )) Z u (24) where for u ∈ A ( S ) , Z u : = (cid:16) + I n |A ( S ) | ≥ x ( δ ↑ ( u )) ≤ ǫ F o(cid:17) which is active when δ ↑ ( u )) is very close to zero . Roughly speaking, the intention of the above lemma is to match edges in E → ( S ) to a similarfraction of edges from endpoints that go higher. Eq. (23) says that if x ( δ ↑ ( u )) is fractional thenedges incident to u can be matched to a larger faction of edges in δ ↑ ( u ) . On the other hand,Eq. (24) says that if x ( δ ↑ ( u )) ≈

0, then a larger fraction of edges will match to edges in δ ↑ ( u ) .This is the matching that we use in order to decide which edges we will have positive slack tocompensate for the negative slack of edges going higher.Throughout this section we adopt the following notation: For a cut S ∈ H and a set W ⊆ A ( S ) ,we write E ( W , S r W ) : = ∪ a ∈ W , b ∈A ( S ) r W E ( a , b ) , δ ↑ ( W ) : = ∪ a ∈ W δ ↑ ( a ) = δ ( W ) ∩ δ ( S ) , δ → ( W ) : = ∪ a ∈ W δ → ( a ) .Note that in δ → ( W ) δ ( W ) since it includes edge bundles between atoms in W .Before proving the main lemma we record the following facts. Lemma 6.3.

For any S ∈ H and W ( A ( S ) , we havex ( δ → ( W )) ≥ ∑ a ∈ W x ( δ ( a )) − ǫ /2 ≥ | W | − ǫ /2 where the sum is over the vertices in W.Proof. We have x ( δ → ( W )) = ∑ a ∈ W ( x ( δ ( a )) + x ( E ( W , S r W )) − x ( δ ↑ ( W )) ! .Since x ( δ ( S r W )) ≥ x ( δ ( S )) ≤ + ǫ , we have:(a) x ( E ( W , S r W )) + x ( δ ↑ ( S r W ))) ≥ x ( δ ↑ ( W )) + x ( δ ↑ ( S r W )) ≤ + ǫ .Subtracting (b) from (a), we get x ( E ( W , S r W )) − x ( δ ↑ ( W )) ≥ − ǫ ,which after substituting into the above equation, completes the proof of the ﬁrst inequality in thelemma statement. The second inequality follows from the fact that δ ( a ) ≥ a .59 emma 6.4. For S ∈ H , if |A ( S ) | = then there are no bad edges in E → ( S ) .Proof. Suppose A ( S ) = { u , v , w } and e = ( u , v ) is a bad edge bundle. Then | x e − | ≤ ǫ . Inaddition, by Theorem 5.14, x ( δ ↑ ( u )) , x ( δ ↑ ( v )) ≤ + ǫ . Therefore, x ( u , w ) = x ( δ ( u )) − x e − x ( δ ↑ ( u )) ≥ − ǫ .Similarly, x ( v , w ) ≥ − ǫ . Finally, since x ( δ ( S )) ≥

2, and x ( δ ↑ ( u )) , x ( δ ↑ ( v )) ≤ + ǫ , wemust have x ( δ ↑ ( w )) ≥ − ǫ . But, this contradicts the assumption that w ∈ H must satisfy x ( δ ( w )) ≤ + ǫ η . Proof of Lemma 6.2.

We will prove this by setting up a max-ﬂow min-cut problem. Construct agraph with vertex set { s , X , Y , t } , where s , t are the source and sink. We identify X with the set ofgood edge bundles in E → ( S ) and Y with the set of atoms in A ( S ) . For every edge bundle e ∈ X ,add an arc from s to e of capacity c ( s , e ) : = ( + α ) x e . For every u ∈ A ( S ) , there is an arc ( u , t ) with capacity c ( u , t ) = x ( δ ↑ ( u )) F u Z u .Finally, connect e = ( u , v ) ∈ X to nodes u and v ∈ Y with a directed edge of inﬁnite capacity,i.e., c ( e , u ) = c ( e , v ) = ∞ . We will show below that there is a ﬂow saturating t , i.e. there is a ﬂowof value c ( t ) : = ∑ u ∈A ( S ) c ( u , t ) = ∑ u ∈A ( S ) x ( δ ↑ ( u )) F u Z u .Suppose that in the corresponding max-ﬂow, there is a ﬂow of value f e , u on the edge ( e , u ) .Deﬁne m e , u : = f e , u F u .Then (23) follows from the fact that the ﬂow leaving e is at most the capacity of the edge from s to e , and (24) follows by conservation of ﬂow on the node u (after cancelling out F u from bothsides).We have left to show that for any s - t cut A , A where s ∈ A , t ∈ A that the capacity of this cutis at least c ( t ) . Claim 6.5.

If A = { s } , then capacity of ( A , A ) is at least c ( t ) .Proof. If |A ( S ) | = Z u = u ∈ A ( S ) and by Lemma 6.4 all edges are good. Also, x ( E → ( S )) = ∑ a ∈A ( S ) ( x ( δ ( a )) − x ( δ ↑ ( a ))) ≥ |A ( S ) | − ( + ǫ η ) = − ǫ η /2.So, x ( E → ( S )) ≥ − ǫ η /2. Thus, for α ≥ ǫ η we have c ( s )( + α ) ≥ ( − ǫ η /2 )( + α ) ≥ + ǫ η ≥ x ( δ ( S )) = c ( t ) as desired.Now, suppose |A ( S ) | ≥

4. By Theorem 5.14 there is at most one bad half edge adjacent toevery vertex. Therefore there are at most |A ( S ) | /2 bad edges in total (the bound is met if theyform a perfect matching) which contributes to at most a total of ( + ǫ ) |A ( S ) | /2 = : x B x G : = x ( E → ( S )) − x B ≥ |A ( S ) | − − ǫ η /2 − x B of goodedges. So, ( + α ) x G ≥ ( + α ) (cid:18) |A ( S ) | ( − ǫ ) − − ǫ η (cid:19) ≥ ( + α ) (cid:18) |A ( S ) | ( − ǫ − ǫ F ) − − ǫ η + ǫ F |A ( S ) | . (cid:19) ≥ + ǫ η + ǫ F |A ( S ) | ≥ c ( t ) .where the ﬁnal inequality holds, e.g., for α ≥ ǫ η and since |A ( S ) | ≥ ǫ F ≤ |A ( S ) | =

4, and we have 0 or 1 bad edges, then x G ≥ − ǫ η /2 − ǫ , so ( + α ) x G ≥ ( + ǫ η )( − ǫ η /2 − ǫ ) ≥ + ǫ η + ǫ F for ǫ F < (and noting that with |A ( S ) | = δ ↑ ( u ) ≤ ǫ F ).Finally, suppose that |A ( S ) | = S and for each u ∈ A ( S ) , x ( δ ↑ ( u )) ≤ + ǫ (see Theorem 5.14).It also must be the case that x ( δ ↑ ( u )) ≥ ǫ F for each u ∈ A ( S ) . If not, there would have to bea node u ′ ∈ A ( S ) such that x ( δ ↑ ( u ′ )) ≥ ( − ǫ F ) /3 > + ǫ , which is a contradiction to u ′ having an incident bad edge. Thus, for each u ∈ A ( S ) , x ( δ ↑ ( u )) is ǫ F -fractional, i.e., F u = − ǫ B and Z u = c ( t ) ≤ ( + ǫ η )( − ǫ B ) . Therefore, we have c ( s ) = ( + α ) x G ≥ ( + α )( − ǫ η /2 − ( + ǫ )) = ( + α )( − ǫ − ǫ η /2, ) and the rightmost quantity is at least c ( t ) for ǫ B ≥ ǫ and α ≥ ǫ η .From now on, we assume that the min s-t cut A = { s } . In the following we will prove thatfor any set of atoms W ( S , we have: c ( s , δ → ( W )) = ( + α ) x G ( δ → ( W )) ≥ c ( δ ↑ ( W ) , t ) (25)where for a set F of edges we write x G ( F ) to denote the total fractional value of good edges in F .Let A X = A ∩ X , A Y = A ∩ Y and so on. Assuming the above inequality, let us prove thelemma: First, for the set of edges A X chosen from X , let Q be the set of endpoints of all edgebundles in A X (in A ( S ) ).Observe that we must choose all atoms in Q inside A Y due to the inﬁnite capacity arcs, i.e., Q ⊆ A Y . Let W = S r Q . Note that W = S . Then:cap ( A , A ) = c ( A Y , t ) + c ( s , A X ) ≥ c ( δ ↑ ( Q ) , t ) + c ( s , δ → ( W ))= c ( δ ↑ ( S ) , t ) − c ( δ ↑ ( W )) + c ( s , δ → ( W )) ≥ c ( δ ↑ ( S ) , t ) ,where the last inequality follows by (25).Finally, we prove (25). Suppose atoms in W are adjacent to k bad edges. Then x G ( δ → ( W )) = x ( δ → ( W )) − x B ( δ → ( W )) which by Lemma 6.3 and the fact that each bad edge has fraction at most 1/2 + ǫ , is ≥ | W | − ǫ η /2 − k ( + ǫ ) . (26)61o upper bound c ( δ ↑ ( W ) , t ) , we observe that for any u ∈ A ( S ) , c ( u , t ) ≤  x ( δ ↑ ( u )) Z u ≤ x ( δ ↑ ( u )) < ǫ F ( + ǫ )( − ǫ B ) if x ( δ ↑ ( u )) > ǫ F and u incident to bad edge1 + ǫ η otherwise, using Lemma 2.7.Therefore, we can write, c ( δ ↑ ( W ) , t ) ≤ k ( + ǫ )( − ǫ B ) + ( | W | − k )( + ǫ η ) .Now, to prove (25), using (26), it is enough to choose α and ǫ B such that, ( + α ) (cid:0) | W | − ǫ η /2 − k ( + ǫ ) (cid:1) ≥ k ( + ǫ )( − ǫ B ) + ( | W | − k )( + ǫ η ) ,or equivalently, | W | ( α − ǫ η ) ≥ k ( α /2 + ǫ + αǫ − ǫ B /2 − ǫ B ǫ − ǫ η ) + ǫ η ( + α ) Since every atom is adjacent to at most one bad edge, k ≤ | W | and | W | ≥

1, the inequality followsusing ǫ B ≥ ǫ and α > ǫ η and ǫ ≤ ǫ η ≤ ǫ . In this section we prove Theorem 4.33.In Section 5 we deﬁned a number of happy events, such as 2-1-1 happy or 2-2-2 happy andshowed that each of these events occurs with probability at least p . In this section, we willsubsample these events to deﬁne a corresponding decrease event that occurs with probability exactly p . Reduction Events. i) For each polygon cut S ∈ H , let R S be the indicator of a uniformly random subset ofmeasure p of the max ﬂow event E S . Note that when R S = S is happy.ii) For a top edge bundle e = ( u , v ) deﬁne H e , u =  e is 2-1-1 happy and good w.r.t. u e is 2-2-2 happy and good w.r.t. u , but not 2-1-1 good with respect to u e is 2-2 happy and good, but not 2-1-1 or 2-2-2 good with respect to u Suppose that under the distribution µ on spanning trees, some event D ′ has probability q ≥ p and we seek todeﬁne an event D ⊆ D ′ that has probability exactly p . To this end, one can copy every tree T in the support of µ ,exactly ⌊ kqp ⌋ times for some integer k > T we choose a copy uniformly at random. So,to get a probability exactly p for an event, we say this event occurs if for a “feasible” tree T one of the ﬁrst k copiesare sampled. Now, as k → ∞ the probability that D occurs converges to p . Now, for a number of decreasing events, D , D , . . . , that occur with probabilities q , q , . . . (respectively), we just need to let k be the least common multiple of p / q , p / q , . . . and follow the above procedure. Another method is to choose an independent Bernoulli with successprobability p / q for any such event D . H e , v be deﬁned similarly. Since p is a lower bound on the probability an edge isgood, we may now let R e , u and R e , v be indicators of subsets of measure p of H e , u and H e , v respectively (note R e , u and R e , v may overlap). In this way every top edge bundle e = ( u , v ) is associated with indicators R e , u and R e , v .We set β = η /8 and τ = β . Deﬁne r : E → R ≥ as follows: For any (non-bundle) edge e , r e = ( β x e R S if p ( e ) = S for a polygon cut S ∈ H τ x e ( R f , u + R f , v ) if e ∈ f for a top edge bundle f = ( u , v ) Increase Events

Let E be the set of edge bundles, i.e., top/bottom edge bundles. Now, wedeﬁne the increase vector I : E → R ≥ as follows: • Top edges.

Let m e , u be deﬁned as in Lemma 6.2. For each top edge bundle e = ( u , v ) , let I e , u : = ∑ g ∈ δ ↑ ( u ) r g · m e , u ∑ f ∈ δ → ( u ) m f , u I { u is odd } , (27)and deﬁne I e , v analogously. Let I e = I e , u + I e , v . • Bottom edges.

For each bottom edge bundle S with polygon partition A , B , C , let r ( A ) : = ∑ f ∈ A r f , r ( B ) : = ∑ f ∈ B r f , and r ( C ) : = ∑ f ∈ C r f . Then set I S : = ( + ǫ η ) (cid:16) max { r ( A ) · I { S not left happy } , r ( B ) · I { S not right happy }} + r ( C ) I { S not happy } (cid:17) . (28)The following theorem is the main technical result of this section. Theorem 7.1.

For any good top edge bundle e , E [ I e ] ≤ ( − ǫ ) p τ x e , and for any bottom edge bundleS, E [ I S ] ≤ β p. Using this theorem, we can prove the desired theorem:

Theorem 4.33 (Main Payment Theorem) . For an LP solution x and x be x restricted to E and ahierarchy H for some ǫ η ≤ − , the maximum entropy distribution µ with marginals x satisﬁes thefollowing:i) There is a set of good edges E g ⊆ E r δ ( { u , v } ) such that any bottom edge e is in E g and for any(non-root) S ∈ H such that p ( S ) is a degree cut, we have x ( E g ∩ δ ( S )) ≥ .ii) There is a random vector s : E g → R (as a function of T ∼ µ ) such that for all e, s e ≥ − x e η /8 (with probability 1), andiii) If a polygon cut u with polygon partition A , B , C is not left happy, then for any set F ⊆ E with p ( e ) = u for all e ∈ F and x ( F ) ≥ − ǫ η /2 , we haves ( A ) + s ( F ) + s − ( C ) ≥ where s − ( C ) = ∑ e ∈ C min { s e , 0 } . A similar inequality holds if u is not right happy. v) For every cut S ∈ H such that p ( S ) is not a polygon cut, if δ ( S ) T is odd, then s ( δ ( S )) ≥ .v) For a good edge e ∈ E g , E [ s e ] ≤ − ǫ P η x e (see Eq. (31) for deﬁnition of ǫ P ) .Proof of Theorem 4.33. First, we set the constants: ǫ = ǫ = ǫ

12 , p = ǫ , ǫ M = τ = β , β = η /8. (29)Deﬁne E g to be the set of bottom edges together with any edge e which is part of a good topedge bundle. Now, we verify (i): We show for any S ∈ H such that p ( S ) is a degree cut, x ( E g ∩ δ ( S )) ≥ x ( δ ↑ ( S )) ≥ + ǫ then all edges in δ → ( S ) are good, so the claim follows because by Lemma 2.7, x ( δ → ( S )) ≥ − ǫ η ≥ x ( δ ↑ ( S )) ≤ + ǫ . Then, by Theorem 5.14 there is at most one bad edge in δ → ( S ) . Therefore,there is a fraction at least x ( δ → ( S )) − ( + ǫ ) ≥ δ → ( S ) .For any edge e ∈ E ′ deﬁne s e = − r e + ( I f x e x f if e ∈ f for a top edge bundle f , I S x e if p ( e ) = S for a polygon cut S ∈ H . (30)Now, we verify (ii): First, we observe that s e = e is part of a bad edgebundle since we deﬁned decrease events only for good edges and m e , u is non-zero only for goodedge bundles. Since r e ≤ β x e for bottom edges and r e ≤ τ x e for top edges, and τ ≤ β ≤ η /8, wehave that r e ≤ x e η /8. It follows that s e ≥ − x e η /8.Now, we verify (iii): Suppose a polygon cut u is not left-happy. Since u is not happy we musthave R u = r e = e ∈ F . Therefore, s ( A ) + s ( F ) + s − ( C ) = s ( A ) + I S x ( F ) + s − ( C ) ≥ − r ( A ) + ( + ǫ η )( r A + r C )( − ǫ η /2 ) − r ( C ) ≥ x ( F ) ≥ − ǫ η /2.Now, we verify (iv): Let S ∈ H , where p ( S ) is a degree cut. If S is odd, then r e = e ∈ δ → ( S ) ; so by Eq. (27) s ( δ ( S )) ≥ − ∑ g ∈ δ ↑ ( S ) r g + ∑ e ∈ δ → ( S ) I e , S = − ∑ g ∈ δ ↑ ( S ) r g + ∑ e ∈ δ → ( S ) ∑ g ∈ δ ↑ ( S ) r g m e , S ∑ f ∈ δ → ( S ) m f , S = e that is part of a topedge bundle f we have E [ s e ] = − E [ r e ] + E [ I f ] x e x f ≤ − τ px e + ( − ǫ ) p τ x e = − ǫ p τ x e .On the other hand, for a bottom edge e with p ( e ) = S , then E [ s e ] = − E [ r e ] + E [ I S ] x e ≤ − β px e + p β x e ≤ − p β x e .64inally, we can let ǫ P : = ǫ p ( τ / η ) = ǫ

72 0.005 ǫ ≥ ǫ ≥ · − (31)as desired.In the rest of this section we prove Theorem 7.1. Throughout the proof, we will repeatedlyuse the following facts proved in Section 5: If a top edge e = ( u , v ) that is part of a bundle f is reduced (equivalently H f , u = H f , v = u and v are trees, which means that treesampling inside u and v is independent of the reduction of e .Note however, that conditioning on a near-min-cut or atom to be a tree increases marginalsinside and reduces marginals outside as speciﬁed by Lemma 2.23. Since for any S ∈ H , x ( δ ( S )) ≤ + ǫ η , the overall change is ± ǫ η /2.The proof of Theorem 7.1 simply follows from Lemma 7.2 and Lemma 7.7 that we will provein the following two sections. The following lemma is the main result of this subsection.

Lemma 7.2 (Top Edge Increase) . Let S ∈ H be a degree cut and e = ( u , v ) a good edge bundle with p ( e ) = S. If ǫ ≤ , ǫ ≤ ǫ /12 and ǫ η ≤ ǫ , ǫ F = then E [ I e , u ] + E [ I e , v ] ≤ p τ x e (cid:16) − ǫ (cid:17) .We will use the following technical lemma to prove the above lemma. Lemma 7.3.

Let S ∈ H be a degree cut with an atom u ∈ A ( S ) . If x ( δ ↑ ( u )) > ǫ F , ǫ ≤ , ǫ ≤ ǫ /12 , and ǫ F = then we have ∑ g ∈ δ ↑ ( u ) , g ∈ f =( u ′ , v ′ ) good top τ x g · ( P [ δ ( u ) T odd |R f , u ′ ] + P [ δ ( u ) T odd |R f , v ′ ]) (32) + ∑ g ∈ δ ↑ ( u ) , p ( g )= S β x g · P [ δ ( u ) T odd |R S ] ≤ τ ( − ǫ ) x ( δ ↑ ( u )) F u , where recall we set F u : = − ǫ B I { u ∈ F ( S ) } in Lemma 6.2, where we take ǫ B : = ǫ .Proof of Lemma 7.2. By linearity of expectation and using Eq. (27): E [ I e , u ] = m e , u ∑ f ∈ δ → ( u ) m f , u E  ∑ g ∈ δ ↑ ( u ) r g · I { u is odd }  = m e , u ∑ f ∈ δ → ( u ) m f , u (cid:16) ∑ g ∈ δ ↑ ( u ) : g ∈ f =( u ′ , v ′ ) good top τ x g ( P [ R f , u ′ , δ ( u ) T odd ] + P [ R f , v ′ , δ ( u ) T odd ]) (33) + ∑ g ∈ δ ↑ ( u ) : p ( g )= S polygon β x g P [ R S , δ ( u ) T odd ] (cid:17)

65 similar equation holds for E [ I e , v ] .The case where x ( δ ↑ ( u )) ≤ ǫ F or x ( δ ↑ ( v )) ≤ ǫ F is dealt with in Lemma 7.6. So, consider thecase where x ( δ ↑ ( u )) , x ( δ ↑ ( v )) > ǫ F . Now recall that from (24), ∑ f ∈ δ → ( u ) m f , u = Z u δ ↑ ( u ) (34)where Z u = + I (cid:8) | S | ≥ x ( δ ↑ ( u )) ≤ ǫ F (cid:9) . In this case, Z u = Z v = P [ R f , u ′ , δ ( u ) T odd ] = p P [ δ ( u ) T odd |R f , u ′ ] , and plugging (32) into (33) for u and v , weget (and using Eq. (34)): E [ I e , u ] + E [ I e , v ] ≤ p τ ( − ǫ ) (cid:18) x ( δ ↑ ( u )) F u m e , u x ( δ ↑ ( u )) + x ( δ ↑ ( v )) F v m e , v x ( δ ↑ ( v )) (cid:19) (35) = p τ ( − ǫ )( F u m e , u + F v m e , v )] ≤ p τ ( − ǫ )( + ǫ η ) x e < p τ x e ( − ǫ ) .where on the ﬁnal line we used (23) and ǫ η < ǫ . Proof of Lemma 7.3.

Suppose that S i ∈ H are the ancestors of S in the hierarchy (in order) such S = S and for each i , S i + = p ( S i ) . Let δ ≥ i : = δ ( u ) ∩ δ ( S i ) and δ i : = δ ( u ) ∩ δ → ( S i ) .Each group of edges δ i is either entirely top edges or entirely bottom edges. First note that if g ∈ δ i and g is a bottom edge, i.e., S i + is a polygon cut, then by Corollary 5.10, P (cid:2) δ ( u ) T odd |R S i + (cid:3) = P (cid:2) δ ( u ) T odd |E S i + (cid:3) ≤ E S i + , R i + ) where in the equality we used that R S i + is a uniformly random event chosen in E S i + . Therefore, to prove Eq. (32) it is enough toshow ∑ g ∈ δ ↑ good ( u ) : g ∈ f =( u ′ , v ′ ) top, τ x g ( P [ δ ( u ) T odd |R f , u ′ ] + P [ δ ( u ) T odd |R f , v ′ ]) ≤ τ (cid:16) ( − ǫ ) F u (cid:16) x ( δ ↑ good ( u )) + x ( δ ↑ bad ( u )) (cid:17) + x ( δ ↑ β ( u )) (cid:17) (36)where we write δ β ( u ) , δ good ( u ) , δ bad ( u ) to denote the set of bottom edges, good top edges, andbad (top) edges in δ ( u ) respectively and we used that τ ( − ǫ )( − ǫ B ) − β ≥ τ since τ = β and ǫ ≤ ǫ and ǫ ≤ h ( f ) : = ( P [ δ ( u ) T odd |R f , u ′ ] + P [ δ ( u ) T odd |R f , v ′ ]) ≤ ( − ǫ ) F u is nearly 1, ineach of the following cases x ( δ ↑ β ( u )) ≥ ( F u = x ( δ ↑ ( u )) when F u = − ǫ B or x ( δ ↑ bad ( u )) ≥ F u ≥ − ǫ B ,(37)6636) holds. To see this, just plug in ǫ ≤ ǫ , ǫ ≤ ǫ B = ǫ , ǫ η ≤ − , x ( δ ↑ ( u )) ≤ + ǫ η and any inequality from (37) into (36), using the upper bound h ( f ) = δ top ( u ) = δ good ( u ) ∪ δ bad ( u ) be the set of top edges in δ ( u ) , if we can showthe existence of a set D ⊆ δ ↑ top ( u ) such that x ( D ) · max g ∈ D : g ∈ f =( u ′ , v ′ ) good − P [ δ ( u ) T odd |R f , u ′ ] + P [ δ ( u ) T odd |R f , v ′ ] ≥ (cid:16) ǫ + − F u (cid:17) x ( δ ↑ top ( u )) ,(38)then, again, (36) holds.In the rest of the proof, we will consider a number of cases and show that in each of them,either one of the inequalities in (37) or the inequality in (38) for some set D is true, which willimply the lemma. uS = S x ( δ ≥ ℓ ) ≥ ǫ η + ǫ S j S k S ℓ x ( δ ≥ k ) ≥ ǫ η + ǫ F x ( δ ≥ j ) ≥ − ǫ δ δ δ j S S First, let j = max { i : x ( δ ≥ i ) ≥ − ǫ } k = max { i : x ( δ ≥ i ) ≥ ǫ η + ǫ F /2 } , ℓ = max { i : x ( δ ≥ i ) ≥ ǫ η + ǫ } Just note j ≤ k ≤ ℓ . Note that levels ℓ and k exist since x ( δ ↑ ( u )) ≥ ǫ F , whereas level j may notexist (if x ( δ ↑ ( u )) < − ǫ ). We consider three cases: Case 1: x ( δ ↑ ( u )) ≥ − ǫ : Then j exists and S j has a valid A , B , C degree partitioning(Deﬁnition 5.18) where A = δ ( u ′ ) ∩ δ ( S j ) such that either u = u ′ or u ′ is a descendant of u in H . Note that, x ( δ ( u ) ∩ δ ( S j )) ≥ − ǫ , and that, in this case, x ( δ ↑ ( u )) is not ǫ F fractional(see Lemma 6.2), so F u = Case 1a: x ( δ j ) ≥ . If δ j are bottom edges then (37) holds. So, suppose that δ j is a set of topedges. By Lemma 5.28, at most 1/2 + ǫ fraction of edges in A ∩ δ j are good but not2-1-1 good (w.r.t., u ). So, the rest of the edges in A ∩ δ j are either bad or 2-1-1 good. Since x ( A ∩ δ j ) ≥ − x ( C ) ≥ − ǫ − ǫ η ,67 j either has a mass of ( − ǫ − ǫ η − ǫ ) > − ǫ of bad edges or of 2-1-1good edges. The former case implies that (37) holds. In the latter case, by Claim 7.4 forany 2-1-1 good edge g ∈ δ j with g ∈ f = ( u ′ , v ′ ) we have P [ δ ( u ) T odd |R f , u ′ ] ≤ ǫ η + ǫ ;so (38) holds for D deﬁned as the set of 2-1-1 good edges in δ j . Case 1b: x ( δ j ) < . If x ( δ ↑ β ( u )) ≥ ǫ = ǫ to all top edges in D = δ ≥ j + r δ ≥ ℓ + and we get that12 ( P [ δ ( u ) T odd |R f , u ′ ] + P [ δ ( u ) T odd |R f , v ′ ]) ≤ − ǫ + ǫ .Since x ( D ) ≥ − ǫ − − ǫ η − ǫ − > Case 2: − ǫ F < x ( δ ↑ ( u )) < − ǫ . Again we have F u =

1. So we can either show that x ( δ ↑ β ( u )) ≥ D to be the top edges in δ ↑ ( u ) r δ ≥ ℓ + and use Claim 7.5 with ǫ = ǫ .This will enable us to show that (38) holds as in the previous case. Case 3: ǫ F < x δ ↑ ( u ) < − ǫ F : In this case F u = − ǫ B . If at least 4/5 of the edges in δ ↑ ( u ) arebottom edges, then we are done by (37).Otherwise, let u ′ = p ( u ) . For any top edge e ∈ δ ↑ ( u ) where e ∈ f = ( u ′′ , v ′′ ) we have P [ δ ( u ) T odd |R f , u ′′ ] ≤ P (cid:2) u ′ tree |R f , u ′′ (cid:3) P (cid:2) δ ( u ) T odd | u ′ tree, R f , u ′′ (cid:3) + P (cid:2) u ′ not tree |R f , u ′′ (cid:3) Using that u ′ ⊆ u ′′ is a tree under |R f , u ′′ with probability at least 1 − ǫ η /2, and applying Claim 7.5(to u and u ′ ) with ǫ = ǫ F we have P [ δ ( u ) T odd | u ′ tree, R f , u ′′ ] ≤ − ǫ F + ǫ F we get P [ δ ( u ) T odd |R f , u ′′ ] ≤ − ǫ F + ǫ F + ǫ η /2.Now, let D be all top edges in δ ↑ ( u ) . Then, we apply Eq. (38) to this set of mass at least x ( δ ↑ ( u )) /5,and we are done, using that ( ǫ F − ǫ F ) /5 ≥ ( ǫ + ǫ B ) which holds for ǫ F ≥ ǫ B = ǫ ,and ǫ ≤ Claim 7.4.

For u ∈ H and a top edge e ∈ f = ( u ′ , v ′ ) for some u ′ ∈ H that is an ancestor of u, ifx ( δ ( u ) ∩ δ ( u ′ )) ≥ − ǫ and f is 2-1-1 good, then P [ δ ( u ) T odd |R f , u ′ ] ≤ ǫ η + ǫ . Proof.

Let A , B , C be the degree partitioning of δ ( u ′ ) . By the assumption of the claim, withoutloss of generality, assume A ⊆ δ ( u ) ∩ δ ( u ′ ) . This means that if R f , u ′ = u ′ is a tree and A T = = ( δ ( u ) ∩ δ ( u ′ )) T (also using C T = P [ δ ( u ) T odd |R f , u ′ ] = P (cid:2) ( δ ( u ) r δ ( u ′ )) T even |R f , u ′ (cid:3) .To upper bound the RHS ﬁrst observe that E (cid:2) ( δ ( u ) r δ ( u ′ )) T |R f , u ′ (cid:3) ≤ ǫ η /2 + x ( δ ( u ) r δ ( u ′ )) ≤ ǫ η /2 + x ( δ ( u )) − x ( A ) < + ǫ η + ǫ . We are using the fact that ǫ = ǫ /12 and that ǫ η is tiny by comparison to these. |R f , u ′ , u ′ is a tree , so u must be connected inside u ′ , i.e., ( δ ( u ) r δ ( u ′ )) T ≥ P (cid:2) ( δ ( u ) r δ ( u ′ )) T even |R f , u ′ (cid:3) ≤ P (cid:2) ( δ ( u ) r δ ( u ′ )) T − = |R f , u ′ (cid:3) ≤ ǫ η + ǫ as desired. Claim 7.5.

For u , u ′ ∈ H such that u ′ is an ancestor of u. Let ν = ν u ′ × ν G / u ′ be the measure resultingfrom conditioning u ′ to be a tree. if x ( δ ( u ) ∩ δ ( u ′ )) ∈ [ ǫ , 1 − ǫ ] , then P ν (cid:2) δ ( u ) odd | ( δ ( u ) ∩ δ ( u ′ )) T (cid:3) ≤ − ǫ + min { ǫ η , ǫ } . (39) Proof.

Let D = δ ( u ) r δ ( u ′ ) . By assumption, u ′ is a tree, so D T ≥ ( δ ( u ) ∩ δ ( u ′ )) T P ν (cid:2) δ ( u ) T even | ( δ ( u ) ∩ δ ( u ′ )) T (cid:3) ≥ min { P (cid:2) D T − | u ′ tree (cid:3) , P (cid:2) D T − = | u ′ tree (cid:3) } where we removed the conditioning by taking the worst case over ( δ ( u ) ∩ δ ( u ′ )) T even, ( δ ( u ) ∩ δ ( u ′ )) T odd. First, observe by the assumption of the claim and that x ( δ ↑ ( u )) ≤ + ǫ η we have E (cid:2) D T − | u ′ tree (cid:3) ∈ [ ǫ , 1 − ǫ + ǫ η ] .Furthermore, since we have a SR distribution on G [ u ′ ] , D T − P (cid:2) D T − = | u ′ tree (cid:3) ≥ ǫ − ǫ η and by Corollary 2.17 P (cid:2) D T − | u ′ tree (cid:3) ≥ − ( + e − ǫ ) ≥ ǫ − ǫ as desired. Lemma 7.6.

Let S ∈ H be a degree cut and e = ( u , v ) a good edge bundle with p ( e ) = S. If x ( δ ↑ ( u )) < ǫ F , ǫ ≤ , ǫ ≤ ǫ /10 , then, E [ I e , u ] + E [ I e , v ] ≤ p τ x e (cid:16) − ǫ (cid:17) Proof.

First, we consider the case |A ( S ) | =

3, where A ( S ) = { u , v , w } . Let f = ( u , w ) , g = ( v , w ) (and of course e = ( u , v ) ). We will use the following facts below: u v e w f g δ ↑ ( u ) δ ↑ ( w ) δ ↑ ( v ) S e + x f ≥ − ǫ F ( x ( δ ( u )) ≥ x ( δ ↑ ( u )) ≤ ǫ F ) x ( δ ↑ ( v )) + x ( δ ↑ ( w )) ≥ − ǫ F ( x ( δ ( S )) ≥ x f , x ( δ ↑ ( w )) ≤ + ǫ η , (Lemma 2.7)so we have, x e , x ( δ ↑ ( v )) ≥ − ǫ F − ǫ η . (40)Now we bound E [ I e , u ] + E [ I e , v ] . Since x ( δ ↑ ( v )) ≥ ǫ F , applying (32) and (33) to I e , v and using Z v = E [ I e , v ] ≤ m e , v ∑ f ∈ δ → ( v ) m f , v p τ ( − ǫ ) x ( δ ↑ ( v )) F v = m e , v p τ (cid:16) − ǫ (cid:17) F v (41)On the other hand, since by Corollary 5.10 for any bottom edge g ∈ δ ↑ ( u ) with p ( g ) = S ′ , wehave P [ δ ( u ) T odd |R S ′ ] = P [ δ ( u ) T odd |E S ′ ] ≤ β ≤ τ and Z u = E [ I e , u ] ≤ ∑ h ∈ δ ↑ ( u ) x h p τ F u · m e , u Z u x ( δ ↑ ( u )) = p τ F u m e , u . (42)Therefore, E [ I e , u ] + E [ I e , v ] ≤ p τ F u m e , u + p τ (cid:16) − ǫ (cid:17) F v m e , v = p τ ( F u m e , u + F v m e , v ) − ǫ p τ F v m e , v ≤ p τ ( + ǫ η ) x e − ǫ p τ F v m e , v (43)where the ﬁnal inequality follows from (23). To complete the proof, we lower bound m e , v .Using (24) for v and w , we can write, x ( δ ↑ ( v )) + x ( δ ↑ ( w )) = m e , v + m g , v + m f , w + m g , w ≤ m e , v + ( + ǫ η )( − ǫ B ) ( x f + x g ) (using (23)) = m e , v + ( + ǫ η )( − ǫ B ) ∑ a ∈A ( S ) x ( δ ( a )) − x ( δ ( S )) − x e ! ≤ m e , v + ( + ǫ η )( − ǫ B ) ( + ǫ η − x e ) and using the fact that x ( δ ↑ ( v )) + x ( δ ↑ ( w )) ≥ − ǫ F , we get m e , v ≥ x e − ǫ F − ǫ B ≥ ( − ǫ F ) x e ,where the second inequality follows from (40) and ǫ B = ǫ and ǫ η < ǫ and ǫ F ≥ F v ≥ − ǫ B = − ǫ we get E [ I e , u ] + E [ I e , v ] ≤ p τ x e (cid:16) + ǫ η − ǫ ( − ǫ F )( − ǫ ) (cid:17) ≤ p τ x e ( − ǫ )

70s desired. In the last inequality we used ǫ F ≤ ǫ ≤ Case 2: | S | ≥ . In this case, Z u =

2. Therefore, by Eq. (42) E [ I e , u ] ≤ ∑ e ∈ δ ↑ ( u ) x e p τ F u m e , u Z u x ( δ ↑ ( u )) = p τ F u m e , u .Now, either x ( δ ↑ ( v )) < ǫ F and we get the same inequality for I e , v or x ( δ ↑ ( v )) ≥ ǫ F in which caseby (41) we get E [ I e , v ] ≤ m e , v p τ F v ( − ǫ /5 ) . Putting these together proves the lemma. The following lemma is the main result of this subsection.

Lemma 7.7 (Bottom Edge Increase) . If ǫ ≤ , ǫ η ≤ ǫ , for any polygon cut S ∈ H , E [ I S ] ≤ β p . Proof.

For a set of edges D ∈ δ ( S ) deﬁne the random variable. I S ( D ) : = ( + ǫ η )( max { r ( A ∩ D ) I { S not left happy } , r ( B ∩ D ) I { S not right happy } + r ( C ∩ D ) I { S not happy } ) . (44)Note that by deﬁnition I S ( δ ( S )) = I S and for any two disjoint sets D , D , I S ( D ∪ D ) ≤ I S ( D ) + I S ( D ) . Also, deﬁne I ↑ S = I S ( δ ↑ ( S )) and I → S = I S ( δ → ( S )) .First, we upper bound E h I ↑ S i . Let f ∈ δ ↑ ( S ) and suppose that f with p ( f ) = S ′ is a bottomedge. Say we have f ∈ A ↑ ( S ) . We write, E [ I S ( f )] = ( + ǫ η ) β x f P [ R S ′ ] P [ S not left happy |R S ′ ] ≤ x f p β where in the inequality we used Corollary 5.11 and that P [ S not left happy |R S ] = P [ S not left happy |E S ] since R S is a uniformly random subset of E S . On the other hand, if f is a top edge, then we usethe trivial bound E [ I S ( f )] = ( + ǫ η ) τ px f . (45)Therefore, E h I ↑ S i ≤ ( + ǫ η ) τ px ( δ ↑ ( S )) ≤ ( + ǫ η )( ) β px ( δ ↑ ( S )) (46)Now, we consider three cases:Case 1: ˆ S = p ( S ) is a degree cut. Combining (46) and Lemma 7.8 below, we get E [ I S ] ≤ ( + ǫ η ) p ( ) β ( + ǫ + ǫ η ) ≤ β p using ǫ ≤ ǫ η ≤ ǫ . 71ase 2: ˆ S = p ( S ) is a polygon cut with ordering u , . . . , u k of A ( ˆ S ) , S = u or S = u k Then, byLemma 7.9 below, E [ I S ] ≤ ( + ǫ η ) β p ( x ( δ ↑ ) + ) ≤ β p where we used x ( δ ↑ ( S )) ≤ + ǫ η .Case 3: ˆ S = p ( S ) is a polygon cut with ordering u , . . . , u k of A ( ˆ S ) , S = u , u k Then, byLemma 7.12 below E [ I S ] ≤ ( + ǫ η ) β p ( x ( δ ↑ ) + ) ≤ β p where we use that x ( δ ↑ ( S )) ≤ ǫ η since we have a hierarchy. This concludes the proof. ˆ S is a degree cutLemma 7.8. Let S ∈ H be a polygon cut with parent ˆ S which is a degree cut. Then E [ I → S ] ≤ ( + ǫ η ) p τ ( x ( δ → ( S )) − ( − ǫ )) . Proof.

Let A , B , C be the polygon partition of S . We will show that for a constant fraction of theedges in δ → ( S ) , we can improve over the trivial bound in (45). To this end, consider the casesgiven by Theorem 5.21. Case 1: There is a set of 2-1-1 good edges (w.r.t., S ) D ⊆ δ → ( S ) , such that x D ≥ − ǫ − ǫ η . For any (top) edge e ∈ f = ( S , u ) such that e ∈ D , if R f , S , then S is happy, that is A T = B T = C T = E [ I S ( D )] ≤ ∑ e ∈ D : e ∈ f =( S , u ) + ǫ η τ x e P [ S not happy |R f , u ] P [ R f , u ] ≤ + ǫ η p τ x ( D ) .Using the trivial inequality Eq. (45) for edges in δ → ( S ) r D we get E [ I → S ] ≤ ( + ǫ η ) p τ ( x ( D ) + x ( δ → ( S )) − x ( D )) ≤ ( + ǫ η ) p τ ( x ( δ → ( S )) − ( − ǫ )) as desired. In the last inequality we used x ( D ) ≥ − ǫ − ǫ η . Case 2: There are two 2-2-2 good top half edge bundles, e = ( S , v ) , f = ( S , w ) in δ → ( S ) ,such that x e ( B ) , x f ( A ) ≤ ǫ . (Recall that e ( A ) = e ∩ A .) Let D = e ( A ) ∪ f ( B ) . In this case, e and f are reduced simultaneously by τ when they are 2-2-2 happy (w.r.t., S ), i.e., when R e , S = R f , S = δ ( S ) T = δ ( v ) T = δ ( w ) T =

2. Therefore, E [ I S ( D )] ≤ ( + ǫ η ) E [ max { r ( A ∩ D ) , r ( B ∩ D ) } ] ≤ ( + ǫ η ) τ { x e ( A ) , x f ( B ) } ( P [ e , f ] + P [ R e , v ] + P [ R f , w ]) ≤ ( + ǫ η ) τ p x ( D ) (cid:18) + ǫ (cid:19) = ( + ǫ η ) τ px ( D ) (cid:18) + ǫ (cid:19) − ǫ − x ( C ) ≤ x e ( A ) , x f ( B ) ≤ + ǫ and that x ( C ) ≤ ǫ η . Using thetrivial inequality Eq. (45) for edges in δ → ( S ) r D we get E [ I → S ] ≤ ( + ǫ η ) p τ ( x ( D )( + ǫ ) + x ( δ → ( S )) − x ( D )) ≤ ( + ǫ η ) p τ ( x ( δ → ( S )) − ( − ǫ )) where we used x ( D ) ≥ − ǫ − ǫ η . Case 3: There is a bad half edge e in δ → ( S ) . Since bad edges never decrease, no correspond-ing increase occurs, so by the trivial bound Eq. (45) E [ I → S ] ≤ ( + ǫ η ) p τ ( x ( δ → ( S )) − ( − ǫ )) .This concludes the proof. S and its parent ˆ S are both polygon cuts In this subsection we prove two lemmas: Lemma 7.9, which bounds E [ I → S ] when S is the leftmostor rightmost atom of ˆ S , and Lemma 7.12, which bounds this quantity when S is not leftmost orrightmost. Lemma 7.9.

Let S ∈ H be a polygon cut with p ( S ) = ˆ S also a polygon cut. Let u , . . . , u k be the orderingof cuts in A ( ˆ S ) (as deﬁned in Deﬁnition 4.31). If ǫ M ≤ , ǫ η ≤ ǫ M , S = u or S = u k , then E [ I → S ] ≤ β p . Proof.

Let S be the leftmost atom of ˆ S and let A , B , C be the polygon partition of δ ( S ) . First, note E [ I → S ] ≤ ( + ǫ η ) ( E [ max ( r ( A → ) , r ( B → )) · I { S not happy } ] + E [ r ( C → ) I { S not happy } ]) . (47)where recall that A → = A ∩ δ → ( S ) . WLOG assume x ( A → ) ≥ x ( B → ) . Then, E [ max { r ( A → ) , r ( B → ) } I { S not happy } ] = β px ( A → ) · P (cid:2) S not happy |R ˆ S (cid:3) By Lemma 7.10 we have x ( A → ) · P (cid:2) S not happy |R ˆ S (cid:3) ≤ x ( A → ) (cid:0) − (( − x ( A → )) + ( x ( A → )) − ǫ M − ǫ η ) (cid:1) ≤ (cid:0) x ( A → ) − x ( A → ) + ǫ M x ( A → ) + ǫ η x ( A → ) (cid:1) ≤ ( + ǫ M + ǫ η ) ,where in the ﬁnal inequality we used that the function x x ( − x ) is maximized at x = ǫ M ≤ ǫ η < ǫ M .Plugging this back into (47), and using x ( C ) ≤ ǫ η , we get E [ I → S ] ≤ ( + ǫ η ) β p ( + ǫ M + ǫ η ) ≤ β p ,where the last inequality follows since ǫ M ≤ ǫ η < ǫ M .73 emma 7.10. Let S ∈ H be a polygon cut with p ( S ) = ˆ S also a polygon cut. Let u , . . . , u k be theordering of cuts in A ( ˆ S ) . If S = u , (or S = u k ) then P (cid:2) S happy |R ˆ S (cid:3) ≥ ( − x ( A → )) + ( x ( A → )) − ǫ M − ǫ η . Proof.

Let A , B , C , ˆ A , ˆ B , ˆ C be the polygon partition of S , ˆ S respectively. Observe that since S = u ,we have ˆ A = E ( u , ˆ S ) = A ↑ ∪ B ↑ ∪ C ↑ and ˆ B , ˆ C ∩ ( A ∪ B ∪ C ) = ∅ . Conditioned on R ˆ S , ˆ S is a tree,and marginals of all edges in ˆ A is changed by a total variation distance at most ǫ ′ M : = ǫ M + ǫ η from x (see Corollary 5.9) and they are independent of edges inside ˆ S . The tree conditioningincreases marginals inside by at most ǫ η /2. Since after the changes just described E [ C T ] ≤ x C + ǫ η + ǫ ′ M ≤ ǫ η + ǫ M ,it follows that P (cid:2) C T = |R ˆ S (cid:3) ≥ − ǫ η − ǫ M . So, P (cid:2) S happy | R ˆ S (cid:3) ≥ ( − ǫ η − ǫ M ) P (cid:2) A T = B T = | C T = R ˆ S (cid:3) . (48)Let ν be the conditional measure C T = R ˆ S . We see that P ν [ A T = B T = ] = P ν h A ↑ T = B ↑ T = A → T = B → T = i + P ν h A ↑ T = B ↑ T = A → T = B → T = i so using independence of ( δ ↑ ( S )) T and ( δ → ( S )) T . = P ν h A ↑ T = B ↑ T = i P ν [ A → T = B → T = ] + P ν h A ↑ T = B ↑ T = i P ν [ A → T = B → T = ] ≥ ( x ( A ↑ ) − ǫ ′ M ) P ν [ A → T = B → T = ] + ( x ( B ↑ ) − ǫ ′ M ) P ν [ A → T = B → T = ] .In the ﬁnal inequality, we used the fact that conditioned on R ˆ S , ˆ A = ( A ↑ ∪ B ↑ ∪ C ↑ ) T = A ↑ and B ↑ are approximately preserved. Now, we lower bound P ν [ A → T = B → T = ] .Let ǫ A , ǫ B be such that E ν [ A → T ] = P ν [ A → T = B → T = ] + ǫ A , E ν [ B → T ] = P ν [ A → T = B → T = ] + ǫ B Therefore, ǫ A + ǫ B = E ν [ A → T + B → T ] − P ν [ A → T + B → T = ] ≤ x ( δ ( S )) − x ( δ ↑ ( S )) + ǫ η ≤ ( + ǫ η + ǫ η − ( − ǫ η )) − ( − ǫ η ) ≤ ǫ η .where in the inequality we used x ( δ ( S )) ≤ + ǫ η , that conditioning ˆ S to be a tree and C to 0increases marginals by at most 1.5 ǫ η , that x ( δ ↑ ( S )) ≥ − ǫ η (by Lemma 4.17) and Claim 7.11.Therefore, P ν [ A T = B T = ] ≥ ( x ( A ↑ ) − ǫ ′ M )( E ν [ B → T ] − ǫ B ) + ( x ( B ↑ ) − ǫ ′ M )( E ν [ A → T ] − ǫ A ) ≥ ( x ( A ↑ ) − ǫ ′ M )( x ( B → ) − ǫ η ) + ( x ( B ↑ ) − ǫ ′ M )( x ( A → ) − ǫ η ) where the second inequality uses that the tree conditioning and C → T = A → and B → . Simplify the above using x ( A ↑ ) + x ( A → ) ≥ − ǫ η , andsimilarly for B , P ν [ A T = B T = ] ≥ ( − x ( A → ) − ǫ η − ǫ ′ M )( x ( B → ) − ǫ η ) + ( − x ( B → ) − ǫ η − ǫ ′ M )( x ( A → ) − ǫ η ) x ( A → ) + x ( B → ) ≥ − ǫ η (because x ( A ↑ ) + x ( B ↑ ) ≤ + ǫ η and x C ≤ ǫ η ), this is ≥ ( − x ( A → ) − ǫ η − ǫ ′ M )( − x ( A → ) − ǫ η ) + ( x ( A → ) − ǫ η − ǫ ′ M )( x ( A → ) − ǫ η ) ≥ ( − x ( A → )) + ( x ( A → )) − ǫ ′ M − ǫ η .Plugging this into Eq. (48), we obtain P (cid:2) A T = B T = C T = | R ˆ S (cid:3) ≥ ( − ǫ η − ǫ ′ M ) P (cid:2) A T = B T = | C T = R ˆ S (cid:3) ≥ ( − ǫ η − ǫ ′ M )(( − x ( A → )) + ( x ( A → )) − ǫ ′ M − ǫ η ) ≥ ( − x ( A → )) + ( x ( A → )) − ǫ ′ M − ǫ η ,which noting ǫ ′ M = ǫ M + ǫ η completes the proof of the lemma. Claim 7.11.

Let ν be the conditional measure C T = R ˆ S . Then P ν [ A → T + B → T = ] ≥ − ǫ η . Proof.

First, notice under ν , ˆ S is a tree, and δ → ( S ) is independent of edges in G / ˆ S , though E ν [ δ → ( S ) T ] may be increased by 2 ǫ η due to R ˆ S , C T =

0, so E ν [ δ → ( S ) T ] ≤ + ǫ η − ( − ǫ η ) + ǫ η ≤ + ǫ η ,where we used that S = u , so x ( δ ↑ ( S )) ≥ − ǫ η .Also, since C T =

0, under ν , we have δ → ( S ) T = A → T + B → T . Furthermore, since δ → ( S ) T ≥ ν , P ν [ A → T + B → T = ] = P ν [ δ → ( S ) T = ] ≥ − ǫ η as desired. Lemma 7.12.

Let S ∈ H be a polygon cut with p ( S ) = ˆ S also a polygon cut with u , . . . , u k be theordering of cuts in A ( ˆ S ) . If S = u , u k , then E [ I → S ] ≤ β p . Proof.

Let S = u i for some 2 ≤ i ≤ k −

1. Let A , B , C be the polygon partitioning of δ ( u i ) and ˆ A , ˆ B , ˆ C be the polygon partition of ˆ S . Since u i is in the hierarchy A ↑ ∪ B ↑ ∪ C ↑ ⊆ ˆ C . So,conditioned on R ˆ S , A ↑ T = B ↑ T = C ↑ T = ν be the conditional measure C T = R ˆ S . Similar to the previous case, wewill lower-bound P (cid:2) S happy |R ˆ S (cid:3) ≥ ( − ǫ η ) P (cid:2) A → T = B → T = | C T = R ˆ S (cid:3) = ( − ǫ η ) P ν [ A → T = | A → T + B → T = ] P ν [ A → T + B → T = ] (49)where we used E (cid:2) C → T |R ˆ S (cid:3) ≤ ǫ η in the ﬁrst inequality. So, it remains to lower-bound each ofthe two terms in the RHS.We start with the ﬁrst one. Since x ( A ) ∈ [ − ǫ η , 1 + ǫ η ] and x ( A ↑ ) ≤ ǫ η we have E ν [ A → T ] ∈ [ − ǫ η , 1 + ǫ η ] .75he same bounds hold for E ν [ x ( B → )] .Therefore, P ν [ A → T ≥ ] , P ν [ B → T ≥ ] ≥ − e − + ǫ η (Lemma 2.22) P ν [ A → T ≤ ] , P ν [ B → T ≤ ] ≥ ǫ = ( − e − + ǫ η ) ≥ P ν [ A → T = | A → T + B → T = ] ≥ P ν [ E ( u i − , u i ) T = ] ≥ − ǫ η . Similarly, P ν [ E ( u i , u i + ) T = ] ≥ − ǫ η . And, P ν [ δ → ( u i ) T − E ( u i − , u i ) T − E ( u i , u i + ) T = ] ≥ − ǫ η So, by a union bound all of these events happen simultaneously and we get P ν [ δ → ( u i ) T = ] ≥ − ǫ η . Therefore, P ν [( A → ) T = ( B → ) T = ] ≥ ( − ǫ η ) ≥ P (cid:2) S happy |R ˆ S (cid:3) ≥ ( − ǫ η ) ≥ E [ I → S ] ≤ ( + ǫ η ) β p P (cid:2) S not happy |R ˆ S (cid:3) ( max { x ( A → ) , x ( B → ) } + x ( C → )) ≤ ( + ǫ η ) β p ( − )( + ǫ η + ǫ η ) ≤ β p as desired. References [AOV18] Nima Anari, Shayan Oveis Gharan, and Cynthia Vinzant. “Log-Concave Polynomi-als, Entropy, and a Deterministic Approximation Algorithm for Counting Bases ofMatroids”. In:

FOCS . Ed. by Mikkel Thorup. IEEE Computer Society, 2018, pp. 35–46(cit. on p. 4).[App+07] David L. Applegate, Robert E. Bixby, Vasek Chvatal, and William J. Cook.

The Trav-eling Salesman Problem: A Computational Study (Princeton Series in Applied Mathematics) .Princeton, NJ, USA: Princeton University Press, 2007 (cit. on p. 1).[Aro+98] Sanjeev Arora, Michelangelo Grigni, David Karger, Philip Klein, and Andrzej Woloszyn.“A polynomial-time approximation scheme for weighted planar graph TSP”. In:

SODA .1998, pp. 33–41 (cit. on p. 1).[Aro96] Sanjeev Arora. “Polynomial Time Approximation Schemes for Euclidean TSP andOther Geometric Problems”. In:

FOCS . 1996, pp. 2–11 (cit. on p. 1).76Asa+10] Arash Asadpour, Michel X. Goemans, Aleksander Madry, Shayan Oveis Gharan, andAmin Saberi. “An O(log n/ log log n)-approximation Algorithm for the AsymmetricTraveling Salesman Problem”. In:

SODA . 2010, pp. 379–389 (cit. on p. 2).[BBL09] Julius Borcea, Petter Branden, and Thomas M. Liggett. “Negative dependence andthe geometry of polynomials.” In:

Journal of American Mathematical Society

22 (2009),pp. 521–567 (cit. on pp. 7, 9, 10).[BC11] Sylvia Boyd and Robert Carr. “Finding low cost TSP and 2-matching solutions usingcertain half-integer subtour vertices”. In:

Discrete Optimization

Combinatorial Optimization and Discrete Algorithms .Vol. B23. 2010, pp. 33–47 (cit. on p. 1).[Ben95] András A. Benczúr. “A Representation of Cuts within 6/5 Times the Edge Connec-tivity with Applications”. In:

FOCS . 1995, pp. 92–102 (cit. on p. 3).[Ben97] Andras A. Benczúr. “Cut structures and randomized algorithms in edge-connectivityproblems”. PhD thesis. MIT, 1997 (cit. on p. 7).[BG08] András A. Benczúr and Michel X. Goemans. “Deformable Polygon Representationand Near-Mincuts”. In:

Building Bridges: Between Mathematics and Computer Science,M. Groetschel and G.O.H. Katona, Eds., Bolyai Society Mathematical Studies

19 (2008),pp. 103–135 (cit. on p. 3).[BP91] S. C. Boyd and William R. Pulleyblank. “Optimizing over the subtour polytope of thetravelling salesman problem”. In:

Math. Program.

49 (1991), pp. 163–187 (cit. on p. 1).[BS20] René van Bevern and Viktoriia A. Slugina. “A historical note on the 3/2-approximationalgorithm for the metric traveling salesman problem”. 2020 (cit. on p. 1).[Chr76] Nicos Christoﬁdes.

Worst Case Analysis of a New Heuristic for the Traveling SalesmanProblem . Report 388. Pittsburgh, PA: Graduate School of Industrial Administration,Carnegie-Mellon University, 1976 (cit. on p. 1).[CV00] Robert D. Carr and Santosh Vempala. “Towards a 4/3 approximation for the asym-metric traveling salesman problem”. In:

SODA . 2000, pp. 116–125 (cit. on p. 1).[Dar64] J. N. Darroch. “On the distribution of the number of successes in independent trials”.In:

Ann. Math. Stat.

36 (1964), pp. 1317–1321 (cit. on p. 10).[DFJ59] G.B. Dantzig, D.R. Fulkerson, and S. Johnson. “On a Linear Programming Combi-natorial Approach to the Traveling Salesman Problem”. In: OR SODA . 2007, pp. 278–287 (cit. on p. 1).[DKL76] E.A. Dinits, A.V. Karzanov, and M.V. Lomonosov. “On the structure of a family ofminimal weighted cuts in graphs”. In:

Studies in Discrete Mathematics (in Russian), ed.A.A. Fridman, 290-306, Nauka (Moskva) (1976) (cit. on p. 3).77Edm70] Jack Edmonds. “Submodular functions, matroids and certain polyhedra”. In:

Com-binatorial Structures and Their Applications . New York, NY, USA: Gordon and Breach,1970, pp. 69–87 (cit. on p. 6).[EJ73] Jack Edmonds and Ellis L. Johnson. “Matching, Euler tours and the Chinese post-man”. In:

Mathematical Programming

Proceedings of the twenty-fourth annual ACM symposium on Theory of Computing . Victoria, British Columbia,Canada: ACM, 1992, pp. 26–38 (cit. on p. 9).[GKP95] M. Grigni, E. Koutsoupias, and C. Papadimitriou. “An approximation scheme for pla-nar graph TSP”. In:

FOCS ’95: Proceedings of the 36th Annual Symposium on Foundationsof Computer Science . Washington, DC, USA: IEEE Computer Society, 1995, p. 640. isbn :0-8186-7183-1 (cit. on p. 1).[GLS05] David Gamarnik, Moshe Lewenstein, and Maxim Sviridenko. “An improved upperbound for the TSP in cubic 3-edge-connected graphs”. In:

Oper. Res. Lett.

MATH. PROG

69 (1995), pp. 335–349 (cit. on p. 1).[Gur06] Leonid Gurvits. “Hyperbolic polynomials approach to Van der Waerden/Schrijver-Valiant like conjectures: sharper bounds, simpler proofs and algorithmic applica-tions”. In:

STOC . Ed. by Jon M. Kleinberg. ACM, 2006, pp. 417–426 (cit. on p. 3).[Gur08] Leonid Gurvits. “Van der Waerden/Schrijver-Valiant like Conjectures and Stable (akaHyperbolic) Homogeneous Polynomials: One Theorem for all”. In:

Electr. J. Comb.

Algorithmica

Operations Research

18 (1970), pp. 1138–1162 (cit. on p. 2).[HLP52] G. H. Hardy, J. E. Littlewood, and G. Polya.

Inequalities . Cambridge Univ. Press, 1952(cit. on p. 10).[HN19] Arash Haddadan and Alantha Newman. “Towards Improving Christoﬁdes Algo-rithm for Half-Integer TSP”. In:

ESA . Ed. by Michael A. Bender, Ola Svensson, andGrzegorz Herman. Vol. 144. LIPIcs. Schloss Dagstuhl - Leibniz-Zentrum für Infor-matik, 2019, 56:1–56:12 (cit. on p. 1).[HNR17] Arash Haddadan, Alantha Newman, and R. Ravi. “Cover and Conquer: AugmentingDecompositions for Connectivity Problems”. abs/1707.05387. 2017 (cit. on p. 1).[Hoe56] W. Hoeffding. “On the distribution of the number of successes in independent trials”.In:

Ann. Math. Statist.

27 (1956), pp. 713–721 (cit. on pp. 10, 14).[KKO20] Anna R. Karlin, Nathan Klein, and Shayan Oveis Gharan. “An improved approxi-mation algorithm for TSP in the half integral case”. In:

STOC . Ed. by KonstantinMakarychev, Yury Makarychev, Madhur Tulsiani, Gautam Kamath, and Julia Chuzhoy.ACM, 2020, pp. 28–39 (cit. on p. 1). 78Kle05] Philip N. Klein. “A linear-time approximation scheme for planar weighted TSP”. In:

FOCS . 2005, pp. 647–657 (cit. on p. 1).[KLS15] Marek Karpinski, Michael Lampis, and Richard Schmied. “New inapproximabilitybounds for TSP”. In:

Journal of Computer and System Sciences issn : 0022-0000 (cit. on p. 1).[Mit99] Joseph SB Mitchell. “Guillotine subdivisions approximate polygonal subdivisions: Asimple polynomial-time approximation scheme for geometric TSP, k-MST, and re-lated problems”. In:

SIAM Journal on Computing

FOCS . 2011, pp. 560–569 (cit. on p. 1).[Muc12] M Mucha. “ -approximation for graphic TSP.” In:

STACS . 2012, pp. 30–41 (cit. onp. 1).[OSS11] Shayan Oveis Gharan, Amin Saberi, and Mohit Singh. “A Randomized RoundingApproach to the Traveling Salesman Problem”. In:

FOCS . IEEE Computer Society,2011, pp. 550–559. isbn : 978-0-7695-4571-4 (cit. on pp. 1–4, 7, 17).[Ser78] A. I. Serdyukov. “O nekotorykh ekstremal’nykh obkhodakh v grafakh”. In:

Upravlyae-mye sistemy

17 (1978), pp. 76–79 (cit. on p. 1).[SV12] András Sebö and Jens Vygen. “Shorter Tours by Nicer Ears:” CoRR abs/1201.1870.2012 (cit. on pp. 1, 2).[SV19] Damian Straszak and Nisheeth K. Vishnoi. “Maximum Entropy Distributions: BitComplexity and Stability”. In:

COLT . Ed. by Alina Beygelzimer and Daniel Hsu.Vol. 99. Proceedings of Machine Learning Research. PMLR, 2019, pp. 2861–2891 (cit.on p. 19).[SW90] D. B. Shmoys and D. P. Williamson. “Analyzing the Held-Karp TSP bound: a mono-tonicity property with application”. In:

Inf. Process. Lett.

SODA . 2012, pp. 1477–1486 (cit. on pp. 1, 24).[SWZ13] Frans Schalekamp, David P. Williamson, and Anke van Zuylen. “2-Matchings, theTraveling Salesman Problem, and the Subtour LP: A Proof of the Boyd-Carr Conjec-ture”. In:

Mathematics of Operations Research

STOC .Ed. by Konstantin Makarychev, Yury Makarychev, Madhur Tulsiani, Gautam Ka-math, and Julia Chuzhoy. ACM, 2020, pp. 14–27 (cit. on p. 1).[Wol80] Laurence A. Wolsey. “Heuristic analysis, linear programming and branch and bound”.In:

Combinatorial Optimization II . Vol. 13. Mathematical Programming Studies. SpringerBerlin Heidelberg, 1980, pp. 121–134 (cit. on pp. 1, 17).79 v e V Figure 13: Setting of Lemma 5.22

A Proofs from Section 5

Lemma 5.22.

Let e = ( u , v ) be a top edge bundle such that x e ≤ − ǫ . If ǫ ≤ ǫ ≤ then, e is 2/1/1 happy with probability at least ǫ .Proof. Let A , B , C be the degree partitioning of δ ( u ) . Let V : = δ ( v ) − e (see Fig. 13). Condition u , v be trees, e and C to 0, let ν be the resulting measure. This happens with probability at least0.5 and increases marginals in A − e , B − e , V by at most x e + ǫ + ǫ η ≤ x e + ǫ and by treeconditioning decreases marginals by at most 2 ǫ η . After conditioning, we have E ν [ A T ] ∈ x ( A ) − x e ( A ) + [ − ǫ η , x e + ǫ ] ⊂ [ ] , similarly E ν [ B T ] ⊂ [ ] E ν [ V T ] ∈ x ( δ ( v )) − x e + [ − ǫ η , x e + ǫ ] ⊂ [ ] E ν [ B T + V T ] ∈ x ( B ) + x ( δ ( v )) − x e − x e ( B ) + [ − ǫ η , x e + ǫ ] ⊂ [ + ǫ , 3.01 ] , E ν [ A T + B T ] ∈ x ( A ) + x ( B ) − x e ( A ) − x e ( B ) + [ − ǫ η , x e + ǫ ] ⊂ [ ] , E ν [ A T + B T + V T ] ∈ x ( A ) + x ( B ) + x ( δ ( v )) − x e − x e ( A ) − x e ( B ) + [ − ǫ η , x e + ǫ ] ⊂ [ + ǫ , 3.51 ] .where we used ǫ ≤ ǫ < ǫ and x e ( A ) , x e ( B ) , x e ( A ) + x e ( B ) ≤ x e ≤ − ǫ . Itimmediately follows from Proposition 5.1 that P ν [ A T = B T = V T = ] is at least a constant. Inthe rest of the proof, we do a more reﬁned analysis. Using A T + B T ≥ V T ≥ P ν [ A T + B T + V T = ] ≥ ( ǫ ) e − ǫ ≥ ǫ , (Lemma 2.21) P ν [ A T + B T ≥ ] , P ν [ V T ≥ ] ≥ P ν [ A T + B T ≤ ] , P ν [ V T ≤ ] ≥ A T + B T ≥ V T ≥ ν ) P ν [ A T ≤ ] ≥ P ν [ B T + V T ≤ ] ≥ V T ≥ ν ) P ν [ A T ≥ ] ≥ P ν [ B T + V T ≥ ] ≥ ǫ , (Lemma 2.22)It follows by Corollary 5.5 (with ǫ = p m ≥ − ǫ ≥ P ν [ V T = | A T + B T + V T = ] ≥ V T ≥ A T + B T ≥ V T − A T + B T − P ν [ A T ≥ | A T + B T + V T = ] ≥ P ν [ A T ≤ | A T + B T + V T = ] ≥ ǫ . The same holds for B T . Therefore, by Corollary 5.5 (with ǫ = ǫ ), using that ǫ < P ν [ A T = | A T + B T = V T = ] ≥ ǫ .80utting these together we have P [ e ] ≥ P ν [ A T = B T = V T = ]= P ν [ A T + B T + V T = ] P ν [ V T = | A T + B T + V T = ] · P ν [ A T = | V T = A T + B T = ] ≥ ( ǫ )( )( ǫ ) ≥ ǫ as desired. Lemma 5.23.

Let e = ( u , v ) be a top edge bundle such that x e ≥ + ǫ . If ǫ ≤ ǫ ≤ ,then, e is 2/1/1 happy with respect to u with probability at least ǫ .Proof. Let A , B , C be the degree partitioning of the edges in δ ( u ) , V = δ − e ( v ) . Condition u , v betrees, C T = u ∪ v to be a tree (in order). This happens with probability at least + ǫ − ǫ η − ǫ ≥ ν be the resulting measure restricted to edges in A , B , V . Note that ν onedges in A , B , V is SR. This is because ν is a product of two strongly Rayleigh distribution on thefollowing two disjoint set of edges (i) the edges between u , v and (ii) the edges in A − e , B − e , V .Furthermore, observe that under ν , every set of edges in A − e , B − e , V increases by at most2 ǫ + ǫ η < ǫ (using 12 ǫ ≤ ǫ ), and decreases by at most 1 − x e + ǫ η . Therefore, E ν [ A T ] ∈ x ( A ) + [ − ( − x e ) − ǫ η , 1 − x e + ǫ ] ⊂ [ ] , similarly, E ν [ B T ] ∈ [ ] E ν [ V T ] ∈ x ( δ ( v )) − x e + [ − ( − x e ) − ǫ η , 0.2 ǫ ] ⊂ [ ] . E ν [ A T + B T ] ∈ x ( A ) + x ( B ) + − x e ( A ) − x e ( B ) + [ − ( − x e ) − ǫ η , 0.2 ǫ ] ⊂ [ ] , E ν [ B T + V T ] ∈ x ( B ) + x ( δ ( v )) − x e + [ − ( − x e ) − ǫ η , 1 − x e + ǫ ] ⊂ [ − ǫ ] . E ν [ A T + B T + V T ] ∈ x ( A ) + x ( B ) + x ( δ ( v )) + − x e − x e ( A ) − x e ( B ) + [ − ( − x e ) − ǫ η , 0.2 ǫ ] ⊂ [ − ǫ ] .where in the upper bound on E ν [ A T ] , E ν [ B T ] , E ν [ B T + V T ] we used that the marginals of edgesin the bundle e can only increase by 1 − x e (in total) when conditioning u ∪ v to be a tree. So, P ν [ A T + B T + V T = ] ≥ ǫ , (By Theorem 2.15) P ν [ A T + B T ≥ ] ≥ P ν [ V T ≥ ] ≥ A T + B T ≥ P ν [ A T + B T ≤ ] ≥ P ν [ V T ≤ ] ≥ A T + B T ≥ P ν [ A T ≥ ] ≥ P ν [ B T + V T ≥ ] ≥ P ν [ A T ≤ ] ≥ P ν [ B T + V T ≤ ] ≥ ǫ , (Markov, In worst case P [ B T + V T < ] = ǫ = p m = P ν [ A T + B T = | A T + B T + V T = ] ≥ A T + B T ≥ A T + B T − V T .Furthermore, by Lemma 5.4, P ν [ A T ≥ | A T + B T + V T = ] ≥ ǫ and P ν [ A T ≤ | A T + B T + V T = ] ≥ B T . Therefore, by Corollary 5.5, P ν [ A T = | A T + B T = V T = ] ≥ ǫ .81here we used ǫ < P [ e ] ≥ ( ǫ ) ( ǫ ) ≥ ǫ ,as desired. A Bu v V e ( A ) e ( B ) Figure 14: Setting of Lemma 5.24

Lemma 5.24.

For a good half top edge bundle e = ( u , v ) , let A , B , C be the degree partitioning of δ ( u ) ,and let V = δ ( v ) − e (see Fig. 14). If x e ( B ) ≤ ǫ and P [( A − e ) T + V T ≤ ] ≥ ǫ then e is 2-1-1good, P [ e ] ≥ ǫ Proof.

The proof is similar to Lemma 5.23. We condition u , v to be trees, C T = u ∪ v to be a tree.Let ν be the resulting SR measure on edges in A , B , V . The main difference is since x e + ǫ we use the lemma’s assumptions to lower bound P ν [ A T + B T + V T = ] , P ν [ A T + V T ≤ ] , P ν [ B T + V T ≤ ] .First, since e is 2-2 good, by Lemma 5.15 and negative association, P ν [( δ ( u ) − e ) T + V T ≤ ] ≥ P [( δ ( u ) − e ) T + V T ≤ ] − P [ C T = ] ≥ ǫ − ǫ − ǫ η ≥ ǫ ,where we used ǫ < ǫ . Letting p i = P [( δ ( u ) − e ) T + V T = i ] , we therefore have p ≤ ≥ ǫ . In addition, by Lemma 2.21, p ≥ p < ǫ , then from p / p ≤ ǫ , wecould use log-concavity to derive a contradiction to p ≤ ≥ ǫ (analogously to what’s donein the proof of Lemma 2.18). Therefore, we must have P ν [ A T + B T + V T = ] = P ν [( δ ( u ) − e ) T + V T = ] ≥ ǫ .Next, notice since P [ u , v , u ∪ v trees, C T = ] ≥ P ν [ e ( B )] ≤ ǫ . Therefore, E ν [ B T + V T ] ≤ x ( V ) + x ( B ) + ǫ + ǫ + ǫ η ≤ P ν [ B T + V T ≤ ] ≥ P ν [ A T + V T ≤ ] ≥ P ν [( A − e ) T + V T ≤ ] ≥ P [( A − e ) T + V T ≤ ] − P [ C T = ] ≥ ǫ where we used the lemma’s assumption.Now, following the same line of arguments as in Lemma 5.23, we have P ν [ A T + B T = | A T + B T + V T = ] ≥ P ν [ A T ≥ | A − T + B T + V T = ] ≥ P ν [ A T = | A T + B T = V T = ] ≥ ǫ . This implies P [ e ] ≥ ( ǫ ) ( ǫ ) ≥ ǫ as desired. 82 emma 5.25. Let e = ( v , u ) and f = ( v , w ) be good half top edge bundles and let A , B , C be thedegree partitioning of δ ( v ) such that x e ( B ) , x f ( B ) ≤ ǫ . Then, one of e , f is 2-1-1 happy with probabilityat least ǫ .Proof. Let U = δ ( u ) − e . By Lemma 2.27, we can assume, without loss of generality, that E [ U T | f / ∈ T , u , v , w tree ] ≤ x ( U T ) + + ǫ η . (50)On the other hand, E [( A − e − f ) T ] ≥ E [( A − e − f ) T | f / ∈ T , u , v , w tree ] P [ f / ∈ T , u , v , w , tree ] ≥ E [( A − e − f ) T | f / ∈ T , u , v , w tree ] E [( A − e − f ) T | f / ∈ T , u , v , w , tree ] ≤ x ( A − e − f ) ≤ ( ǫ + ǫ η ) ≤ ǫ . (51)Combining (50) and (51), we get E [ U T + ( A − e ) | f / ∈ T , u , v , w tree ] ≤ ǫ ≤ P [ U T + ( A − e ) T ≤ ] ≥ P [ U T + ( A − e ) T ≤ | f / ∈ T , u , v , w tree ] ≥ ǫ ≤ e is 2-1-1 good. A B v YX e ( A ) e ( B ) Figure 15: Setting of Lemma 5.26.

Lemma 5.26.

Let e = ( u , v ) be a good half edge bundle and let A , B , C be the degree partitioning of δ ( u ) (see Fig. 15). If ǫ ≤ ǫ ≤ and x e ( A ) , x e ( B ) ≥ ǫ , then P [ e ] ≥ ǫ . Proof.

Condition C T to be zero, u , v and u ∪ v be trees. This happens with probability at least0.49. Let ν be the resulting measure. Let X = A − e ∪ B − e , Y = δ ( v ) − e Since e is 2/2 good byLemma 5.15 and stochastic dominance, P ν [ X T + Y T ≤ ] ≥ P [( δ ( u ) − e ) T + Y T ≤ ] − P [ C T = ] ≥ ǫ − ǫ − ǫ η ≥ ǫ ,where we used ǫ < ǫ . It follows by log-concavity of X T + Y T that P ν [ X T + Y T = ] ≥ ǫ . Now, E ν [ X T ] , E ν [ Y T ] ∈ [ − ǫ , 1.5 + ǫ + ǫ + ǫ η ] ⊂ [ ] P ν [ X T ≥ ] , P ν [ Y T ≥ ] ≥ P ν [ X T ≤ ] , P ν [ Y T ≤ ] ≥ P ν [ X T = | X T + Y T = ] ≥ P ν [ X T = Y T = ] ≥ ( ǫ ) ≥ ǫ ,Let E be the event { X T = Y T = | ν } . Note that in ν we always choose exactly 1 edge from the e bundle and that is independent of edges in X , Y , in particular the above event. Therefore, we cancorrect the parity of A , B by choosing from e A or e B . It follows that P [ e u ] ≥ P ν [ E ] ( ǫ ) ≥ ǫ ,where we used that E ν [ e ( A ) T ] ≥ ǫ , and the same fact for e ( B ) T . To see why this latter factis true, observe that conditioned on u , v trees, we always sample at most one edge between u , v .Therefore, since under ν we choose exactly one edge between u , v , the probability of choosingfrom e ( A ) (and similarly choosing from e ( B ) ) is at least E [ e ( A ) T | u , v trees, C T = ] P [ e | u , v trees, C T = ] ≥ x e ( A ) − ǫ η x e + ǫ ≥ ǫ − ǫ η + ǫ ≥ ǫ as desired. uU wWA Be ( A ) e ( B ) f ( A ) f ( B ) Z Figure 16: Setting of Lemma 5.27. We assume that the dotted green/blue edges are at most ǫ .Note that edges of C are not shown. Lemma 5.27.

Let e = ( u , v ) , f = ( v , w ) be two good top half edge bundles and let A , B , C be degree partitioningof δ ( v ) such that x e ( B ) , x f ( A ) ≤ ǫ . If e , f are not 2-1-1 good with respect to v, and ǫ ≤ ǫ ≤ , then e , f are 2-2-2 happy with probability at least .Proof. First, observe that by Lemma 5.24 if P [ U T + ( A − e ) T ≤ ] ≥ ǫ , where ǫ ≥ ǫ is aconstant that we ﬁx later, then e is 2/1/1 good and we are done. So, assume, P [ U T + ( A − e ) T ≥ ] ≥ − ǫ . Furthermore, let q = P [ U T + ( A − e ) T ≥ ] . Since x ( U ) + x ( A − e ) ≤ + ǫ + ǫ + ǫ η ≤ + ǫ (where we used x e ( A ) ≥ x e − x e ( B ) − x C ≥ − ǫ − ǫ − ǫ η and wherewe used 12 ǫ ≤ ǫ ), 2 ( − q − ǫ ) + q ≤ + ǫ .84his implies that q ≤ ǫ + ǫ ≤ ǫ (for ǫ ≥ ǫ ) . Therefore, P [ U T + ( A − e ) T = ] , P [ W T + ( B − f ) T = ] ≥ − ǫ (52)where the second inequality follows by a similar argument. Claim A.1.

Let Z = δ ( u ) ∩ δ ( w ) . If ǫ < , then either E [ Z | u , v , w tree ] ≤ ǫ or E [ Z | u , v , w tree ] ≥ ( − ǫ ) .Proof. For the whole proof we work with µ conditioned on u , v , w are trees. Let z = E [ Z ] . Let D = U ∪ W ∪ A − e ∪ B − f r Z . Note that D T + Z T = U T ∪ W T ∪ ( A − e ) T ∪ ( B − f ) T . By Eq. (52) anda union bound P [ D T + Z T = ] ≥ − ǫ − ǫ η . Therefore,2.1 ǫ ≥ ǫ + ǫ η ≥ P [ D T + Z T = ] ≥ P [ D T = ] ≥ q P [ D T = ] P [ D T = ] where the last inequality follows by log-concavity. On the other hand, z = P [ Z = ] ≤ P [ D T = Z = ] + P [ D T + Z T = ] ≤ P [ D T = ] + ǫ ,1 − z = P [ Z = ] ≤ P [ D T = Z = ] + P [ D T + Z T = ] ≤ P [ D T = ] + ǫ Putting everything together, ( ǫ ) ≥ ( z − ǫ )( − z − ǫ ) = z ( − z ) − ǫ + ǫ .Therefore, using ǫ ≤ z ≤ ǫ or z ≥ − ǫ .So, for the rest of proof we assume E [ Z T | u , v , w trees ] < ǫ . A similar proof shows e , f are 2-2-2 good when E [ Z T | u , v , w trees ] > − ǫ . We run the following conditionings in order: u , v , w trees, Z T = C T = e ( B ) , f / ∈ T , e ( A ) ∈ T . Note that e ( A ) ∈ T is equivalent to u ∪ v be a tree.Call this event E (i.e., the event that all things we conditioned on happen). First, notice P [ E ] ≥ ( − ǫ η )( − ǫ − ǫ − ǫ η − ǫ − ( + ǫ ))( − ǫ ) ≥ ≥ µ |E isstrongly Rayleigh. The main statement we will show is that P [ e , f |E ] ≥ P [ U T = ( A − e ) T = ( B − f ) T = W T = |E ] = Ω ( ) .The main insight of the proof is that Eq. (52) holds (up to a larger constant of ǫ ), even afterconditioning E , B − f = A − e =

1; so, we can bound the preceding event by just a union bound.The main non-trivial statement is to argue that the expectations of B − f and A − e do not changeso much under E .Combining (52) and (53), P [ U T + ( A − e ) T = |E ] , P [ W T + ( B − f ) T = |E ] ≥ − ǫ . (54)We claim that E [ B T |E ] = E [( B − f ) T |E ] ≤ x ( B − f ) + ǫ η + ǫ + ǫ + ǫ ≤ ǫ < ǫ = ǫ . To see this, observe that after each conditioning in E eitherall marginals increase or all decrease. Furthermore, the events C T = Z T = e ( B ) T = ǫ η + ǫ + ǫ ; the only other event that can increase B − f is f / ∈ T . Now we know P [ B − f ) T + W T = |E ] ≥ − ǫ before and after conditioning f / ∈ T .Therefore, by Corollary 2.19, 2 − ǫ ≤ E [ B − f ) T + W T ] ≤ + ǫ . But if E [ B − f ) T ] increased bymore than 35 ǫ , then either before conditioning f / ∈ T , E [( B − f ) + W T ] < − ǫ or afterwards it ismore than 2 + ǫ , which is a contradiction, and completes the proof of (55). A similar argumentshows that E [( A − e ) T |E ] ≤ E [( A − e ) T |E ] ≥ x ( A − e ) − ǫ η − ǫ ≥ E increases E [( A − e ) T ] except for possibly e ( A ) ∈ T .As above, we know that P [ U T + ( A − e ) T = |E ] ≥ − ǫ before and after e ( A ) / ∈ T . So againapplying Corollary 2.19, we see that it can’t decrease by more than 35 ǫ .It follows that0.33 ≤ E [( A − e ) T |E ] ≤ E [( A − e ) T |E , ( B − f ) T = ] ≤ + ≤ P (cid:2) ( A − e ) T = |E , ( B − f ) T = (cid:3) ≥ e − .33 ≥ P [ E , ( A − e ) T = ( B − f ) T = ] ≥ ( )( )( ) ≥ P [ U T = |E , ( A − e ) T = ( B − f ) T = ] , P [ W T = |E , ( A − e ) T = ( B − f ) T = ] ≥ − ǫ /0.019Finally, by union bound P [ U T = W T = |E , ( A − e ) T = ( B − f ) T = ] ≥ − ǫ /0.009Using ǫ = ǫ and ǫ ≤ e , f are2-2-2-happy with probability 0.019 ( − ǫ /0.009 ) >>