[PDF] Strongly Exponential Separation Between Monotone VP and Monotone VNP

Abstract

We show that there is a sequence of explicit multilinear polynomials P_n(x_1,\ldots,x_n)\in \mathbb{R}[x_1,\ldots,x_n] with non-negative coefficients that lies in monotone VNP such that any monotone algebraic circuit for P_n must have size \exp(\Omega(n)). This builds on (and strengthens) a result of Yehudayoff (2018) who showed a lower bound of \exp(\tilde{\Omega}(\sqrt{n})).

Full PDF

aa r X i v : . [ c s . CC ] J u l Strongly Exponential Separation Between Monotone VPand Monotone VNP

Srikanth Srinivasan ∗ Department of MathematicsIIT BombayAugust 3, 2020

Abstract

We show that there is a sequence of explicit multilinear polynomials P n ( x , . . . , x n ) ∈ R [ x , . . . , x n ] with non-negative coeﬃcients that lies in monotone VNP such that anymonotone algebraic circuit for P n must have size exp(Ω( n )) . This builds on (andstrengthens) a result of Yehudayoﬀ (2018) who showed a lower bound of exp( ˜Ω( √ n )) . This paper deals with a problem in

Algebraic Complexity , which is the study of the com-plexity of computing multivariate polynomials over some underlying ﬁeld F . The model ofcomputation is the Algebraic circuit model, which computes polynomials from F [ x , . . . , x n ]using the basic sum and product operations in this ring. This model and its variants havebeen studied by a large body of work (see, e.g. the surveys [18, 15]).The central question in the area is Valiant’s [19] VP vs. VNP question. The set VPcontains sequences ( P n ( x , . . . , x n )) n ≥ of polynomials of polynomially bounded degree thatcan be computed by polynomial-sized algebraic circuits. The class VNP contains sequences( Q n ( x , . . . , x n )) n ≥ where Q n ( x , . . . , x n ) = X b ,...,b m ∈{ , } P n + m ( x , . . . , x n , b , . . . , b m )where m is polynomially bounded in n and ( P r ( x , . . . , x r )) r ≥ is in VP . Like its Boolean analogue, the VP vs. VNP question has proved stubbornly hard toresolve, the principal bottleneck being our inability to prove explicit algebraic circuit lowerbounds. Given this, it is natural to look at variants of this question. ∗ Email: [email protected] i.e. deg( P n ) ≤ n O (1)

1n a recent paper [21], Yehudayoﬀ considered the monotone version of the VP vs. VNPquestion, which is deﬁned as follows. The underlying ﬁeld is R and the polynomials beingcomputed have non-negative coeﬃcients. A monotone algebraic circuit is one where all theconstants appearing in the circuit are non-negative. The monotone versions of VP and VNP,denoted MVP and MVNP respectively, are deﬁned analogously: MVP contains (sequencesof) polynomials that have small monotone algebraic circuits; MVNP contains (sequences of)polynomials that can be written as exponential Boolean sums over polynomials in MVP . Monotone algebraic circuits have been studied since the 80s, and explicit exponentiallower bounds are known for this model via the work of Schnorr [16] and Jerrum and Snir [9](see also [20, 17, 6, 13]). However, as Yehudayoﬀ [21] pointed out, these results do not implya separation between MVP and MVNP. In fact, most of the monotone circuit lower boundsproved in earlier work also imply that the same polynomials do not belong to MVNP , andhence do not imply a separation between these two classes.The main result of [21] was the resolution of the MVP vs. MVNP question. Moreprecisely, Yehudayoﬀ showed that there is an explicit sequence of multilinear polynomials( P n ( x , . . . , x n )) n ≥ in MVNP such that any monotone algebraic circuit for P n must havesize exp( ˜Ω( √ n )) . In this paper, we strengthen this result to a strongly exponential lower bound.

Theorem 1.

There is an explicit sequence of multilinear polynomials ( P n ( x , . . . , x n )) n ≥ in MVNP such that any monotone algebraic circuit for P n must have size Ω( n ) . This theorem bears a similar relation to Yehudayoﬀ’s result as some later works [6, 13]bears to the result of Schnorr [16]. Schnorr [16] proved a lower bound of exp(Ω( √ n )) foran explicit family of polynomials; a similar lower bound was also proved for an explicitfamily of polynomials by Jerrum and Snir [9]. These bounds were strengthened to stronglyexponential lower bounds by a series of works of Kuznetsov, Kasim-Zade, and Gashkov inthe USSR in the 80s [11, 5, 6] , and independently by a more recent result of Raz andYehudayoﬀ [13]. High level idea.

We rely on a connection between monotone algebraic circuit lower boundsand communication complexity that was made explicit by Raz and Yehudayoﬀ [13]. As shownin [13], if a multilinear polynomial P ∈ R [ x , . . . , x n ] has a monotone algebraic circuit of size s , then we get a decomposition P = s X i =1 g i h i (1) The one exception to this seems to be a lower bound of Raz and Yehudayoﬀ [13]. Here, it is unclearwhether the hard polynomials lie in MVNP but we are unable to rule it out. These explicit polynomials were based on the Clique and the Permanent respectively. Unfortunately, journal versions of these papers are not easily available, but we refer to a survey ofGashkov and Sergeev [6] for a very interesting account of this line of work, along with details of some ofthese results. g i h i satisﬁes the property that g i and h i are non-negative multilinearpolynomials that depend on disjoint sets of at least n/ non-negative product polynomial . Thus, to prove a lower bound on the circuit complexityof P , it suﬃces to lower bound the number of terms in any decomposition as in (1).As noted by Jerrum and Snir [9], one way to do this is via the support of the polynomial P , by which we mean the set of monomials that have non-zero coeﬃcients in P . We thinkof this set, denoted Supp( P ), as a subset of 2 [ n ] by identifying each multilinear monomialon x , . . . , x n with a subset of [ n ] in the natural way. Given a decomposition of P into non-negative product polynomials as in (1), we immediately get Supp( P ) = S i ∈ [ s ] Supp( g i · h i ) . And so it suﬃces to obtain a P such that any such decomposition of Supp( P ) must havelarge size.Such decompositions are closely related to a model of communication complexity knownas Multipartition Communication Complexity , introduced by ˇDuris, Hromkoviˇc, Jukna, Sauer-hoﬀ and Schnitger [4] (see also the earlier result of Borodin, Razborov and Smolensky [2]).The multipartition communication complexity of a subset

S ⊆ [ n ] (or equivalently a Booleanfunction f : { , } n → { , } )) is deﬁned as follows. We deﬁne a rectangle R ⊆ [ n ] to beany set of the form { A ∪ B | A ∈ A , B ∈ B} , where A ⊆ Y and B ⊆ Z and ( Y, Z ) is apartition of [ n ]. Further, we say that both the partition and the rectangle R are balanced if | Y | , | Z | ≥ n/ . Finally, the multipartition communication complexity of S is deﬁned to be ⌈ log k ⌉ where k is the smallest integer such that S can be decomposed as the union of k many balanced rectangles.To see the connection to algebraic complexity, note that if P ∈ R [ x , . . . , x n ] has mono-tone algebraic circuits of size s , then (1) implies that Supp( P ) has multipartition communi-cation complexity at most ⌈ log s ⌉ . In particular, linear lower bounds in this model for someexplicit S implies that any non-negative polynomial P with support exactly S cannot becomputed by monotone algebraic circuits of subexponential size.Polynomial (but sublinear) lower bounds for multipartition communication complexitywere implicit in the work of Borodin et al. [2] and were extended to linear (but somewhatnon-explicit) lower bounds in the work of ˇDuris et al. [4]. An explicit linear lower boundfor this model is implicit in a result of Bova, Capelli, Mengel and Slivovsky [3]. (See alsothe related work of Hayes [7]. Similar constructions are attributed to Wigderson in [13] andcarried out by Jukna [10].) The hard problem of [3] is quite easy to describe. Fix a regularexpander graph G on vertex set [ n ] with constant degree d . The associated hard problemis given by taking S to be the set of all vertex covers in G . Said diﬀerently, we consider theBoolean function f G ( x , . . . , x n ) = V { i,j }∈ E ( G ) ( x i ∨ x j ).As mentioned above, the communication complexity lower bound on S immediately yieldsa strongly exponential lower bound on the monotone algebraic complexity of some explicitlydeﬁned polynomial. Unfortunately, as observed by Yehudayoﬀ [21], this does not yield aseparation between MVNP and MVP . This is because the above argument implies that any Recall that we call a family of d -regular graphs ( G n ) n ≥ (with G n a graph on n vertices) an expandersequence if the second largest (in absolute value) eigenvalue of its adjacency matrix A is at most d (1 − Ω(1)) . For the problem deﬁned above, take G = G n in such a sequence. P that has support S requires monotone algebraic circuits of exponential size.Yehudayoﬀ showed that for any polynomial P in MVNP , there is a polynomial-sized mono-tone algebraic circuit that computes a polynomial Q with the same support. In particular,the polynomial P cannot be in MVNP as that would contradict our lower bound above.Thus, to obtain a separation between MVNP and MVP along these lines, some new idea isnecessary.We take our cue from the multipartition communication complexity lower bound above,but modify it suitably to obtain a somewhat diﬀerent lower bound candidate polynomial P . Our proof method for the lower bound, as in [21], is not just based on the supportof P , but rather on the sizes of the coeﬃcients of P . We deﬁne a probability distribution µ on the monomials of P and show that for any non-negative product polynomial g i h i ina decomposition as in (1), a random monomial (chosen according to µ ) has much smallercoeﬃcient in the product polynomial than in P . As the product polynomials sum to P ,there must be many of them. This yields the lower bound.We explain this in some more detail below. Detailed outline.

The heart of the multipartition communication complexity lower boundfor the function f G is a more standard lower bound for the non-deterministic communciationcomplexity of the Disjointness problem. Here, the non-deterministic communication com-plexity of a function f (or equivalently, the set system S ⊆ [ n ] given by f − (1)) is deﬁnedin a similar way to multipartition communication complexity, except that each balancedrectangle R is deﬁned over the same equipartition ( Y, Z ) of [ n ], which we can take to be thesets [ n/

2] and [ n ] \ [ n/

2] respectively; and the Disjointness function D ( x ) is deﬁned by theBoolean predicate V i ∈ [ n/ ( x i ∨ x i + n/ ) . The lower bound for the Disjointness function is proved by a standard

Fooling set argu-ment (see, e.g., [12]). We consider the 2 n/ × n/ communication matrix M , where the rowsand columns are labelled by Boolean settings to variables indexed by Y and Z respectivelyand the ( i, j )th entry of M is the disjointness predicate evaluated on the corresponding in-put. Further, assume that the rows are ordered using the lexicograhical ordering of { , } Y ,and the columns are ordered according to the reverse lexicographic ordering of { , } Z . Thisensures that for any i ∈ [2 n/ ], the diagonal entry M ( i, i ) corresponds to an input of the form( a, a ) where a ∈ { , } Y and a is the bitwise complement of a . From the deﬁnition of theDisjointness function, one can check that each diagonal entry of M is 1; further, given i = j ,either M ( i, j ) or M ( j, i ) is 0. This implies that any rectangle over ( Y, Z ) that contains the i th diagonal entry cannot contain the j th diagonal entry for any j = i . In particular, thenumber of rectangles required to cover all the diagonal entries is 2 n/ , implying a linear lowerbound on the non-deterministic communication complexity of the Disjointness function.For the multipartition setting, we can follow the above strategy to prove a lower boundfor the function f G deﬁned above. The intuition is that for any graph G , the function f G contains many copies of the Disjointness function above. In particular, taking any induced Strictly speaking, the Disjointness function is V i ∈ [ m ] ( ¬ x i ∨ ¬ x i + n/ ) but we keep this deﬁnition forsimplicity. M of size m in G and setting variables corresponding to vertices i V ( M ) to1, we get a copy f M of the Disjointness function on 2 m bits. Given any rectangle R overthe partition ( Y, Z ), one can similarly prove that R cannot contain many (suitably deﬁned)“diagonal entries” of the communication matrix of f M , as long as M contains many (sayΩ( m )) edges from the cut deﬁned by ( Y, Z ) in G .But there is a subtle question of how to choose M as above. In the multipartition setting,the partition ( Y, Z ) is not known ahead of time and furthermore, each rectangle comes withits own underlying partition. This is where the expanding nature of the graph G comes in.Standard facts about expander graphs imply that given any balanced partition ( Y, Z ) (i.e. | Y | , | Z | ≥ n/ G lie in the cut deﬁned by ( Y, Z ). Inparticular, choosing M randomly guarantees that many edges of M lie in the cut with highprobability. This leads to a proof of the multipartition commmunication complexity lowerbound.We now describe how this connects to the lower bounds of this paper for monotonealgebraic circuits. We will follow a similar strategy, but instead of the 0s and 1s of theBoolean predicate, we will analyze the coeﬃcients of the multilinear monomials in P and inthe terms of the decomposition in (1). The polynomial P is deﬁned using an expander graph G on vertex set [ n ] (let us skip over what the deﬁnition of P is for the moment) and thehard distribution µ over the monomials of P is again just the process of choosing a randominduced matching M of size m in G and considering the monomial Q i ∈ V ( M ) x i .The proof of the lower bound then proceeds as follows. Assume that P has a circuit ofsize s and consider the decomposition given in (1). Given a term g i h i of the decomposition,we get a balanced partition ( Y i , Z i ) of the underlying variable set x , . . . , x n . We argue thatfor a random monomial m chosen according to the distribution µ, the expected value ofthe coeﬃcient of m in g i h i is much smaller than its coeﬃcient in P . To do this, we usea numerical analogue of the fooling set technique outlined above. Again, we consider the“communication matrix” M , which now is a 2 | Y i | × | Z i | matrix whose rows and columnsare labelled by multilinear monomials in Y i and Z i respectively, and such that the entrycorresponding to monomials ( m , m ) is the coeﬃcient of the product monomial m · m in P . The main technical part of the proof shows the following: for independently sampledmonomials m ′ and m ′′ (chosen from distribution µ ) that factor as m ′ · m ′ and m ′′ · m ′′ respectively, where m ′ , m ′′ are monomials over Y i and m ′ , m ′′ are monomials over Z i , thecoeﬃcients of the “cross monomials” ˆ m := m ′ · m ′′ and ˜ m := m ′′ · m ′ in P are much smallerthan the coeﬃcients of m ′ and m ′′ in P . This immediately implies that the coeﬃcients of m ′ and m ′′ in g i h i are smaller than they are in P by the following simple argument. If we letCoeﬀ( m , Q ) denote the coeﬃcient of monomial m in a polynomial Q , then we see thatCoeﬀ( m ′ , g i h i ) · Coeﬀ( m ′′ , g i h i ) = Coeﬀ( m ′ , g i )Coeﬀ( m ′ , h i )Coeﬀ( m ′′ , g i )Coeﬀ( m ′′ , h i )= Coeﬀ( ˆ m , g i h i ) · Coeﬀ( ˜ m , g i h i ) . The latter term is upper bounded by the product of the coeﬃcients of the monomials ˆ m and For some technical reasons, we will actually choose M so that the non-adjacent vertices of M are atdistance at least 3 from each other. But this can be ignored for now. m in P (because of the decomposition (1)), which we already argued are much smaller thanthe coeﬃcients of m ′ and m ′′ in P . This implies that a randomly chosen monomial m hasmuch smaller coeﬃcient in any product term g i h i than in P . Therefore, there must be manysuch terms in the decomposition (1). This implies the lower bound.The above outline also indicates the property of P that allows the lower bound proofto work: we would like that the coeﬃcients of m ′ and m ′′ are much larger than those ofthe monomials ˆ m and ˜ m . We do this by designing a polynomial P in MVNP where thecoeﬃcients of any monomial Q i ∈ S x i grows with the number of edges in the subgraph of G induced by the set S . Recall that for a random monomial m chosen according to µ , S is thevertex set of a matching of size m and hence this induced subgraph has m edges. However,if m is suﬃciently smaller than n (say m ≤ αn for a small enough α > m to have too many edgesbetween them. This is what allows us to bound the coeﬃcients of ˆ m and ˜ m , and prove thelower bound as above. Notation.

Throughout, let n ≥ X = { x , . . . , x n } be a set of indeterminates. We use x S to denote the monomial Q i ∈ S x i . Given a polynomial P ∈ R [ x , . . . , x n ] and S ⊆ [ n ], we use Coeﬀ( x S , P ) to denote the coeﬃcient of the monomial x S in the polynomial P .Let ( G n ) n>d be an explicit sequence of d -regular expander graphs on n vertices with secondlargest eigenvalue at most d . . Here, d is a large enough constant as speciﬁed below. Suchan explicit sequence of expander graphs can be constructed using, say, [14]. The only factwe will use about expanders is the following, which is an easy consequence of the ExpanderMixing Lemma [1] (see also [8, Lemma 2.5]).For any pair of disjoint sets U, V ⊆ V ( G n ), we use E ( U, V ) to denote the set of edges { u, v } ∈ E ( G n ) such that u ∈ U and v ∈ V . Also, let E ( U ) denote the set of edges e = { u, v } ∈ E ( G n ) such that u, v ∈ U . Lemma 2 (Corollary to Expander Mixing Lemma) . Let G n be as above. Then, for anydisjoint sets U, V ⊆ [ n ] such that | U | , | V | ∈ [ n/ , n/ , we have | E ( U, V ) | ≥ | E ( G n ) | . as long as d is a large enough constant. From now on, d will be ﬁxed to be a large enough constant so that the inequality inLemma 2 holds.We deﬁne the polynomial P n ( x , . . . , x n ) as follows. We assume that V ( G n ) = [ n ]. Foreach edge e ∈ E ( G n ) introduce a variable x ′ e and let X ′ = { x ′ e | e ∈ E ( G n ) } . Notice that foreach Boolean assignment to the variables in X ′ , we obtain a subgraph H of G n . In particular,if the variables in X ′ are set randomly to Boolean values, we get a random subgraph H of6 n with the same vertex set [ n ]. We use deg H ( i ) to denote the degree of the vertex i in thegraph H .We now deﬁne P n ( x , . . . , x n ) = E x ′ e ∈{ , }∀ e ∈ E ( G n )  Y i ∈ [ n ] (cid:0) x i · deg H ( i ) (cid:1) (2)= E x ′ e ∈{ , }∀ e ∈ E ( G n )  X S ⊆ [ n ] x S · P i ∈ S deg H ( i )  (3)where the variables x ′ e are set to one of { , } independently and uniformly at random. Lemma 3.

The sequence of polynomials P n as deﬁned above is in MVNP .Proof.

Using (2), we see that P n ( x , . . . , x n ) = 12 | E ( G n ) | X x ′ e ∈{ , } : e ∈ E ( G ) Y i ∈ [ n ] (cid:16) x i · P e ∋ i x ′ e (cid:17) . Since G n is d -regular, it suﬃces to show that each function f : { , } d → R deﬁned by f ( x ′ , . . . , x ′ d ) = 2 P j ∈ [ d ] x ′ j can be represented by a constant-sized polynomial over x ′ , . . . , x ′ d with non-negative coeﬃcients.But this is clear since f ( x ′ , . . . , x ′ d ) = P S ⊆ [ d ] Q i ∈ S x ′ i . The main theorem of this section is the following.

Theorem 4.

Any monotone circuit computing P n has size Ω( n ) . We need the following lemma from [13]. We say that a pair of multilinear polynomials( g, h ) ∈ R [ X ] form a non-negative product pair if g, h are polynomials with non-negativecoeﬃcients, and there is a partition of X = Y ∪ Z where n/ ≤ | Y | , | Z | ≤ n/ g ∈ R [ Y ] , h ∈ R [ Z ] . Lemma 5 ([13], Lemma 3.3) . Assume that P n has a monotone circuit of size s . Then P n ( X ) = s +1 X i =1 g i h i where for each i ∈ [ s ] , ( g i , h i ) forms a non-negative product pair. Corollary 6.

Assume that P n has a monotone circuit of size s . Let µ be any probabilitydistribution on subsets S ⊆ [ n ] . Then, there is a non-negative product pair ( g, h ) such that gh ≤ P , i.e., Coeﬀ( x S , gh ) ≤ Coeﬀ( x S , P n ) for each S ⊆ [ n ] , • E S ∼ µ (cid:2) Coeﬀ( x S , gh ) / Coeﬀ( x S , P n ) (cid:3) ≥ / ( s +1) . (The quantity Coeﬀ( x S , gh ) / Coeﬀ( x S , P n ) is well deﬁned since by (3), the denominator is non-zero for all S ⊆ [ n ] .)Proof. Write P n = P i ≤ s +1 g i h i as in Lemma 5. For any ﬁxed S ⊆ [ n ] and a uniformlyrandom i ∈ [ s + 1], we have E i ∈ [ s +1] (cid:20) Coeﬀ( x S , g i h i )Coeﬀ( x S , P n ) (cid:21) = 1 s + 1 X i ∈ [ s ] Coeﬀ( x S , g i h i )Coeﬀ( x S , P n ) = 1 s + 1 . In particular, the above also holds when S is chosen according to µ . The result now followsby averaging over i ∈ [ s + 1].Given Corollary 6, to prove Theorem 4, it suﬃces to show the following. Lemma 7.

There is a probability distribution µ on subsets S ⊆ [ n ] such that for any non-negative product pair ( g, h ) with gh ≤ P n , we have E S ∼ µ (cid:2) Coeﬀ( x S , gh ) / Coeﬀ( x S , P n ) (cid:3) ≤ exp( − Ω( n )) . (4)We need some preparatory work before proving Lemma 7. Lemma 8.

There exist constants

A, B > such that P n ( X ) = X S ⊆ [ n ] x S B | S | A | E ( S ) | . Proof.

Using (3), we obtain P n ( x , . . . , x n ) = E x ′ e : e ∈ E ( G )  X S ⊆ [ n ] x S · P i ∈ S deg H ( i )  = X S ⊆ [ n ] x S · E x ′ e : e ∈ E ( G ) (cid:2) P i ∈ S deg H ( i ) (cid:3) where H is the random subgraph of G deﬁned by a uniformly random Boolean assignmentto the variables in X ′ . Note that X i ∈ S deg H ( i ) = X i ∈ S X e ∋ i x ′ e = X e ∈ E ( S, ¯ S ) x ′ e + X e ∈ E ( S ) x ′ e . Hence, we get for any S ⊆ [ n ], E x ′ e : e ∈ E ( G ) (cid:2) P i ∈ S deg H ( i ) (cid:3) = Y e ∈ E ( S, ¯ S ) E x ′ e h x ′ e i · Y e ∈ E ( S ) E x ′ e h x ′ e i = (3 / | E ( S, ¯ S ) | · (5 / | E ( S ) | = (3 / | S | d · (5 / | E ( S ) | (3 / | E ( S ) | where for the last equality, we have used the fact that 2 | E ( S ) | + | E ( S, ¯ S ) | = | S | d. Note thatthis proves the lemma with B = (3 / d and A = (10 / .

8e now deﬁne the probability distribution µ that will be shown to have the propertyin (4). The distribution is deﬁned by the following sampling process. Let m = αn where α ∈ (0 ,

1) is a small constant speciﬁed below.Sampling Algorithm S :1. Set M = ∅ . (Eventually, M will be a matching of size m/ G .)2. For i = 1 to ( m/ G n that are at distance at most 2 from any vertex inthe matching M . Let G ( i ) n be the resulting graph.(b) Choose a uniformly random edge e i from E ( G ( i ) n ) and add it to M .3. Output M .The above algorithm deﬁnes a distribution ν over matchings M in G n of size m/

2. We deﬁne S = V ( M ) to be the set of vertices sampled by the algorithm. This deﬁnes a probabilitydistribution µ over subsets of [ n ].We will need the following properties of the above algorithm. Lemma 9 (Properties of S . ) . Let M be sampled as in S above and let S = V ( M ) . Then wehave1. Assuming that α ≤ / (100 · d ) , we have | M | = ( m/ , | S | = m and E ( S ) = M withprobability .2. Let ( U, V ) be any partition of V ( G n ) such that n/ ≤ | U | , | V | ≤ n/ . Then, as longas α ≤ / (100 · d ) , for some absolute constant γ > , we have Pr M [ | M ∩ E ( U, V ) | ≤ γm ] ≤ exp( − γm ) .

3. Let M and M be two independent samples obtained by running S twice, and let S i = V ( M i ) ( i ∈ [2] ). Let ( U, V ) be a partition of V ( G n ) as above. Deﬁne R i = S i ∩ U and T i = S i ∩ V. Then, for α ≤ γ ln A/ (100 · A d ) we have E M ,M (cid:2) A | E ( R ,T ) | + | E ( R ,T ) | (cid:3) ≤ A γm/ . Here, γ is as in the previous item and A is as in the statement of Lemma 8.Proof. Item 1 easily follows from the deﬁnition of the Sampling algorithm S . Note that ineach iteration of Step 2, we remove at most 2 · (1 + d + d ) vertices and hence at most2( d + d + d ) edges from the graph G n . Hence, the upper bound on α guarantees that after i < ( m/

2) iterations of the for loop, the number of edges removed from the graph is at most2 i · ( d + d + d ) < md = 4 αnd < nd , G ( i ) n to add to the matching M . For Item 2, we proceed as follows. For i ∈ { , . . . , m/ } , let e i be the edge chosen bythe sampling algorithm S in the i th iteration of Step 2. Fix any choices of all the e j with j < i and consider the i th iteration of Step 2. The probability that e i lies in E ( U, V ) is | E i ( U, V ) | / | E ( G ( i ) n ) | where E i ( U, V ) is the set of edges in G ( i ) n with one endpoint each in U and V . Note that | E i ( U, V ) | ≥ | E ( U, V ) | −| E ( G n ) \ E ( G ( i ) n ) | ≥ nd − i − · ( d + d + d ) ≥ nd − αn (3 d ) ≥ nd i − < m = αn ≤ n/ (100 · d ) . Hence,we have shown that for each i ,Pr[ e i ∈ E ( U, V ) | e , . . . , e i − ] = | E i ( U, V ) || E ( G ( i ) n ) | ≥ ( nd ) / nd ) / . In particular, for any T ⊆ [ m ], the probability that for every i ∈ T , e i E ( U, V ) can beupper bounded by (9 / | T | . Thus, the probability that | M ∩ E ( U, V ) | ≤ ℓ = γm can be bounded byPr M [ ∃ T ∈ (cid:18) [ m ] m − ℓ (cid:19) s.t. ∀ i ∈ T, e i E ( U, V )] ≤ X T Pr M [ ∀ i ∈ T, e i E ( U, V )] ≤ (cid:18) mℓ (cid:19) (cid:18) (cid:19) m − ℓ ≤ (cid:16) emℓ (cid:17) ℓ · (cid:18) (cid:19) m − ℓ = eγ · (cid:18) (cid:19) (1 /γ ) − ! γm ≤ exp( − γm )as long as γ is bounded by a small enough absolute constant. This ﬁnishes the proof of Item2. We now prove Item 3. Fix any possible matching M as sampled by the algorithm S . It suﬃces to bound E M (cid:2) A | E ( R ,T ) | + | E ( R ,T ) | (cid:3) for each such M . Let ˜ S denote the set ofvertices that are at distance at most 1 from S and let E denote the set of edges that haveat least one endpoint in ˜ S . Note that | E | ≤ | ˜ S | d ≤ | S | d = md = αnd . We claim that | E ( R , T ) | + | E ( R , T ) | ≤ | E ∩ M | . The reason for this is that if avertex i ∈ S is incident to an edge e in E ( R , T ) ∪ E ( R , T ) then i ∈ ˜ S and hence theedge e ′ ∈ M involving i is an edge in E ∩ M . In particular, the number of such vertices i ∈ S is at most 2 | E ∩ M | . Further, each such vertex i is adjacent to at most 2 verticesin S since vertices in S that are not adjacent via an edge in M are at distance at least 3from each other. Thus each such vertex i contributes at most 2 to | E ( R , T ) | + | E ( R , T ) | . This yields the claimed inequality. Thus it suﬃces to bound E M (cid:2) A | E ∩ M | (cid:3) .

10e start with a tail bound for | E ∩ M | . Let M = { e ′ , . . . , e ′ m/ } where e ′ j is the j th edgeadded to M by the algorithm S . Conditioned on e ′ , . . . , e ′ j − , the probability that e ′ j ∈ E is at most | E || E ( G ( j ) n ) | = | E || E ( G n ) | − | E ( G n ) \ E ( G ( j ) n ) | ≤ αnd ( nd/ − αnd ≤ αnd ( nd ) / αd where for the ﬁrst inequality we have bounded | E ( G n ) \ E ( G ( j ) n ) | and | E | as above and forthe second inequality we have used the bound on α. Hence, we havePr M [ | E ∩ M | ≥ i ] ≤ X T ∈ ( m/ i ) Pr M [ ∀ j ∈ T, e ′ j ∈ E ] ≤ (cid:18) m/ i (cid:19) (4 αd ) i . This allows us to bound E M (cid:2) A | E ∩ M | (cid:3) for any ﬁxed M output by S . E M (cid:2) A | E ∩ M | (cid:3) ≤ m/ X i =0 A i Pr M [ | E ∩ M | ≥ i ] ≤ m/ X i =0 A i · (cid:18) m/ i (cid:19) (4 αd ) i = (cid:0) αdA (cid:1) m/ ≤ (1 + ( γ ln A ) / m/ ≤ exp(( mγ ln A ) /

4) = A γm/ where the third inequality follows from the bound α ≤ γ ln A/ (100 · A d ) assumed in thestatement of the lemma.We are now ready to prove Lemma 7, which will complete the proof of Theorem 4. Proof of Lemma 7.

We set m = αn so that α is a positive constant upper bounded by γ ln A/ (100 · A · d ) and m is even. Assume that M is as sampled above by samplingalgorithm S and S = V ( M ). This deﬁnes the distribution µ on subsets of [ n ].Let ( g, h ) be any non-negative product pair such gh ≤ P n . Consequently, there exists apartition ( U, V ) of V ( G n ) = [ n ] such that n/ ≤ | U | , | V | ≤ n/ g ∈ R [ x i : i ∈ U ] , h ∈ R [ x j : j ∈ V ].Let E = E ( M ) denote the event that | M ∩ E ( U, V ) | ≤ γm. By Lemma 9 item 2, we knowthat Pr M [ E ] ≤ exp( − Ω( n )) and hence we have E M (cid:20) Coeﬀ( x S , gh )Coeﬀ( x S , P n ) (cid:21) ≤ E M (cid:20) Coeﬀ( x S , gh )Coeﬀ( x S , P n ) | E (cid:21) Pr M [ E ] + E M (cid:20) Coeﬀ( x S , gh )Coeﬀ( x S , P n ) | E (cid:21) Pr M [ ¯ E ] ≤ Pr M [ E ] + E M (cid:20) Coeﬀ( x S , gh )Coeﬀ( x S , P n ) | E (cid:21) ≤ exp( − Ω( n )) + 1 B m · A m/ E M (cid:2) Coeﬀ( x S , gh ) | E (cid:3) (5)11here for the second inequality we have used that gh ≤ P n , and for the ﬁnal inequality wehave used our bound on Pr M [ E ] along with Lemma 8 and Lemma 9 item 1.We now bound the latter term in (5). For any i, j, k, let E i,j,k = E i,j,k ( M ) denote the eventthat | M ∩ E ( U, V ) | = i, | M ∩ E ( U ) | = j, and | M ∩ E ( V ) | = k. The event ¯ E is partitionedinto E i,j,k where i + j + k = ( m/

2) and i ≥ γm . Let T denote the set of such triples ( i, j, k ) . We have E M (cid:2) Coeﬀ( x S , gh ) | E (cid:3) = X ( i,j,k ) ∈T E M (cid:2) Coeﬀ( x S , gh ) | E i,j,k (cid:3) · Pr M [ E i,j,k | E ]Call a triple ( i, j, k ) ∈ T heavy if Pr M [ E i,j,k ] ≥ A − γm/ and light otherwise. Note thatas Pr M [ E ] = 1 − exp( − Ω( n )) ≥ / , we have Pr M [ E i,j,k | E ] ≤ M [ E i,j,k ] . In particular,if ( i, j, k ) is light, we have Pr M [ E i,j,k | E ] ≤ A − γm/ = exp( − Ω( n )) . Plugging this into theexpression above, we get E M (cid:2) Coeﬀ( x S , gh ) | E (cid:3) = X ( i,j,k ) ∈T E M (cid:2) Coeﬀ( x S , gh ) | E i,j,k (cid:3) · Pr M [ E i,j,k | E ] ≤ |{ ( i, j, k ) | ( i, j, k ) light }| · B m A m/ · exp( − Ω( n )) + max ( i,j,k ) heavy E M (cid:2) Coeﬀ( x S , gh ) | E i,j,k (cid:3) ≤ exp( − Ω( n )) B m A m/ + max ( i,j,k ) heavy E M (cid:2) Coeﬀ( x S , gh ) | E i,j,k (cid:3) . (6)It suﬃces therefore to bound E M (cid:2) Coeﬀ( x S , gh ) | E i,j,k (cid:3) for any heavy ( i, j, k ). This isthe main part of the proof.Fix some ( i, j, k ) ∈ T that is heavy. Let C = E M (cid:2) Coeﬀ( x S , gh ) | E i,j,k (cid:3) . Thus, we get C = E M ,M (cid:2) Coeﬀ( x S , gh )Coeﬀ( x S , gh ) (cid:3) where M and M are independent samples of M conditioned on the event E i,j,k ( M ) , and S ℓ = V ( M ℓ ) for ℓ ∈ { , } . Deﬁne R ℓ = S ℓ ∩ U and T ℓ = S ℓ ∩ V. We make some simpleobservations. For each ℓ ∈ [2]1. | R ℓ | = i + 2 j and | T ℓ | = i + 2 k ,2. | E ( R ℓ ) | = j, | E ( T ℓ ) | = k ,3. Coeﬀ( x S ℓ , gh ) = Coeﬀ( x R ℓ , g ) · Coeﬀ( x T ℓ , h ) . C = E M ,M (cid:2) Coeﬀ( x R , g )Coeﬀ( x T , h )Coeﬀ( x R , g )Coeﬀ( x T , h ) (cid:3) = E M ,M (cid:2) Coeﬀ( x R ∪ T , gh )Coeﬀ( x R ∪ T , gh ) (cid:3) ≤ E M ,M (cid:2) Coeﬀ( x R ∪ T , P n )Coeﬀ( x R ∪ T , P n ) (cid:3) = E M ,M (cid:2) B | R | + | T | · A | E ( R ∪ T ) | · B | R | + | T | · A | E ( R ∪ T ) | (cid:3) = E M ,M (cid:2) B i + j + k ) · A | E ( R ) | + | E ( T ) | + | E ( R ) | + | E ( T ) | + | E ( R ,T ) | + | E ( R ,T ) | (cid:3) = B m A m − i · E M ,M (cid:2) A | E ( R ,T ) | + | E ( R ,T ) | (cid:3) ≤ B m A m (1 − γ ) · E M ,M (cid:2) A | E ( R ,T ) | + | E ( R ,T ) | (cid:3) (7)where we used the observations above for the equalities and for the ﬁnal inequality, we usedthe fact that i ≥ γm for all ( i, j, k ) ∈ T .To bound the latter term in (7), we consider a similar expression where M and M arereplaced by M ′ and M ′ which are independent random outputs of the algorithm S (withoutany conditioning). In this case, by Lemma 9 item 3, we have E M ′ ,M ′ h A | E ( R ′ ,T ′ ) | + | E ( R ′ ,T ′ ) | i ≤ A γm/ , where R ′ ℓ , T ′ ℓ are deﬁned analogously for ℓ ∈ [2]. Thus, using Bayes’s rule we have E M ,M (cid:2) A | E ( R ,T ) | + | E ( R ,T ) | (cid:3) ≤ E M ′ ,M ′ (cid:2) A | E ( R ′ ,T ′ ) | + | E ( R ′ ,T ′ ) | (cid:3) Pr M ′ ,M ′ [ E i,j,k ( M ′ ) ∧ E i,j,k ( M ′ )] ≤ A γm/ where the last inequality uses the fact that ( i, j, k ) is heavy. Plugging the above into (7), wehave C ≤ B m A m/ − γm/ = B m A m/ exp( − Ω( n )) . As this holds for any heavy ( i, j, k ), using (6) and (5), we obtain the statement of thelemma.

Acknowledgements.

The author is grateful to Mrinal Kumar and Amir Yehudayoﬀ forvery helpful discussions and encouragement. This work was done during a visit to the “LowerBounds Program in Computational Complexity” program at the Simons Institute for theTheory of Computing; the author is grateful to the organizers of this program and the SimonsInstitute for their hospitality. The author is also grateful to Igor Sergeev and an anonymousreviewer for pointing out the large body of work on monotone circuit lower bounds carriedout by the Russian mathematical community (see [6]). Finally, the author would like tothank the anonymous reviewers (who reviewed this paper for the ACM Transactions onComputation Theory) for their comments and suggestions.13 eferences [1] N. Alon and F. Chung. Explicit construction of linear sized tolerant networks.

DiscreteMathematics , 72(1):15 – 19, 1988.[2] A. Borodin, A. A. Razborov, and R. Smolensky. On lower bounds for read-k-timesbranching programs.

Computational Complexity , 3:1–18, 1993.[3] S. Bova, F. Capelli, S. Mengel, and F. Slivovsky. Expander CNFs have exponentialDNNF size.

CoRR , abs/1411.1995, 2014.[4] P. Duris, J. Hromkovic, S. Jukna, M. Sauerhoﬀ, and G. Schnitger. On multi-partitioncommunication complexity.

Inf. Comput. , 194(1):49–75, 2004.[5] S. B. Gashkov. On the complexity of monotone computations of polynomials.

Vestn.Mosk. Univ., Ser. I , 1987(5):7–13, 1987.[6] S. B. Gashkov and I. S. Sergeev. A method for deriving lower bounds for the complexityof monotone arithmetic circuits computing real polynomials.

Sbornik. Mathematics ,203(10), 10 2012.[7] T. P. Hayes. Separating the k-party communication complexity hierarchy: an appli-cation of the Zarankiewicz problem.

Discrete Mathematics & Theoretical ComputerScience , 13(4):15–22, 2011.[8] S. Hoory, N. Linial, and A. Wigderson. Expander graphs and their applications.

Bulletinof the American Mathematical Society , 43(4):439–561, 2006.[9] M. Jerrum and M. Snir. Some exact complexity results for straight-line computationsover semirings.

J. ACM , 29(3):874–897, 1982.[10] S. Jukna. Lower bounds for tropical circuits and dynamic programs.

Theory Comput.Syst. , 57(1):160–194, 2015.[11] O. M. Kasim-Zade. The complexity of monotone polynomials. In

Proceedings of theAll-Union seminar on discrete mathematics and its applications (Russian) (Moscow,1984) , pages 136–138. Moskov. Gos. Univ., Mekh.-Mat. Fak., Moscow, 1986.[12] A. Rao and A. Yehudayoﬀ.

Communication Complexity: and Applications . CambridgeUniversity Press, 2020.[13] R. Raz and A. Yehudayoﬀ. Multilinear formulas, maximal-partition discrepancy andmixed-sources extractors.

J. Comput. Syst. Sci. , 77(1):167–190, 2011.[14] O. Reingold, S. Vadhan, and A. Wigderson. Entropy waves, the zig-zag graph product,and new constant-degree expanders.

Annals of mathematics , pages 157–187, 2002.1415] R. Saptharishi. A survey of lower bounds in arithmetic circuit complexity.

Githubsurvey , 2015.[16] C.-P. Schnorr. A lower bound on the number of additions in monotone computations.

Theoret. Comput. Sci. , 2(3):305–315, 1976.[17] E. Shamir and M. Snir.

Lower bounds on the number of multiplications and the numberof additions in monotone computations . IBM Thomas J. Watson Research Division,1977.[18] A. Shpilka and A. Yehudayoﬀ. Arithmetic circuits: A survey of recent results and openquestions.

Foundations and Trends in Theoretical Computer Science , 5(3-4):207–388,2010.[19] L. G. Valiant. Completeness classes in algebra. In

Proceedings of the 11h Annual ACMSymposium on Theory of Computing, April 30 - May 2, 1979, Atlanta, Georgia, USA ,pages 249–261, 1979.[20] L. G. Valiant. Negation can be exponentially powerful.

Theor. Comput. Sci. , 12:303–314, 1980.[21] A. Yehudayoﬀ. Separating monotone VP and VNP. In