Tight Size-Degree Bounds for Sums-of-Squares Proofs
aa r X i v : . [ c s . CC ] A p r Tight Size-Degree Bounds for Sums-of-Squares Proofs ∗ Massimo LauriaKTH Royal Institute of Technology Jakob Nordstr¨omKTH Royal Institute of TechnologyApril 8, 2015
Abstract
We exhibit families of -CNF formulas over n variables that have sums-of-squares (SOS) proofsof unsatisfiability of degree (a.k.a. rank) d but require SOS proofs of size n Ω( d ) for values of d = d ( n ) from constant all the way up to n δ for some universal constant δ . This shows that the n O( d ) runningtime obtained by using the Lasserre semidefinite programming relaxations to find degree- d SOSproofs is optimal up to constant factors in the exponent. We establish this result by combining NP -reductions expressible as low-degree SOS derivations with the idea of relativizing CNF formu-las in [Kraj´ıˇcek ’04] and [Dantchev and Riis ’03], and then applying a restriction argument as in[Atserias, M¨uller, and Oliva ’13] and [Atserias, Lauria, and Nordstr¨om ’14]. This yields a genericmethod of amplifying SOS degree lower bounds to size lower bounds, and also generalizes the ap-proach in [ALN14] to obtain size lower bounds for the proof systems resolution, polynomial calculus,and Sherali-Adams from lower bounds on width, degree, and rank, respectively. Let f , . . . , f s ∈ R [ x , . . . , x n ] be real, multivariate polynomials. Then the Positivstellensatz provenin [Kri64, Ste73] says (as a special case) that the the system of equations f = 0 , . . . , f s = 0 (1.1)has no solution over R n if and only if there exist polynomials g j , q ℓ ∈ R [ x , . . . , x n ] such that s X j =1 g j f j = − − X ℓ q ℓ . (1.2)That there can exist no solution given an expression of the form (1.2) is clear, but what is more in-teresting is that there always exists such an expression to certify unsatisfiability. We refer to (1.2) asa Positivstellensatz proof or Sums-of-squares (SOS) proof of unsatisfiability, or as an
SOS refutation , of (1.1). We remark that the Positivstellensatz also applies if we add inequalities h ≥ , . . . , h t ≥ tothe system of equations and allow terms − h j P ℓ q j,ℓ on the right-hand side in (1.2).The degree of an SOS refutation is the maximal degree of any g j f j . The search for proofs ofconstant degree d is automatizable as shown in a sequence of works by Shor [Sho87], Nesterov [Nes00],Lasserre [Las01], and Parrilo [Par00]. What this means is that if there exists a degree- d SOS refutationfor a system of polynomial equalities (and inequalities) over n variables, then such a refutation canbe found in polynomial time n O( d ) . Briefly, one can view (1.2) as linear system of equations in the ∗ This is the full-length version of the paper with the same title to appear in
Proceedings of the 30th Annual ComputationalComplexity Conference (CCC ’15) . All proofs for systems of polynomial equations or for formulas in conjunctive normal form (CNF) in this paper will beproofs of unsatisfiability, and we will therefore use the two terms “proof” and “refutation” interchangeably. This is sometimes also referred to as the “rank,” but we will stick to the term “degree” in this paper.
IGHTSIZE-DEGREEBOUNDSFORSUMS-OF-SQUARESPROOFScoefficients of g j and u = P ℓ q ℓ with the added constraint that u is a sum of squares, and such a systemcan be solved by semidefinite programming in d/ rounds of the Lasserre SDP hierarchy.In the last few years there has been renewed interest in sums-of-squares in the context of constraintsatisfaction problems (CSPs) and hardness of approximation, as witnessed by, for instance, [BBH + Positivstellensatz or Lasserre ). This proof systemwas introduced by Grigoriev and Vorobjov [GV01] as an extension of the Nullstellensatz proof systemstudied by Beame et al. [BIK + F -linear equations [Gri01b] (also referred to as the -XOR problem when each equation involves atmost variables) and for the knapsack problem [Gri01a].Given the connections to semidefinite programming and the Lasserre SDP hierarchy, it is perhapsnot surprising that most works on SOS lower bounds have focused on the degree measure. However,from a proof complexity point of view it is also natural to ask about the minimal size of SOS proofs,measured as the number of monomials when all polynomials in each term in (1.2) are expanded out aslinear combinations of monomials. Such SOS size lower bounds were proven for knapsack in [GHP02]and F -linear systems of equations in [KI06], and tree-like size lower bounds for other formulas werealso obtained in [PS12].A wider interest in this area of research was awakened when Schoenebeck [Sch08] (essentially)rediscovered Grigoriev’s result [Gri01b], which together with further work by Tulsiani [Tul09] led tointegrality gaps for a number of constraint satisfaction problems. There have also been papers such as[BPS07] and [GP14] focusing on semantic versions of the proof system, with less attention to the actualsyntactic derivation rules used. We refer the reader to, for instance, the introductory section of [OZ13] formore background on sums-of-squares and connections to hardness of approximation, and to the survey[BS14] for an in-depth discussion of SOS as an approximation algorithm and the intriguing connectionsto the so-called Unique Games Conjecture [Kho02]. As discussed above, if a system of polynomial equalities and inqualities over n variables can be showninconsistent by SOS in degree d , then by using semidefinite programming one can find an SOS refutationof the system in time n O( d ) . It is natural to ask whether this is optimal, or whether there might exist“shortcuts” that could lead to SOS refutations more quickly.We prove that there are no such shortcuts in general, but that the running time obtained by using theLasserre semidefinite programming relaxations to find SOS proofs is optimal up to the constant in theexponent. We show this by constructing formulas on n variables (which can be translated to systems ofpolynomial equalities in a canonical way) that have SOS refutations of degree d but require refutationsof size n Ω( d ) . Our lower bound proof works for d from constant all the way up to n δ for some constant δ . Theorem 1.1 (informal).
Let d = d ( n ) ≤ n δ where δ > is a universal constant. Then there is afamily of -CNF formulas { F n } n ∈ N + with O (cid:0) n (cid:1) clauses over O( n ) variables such that F n is refutablein sums-of-squares in degree Θ( d ) but any SOS refutation of F n requires size n Ω( d ) . This theorem extends an analogous result joint by the two authors with Atserias in [ALN14] for theproof systems resolution, polynomial calculus, and Sherali-Adams, where upper bounds on refutationsize in terms of width, degree, and rank, respectively, were shown to be tight up to the multiplicativeconstant in the exponent. Theorem 1.1 works for all of these proof systems, since the upper bound is infact on resolution width (i.e., the size of a largest clause in a resolution refutation), not just SOS degree, It might be worth pointing out that definitions and terminology in this area have suffered from a certain lack of standard-ization, and so what [KI06] refers to as “static Lov´asz-Schrijver calculus” is closer to what we mean by SOS/Lasserre. The exact details of these proof systems are not important for this discussion, and so we choose not to elaborate furtherhere, instead referring the interested reader to [ALN14]. n Ω( d ) size lower bound is very much worse, however, andtherefore the gap between upper and lower bounds is very much larger than in [ALN14].We want to emphasize that the size lower bound in Theorem 1.1 holds for SOS proofs of arbitrarydegree. Thus, going to higher degree (i.e., higher levels of the Lasserre SDP hierarchy) does not help,since even arbitrarily large degree cannot yield shorter proofs. This is an interesting parallel to thepaper [LRST14] exhibiting problems for which a (symmetric) SDP relaxation of arbitrary degree butbounded size n d does not do much better than the systematic relaxation of degree d . We obtain the result in Theorem 1.1 as a special case of a more general method of amplifying lowerbounds on width (in resolution), degree (in polynomial calculus) and rank/degree (in Sherali-Adams andLasserre/SOS) to size lower bounds in the corresponding proof systems. This method is in some sensealready implicit in [ALN14], which in turn relies heavily on an earlier paper by Atserias et al. [AMO13],but it turns out that extracting the essential ingredients and making them explicit is helpful for extendingthe results in [ALN14] to an analogue for sums-of-squares. We give a brief, informal description of thethree main ingredients of the method below. (i) Find a base CNF formulas hard with respect to width/degree/rank
To start, we need tofind a base problem, encoded as an unsatisfiable CNF formula, that is “moderately hard” for the proofsystem at hand. What this means is that we should be able to prove asymptotically tight bounds on widthif we are dealing with resolution, on degree for polynomial calculus, and on degree/rank for Sherali-Adams and sums-of-squares. It then follows by a generic argument (as discussed briefly above for SOS)that a bound O( d ) on width/degree/rank implies an upper bound n O( d ) on proof size.In [AMO13, ALN14] the pigeonhole principle served as the base problem. This principle, which hasbeen extensively studied in proof complexity, is encoded in CNF as pigeonhole principle (PHP) formulas saying that there is a one-to-one mapping of m pigeons into n pigeonholes for m > n . For sums-of-squares we cannot use PHP formulas, however, since they are not hard with respect to SOS degree.Instead we construct an SOS reduction in low degree from inconsistent systems of F -linear equationsto the clique problem, and then appeal to the result in [Gri01b, Sch08] briefly discussed above to obtainthe following degree lower bound. Theorem 1.2 (informal).
Given k ∈ N + , there is a graph G and a -CNF formula k - Clique( G ) of sizepolynomial in k with the following properties:1. The graph G does not contain a k -clique, but the formula k - Clique( G ) claims that it does.2. Resolution can refute k - Clique( G ) in width k .3. Any sums-of-squares refutation of k - Clique( G ) requires degree Ω( k ) . (ii) Relativize the CNF formulas The second step is to take the formulas for which we have estab-lished width/degree/rank lower bounds and relativize them. Relativization is an idea that seems to havebeen considered for the first time in the context of proof complexity by Kraj´ıˇcek [Kra04] and that wasfurther developed by Dantchev and Riis [DR03]. Very loosely, it can be described as follows.Suppose that we have a CNF formula encoding (the negation of) a combinatorial principle sayingthat some set S has a property. For instance, the CNF formula could encode the pigeonhole principlediscussed above, or could claim the existence of a totally ordered set of n elements where no element inthe set is minimal with respect to the ordering (these latter CNF formulas are known as ordering principleformulas , least number principle formulas , or graph tautologies in the literature).The formula at hand is then relativized by constructing another formula encoding that there is a (po-tentially much larger) set T containing a subset S ⊆ T for which the same combinatorial principle holds.3IGHTSIZE-DEGREEBOUNDSFORSUMS-OF-SQUARESPROOFSFor the ordering principle, we can encode that there exists a non-empty ordered subset S ⊆ T of arbitrarysize such that it is possible for all elements in S to find a smaller element inside S . This relativizationstep transforms the previously very easy ordering principle formulas into relativized versions that areexponentially hard for resolution [Dan06, DM14]. For the PHP formulas, we specify that we have a setof M ≫ m pigeons mapped into into n < m holes such that there exists a subset of m pigeons that aremapped injectively.In our setting, it will be important that the relativization does not make the formulas too hard. We donot want the hardness to blow up exponentially and instead would like the upper bound obtained in thefirst step above to scale nicely with the size of the relativization. For our general approach to work, wetherefore need formulas talking about some domain being mapped to some range, where we can enlargethe domain while keeping the range fixed, and where in addition the mapping is symmetric in the sensethat permuting the domain does not change the formula.For this reason, relativizing the ordering principle formulas does not work for our purposes. Pigeon-hole principle formulas have this structure, however, which is exactly why the proofs in [ALN14] gothrough. As already mentioned, PHP formulas will not work for sums-of-squares, but we can relativizethe formulas in Theorem 1.2 by saying that there is a large subset of vertices such that there is a k -cliquehiding inside such a subset. (iii) Apply random restrictions to show proof size lower bounds In the final step, we userandom restrictions to establish lower bounds on proof size for the relativized CNF formulas obtained inthe second step. This part of the proof is relatively standard, except for a crucial twist in the restrictionargument introduced in [AMO13].Assume that there is a small refutation in sums-of-squares (or whatever proof system we are studying)of the relativized formula claiming the existence of a subset of size m ≪ M with the given combinatorialproperty. Now hit the formula (and the refutation) with a random restriction that in effect chooses a subsetof size m , and hence gives us back the original, non-relativized formula. This restriction will be fairlyaggressive in terms of the number of variables set to fixed truth values, and hence it will hold with highprobability that the restricted refutation has no monomials of high degree (or, for resolution, no clausesof high width), since all such monomials will either have been killed by the restriction or at least haveshrunk significantly. (We remark that making use of this shrinking in the analysis is the crucial extrafeature added in [AMO13].) But this means that we have a refutation of the original formula in degreesmaller than the lower bound established in the first step. Hence, no small refutation can exist, and thelower bound on proof size follows.This concludes the overview of our method to amplify lower bounds on width/degree/rank to size.It is our hope that developing such a systematic approach for deriving this kind of lower bounds, andmaking explicit what conditions are needed for this approach to work, can also be useful in other contexts. The rest of this paper is organized as follows. We start in Section 2 by reviewing the definitions andnotation used, and also stating some basic facts that we will need. In Section 3, we prove a degreelower bound for CNF formulas encoding a version of the clique problem. We then present in Section 4 ageneral method for obtaining SOS size lower bounds from degree lower bounds (or from width, degree,and rank, respectively, for proof systems such as resolution, polynomial calculus, and Sherali-Adams).We conclude with a brief discussion of some possible directions for future research in Section 5.
For a positive integer n , we use the standard notation [ n ] = { , , . . . , n } . All logarithms in this paperare to base . A CNF formula F is a conjunction of clauses, denoted F = V j C j , where each clause C isa disjunction of literals, denoted C = W i a i . Each literal a is either a propositional variable x (a positive literal ) or its negation x (a negative literal ). We think of formulas and clauses as sets, so that there is norepetition and order does not matter. We consider polynomials on the same propositional variables, withthe convention that, as an algebraic variable, x evaluates to when it is true and to when it is false.All polynomials in this paper are evaluated on / -assignments, and live in the ring of real multilinearpolynomials, which is the ring of real polynomials modulo the ideal generated by polynomials x i − x i for all variables x i . In other words, all variables in all monomials have degree at most one, and monomialmultiplication is defined by (cid:0)Q i ∈ A x i (cid:1) · (cid:0)Q i ∈ B x i (cid:1) = Q i ∈ A ∪ B x i .Since sums-of-squares derivations operate with polynomial equations and inequalities, in order toreason about CNF formulas we need to encode them in this language. For a clause C = C + ∨ C − , wherewe write C + and C − to denote the subsets of positive and negative literals, respectively, we define S ( C ) = X x ∈ C + x + X x ∈ C − (1 − x ) (2.1)and encode C as the inequality S ( C ) ≥ . (2.2)Clearly, a clause C is satisfied by a / -assignment if and only if the same assignment satisfies theinequality S ( C ) ≥ . For a variable x and a bit β ∈ { , } , we define δ x = β = ( − x if β = 0 , x if β = 1 ; (2.3)and for a sequence of variables ~x = ( x i , . . . x i w ) and a binary string β = ( β , . . . β w ) , we define the indicator polynomial δ ~x = β = w Y j =1 δ x ij = β j (2.4)expanded out as a linear combination of monomials. That is, δ ~x = β is the polynomial that evaluatesto for / -assignments satisfying the equalities x i j = β j for j = 1 , . . . , w and to for all other / -assignments. We have the following useful fact. Fact 2.1.
For every sequence of variables ~x the syntactic equality (cid:0)P β ∈{ , } w δ ~x = β (cid:1) = 1 holds (aftercancellation of terms). Let F be a CNF formula over some set of variables denoted as Vars ( F ) , and let ρ be a partialassignment on Vars ( F ) . We write F ↾ ρ to denote the formula F restricted by ρ , where all clauses C ∈ F satisfied by ρ are removed and all literals falsified by ρ in other clauses are removed. For a polynomial p over variables Vars ( F ) (written, as always, as a linear combination of distinct monomials), we let p ↾ ρ denote the polynomial obtained by substituting values for assigned variables and removing monomialsthat evaluate to . We extend this definition to sets of formulas or polynomials in the obvious way bytaking unions. Definition 2.2 (Sums-of-squares proof system). A sums-of-squares derivation , or SOS derivation forshort, of the polynomial inequality p ≥ from the system of polynomial constraints f = 0 , . . . , f s = 0 , h ≥ , . . . , h t ≥ (2.5)is a sum p = s X j =1 g j f j + t X j =1 u j h j + u , (2.6)where g , . . . , g s are arbitrary polynomials and each u j is expressible as a sums of squares P ℓ q j,ℓ .A derivation of the equation p = 0 is a pair of derivations of p ≥ and − p ≥ . A sums-of-squaresrefutation of (2.5) is a derivation of the inequality − ≥ from (2.5).5IGHTSIZE-DEGREEBOUNDSFORSUMS-OF-SQUARESPROOFSThe degree of an SOS derivation is the maximum degree among all the polynomials g j f j , u j h j ,and u in (2.6). The size of an SOS derivation is the total number of monomials (counted with repetition)in all polynomials g j f j , u j h j , and u (all expanded out as linear combinations of distinct monomials).The size and degree of refuting an unsatisfiable system of polynomial constraints are defined by takingthe minimum over all SOS refutations of the system with respect to the corresponding measure. Remark 2.3.
Readers more familiar with the usual definition of Positivstellensatz/sums-of-squares inthe literature might be a bit puzzled by the use of multilinearity in Definition 2.2, and might also wonderwhere the axioms x i − x i = 0 , x i ≥ , and − x i ≥ for every variable x i disappeared. It is importantto note that we have these axioms in our multilinear setting as well, although they are not explicitlymentioned. Equations of the form x i − x i = 0 are tautological due to multilinearity, and the inequalities x i ≥ and − x i ≥ are derivable by the squaring rule since in the multilinear setting we have x i = x i and − x i = (1 − x i ) .Our choice of the multilinear setting is without any loss of generality and only serves to simplifythe technical arguments slightly. It is easy to see that applying the multilinearization operator mapping x ℓi to x i for every ℓ ≥ to any SOS derivation over real polynomials yields a legal SOS derivationover multilinear real polynomials in at most the same size and degree. Thus, working in the multilinearsetting can only make our lower bounds stronger. As to the upper bounds in this paper, we prove themin the resolution proof system discussed below, and the simulation of resolution by sums-of-squares inLemma 2.6 below works also in the standard setting without multilinearization.Let us state some useful basic properties of multilinear polynomials for later reference (and alsoprovide a proof just for completeness). Proposition 2.4 (Unique multilinear representation).
Every function f : { , } n → R has a uniquerepresentation as a multilinear polynomial. In particular, if p is a multilinear polynomial such that p ( α ) ∈ { , } for all α ∈ { , } n , then for every positive integer ℓ the equality p ℓ = p holds (wherethis is a syntactic equality of multlinear polynomials expanded out as linear combinations of distinctmonomials).Proof. The set of functions from { , } n to R is a vector space of dimension n . Any function f ( ~x ) inthis space can be represented as a linear combination P β ∈{ , } n f ( β ) · δ ~x = β ( ~x ) . Since each δ ~x = β is amultilinear polynomial the multilinear monomials on n variables are a set of n generators of the vectorspace. By linear independence they also form a basis, and hence the representation of a function as alinear combination of multilinear monomials is unique. The second part of the proposition now followsimmediately since p ℓ and p compute the same function.The upper bounds in this paper are shown in the weaker proof system resolution , which is definedas follows. A resolution derivation of a clause D from a CNF formula F is a sequence of clauses ( D , D , . . . , D τ ) such that D τ = D and for every clause D i it holds that it is either a clause of F (an axiom ), or is obtained by weakening from some D j ⊆ D i for j < i , or can be inferred from twoclauses D ℓ , D j , ℓ < j < i , by the resolution rule that allows to derive the clause A ∨ B from twoclauses A ∨ x and B ∨ x (where we say that A ∨ x and B ∨ x are resolved on x to yield the resolvent A ∨ B ). If in a resolution derivation ( D , D , . . . , D τ ) each clause D j is only used once in a weakeningor resolution step to derive some D i for i > j , we say that the derivation is tree-like (such derivationsmay contain multiple copies of the same clause). A resolution refutation of F , or resolution proof for F ,is a derivation of the empty clause (the clause containing no literals) from F .The width of a clause is the number of literals in it, and the width of a CNF formula or resolutionderivation is the maximal width of any clause in the formula or derivation. The size of a resolutionderivation is the total number of clauses in it (counted with repetitions). The size and width of refutingan unsatisfiable CNF formula F is defined by taking the minimum over all resolution refutations of F with respect to the corresponding measure.The following standard fact is easy to establish by forward induction over resolution derivations. Weomit the proof. 6 Preliminaries Fact 2.5.
Consider a partial assignment ρ which assigns ℓ variables. Let A be the unique clause ofwidth ℓ such that A evaluates to false under ρ . If resolution can derive C in width w and size S from F ↾ ρ ,then resolution can derive A ∨ C in width at most w + ℓ and size at most S + 1 from F . Let us also state for the record the formal claim that SOS is more powerful than resolution in termof degree (and for constant degree also in terms of size). The next lemma is essentially Lemma 4.6in [ALN14], except that there the lemma is stated for the Sherali-Adams proof system. Since SOSsimulates Sherali-Adams efficiently with respect to both size and degree, however, the same bounds applyalso for SOS. Referring to the discussion in Remark 2.3, it should also be pointed out that the lemmain [ALN14] is proven in the more common non-multilinear setting with explicit axioms x i − x i = 0 , x i ≥ , and − x i ≥ for all variables x i . Lemma 2.6 (SOS simulation of resolution).
If a CNF formula F = V tj =1 C j has a resolution refutationof size S and width w , then the constraints { S ( C j ) ≥ } tj =1 as defined in (2.1) and (2.2) have an SOSrefutation of size O (cid:0) w w S (cid:1) and degree at most w + 1 . The next lemma will be useful as a subroutine when we prove upper bounds in resolution.
Lemma 2.7.
Let k and m , m , . . . m k be positive numbers. Then the CNF formula consisting of theclauses y i, i ∈ [ k ] , (2.7a) y i,j − ∨ x i,j ∨ y i,j i ∈ [ k ] , j ∈ [ m i ] , (2.7b) y i,m i i ∈ [ k ] , (2.7c) x ,j ∨ x ,j · · · ∨ x k,j k ( j , . . . , j k ) ∈ [ m ] × · · · × [ m k ] , (2.7d) has a resolution refutation of width k + 1 and size O (cid:0)Q ki =1 m i (cid:1) .Proof. We prove the lemma by backwards induction over k . Consider any clause A of the form A = x ,j ∨ x ,j · · · ∨ x ( i − ,j ( i − (2.8)for ≤ i ≤ k (and note that for i = 1 this is the empty clause). We will show how to derive A in width i + 1 given clauses A ∨ x i, , A ∨ x i, , . . . , A ∨ x i,m i .We start by resolving the axioms y i, and y i, ∨ x i, ∨ y i, , and then we apply the resolution ruleagain on this resolvent and the clause A ∨ x i, (available by the induction hypothesis) to get A ∨ y i, .We now deduce A ∨ y i,j for increasing j . Suppose we have already obtained A ∨ y i,j − . Using theinductively derived clause A ∨ x i,j and the axiom y i,j − ∨ x i,j ∨ y i,j , we can resolve on variables y i,j − and x i,j to obtain A ∨ y i,j . Once A ∨ y i,m i has been derived, we resolve it with the axiom y i,m i to get A .By backward induction we reach the empty clause for i = 1 , which concludes the resolution refutation.Since i ≤ k , the refutation has width k + 1 . It is easy to verify that all axioms and intermediate clausesin the refutation are used exactly once. Thus, the refutation is tree-like, and has size exactly twice thenumber of axioms clauses minus one, which, in particular, is O (cid:0)Q ki =1 m i (cid:1) .When we construct formulas to be relativized as described in Section 1.2, it is convenient to usevariables x i,~ , where i ranges over some specific domain D and ~ is a collection of other indices. Wesay that the variable x i,~ mentions the element i ∈ D . The domain-width of a clause is the number ofdistinct elements of D mentioned by its variables. The domain-width of a CNF formula or resolutionproof is defined by taking the maximum domain-width over all its clauses, and the domain-width ofrefuting a CNF formula F is the minimal domain-width of any resolution refutation of F . Similarly, the domain-degree of a monomial is the number of distinct elements in D mentioned by its variables, thedomain-degree of a polynomial or SOS proof is the maximal domain-degree of any monomial in it, andthe domain-degree of refuting an unsatisfiable system of polynomial constraints is defined by taking theminimum over all refutations. 7IGHTSIZE-DEGREEBOUNDSFORSUMS-OF-SQUARESPROOFS In this section we state and prove the formal version of Theorem 1.2, namely a lower bound for thedomain-degree needed in SOS to prove that a graph G has no k -clique. Let us start by describing howwe encode the k -clique problem as a CNF formula. Definition 3.1 ( k -clique formula). Let k be a positive integer, G = ( V, E ) be an undirected graphon N vertices, and ( v , v , . . . , v N ) be an enumeration of V ( G ) = V . Then the formula k - Clique( G ) consists of the clauses x i,u ∨ x i ′ ,v i, i ′ ∈ [ k ] , i = i ′ , { u, v } 6∈ E ( G ) , (3.1a) x i,u ∨ x i,v , i ∈ [ k ] , u, v ∈ V ( G ) , u = v , (3.1b) z i, i ∈ [ k ] , (3.1c) z i, ( j − ∨ x i,v j ∨ z i,j i ∈ [ k ] , j ∈ [ N ] , (3.1d) z i,N i ∈ [ k ] . (3.1e)The formula k - Clique( G ) encodes the claim that G has a clique of size k . The intended meaningof the variable x i,v for v ∈ V ( G ) is that v is the i th vertex of the clique. The clauses in (3.1a) enforcethat any two members of the clique are distinct and are connected by an edge. The clauses in (3.1b)enforce that at most one vertex is chosen for each i ∈ [ k ] . The clauses in (3.1c)–(3.1e) are simply the -CNF encoding (using extension variables) of the clause W Nj =1 x i,v j enforcing that at least one vertexis chosen for each i ∈ [ k ] . The variables of k - Clique( G ) are indexed by i over the domain [ k ] andthe domain-width of the formula is . The next proposition shows that the naive brute-force approach todecide k - Clique( G ) can be carried on in resolution (and hence by Lemma 2.6 also in SOS). Proposition 3.2. If G has no clique of size k , then k - Clique( G ) has a resolution refutation of size O (cid:0) | V | k (cid:1) and width k + 1 .Proof. We first use the weakening rule to derive all clauses of the form x ,u ∨ x ,u ∨ · · · ∨ x k,u k (3.2)for every sequence of vertices ( u , u , . . . , u k ) . This is possible since either the sequence contains arepetition or it includes two vertices with no edge between them, and in both cases this means that theclause (3.2) is a superclause of some clause of the form (3.1a). Then we derive the empty clause byapplying Lemma 2.7 to the clauses (3.1c)–(3.1e) and (3.2).In order to obtain suitably hard instances of k - Clique( G ) we construct a reduction from -XORsto k -partite graphs. It is convenient for us to describe the special case of k -clique on k -partite graphsdirectly as an encoding as polynomial equations and inequalities as follows next. Definition 3.3 (Polynomial encoding of k -clique on k -partite graphs). For a k -partite graph G with V ( G ) = V . ∪ V . ∪· · · . ∪ V k we let k - Block( G ) denotes the following collection of polynomial constraints: X v ∈ V i x v = 1 i ∈ [ k ] , (3.3a) x u + x v ≤ u ∈ V i , v ∈ V i ′ , i = i ′ , { u, v } 6∈ E ( G ) . (3.3b)It is straightforward to verify that these constrants encode the claim that G has a clique with oneelement in each block V i , since exactly one element is chosen from each block by (3.3a) and all thechosen elements have to be pairwise connected by (3.3b).Any lower bound on degree that we establish for k - Block( G ) will hold also for k - Clique( G ) asstated in the following proposition. 8 ADegree LowerBound for Clique Formulas Proposition 3.4.
Consider a k -partite graph G , where V ( G ) = V . ∪ V . ∪ · · · . ∪ V k . If k - Clique( G ) hasan SOS refutation in domain-degree d , then k - Block( G ) has an SOS refutation in domain-degree d .Proof. The proof is by transforming a refutation of k - Clique( G ) into a refutation of k - Block( G ) ofthe same domain-degree. To give an overview, we start with a refutation of k - Clique( G ) of domain-degree d and replace its variables with polynomials of degree at most mentioning only variables from k - Block( G ) . In this way we get an SOS refutation of domain-degree at most d from the substitutedaxioms of k - Clique( G ) . The latter polynomials are not necessarily axioms of k - Block( G ) , but we showthat they have SOS derivations of domain-degree from the axioms of k - Block( G ) . This concludes theproof.The variable substitution has two steps: first we substitute every variable z i,j with the linear form P Nt = j +1 x i,v t , where { v j } Nj =1 is the enumeration of V ( G ) in Definition 3.1, and then we set x i,v j to whenever v j V i .As mentioned above, we now need to give SOS derivations of domain-degree of all transformedaxioms in k - Clique( G ) from k - Block( G ) . For the axioms (3.1c)–(3.1e), the SOS encoding is z i, ≥ i ∈ [ k ] , (3.4a) (cid:0) − z i, ( j − (cid:1) + x i,v j + z i,j ≥ i ∈ [ k ] , j ∈ [ N ] , (3.4b) (1 − z i,N ) ≥ i ∈ [ k ] . (3.4c)After the first step of the substitution the inequalities (3.4a), (3.4b) and (3.4c) become, respectively,the inequality P Nj =1 x i,v j ≥ , and two occurrences of tautology ≥ . Furthermore, after the secondstep of the substitution the inequality (3.4a) becomes P v ∈ V i x i,v ≥ , which is subsumed by Equa-tion (3.3a). Each of the axioms (3.1a) and (3.1b) is encoded as − x i,u − x i ′ ,v ≥ (3.5)for some pair of indices i, i ′ and vertices u, v . We assume that u ∈ V i and v ∈ V i ′ , because otherwisethe variable substitution turns the inequality into either a tautology or into − x i,u ≥ , where thelatter follows from (1 − x i,u ) ≥ by multilinearity. If i = i ′ then the inequality (3.5) is an axiom of k - Block( G ) . If that is not the case, then we can obtain − x i,u − x i,v in domain-degree using thederivation − X v ∈ V i x i,w | {z } from Equation (3.3a) + X w u,v } ( x i,w ) | {z } sum of squares = 1 − X v ∈ V i x i,w + X w u,v } x i,w = 1 − x i,u − x i,v (3.6)where the first identity holds by multilinearity. The proposition follows.What we want to do now is to prove a domain-degree lower bound for instances of k - Block( G ) where the graph G is obtained by a reduction from (unsatisfiable) sets of F -linear equations. We relyon the version of Grigoriev’s degree lower bound [Gri01b] shown by Schoenebeck [Sch08], which isconveniently stated for random -XOR formulas as encoded next. Definition 3.5 (Polynomial encoding of random -XOR). A random -XOR formula φ represents asystem of ∆ n linear equations modulo defined over n variables. Each equation is sampled at randomamong all equations of the form x ⊕ y ⊕ z = b as follows: x , y , z are sampled uniformily withoutreplacement from the set of n variables and b is sampled uniformly in { , } . The polynomial encodingof any such linear equation modulo is (1 − x )(1 − y ) z = 0 (3.7a) (1 − x ) y (1 − z ) = 0 (3.7b) x (1 − y )(1 − z ) = 0 (3.7c) xyz = 0 (3.7d)9IGHTSIZE-DEGREEBOUNDSFORSUMS-OF-SQUARESPROOFSwhen b = 0 and (1 − x )(1 − y )(1 − z ) = 0 (3.7e) xy (1 − z ) = 0 (3.7f) x (1 − y ) z = 0 (3.7g) (1 − x ) yz = 0 (3.7h)when b = 1 .Fixing δ = 1 / and ∆ = 8 in [Sch08] we have the following theorem. Theorem 3.6 ([Sch08]).
There exists an α , < α < , such that for every ǫ > there exists an n ǫ ∈ N such that a random -XOR formula φ in n ≥ n ǫ variables and n constraints has the followingproperties with probability at least − ǫ .1. At most n parity constraints of φ can be simultaneously satisfied.2. Any sums-of-squares refutation of φ requires degree αn . Now we are ready to describe how to transform a -XOR formula φ into a k -partite graph G kφ thathas a clique of size k if and only if φ is satisfiable. Definition 3.7 ( -XOR graph). Given k ∈ N and a -XOR formula φ with n constraints over n vari-ables, where we assume for simplicity that k divides n , we construct a -XOR graph G kφ as follows.We arbitrarily split the formula φ into k linear systems with n/k constraints each, denoted as φ , φ , . . . φ k . For each φ i we let V i be a set of at most N ≤ n/k vertices labelled by all possibleassignments to the at most n/k variables appearing in φ i . For two distinct vertices u ∈ V i and v ∈ V i ′ there is an edge between u and v in G kφ if the two assignments corresponding to u and v are compatible,i.e., when they assign the same values to the common variables, and also the union of the two assign-ments does not violate any constraint in φ . (In particular, each V i is an independent set, since two distinctassignments to the same set of variables are not compatible.)The key property of the reduction in Definition 3.7 is that it allows small domain-degree refutationsof k - Block (cid:0) G kφ (cid:1) to be converted into small degree refutations of φ . Lemma 3.8. If k - Block (cid:0) G kφ (cid:1) has an SOS refutation of domain-degree d , then φ has an SOS refutationof degree dn/k .Proof. Again we start by giving an overview of the proof, which works by transforming a refutation of k - Block (cid:0) G kφ (cid:1) of domain-degree d into a refutation of φ of degree dn/k .Given a refutation of k - Block (cid:0) G kφ (cid:1) of domain-degree d , we replace every variable x v with a polyno-mial over the variables of φ . In this way we get an SOS refutation from the polynomials correspondingto the substituted axioms of k - Block (cid:0) G kφ (cid:1) . The latter polynomials need not be axioms of φ , but we showthat they can be efficiently derived in SOS from φ . We thus obtain an SOS refutation of φ , the degree ofwhich is easily verified to be as in the statement of the lemma.We now describe the substitution in detail. Consider a block V i and suppose that the corresponding -XOR formula φ i mentions t variables. Let us write ~x to denote this set of variables. Then everyvertex v ∈ V i represents an assignment β ∈ { , } t to ~x . In what follows, we denote the indicatorpolynomial δ ~x = β in (2.4) by δ v for brevity, and we substitute for each variable x v the polynomial δ v ofdegree t ≤ n/k .Before the substitution each monomial in the original refutation has domain-degree at most d byassumption. Two important observations are that ( δ v ) = δ v for every v ∈ V i and that δ u δ v = 0 forevery two distinct u, v in the same block V i . Therefore, after the substitution each monomial is eitheridentically zero or the product of at most d indicator polynomials, and hence its degree is at most dn/k .10 ADegree LowerBound for Clique FormulasTo verify these observations, note that the identity ( δ v ) = δ v holds by Proposition 2.4. The equality δ u δ v = 0 holds because δ u and δ v are the indicator polynomials of two incompatible assignments, and sotheir product always evaluates to zero. Applying Proposition 2.4 again, we conclude that the (multilinear)polynomial δ u δ v is identically zero.In order to complete the proof outline above, we now need to present SOS derivations starting fromthe -XOR constraints of φ of all polynomial constraints resulting from the substitutions in the axiomsof k - Block (cid:0) G kφ (cid:1) described above, and to do so in degree at most n/k .Let us first look at the axioms (3.3a). By Fact 2.1, the identity X v ∈ V i δ v = X β ∈{ , } t δ ~x = β = 1 (3.8)holds syntactically, so substitutions in axioms of the form (3.3a) result in tautologies .The remaining axioms of k - Block (cid:0) G kφ (cid:1) in (3.3b) have the form x u + x v ≤ for non-edges ( u, v ) between vertices in different blocks. By construction of G kφ the reason u and v are not connected iseither that the partial assignments corresponding to the two vertices are incompatible, or that their unionviolates some constraint in φ .In the first case, − δ u − δ v ≥ is an SOS axiom because of the identity (1 − δ u − δ v ) = 1 − δ u − δ v , (3.9)which follows from the observation that δ u and δ v are the indicator polynomials of two incompatibleassignments and cannot evaluate to simultaneously, and so (1 − δ u − δ v ) evaluates to either or andis identical to its square by Proposition 2.4. The degree of (3.9) is n/k .In the second case, the two assignments corresponding to u and v are compatible but their unionviolates some initial equation f = 0 of the form (3.7a)–(3.7h). Any such f is a degree- indicatorpolynomial which evaluates to whenever the assignment satisfies the equations δ u δ v = 1 . This meansthat δ u δ v contains f as a factor. We factorize f as f u f v so that δ u = f u δ ′ u and δ v = f v δ ′ v . Given thisnotation, we can derive ≤ − δ u − δ v using the indentity (1 − f u − f v ) + ( f u − δ u ) + ( f v − δ v ) − f u f v = 1 − δ u − δ v (3.10)of degree at most n/k . To verify (3.10), observe that the left-hand side is the sum of some squaredpolynomials and − f u f v = − f = 0 . Expanding the squared polynomials and using Proposition 2.4repeatedly we have that ( f u ) = f u , ( f v ) = f v , ( δ u ) = δ u , and ( δ v ) = δ v , from which we alsoconclude that f u δ u = f u (cid:0) f u δ ′ u (cid:1) = (cid:0) f u (cid:1) δ ′ u = f u δ ′ u = δ u (3.11)and f v δ v = f v (cid:0) f v δ ′ v (cid:1) = (cid:0) f v (cid:1) δ ′ v = f v δ ′ v = δ v (3.12)which establishes that (3.10) holds. The lemma follows.Now we can put together all the material in this section to prove a formal version of Theorem 1.2 asstated next. Theorem 3.9.
There are universal constants N ∈ N + and α , < α < , such that for every k ≥ there exists a graph G k with at most kN = O( k ) vertices and a -CNF formula k - Clique( G k ) of sizepolynomial in k with the following properties:1. Resolution can refute k - Clique( G k ) in size O( k log k ) and width k + 1 .2. Any SOS refutation of k - Clique( G k ) requires domain-degree α k . Proof.
Fix any positive ǫ < and let N = 2 n ǫ , α = α and n = kn ǫ , where n ǫ and α are theuniversal constants from Theorem 3.6. To build the graph G k we take a -XOR formula φ on n variablesand n equations from the distribution in Definition 3.5. Since n ≥ n ǫ , Theorem 3.6 implies that there isa formula in the support of the distribution that is unsatisfiable and that requires degree αn to be refutedin SOS. We fix φ to be that formula and let G k be the graph G kφ constructed as in Definition 3.7. Then G kφ is k -partite, with each part having at most n/k = N vertices, and the graph has no k -clique becauseotherwise φ would be satisfiable.Suppose that there is an SOS refutation of k - Clique (cid:0) G kφ (cid:1) of domain-degree d . We want to argue that d ≥ α k . Since G kφ is k -partite, by Proposition 3.4 the formula k - Block (cid:0) G kφ (cid:1) also has an SOS refutationin domain-degree d . By Lemma 3.8, this in turn yields an SOS refutation of φ in degree dn/k . NowTheorem 3.6 implies that dn/k ≥ αn , and hence d ≥ α k = α k .To conclude the proof, we can just observe that the resolution width and size upper bounds are adirect application of Proposition 3.2. Using the material developed in Section 3, we can now describe how to relativize formulas in order toto amplify degree lower bounds to size lower bounds in SOS . This method works for formulas that are“symmetric” in a certain sense, and so we start by explaining exactly what is meant by this.
Definition 4.1 (Symmetric formula).
Consider a CNF formula F on variables x i,~ , where i is anindex in some domain D and ~ denotes a collection of other indices. For every subset of indices ~ı = { i , i , . . . , i s } ⊆ D we identify the subformula F ~ı of F such that each clause C ∈ F ~ı mentions exactly the indices in ~ı , so that a formula F of domain-width d can be written as F = d ^ s =0 ^ ~ı ⊆ D | ~ı | = s F ~ı . (4.1)We say that F is symmetric with respect to D if it is invariant with respect to permutations of D , i.e., iffor every F ~ı ⊆ F it also holds that F π ( ~ı ) ⊆ F , where π is any permutation on D and π ( ~ı ) is the set ofimages of the indices in ~ı . Phrased differently, F is symmetric with respect to D if for any permutation π on D the syntactic equality F = V ~ı ⊆ D F π ( ~ı ) holds (where we recall that we treat CNF formulas as setsof clauses). We apply this terminology for systems of polynomial equations and inequalities in the sameway.Let us illustrate Definition 4.1 by giving perhaps the most canonical example of a formula that issymmetric in this sense. Example 4.2.
Recall that the CNF encoding of the pigeonhole principle with a set of pigeons D andholes [ n ] claims that there is a mapping from pigeons in D to holes such that no hole gets two pigeons.For every pigeon i ∈ D there is a clause W j ∈ [ n ] x i,j and for every two distinct pigeons i, i ′ and hole j there is a clause x i,j ∨ x i ′ ,j . Since any permutation of the set of pigeons D gives us back exactly thesame set of clauses (only listed in a different order) the pigeonhole principle formula is symmetric withrespect to D .By now, the reader will already have guessed that another example of a symmetric formula, whichwill be more interesting to us in the currect context, is the k -clique formula discussed in Section 3. Observation 4.3.
The k - Clique( G ) formula in Definition 3.1 over variables x i,v is symmetric with re-spect to the indices i ∈ [ k ] .
12 SizeLowerBounds fromRelativizationStarting with any formula F symmetric with respect to a domain D , we can build a family of similarformulas by varying the size of the domain. If F has domain-width d , then for each s , ≤ s ≤ d , thesubformulas F ~ı with | ~ı | = s in (4.1) are the same up to renaming of the domain indices in ~ı . Hence, wecan arbitrarily pick one such subformula to represent them all, and denote it as F s . The formulas { F s } ds =0 are completely determined by F , and together with D they in turn completely determine F . Using thisobservation, we can generalize the formula F over domain D to any domain D ′ with | D ′ | ≥ d bydefining F [ D ′ ] to be the formula F [ D ′ ] = d ^ s =0 ^ ~ı ⊆ D | ~ı | = s F ~ı , (4.2)where each F ~ı for | ~ı | = s is an isomorphic copy of F s with its domain indices renamed according to ~ı .Let us state some simple but useful facts that can be read off directly from (4.2):1. For any formula F of domain-width d symmetric with respect to domain D , it holds that F [ D ] is(syntactically) equal to F .2. For any domains D ′ , D ′′ with | D ′ | = | D ′′ | ≥ d , the two formulas F [ D ′ ] and F [ D ′′ ] are isomor-phic.3. For any D ′′ ) D ′ with | D ′ | ≥ d , the formula F [ D ′′ ] contains many isomorphic copies of F [ D ′ ] .When we want to emphasize the domain D of a formula F in what follows, we will denote theformula F as F [ D ] . When the domain is D = [ t ] , we abuse notation slightly and write F [ t ] insteadof F [[ t ]] . As discussed above, from a symmetric formula F of domain-width d we can obtain a well-defined sequence of formulas F [ t ] for all t ≥ d . We say that the unsatisfiability threshold of such asequence of formulas is the least t such that F [ t ] is unsatisfiable. For instance, the pigeonhole principleformula in Example 4.2 has unsatisfiability threshold n + 1 . Given a formula F = F [ m ] symmetric with respect to [ m ] and a parameter k < m , we now want todefine the k -relativization of F [ m ] , which is intended to encode the claim that that there exists a subset D ⊆ [ m ] of size | D | ≥ k such that the subformula F [ D ] ⊆ F [ m ] is satisfiable. We remark that a CNFformula encoding such a claim will be unsatisfiable when k is at least the unsatisfiability threshold of F .In order to express the existence of the subset D we use selectors s , s , . . . , s m as indicators ofmembership in the subset and encode the constraint on the subset size | D | = P mi =1 s i ≥ k as describedin the next definition. Definition 4.4.
The threshold- k formula for variables ~s = { s , . . . , s m } is the -CNF formula Thr k ( ~s ) that consists of the clauses y ℓ, ℓ ∈ [ k ] , (4.3a) y ℓ,i − ∨ p ℓ,i ∨ y ℓ,i ℓ ∈ [ k ] , i ∈ [ m ] , (4.3b) y ℓ,m i ∈ [ m ] , (4.3c) p ℓ,i ∨ p ℓ ′ ,i ℓ, ℓ ′ ∈ [ k ] , ℓ = ℓ ′ , i ∈ [ m ] , (4.3d) p ℓ,i ∨ s i ℓ ∈ [ k ] , i ∈ [ m ] . (4.3e)To see that Thr k ( ~s ) indeed enforces a cardinality constraint, note that the variables p ℓ,i encode amapping between [ k ] and [ m ] (with p ℓ,i being true if and only if ℓ maps to i ). The clauses (4.3a)–(4.3c)force every ℓ ∈ [ k ] to have an image in [ m ] , since they form the -CNF representation of clauses W i p ℓ,i .The clauses (4.3d) forbid two distinct elements of [ k ] to have the same image, so there must be at least k elements in the range of the map, and for each of them the corresponding selector must be true becauseof the clauses (4.3e). We will need the following properties of the threshold formula.13IGHTSIZE-DEGREEBOUNDSFORSUMS-OF-SQUARESPROOFS Observation 4.5.
The formula
Thr k ( ~s ) in Definition 4.4 has the following properties:1. Thr k ( ~s ) has size polynomial in both k and m .2. For any partial assignment to ~s with at least k ones there is an assignment to the extension vari-ables that satisfies Thr k ( ~s ) .3. There is a resolution refutation of the set of clauses Thr k ( ~s ) ∪ (cid:8)W i ∈ D s i (cid:12)(cid:12) D ⊆ [ m ] , | D | = k (cid:9) ofsize O (cid:0) km k (cid:1) and width k + 1 .Proof. The first two items are immediate. In order to show the third item we can first derive each clause p ,i ∨ . . . ∨ p k,i k by resolving s i ∨ . . . ∨ s i k with clauses of the form (4.3e), and then apply Lemma 2.7.Using the formula in Definition 4.4 to encode cardinality constraints on subsets, we can now defineformally what we mean by the relativization of a symmetric formula. Definition 4.6 (Relativization).
Given a CNF formula F symmetric with respect to a domain [ m ] and aparameter k < m , the k -relativization (or k -relativized formula ) F [ k ; m ] is the formula consisting of1. the threshold formula Thr k ( ~s ) over selectors ~s = { s , . . . , s m } ;2. a selectable clause s i ∨ . . . ∨ s i s ∨ C for each clause C ∈ F [ m ] , where { i , i , . . . , i s } are theindices mentioned by C .Since we are dealing with refutations of unsatisfiable formulas, it will always be the case that theparameter k in Definition 4.6 is at least the unsatisfiability threshold of F . An important property ofrelativized formulas is that the hardness of F [ k ; m ] scales nicely with m . In particular, if F [ k ] is not toohard, then the relativization F [ k ; m ] also is not too hard. Proposition 4.7. If F [ k ] has a resolution refutation of size S and width w , then F [ k ; m ] has a resolutionrefutation of size S · (cid:0) mk (cid:1) + O (cid:0) km k (cid:1) and width w + k .Proof. For every set D ⊆ [ m ] with | D | = k we show how to derive _ i ∈ D s i (4.4)in size S +1 and width w + k from F [ k ; m ] . Without loss of generality (because of symmetry) we assumethat D = [ k ] , so that we want to derive s ∨ · · · ∨ s k . Consider the assignment ρ = { s = 1 , . . . , s k = 1 } .In the restricted formula F [ k ; m ] ↾ ρ the selectable clauses in Definition 4.6, item 2, with all indices in [ k ] become the clauses of F [ k ] , which has a refutation of size S and width w . Thus the clause s ∨ · · · ∨ s k can be derived in size S + 1 and width w + k from F [ k ; m ] by Fact 2.5. After we have derived all clausesof the form (4.4) in this way, we can obtain the empty clause in width k + 1 and in size at most O (cid:0) km k (cid:1) using Observation 4.5. To prove size lower bounds on refutations of relativized formulas F [ k ; m ] we use random restrictionssampled as follows. Definition 4.8 (Random restrictions for relativized formulas).
Given a relativized formula F [ k ; m ] ,we define a distribution R of partial assignments over the variables of this formula by the followingprocess.1. Pick uniformly at random a set D ⊆ [ m ] of size k .2. Fix s i to if i ∈ D and to otherwise. 14 SizeLowerBounds fromRelativization3. Extend this to any assignment to the remaining variables of the formula Thr k ( ~s ) that satisfies thisthreshold formula.4. For every variable x i,~ that has index i D , fix x i,~ to or uniformly and independently atrandom.5. All remaining variables x i,~ for the indices i ∈ D are left unset.It is straightforward to verify that the distribution R is constructed in such a way as to give us back F [ k ] from F [ k ; m ] . Observation 4.9.
For any relativized formula F [ k ; m ] and any ρ ∈ R it holds that F [ k ; m ] ↾ ρ is equalto F [ k ] up to renaming of variables. The key technical ingredient in the size lower bound on sums-of-squares proofs is the followingproperty of the distribution R , which was proven in [AMO13, ALN14] but is rephrased below using thenotation and terminology in this paper. We also provide a brief proof sketch just to give the reader asense of how the argument goes. Lemma 4.10 ( [AMO13, ALN14] ).
Let k, ℓ, m be positive integers such that m ≥ and ℓ ≤ k ≤ m/ (4 log m ) . Let M be a monomial over the variables of F [ k ; m ] and let ρ be a random restrictionsampled from the distribution R in Definition 4.8. Then the domain-degree of M ↾ ρ is less than ℓ withprobability at least − (4 k log m ) k /m ℓ .Proof sketch. Ley ℓ ′ be the domain-degree of M . The restriction ρ will set independently and uniformlyat random at least ℓ ′ − k of its variables, so if ( ℓ ′ − k ) is larger than ℓ log m , the restricted monomial M ↾ ρ is non zero with probability at most /m ℓ . Otherwise we upper bound the probability that M ↾ ρ hasdomain-degree ℓ with the probability that the ℓ ′ indices in M contain ℓ of the k surviving indices. By aunion bound this probability is at most (4 k log m ) k /m ℓ .Using Lemma 4.10, it is now straightforward to show that relativization amplifies degree lowerbounds to size lower bounds. Theorem 4.11.
Let k, ℓ, m be positive integers such that m ≥ and ℓ ≤ k ≤ m/ (4 log m ) . Ifthe CNF formula F [ k ] requires sums-of-squares refutations of domain-degree ℓ , then the relativizedformula F [ k ; m ] requires sums-of-squares refutations of size m ℓ / (4 k log m ) k .Proof. Suppose that there is a sums-of-squares refutation of F [ k ; m ] in size S , i.e., containing S mono-mials. For ρ sampled from R , we see that the probability that some monomial in the refutation restrictedby ρ has domain-degree at least ℓ is at most S · (4 k log m ) k m ℓ (4.5)by appealing to Lemma 4.10 and taking a union bound.As noted in Observation 4.9, the formula F [ k ; m ] ↾ ρ is equal to F [ k ] up to renaming of variables, andso it cannot have a refutation of domain-degree ℓ or less. This implies that the bound on the probabil-ity (4.5) is greater than one, and thus we obtain S > m ℓ (4 k log m ) k , (4.6)which proves the theorem. 15IGHTSIZE-DEGREEBOUNDSFORSUMS-OF-SQUARESPROOFS Putting everything together, we can establish the formal version of our main results in Theorem 1.1 asfollows.
Theorem 4.12.
Let k = k ( m ) be any monotone non-decreasing integer-valued function such that k ( m ) ≤ m/ (4 log m ) . Then there is a family of -CNF formulas { F m,k } m ≥ with O (cid:0) km (cid:1) clausesover O( km ) variables such that:1. Resolution can refute F m,k in size k O( k ) m k and width k + 1 .2. Any sums-of-squares refutation of F m,k requires size Ω (cid:0) m α k / (4 k log m ) k (cid:1) , where α is a uni-versal constant.Proof. Let G be a graph with properties as in Theorem 3.9 and let F [ k ] be the CNF formula k - Clique( G ) in Definition 3.1. Since F [ k ] is symmetric, we can relativize it as in Definition 4.6 to obtain F [ k ; m ] ,which will be our -CNF formula F m,k . Theorem 3.9 says that F [ k ] has a resolution refutation ofsize k O( k ) and width k + 1 , and appealing to Proposition 4.7 we get a resolution refutation of F m,k insize k O( k ) m k and width k + 1 . Since we have a domain-degree lower bound of α k for refuting F [ k ] according to Theorem 3.9, we can use Theorem 4.11 to deduce that the required size to refute F m,k insums-of-squares is at least Ω (cid:0) m α k / (4 k log m ) k (cid:1) . The theorem follows.We remark that straightforward calculations show that when k ( m ) = O (cid:0) m δ (cid:1) for δ < α the upperbound in Theorem 4.12 is m O( k ) and the lower bound is m Ω( k ) .Let us now discuss a couple of the parameters in Theorem 4.12 and how they could be improvedslightly. We stated our main theorem for -CNF formulas, since that is the clause size that resultsnaturally from our construction. However, if one wants to minimize the clause width and obtain ananalogous result for -CNF formulas this is also possible to achieve, just as was done in [ALN14] forother proof systems. To prove a version of Theorem 4.12 for -CNF formulas we need a simple butrather ad-hoc variation of the relativization argument presented above. Let us briefly describe whatmodifications are needed.The way we presented the construction above, we started with the -CNF formula k - Clique( G ) andthen applied relativization, which turned the clauses (3.1c)–(3.1e) into the -CNF formula s i ∨ z i, i ∈ [ k ] , (4.7a) s i ∨ z i, ( j − ∨ x i,v j ∨ z i,j i ∈ [ k ] , j ∈ [ N ] , (4.7b) s i ∨ z i,N i ∈ [ k ] . (4.7c)An alternative approach would be to first encode k - Clique( G ) with wide clauses W Nj =1 x i,v j instead ofclauses of the form (3.1c)–(3.1e), relativize this new, wide formula, and then convert the relativizedformula into -CNF using extension variables. Instead of clauses (4.7c)–(4.7c), this would yield thecollection of clauses s i ∨ z i, i ∈ [ k ] , (4.8a) z i, ( j − ∨ x i,v j ∨ z i,j i ∈ [ k ] , j ∈ [ N ] , (4.8b) z i,N i ∈ [ k ] . (4.8c)This causes a small technical problem in that some of these clauses mention i ∈ [ m ] but lack theliteral s i , and so a random restriction sampled as in Definition 4.8 may actually falsify these clauses. Thesolution to this is to change the random assignment so that when s i = 0 , we fix each x i,v j uniformly atrandom in { , } , set each z i, ( j − equal to the value assigned to x i,v j , and finally fix z i,N to . The newrestriction satisfies all clauses (4.8a)–(4.8c), and the proof of Lemma 4.10 still goes through.Another parameter in Theorem 4.12 that could be improved is the value of α , which determineshow tightly the size lower bound matches the upper bound implied by width/degree and also how high16 Concluding Remarkswe can push k ( m ) . In our reduction from a -XOR formula φ to the clique formula k - Clique (cid:0) G kφ (cid:1) westart by splitting the n constraints into k blocks. The vertices in each block correspond to assignmentsto n/k variables, and because of this an SOS refutation in domain-degree d of k - Clique (cid:0) G kφ (cid:1) can beconverted to a refutation in degree dn/k of φ .If we want to obtain a more efficient reduction, we could instead split the n variables , rather than the n constraints, into k parts. In this way each vertex in G kφ would correspond to an assigment to n/k vari-ables, and an SOS refutation in domain-degree d would translate to a refutation of φ in degree dn/k . Butnow we cannot reduce to the clique problem anymore. Splitting with respect to constraints allows us toenforce pairwise consistency between vertices in different blocks referring to common variables. Whensplitting with respect to variables, the vertices in different blocks correspond to partial assigments ondisjoint domains and so are always pairwise compatible. However, we must still require that these partialassignments are consistent with the constraints in φ . Each such constraint refers to up to three blocks.Thus, any satisfying assignment to φ corresponds to k vertices such that no triple of vertices violates an -XOR constraint. This reduces to the problem of finding a k -hyperclique in a -uniform hypergraph.The rest of the reduction can be made to work as in Lemma 3.8. In the end we get an analogous resultof that in Theorem 3.9 but with α equal to α instead of α , which also improves Theorem 4.12. In thispaper we instead presented a reduction to the k -clique problem for standard graphs, partly because webelieve that a degree lower bound for this problem can be considered to be of independent interest. In this paper, we show that using Lasserre semidefinite programming relaxations to find degree- d sums-of-squares proofs is optimal up to constant factors in the exponent of the running time. More precisely,we show that there are constant-width CNF formulas on n variables that are refutable in sums-of-squaresin degree d but require proofs of size n Ω( d ) .As for so many other results for the sums-of-squares proof system, in the end our proof boilsdown to a reduction from -XOR using Schoenebeck’s version [Sch08] of Grigoriev’s degree lowerbound [Gri01b]. It would be very interesting to obtain other SOS degree lower bounds by differentmeans than by reducing from Grigoriev’s results for -XOR and knapsack.Another interesting problem would be to prove average-case SOS degree lower bound for k -cliqueformulas over Erd˝os–R´enyi random graphs, or size lower bounds for (non-relativized) k -clique formulasover any graphs. In this context, it might be worth to point out that the problem of establishing proof sizelower bounds for k -clique formulas for constant k , which has been discussed, for instance, in [BGLR12],still remains open even for the resolution proof system (although lower bounds have been shown fortree-like resolution in [BGL13] and for full resolution for a version of clique formulas using a differentencoding more amenable to lower bound techniques in [LPRT13]). Acknowledgements
We are grateful to Albert Atserias for numerous discussions about (and explanations of) Lasserre/SOSand other LP and SDP hierarchies, as well as for help with correcting some references in a preliminaryversion of this manuscript. We thank Per Austrin for valuable suggestions and feedback during theinitial stages of this work, and Michael Forbes for comments on an early version of the overall proofconstruction. Finally, we are thankful for the comments from the anonymous reviewers, which helpedimprove the exposition in this paper considerably.The authors were funded by the European Research Council under the European Union’s SeventhFramework Programme (FP7/2007–2013) / ERC grant agreement no. 279611. The second author wasalso supported by Swedish Research Council grants 621-2010-4797 and 621-2012-5645.17IGHTSIZE-DEGREEBOUNDSFORSUMS-OF-SQUARESPROOFS
References [ALN14] Albert Atserias, Massimo Lauria, and Jakob Nordstr¨om. Narrow proofs may be maximallylong. Technical Report TR14-118, Electronic Colloquium on Computational Complexity(ECCC), September 2014. Preliminary version appeared in
CCC ’14 .[AMO13] Albert Atserias, Moritz M¨uller, and Sergi Oliva. Lower bounds for DNF-refutations of arelativized weak pigeonhole principle. In
Proceedings of the 28th Annual IEEE Conferenceon Computational Complexity (CCC ’13) , pages 109–120, June 2013.[BBH +
12] Boaz Barak, Fernando G. S. L. Brand˜ao, Aram Wettroth Harrow, Jonathan A. Kelner, DavidSteurer, and Yuan Zhou. Hypercontractivity, sum-of-squares proofs, and their applications.In
Proceedings of the 44th Annual ACM Symposium on Theory of Computing (STOC ’12) ,pages 307–326, May 2012.[BGL13] Olaf Beyersdorff, Nicola Galesi, and Massimo Lauria. Parameterized complexity of DPLLsearch procedures.
ACM Transactions on Computational Logic , 14(3):20:1–20:21, August2013. Preliminary version appeared in
SAT ’11 .[BGLR12] Olaf Beyersdorff, Nicola Galesi, Massimo Lauria, and Alexander A. Razborov. Parame-terized bounded-depth Frege is not optimal.
ACM Transactions on Computation Theory ,4:7:1–7:16, September 2012. Preliminary version appeared in
ICALP ’11 .[BIK +
94] Paul Beame, Russell Impagliazzo, Jan Kraj´ıˇcek, Toniann Pitassi, and Pavel Pudl´ak. Lowerbounds on Hilbert’s Nullstellensatz and propositional proofs. In
Proceedings of the 35thAnnual IEEE Symposium on Foundations of Computer Science (FOCS ’94) , pages 794–806,November 1994.[BPS07] Paul Beame, Toniann Pitassi, and Nathan Segerlind. Lower bounds for Lov´asz–Schrijversystems and beyond follow from multiparty communication complexity.
SIAM Journal onComputing , 37(3):845–869, 2007. Preliminary version appeared in
ICALP ’05 .[BS14] Boaz Barak and David Steurer. Sum-of-squares proofs and the quest toward optimal algo-rithms. Technical Report TR14-059, Electronic Colloquium on Computational Complexity(ECCC), April 2014.[Dan06] Stefan Dantchev. Relativisation provides natural separations for resolution-based proofsystems. In
Proceedings of the 1st International Computer Science Symposium in Russia(CSR ’06) , volume 3967 of
Lecture Notes in Computer Science , pages 147–158. Springer,June 2006.[DM14] Stefan Dantchev and Barnaby Martin. Relativization makes contradictions harder for reso-lution.
Annals of Pure and Applied Logic , 165(3):837–857, March 2014.[DR03] Stefan Dantchev and Søren Riis. On relativisation and complexity gap for resolution-basedproof systems. In
Proceedings of the 17th International Workshop on Computer ScienceLogic (CSL ’03) , volume 2803 of
Lecture Notes in Computer Science , pages 142–154.Springer, August 2003.[GHP02] Dima Grigoriev, Edward A. Hirsch, and Dmitrii V. Pasechnik. Exponential lower boundfor static semi-algebraic proofs. In
Proceedings of the 29th International Colloquium onAutomata, Languages and Programming (ICALP ’02) , volume 2380 of
Lecture Notes inComputer Science , pages 257–268. Springer, July 2002.[GP14] Mika G ¨o¨os and Toniann Pitassi. Communication lower bounds via critical block sensitivity.In
Proceedings of the 46th Annual ACM Symposium on Theory of Computing (STOC ’14) ,pages 847–856, May 2014. 18eferences[Gri01a] Dima Grigoriev. Complexity of Positivstellensatz proofs for the knapsack.
ComputationalComplexity , 10(2):139–154, December 2001.[Gri01b] Dima Grigoriev. Linear lower bound on degrees of Positivstellensatz calculus proofs for theparity.
Theoretical Computer Science , 259(1–2):613–622, May 2001.[GV01] Dima Grigoriev and Nicolai Vorobjov. Complexity of Null- and Positivstellensatz proofs.
Annals of Pure and Applied Logic , 113(1–3):153–160, December 2001.[Kho02] Subhash Khot. On the power of unique -prover -round games. In Proceedings of the 34thAnnual ACM Symposium on Theory of Computing (STOC ’02) , pages 767–775, May 2002.[KI06] Arist Kojevnikov and Dmitry Itsykson. Lower bounds of static Lov´asz–Schrijver calcu-lus proofs for Tseitin tautologies. In
Proceedings of the 33rd International Colloquium onAutomata, Languages and Programming (ICALP ’06) , volume 4051 of
Lecture Notes inComputer Science , pages 323–334. Springer, July 2006.[Kra04] Jan Kraj´ıˇcek. Combinatorics of first order structures and propositional proof systems.
Archive for Mathematical Logic , 43(4):427–441, May 2004.[Kri64] Jean-Louis Krivine. Anneaux pr´eordonn´es.
Journal d’Analyse Math´ematique ,12(1):307–326, 1964.[Las01] Jean B. Lasserre. An explicit exact SDP relaxation for nonlinear 0-1 programs. In
Proceed-ings of the 8th International Conference on Integer Programming and Combinatorial Opti-mization (IPCO ’01) , volume 2081 of
Lecture Notes in Computer Science , pages 293–303.Springer, June 2001.[LPRT13] Massimo Lauria, Pavel Pudl´ak, Vojtˇech R ¨odl, and Neil Thapen. The complexity of provingthat a graph is Ramsey. In
Proceedings of the 40th International Colloquium on Automata,Languages and Programming (ICALP ’13) , volume 7965 of
Lecture Notes in ComputerScience , pages 684–695. Springer, July 2013.[LRST14] James R. Lee, Prasad Raghavendra, David Steurer, and Ning Tan. On the power of sym-metric LP and SDP relaxations. In
Proceedings of the 29th Annual IEEE Conference onComputational Complexity (CCC ’14) , pages 13–21, June 2014.[Nes00] Yurii Nesterov. Squared functional systems and optimization problems. In H. Frenk,K. Roos, T. Terlaky, and S. Zhang, editors,
High Performance Optimization , pages 405–440.Kluwer Academic Publisher, 2000.[OZ13] Ryan O’Donnell and Yuan Zhou. Approximability and proof complexity. In
Proceed-ings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA ’13) , pages1537–1556, January 2013.[Par00] Pablo A. Parrilo.
Structured Semidefinite Programs and Semialge-braic Geometry Methods in Robustness and Optimization . PhD the-sis, California Institute of Technology, May 2000. Available at http://resolver.caltech.edu/CaltechETD:etd-05062004-055516 .[PS12] Toniann Pitassi and Nathan Segerlind. Exponential lower bounds and integrality gaps fortree-like Lov´asz–Schrijver procedures.
SIAM Journal on Computing , 41(1):128–159, 2012.Preliminary version appeared in
SODA ’09 .[Sch08] Grant Schoenebeck. Linear level Lasserre lower bounds for certain k -CSPs. In Proceedingsof the 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS ’08) , pages593–602, October 2008. 19IGHTSIZE-DEGREEBOUNDSFORSUMS-OF-SQUARESPROOFS[Sho87] N. Z. Shor. An approach to obtaining global extremums in polynomial mathematical pro-gramming problems.
Cybernetics , 23(5):695–700, 1987. Translated from
Kibernetika ,No. 5, pages 102-–106, 1987.[Ste73] Gilbert Stengle. A Nullstellensatz and a Positivstellensatz in semialgebraic geometry.
Math-ematische Annalen , 207(2):87–97, 1973.[Tul09] Madhur Tulsiani. CSP gaps and reductions in the Lasserre hierarchy. In