The #ETH is False, #k-SAT is in Sub-Exponential Time
aa r X i v : . [ c s . CC ] F e b The k -SAT is in Sub-Exponential Time Giorgio Camerani † Rome, Italy - 2 February 2021 † [email protected] Abstract.
We orchestrate a randomized algorithm for k -SAT which counts the exactnumber of satisfying assignments in o ( n ) time. The existence of such algorithm signifiesthat the ⊕ ETH, ETH, ⊕ SETH and SETH.
Keywords: k -SAT, counting, sub-exponential time, One of the most useful principles of enumeration in discreteprobability and combinatorial theory is the celebrated principleof inclusion–exclusion. When skillfully applied, this principlehas yielded the solution to many a combinatorial problem.
Gian-Carlo Rota Introduction
In our previous paper [1] we have presented a simple deterministic algorithm A for random k -SAT, whichcounts the exact number of satisfying assignments in εn time, with lim k →∞ ε = 0 as long as ∆ = mn ∈ o ( k ) ,where n is the number of variables and m is the number of clauses. When k was allowed to grow with n rather than remaining constant, this led us to a sub-exponential time algorithm. The existence of A revealed to us that, at least in the ∆ ∈ o ( k ) realm, the hardness of random instances decreases as k in-creases: the longer the clause length, the shorter the running time. The algorithm runs faster and fasteras k gets higher and higher. The key insight to obtain such behaviour was the possibility, thanks to theinclusion-exclusion principle, to count satisfying assignments without even searching for them.This paper is devoted to improve A by gradually eliminating its points of weakness: the ∆ ∈ o ( k ) restriction, the non-constant k restriction, and the random restriction. In the end, such gradual improve-ments will culminate into a more general sub-exponential time algorithm for k -SAT, able to deal with anynumber m of clauses, with any constant k , and with worst-case instances as well.The existence of such algorithm will constitute a single shot confutation of all ⊕ ETH, ETH, ⊕ SETH and SETH. he k -SAT is in Sub-Exponential Time Contents.
The rest of this paper is organized as follows:
Section 2
Conceives a deterministic algorithm A for any random k -SAT instance, which com-putes the exact counting of satisfying assignments in time εn , where ε ∈ Θ( log kk ) .The clause density ∆ is no longer present in the expression of ε as it was the case in A , thus A works for any ∆ , be it critical or dense: the ∆ ∈ o ( k ) weakness exhibitedby A is therefore circumvented here by A . As lim k →∞ ε = 0 , the existence of suchcounting algorithm is already enough to refute ETH on random k -SAT, reason beingthat ETH is known to imply [2] that ε increases infinitely often as k → ∞ , whereashere in reality ε is monotonically strictly decreasing. Section 3
Uses A to devise a more general randomized algorithm A , working for any k -SATinstance with constant k , which counts the exact number of satisfying assignments intime O (cid:16) log log log n log log n n (cid:17) . Such final sub-exponential time algorithm A eliminates boththe dependency on k in the running time, and the random restriction on the inputformula. The randomness here is only used to turn the input formula into a formulawhich looks random to A .1.2. A quick note on randomization.
The literature is not unanimous on whether ETH and its closerelatives allow randomized algorithms: in some works they do, in some they do not, in some others no explicitstatement is made. In [3], where andin [4], where SETH was introduced, randomized algorithms are explicitly allowed for both ETH andSETH. We therefore feel justified in adopting the assumption that all these hypotheses permit the usage ofrandomized algorithms. We see no reasonable motivation for forbidding them.1.3. A quick note on k . The literature is also not unanimous on what k means: whether each clause has exactly k literals, or at most k literals. Papers on random k -SAT adopt the former definition, while paperson ETH use the latter. Consistently, in Section 2 we assume = k , while in Section 3 we will assume ≤ k .2. Solving Random k -SAT in Θ ( log kk ) n Deterministic Time
Let
Φ = Φ( n, m, k ) be a k -CNF formula on n boolean variables with m clauses, each one having length exactly k and being chosen uniformly at random among the k (cid:0) nk (cid:1) possible candidates. See how suchdefinition forbids the usage of the same variable more than once in the same clause. In this section we aregoing to excogitate a deterministic algorithm A for counting the exact number of satisfying assignmentsof any such Φ in time εn where ε = Θ( log kk ) . The probability that the returned counting is wrong is ≈ n σ log e , where the integer constant σ ≥ is just a tuning parameter of the algorithm which as suchdoes not depend on the input instance (neither on n nor on k ), and which let us control such probability.The reader might be wondering how is it possible that the algorithm is deterministic, yet it has an errorprobability greater than zero. This is the vanishingly small price we have to pay in order to eliminate the ∆ ∈ o ( k ) restriction that A had: as we will see, the only difference between A and A is just a simple All the logarithms in this paper are base . Moreover, we voluntarily omit polynomial factors, in orderto not encumber the aestethics: each time we write t we actually mean O ⋆ (2 t ) , where the O ⋆ notationsuppresses potentially existing factors of magnitude at most polynomial in the instance size. Finally, anadditional pedantic statement: it is self-evident that the variable in such Θ notation is k , certainly not n . s k = inf { δ : ∃ δn algorithm for solving k -SAT } . s k = inf { δ : ∃ randomized algorithm for k -SAT with time complexity poly ( m )2 δn } . he k -SAT is in Sub-Exponential Time observation which allows us to ignore a massive portion of the search space : such ignored portion is somassive that it let us eliminate the dependency of the exponent from ∆ , and the aforementioned errorprobability is the probability that ignoring it will jeopardise the correct counting of satisfying assignments.2.1. Notations and definitions.
Let
Φ = { c , · · · , c m } , and let V = { v , · · · , v n } be the set of variablesof Φ . Each clause c i = { ℓ i, , · · · , ℓ i,k } is a set of literals, where each literal is either a variable v ∈ V or itsnegation. Let A = { v , ¬ v } × · · · × { v n , ¬ v n } denote the set of all the n possible boolean assignments tothe n variables in V . Let S = { b ∈ A : ∀ c ∈ Φ c ∩ b = ∅ } be the set of satisfying assignments of Φ . Let U = A \ S = { b ∈ A : ∃ c ∈ Φ c ∩ b = ∅ } be the set of unsatisfying assignments of Φ . Definition 2.1 ( Sub-formula of Φ ) . A sub-formula Ψ of Φ is any formula Ψ ⊆ Φ . Definition 2.2 ( Monotone formula) . A formula is monotone if and only if each of its variables alwaysappears with the same sign: either always positive or always negated . Definition 2.3 ( Compatible clause) . Given a monotone formula Ψ and a clause c / ∈ Ψ , we say that c is compatible with Ψ if and only if Ψ ∪ { c } is still monotone . Definition 2.4 ( Maximal monotone sub-formula) . A monotone sub-formula Ψ of Φ is maximal if and onlyif ∀ c ∈ Φ \ Ψ it is the case that c is incompatible with Ψ .2.2. How A worked. Let O ν (respectively E ν ) be the number of monotone sub-formulae of Φ having ν variables and an odd (respectively even) number of clauses. In [1] we have proven the following: Theorem 2.1. (2.1) | S | = 2 n − n X ν =1 ( O ν − E ν ) · n − ν The above identity shows how the number | U | of unsatisfying assignment of any CNF formula can beexpressed as a function of the space of its monotone sub-formulae. This meant we could count satisfyingassignments by merely enumerating all the monotone sub-formulae of the input instance Φ , without eventrying to search for a single one satisfying assignment: we can count without search . We have seen how, as k → ∞ , for random k -SAT instances the cardinality | Ψ | that any maximal monotone sub-formula Ψ ⊂ Φ can possibly have grows at most as nk , as long as ∆ ∈ o ( k ) . This led us to devise A , which perlustratedthe whole space of monotone sub-formulae of Φ by simply brute-forcedly enumerating all the subsets of atmost nk clauses picked from the m = ∆ n available clauses. The number of such subsets is Θ (cid:16) log(∆ k ) k (cid:17) n ,hence the running time of A . The Achilles’ heel of such approach is that when ∆ / ∈ o ( k ) , e.g. as it isthe case for the hardest random instances existing at the critical threshold [5] around k , the running timeof A spirals out of control as k grows, possibly behaving even worse than naïve exhaustive search . Toovercome such fatal vulnerability, some new insight was needed. Recall that, as shown in [1], the search space here is not the space of satisfying assignments, but thespace of monotone sub-formulae. See how for a formula to be monotone it is not required that all the variables carry the same sign.Different variables can have different signs. The only restriction is that every same variable always carriesthe same sign. We also say that c is incompatible with Ψ if and only if c is not compatible with Ψ . See how it can be further shrinked to | S | = P nν =0 ( E ν − O ν ) · n − ν . Recall that Theorem 2.1 holds for as generic as possible CNF expressions. The main reason of such collapse in performances is that, when ∆ / ∈ o ( k ) , it is no longer necessarilytrue that every monotone sub-formula Ψ has at most ≈ nk clauses. Even when it is still the case, like with ∆ = 2 αk for α < , the final running time is ( α + log kk ) n , thus it is no longer true that lim k →∞ ε = 0 . he k -SAT is in Sub-Exponential Time A new insight: how to prune the search space.
In order to illustrate the crucial, yet very simple,observation that lets us able to ignore a remarkable portion of the space of monotone sub-formulae we haveto explore, let us imagine we are visiting such whole space by starting with
Ψ = ∅ and by scanning the m clauses sequentially. For each c i ∈ Φ compatible with Ψ , we branch: either we add it to Ψ , or we do not.Each time we add such a c i to Ψ , the number of its variables increases by an amount between and k . Definition 2.5 ( Fruitless clause) . Given a monotone sub-formula Ψ ⊂ Φ and a clause c j / ∈ Ψ compatiblewith Ψ , we say that c j is fruitless for Ψ if and only if Ψ and Ψ ∪ { c j } have the same number of variables.In other words, a fruitless clause is just a clause that, should it be added to Ψ , would not bring any newvariable to it: all the k literals it has are already mentioned in Ψ . Now, our observation is as simple as this: Adding a fruitless clause to Ψ is a completely useless operation, which does notaffect at all the counting of the exact number of satisfying assignments of Φ . To intuitively see why such observation is correct, think about this: as the number ν of variables of Ψ isequal to the number of variables of Ψ ∪ { c j }, this means that, in 2.1, Ψ will be counted among the O ν and Ψ ∪ { c j } among the E ν , or vice-versa: they cancel out each other, bringing a null and void contribution to | U | . See how far can this go: while gradually assembling a monotone sub-formula Ψ , as soon as we detectthe existence of a fruitless clause c j ahead, it does not just mean that Ψ and Ψ ∪ { c j } only are uselessfor our counting purpose. It also means that the entire Ψ built so far is totally useless, and there is noneed to go on any further: just because c j exists, it makes no sense to visit any of the remaining clausesstill to be considered. The mere existence of such a c j invalidates Ψ as a whole. To epitomize such intuition: Just because ∃ c j fruitless for Ψ , every Ψ ′ ⊇ Ψ brings no contribution to | U | . Every larger Ψ ′ having Ψ as a subset would be subject to such very same annihilation: for c j could be eitherpresent or absent in Ψ ′ without affecting its number ν ′ of variables, thereby causing a worthless +1 − contribution to the quantity O ν ′ − E ν ′ . This means that, as soon as we determine that such a fruitless c j exists somewhere down there in the remaining sequence of clauses yet to be considered, we can legitimatelystop here and overthrow Ψ wholly, thereby pruning the search space by ignoring all the Ψ ′ ⊇ Ψ : their entirerecursion sub-tree, rooted at Ψ , gets discarded without even being perlustrated. We can then rollback tothe last monotone sub-formula we had before Ψ , continuing the recursion from there onwards. Figure 1.
Visiting and pruning the search space of monotone sub-formulae he k -SAT is in Sub-Exponential Time The above figure offers a visual imprinting of the narration held so far. See how every node of the recursiontree corresponds to a certain monotone sub-formula Ψ ⊂ Φ . Which one? The Ψ we have built down to thatpoint, thanks to the choices we have made in the ancestors nodes, while deciding whether to add or noteach of the compatible clauses so far considered (see how when a clause is incompatible, there is no choiceto be made as we only have the left branch). We conclude here by stressing once again this crucial detail:as Figure 1 suggests, the fruitless c j which let us completely disregard the sub-tree rooted at Ψ needs notto be at the same level where Ψ itself is situated: it can be anywhere down such sub-tree, any arbitrarynumber of levels below. With such consideration clear in mind, we step into describing A .2.4. How A works. Let ourselves be wandering somewhere in the recursion tree of the search space ofmonotone sub-formulae. Let us be standing on a certain node Ψ of such tree. Let ν Ψ be the number ofvariables that Ψ has. Definition 2.6 ( Saturation ) . s Ψ = ν Ψ n .The saturation s Ψ of a monotone sub-formula Ψ is just a number comprised between and which representsthe amount of variables that Ψ has, compared to the overall number of variables mentioned in the inputformula Φ . As we keep adding clauses, the saturation clearly grows. Let L Ψ ∈ [1 , · · · , m ] be the level of thetree where Ψ is situated. Definition 2.7 ( Pruning probability ) . The probability P p Ψ that ∃ c j fruitless for Ψ with j ≥ L Ψ .Let T Ψ denote the sub-tree rooted at Ψ . The pruning probability P p Ψ is thus the probability that thewhole T Ψ can be disregarded without even being scrutinized by A , due to the existence of at least one fruit-less clause for Ψ at any level of T Ψ . We are now ready to formulate the following, naturally arising question: As the saturation s Ψ grows, how does the pruning probability P p Ψ evolve? Our aim would be to express P p Ψ as a function of s Ψ . To do so, we need to introduce the following first: Definition 2.8 ( Fruitless probability ) . The probability P f Ψ that a randomly picked clause is fruitless for Ψ .The following lemma relates the fruitless probability to the saturation, and will be used to compute P p Ψ : Lemma 2.2. (2.2) P f Ψ ≈ k (1 − log s Ψ ) Proof.
Each compatible clause that we might add to Ψ has to k literals in common with Ψ itself. In orderfor such a randomly generated clause to be fruitless, it has to have all of its literals in common, thus allthe k of them shall be picked among the s Ψ n variables of Ψ , with the same signs (otherwise it would beincompatible). P f Ψ can be thus expressed as the ratio between favourable outcomes (fruitless clauses) andall outcomes (all available clauses): P f Ψ = (cid:0) s Ψ nk (cid:1) k (cid:0) nk (cid:1) Using Stirling’s approximation , we can write: log P f Ψ ≈ k log s Ψ nk + k − k s Ψ n − k − k log nk − k + k n = k log s Ψ − k + − k s Ψ n + k n The p in the superscript of P p Ψ is a mnemonic for pruning . The f in the superscript of P f Ψ is a mnemonic for fruitless . log (cid:0) ab (cid:1) ≈ b log ab + ( a − b ) log aa − b = b log ab − ( a − b ) log(1 − ba ) . As log(1 − x ) ≈ − x for small x , theexpression finally simplifies to log (cid:0) ab (cid:1) ≈ b log ab + b − b a . This holds whichever the base of the logarithm is,because all the terms such as − x log e cancel out each other. he k -SAT is in Sub-Exponential Time The rightmost terms monotonically descend towards as n → ∞ and as we keep adding clauses to Ψ ,they can therefore be ignored asymptotically, leading us to the following expression which closes the proof: log P f Ψ ≈ k (log s Ψ −
1) = − k (1 − log s Ψ ) We conclude with an observation: the smallest term log(2 πx ) coming from the Stirling approxima-tion, which was obviously ignored in the above reasoning, would have given an overall contribution of log( s Ψ ( n − k ) s Ψ n − k ) which evidently collapses to as n → ∞ . (cid:3) We are interested in how P f Ψ behaves as we keep halving s Ψ . For each h ≥ , let us set:(2.3) s Ψ = 2 − h +1 That is to say, we start with a saturation s Ψ = 1 (corresponding to h = 1 ) and we keep increasing h : eachtime h is increased by , the saturation gets halved. Plugging 2.3 into 2.2 gives:(2.4) P f Ψ ≈ kh Now we set h = log n − log log nk and see what happens to both the saturation:(2.5) s Ψ = 2 log k nn k and the fruitless probability:(2.6) P f Ψ ≈ log nn Let us call 2.5 the critical saturation. Gluing it together, here is what it all roughly means, on average:
Once we have reached the critical saturation, we should expectto find log n fruitless clauses every further n clauses we scan. We are now ready to come back to P p Ψ , and to formulate an expression for it telling us how it behaves incorrespondence of the critical saturation as n → ∞ . Lemma 2.3. If Ψ has saturation at least critical, the following holds:(2.7) lim n →∞ P p Ψ = 1 Proof.
We focus on the last σn clauses of Φ and consider the Bernoulli process X , · · · , X σn where X i isthe random variable defined as follows: X i = ( if the i -th clause is fruitless for Ψ0 otherwiseClearly we mean the i -th clause of Φ among its last σn clauses: we are standing somewhere on a certain Ψ of the recursion tree, from our node having critical saturation we look down toward the last σn layers of thetree and conduct our Bernoulli process on them. By 2.6, X i = 0 with probability ≈ − log nn . The randomvariable X = P σni =1 X i tracks the number of fruitless clauses for Ψ among the last σn clauses of Φ , and thepruning probability P p Ψ is clearly at least equal to the probability that X > (in general P p Ψ is higher thanthat, because a fruitless clause might very well exist also before the last σn clauses). Considering that theprobability that X = 0 is: P ( X = 0) ≈ (cid:18) − log nn (cid:19) σn As Φ is a set of clauses, there is no notion of last . We therefore mean last with respect to a certainordering. Which ordering? Say, the ordering the m clauses have been randomly picked in the first place,from the set of k (cid:0) nk (cid:1) candidates. he k -SAT is in Sub-Exponential Time and that lim n →∞ P ( X = 0) = n σ log e this obvious conclusion follows asymptotically: P p Ψ ≥ − n σ log e which trivially means lim n →∞ P p Ψ = 1 , thereby concluding the proof. (cid:3) That is to say: at or above the critical saturation, every sub-tree T Ψ is asymptotically almost surely prun-able. The probability that there are no fruitless clause among the last σn clauses drops to as a powerof n , and we can control such power by tuning σ at our will. Once Ψ reaches the critical saturation, theprobability that the sub-tree T Ψ rooted at Ψ is totally worthless to be explored, and can therefore be ignoredtout-court without being visited and without affecting at all the correctness of the final counting, quicklyapproaches as n → ∞ . Epitomizing it: It makes literally no sense to use the first m − σn clauses to buildmonotone sub-formulae having a saturation higher than the critical.We can enumerate them only up to the critical saturation, not more. Doing so we are going to avoid the exploration of a massive portion of the search space, because each ofthe Ψ we are going to consider will be built as follows: by picking few clauses from the m − σn side, andcombining them with clauses picked from the σn side (where A works nicely, because σ is independent of k ). We are now going to formalize such intuitive statement: firstly by ending this sub-section presentingthe pseudo-code of A , and secondly by proving its running time immediately after. Algorithm A Computes the exact number of satisfying assignments of random Φ procedure CountRandom ( Φ , σ )2: Let Φ ↑ be the sub-formula of Φ obtained by selecting its first m − σn clauses3: Let Φ ↓ be the sub-formula of Φ obtained by selecting its last σn clauses4: Initialize h ν, O ν , E ν i ← h ν, , i , ∀ ν ∈ [ k, n ] for each monotone sub-formula Ψ ↑ of Φ ↑ having saturation s Ψ ↑ less than critical do for each monotone sub-formula Ψ ↓ of Φ ↓ do
7: Let
Ψ = Ψ ↑ ∪ Ψ ↓ be monotone and have ν variables8: if | Ψ | is odd then h ν, O ν , E ν i ← h ν, O ν + 1 , E ν i else h ν, O ν , E ν i ← h ν, O ν , E ν + 1 i end if end for end for count ← for ν ∈ [ k, n ] do count ← count + ( O ν − E ν ) · n − ν end for
19: Return n − count end procedure Some quick observations, all pretty obvious. Firstly, it’s clear that, at line , if Ψ does not happen to bemonotone we just skip it and go on with the inner iteration, to the next Ψ ↓ . Secondly, it’s also clear thatthe above algorithm A is not optimal: taking into account the existence of fruitless clauses only amongthe last σn , whereas a fruitless clause might very well exist at any index after the last clause of Ψ ↑ , is a he k -SAT is in Sub-Exponential Time rough simplification. However, such simplification renders A more amenable to a straightforward runningtime analysis: as we are about to see, the resulting asymptotic running time will not depend on ∆ in theend, so sufficient for our purpose. Thirdly, see how the above algorithm is deterministic, as there is nousage of random bits in it, yet it is not such an algorithm stricto sensu either: rather, it returns the correctanswer with probability , asymptotically almost surely. The probability that the returned answer mightbe wrong, which drops to as n → ∞ , is due to the remote, vanishing possibility that there might existsome Ψ ↑ having saturation equal or higher than critical, yet T Ψ ↑ being not prunable due to nonexistenceof fruitless clauses for Ψ ↑ (meaning that it might furnish a non-null contribution to the final counting).2.5. Proof of A running time. We are now ready to prove the following:
Theorem 2.4. A runs in εn time, with lim k →∞ ε = 0 . Proof.
We are going to show that ε ∈ Θ( log kk ) . First of all, we call Ψ ↑ ⊂ Φ critical if and only if itssaturation s Ψ ↑ is equal to the critical saturation. We shall then ask the following natural question: howmany clause does a critical Ψ ↑ have? We need an upper bound on the number of clauses that such a critical Ψ ↑ might possibly exhibit. By repeating a very basic reasoning already held in [1] , it is easy to see that,due to the random nature of Φ , each time we add a new compatible clause of k literals to the Ψ ↑ we areassembling, at least k of such literals will be new variables for Ψ ↑ : as we keep adding clauses, we imagineto stack all the growing number of variables of Ψ ↑ on the "left" half of n , then doing so each randomlygenerated clause will roughly have half of its literals falling on the "left" side (which occupancy is growing),and half of them on the "right" side (which keeps on remaining empty). This is true as long as Ψ ↑ has atmost n variables. As we have to keep on adding clauses only up to the critical saturation s Ψ ↑ = k nn k ,which corresponds to an amount of variables equal to s Ψ ↑ n = 2 n k − k log k n , and by the truism that suchamount is clearly less than n for large enough n , we can legitimately assume that in order to build ourcritical Ψ ↑ we need to invest compatible clause for every k variables we want in it, which leads to thefollowing upper bound on the number of clauses of any critical Ψ ↑ :(2.8) | Ψ ↑ | ≤ n k − k log k nk By invoking once again Stirling’s approximation as we did in the proof of Lemma 2.2, we can determinehow many critical Ψ ↑ can be assembled by picking clauses among the first m − σn clauses of Φ :(2.9) log (∆ − σ ) n n k − k log k nk ! ≈ n k − k log k nk log k (∆ − σ ) n k k n | {z } Leading term + 4 n k − k log k nk − n k − k log k nk (∆ − σ ) See how the leading term belongs to o ( n ) for any clause density ∆ . Let us sculpture it more evidently: The outer iteration of A cycles o ( n ) many times, whatever the ∆ is. It must be observed that the expression in 2.9 is a gross overestimation, because we are considering all thesub-formulae of Φ , even the non-monotone ones: reality is therefore much better than that. We now focuson the inner iteration of A , that is to say on the last σn clauses. Since σ is a constant which does notdepend on k , we can apply the result we have already proven in [1] for A , where we have shown that aslong as σnn ∈ o ( k ) it is the case that every maximal monotone sub-formula Ψ has at most ≈ nk clausesas k → ∞ . This fact allowed us to conclude that A had to perlustrate not more than (cid:0) m nk (cid:1) such maxi-mal Ψ s. By repeating that very same simple reasoning here on the last σn clauses, we can state the following: See Theorem 4.1 over there. he k -SAT is in Sub-Exponential Time The inner iteration of A cycles at most Θ( log kk ) n many times. The above follows from applying Stirling’s approximation to (cid:0) σn nk (cid:1) and observing how the resulting exponentasymptotically behaves as εn with ε = log( σk ) k + k − σk < log kk + log σk + k ∈ Θ( log kk ) . Gluing it all together:we have nested iterations, the outer one scanning over o ( n ) elements (the Ψ ↑ ), and the inner one scanningover εn elements (the Ψ ↓ ), which obviously means the overall number of considered merged objects (the Ψ = Ψ ↑ ∪ Ψ ↓ ) is given by their product, which translates into the sum of the two (outer and inner)exponents. The amount of times we are going to execute lines from to can therefore be written asfollows (considering only critical Ψ ↑ s and maximal Ψ ↓ s): o ( n ) for any ∆ and k z }| { k nk n k − k log k (∆ − σ ) n k k n | {z } Negligible term, outer iteration + lim k →∞ ε =0 z }| { Θ (cid:18) log kk (cid:19) n | {z } Leading term,inner iteration
By the very definition of Θ notation , we can assert that the total number of steps performed by A isasymptotically upper bounded by (up to a polynomial factor ): o ( n ) + Θ ( log kk ) n ∈ Θ ( log kk ) n thereby concluding the proof. We have a deterministic algorithm, which probability n σ log e of returning awrong answer collapses to as n grows, running faster and faster on random k -SAT instances as k → ∞ ,for any clause density, whatever ∆ ∈ Θ( k ) or ∆ ∈ Θ( n k − ) . (cid:3) Solving k -SAT in O (cid:16) log log log n log log n n (cid:17) Time with the help of Randomness
In this section, we are now going to orchestrate a plot to turn our non-randomized algorithm working onrandom instances into a randomized algorithm working on non-random instances:
We derandomize the instance by randomizing the algorithm.
In order to do that, we will devise a randomized reduction which, given a generic formula Φ having n variables m clauses and at most k literals per clause, outputs another formula Φ ′ having n ′ = n variables m ′ ≤ mσ log n clauses and at least log log n literals per clause. Then we will invoke A on Φ ′ : to do that,we only have to make sure that the clauses of Φ ′ look randomly generated to A , that is to say the last σn of them shall be indistinguishable by A from a random instance, whereas the first m ′ − σn shall onlyhave the property that each variable is mentioned roughly the same number of times. Clearly, the samedefinitions introduced in the previous section also apply to the present section, unique couple of exceptions That is to say, we pretend that the cheaper outer iteration has the same cost as the inner one. Such polynomial factor depends on sources: the fact that we ignored the log(2 πx ) term in Stirling’sapproximation, the fact that line requires to visit all the k literals, and the fact that we focused on maximal Ψ s only whereas we need to enumerate all Ψ s up to maximal size (very roughly, this can be brutally adjustedthanks to a nk factor squared, that is to say by pretending that there are, for each size up to the maximalsize, as many clauses as there are for the maximal, squaring due to the two nested cycles). he k -SAT is in Sub-Exponential Time being, as already anticipated, the ≤ k assumption used here instead of the = k assumption used there, andthe fact that here we require k to be constant whereas there we did not. Definition 3.1 ( Random inflation R c,z ) . Given a clause c ∈ Φ and an integer z > , R c,z is a randomlygenerated set of clauses c , · · · , c z , each one having exactly | c | + z literals. Being V c the set of variablesmentioned in c , the generation of R c,z consists in randomly picking z variables from V \ V c and in buildingthe z inflated clauses by adding z literals to c , one such clause for each possible combination of signs. Definition 3.2 ( Random inflation R Φ ,z ) . R Φ ,z = S c ∈ Φ R c,z .Thus given a generic Φ , we randomly inflate it by randomly inflating each of its clauses. Let us state theobvious: the z variables used to inflate Φ are re-picked again and again for each clause (and of coursethrown back in the V basket after being used), in order for each set R c,z to appear random to each other. Lemma 3.1. Φ and R Φ ,z have the same set of satisfying assignments. Proof.
Each set of clauses R c,z implies c , by applying z − resolution steps to the z literals. The clause c we get back subsumes every clause in R c,z , which thus all disappear by leaving c only. Repeating thisprocess for every c let us transform R Φ ,z back into Φ in m (2 z − resolution steps. (cid:3) Clearly such proof relies on the fact that the z variables which were used to inflate clauses were all alreadymentioned in Φ : none of them was a fresh new variable (in which case the above Lemma 3.1 would havebeen false). So far we have constructed, from our input formula Φ having n variables m clauses and atmost k literals per clause, another random looking formula having n variables z m clauses and at least z literals per clause. We are now ready to present our randomized reduction: Algorithm R Reduces Φ to Φ ′ procedure Inflate ( Φ )2: Initialize Φ ′↑ = Φ ′↓ = ∅ for each c ∈ Φ do
4: Let R c = R c, log log n
5: Let c ↓ ∈ R c be randomly picked6: Φ ′↑ ← Φ ′↑ ∪ R c \ { c ↓ } Φ ′↓ ← Φ ′↓ ∪ { c ↓ } end for
9: Return Φ ′↑ ∪ Φ ′↓ end procedure procedure Inflate ( Φ , σ )12: if m ≥ σn then
13: Return
Inflate ( Φ )14: else
15: Initialize Φ ′↑ = Φ ′↓ = ∅ for each i = 1 , · · · , σnm do
17: Let Φ ′ i = Inflate (Φ) Φ ′↑ ← Φ ′↑ ∪ Φ ′ i, ↑ Φ ′↓ ← Φ ′↓ ∪ Φ ′ i, ↓ end for
21: Return Φ ′↑ ∪ Φ ′↓ end if end procedure he k -SAT is in Sub-Exponential Time At line 9 above, we mean that Φ ′↓ are the last m clauses of the returned Φ ′ (similarly for Φ ′ i, ↓ at line 19).As promised in the beginning of this section, such reduction R allows us to obtain, from any input formula Φ having at most a constant number k of literals per clause, another formula Φ ′ having as many variablesas Φ , a number of clauses at most σ log n times higher, at least log log n literals per clause, and exactly thesame set of satisfying assignments as Φ . Moreover, the sub-formula composed by the last σn clauses of Φ ′ ,denoted as Φ ′↓ , is a full-fledged random formula: from the point of view of A , it will behave for all intentsno differently than a random CNF instance having exactly log log n literals per clause. The randomizedalgorithm A which makes us able to solve k -SAT in sub-exponential time is therefore the following: Algorithm A Computes the exact number of satisfying assignments of Φ procedure Count ( Φ , σ )2: Let Φ ′ = Inflate (Φ , σ )
3: Return
CountRandom ( Φ ′ , σ )4: end procedure We are now ready to complete the paper by finally proving our main result, as follows:
Theorem 3.2. A runs in o ( n ) time. Proof.
We only have to show that the proof of Theorem 2.4 goes through as well with Φ ′ . In order to doso, we can consider the k in Theorem 2.4 to be equal to log log n . So let us read that proof again and checkif every step of it stands valid with Φ ′ . First step is to verify whether the upper bound 2.8 on the numberof clauses that any critical Ψ might possibly have still holds: it is evident that it does, for its underlyinghypothesis (that at least k = log log n new variables are added to Ψ ↑ for each new clause inserted in it) isstill valid, due to the fact that, thanks to the non-constant z random inflation, any variable occurs in Φ ′ roughly the same number of times as any other variable (whereas it might not be the case in Φ ). We alsoobserve that the critical saturation is s Ψ = 4 n − n , which means s Ψ n ∈ o ( n ) . By plugging k = log log n into 2.8, we can write the following:(3.1) | Ψ ↑ | ≤ n log log n − n log log n By our usual notation, being ∆ (respectively ∆ ′ ) the clause density of Φ (respectively Φ ′ ), we can write:(3.2) ∆ ′ ≤ ∆ σ log log n = ∆ σ log n By plugging 3.2 into 2.9, and displaying the leading term only, we can write:(3.3) log (∆ ′ − σ ) n n log log n − n log log n ! ≈ n log log n − n log log n log log log n ( ≥ ∆ ′ z }| { ∆ σ log n − σ ) n n We observe how n log log n − n ∈ o ( n ) and how the increased ∆ ′ does not behave substantially different than theoriginal ∆ in terms of its impact on the overall expression in 3.3, which as a whole remains sub-exponentialas well in any case. This means that: The outer iteration of A cycles o ( n ) many times also when fed with Φ ′ . We repeat here the same observation made in the previous section: we are overestimating the numberof loops of the outer cycle, because we are pretending that every clause can coexist in Ψ ↑ with everyother clause (see how this is clearly false for every pair of clauses picked from the same R c ). We shallnow focus on the last σn clauses of Φ ′ : thanks to our randomized reduction R , such last clauses have he k -SAT is in Sub-Exponential Time been built and arranged in such a way to be indistinguishable by A from a random log log n -SAT in-stance. What do we mean by "indistinguishable by A "? We mean that the crucial property exploitedby A (and re-used by A ) holds: asymptotically, every monotone sub-formula assembled using the last σn clauses has cardinality at most n log log n , as long as σ ∈ o (log log n ) (which is obviously true as σ is aconstant). This means we can re-apply that very same argument and conclude that the number of maximalmonotone sub-formulae to be perlustrated among the last σn clauses by the inner iteration of A behavesas εn , where ε = log log log n log log n + log σ log log n + n − σ log log n , with the first term clearly being the leading term: When fed with Φ ′ , the inner iteration of A cycles O (cid:16) log log log n log log n n (cid:17) many times. The proof ends the same way it ended in the previous section: the running time of A , which is given bythe running time of A when fed with Φ ′ , is the product of the outer and inner iterations, both of themexecuting a sub-exponential amount of cycles. The exponent of the overall running time is the sum of thetwo exponents of those iterations, now both of them being o ( n ) : o ( n ) for any ∆ z }| { o ( n ) z }| { n log log n − n log log n log log log n (∆ σ log n − σ ) n n | {z } Negligible term, outer iteration + o ( n ) z }| { O (cid:18) log log log n log log n n (cid:19)| {z } Leading term,inner iteration As n → ∞ , by definition the above can be rewritten as: O (cid:16) log log log n log log n n (cid:17) ∈ o ( n ) thereby concluding the proof. We have a randomized counting algorithm which returns the exact number ofsatisfying assignments with probability , running in o ( n ) time on generic k -SAT instances, for any clausedensity and any constant k . (cid:3) References [1] Giorgio Camerani. “The Long, the Short and the Random”. In:
CoRR abs/2011.01649 (2020).arXiv: . url : https://arxiv.org/abs/2011.01649 .[2] Russell Impagliazzo and Ramamohan Paturi. “The Complexity of k -SAT”. In: Proceedings ofthe 14th IEEE Conference on Computational Complexity (1999), pp. 237–240.[3] Martin Wahlén Holger Dell Thore Husfeldt. “Exponential Time Complexity of the Perma-nent and the Tutte Polynomial”. In:
International Colloquium on Automata, Languages, andProgramming (2010), pp. 426–437.[4] Russell Impagliazzo Chris Calabro and Ramamohan Paturi. “The Complexity of Satisfiability ofSmall Depth Circuits”. In:
Parameterized and Exact Computation, 4th International Workshop (2009), pp. 75–85.[5] Amin Coja-Oghlan. “The Asymptotic k -SAT Threshold”. In: Proceedings of the Forty-Sixth An-nual ACM Symposium on Theory of Computing . STOC ’14. New York, New York: Associationfor Computing Machinery, 2014, pp. 804–813. isbn : 9781450327107. doi : . url : https://doi.org/10.1145/2591796.2591822https://doi.org/10.1145/2591796.2591822