aa r X i v : . [ c s . CC ] F e b Sat Has No Wizards
Silvano Di ZenzoDepartment of Computer Science, University of Rome
Abstract
An (encoded) decision problem over Σ is a pair (
E, F ) where E =words that encode instances of the problem, F =words to be ac-cepted. We use strings in a technical sense, borrowed from Com-putability. With any NP problem ( E, F ) we associate a set of strings | Log E ( F ) | called the reduced logogram of F relative to E , which con-veys structural information on E , F , and how F is embedded in E .We define notions of internal independence of decision problems interms of | Log E ( F ) | . The kernel Ker ( P ) of a program P that solves( E, F ) is the set of those strings in | Log E ( F ) | that are actually usedby P in making decisions. There are strict relations between Ker ( P )and the complexity of P .We develop an application to SAT that relies upon a property ofstrong internal independence of
SAT . We show that
SAT cannot havein its reduced logogram certain strings that, when present, serve ascollective certificates. As a consequence, all the programs that solve
SAT have the same kernel
Ker ( P ) = | Log
CNF ( SAT ) | . We develop an application to
SAT , to be positioned in current stream ofinterest in the structure of Boolean satisfiability [1]. We use strings in atechnical sense, borrowed from Computability [2].An encoded decision problem over Σ is a pair (
E, F ) where E =wordsthat encode instances of the problem, F =words to be accepted. On input x ,a decision program P for ( E, F ) either accepts x (if x is in F ) or rejects x (if x is in E − F ) or else discards x (for x outside E ).Our fundamental construct is a set of strings Log E ( F ) called logogram of F relative to E that conveys structural information on E , F , and how F isembedded in E . We mostly use the reduced version | Log E ( F ) | , consisting ofthose strings in Log E ( F ) that do not include other strings in Log E ( F ). The1 ernel Ker ( P ) of a program P that solves ( E, F ) is the set of those stringsin | Log E ( F ) | that are actually used by P in making decisions. There arestrict relationships between the composition in terms of strings of the kernel Ker ( P ) of a program solving ( E, F ) and the complexity of P .Our application to SAT uses a property of internal independence of adecision problem that we call “strong internal independence.” Think of acomputation in which the result of any computation step does not changethe results that are possible for subsequent steps. Internal independence isdefined in terms of a relation of entanglement ⊒ E between sets of stringsrelative to reference set E .Our main results are the following. We show that ( CN F, SAT ) exhibitsthe strong internal independence property: Intuitively, no “entanglement atdistance” between strings in | Log
CNF ( SAT ) | is possible. Besides, we showthat problem ( CN F, SAT ) cannot have, in its reduced logogram, certaincollective certificates that we call wizards . As a consequence, the decisionprograms P that solve ( CN F, SAT ) all have the same kernel
Ker ( P ) = | Log
CNF ( SAT ) | . We first recall notions regarding the certificates of membership in NP theory.As next step, we illustrate possible use of strings to represent certificates. Weconclude the section reviewing basic algebraic properties of strings.Let G ⊆ Σ ∗ × Σ ∗ so that G is a relation on words over Σ. Let Dom ( G )and Cod ( G ) be first and second projection of G . A relation G which is bothpolynomial-time decidable and polynomially balanced is an NP relation. L isin NP if and only if there exists an NP relation G such that L = Dom ( G ). Weinterchange problems with languages: ( E, F ) ∈ N P and F ∈ N P amount tothe same.Let (
E, F ) be an NP problem. Then there exists a sequence y , y ,.. ofwords (over some appropriate alphabet) called solutions or else certif icatesof membership for problem ( E, F ). For any problem instance x ∈ E wehave that x can possibly be satisfied by some of the y i s . We also have“unsatisfiable” instances. What “satisfaction” means operationally is properof problem ( E, F ).Cardinality function α ( n ) of an NP problem: We may arrange notationsso that all solutions that can possibly satisfy an x of size n are between y and y α ( n ) .Associated with solutions y , y ,.. there is a decomposition of target set F into subsets F i called regions , where F i is the set of those x ′ s that are2atisfied by y i . Regions satisfy the obvious relation F = ∪ i F i . In this paper we replace certificates with generalized certif icates . These arerepresented by strings , defined to be functions N → Σ with finite domain( N =positive integers). In loose words, a string g being included (or sub-sumed) in a word x is that which remains by canceling zero or more lettersin x , while leaving blanks in places of letters. Note that words are certainspecial strings, thus the solutions y , y ,.. continue to be certificates. Thisgeneralization allows us to introduce certain more general certificates thatwe call wizards .We assume that satisfiability, being a property exhibited by certain words,is accompanied by characteristic signs , that we think as distinctive marks,or signatures, being somehow inscribed within the word x under study. Adetailed discussion would yield strings as proper formalization of such notionsas “mark” or “signature.” Thus, we assume that signs are strings interspersedin x . Since strings represent words in shorthand, we call their set a logogram . We define Σ ∞ to be the set of all strings over Σ. Look at g ∈ Σ ∞ as aprescription that a word x over Σ may or may not satisfy. If Dom ( g ) is aninitial segment of N then g is an ordinary word: Thus, words are certainspecial strings. The length (or size) | g | is the greatest number in Dom ( g ).Σ ∞ is partially ordered. Given f, g ∈ Σ ∞ , g is an extension of f , written f ≤ g , as soon as Dom ( f ) ⊆ Dom ( g ) and g takes same values as f in Dom ( f ). If f ≤ g and g ≤ f then f = g . If f ≤ g but not g ≤ f , write f < g and say g is a proper extension of f , or else f is a proper restriction of g . The empty partial function N → Σ , noted ⊥ , is the void string, and Dom ( ⊥ ) = ∅ . Any f in Σ ∞ is an extension of ⊥ , thus (Σ ∞ , ≤ ) has a leastelement ⊥ .Two strings f and g are compatible as soon as f ( x ) = g ( x ) for any x in Dom ( f ) ∩ Dom ( g ). If f, g are disjoint, which is to say Dom ( f ) ∩ Dom ( g ) = ∅ ,then f and g are certainly compatible. The meet f ∧ g of any pair f, g is therestriction of f (or g ) to that portion of the intersection Dom ( f ) ∩ Dom ( g )where f and g agree. The join of two compatible strings f, g , noted f + g ,is the least string which is an extension of both f and g . Thus f, g ≤ f + g and Dom ( f + g ) = Dom ( f ) ∪ Dom ( g ). Equipped with meet and join, Σ ∞ isan upward directed complete meet-semilattice [3].3 Entanglement among Strings
The cylinders defined below are as in Computability (the formalism is slightlydifferent). The logogram is a newcomer in Computer Science. Entanglementis a key concept to deal with internal structure of computational problems.
Given H ⊆ Σ ∞ we define Exp ( H ) = { x ∈ Σ ∗ : ∃ a ∈ H ( x ≥ a ) } (1)Thus, Exp ( H )=set of all words that include strings from H . Call Exp ( H ) absolute expansion , equivalently, absolute cylinder associated with H . Notethat Exp ( H ) is the union of the elementary cylinders Exp ( g ) for g ∈ H .Given any recursive set of words E , we write Σ ∞ ( E ) for the set of allstrings that happen to be included in words of E , thusΣ ∞ ( E ) = { g ∈ Σ ∞ : Exp ( g ) ∩ E = ∅} (2)Σ ∞ ( E ) is the set of those strings g in Σ ∞ whose associated cylinder Exp ( g )intersects E . We think of E as the set of words over Σ that encode instancesof some fixed reference computational problem Π. (Whenever we talk of areference set E there is implicit reference to some fixed abstract decisionproblem Π as well as to a program P solving Π.) For H ⊆ Σ ∞ ( E ) we write Exp E ( H ) = E H = { x ∈ E : ∃ a ∈ H ( x ≥ a ) } = E ∩ Exp ( H ) (3)Thus, E H is the set of those words in E that contain strings from H . E H is the expansion of H relative to base E . We actually regard E H as arelativized cylinder, equivalently, as being a cylinder relative to a referenceset E . Note that for E = Σ ∗ we regain the absolute expansion of set H .Given H ⊆ Σ ∞ ( E ), correspondence Exp E : H → E H exhibits properties: E H ∪ E K = E H ∪ K , E H ∩ E K = E H + K (4)Thus, unions and intersections of sets that are cylinders relative to refer-ence set E are cylinders in E . Also note that, for any H, K ∈ Σ ∞ ( E ), H ⊆ K ⇒ E H ⊆ E K (5) Exp E ( Exp E ( H )) = Exp E ( E H ) = E H (6)4 .2 Logograms In this section we introduce the logogram of a set of words F relative to areference set E . Given F ⊆ E , we define Log E ( F ) = { g ∈ Σ ∞ ( E ) : ∀ x ∈ E ( x ≥ g ⇒ x ∈ E F ) } (7)Since ordinary words are strings, F can be regarded as a set of strings, hence E F is defined. E F is the relative cylindrification of F in E , and this in turnis the set of all words in E that are prefixed by words in F . Remark If F is a cylinder in E , i.e., F = E H for some H ⊆ Σ ∞ ( E ), then E F = F by Equation 6. Remark
For E = Σ ∗ Equation 7 gives absolute logogram .The main property of correspondence
Log E : E → Σ ∞ ( E ) is the following.Given A, B ⊆ E , Log E ( A ∪ B ) ⊇ Log E ( A ) ∪ Log E ( B ) . (8)Let us understand this inclusion. Log E ( A ∪ B ) is the set of all strings that,for x in E , are able to trigger event x ∈ E A ∪ B = E A ∪ E B . A string thattriggers x ∈ E A certainly belongs to Log E ( A ∪ B ). Analogously, a string thattriggers x ∈ E B certainly belongs to Log E ( A ∪ B ). Thus, Log E ( A ) ∪ Log E ( B )certainly is a subset of Log E ( A ∪ B ). However, there can be strings f whoseinclusion in a word x ∈ E is a sufficient condition for event x ∈ E A ∪ E B butnot for x ∈ E A or x ∈ E B . Thus, in the general case Log E ( A ∪ B ) is not thesame set as Log E ( A ) ∪ Log E ( B ). The presence of certain strings in a word may entail that of certain others.Given
H, K ⊆ Σ ∞ ( E ), we write K ⊒ E H if the following happens: Everyword in E which includes strings from K also includes strings from H . (Thinkof strings in K as spies, or else symptoms, for presence in an input string x of strings from H .) If H ⊒ E K and K ⊒ E H then we write H ≡ E K andsay that H, K are isoexpansive relative to E . Clearly, ≡ E is an equivalencerelation. It is easily seen that H ≡ E K if and only if E H = E K .For E = Σ ∗ we rewrite ⊒ E as ⊒ and ≡ E as ≡ . Note that f ⊒ E g if andonly if every word x (within E ) which includes f also includes g .We mention a few easy facts. (i) If f ≤ g then f ⊒ E g for any possible E . (ii) In the general case f ⊒ E g does not imply f ≤ g . (It is well possiblethat this holds for specific sets E . For example, if E = Σ ∗ then f ⊒ E g if and5nly if f ≤ g .) (iii) If f, g are incompatible, then it cannot be that f ⊒ E g .(iv) Given any H, K ⊆ Σ ∞ ( E ), H ⊆ K ⇒ H ⊒ K ⇒ H ⊒ E K (9)We ask: Is there any easy piece of algebra linking expansion, logogram,entanglement? To get an answer, we define a Galois connection that willprovide us with a closure operation in Σ ∞ ( E ), noted H → H αβ . We willsee that H and H αβ are isoexpansive relative to E . What more, there canbe distinct subsets K, I ,.. of H being isoexpansive (mod E ) to H αβ whilepossibly exhibiting different computational behaviors.We define our connection to be a pair ( α, β ) of correspondences betweensets of strings and sets of words. The first correspondence α carries a set ofstrings H ⊆ Σ ∞ ( E ) into a corresponding set of words H α ⊆ E . The secondcarries a set of words A ⊆ E into a set of strings A β ⊆ Σ ∞ ( E ) according to H ⊑ E K ⇒ H α ⊇ K α (10) A ⊆ B ⇒ A β ⊒ E B β (11) H ⊑ E H αβ , A ⊆ A βα (12)The connection is formally defined through the explicit expressions: H α = E H (13) A β = Log E ( A ) (14)We emphasize that A is any subset of E . Thus, given any subset A of thereference set E the function A β = Log E ( A ) is defined. However, not allsubsets A of E happen to be the conjugate set H α of some set H ⊆ Σ ∞ ( E ).If that happens, we say that A is closed. Note that A closed implies E A = A . Theorem 1. ( α, β ) is a Galois connection.Proof. We must derive Equations 10-12 from Equations 13-14.(I) Let
H, K ⊆ Σ ∞ ( E ) be given, and assume H ⊑ E K .Let g ∈ K , and let x be any word in E such that x ≥ g . Then x ∈ E K hence x ∈ K α . Since H ⊑ E K , there exists f ∈ H such that x ≥ f . Then x is in E H hence x ∈ H α . Equation 10 is proved.(II) Next, we prove Equation 11. Let A, B ⊆ E and assume A ⊆ B .We must prove that if a word x ∈ E includes a string g from A β then x also includes a string f from B β .Let g ∈ Log E ( A ) so that g ∈ A β by Equation 14.6hus, for all x ∈ E we have x ≥ g ⇒ x ∈ E A . But A ⊆ B , hence x ∈ E A ⇒ x ∈ E B by Equation 5.Thus, for all x ∈ E we have x ≥ g ⇒ x ∈ E B . By Equation 14 this is tosay g ∈ Log E ( B ) = B β . We have shown that A β ⊆ B β . Equation 11 followsby virtue of Equation 9.(III) Next, we prove the first of Equations 12.Let g ∈ H αβ and let x ∈ E be any string in E such that x ≥ g .We have H α = E H hence H αβ = Log E ( E H ). Thus, g ∈ Log E ( E H ).By Equation 7 we have x ∈ Exp E ( E H ), and then, by virtue of Equation 6, x ∈ E H .We conclude that there exists f ∈ H such that x ≥ f .(IV) Next, we prove the second of Equations 12. It follows from Equations13, 14 that A βα = E Log E ( A ) . On other hand one has A ⊆ E Log E ( A ) for any A ⊆ E . Indeed, A ⊆ E A and E A = E Log E ( A ) from definitions, taking intoaccount Equation 6. We proved the second of Equations 12. Theorem 2.
The following equations hold: H ⊆ H αβ , A ⊆ A βα (15) H α = H αβα , A β = A βαβ (16) Proof.
From theory of Galois connection [4].
Theorem 3.
The map H → H αβ is a closure operation in Σ ∞ ( E ) , and A → A βα is a closure operation in E . Only for a closed A do we have that, for all x in E , x ∈ A if and only ifthere exists a g ∈ Log E ( A ) such that x ≥ g . If and only if A is a closed subsetof the reference set E we define the reduced kernel | Log E ( A ) | . Regarding thereduced kernel | Log E ( A ) | of a closed set A ⊆ E , we explicitly note that, forany x in E , x ∈ A if and only if there is g ∈ | Log E ( A ) | such that x ≥ g . We begin with a few remarks on the nature of the strings that happen tooccur in the reduced logogram | Log E ( F ) | . For any NP decision problem( E, F ) we assume F to be a relative cylinder in E . (It is known that SAT is a cylinder [5].) The strings in | Log E ( F ) | are certificates of membershipfor F relative to E : For words in E , to include one or more strings from | Log E ( F ) | is necessary and sufficient for membership in F . In principle, wecannot exclude that | Log E ( F ) | may contain strings that behave as collective7itnesses, also called wizards. (There exist problems, e.g. P RIM ES , where | Log E ( F ) | has wizards.) In that case a program P solving ( E, F ) might docalculations that are functionally equivalent to testing input x for wizards.Let P solve problem ( E, F ). The computations that P performs arefunctionally equivalent to sequences of tests done on input x . This is part ofScott’s view of computations [6] [7]. (The term “test” is ours: Dana Scottuses “token” or else “piece of information” according to context.) Note thatScott’s theory is consistent with our developments as soon as we identifyScott’s tokens with strings. In this view what P actually does is searchingthe input x for strings in | Log E ( F ) | . That yields a view of computations assequences of tests in disguise .Let program P solve problem ( E, F ). The tests in | Log E ( F ) | are thosethat P can use: They are so to speak at disposal for a program P . Which ofthese tests are actually used by P is a different story. We define the kernel ofprogram P , noted Ker ( P ), to be the set of the strings from | Log E ( F ) | that P actually uses for making decisions. The strings in Ker ( P ) are uniquelyidentified by the algorithm that P implements. The composition of Ker ( P )in terms of strings can also be determined through experiments with theexecutable of P .A concept of great relevance for sequel is that of a complete subset ofthe reduced logogram | Log E ( F ) | of decision problem ( E, F ): We define a set H ⊆ | Log E ( F ) | to be complete for problem ( E, F ) as soon as, for any x ∈ E ,one has x ∈ F ⇔ ∃ f ∈ H ( f ≤ x ).The proofs of following two theorems are not difficult and are omitted. Theorem 4.
A necessary condition for P to correctly solve ( E, F ) is Ker ( P ) complete for ( E, F ) . Let H ⊆ | Log E ( F ) | be a complete set of strings for ( E, F ). We define H to be irreducible for ( E, F ) as soon as no proper subset K ⊂ H happens tobe complete for ( E, F ). Theorem 5.
Let | Log E ( F ) | be irreducible and programs P, Q both solve ( E, F ) . Then Ker ( P ) = Ker ( Q ) . We first introduce a notion of pairwise independence of strings relative to areference set E . As next step, we define a notion of internal independence ofset E . Next we define notions of internal and strong internal independenceof a decision problem. 8 utual Independence of Strings Let f, g be any two strings in Σ ∞ ( E )where E is any infinite recursive set of words over alphabet Σ. According todefinitions, f entangles g relative to E as soon as, for all x ∈ E , x ≥ f ⇒ x ≥ g . We agreed that f ⊒ E g means that f entangles g relative to E .Observe that f fails to entangle g relative to E if and only if there ex-ists x ∈ E such that x contains f and does not contain g . If f E g and g E f then f and g are said mutually independent relative to E ; f and g are mutually dependent relative to E when they fail to be mutually indepen-dent relative to E . If f, g are incompatible, then certainly f, g are mutuallyindependent relative to any E . Independence of a Recursive Set
Our next step is to define the internalindependence of a recursive set E . We define E to be internally independent as soon as, given any f, g ∈ Σ ∞ ( E ) one has f ⊒ E g if and only if f is part of g , that is to say, if and only if f ≤ g . Independence of a Decision Problem
Now we are ready to introducethe simple internal independence of a decision problem (
E, F ). We call (
E, F ) internally independent as soon as the strings in | Log E ( F ) | are mutuallyindependent taken two by two. Theorem 6. If E is internally independent then any decision problem ( E, F ) based on E as reference set exhibits the simple internal independence property.Proof. Let E be any infinite recursive set exhibiting the internal indepen-dence property. Let ( E, F ) be any decision problem based on E as referenceset. Let f, g be any two strings in the reduced logogram | Log E ( F ) | of theproblem.(I) Assume f, g incompatible. Since f ∈ Σ ∞ ( E ), we have E ∩ Exp ( f ) = ∅ .Let x ∈ E ∩ Exp ( f ). Then x is in E , x includes f and does not include g .Analogously, one can find a y ∈ E which includes g and does not include f .Thus, f, g are mutually independent in E .(II) Assume f, g compatible. By the minimality property of the reducedlogogram | Log E ( F ) | it cannot be that f ≥ g . By the internal independenceof the reference set E one has f ⊒ E g if and only if f ≥ g . Then, it alsocannot be the case that f ⊒ E g . As a consequence, there exists x ∈ E whichincludes f and does not include g . Analogously, there exists y ∈ E whichincludes g and does not include f . Thus, again we have that f, g are mutuallyindependent.We conclude that problem ( E, F ) exhibits the simple internal indepen-dence property. 9 trong Independence of a Decision Problem
Let us now come to thestrong internal independence of a decision problem. We know that (
E, F )is internally independent as soon as the strings in its reduced logogram | Log E ( F ) | are mutually independent taken two by two.The simple internal independence of a decision problem ( E, F ) certainlyis a form of internal independence of a decision problem, but we may indeedask for more independence: We may ask for independence of the elements ofthe reduced kernel | Log E ( F ) | taken m by m all m . The following notion ofinternal independence of a decision problem captures this extreme form ofinternal independence of a problem.We shall say that the decision problem ( E, F ) exhibits the property of strong internal independence if, for any choice of s distinct strings f , .., f s in | Log E ( F ) | , the following is true: For every i between 1 and s there exists aword x i ∈ E such that x i contains f i and fails to contain any of the remainingstrings in { f , .., f s } . It is left for the reader to show that strong internalindependence of a decision problem implies simple internal independence. From Equation 8 we have
Log E A ∪ .. ∪ Log E A m ⊆ Log E ( A ∪ .. ∪ A m ) (17)for closed A , .., A m ⊆ E . Now replace m with α ( n ) and A i with F i : Log E F ∪ .. ∪ Log E F α ( n ) ⊆ Log E ( F ∪ .. ∪ F α ( n ) ) (18)The strings in Log E F , .., Log E F α ( n ) are witnesses. The possible strings in Log E ( F ∪ .. ∪ F α ( n ) ) − Log E F ∪ .. ∪ Log E F α ( n ) (19)we call “wizards” since they are so to speak able to perceive that an input x shall be in someone of the F i s but couldn’t say which. The possible existenceof this type of strings in the reduced logogram | Log E ( F ) | of a decision prob-lem ( E, F ) can be demonstrated by examples. Wizards have been found toexist in the reduced logograms of following problems (i) To decide if a sym-metric loopfree graph is connected, (ii) To decide if a given positive integeris composite (note, incidentally, that
P RIM ES is in P [8]).In a situation in which the target set F is decomposed according to ∪ n F n = F , the witnesses are always there in the reduced logogram of set F relative to E . On the contrary, the wizards may be missing. It pertains tothe structure of the computational problem at hand whether the target set F has wizards. We conclude this section proving a theorem:10 heorem 7. If F = ∪ n F n where the F i s are cylinders in E , then ∪ α ( n ) i =1 | Log E F i | is complete for ( E, F ) .Proof. Being a union of cylinders in E , F is a cylinder in E . Being cylindersin E , the F i s are endowed with reduced logograms. This is to say that,for i = 1 , .., α ( n ) and any x ∈ E , one has x ∈ F i if and only if there is g ∈ | Log E ( F i ) | such that x ≥ g . Since the target set ∪ n F n = F is itself acylinder in E , Equation 18 holds.(I) Let f ∈ | Log E F | ∪ .. ∪ | Log E F α ( n ) | and let x be an input word oflength n such that x ∈ E and x ≥ f . We must prove x ∈ F .Very obviously we have f ∈ Log E F ∪ .. ∪ Log E F α ( n ) .Since sequence F i has a cardinality function α ( n ), then, for input words x ∈ E of length n , equation ∪ n F n = F can be rewritten F = F ∪ .. ∪ F α ( n ) .Given F ∪ .. ∪ F α ( n ) = F , f ∈ Log E ( F ) follows from Equation 18. Then x ∈ F follows from x ≥ f (taking into account that F is a cylinder in E ).(II) Let x ∈ E be any input word of length | x | = n . Assume x ∈ F . Wemust prove that there exists f ∈ | Log E F | ∪ .. ∪ | Log E F α ( n ) | such that x ≥ f .Since x ∈ F and F = F ∪ .. ∪ F α ( n ) , there exists i , 1 ≤ i ≤ α ( n ), suchthat x ∈ F i .Since F i is a cylinder in E , the reduced logogram | Log E ( F i ) | exists. Thisimplies that, if y ∈ E includes a string g ∈ | Log E ( F i ) | then certainly y ∈ F i .Conversely, if y ∈ F i then y includes at least a string g ∈ | Log E ( F i ) | . Butthis is just to say that | Log E ( F i ) | is a complete subset of | Log E ( F ) | for F i relative to E , which is to say, for problem ( E, F i ).Since | Log E ( F i ) | is complete for F i relative to E , it follows from x ∈ F i that there exists a string f ∈ | Log E ( F i ) | such that x ≥ f .Then we also have f ∈ | Log E F | ∪ .. ∪ | Log E F α ( n ) | .We have shown that, given any input word x ∈ E such that | x | = n , onehas x ∈ F if and only if there exists a string f in | Log E F | ∪ .. ∪ | Log E F α ( n ) | such that f ≤ x . Thus, | Log E F | ∪ .. ∪ | Log E F α ( n ) | is a complete subset of | Log E F | for F relative to E . The encoding scheme that we adopt converts
CN F formulas into words overΣ = { , , } . In what follows E = CN F , F = SAT .We represent clauses over x , .., x n by sequences of n codes from Σ. Code0 denotes absence of the variable, code 1 presence without minus, code 2presence with minus. E.g., clause x ∨ x ∨ − x becomes 1012.A whole formula is encoded as a sequence of clauses. We define F nm =satisfiable formulas with n variables and m clauses.11e introduce the sequence y , y , .. of solutions, and the correspondingsequence F , F , .. or recursive subsets of F . Here the solutions y i are valueassignments. The cardinality function is α ( n ) = 2 n . We assume that F = SAT as well as the regions F , F , .. are closed sets in E = CN F . Thus,all these sets are assumed to be relative cylinders in E . These assumptionscorrespond to known properties of SAT [5] [9].Essentially, our application consists in investigating whether | Log E ( F ) | might possibly contain strings not already in some of the | Log E ( F i ) | . Beforewe discuss the propositions that we were able to derive, let us spend a fewwords on the logogram of SAT . A string in | Log E ( F nm ) | is a prescriptionthat a word in F nm may or may not be conformant with. We may representa string in | Log E ( F nm ) | as a word of length nm over { ♭ } ∪ Σ. Example for n = m = 3: String ♭♭ ♭ ♭ ♭ prescribes that first clause shall include x ,second shall include x and − x , third shall include − x . Note that stringsin | Log E ( F nm ) | only prescribe either 1 or 2 as values (by the minimalityproperty of reduced logogram). Theorem 8.
Problem ( CN F, SAT ) exhibits the strong internal independenceproperty.Proof. We consider s distinct strings f , .., f s in | Log E ( F nm ) | . Thus, regardedas a partial function, each f i will assign only values 1 or 2. We must provethat for each i = 1 , .., s there exists a string x i ∈ E nm = CN F nm such that x i includes f i and does not include any of the remaining strings f , .., f s .Let i be any one of the indices 1 , .., s . Then Dom ( f i ) ⊆ { , .., nm } and,for all h ∈ Dom ( f i ), we either have f i ( h ) = 1 or f i ( h ) = 2.Let x i be that word of length nm over Σ = { , , } such that for all h ∈ Dom ( f i ) it holds that x ih = f i ( h ) while for h not in Dom ( f i ) one has x ih = 0. Then certainly x i includes f i .Let f j be any one of the strings f , .., f s being different from f i . Thus, f j = f i . We must prove that x i does not include f j .(I) Assume Dom ( f j ) = Dom ( f i ).Since f i and f j are different, there is k ∈ Dom ( f i ) such that f i ( k ) = f j ( k ).But x ik = f i ( k ), then x ik = f j ( k ). Then x i does not include f j .(II) Let Assume Dom ( f j ) = Dom ( f i ).Then either there is a ∈ Dom ( f j ) such that a Dom ( f i ) or there exists b ∈ Dom ( f i ) such that b Dom ( f j ).Assume that a exists. Then x i does not include f j since x ia = 0 while f j ( a ) / ∈
0, hence x ia = f j ( a ). Analogously, x i does not include f j in case b exists. Theorem 9.
The reduced logogram | Log
CNF ( SAT ) | does not contain wiz-ards. roof. We must prove:
Log E F nm ∪ .. ∪ Log E F nmα ( n ) = Log E ( F nm ) (20)where α ( n ) is the cardinality function of sequence F , F , .. . Here F i is therange of the value assignment y i (set of formulas in F that are satisfied by y i )and is a cylinder in E . Since Equation 18 holds, we just have to prove thatthe right-hand side of Equation 20 does not contain wizards. We actuallywill prove: | Log E F nm | ∪ .. ∪ | Log E F nmα ( n ) | = | Log E ( F nm ) | (21)which is evidently equivalent to Equation 20.We write K nm for | Log E ( F nm ) | and, for every integer i = 1 , .., α ( n ), wewrite K nmi = | Log E ( F nmi ) | . We must prove K nm = K nm ∪ .. ∪ K nmα ( n ) .First of all, note that the set of all witnesses K nm ∪ .. ∪ K nmα ( n ) is completefor the target set F = SAT relative to reference set E = CN F by Theorem 7.This implies that, if x ∈ F nm , then x includes a string f ∈ K nm ∪ .. ∪ K nmα ( n ) .Let h ∈ K nm . Since K nm is included in Σ ∞ ( E ), we have h ∈ Σ ∞ ( E ).Then there is an x ∈ E nm such that x ≥ h . On the other side, if x is in E nm and includes string h , then x ∈ F , hence, since K nm ∪ .. ∪ K nmα ( n ) is completefor F relative to E , there shall exist a string k ∈ K nm ∪ .. ∪ K nmα ( n ) such that x ≥ k (and h, k shall have to be compatible to one another).We then set h → k to mean that (i) k is a member of K nm ∪ .. ∪ K nmα ( n ) ,(ii) there exists x ∈ E nm such that both h ≤ x and k ≤ x . (Thus, h → k implies that h and k are compatible.)Besides, we introduce the set U ( h ) = { k | h → k } of those witnesses(members of set K nm ∪ .. ∪ K nmα ( n ) ) that are related to h .Now, by way of contradiction, we assume that h does not belong to U ( h ).We then have that the elements in the set { h } ∪ U ( h ) are all distinct.By the strong internal independence of SAT , in correspondence to eachstring f ∈ { h } ∪ U ( h ) there exists a word x ∈ E nm such that f ≤ x and forno g ∈ { h } ∪ U ( h ) being distinct from f one has g ≤ x .Let x ∈ E nm be such that x ≥ h and for no g ∈ U ( h ) one has theinclusion g ≤ x . Word x is in F = SAT since x includes h which is anelement of | Log E ( F nm ) | . Besides, x does not contain any element from U ( h ).But that in turn means that x does not contain any strings from the witset K nm ∪ .. ∪ K nmα ( n ) . (Should x include a string k from K nm ∪ .. ∪ K nmα ( n ) thatwould mean that both x ≥ h, x ≥ k would hold, hence k would be relatedwith h which would imply k ∈ U ( h ).)This is absurd, since K nm ∪ .. ∪ K nmα ( n ) is complete for SAT relative to
CN F .We conclude that h is a member of U ( h ), and hence is in K nm ∪ .. ∪ K nmα ( n ) .13ince we already know that K nm ∪ .. ∪ K nmα ( n ) is a subset of K nm , we concludethat K nm = K nm ∪ .. ∪ K nmα ( n ) . Thus, SAT has no wizards.
Theorem 10.
The reduced logogram | Log
CNF ( SAT ) | is irreducible.Proof. Let g ∈ | Log E ( F nm ) | . By Theorem 9 we know that g must be awitness. Thus, g is a string conveying the specification of exactly one valueassignment. Besides, g is minimal (no proper restriction of g is a sufficientcondition for event x ∈ F ). These two facts make it a straightforward taskto specify the general shape that string g shall exhibit.First of all, Dom ( g ) shall have to be a set of exactly m numbers takenfrom { , .., nm } . The first of these numbers is to be taken from the first block { , .., n } (where the first clause is allocated), the second is to be from thesecond block { n + 1 , .., n } ,.., the m th is from the m th block { n ( m −
1) +1 , .., nm } (where the last clause is allocated). Thus, there are nm possibledeterminations for Dom ( g ). We know that regarded as a prescription, g canonly prescribe the two values 1 and 2. (To help intuition, string g can bethought of as a sequence of flats ♭♭..♭ of length nm in which some of the flats(as many as m ) have been replaced with 1 s or 2 s .)With any g that satisfies the above requirements we associate a formula γ ( g ) as follows. We note that, regarded as a prescription, g prescribes thepresence of exactly one literal in each clause of a formula x consisting of m clauses: We then state that the i th clause of γ ( g ) shall consist of exactly thesingle literal that g prescribes to the i th clause of x .Evidently, γ ( g ) is satisfiable and g ≤ γ ( g ). We claim that γ ( g ) does notinclude members of | Log E ( F nm ) | other than g .Indeed, the strings in | Log E ( F nm ) | never prescribe 0 as value, and g isthe largest string being included in the codeword of γ ( g ) which does notprescribe 0 as value. Thus, the only strings that do not prescribe 0 as valueand happen to be included in the codeword of γ ( g ) are exactly string g itselfand the proper restrictions of string g . Since g is minimal, all of its properrestrictions are not members of | Log E ( F nm ) | . Thus g is the only string beingincluded in the codeword of γ ( g ) to be found in | Log E ( F nm ) | .Hence, | Log E ( F nm ) | − { g } is not complete for F nm relative to E nm . The search version of a decision problem consists in obtaining solutions fora given instance x . Thus, with any NP problem ( E, F ) we associate thefollowing search problem: Given x find a solution y for x or state that nosuch y exists. 14t is known that, by self-reducibility of SAT , if we had a polynomialalgorithm for
SAT , then we would also have a polynomial algorithm for thesearch problem associated with
SAT [9]. The results of previous sectionsshow that we can say more: It is impossible to solve
SAT without at thesame time solving the search problem associated with
SAT .These remarks suggest that we may wish to focus on the search problemassociated with
SAT . This is what we do in this section.Given any NP problem (
E, F ), we introduce the cover of the target set F associated with | Log E ( F ) | to be the family of sets D E ( F ) = { Exp E ( g ) ⊆ F : g ∈ | Log E ( F ) |} . (22)Its members are the charts or else regions of the cover. The cover that isassociated with the kernel of a program P solving ( E, F ) is then F P ( E, F ) = { Exp E ( g ) ⊆ F : g ∈ Ker ( P ) } . (23)Both D E ( F ) and F P ( E, F ) are families of subsets of the target set F whoseunion is F , with F P ( E, F ) being a subfamily of D E ( F ).For SAT we have the following situation: F P ( E, F ) = D E ( F ) by Theo-rem 10 and the strings in | Log E ( F ) | are all witnesses by Theorem 9. Thusany of these strings, call it g , has an associated relativized cylinder Exp E ( g )being fully included in only one of the regions F i s .Since for E = CN F , F = SAT , Exp E ( g ) is actually an intersection oftwo absolute cylinder sets Exp ( g ) and E , then Exp E ( g ) itself is an absolutecylinder. In general, Exp E ( g ) will intersect certain other regions F h , F k ,.., but there exists only one region F j which completely includes Exp E ( g ).Besides, every region F i shall have to include at least one such elementaryrelativized cylinder Exp E ( g ).As a consequence, the cardinality of the cover D E ( F ) cannot be smallerthan that of the family of sets { F ni : i = 1 , .., n } , hence it is exponential. Remarks on the Time Complexity of SAT
In the rest of this sectionwe make remarks on the time complexity of SAT in the light of Theorems 8,9, 10. We will be less formal than in previous sections. Our remarks consistof two parts:
Part One
It follows from Theorem 10 that there is a unique subfamily F of D E ( F ) such that F = S F , namely D E ( F ) itself. As a consequence, forany proper subset F ⊂ D E ( F ) one has F = S F .We then have that it cannot be that F P ( E, F ) is a proper subfamily ofthe full cover D E ( F ), otherwise we would have F = S F P ( E, F ), and then P D E ( F ) is exponential, F P ( E, F ) is not allowed to be a polynomial subfamily of D E ( F ) ): No searchalgorithm for SAT can only search a polynomial family of sets.
Part Two
It remains for us to discuss the possibility that one single algo-rithm can solve the full search problem for x by directly searching the fullexponential family D E ( F ) in polynomial time. However this can scarcely bethe case due to complete absence of any form of dependence among subsetsin the reduced logogram | Log E ( F ) | for E = CN F , F = SAT . By this lack ofinternal dependence, any computation of a program P solving ( CN F, SAT )is such that the result of any computation step does not change the resultsthat are left possible for the subsequent steps. In the rest of this part wemake a few informal remarks on how this lack of dependence comes into play.We take a general purpose program machine M as computation model.(That M is a program machine means that the process carried out by M is determined by a running program.) We assume that only one programis running at any moment of time within M . We keep machine M fixedwhile we consider an infinite set of programs solving SAT (actually the setof all programs that run on M and solve SAT ). We emphasize that thehardware is kept fixed while different programs all running on that hardwareare compared.Let B ( x, m ) be a program which for any given input x of size n and everyinteger m between 1 and 2 n will decide if x has solutions in the range between y and y m . Take T ime B ( x, m ) be the number of time units that B uses oninputs x, m .We will make remarks that convey evidence for following statement: Iffor any x and m < n we have T ime B ( x, m ) = T ime B ( x, m + 1), thenwe may replace B with a new program C running on M and such that T ime C ( x, m ) < T ime C ( x, m + 1) = T ime B ( x, m + 1).Indeed, under the above hypotheses on M , we can speak of the the classof all programs B , C ,.. that solve SAT on machine M , and we can introducea most efficient program A in this class. We understand that A is a mostefficient program as soon as T ime A ( x, n ) ≤ T ime C ( x, n ) for any otherprogram C on any input word x .It is sufficient for us to give a hint for T ime A ( x, < T ime A ( x, | Log E ( F ∪ F ) | = | Log E ( F ) | ∪| Log E ( F ) | and | Log E ( F ) | ∩| Log E ( F ) | = ∅ , a computation thatimplements the collection of tests in | Log E ( F ∪ F ) | consists of two distinctcomputations, one implementing collection | Log E ( F ) | and the other imple-menting collection | Log E ( F ) | . Thus, computation A ( x,
1) being a proper16refix of computation A ( x,
2) is compatible with assumed optimality of A ,whence T ime A ( x, < T ime A ( x, Our theory has roots in the body of formalized concepts referred to as Scott’stheory of computation [10]. Thus, the reduced logogram | Log E ( F ) | associ-ated with problem ( E, F ) is an inf ormation system [6], [7] (however, thevery important relation is not entailment but entanglement). Even morerelevant are the relationships with the “dynamical” part of Scott’s theory,the one regarding computations as sequences of steps through which the run-ning program’s knowledge increases [3]. We also, in this latter respect, usedconcepts from the model theoretic analysis of program knowledge [11].In this section we briefly review relationships of the above theory withformalisms that ascribe knowledge to a running program.In Scott’s theory the computations that program P does are functionallyequivalent to sequences of tokens (or tests) being consistent with the inputstring x . In our developments, the “tests” or “tokens” are identified withthe strings in Ker ( P ). In Scott’s theory, the state of knowledge of a runningprogram P consists of a pile of assertions . These are consistent (indeed, theyare propositions that are true of one and the same object x ). As soon as thepile becomes a decisive one, the program makes its decision and stops. Ouraddition is: The “assertions” are of the form x ∈ Exp ( g ) or else x Exp ( f )where f, g ∈ Ker ( P ).Searching x for a string g amounts to same as asking if x happens tobelong in the absolute elementary cylinder Exp ( g ) associated with g . Wethus arrive at the conclusion that all that P can possibly do to make adecision consists in asking questions of this form. Thus, the computationsthat P performs are just sequences of tests in disguise . Note that P has notgot to ask whether x is in Exp E ( g ) since P already knows that x is in E .(This is an important point since asking if x is in Exp E ( g ) would be morecomputationally expensive.)In this theory, information regarding x is acquired by P in lumps. Theacquisition of a piece of information occurs at the moment when the exe-cution of a sequence of tests is completed (i.e., when the computation thatimplements that sequence of tests is completed). We may well think of apiece of information as being a piece of paper carrying a written note suchas “ x is in Exp ( g )” or “ x fails to be in Exp ( g ).” These notes stack one uponthe other until the pile becomes a decisive one: This is the case when thedata that was gathered entails one of the events x ∈ F or else x ∈ E − F .17ote that loading an input x in memory does not imply computations,hence no tests are made on x while loading, hence no knowledge is acquiredabout x . After loading x , the pile of assertions that represents program’sknowledge is empty.
10 Conclusions
We advocated strings (with special meaning for the term) as a fundamen-tal notion for studies of computation. So to speak, strings are needed toexpress the notions of internal and strong internal independence of a deci-sion problem that underly our theory of decision problems. We were led toformulate strings to become able to derive the very basic notion of internalindependence of a decision problem. Strings seem to be useful since they areabsolutely elementary. Note that they are already at work in Computability.The “restrictions” that are often used in the study of circuit complexity arefinite Boolean versions of the strings [9].Strings are not made of consecutive letters. A string can be interspersed ina word: By canceling zero or more letters in a word x , and by leaving blanksin places of letters, we get a string f which is a substring of the original word x . In a string, one has information associated with spaces between letters(and hence with possible multiple periodicity with which letters may occur).As soon as we have the strings, we are able to define the kernel Ker ( P ) ofa decision program P , a set of strings which capture structural features ofboth program P and the decision problem ( E, F ) that P solves. Ker ( P ) is a subset of the reduced logogram | Log E ( F ) | of target set F in base E . The reduced logogram consists of substrings of the words in F which exhibit the following property: If a word in E includes one of thesesubstrings then it belongs to F . We may think of the strings in | Log E ( F ) | as kind of genes of the words in F . (In early notes the logogram was the jinnee or genie of problem ( E, F ).) The idea clearly comes from biology,where it is known that certain occurrences at given intervals of certain letterswithin DNA sequences convey structural information, and yield observablecharacters in the macroscopic development of the structures.Our application to
SAT uses a structural property of that problem thatseems to have escaped attention so far. We called it “strong internal inde-pendence.” Theorem 8 shows that
SAT exhibits the strong internal inde-pendence property. Theorem 9 shows that, by that property,
SAT cannothave collective certificates in its reduced logogram. As a consequence, all theprograms that solve
SAT have same kernel (Theorem 10).The remarks in Section 8 suggest how Theorems 8, 9, 10 can possibly be18sed to put SAT under scrutiny. Our ultimate concern in this paper has beento set forth our developments as a possible new technique to attack decisionproblems, where “technique” is here used in the sense that Hemaspaandraand Ogihara gave to this term in the preface of their “Companion.”
11 Acknowledgements
In the development of this research I received advice from Proff. FabrizioLuccio, Johan Hastad, Giancarlo Mauri, and Claudio Procesi. These resultswould not have been achieved without that help.
References [1] Kirousis K and Kolaitis P.
The complexity of minimal satisfiability prob-lems . Information and Computation, 187(2003), 20-39.[2] Odifreddi P.
Classical Recursion Theory . North-Holland, 1989[3] Gierz G, Hofmann K, Keimel K, Lawson J D, Mislove M W, Scott D.
Continuous Lattices and Domains . Cambridge University Press[4] Birkhoff G.
Lattice theory . AMS Volume 25[5] Balcazar J, Diaz J, Gabarro J.
Structural Complexity II . Springer, 1990[6] Scott D.
Domains for denotational semantics . ICALP82, Lecture notesin Computer Science 140, Springer, 1982.[7] Larsen K G and Winskel G.
Using Information Systems to solve recursivedomain equations . Information and Computation, 91(1991), 232-258.[8] Agrawal M, Kayal N, and Saxena N.
PRIMES is in P . Annals of Math-ematics, 160(2004), 781-793.[9] Hemaspaandra L A, Ogihara M.
The Complexity Theory Companion .Springer, 2002[10] Di Zenzo S, Bottoni P, Mussio P.
A notion of information related tocomputation . Information Processing Letters, 64(1997), 207-215.[11] Fagin R, Halpern J H, and Vardi M Y.