A First Class Boolean Sort in First-Order Theorem Proving and TPTP
aa r X i v : . [ c s . L O ] M a y A First Class Boolean Sort in First-OrderTheorem Proving and TPTP ⋆ Evgenii Kotelnikov , Laura Kov´acs ⋆⋆ , and Andrei Voronkov ⋆ ⋆ ⋆ Chalmers University of Technology, Gothenburg, Sweden [email protected], [email protected] The University of Manchester, Manchester, UK [email protected]
Abstract.
To support reasoning about properties of programs operatingwith boolean values one needs theorem provers to be able to natively dealwith the boolean sort. This way, program properties can be translated tofirst-order logic and theorem provers can be used to prove program prop-erties efficiently. However, in the TPTP language, the input language ofautomated first-order theorem provers, the use of the boolean sort islimited compared to other sorts, thus hindering the use of first-ordertheorem provers in program analysis and verification. In this paper, wepresent an extension FOOL of many-sorted first-order logic, in which theboolean sort is treated as a first-class sort. Boolean terms are indistin-guishable from formulas and can appear as arguments to functions. Inaddition, FOOL contains if - then - else and let - in constructs. We definethe syntax and semantics of FOOL and its model-preserving translationto first-order logic. We also introduce a new technique of dealing withboolean sorts in superposition-based theorem provers. Finally, we discusshow the TPTP language can be changed to support FOOL. Automated program analysis and verification requires discovering and provingprogram properties. Typical examples of such properties are loop invariants orCraig interpolants. These properties usually are expressed in combined theoriesof various data structures, such as integers and arrays, and hence require reason-ing with both theories and quantifiers. Recent approaches in interpolation andloop invariant generation [14,12,10] present initial results of using first-order the-orem provers for generating quantified program properties. First-order theoremprovers can also be used to generate program properties with quantifier alter-nations [12]; such properties could not be generated fully automatically by anypreviously known method. Using first-order theorem prover to generate, and not ⋆ The final publication is available at http://link.springer.com. ⋆⋆ The first two authors were partially supported by the Wallenberg Academy Fellow-ship 2014, the Swedish VR grant D0497701, and the Austrian research project FWFS11409-N23. ⋆ ⋆ ⋆
Partially supported by the EPSRC grant “Reasoning in Verification and Security”. nly prove program properties, opens new directions in analysis and verificationof real-life programs.First-order theorem provers, such as iProver [11], E [18], and Vampire [13],lack however various features that are crucial for program analysis. For example,first-order theorem provers do not yet efficiently handle (combinations of) theo-ries; nevertheless, sound but incomplete theory axiomatisations can be used in afirst-order prover even for theories having no finite axiomatisation. Another dif-ficulty in modelling properties arising in program analysis using theorem proversis the gap between the semantics of expressions used in programming languagesand expressiveness of the logic used by the theorem prover. A similar gap existsbetween the language used in presenting mathematics. For example, a standardway to capture assignment in program analysis is to use a let - in expression,which introduces a local binding of a variable, or a function for array assignments,to a value. There is no local binding expression in first-order logic, which meansthat any modelling of imperative programs using first-order theorem provers atthe backend, should implement a translation of let - in expressions. Similarly,mathematicians commonly use local definitions within definitions and proofs.Some functional programming languages also contain expressions introducinglocal bindings. In all three cases, to facilitate the use of first-order provers, oneneeds a theorem prover implementing let - in constructs natively.Efficiency of reasoning-based program analysis largely depends on how pro-grams are translated into a collection of logical formulas capturing the programsemantics. The boolean structure of a program property that can be efficientlytreated by a theorem prover is however very sensitive to the architecture of thereasoning engine of the prover. Deriving and expressing program properties inthe “right” format therefore requires solid knowledge about how theorem proverswork and are implemented — something that a user of a verification tool mightnot have. Moreover, it can be hard to efficiently reason about certain classes ofprogram properties, unless special inference rules and heuristics are added to thetheorem prover, see e.g. [8] when it comes to prove properties of data collectionswith extensionality axioms.In order to increase the expressiveness of program properties generated byreasoning-based program analysis, the language of logical formulas accepted bya theorem prover needs to be extended with constructs of programming lan-guages. This way, a straightforward translation of programs into first-order logiccan be achieved, thus relieving users from designing translations which can beefficiently treated by the theorem prover. One example of such an extension isrecently added to the TPTP language [19] of first-order theorem provers, resem-bling if - then - else and let - in expressions that are common in programminglanguages. Namely, special functions $ite_t and $ite_f can respectively beused to express a conditional statement on the level of logical terms and for-mulas, and $let_tt , $let_tf , $let_ff and $let ft can be used to expresslocal variable bindings for all four possible combinations of logical terms ( t ) andformulas ( f ). While satisfiability modulo theory (SMT) solvers, such as Z3 [6]2nd CVC4 [2], integrate if - then - else and let - in expressions, in the first-ordertheorem proving community so far only Vampire supports such expressions.To illustrate the advantage of using if - then - else and let - in expressions inautomated provers, let us consider the following simple example. We are inter-ested in verifying the partial correctness of the code fragment below: if (r(a)) {a := a + 1} else {a := a + q(a)} using the pre-condition (( ∀ x ) P ( x ) ⇒ x ≥ ∧ (( ∀ x ) q ( x ) > ∧ P ( a ) and thepost-condition a >
0. Let a1 denote the value of the program variable a after theexecution of the if -statement. Using if - then - else and let - in expressions, thenext state function for a can naturally be expressed by the following formula: a1 = if r(a) then let a = a + 1 in a else let a = a + q(a) in a This formula can further be encoded in TPTP, and hence used by a the-orem prover as a hypothesis in proving partial correctness of the above codesnippet. We illustrate below the TPTP encoding of the first-order problem cor-responding to the partial program correctness problem we consider. Note thatthe pre-condition becomes a hypothesis in TPTP, whereas the proof obligationgiven by the post-condition is a TPTP conjecture. All formulas below are typedfirst-order formulas ( tff ) in TPTP that use the built-in integer sort ( $int ). tff (1, type , p : $int > $o). tff (2, type , q : $int > $int). tff (3, type , r : $int > $o). tff (4, type , a : $int). tff (5, hypothesis , ! [X : $int] : (p(X) => $greatereq(X, 0))). tff (6, hypothesis , ! [X : $int] : ($greatereq(q(X), 0))). tff (7, hypothesis , p(a)). tff (8, hypothesis ,a1 = $ite_t(r(a), $let_tt(a, $sum(a, 1), a),$let_tt(a, $sum(a, q(a)), a))). tff (9, conjecture , $greater(a1, 0)). Running a theorem prover that supports $ite_t and $let_tt on this TPTPproblem would prove the partial correctness of the program we considered. Notethat without the use of if - then - else and let - in expressions, a more tedioustranslation is needed for expressing the next state function of the program vari-able a as a first-order formula. When considering more complex programs con-taining multiple conditional expressions assignments and composition, comput-ing the next state function of a program variable results in a formula of sizeexponential in the number of conditional expressions. This problem of comput-ing the next state function of variables is well-known in the program analysiscommunity, by computing so-called static single assignment (SSA) forms. Using3he if - then - else and let - in expressions recently introduced in TPTP and al-ready implemented in Vampire [7], one can have a linear-size translation instead.Let us however note that the usage of conditional expressions in TPTP issomewhat limited. The first argument of $ite_t and $ite_f is a logical formula,which means that a boolean condition from the program definition should betranslated as such. At the same time, the same condition can be treated as avalue in the program, for example, in a form of a boolean flag, passed as anargument to a function. Yet we cannot mix terms and formulas in the same wayin a logical statement. A possible solution would be to map the boolean type ofprograms to a user-defined boolean sort, postulate axioms about its semantics,and manually convert boolean terms into formulas where needed. This approach,however, suffers the disadvantages mentioned earlier, namely the need to designa special translation and its possible inefficiency.Handling boolean terms as formulas is needed not only in applications ofreasoning-based program analysis, but also in various problems of formalisationof mathematics. For example, if one looks at two largest kinds of attempts to for-malise mathematics and proofs: those performed by interactive proof assistants,such as Isabelle [16], and the Mizar project [21], one can see that first-order the-orem provers are the main workhorses behind computer proofs in both cases –see e.g. [5,22]. Interactive theorem provers, such as Isabelle routinely use quan-tifiers over booleans. Let us illustrate this by the following examples, chosenamong 490 properties about (co)algebraic datatypes, featuring quantifiers overbooleans, generated by Isabelle and kindly found for us by Jasmin Blanchette.Consider the distributivity of a conditional expression (denoted by the ite func-tion) over logical connectives, a pattern that is widely used in reasoning aboutproperties of data structures. For lists and the contains function that checksthat its second argument contains the first one, we have the following example:( ∀ p : bool )( ∀ l : list A )( ∀ x : A )( ∀ y : A ) contains ( l, ite( p, x, y )) . =( p ⇒ contains ( l, x )) ∧ ( ¬ p ⇒ contains ( l, y )) (1)A more complex example with a heavy use of booleans is the unsatisfiability ofthe definition of subset_sorted . The subset_sorted function takes two sortedlists and checks that its second argument is a sublist of the first one.( ∀ l : list A )( ∀ l : list A )( ∀ p : Bool ) ¬ ( subset sorted ( l , l ) . = p ∧ ( ∀ l ′ : list A ) ¬ ( l . = nil ∧ l . = l ′ ∧ p ) ∧ ( ∀ x : A )( ∀ l ′ : list A ) ¬ ( l . = cons ( x , l ′ ) ∧ l . = nil ∧ ¬ p ) ∧ ( ∀ x : A )( ∀ l ′ : list A )( ∀ x : A )( ∀ l ′ : list A ) ¬ ( l . = cons ( x , l ′ ) ∧ l . = cons ( x , l ′ ) ∧ p . = ite( x < x , false , ite( x . = x , subset sorted ( l ′ , l ′ ) , subset sorted ( cons ( x , l ′ ) , l ′ ))))) (2)4ormulas with boolean terms are also common in the SMT-LIB project [3],the collection of benchmarks for SMT-solvers. Its core logic is a variant of first-order logic that treats boolean terms as formulas, in which logical connectivesand conditional expressions are defined in the core theory.In this paper we propose a modification FOOL of first-order logic, which in-cludes a first-class boolean sort and if - then - else and let - in expressions, aimedfor being used in automated first-order theorem proving. It is the smallest logicthat contains both the SMT-LIB core theory and the monomorphic first-ordersubset of TPTP. The syntax and semantics of the logic are given in Section 2. Wefurther describe how FOOL can be translated to the ordinary many-sorted first-order logic in Section 3. Section 4 discusses superposition-based theorem provingand proposes a new way of dealing with the boolean sort in it. In Section 5 wediscuss the support of the boolean sort in TPTP and propose changes to it re-quired to support a first-class boolean sort. We point out that such changes canalso partially simplify the syntax of TPTP. Section 6 discusses related work andSection 7 contains concluding remarks.The main contributions of this paper are the following:1. the definition of FOOL and its semantics;2. a translation from FOOL to first-order logic, which can be used to supportFOOL in existing first-order theorem provers;3. a new technique of dealing with the boolean sort in superposition theoremprovers, allowing one to replace boolean sort axioms by special rules;4. a proposal of a change to the TPTP language, intended to support FOOLand also simplify if - then - else and let - in expressions. First-order logic with the boolean sort (FOOL) extends many-sorted first-orderlogic (FOL) in two ways:1. formulas can be treated as terms of the built-in boolean sort; and2. one can use if - then - else and let - in expressions defined below.FOOL is the smallest logic containing both the SMT-LIB core theory and themonomorphic first-order part of the TPTP language. It extends the SMT-LIBcore theory by adding let-in expressions defining functions and TPTP by thefirst-class boolean sort. We assume a countable infinite set of variables . Definition 1. A signature of first-order logic with the boolean sort is a triple Σ = ( S, F, η ), where: 5. S is a set of sorts , which contains a special sort bool . A type is either a sortor a non-empty sequence σ , . . . , σ n , σ of sorts, written as σ × . . . × σ n → σ .When n = 0, we will simply write σ instead of → σ . We call a type assignment a mapping from a set of variables and function symbols to types, which mapsvariables to sorts.2. F is a set of function symbols . We require F to contain binary functionsymbols ∨ , ∧ , ⇒ and ⇔ , used in infix form, a unary function symbol ¬ ,used in prefix form, and nullary function symbols true , false .3. η is a type assignment which maps each function symbol f into a type τ .When the signature is clear from the context, we will write f : τ instead of η ( f ) = τ and say that f is of the type τ .We require the symbols ∨ , ∧ , ⇒ , ⇔ to be of the type bool × bool → bool , ¬ to be of the type bool → bool and true , false to be of the type bool . ❏ In the sequel we assume that Σ = ( S, F, η ) is an arbitrary but fixed signature.To define the semantics FOOL, we will have to extend the signature and alsoassign sorts to variables. Given a type assignment η , we define η, x : σ to bethe type assignment that maps a variable x to σ and coincides otherwise with η . Likewise, we define η, f : τ to be the type assignment that maps a functionsymbol f to τ and coincides otherwise with η .Our next aim to define the set of terms and their sorts with respect to a typeassignment η . This will be done using a relation η ⊢ t : σ , where σ ∈ S , termscan then be defined as all such expressions t . Definition 2.
The relation η ⊢ t : σ , where t is an expression and σ ∈ S isdefined inductively as follows. If η ⊢ t : σ , then we will say that t is a term ofthe sort σ w.r.t. η .1. If η ( x ) = σ , then η ⊢ x : σ .2. If η ( f ) = σ × . . . × σ n → σ , η ⊢ t : σ , . . . , η ⊢ t n : σ n , then η ⊢ f ( t , . . . , t n ) : σ .3. If η ⊢ φ : bool , η ⊢ t : σ and η ⊢ t : σ , then η ⊢ ( if φ then t else t ) : σ .4. Let f be a function symbol and x , . . . , x n pairwise distinct variables. If η, x : σ , . . . , x n : σ n ⊢ s : σ and η, f : ( σ × . . . × σ n → σ ) ⊢ t : τ , then η ⊢ ( let f ( x : σ , . . . , x n : σ n ) = s in t ) : τ .5. If η ⊢ s : σ and η ⊢ t : σ , then η ⊢ ( s . = t ) : bool .6. If η, x : σ ⊢ φ : bool , then η ⊢ ( ∀ x : σ ) φ : bool and η ⊢ ( ∃ x : σ ) φ : bool . ❏ We only defined a let - in expression for a single function symbol. It is not hardto extend it to a let - in expression that binds multiple pairwise distinct functionsymbols in parallel, the details of such an extension are straightforward.When η is the type assignment function of Σ and η ⊢ t : σ , we will say that t is a Σ -term of the sort σ , or simply that t is a term of the sort σ . It is not hardto argue that every Σ -term has a unique sort.According to our definition, not every term-like expression has a sort. Forexample, if x is a variable and η is not defined on x , then x is a not a term w.r.t. η . To make the relation between term-like expressions and terms clear,6e introduce a notion of free and bound occurrences of variables and functionsymbols. We call the following occurrences of variables and function symbols bound :1. any occurrence of x in ( ∀ x : σ ) φ or in ( ∃ x : σ ) φ ;2. in the term let f ( x : σ , . . . , x n : σ n ) = s in t any occurrence of a variable x i in f ( x : σ , . . . , x n : σ n ) or in s , where i = 1 , . . . , n .3. in the term let f ( x : σ , . . . , x n : σ n ) = s in t any occurrence of thefunction symbol f in f ( x : σ , . . . , x n : σ n ) or in t .All other occurrences are called free . We say that a variable or a function symbolis free in a term t if it has at least one free occurrence in t . A term is called closed if it has no occurrences of free variables. Theorem 1.
Suppose η ⊢ t : σ . Then1. for every free variable x of t , η is defined on x ;2. for every free function symbol f of t , η is defined on f ;3. if x is a variable not free in t , and σ ′ is an arbitrary sort, then η, x : σ ′ ⊢ t : σ ;4. if f is a function symbol not free in t , and τ is an arbitrary type, then η, f : τ ⊢ t : σ . ❏ Definition 3. A predicate symbol is any function symbol of the type σ × . . . × σ n → bool . A Σ -formula is a Σ -term of the sort bool . All Σ -terms that are not Σ -formulas are called non-boolean terms . ❏ Note that, in addition to the use of let-in and if-then-else , FOOL is aproper extension of first-order logic. For example, in FOOL formulas can be usedas arguments to terms and one can quantify over booleans. As a consequence,every quantified boolean formula is a formula in FOOL.
As usual, the semantics of FOOL is defined by introducing a notion of interpre-tation and defining how a term is evaluated in an interpretation.
Definition 4.
Let η be a type assignment. A η -interpretation I is a map, definedas follows. Instead of I ( e ) we will write J e K I , for every element e in the domainof I .1. Each sort σ ∈ S is mapped to a nonempty domain J σ K I . We require J bool K I = { , } .2. If η ⊢ x : σ , then J x K I ∈ J σ K I .3. If η ( f ) = σ × . . . × σ n → σ , then J f K I is a function from J σ K I × . . . × J σ n K I to J σ K I .4. We require J true K I = 1 and J false K I = 0. We require J ∧ K I , J ∨ K I , J ⇒ K I , J ⇔ K I and J ¬ K I respectively to be the logical conjunction, disjunction, im-plication, equivalence and negation, defined over { , } in the standard way.7iven a η -interpretation I and a function symbol f , we define I gf to be themapping that maps f to g and coincides otherwise with I . Likewise, for a variable x and value a we define I ax to be the mapping that maps x to a and coincidesotherwise with I . Definition 5.
Let I be a η -interpretation, and η ⊢ t : σ . The value of t in I ,denoted as eval I ( t ), is a value in J σ K I inductively defined as follows:eval I ( x ) = J x K I . eval I ( f ( t , . . . , t n )) = J f K I (eval I ( t ) , . . . , eval I ( t n )) . eval I ( if φ then s else t ) = ( eval I ( s ) , if eval I ( φ ) = 1;eval I ( t ) , otherwise.eval I ( let f ( x : σ , . . . , x n : σ n ) = s in t ) = eval I gf ( t ) , where g is such that for all i = 1 , . . . , n and a i ∈ J σ i K I , we have g ( a , . . . , a n ) =eval I a ...anx ...xn ( s ). eval I ( s . = t ) = ( , if eval I ( s ) = eval I ( t );0 , otherwise.eval I (( ∀ x : σ ) φ ) = , if eval I ax ( φ ) = 1for all a ∈ I ( σ );0 , otherwise.eval I (( ∃ x : σ ) φ ) = , if eval I ax ( φ ) = 1for some a ∈ I ( σ );0 , otherwise. Theorem 2.
Let η ⊢ φ : bool and I be a η -interpretation. Then1. for every free variable x of φ , I is defined on x ;2. for every free function symbol f of φ , I is defined on f ;3. if x is a variable not free in φ , σ is an arbitrary sort, and a ∈ J σ K I theneval I ( φ ) = eval I ax ( φ );4. if f is a function symbol not free in φ , σ , . . . , σ n , σ are arbitrary sorts and g ∈ J σ K I × . . . × J σ n K I → J σ K I , then eval I ( φ ) = eval I gf ( φ ). ❏ Let η ⊢ φ : bool . A η -interpretation I is called a model of φ , denoted by I | = φ , if eval I ( φ ) = 1. If I | = φ , we also say that I satisfies φ . We say that φ is valid , if I | = φ for all η -interpretations I , and satisfiable , if I | = φ for at least one η -interpretation I . Note that Theorem 2 implies that any interpretation, whichcoincides with I on free variables and free function symbols of φ is also a modelof φ . 8 Translation of FOOL to FOL
FOOL is a modification of FOL. Every FOL formula is syntactically a FOOLformula and has the same models, but not the other way around. In this sectionwe present a translation from FOOL to FOL, which preserves models of φ .This translation can be used for proving theorems of FOOL using a first-ordertheorem prover. We do not claim that this translation is efficient – more researchis required on designing translations friendly for first-order theorem provers.We do not formally define many-sorted FOL with equality here, since FOLis essentially a subset of FOOL, which we will discuss now.We say that an occurrence of a subterm s of the sort bool in a term t is ina formula context if it is an argument of a logical connective or the occurrencein either ( ∀ x : σ ) s or ( ∃ x : σ ) s . We say that an occurrence of s in t is in a termcontext if this occurrence is an argument of a function symbol, different from alogical connective, or an equality. We say that a formula of FOOL is syntacticallyfirst order if it contains no if - then - else and let - in expressions, no variablesoccurring in a formula context and no formulas occurring in a term context.By restricting the definition of terms to the subset of syntactically first-orderformulas, we obtain the standard definition of many-sorted first-order logic, withthe only exception of having a distinguished boolean sort and constants true and false occurring in a formula context.Let φ be a closed Σ -formula of FOOL. We will perform the following steps totranslate φ into a first-order formula. During the translation we will maintain aset of formulas D , which initially is empty. The purpose of D is to collect a set offormulas (definitions of new symbols), which guarantee that the transformationpreserves models.1. Make a sequence of translation steps obtaining a syntactically first orderformula φ ′ . During this translation we will introduce new function symbolsand add their types to the type assignment η . We will also add formulasdescribing properties of these symbols to D . The translation will guaranteethat the formulas φ and V ψ ∈ D ψ ∧ φ ′ are equivalent, that is, have the samemodels restricted to Σ .2. Replace the constants true and false , standing in a formula context, bynullary predicates ⊤ and ⊥ respectively, obtaining a first-order formula.3. Add special boolean sort axioms.During the translation, we will say that a function symbol or a variable is fresh if it neither appears in φ nor in any of the definitions, nor in the domain of η .We also need the following definition. Let η ⊢ t : σ , and x be a variableoccurrence in t . The sort of this occurrence of x is defined as follows:1. any free occurrence of x in a subterm s in the scope of ( ∀ x : σ ′ ) s or ( ∃ x : σ ′ ) s has the sort σ ′ .2. any free occurrence of x i in a subterm s in the scope of let f ( x : σ , . . . , x n : σ n ) = s in s has the sort σ i , where i = 1 , . . . , n .9. a free occurrence of x in t has the sort η ( x ).If η ⊢ t : σ , s is a subterm of t and x a free variable in s , we say that x has asort σ ′ in s if its free occurrences in s have this sort.The translation steps are defined below. We start with an empty set D and aninitial FOOL formula φ , which we would like to change into a syntactically first-order formula. At every translation step we will select a formula χ , which is either φ or a formula in D , which is not syntactically first-order, replace a subterm in χ it by another subterm, and maybe add a formula to D . The translation stepscan be applied in any order.1. Replace a boolean variable x occurring in a formula context, by x . = true .2. Suppose that ψ is a formula occurring in a term context such that (i) ψ isdifferent from true and false , (ii) ψ is not a variable, and (iii) ψ contains nofree occurrences of function symbols bound in χ . Let x , . . . , x n be all freevariables of ψ and σ , . . . , σ n be their sorts. Take a fresh function symbol g ,add the formula ( ∀ x : σ ) . . . ( ∀ x n : σ n )( ψ ⇔ g ( x , . . . , x n ) . = true ) to D and replace ψ by g ( x , . . . , x n ). Finally, change η to η, g : σ × . . . × σ n → bool .3. Suppose that if ψ then s else t is a term containing no free occurrencesof function symbols bound in χ . Let x , . . . , x n be all free variables of thisterm and σ , . . . , σ n be their sorts. Take a fresh function symbol g , addthe formulas ( ∀ x : σ ) . . . ( ∀ x n : σ n )( ψ ⇒ g ( x , . . . , x n ) . = s ) and ( ∀ x : σ ) . . . ( ∀ x n : σ n )( ¬ ψ ⇒ g ( x , . . . , x n ) . = t ) to D and replace this term by g ( x , . . . , x n ). Finally, change η to η, g : σ × . . . × σ n → σ , where σ is suchthat η, x : σ , . . . , x n : σ n ⊢ s : σ .4. Suppose that let f ( x : σ , . . . , x n : σ n ) = s in t is a term containing nofree occurrences of function symbols bound in χ . Let y , . . . , y m be all freevariables of this term and τ , . . . , τ m be their sorts. Note that the variablesin x , . . . , x n are not necessarily disjoint from the variables in y , . . . , y m .Take a fresh function symbol g and fresh sequence of variables z , . . . , z n . Letthe term s ′ be obtained from s by replacing all free occurrences of x , . . . , x n by z , . . . , z n , respectively. Add the formula ( ∀ z : σ ) . . . ( ∀ z n : σ n )( ∀ y : τ ) . . . ( ∀ y m : τ m )( g ( z , . . . , z n , y , . . . , y m ) . = s ′ ) to D . Let the term t ′ beobtained from t by replacing all bound occurrences of y , . . . , y m by freshvariables and each application f ( t , . . . , t n ) of a free occurrence of f in t by g ( t , . . . , t n , y , . . . , y m ). Then replace let f ( x : σ , . . . , x n : σ n ) = s in t by t ′ . Finally, change η to η, g : σ × . . . × σ n × τ × . . . × τ m → σ , where σ is such that η, x : σ , . . . , x n : σ n , y : τ , . . . , y m : τ m ⊢ s : σ .The translation terminates when none of the above rules apply.We will now formulate several of properties of this translation, which willimply that, in a way, it preserves models. These properties are not hard toprove, we do not include proofs in this paper. Lemma 1.
Suppose that a single step of the translation changes a formula φ into φ , δ is the formula added at this step (for step 1 we can assume true = true is added), η is the type assignment before this step and η ′ is the type assignmentafter. Then for every η ′ -interpretation I we have I | = δ ⇒ ( φ ⇔ φ ). ❏
10y repeated applications of this lemma we obtain the following result.
Lemma 2.
Suppose that the translation above changes a formula φ into φ ′ , D is the set of definitions obtained during the translation, η is the initial typeassignment and η ′ is the final type assignment of the translation. Let I ′ be anyinterpretation of η ′ . Then I ′ | = V ψ ∈ D ψ ⇒ ( φ ⇔ φ ′ ). ❏ We also need the following result.
Lemma 3.
Any sequence of applications of the translation rules terminates. ❏ The lemmas proved so far imply that the translation terminates and thefinal formula is equivalent to the initial formula in every interpretation satisfyingall definitions in D . To prove model preservation, we also need to prove someproperties of the introduced definitions. Lemma 4.
Suppose that one of the steps 2–4 of the translation translates aformula φ into φ , δ is the formula added at this step, η is the type assignmentbefore this step, η ′ is the type assignment after, and g is the fresh functionsymbol introduced at this step. Let also I be η -interpretation. Then there existsa function h such that I hg | = δ . ❏ These properties imply the following result on model preservation.
Theorem 3.
Suppose that the translation above translates a formula φ into φ ′ , D is the set of definitions obtained during the translation, η is the initial typeassignment and η ′ is the final type assignment of the translation.1. Let I be any η -interpretation. Then there is a η ′ -interpretation I ′ such that I ′ is an extension of I and I ′ | = V ψ ∈ D ψ ∧ φ ′ .2. Let I ′ be a η ′ -interpretation and I ′ | = V ψ ∈ D ψ ∧ φ ′ . Then I ′ | = φ . ❏ This theorem implies that φ and V ψ ∈ D ψ ∧ φ ′ have the same models, as far as theoriginal type assignment (the type assignment of Σ ) is concerned. The formula V ψ ∈ D ψ ∧ φ ′ in this theorem is syntactically first-order. Denote this formula by γ . Our next step is to define a model-preserving translation from syntacticallyfirst-order formulas to first-order formulas.To make γ into a first-order formula, we should get rid of true and false occurring in a formula context. To preserve the semantics, we should also addaxioms for the boolean sort, since in first-order logic all sorts are uninterpreted,while in FOOL the interpretations of the boolean sort and constants true and false are fixed.To fix the problem, we will add axioms expressing that the boolean sort hastwo elements and that true and false represent the two distinct elements of thissort. ∀ ( x : bool )( x . = true ∨ x . = false ) ∧ true . = false . (3)Note that this formula is a tautology in FOOL, but not in FOL.Given a syntactically first-order formula γ , we denote by fol ( γ ) the formulaobtained from γ by replacing all occurrences of true and false in a formulacontext by logical constants ⊤ and ⊥ (interpreted as always true and alwaysfalse), respectively and adding formula (3).11 heorem 4. Let η is a type assignment and γ be a syntactically first-orderformula such that η ⊢ γ : bool .1. Suppose that I is a η -interpretation and I | = γ in FOOL. Then I | = fol ( γ )in first-order logic.2. Suppose that I is a η -interpretation and I | = fol ( γ ) in first-order logic.Consider the FOOL-interpretation I ′ that is obtained from I by changingthe interpretation of the boolean sort bool by { , } and the interpretationsof true and false by the elements 1 and 0, respectively, of this sort. Then I ′ | = γ in FOOL. ❏ Theorems 3 and 4 show that our translation preserves models. Every modelof the original formula can be extended to a model of the translated formulasby adding values of the function symbols introduced during the translation.Likewise, any first-order model of the translated formula becomes a model ofthe original formula after changing the interpretation of the boolean sort tocoincide with its interpretation in FOOL.
In Section 3 we presented a model-preserving syntactic translation of FOOLto FOL. Based on this translation, automated reasoning about FOOL formulascan be done by translating a FOOL formula into a FOL formula, and using anautomated first-order theorem prover on the resulting FOL formula. State-of-the-art first-order theorem provers, such as Vampire [13], E [18] and Spass [23],implement superposition calculus for proving first-order formulas. Naturally, wewould like to have a translation exploiting such provers in an efficient manner.Note however that our translation adds the two-element domain axiom ∀ ( x : bool )( x . = true ∨ x . = false ) for the boolean sort. This axioms will be convertedto the clause x . = true ∨ x . = false , (4)where x is a boolean variable. In this section we explain why this axiom requiresa special treatment and propose a solution to overcome problems caused by itspresence.We assume some basic understanding of first-order theorem proving and su-perposition calculus, see, e.g. [1,15]. We fix a superposition inference system forfirst-order logic with equality, parametrised by a simplification ordering ≻ onliterals and a well-behaved literal selection function [13], that is a function thatguarantees completeness of the calculus. We denote selected literals by underlin-ing them. We assume that equality literals are treated by a dedicated inferencerule, namely, the ordered paramodulation rule [17]: l . = r ∨ C L [ s ] ∨ D if θ = mgu( l, s ),( L [ r ] ∨ C ∨ D ) θ C, D are clauses, L is a literal, l, r, s are terms, mgu( l, s ) is a most generalunifier of l and s , and rθ lθ . The notation L [ s ] denotes that s is a subterm of L , then L [ r ] denotes the result of replacement of s by r .Suppose now that we use an off-the-shelf superposition theorem prover toreason about FOL formulas obtained by our translation. W.l.o.g, we assume that true ≻ false in the term ordering used by the prover. Then self-paramodulation(from true to true ) can be applied to clause (4) as follows: x . = true ∨ x . = false y . = true ∨ y . = false x . = y ∨ x . = false ∨ y . = false The derived clause x . = y ∨ x . = f alse ∨ y . = false is a recipe for disaster, since theliteral x . = y must be selected and can be used for paramodulation into everynon-variable term of a boolean sort. Very soon the search space will contain manyclauses obtained as logical consequences of clause (4) and results of paramod-ulation from variables applied to them. This will cause a rapid degradation ofperformance of superposition-based provers.To get around this problem, we propose the following solution. First, wewill choose term orderings ≻ having the following properties: true ≻ false and true and false are the smallest ground terms w.r.t. ≻ . Consider now all groundinstances of (4). They have the form s . = true ∨ s . = false , where s is a groundterm. When s is either true or false , this instance is a tautology, and henceredundant. Therefore, we should only consider instances for which s ≻ true .This prevents self-paramodulation of (4).Now the only possible inferences with (4) are inferences of the form x . = true ∨ x . = false C [ s ] C [ true ] ∨ s . = false , where s is a non-variable term of the sort bool . To implement this, we can removeclause (4) and add as an extra inference rule to the superposition calculus thefollowing rule: C [ s ] C [ true ] ∨ s . = false , where s is a non-variable term of the sort bool . The typed monomorphic first-order formulas subset, called TFF0, of the TPTPlanguage [20], is a representation language for many-sorted first-order logic. Itcontains if - then - else and let - in constructs (see below), which is useful forapplications, but is inconsistent in its treatment of the boolean sort. It has apredefined atomic sort symbol $o denoting the boolean sort. However, unlikeall other sort symbols, $o can only be used to declare the return type of pred-icate symbols. This means that one cannot define a function having a booleanargument, use boolean variables or equality between booleans.13uch an inconsistent use of the boolean sort results in having two kinds of if - then - else expressions and four kinds of let - in expressions. For example,a FOOL-term let f ( x : σ , . . . , x n : σ n ) = s in t can be represented usingone of the four TPTP alternatives $let_tt , $let_tf , $let_ft and $let_ff ,depending on whether s and t are terms or formulas.Since the boolean type is second-class in TPTP, one cannot directly representformulas coming from program analysis and interactive theorem provers, suchas formulas (1) and (2) of Section 1.We propose to modify the TFF0 language of TPTP to coincide with FOOL.It is not late to do so, since there is no general support for if - then - else and let - in . To the best of our knowledge, Vampire is currently the only theoremprover supporting full TFF0. Note that such a modification of TPTP wouldmake multiple forms of if - then - else and let - in redundant. It will also makeit possible to directly represent the SMT-LIB core theory.We note that our changes and modifications on TFF0 can also be applied tothe TFF1 language of TPTP [4]. TFF1 is a polymorphic extension of TFF0 andits formalisation does not treat the boolean sort. Extending our work to TFF1should not be hard but has to be done in detail. Handling boolean terms as formulas is common in the SMT community. TheSMT-LIB project [3] defines its core logic as first-order logic extended with thedistinguished first-class boolean sort and the let - in expression used for localbindings of variables. The core theory of SMT-LIB defines logical connectivesas boolean functions and the ad-hoc polymorphic if - then - else ( ite ) function,used for conditional expressions. The language FOOL defined here extends theSMT-LIB core language with local function definitions, using let - in expressionsdefining functions of arbitrary, and not just zero, arity. This, FOOL contains boththis language and the TFF0 subset of TPTP. Further, we present a translationof FOOL to FOL and show how one can improve superposition theorem proversto reason with the boolean sort.Efficient superposition theorem proving in finite domains, such as the booleandomain, is also discussed in [9]. The approach of [9] sometimes falls back to enu-merating instances of a clause by instantiating finite domain variables with allelements of the corresponding domains. We point out here that for the boolean(i.e., two-element) domain there is a simpler solution. However, the approachof [9] also allows one to handle domains with more than two elements. One canalso generalise our approach to arbitrary finite domains by using binary encod-ings of finite domains, however, this will necessarily result in loss of efficiency,since a single variable over a domain with 2 k elements will become k variablesin our approach, and similarly for function arguments.14 Conclusion
We defined first-order logic with the first class boolean sort (FOOL). It extendsordinary many-sorted first-order logic (FOL) with (i) the boolean sort such thatterms of this sort are indistinguishable from formulas and (ii) if - then - else and let - in expressions. The semantics of let - in expressions in FOOL is essentiallytheir semantics in functional programming languages, when they are not usedfor recursive definitions. In particular, non-recursive local functions can be de-fined and function symbols can be bound to a different sort in nested let - in expressions.We argued that these extensions are useful in reasoning about problems com-ing from program analysis and interactive theorem proving. The extraction ofproperties from certain program definitions (especially in functional program-ming languages) into FOOL formulas is more straightforward than into ordinaryFOL formulas and potentially more efficient. In a similar way, a more straight-forward translation of certain higher-order formulas into FOOL can facilitateproof automation in interactive theorem provers.FOOL is a modification of FOL and reasoning in it reduces to reasoningin FOL. We gave a translation of FOOL to FOL that can be used for provingtheorems in FOOL in a first-order theorem prover. We further discussed a mod-ification of superposition calculus that can reason efficiently in presence of theboolean sort. Finally, we pointed out that the TPTP language can be changedto support FOOL, which will also simplify some parts of the TPTP syntax.Implementation of theorem proving support for FOOL, including its superpo-sition-friendly translation to CNF, is an important task for future work. Further,we are also interested in extending FOOL with theories, such as the theory ofinteger linear arithmetic and arrays. References
1. Bachmair, L., Ganzinger, H.: Resolution Theorem Proving. In: Handbook of Au-tomated Reasoning, pp. 19–99. Elsevier and MIT Press (2001)2. Barrett, C., Conway, C.L., Deters, M., Hadarean, L., Jovanovic, D., King, T.,Reynolds, A., Tinelli, C.: CVC4. In: Proc. of CAV. pp. 171–177 (2011)3. Barrett, C., Stump, A., Tinelli, C.: The SMT-LIB Standard: Version 2.0. Tech.rep., Department of Computer Science, The University of Iowa (2010), availableat
4. Blanchette, J.C., Paskevich, A.: TFF1: The TPTP Typed First-Order Form withRank-1 Polymorphism. In: Proc. of CADE-24. pp. 414–420. Springer (2013)5. B¨ohme, S., Nipkow, T.: Sledgehammer: Judgement Day. In: Proc. of IJCAR. pp.107–121 (2010)6. de Moura, L., Bjørner, N.: Z3: An Efficient SMT Solver. In: Proc. of TACAS. pp.337–340 (2008)7. Dragan, I., Kov´acs, L.: Lingva: Generating and Proving Program Properties UsingSymbol Elimination. In: Proc. of PSI. pp. 67–75 (2014)8. Gupta, A., Kov´acs, L., Kragl, B., Voronkov, A.: Extensionality Crisis and ProvingIdentity. In: Proc. of ATVA. pp. 185–200 (2014) . Hillenbrand, T., Weidenbach, C.: Superposition for Bounded Domains. In: Auto-mated Reasoning and Mathematics - Essays in Memory of William W. McCune.pp. 68–100 (2013)10. Hoder, K., Kov´acs, L., Voronkov, A.: Playing in the grey area of proofs. In: Proc.of POPL. pp. 259–272 (2012)11. Korovin, K.: iProver - An Instantiation-Based Theorem Prover for First-OrderLogic (System Description). In: Proc. of IJCAR. pp. 292–298 (2008)12. Kov´acs, L., Voronkov, A.: Finding Loop Invariants for Programs over Arrays Usinga Theorem Prover. In: Proc. of FASE. pp. 470–485 (2009)13. Kov´acs, L., Voronkov, A.: First-Order Theorem Proving and Vampire. In: Proc. ofCAV. pp. 1–35 (2013)14. McMillan, K.L.: Quantified Invariant Generation Using an Interpolating SaturationProver. In: Proc. of TACAS. pp. 413–427 (2008)15. Nieuwenhuis, R., Rubio, A.: Paramodulation-Based Theorem Proving. In: Robin-son, A., Voronkov, A. (eds.) Handbook of Automated Reasoning, vol. I, chap. 7,pp. 371–443. Elsevier Science (2001)16. Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL - A Proof Assistant forHigher-Order Logic (2002)17. Robinson, G., Wos, L.: Paramodulation and theorem-proving in first-order theorieswith equality. Machine intelligence 4, 135–150 (1969)18. Schulz, S.: System Description: E 1.8. In: Proc. of LPAR. pp. 735–743 (2013)19. Sutcliffe, G.: The TPTP Problem Library and Associated Infrastructure. J. Autom.Reasoning 43(4), 337–362 (2009)20. Sutcliffe, G., Schulz, S., Claessen, K., Baumgartner, P.: The TPTP Typed First-Order Form with Arithmetic. In: Proc. of LPAR. pp. 406–419. Springer (2012)21. Trybulec, A.: Mizar. In: The Seventeen Provers of the World, Foreword by DanaS. Scott. pp. 20–23 (2006)22. Urban, J., Hoder, K., Voronkov, A.: Evaluation of Automated Theorem Provingon the Mizar Mathematical Library. In: ICMS. pp. 155–166 (2010)23. Weidenbach, C., Dimova, D., Fietzke, A., Kumar, R., Suda, M., Wischnewski, P.:Spass version 3.5. In: CADE. pp. 140–145 (2009). Hillenbrand, T., Weidenbach, C.: Superposition for Bounded Domains. In: Auto-mated Reasoning and Mathematics - Essays in Memory of William W. McCune.pp. 68–100 (2013)10. Hoder, K., Kov´acs, L., Voronkov, A.: Playing in the grey area of proofs. In: Proc.of POPL. pp. 259–272 (2012)11. Korovin, K.: iProver - An Instantiation-Based Theorem Prover for First-OrderLogic (System Description). In: Proc. of IJCAR. pp. 292–298 (2008)12. Kov´acs, L., Voronkov, A.: Finding Loop Invariants for Programs over Arrays Usinga Theorem Prover. In: Proc. of FASE. pp. 470–485 (2009)13. Kov´acs, L., Voronkov, A.: First-Order Theorem Proving and Vampire. In: Proc. ofCAV. pp. 1–35 (2013)14. McMillan, K.L.: Quantified Invariant Generation Using an Interpolating SaturationProver. In: Proc. of TACAS. pp. 413–427 (2008)15. Nieuwenhuis, R., Rubio, A.: Paramodulation-Based Theorem Proving. In: Robin-son, A., Voronkov, A. (eds.) Handbook of Automated Reasoning, vol. I, chap. 7,pp. 371–443. Elsevier Science (2001)16. Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL - A Proof Assistant forHigher-Order Logic (2002)17. Robinson, G., Wos, L.: Paramodulation and theorem-proving in first-order theorieswith equality. Machine intelligence 4, 135–150 (1969)18. Schulz, S.: System Description: E 1.8. In: Proc. of LPAR. pp. 735–743 (2013)19. Sutcliffe, G.: The TPTP Problem Library and Associated Infrastructure. J. Autom.Reasoning 43(4), 337–362 (2009)20. Sutcliffe, G., Schulz, S., Claessen, K., Baumgartner, P.: The TPTP Typed First-Order Form with Arithmetic. In: Proc. of LPAR. pp. 406–419. Springer (2012)21. Trybulec, A.: Mizar. In: The Seventeen Provers of the World, Foreword by DanaS. Scott. pp. 20–23 (2006)22. Urban, J., Hoder, K., Voronkov, A.: Evaluation of Automated Theorem Provingon the Mizar Mathematical Library. In: ICMS. pp. 155–166 (2010)23. Weidenbach, C., Dimova, D., Fietzke, A., Kumar, R., Suda, M., Wischnewski, P.:Spass version 3.5. In: CADE. pp. 140–145 (2009)