[PDF] A First Class Boolean Sort in First-Order Theorem Proving and TPTP

Abstract

Full PDF

aa r X i v : . [ c s . L O ] M a y A First Class Boolean Sort in First-OrderTheorem Proving and TPTP ⋆ Evgenii Kotelnikov , Laura Kov´acs ⋆⋆ , and Andrei Voronkov ⋆ ⋆ ⋆ Chalmers University of Technology, Gothenburg, Sweden [email protected], [email protected] The University of Manchester, Manchester, UK [email protected]

Abstract.

To support reasoning about properties of programs operatingwith boolean values one needs theorem provers to be able to natively dealwith the boolean sort. This way, program properties can be translated toﬁrst-order logic and theorem provers can be used to prove program prop-erties eﬃciently. However, in the TPTP language, the input language ofautomated ﬁrst-order theorem provers, the use of the boolean sort islimited compared to other sorts, thus hindering the use of ﬁrst-ordertheorem provers in program analysis and veriﬁcation. In this paper, wepresent an extension FOOL of many-sorted ﬁrst-order logic, in which theboolean sort is treated as a ﬁrst-class sort. Boolean terms are indistin-guishable from formulas and can appear as arguments to functions. Inaddition, FOOL contains if - then - else and let - in constructs. We deﬁnethe syntax and semantics of FOOL and its model-preserving translationto ﬁrst-order logic. We also introduce a new technique of dealing withboolean sorts in superposition-based theorem provers. Finally, we discusshow the TPTP language can be changed to support FOOL. Automated program analysis and veriﬁcation requires discovering and provingprogram properties. Typical examples of such properties are loop invariants orCraig interpolants. These properties usually are expressed in combined theoriesof various data structures, such as integers and arrays, and hence require reason-ing with both theories and quantiﬁers. Recent approaches in interpolation andloop invariant generation [14,12,10] present initial results of using ﬁrst-order the-orem provers for generating quantiﬁed program properties. First-order theoremprovers can also be used to generate program properties with quantiﬁer alter-nations [12]; such properties could not be generated fully automatically by anypreviously known method. Using ﬁrst-order theorem prover to generate, and not ⋆ The ﬁnal publication is available at http://link.springer.com. ⋆⋆ The ﬁrst two authors were partially supported by the Wallenberg Academy Fellow-ship 2014, the Swedish VR grant D0497701, and the Austrian research project FWFS11409-N23. ⋆ ⋆ ⋆

Partially supported by the EPSRC grant “Reasoning in Veriﬁcation and Security”. nly prove program properties, opens new directions in analysis and veriﬁcationof real-life programs.First-order theorem provers, such as iProver [11], E [18], and Vampire [13],lack however various features that are crucial for program analysis. For example,ﬁrst-order theorem provers do not yet eﬃciently handle (combinations of) theo-ries; nevertheless, sound but incomplete theory axiomatisations can be used in aﬁrst-order prover even for theories having no ﬁnite axiomatisation. Another dif-ﬁculty in modelling properties arising in program analysis using theorem proversis the gap between the semantics of expressions used in programming languagesand expressiveness of the logic used by the theorem prover. A similar gap existsbetween the language used in presenting mathematics. For example, a standardway to capture assignment in program analysis is to use a let - in expression,which introduces a local binding of a variable, or a function for array assignments,to a value. There is no local binding expression in ﬁrst-order logic, which meansthat any modelling of imperative programs using ﬁrst-order theorem provers atthe backend, should implement a translation of let - in expressions. Similarly,mathematicians commonly use local deﬁnitions within deﬁnitions and proofs.Some functional programming languages also contain expressions introducinglocal bindings. In all three cases, to facilitate the use of ﬁrst-order provers, oneneeds a theorem prover implementing let - in constructs natively.Eﬃciency of reasoning-based program analysis largely depends on how pro-grams are translated into a collection of logical formulas capturing the programsemantics. The boolean structure of a program property that can be eﬃcientlytreated by a theorem prover is however very sensitive to the architecture of thereasoning engine of the prover. Deriving and expressing program properties inthe “right” format therefore requires solid knowledge about how theorem proverswork and are implemented — something that a user of a veriﬁcation tool mightnot have. Moreover, it can be hard to eﬃciently reason about certain classes ofprogram properties, unless special inference rules and heuristics are added to thetheorem prover, see e.g. [8] when it comes to prove properties of data collectionswith extensionality axioms.In order to increase the expressiveness of program properties generated byreasoning-based program analysis, the language of logical formulas accepted bya theorem prover needs to be extended with constructs of programming lan-guages. This way, a straightforward translation of programs into ﬁrst-order logiccan be achieved, thus relieving users from designing translations which can beeﬃciently treated by the theorem prover. One example of such an extension isrecently added to the TPTP language [19] of ﬁrst-order theorem provers, resem-bling if - then - else and let - in expressions that are common in programminglanguages. Namely, special functions $ite_t and $ite_f can respectively beused to express a conditional statement on the level of logical terms and for-mulas, and $let_tt , $let_tf , $let_ff and $let ft can be used to expresslocal variable bindings for all four possible combinations of logical terms ( t ) andformulas ( f ). While satisﬁability modulo theory (SMT) solvers, such as Z3 [6]2nd CVC4 [2], integrate if - then - else and let - in expressions, in the ﬁrst-ordertheorem proving community so far only Vampire supports such expressions.To illustrate the advantage of using if - then - else and let - in expressions inautomated provers, let us consider the following simple example. We are inter-ested in verifying the partial correctness of the code fragment below: if (r(a)) {a := a + 1} else {a := a + q(a)} using the pre-condition (( ∀ x ) P ( x ) ⇒ x ≥ ∧ (( ∀ x ) q ( x ) > ∧ P ( a ) and thepost-condition a >

0. Let a1 denote the value of the program variable a after theexecution of the if -statement. Using if - then - else and let - in expressions, thenext state function for a can naturally be expressed by the following formula: a1 = if r(a) then let a = a + 1 in a else let a = a + q(a) in a This formula can further be encoded in TPTP, and hence used by a the-orem prover as a hypothesis in proving partial correctness of the above codesnippet. We illustrate below the TPTP encoding of the ﬁrst-order problem cor-responding to the partial program correctness problem we consider. Note thatthe pre-condition becomes a hypothesis in TPTP, whereas the proof obligationgiven by the post-condition is a TPTP conjecture. All formulas below are typedﬁrst-order formulas ( tff ) in TPTP that use the built-in integer sort ( $int ). tff (1, type , p : $int > $o). tff (2, type , q : $int > $int). tff (3, type , r : $int > $o). tff (4, type , a : $int). tff (5, hypothesis , ! [X : $int] : (p(X) => $greatereq(X, 0))). tff (6, hypothesis , ! [X : $int] : ($greatereq(q(X), 0))). tff (7, hypothesis , p(a)). tff (8, hypothesis ,a1 = $ite_t(r(a), $let_tt(a, $sum(a, 1), a),$let_tt(a, $sum(a, q(a)), a))). tff (9, conjecture , $greater(a1, 0)). Running a theorem prover that supports $ite_t and $let_tt on this TPTPproblem would prove the partial correctness of the program we considered. Notethat without the use of if - then - else and let - in expressions, a more tedioustranslation is needed for expressing the next state function of the program vari-able a as a ﬁrst-order formula. When considering more complex programs con-taining multiple conditional expressions assignments and composition, comput-ing the next state function of a program variable results in a formula of sizeexponential in the number of conditional expressions. This problem of comput-ing the next state function of variables is well-known in the program analysiscommunity, by computing so-called static single assignment (SSA) forms. Using3he if - then - else and let - in expressions recently introduced in TPTP and al-ready implemented in Vampire [7], one can have a linear-size translation instead.Let us however note that the usage of conditional expressions in TPTP issomewhat limited. The ﬁrst argument of $ite_t and $ite_f is a logical formula,which means that a boolean condition from the program deﬁnition should betranslated as such. At the same time, the same condition can be treated as avalue in the program, for example, in a form of a boolean ﬂag, passed as anargument to a function. Yet we cannot mix terms and formulas in the same wayin a logical statement. A possible solution would be to map the boolean type ofprograms to a user-deﬁned boolean sort, postulate axioms about its semantics,and manually convert boolean terms into formulas where needed. This approach,however, suﬀers the disadvantages mentioned earlier, namely the need to designa special translation and its possible ineﬃciency.Handling boolean terms as formulas is needed not only in applications ofreasoning-based program analysis, but also in various problems of formalisationof mathematics. For example, if one looks at two largest kinds of attempts to for-malise mathematics and proofs: those performed by interactive proof assistants,such as Isabelle [16], and the Mizar project [21], one can see that ﬁrst-order the-orem provers are the main workhorses behind computer proofs in both cases –see e.g. [5,22]. Interactive theorem provers, such as Isabelle routinely use quan-tiﬁers over booleans. Let us illustrate this by the following examples, chosenamong 490 properties about (co)algebraic datatypes, featuring quantiﬁers overbooleans, generated by Isabelle and kindly found for us by Jasmin Blanchette.Consider the distributivity of a conditional expression (denoted by the ite func-tion) over logical connectives, a pattern that is widely used in reasoning aboutproperties of data structures. For lists and the contains function that checksthat its second argument contains the ﬁrst one, we have the following example:( ∀ p : bool )( ∀ l : list A )( ∀ x : A )( ∀ y : A ) contains ( l, ite( p, x, y )) . =( p ⇒ contains ( l, x )) ∧ ( ¬ p ⇒ contains ( l, y )) (1)A more complex example with a heavy use of booleans is the unsatisﬁability ofthe deﬁnition of subset_sorted . The subset_sorted function takes two sortedlists and checks that its second argument is a sublist of the ﬁrst one.( ∀ l : list A )( ∀ l : list A )( ∀ p : Bool ) ¬ ( subset sorted ( l , l ) . = p ∧ ( ∀ l ′ : list A ) ¬ ( l . = nil ∧ l . = l ′ ∧ p ) ∧ ( ∀ x : A )( ∀ l ′ : list A ) ¬ ( l . = cons ( x , l ′ ) ∧ l . = nil ∧ ¬ p ) ∧ ( ∀ x : A )( ∀ l ′ : list A )( ∀ x : A )( ∀ l ′ : list A ) ¬ ( l . = cons ( x , l ′ ) ∧ l . = cons ( x , l ′ ) ∧ p . = ite( x < x , false , ite( x . = x , subset sorted ( l ′ , l ′ ) , subset sorted ( cons ( x , l ′ ) , l ′ ))))) (2)4ormulas with boolean terms are also common in the SMT-LIB project [3],the collection of benchmarks for SMT-solvers. Its core logic is a variant of ﬁrst-order logic that treats boolean terms as formulas, in which logical connectivesand conditional expressions are deﬁned in the core theory.In this paper we propose a modiﬁcation FOOL of ﬁrst-order logic, which in-cludes a ﬁrst-class boolean sort and if - then - else and let - in expressions, aimedfor being used in automated ﬁrst-order theorem proving. It is the smallest logicthat contains both the SMT-LIB core theory and the monomorphic ﬁrst-ordersubset of TPTP. The syntax and semantics of the logic are given in Section 2. Wefurther describe how FOOL can be translated to the ordinary many-sorted ﬁrst-order logic in Section 3. Section 4 discusses superposition-based theorem provingand proposes a new way of dealing with the boolean sort in it. In Section 5 wediscuss the support of the boolean sort in TPTP and propose changes to it re-quired to support a ﬁrst-class boolean sort. We point out that such changes canalso partially simplify the syntax of TPTP. Section 6 discusses related work andSection 7 contains concluding remarks.The main contributions of this paper are the following:1. the deﬁnition of FOOL and its semantics;2. a translation from FOOL to ﬁrst-order logic, which can be used to supportFOOL in existing ﬁrst-order theorem provers;3. a new technique of dealing with the boolean sort in superposition theoremprovers, allowing one to replace boolean sort axioms by special rules;4. a proposal of a change to the TPTP language, intended to support FOOLand also simplify if - then - else and let - in expressions. First-order logic with the boolean sort (FOOL) extends many-sorted ﬁrst-orderlogic (FOL) in two ways:1. formulas can be treated as terms of the built-in boolean sort; and2. one can use if - then - else and let - in expressions deﬁned below.FOOL is the smallest logic containing both the SMT-LIB core theory and themonomorphic ﬁrst-order part of the TPTP language. It extends the SMT-LIBcore theory by adding let-in expressions deﬁning functions and TPTP by theﬁrst-class boolean sort. We assume a countable inﬁnite set of variables . Deﬁnition 1. A signature of ﬁrst-order logic with the boolean sort is a triple Σ = ( S, F, η ), where: 5. S is a set of sorts , which contains a special sort bool . A type is either a sortor a non-empty sequence σ , . . . , σ n , σ of sorts, written as σ × . . . × σ n → σ .When n = 0, we will simply write σ instead of → σ . We call a type assignment a mapping from a set of variables and function symbols to types, which mapsvariables to sorts.2. F is a set of function symbols . We require F to contain binary functionsymbols ∨ , ∧ , ⇒ and ⇔ , used in inﬁx form, a unary function symbol ¬ ,used in preﬁx form, and nullary function symbols true , false .3. η is a type assignment which maps each function symbol f into a type τ .When the signature is clear from the context, we will write f : τ instead of η ( f ) = τ and say that f is of the type τ .We require the symbols ∨ , ∧ , ⇒ , ⇔ to be of the type bool × bool → bool , ¬ to be of the type bool → bool and true , false to be of the type bool . ❏ In the sequel we assume that Σ = ( S, F, η ) is an arbitrary but ﬁxed signature.To deﬁne the semantics FOOL, we will have to extend the signature and alsoassign sorts to variables. Given a type assignment η , we deﬁne η, x : σ to bethe type assignment that maps a variable x to σ and coincides otherwise with η . Likewise, we deﬁne η, f : τ to be the type assignment that maps a functionsymbol f to τ and coincides otherwise with η .Our next aim to deﬁne the set of terms and their sorts with respect to a typeassignment η . This will be done using a relation η ⊢ t : σ , where σ ∈ S , termscan then be deﬁned as all such expressions t . Deﬁnition 2.

The relation η ⊢ t : σ , where t is an expression and σ ∈ S isdeﬁned inductively as follows. If η ⊢ t : σ , then we will say that t is a term ofthe sort σ w.r.t. η .1. If η ( x ) = σ , then η ⊢ x : σ .2. If η ( f ) = σ × . . . × σ n → σ , η ⊢ t : σ , . . . , η ⊢ t n : σ n , then η ⊢ f ( t , . . . , t n ) : σ .3. If η ⊢ φ : bool , η ⊢ t : σ and η ⊢ t : σ , then η ⊢ ( if φ then t else t ) : σ .4. Let f be a function symbol and x , . . . , x n pairwise distinct variables. If η, x : σ , . . . , x n : σ n ⊢ s : σ and η, f : ( σ × . . . × σ n → σ ) ⊢ t : τ , then η ⊢ ( let f ( x : σ , . . . , x n : σ n ) = s in t ) : τ .5. If η ⊢ s : σ and η ⊢ t : σ , then η ⊢ ( s . = t ) : bool .6. If η, x : σ ⊢ φ : bool , then η ⊢ ( ∀ x : σ ) φ : bool and η ⊢ ( ∃ x : σ ) φ : bool . ❏ We only deﬁned a let - in expression for a single function symbol. It is not hardto extend it to a let - in expression that binds multiple pairwise distinct functionsymbols in parallel, the details of such an extension are straightforward.When η is the type assignment function of Σ and η ⊢ t : σ , we will say that t is a Σ -term of the sort σ , or simply that t is a term of the sort σ . It is not hardto argue that every Σ -term has a unique sort.According to our deﬁnition, not every term-like expression has a sort. Forexample, if x is a variable and η is not deﬁned on x , then x is a not a term w.r.t. η . To make the relation between term-like expressions and terms clear,6e introduce a notion of free and bound occurrences of variables and functionsymbols. We call the following occurrences of variables and function symbols bound :1. any occurrence of x in ( ∀ x : σ ) φ or in ( ∃ x : σ ) φ ;2. in the term let f ( x : σ , . . . , x n : σ n ) = s in t any occurrence of a variable x i in f ( x : σ , . . . , x n : σ n ) or in s , where i = 1 , . . . , n .3. in the term let f ( x : σ , . . . , x n : σ n ) = s in t any occurrence of thefunction symbol f in f ( x : σ , . . . , x n : σ n ) or in t .All other occurrences are called free . We say that a variable or a function symbolis free in a term t if it has at least one free occurrence in t . A term is called closed if it has no occurrences of free variables. Theorem 1.

Suppose η ⊢ t : σ . Then1. for every free variable x of t , η is deﬁned on x ;2. for every free function symbol f of t , η is deﬁned on f ;3. if x is a variable not free in t , and σ ′ is an arbitrary sort, then η, x : σ ′ ⊢ t : σ ;4. if f is a function symbol not free in t , and τ is an arbitrary type, then η, f : τ ⊢ t : σ . ❏ Deﬁnition 3. A predicate symbol is any function symbol of the type σ × . . . × σ n → bool . A Σ -formula is a Σ -term of the sort bool . All Σ -terms that are not Σ -formulas are called non-boolean terms . ❏ Note that, in addition to the use of let-in and if-then-else , FOOL is aproper extension of ﬁrst-order logic. For example, in FOOL formulas can be usedas arguments to terms and one can quantify over booleans. As a consequence,every quantiﬁed boolean formula is a formula in FOOL.

As usual, the semantics of FOOL is deﬁned by introducing a notion of interpre-tation and deﬁning how a term is evaluated in an interpretation.

Deﬁnition 4.

Let η be a type assignment. A η -interpretation I is a map, deﬁnedas follows. Instead of I ( e ) we will write J e K I , for every element e in the domainof I .1. Each sort σ ∈ S is mapped to a nonempty domain J σ K I . We require J bool K I = { , } .2. If η ⊢ x : σ , then J x K I ∈ J σ K I .3. If η ( f ) = σ × . . . × σ n → σ , then J f K I is a function from J σ K I × . . . × J σ n K I to J σ K I .4. We require J true K I = 1 and J false K I = 0. We require J ∧ K I , J ∨ K I , J ⇒ K I , J ⇔ K I and J ¬ K I respectively to be the logical conjunction, disjunction, im-plication, equivalence and negation, deﬁned over { , } in the standard way.7iven a η -interpretation I and a function symbol f , we deﬁne I gf to be themapping that maps f to g and coincides otherwise with I . Likewise, for a variable x and value a we deﬁne I ax to be the mapping that maps x to a and coincidesotherwise with I . Deﬁnition 5.

Let I be a η -interpretation, and η ⊢ t : σ . The value of t in I ,denoted as eval I ( t ), is a value in J σ K I inductively deﬁned as follows:eval I ( x ) = J x K I . eval I ( f ( t , . . . , t n )) = J f K I (eval I ( t ) , . . . , eval I ( t n )) . eval I ( if φ then s else t ) = ( eval I ( s ) , if eval I ( φ ) = 1;eval I ( t ) , otherwise.eval I ( let f ( x : σ , . . . , x n : σ n ) = s in t ) = eval I gf ( t ) , where g is such that for all i = 1 , . . . , n and a i ∈ J σ i K I , we have g ( a , . . . , a n ) =eval I a ...anx ...xn ( s ). eval I ( s . = t ) = ( , if eval I ( s ) = eval I ( t );0 , otherwise.eval I (( ∀ x : σ ) φ ) =  , if eval I ax ( φ ) = 1for all a ∈ I ( σ );0 , otherwise.eval I (( ∃ x : σ ) φ ) =  , if eval I ax ( φ ) = 1for some a ∈ I ( σ );0 , otherwise. Theorem 2.

Let η ⊢ φ : bool and I be a η -interpretation. Then1. for every free variable x of φ , I is deﬁned on x ;2. for every free function symbol f of φ , I is deﬁned on f ;3. if x is a variable not free in φ , σ is an arbitrary sort, and a ∈ J σ K I theneval I ( φ ) = eval I ax ( φ );4. if f is a function symbol not free in φ , σ , . . . , σ n , σ are arbitrary sorts and g ∈ J σ K I × . . . × J σ n K I → J σ K I , then eval I ( φ ) = eval I gf ( φ ). ❏ Let η ⊢ φ : bool . A η -interpretation I is called a model of φ , denoted by I | = φ , if eval I ( φ ) = 1. If I | = φ , we also say that I satisﬁes φ . We say that φ is valid , if I | = φ for all η -interpretations I , and satisﬁable , if I | = φ for at least one η -interpretation I . Note that Theorem 2 implies that any interpretation, whichcoincides with I on free variables and free function symbols of φ is also a modelof φ . 8 Translation of FOOL to FOL

FOOL is a modiﬁcation of FOL. Every FOL formula is syntactically a FOOLformula and has the same models, but not the other way around. In this sectionwe present a translation from FOOL to FOL, which preserves models of φ .This translation can be used for proving theorems of FOOL using a ﬁrst-ordertheorem prover. We do not claim that this translation is eﬃcient – more researchis required on designing translations friendly for ﬁrst-order theorem provers.We do not formally deﬁne many-sorted FOL with equality here, since FOLis essentially a subset of FOOL, which we will discuss now.We say that an occurrence of a subterm s of the sort bool in a term t is ina formula context if it is an argument of a logical connective or the occurrencein either ( ∀ x : σ ) s or ( ∃ x : σ ) s . We say that an occurrence of s in t is in a termcontext if this occurrence is an argument of a function symbol, diﬀerent from alogical connective, or an equality. We say that a formula of FOOL is syntacticallyﬁrst order if it contains no if - then - else and let - in expressions, no variablesoccurring in a formula context and no formulas occurring in a term context.By restricting the deﬁnition of terms to the subset of syntactically ﬁrst-orderformulas, we obtain the standard deﬁnition of many-sorted ﬁrst-order logic, withthe only exception of having a distinguished boolean sort and constants true and false occurring in a formula context.Let φ be a closed Σ -formula of FOOL. We will perform the following steps totranslate φ into a ﬁrst-order formula. During the translation we will maintain aset of formulas D , which initially is empty. The purpose of D is to collect a set offormulas (deﬁnitions of new symbols), which guarantee that the transformationpreserves models.1. Make a sequence of translation steps obtaining a syntactically ﬁrst orderformula φ ′ . During this translation we will introduce new function symbolsand add their types to the type assignment η . We will also add formulasdescribing properties of these symbols to D . The translation will guaranteethat the formulas φ and V ψ ∈ D ψ ∧ φ ′ are equivalent, that is, have the samemodels restricted to Σ .2. Replace the constants true and false , standing in a formula context, bynullary predicates ⊤ and ⊥ respectively, obtaining a ﬁrst-order formula.3. Add special boolean sort axioms.During the translation, we will say that a function symbol or a variable is fresh if it neither appears in φ nor in any of the deﬁnitions, nor in the domain of η .We also need the following deﬁnition. Let η ⊢ t : σ , and x be a variableoccurrence in t . The sort of this occurrence of x is deﬁned as follows:1. any free occurrence of x in a subterm s in the scope of ( ∀ x : σ ′ ) s or ( ∃ x : σ ′ ) s has the sort σ ′ .2. any free occurrence of x i in a subterm s in the scope of let f ( x : σ , . . . , x n : σ n ) = s in s has the sort σ i , where i = 1 , . . . , n .9. a free occurrence of x in t has the sort η ( x ).If η ⊢ t : σ , s is a subterm of t and x a free variable in s , we say that x has asort σ ′ in s if its free occurrences in s have this sort.The translation steps are deﬁned below. We start with an empty set D and aninitial FOOL formula φ , which we would like to change into a syntactically ﬁrst-order formula. At every translation step we will select a formula χ , which is either φ or a formula in D , which is not syntactically ﬁrst-order, replace a subterm in χ it by another subterm, and maybe add a formula to D . The translation stepscan be applied in any order.1. Replace a boolean variable x occurring in a formula context, by x . = true .2. Suppose that ψ is a formula occurring in a term context such that (i) ψ isdiﬀerent from true and false , (ii) ψ is not a variable, and (iii) ψ contains nofree occurrences of function symbols bound in χ . Let x , . . . , x n be all freevariables of ψ and σ , . . . , σ n be their sorts. Take a fresh function symbol g ,add the formula ( ∀ x : σ ) . . . ( ∀ x n : σ n )( ψ ⇔ g ( x , . . . , x n ) . = true ) to D and replace ψ by g ( x , . . . , x n ). Finally, change η to η, g : σ × . . . × σ n → bool .3. Suppose that if ψ then s else t is a term containing no free occurrencesof function symbols bound in χ . Let x , . . . , x n be all free variables of thisterm and σ , . . . , σ n be their sorts. Take a fresh function symbol g , addthe formulas ( ∀ x : σ ) . . . ( ∀ x n : σ n )( ψ ⇒ g ( x , . . . , x n ) . = s ) and ( ∀ x : σ ) . . . ( ∀ x n : σ n )( ¬ ψ ⇒ g ( x , . . . , x n ) . = t ) to D and replace this term by g ( x , . . . , x n ). Finally, change η to η, g : σ × . . . × σ n → σ , where σ is suchthat η, x : σ , . . . , x n : σ n ⊢ s : σ .4. Suppose that let f ( x : σ , . . . , x n : σ n ) = s in t is a term containing nofree occurrences of function symbols bound in χ . Let y , . . . , y m be all freevariables of this term and τ , . . . , τ m be their sorts. Note that the variablesin x , . . . , x n are not necessarily disjoint from the variables in y , . . . , y m .Take a fresh function symbol g and fresh sequence of variables z , . . . , z n . Letthe term s ′ be obtained from s by replacing all free occurrences of x , . . . , x n by z , . . . , z n , respectively. Add the formula ( ∀ z : σ ) . . . ( ∀ z n : σ n )( ∀ y : τ ) . . . ( ∀ y m : τ m )( g ( z , . . . , z n , y , . . . , y m ) . = s ′ ) to D . Let the term t ′ beobtained from t by replacing all bound occurrences of y , . . . , y m by freshvariables and each application f ( t , . . . , t n ) of a free occurrence of f in t by g ( t , . . . , t n , y , . . . , y m ). Then replace let f ( x : σ , . . . , x n : σ n ) = s in t by t ′ . Finally, change η to η, g : σ × . . . × σ n × τ × . . . × τ m → σ , where σ is such that η, x : σ , . . . , x n : σ n , y : τ , . . . , y m : τ m ⊢ s : σ .The translation terminates when none of the above rules apply.We will now formulate several of properties of this translation, which willimply that, in a way, it preserves models. These properties are not hard toprove, we do not include proofs in this paper. Lemma 1.

Suppose that a single step of the translation changes a formula φ into φ , δ is the formula added at this step (for step 1 we can assume true = true is added), η is the type assignment before this step and η ′ is the type assignmentafter. Then for every η ′ -interpretation I we have I | = δ ⇒ ( φ ⇔ φ ). ❏

10y repeated applications of this lemma we obtain the following result.

Lemma 2.

Suppose that the translation above changes a formula φ into φ ′ , D is the set of deﬁnitions obtained during the translation, η is the initial typeassignment and η ′ is the ﬁnal type assignment of the translation. Let I ′ be anyinterpretation of η ′ . Then I ′ | = V ψ ∈ D ψ ⇒ ( φ ⇔ φ ′ ). ❏ We also need the following result.

Lemma 3.

Any sequence of applications of the translation rules terminates. ❏ The lemmas proved so far imply that the translation terminates and theﬁnal formula is equivalent to the initial formula in every interpretation satisfyingall deﬁnitions in D . To prove model preservation, we also need to prove someproperties of the introduced deﬁnitions. Lemma 4.

Suppose that one of the steps 2–4 of the translation translates aformula φ into φ , δ is the formula added at this step, η is the type assignmentbefore this step, η ′ is the type assignment after, and g is the fresh functionsymbol introduced at this step. Let also I be η -interpretation. Then there existsa function h such that I hg | = δ . ❏ These properties imply the following result on model preservation.

Theorem 3.

Suppose that the translation above translates a formula φ into φ ′ , D is the set of deﬁnitions obtained during the translation, η is the initial typeassignment and η ′ is the ﬁnal type assignment of the translation.1. Let I be any η -interpretation. Then there is a η ′ -interpretation I ′ such that I ′ is an extension of I and I ′ | = V ψ ∈ D ψ ∧ φ ′ .2. Let I ′ be a η ′ -interpretation and I ′ | = V ψ ∈ D ψ ∧ φ ′ . Then I ′ | = φ . ❏ This theorem implies that φ and V ψ ∈ D ψ ∧ φ ′ have the same models, as far as theoriginal type assignment (the type assignment of Σ ) is concerned. The formula V ψ ∈ D ψ ∧ φ ′ in this theorem is syntactically ﬁrst-order. Denote this formula by γ . Our next step is to deﬁne a model-preserving translation from syntacticallyﬁrst-order formulas to ﬁrst-order formulas.To make γ into a ﬁrst-order formula, we should get rid of true and false occurring in a formula context. To preserve the semantics, we should also addaxioms for the boolean sort, since in ﬁrst-order logic all sorts are uninterpreted,while in FOOL the interpretations of the boolean sort and constants true and false are ﬁxed.To ﬁx the problem, we will add axioms expressing that the boolean sort hastwo elements and that true and false represent the two distinct elements of thissort. ∀ ( x : bool )( x . = true ∨ x . = false ) ∧ true . = false . (3)Note that this formula is a tautology in FOOL, but not in FOL.Given a syntactically ﬁrst-order formula γ , we denote by fol ( γ ) the formulaobtained from γ by replacing all occurrences of true and false in a formulacontext by logical constants ⊤ and ⊥ (interpreted as always true and alwaysfalse), respectively and adding formula (3).11 heorem 4. Let η is a type assignment and γ be a syntactically ﬁrst-orderformula such that η ⊢ γ : bool .1. Suppose that I is a η -interpretation and I | = γ in FOOL. Then I | = fol ( γ )in ﬁrst-order logic.2. Suppose that I is a η -interpretation and I | = fol ( γ ) in ﬁrst-order logic.Consider the FOOL-interpretation I ′ that is obtained from I by changingthe interpretation of the boolean sort bool by { , } and the interpretationsof true and false by the elements 1 and 0, respectively, of this sort. Then I ′ | = γ in FOOL. ❏ Theorems 3 and 4 show that our translation preserves models. Every modelof the original formula can be extended to a model of the translated formulasby adding values of the function symbols introduced during the translation.Likewise, any ﬁrst-order model of the translated formula becomes a model ofthe original formula after changing the interpretation of the boolean sort tocoincide with its interpretation in FOOL.

In Section 3 we presented a model-preserving syntactic translation of FOOLto FOL. Based on this translation, automated reasoning about FOOL formulascan be done by translating a FOOL formula into a FOL formula, and using anautomated ﬁrst-order theorem prover on the resulting FOL formula. State-of-the-art ﬁrst-order theorem provers, such as Vampire [13], E [18] and Spass [23],implement superposition calculus for proving ﬁrst-order formulas. Naturally, wewould like to have a translation exploiting such provers in an eﬃcient manner.Note however that our translation adds the two-element domain axiom ∀ ( x : bool )( x . = true ∨ x . = false ) for the boolean sort. This axioms will be convertedto the clause x . = true ∨ x . = false , (4)where x is a boolean variable. In this section we explain why this axiom requiresa special treatment and propose a solution to overcome problems caused by itspresence.We assume some basic understanding of ﬁrst-order theorem proving and su-perposition calculus, see, e.g. [1,15]. We ﬁx a superposition inference system forﬁrst-order logic with equality, parametrised by a simpliﬁcation ordering ≻ onliterals and a well-behaved literal selection function [13], that is a function thatguarantees completeness of the calculus. We denote selected literals by underlin-ing them. We assume that equality literals are treated by a dedicated inferencerule, namely, the ordered paramodulation rule [17]: l . = r ∨ C L [ s ] ∨ D if θ = mgu( l, s ),( L [ r ] ∨ C ∨ D ) θ C, D are clauses, L is a literal, l, r, s are terms, mgu( l, s ) is a most generaluniﬁer of l and s , and rθ lθ . The notation L [ s ] denotes that s is a subterm of L , then L [ r ] denotes the result of replacement of s by r .Suppose now that we use an oﬀ-the-shelf superposition theorem prover toreason about FOL formulas obtained by our translation. W.l.o.g, we assume that true ≻ false in the term ordering used by the prover. Then self-paramodulation(from true to true ) can be applied to clause (4) as follows: x . = true ∨ x . = false y . = true ∨ y . = false x . = y ∨ x . = false ∨ y . = false The derived clause x . = y ∨ x . = f alse ∨ y . = false is a recipe for disaster, since theliteral x . = y must be selected and can be used for paramodulation into everynon-variable term of a boolean sort. Very soon the search space will contain manyclauses obtained as logical consequences of clause (4) and results of paramod-ulation from variables applied to them. This will cause a rapid degradation ofperformance of superposition-based provers.To get around this problem, we propose the following solution. First, wewill choose term orderings ≻ having the following properties: true ≻ false and true and false are the smallest ground terms w.r.t. ≻ . Consider now all groundinstances of (4). They have the form s . = true ∨ s . = false , where s is a groundterm. When s is either true or false , this instance is a tautology, and henceredundant. Therefore, we should only consider instances for which s ≻ true .This prevents self-paramodulation of (4).Now the only possible inferences with (4) are inferences of the form x . = true ∨ x . = false C [ s ] C [ true ] ∨ s . = false , where s is a non-variable term of the sort bool . To implement this, we can removeclause (4) and add as an extra inference rule to the superposition calculus thefollowing rule: C [ s ] C [ true ] ∨ s . = false , where s is a non-variable term of the sort bool . The typed monomorphic ﬁrst-order formulas subset, called TFF0, of the TPTPlanguage [20], is a representation language for many-sorted ﬁrst-order logic. Itcontains if - then - else and let - in constructs (see below), which is useful forapplications, but is inconsistent in its treatment of the boolean sort. It has apredeﬁned atomic sort symbol $o denoting the boolean sort. However, unlikeall other sort symbols, $o can only be used to declare the return type of pred-icate symbols. This means that one cannot deﬁne a function having a booleanargument, use boolean variables or equality between booleans.13uch an inconsistent use of the boolean sort results in having two kinds of if - then - else expressions and four kinds of let - in expressions. For example,a FOOL-term let f ( x : σ , . . . , x n : σ n ) = s in t can be represented usingone of the four TPTP alternatives $let_tt , $let_tf , $let_ft and $let_ff ,depending on whether s and t are terms or formulas.Since the boolean type is second-class in TPTP, one cannot directly representformulas coming from program analysis and interactive theorem provers, suchas formulas (1) and (2) of Section 1.We propose to modify the TFF0 language of TPTP to coincide with FOOL.It is not late to do so, since there is no general support for if - then - else and let - in . To the best of our knowledge, Vampire is currently the only theoremprover supporting full TFF0. Note that such a modiﬁcation of TPTP wouldmake multiple forms of if - then - else and let - in redundant. It will also makeit possible to directly represent the SMT-LIB core theory.We note that our changes and modiﬁcations on TFF0 can also be applied tothe TFF1 language of TPTP [4]. TFF1 is a polymorphic extension of TFF0 andits formalisation does not treat the boolean sort. Extending our work to TFF1should not be hard but has to be done in detail. Handling boolean terms as formulas is common in the SMT community. TheSMT-LIB project [3] deﬁnes its core logic as ﬁrst-order logic extended with thedistinguished ﬁrst-class boolean sort and the let - in expression used for localbindings of variables. The core theory of SMT-LIB deﬁnes logical connectivesas boolean functions and the ad-hoc polymorphic if - then - else ( ite ) function,used for conditional expressions. The language FOOL deﬁned here extends theSMT-LIB core language with local function deﬁnitions, using let - in expressionsdeﬁning functions of arbitrary, and not just zero, arity. This, FOOL contains boththis language and the TFF0 subset of TPTP. Further, we present a translationof FOOL to FOL and show how one can improve superposition theorem proversto reason with the boolean sort.Eﬃcient superposition theorem proving in ﬁnite domains, such as the booleandomain, is also discussed in [9]. The approach of [9] sometimes falls back to enu-merating instances of a clause by instantiating ﬁnite domain variables with allelements of the corresponding domains. We point out here that for the boolean(i.e., two-element) domain there is a simpler solution. However, the approachof [9] also allows one to handle domains with more than two elements. One canalso generalise our approach to arbitrary ﬁnite domains by using binary encod-ings of ﬁnite domains, however, this will necessarily result in loss of eﬃciency,since a single variable over a domain with 2 k elements will become k variablesin our approach, and similarly for function arguments.14 Conclusion

We deﬁned ﬁrst-order logic with the ﬁrst class boolean sort (FOOL). It extendsordinary many-sorted ﬁrst-order logic (FOL) with (i) the boolean sort such thatterms of this sort are indistinguishable from formulas and (ii) if - then - else and let - in expressions. The semantics of let - in expressions in FOOL is essentiallytheir semantics in functional programming languages, when they are not usedfor recursive deﬁnitions. In particular, non-recursive local functions can be de-ﬁned and function symbols can be bound to a diﬀerent sort in nested let - in expressions.We argued that these extensions are useful in reasoning about problems com-ing from program analysis and interactive theorem proving. The extraction ofproperties from certain program deﬁnitions (especially in functional program-ming languages) into FOOL formulas is more straightforward than into ordinaryFOL formulas and potentially more eﬃcient. In a similar way, a more straight-forward translation of certain higher-order formulas into FOOL can facilitateproof automation in interactive theorem provers.FOOL is a modiﬁcation of FOL and reasoning in it reduces to reasoningin FOL. We gave a translation of FOOL to FOL that can be used for provingtheorems in FOOL in a ﬁrst-order theorem prover. We further discussed a mod-iﬁcation of superposition calculus that can reason eﬃciently in presence of theboolean sort. Finally, we pointed out that the TPTP language can be changedto support FOOL, which will also simplify some parts of the TPTP syntax.Implementation of theorem proving support for FOOL, including its superpo-sition-friendly translation to CNF, is an important task for future work. Further,we are also interested in extending FOOL with theories, such as the theory ofinteger linear arithmetic and arrays. References

1. Bachmair, L., Ganzinger, H.: Resolution Theorem Proving. In: Handbook of Au-tomated Reasoning, pp. 19–99. Elsevier and MIT Press (2001)2. Barrett, C., Conway, C.L., Deters, M., Hadarean, L., Jovanovic, D., King, T.,Reynolds, A., Tinelli, C.: CVC4. In: Proc. of CAV. pp. 171–177 (2011)3. Barrett, C., Stump, A., Tinelli, C.: The SMT-LIB Standard: Version 2.0. Tech.rep., Department of Computer Science, The University of Iowa (2010), availableat

4. Blanchette, J.C., Paskevich, A.: TFF1: The TPTP Typed First-Order Form withRank-1 Polymorphism. In: Proc. of CADE-24. pp. 414–420. Springer (2013)5. B¨ohme, S., Nipkow, T.: Sledgehammer: Judgement Day. In: Proc. of IJCAR. pp.107–121 (2010)6. de Moura, L., Bjørner, N.: Z3: An Eﬃcient SMT Solver. In: Proc. of TACAS. pp.337–340 (2008)7. Dragan, I., Kov´acs, L.: Lingva: Generating and Proving Program Properties UsingSymbol Elimination. In: Proc. of PSI. pp. 67–75 (2014)8. Gupta, A., Kov´acs, L., Kragl, B., Voronkov, A.: Extensionality Crisis and ProvingIdentity. In: Proc. of ATVA. pp. 185–200 (2014) . Hillenbrand, T., Weidenbach, C.: Superposition for Bounded Domains. In: Auto-mated Reasoning and Mathematics - Essays in Memory of William W. McCune.pp. 68–100 (2013)10. Hoder, K., Kov´acs, L., Voronkov, A.: Playing in the grey area of proofs. In: Proc.of POPL. pp. 259–272 (2012)11. Korovin, K.: iProver - An Instantiation-Based Theorem Prover for First-OrderLogic (System Description). In: Proc. of IJCAR. pp. 292–298 (2008)12. Kov´acs, L., Voronkov, A.: Finding Loop Invariants for Programs over Arrays Usinga Theorem Prover. In: Proc. of FASE. pp. 470–485 (2009)13. Kov´acs, L., Voronkov, A.: First-Order Theorem Proving and Vampire. In: Proc. ofCAV. pp. 1–35 (2013)14. McMillan, K.L.: Quantiﬁed Invariant Generation Using an Interpolating SaturationProver. In: Proc. of TACAS. pp. 413–427 (2008)15. Nieuwenhuis, R., Rubio, A.: Paramodulation-Based Theorem Proving. In: Robin-son, A., Voronkov, A. (eds.) Handbook of Automated Reasoning, vol. I, chap. 7,pp. 371–443. Elsevier Science (2001)16. Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL - A Proof Assistant forHigher-Order Logic (2002)17. Robinson, G., Wos, L.: Paramodulation and theorem-proving in ﬁrst-order theorieswith equality. Machine intelligence 4, 135–150 (1969)18. Schulz, S.: System Description: E 1.8. In: Proc. of LPAR. pp. 735–743 (2013)19. Sutcliﬀe, G.: The TPTP Problem Library and Associated Infrastructure. J. Autom.Reasoning 43(4), 337–362 (2009)20. Sutcliﬀe, G., Schulz, S., Claessen, K., Baumgartner, P.: The TPTP Typed First-Order Form with Arithmetic. In: Proc. of LPAR. pp. 406–419. Springer (2012)21. Trybulec, A.: Mizar. In: The Seventeen Provers of the World, Foreword by DanaS. Scott. pp. 20–23 (2006)22. Urban, J., Hoder, K., Voronkov, A.: Evaluation of Automated Theorem Provingon the Mizar Mathematical Library. In: ICMS. pp. 155–166 (2010)23. Weidenbach, C., Dimova, D., Fietzke, A., Kumar, R., Suda, M., Wischnewski, P.:Spass version 3.5. In: CADE. pp. 140–145 (2009). Hillenbrand, T., Weidenbach, C.: Superposition for Bounded Domains. In: Auto-mated Reasoning and Mathematics - Essays in Memory of William W. McCune.pp. 68–100 (2013)10. Hoder, K., Kov´acs, L., Voronkov, A.: Playing in the grey area of proofs. In: Proc.of POPL. pp. 259–272 (2012)11. Korovin, K.: iProver - An Instantiation-Based Theorem Prover for First-OrderLogic (System Description). In: Proc. of IJCAR. pp. 292–298 (2008)12. Kov´acs, L., Voronkov, A.: Finding Loop Invariants for Programs over Arrays Usinga Theorem Prover. In: Proc. of FASE. pp. 470–485 (2009)13. Kov´acs, L., Voronkov, A.: First-Order Theorem Proving and Vampire. In: Proc. ofCAV. pp. 1–35 (2013)14. McMillan, K.L.: Quantiﬁed Invariant Generation Using an Interpolating SaturationProver. In: Proc. of TACAS. pp. 413–427 (2008)15. Nieuwenhuis, R., Rubio, A.: Paramodulation-Based Theorem Proving. In: Robin-son, A., Voronkov, A. (eds.) Handbook of Automated Reasoning, vol. I, chap. 7,pp. 371–443. Elsevier Science (2001)16. Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL - A Proof Assistant forHigher-Order Logic (2002)17. Robinson, G., Wos, L.: Paramodulation and theorem-proving in ﬁrst-order theorieswith equality. Machine intelligence 4, 135–150 (1969)18. Schulz, S.: System Description: E 1.8. In: Proc. of LPAR. pp. 735–743 (2013)19. Sutcliﬀe, G.: The TPTP Problem Library and Associated Infrastructure. J. Autom.Reasoning 43(4), 337–362 (2009)20. Sutcliﬀe, G., Schulz, S., Claessen, K., Baumgartner, P.: The TPTP Typed First-Order Form with Arithmetic. In: Proc. of LPAR. pp. 406–419. Springer (2012)21. Trybulec, A.: Mizar. In: The Seventeen Provers of the World, Foreword by DanaS. Scott. pp. 20–23 (2006)22. Urban, J., Hoder, K., Voronkov, A.: Evaluation of Automated Theorem Provingon the Mizar Mathematical Library. In: ICMS. pp. 155–166 (2010)23. Weidenbach, C., Dimova, D., Fietzke, A., Kumar, R., Suda, M., Wischnewski, P.:Spass version 3.5. In: CADE. pp. 140–145 (2009)