[PDF] Superposition with Lambdas

Abstract

We designed a superposition calculus for a clausal fragment of extensional polymorphic higher-order logic that includes anonymous functions but excludes Booleans. The inference rules work on \beta\eta-equivalence classes of \lambda-terms and rely on higher-order unification to achieve refutational completeness. We implemented the calculus in the Zipperposition prover and evaluated it on TPTP and Isabelle benchmarks. The results suggest that superposition is a suitable basis for higher-order reasoning.

Full PDF

aa r X i v : . [ c s . L O ] J a n J. Autom. Reasoning manuscript No. (will be inserted by the editor)

Superposition with Lambdas

Alexander Bentkamp · Jasmin Blanchette · Sophie Tourret · Petar Vukmirovi´c · Uwe Waldmann

Received: date / Accepted: date

Abstract

We designed a superposition calculus for a clausal fragment of extensional poly-morphic higher-order logic that includes anonymous functions but excludes Booleans. Theinference rules work on βη -equivalence classes of λ -terms and rely on higher-order uniﬁ-cation to achieve refutational completeness. We implemented the calculus in the Zipper-position prover and evaluated it on TPTP and Isabelle benchmarks. The results suggest thatsuperposition is a suitable basis for higher-order reasoning. Keywords superposition calculus · higher-order logic · refutational completeness Superposition [6] is widely regarded as the calculus par excellence for reasoning about ﬁrst-order logic with equality. To increase automation in proof assistants and other veriﬁcationtools based on higher-order formalisms, we propose to generalize superposition to an exten-sional, polymorphic, clausal version of higher-order logic (also called simple type theory).Our ambition is to achieve a graceful extension, which coincides with standard superpositionon ﬁrst-order problems and smoothly scales up to arbitrary higher-order problems.Bentkamp, Blanchette, Cruanes, and Waldmann [12] designed a family of superposition-like calculi for a λ -free clausal fragment of higher-order logic, with currying and appliedvariables. We adapt their extensional nonpurifying calculus to support λ -terms (Sect. 3).Our calculus does not support interpreted Booleans; it is conceived as the penultimate mile-stone towards a superposition calculus for full higher-order logic. If desired, Booleans canbe encoded in our logic fragment using an uninterpreted type and uninterpreted “proxy”symbols corresponding to equality, the connectives, and the quantiﬁers.Designing a higher-order superposition calculus poses three main challenges: Alexander Bentkamp ( (cid:0) ) · Jasmin Blanchette · Petar Vukmirovi´cVrije Universiteit Amsterdam, Department of Computer Science, Section of Theoretical Computer Science,De Boelelaan 1111, 1081 HV Amsterdam, the NetherlandsE-mail: {a.bentkamp,j.c.blanchette,p.vukmirovic}@vu.nlJasmin Blanchette · Sophie Tourret · Uwe WaldmannMax-Planck-Institut für Informatik, Saarland Informatics Campus E1 4, 66123 Saarbrücken, GermanyE-mail: {jblanche,stourret,uwe}@mpi-inf.mpg.de Alexander Bentkamp et al.

1. Standard superposition is parameterized by a ground-total simpliﬁcation order ≻ , butsuch orders do not exist for λ -terms equal up to β -conversion. The relations designed forproving termination of higher-order term rewriting systems, such as HORPO [40] andCPO [22], lack many of the desired properties (e.g., transitivity, stability under ground-ing substitutions).2. Higher-order uniﬁcation is undecidable and may give rise to an inﬁnite set of incompa-rable uniﬁers. For example, the constraint f ( y a ) ? = y ( f a ) admits inﬁnitely many inde-pendent solutions of the form { y λ x . f n x } .

3. In ﬁrst-order logic, to rewrite into a term s using an oriented equation t ≈ t ′ , it sufﬁcesto ﬁnd a subterm of s that is uniﬁable with t . In higher-order logic, this is insufﬁcient.Consider superposition from f c ≈ a into y c y b . The left-hand sides can obviously beuniﬁed by { y f } , but the more general { y λ x . z x ( f x ) } also gives rise to a subterm f c after β -reduction. The corresponding inference generates the clause z c a z b ( f b ) .To address the ﬁrst challenge, we adopt the η -short β -normal form to represent βη -equivalence classes of λ -terms. In the spirit of Jouannaud and Rubio’s early joint work [39],we state requirements on the term order only for ground terms (i.e., closed monomorphic βη -equivalence classes); the nonground case is connected to the ground case via stabilityunder grounding substitutions. Even on ground terms, we cannot obtain all desirable prop-erties. We sacriﬁce compatibility with arguments (the property that s ′ ≻ s implies s ′ t ≻ s t ),compensating with an argument congruence rule (A RG C ONG ), as in Bentkamp et al. [12].For the second challenge, we accept that there might be inﬁnitely many incomparableuniﬁers and enumerate a complete set (including the notorious ﬂex–ﬂex pairs [37]), rely-ing on heuristics to postpone the combinatorial explosion. The saturation loop must also beadapted to interleave this enumeration with the theorem prover’s other activities (Sect. 6).Despite its reputation for explosiveness, higher-order uniﬁcation is a conceptual improve-ment over SK combinators, because it can often compute the right uniﬁer. Consider theconjecture ∃ z . ∀ x y . z x y ≈ f y x . After negation, clausiﬁcation, and skolemization (which areas for ﬁrst-order logic), the formula becomes z ( sk x z ) ( sk y z ) f ( sk y z ) ( sk x z ) . Higher-orderuniﬁcation quickly computes the unique uniﬁer: { z λ x y . f y x } . In contrast, an encod-ing approach based on combinators, similar to the one implemented in Sledgehammer [52],would blindly enumerate all possible SK terms for z until the right one, S ( K ( S f )) K , isfound. Given the deﬁnitions S z y x ≈ z x ( y x ) and K x y ≈ x , the E prover [59] in auto modeneeds to perform 3757 inferences to derive the empty clause.For the third challenge, the idea is that, when applying t ≈ t ′ to perform rewriting insidea higher-order term s , we can encode an arbitrary context as a fresh higher-order variable z ,unifying s with z t ; the result is ( z t ′ ) σ , for some uniﬁer σ . This is performed by a dedicated ﬂuid subterm superposition rule (F LUID S UP ).Functional extensionality is also considered a quintessential higher-order challenge [14],although similar difﬁculties arise with ﬁrst-order sets and arrays [34]. Our approach is to addextensionality as an axiom and provide optional rules as optimizations (Sect. 5). With thisaxiom, our calculus is refutationally complete w.r.t. extensional Henkin semantics (Sect. 4).Our proof employs the new saturation framework by Waldmann et al. [69] to derive dynamiccompleteness of a given clause prover from ground static completeness.We implemented the calculus in the Zipperposition prover [28] (Sect. 6). Our empiricalevaluation includes benchmarks from the TPTP [63] and interactive veriﬁcation problemsexported from Isabelle/HOL [23] (Sect. 7). The results clearly demonstrate the calculus’spotential. The 2020 edition of the CADE ATP System Competition (CASC) provides further uperposition with Lambdas 3 conﬁrmation: Zipperposition ﬁnished 20 percentage points ahead of its closest rival. Thissuggests that an implementation inside a high-performance prover such as E [59] or Vampire[48] could fulﬁll the promise of strong proof automation for higher-order logic (Sect. 8).An earlier version of this article was presented at CADE-27 [11]. This article extendsthe conference paper with more explanations, detailed soundness and completeness proofs,including dynamic completeness, and new optional inference rules. We have also updatedthe empirical evaluation and extended the coverage of related work. Finally, we tightenedside condition 4 of F LUID S UP , making the rule slightly less explosive. Our extensional polymorphic clausal higher-order logic is a restriction of full TPTP THF[16] to rank-1 (top-level) polymorphism, as in TH1 [41]. In keeping with standard super-position, we consider only formulas in conjunctive normal form, without explicit quantiﬁersor Boolean type. We use Henkin semantics [15, 31, 35], as opposed to the standard seman-tics that is commonly considered the foundation of the HOL systems [33]. However, bothof these semantics are compatible with the notion of provability employed by the HOL sys-tems. By admitting nonstandard models, Henkin semantics is not subject to Gödel’s ﬁrstincompleteness theorem, allowing us to claim not only soundness but also refutational com-pleteness of our calculus.

Syntax

We ﬁx a set Σ ty of type constructors with arities and a set V ty of type variables.We require at least one nullary type constructor and a binary function type constructor → tobe present in Σ ty . A type τ, υ is either a type variable α ∈ V ty or has the form κ ( ¯ τ n ) for an n -ary type constructor κ ∈ Σ ty and types ¯ τ n . We use the notation ¯ a n or ¯ a to stand for the tuple ( a , . . ., a n ) or product a × · · · × a n , where n ≥

0. We write κ for κ () and τ → υ for → ( τ, υ ) .Type declarations have the form Π ¯ α m . τ (or simply τ if m = τ belong to ¯ α m .We ﬁx a set Σ of (function) symbols a , b , c , f , g , h , . . . , with type declarations, written as f : Π ¯ α m . τ or f , and a set V of term variables with associated types, written as x : τ or x .The notation t : τ will also be used to indicate the type of arbitrary terms t . We require thepresence of a symbol of type Π α. α and of a symbol diﬀ : Π α, β. ( α → β ) → ( α → β ) → α in Σ . We use diﬀ to express the polymorphic functional extensionality axiom. A signature isa pair ( Σ ty , Σ ) .In the following, we will deﬁne terms in three layers of abstraction: raw λ -terms, λ -terms, and terms; where λ -terms will be α -equivalence classes of raw λ -terms and termswill be βη -equivalence classes of λ -terms.The raw λ -terms over a given signature and their associated types are deﬁned inductivelyas follows. Every x : τ ∈ V is a raw λ -term of type τ . If f : Π ¯ α m . τ ∈ Σ and ¯ υ m is a tuple oftypes, called type arguments , then f h ¯ υ m i (or f if m =

0) is a raw λ -term of type τ { ¯ α m ¯ υ m } .If x : τ and t : υ , then the λ -expression λ x . t is a raw λ -term of type τ → υ . If s : τ → υ and t : τ , then the application s t is a raw λ -term of type υ .The function type constructor → is right-associative; application is left-associative. Us-ing the spine notation [26], raw λ -terms can be decomposed in a unique way as a nonappli-cation head t applied to zero or more arguments: t s . . . s n or t ¯ s n (abusing notation).A raw λ -term s is a subterm of a raw λ -term t , written t = t [ s ] , if t = s , if t = ( λ x . u [ s ]) ,if t = ( u [ s ]) v , or if t = u ( v [ s ]) for some raw λ -terms u and v . A proper subterm of a raw λ -term t is any subterm of t that is distinct from t itself. Alexander Bentkamp et al.

A variable occurrence is free in a raw λ -term if it is not bound by a λ -expression. Araw λ -term is ground if it is built without using type variables and contains no free termvariables.The α -renaming rule is deﬁned as ( λ x . t ) − (cid:4) → α ( λ y . t { x y } ) , where y does not occurfree in t and is not captured by a λ -binder in t . Raw λ -terms form equivalence classes modulo α -renaming, called λ -terms . We lift the above notions on raw λ -terms to λ -terms.A substitution ρ is a function from type variables to types and from term variables to λ -terms such that it maps all but ﬁnitely many variables to themselves. We require that it istype-correct—i.e., for each x : τ ∈ V , x ρ is of type τρ . The letters θ, π, ρ, σ are reserved forsubstitutions. Substitutions α -rename λ -terms to avoid capture; for example, ( λ x . y ) { y x } = ( λ x ′ . x ) . The composition ρσ applies ρ ﬁrst: t ρσ = ( t ρ ) σ . The notation σ [ ¯ x n ¯ s n ] denotes the substitution that replaces each x i by s i and that otherwise coincides with σ .The β - and η -reduction rules are speciﬁed on λ -terms as ( λ x . t ) u − (cid:4) → β t { x u } and ( λ x . t x ) − (cid:4) → η t . For β , bound variables in t are implicitly renamed to avoid capture; for η ,the variable x must not occur free in t . The λ -terms form equivalence classes modulo βη -reduction, called βη -equivalence classes or simply terms . Convention 1

When deﬁning operations that need to analyze the structure of terms, we willuse the η -short β -normal form t ↓ βη , obtained by applying − (cid:4) → β and − (cid:4) → η exhaustively, as arepresentative of the equivalence class t . In particular, we lift the notions of subterms andoccurrences of variables to βη -equivalence classes via their η -short β -normal representative.Many authors prefer the η -long β -normal form [37, 39, 51], but in a polymorphic settingit has the drawback that instantiating a type variable with a functional type can lead to η -expansion. We reserve the letters s , t , u , v for terms and x , y , z for variables.An equation s ≈ t is formally an unordered pair of terms s and t . A literal is an equationor a negated equation, written ¬ s ≈ t or s t . A clause L ∨ · · · ∨ L n is a ﬁnite multiset ofliterals L j . The empty clause is written as ⊥ .A complete set of uniﬁers on a set X of variables for two terms s and t is a set U ofuniﬁers of s and t such that for every uniﬁer θ of s and t there exists a member σ ∈ U anda substitution ρ such that x σρ = x θ for all x ∈ X . We let CSU X ( s , t ) denote an arbitrary(preferably minimal) complete set of uniﬁers on X for s and t . We assume that all σ ∈ CSU X ( s , t ) are idempotent on X —i.e., x σσ = x σ for all x ∈ X . The set X will consist of thefree variables of the clauses in which s and t occur and will be left implicit.Given a substitution σ , the σ -instance of a term t or clause C is the term t σ or the clause C σ , respectively. If t σ or C σ is ground, we call it a σ -ground instance. Semantics A type interpretation I ty = ( U , J ty ) is deﬁned as follows. The universe U isa nonempty collection of nonempty sets, called domains . The function J ty associates afunction J ty ( κ ) : U n → U with each n -ary type constructor κ , such that for all domains D , D ∈ U , the set J ty ( → )( D , D ) is a subset of the function space from D to D . Thesemantics is standard if J ty ( → )( D , D ) is the entire function space for all D , D .A type valuation ξ is a function that maps every type variable to a domain. The denota-tion of a type for a type interpretation I ty and a type valuation ξ is deﬁned by J α K ξ I ty = ξ ( α ) and J κ ( ¯ τ ) K ξ I ty = J ty ( κ )( J ¯ τ K ξ I ty ) . We abuse notation by applying an operation on a tuple whenit must be applied elementwise; thus, J ¯ τ n K ξ I ty stands for J τ K ξ I ty , . . ., J τ n K ξ I ty . A type valuation ξ can be extended to be a valuation by additionally assigning an element ξ ( x ) ∈ J τ K ξ I ty toeach variable x : τ . An interpretation function J for a type interpretation I ty associates witheach symbol f : Π ¯ α m . τ and domain tuple ¯ D m ∈ U m a value J ( f , ¯ D m ) ∈ J τ K ξ I ty , where ξ is thetype valuation that maps each α i to D i . uperposition with Lambdas 5 The comprehension principle states that every function designated by a λ -expression iscontained in the corresponding domain. Loosely following Fitting [31, Sect. 2.4], we ini-tially allow λ -expressions to designate arbitrary elements of the domain, to be able to deﬁnethe denotation of a term. We impose restrictions afterwards using the notion of a proper in-terpretation. A λ -designation function L for a type interpretation I ty is a function that mapsa valuation ξ and a λ -expression of type τ to elements of J τ K ξ I ty . A type interpretation, aninterpretation function, and a λ -designation function form an ( extensional ) interpretation I = ( I ty , J , L ) . For an interpretation I and a valuation ξ , the denotation of a term is deﬁnedas J x K ξ I = ξ ( x ) , J f h ¯ τ m i K ξ I = J ( f , J ¯ τ m K ξ I ty ) , J s t K ξ I = J s K ξ I ( J t K ξ I ) , and J λ x . t K ξ I = L ( ξ, λ x . t ) . Forground terms t , the denotation does not depend on the choice of the valuation ξ , which iswhy we sometimes write J t K I for J t K ξ I .An interpretation I is proper if J λ x . t K ξ I ( a ) = J t K ξ [ x a ] I for all λ -expressions λ x . t , allvaluations ξ , and all a . If a type interpretation I ty and an interpretation function J can beextended by a λ -designation function L to a proper interpretation ( I ty , J , L ) , then this L isunique [31, Proposition 2.18]. Given an interpretation I and a valuation ξ , an equation s ≈ t is true if J s K ξ I and J t K ξ I are equal and it is false otherwise. A disequation s t is true if s ≈ t isfalse. A clause is true if at least one of its literals is true. A clause set is true if all its clausesare true. A proper interpretation I is a model of a clause set N , written I | = N , if N is true in I for all valuations ξ . Axiomatization of Booleans

Our clausal logic lacks a Boolean type, but it can easily beaxiomatized as follows. We extend the signature with a nullary type constructor bool ∈ Σ ty equipped with the proxy constants t , f : bool , not : bool → bool , and , or , impl , equiv : bool → bool → bool , forall , exists : Π α. ( α → bool ) → bool , eq : Π α. α → α → bool , and choice : Π α. ( α → bool ) → α , characterized by the axioms t f x ≈ t ∨ x ≈ fnot t ≈ fnot f ≈ tand t x ≈ x and f x ≈ f or t x ≈ tor f x ≈ x impl t x ≈ x impl f x ≈ t x y ∨ eq h α i x y ≈ t x ≈ y ∨ eq h α i x y ≈ f equiv x y ≈ and ( impl x y ) ( impl y x ) forall h α i ( λ x . t ) ≈ t y ≈ ( λ x . t ) ∨ forall h α i y ≈ fexists h α i y ≈ not ( forall h α i ( λ x . not ( y x ))) y x ≈ f ∨ y ( choice h α i y ) ≈ t This axiomatization of Booleans can be used in a prover to support full higher-orderlogic with or without Hilbert choice, corresponding to the TPTP THF format variants TH0(monomorphic) [64] and TH1 (polymorphic) [41]. The prover’s clausiﬁer would transformthe outer ﬁrst-order skeleton of a formula into a clause and use the axiomatized Booleanswithin the terms. It would also add the proxy axioms to the clausal problem. As an alter-native to this complete axiomatization, Vukmirovi´c and Nummelin [68] present a possiblyrefutationally incomplete calculus extension with dedicated rules to support Booleans. Thisapproach works better in practice and contributed to Zipperposition’s victory at CASC 2020.

Our

Boolean-free λ -superposition calculus presented here is inspired by the extensional non-purifying Boolean-free λ -free higher-order superposition calculus described by Bentkamp etal. [12]. The text of this and the next section is partly based on that paper and the associatedjournal submission [10] (with Cruanes’s permission). The central idea is that superposition Alexander Bentkamp et al. inferences are restricted to unapplied subterms occurring in the ﬁrst-order outer skeleton ofclauses—that is, outside λ -expressions and outside the arguments of applied variables. Wecall these “green subterms.” Thus, g ≈ ( λ x . f x x ) cannot be used directly to rewrite g a to f a a , because g is applied in g a . A separate inference rule, A RG C ONG , takes care of deriv-ing g x ≈ f x x , which can be oriented independently of its parent clause and used to rewrite g a or f a a . Deﬁnition 2 (Green positions and subterms)

The green positions and green subterms ofa term (i.e., a βη -equivalence class) are deﬁned inductively as follows. A green position isa tuple of natural numbers. For any term t , the empty tuple ε is a green position of t , and t is the green subterm of t at position ε . For all symbols f ∈ Σ , types ¯ τ , and terms ¯ u , if t is agreen subterm of u i at some position p for some i , then i . p is a green position of f h ¯ τ i ¯ u , and t is the green subterm of f h ¯ τ i ¯ u at position i . p . We denote the green subterm of s at the greenposition p by s | p .In f ( g a ) ( y b ) ( λ x . h c ( g x )) , the proper green subterms are a , g a , y b , and λ x . h c ( g x ) . Thelast two of these do not look like ﬁrst-order terms and hence their subterms are not green. Deﬁnition 3 (Green contexts)

We write t = s u p to express that u is a green subterm of t at the green position p and call s p a green context . We omit the subscript p if there areno ambiguities.In a βη -normal representative of a green context, the hole never occurs applied. Therefore,inserting a βη -normal term into the context produces another βη -normal term.Another key notion is that of a ﬂuid term: Deﬁnition 4 (Fluid terms)

A term t is called ﬂuid if (1) t ↓ βη is of the form y ¯ u n where n ≥

1, or (2) t ↓ βη is a λ -expression and there exists a substitution σ such that t σ ↓ βη is not a λ -expression (due to η -reduction).Case (2) can arise only if t contains an applied variable. Intuitively, ﬂuid terms are termswhose η -short β -normal form can change radically as a result of instantiation. For example, λ x . y a ( z x ) is ﬂuid because applying { z λ x . x } makes the λ vanish: ( λ x . y a x ) = y a .Similarly, λ x . f ( y x ) x is ﬂuid because ( λ x . f ( y x ) x ) { y λ x . a } = ( λ x . f a x ) = f a . The calculus is parameterized by a strict and a nonstrict term order as well as a selectionfunction. These concepts are deﬁned below.

Deﬁnition 5 (Strict ground term order) A strict ground term order is a well-foundedstrict total order ≻ on ground terms satisfying the following criteria, where (cid:23) denotes thereﬂexive closure of ≻ : – green subterm property : t s (cid:23) s ; – compatibility with green contexts : s ′ ≻ s implies t s ′ ≻ t s .Given a strict ground term order, we extend it to literals and clauses via the multiset exten-sions in the standard way [6, Sect. 2.4].Two properties that are not required are compatibility with λ -expressions ( s ′ ≻ s implies ( λ x . s ′ ) ≻ ( λ x . s ) ) and compatibility with arguments ( s ′ ≻ s implies s ′ t ≻ s t ). The latterwould even be inconsistent with totality. To see why, consider the symbols c ≻ b ≻ a and uperposition with Lambdas 7 the terms λ x . b and λ x . x . Owing to totality, one of the terms must be larger than the other, say, ( λ x . b ) ≻ ( λ x . x ) . By compatibility with arguments, we get ( λ x . b ) c ≻ ( λ x . x ) c , i.e., b ≻ c ,a contradiction. A similar line of reasoning applies if ( λ x . b ) ≺ ( λ x . x ) , using a instead of c . Deﬁnition 6 (Strict term order) A strict term order is a relation ≻ on terms, literals, andclauses such that restriction to ground entities is a strict ground term order and such thatit is stable under grounding substitutions (i.e., t ≻ s implies t θ ≻ s θ for all substitutions θ grounding the entities t and s ). Deﬁnition 7 (Nonstrict term order)

Given a strict term order ≻ and its reﬂexive closure (cid:23) , a nonstrict term order is a relation % on terms, literals, and clauses such that t % s implies t θ (cid:23) s θ for all θ grounding the entities t and s .Although we call them orders, a strict term order ≻ is not required to be transitive on non-ground entities, and a nonstrict term order % does not need to be transitive at all. Normally, t (cid:23) s should imply t % s , but this is not required either. A nonstrict term order % allows usto be more precise than the reﬂexive closure (cid:23) of ≻ . For example, we cannot have y b (cid:23) y a ,because y b = y a and y b y a by stability under grounding substitutions (with { y λ x . c } ).But we can have y b % y a if b ≻ a . In practice, the strict and the nonstrict term order shouldbe chosen so that they can compare as many pairs of terms as possible while being com-putable and reasonably efﬁcient. Deﬁnition 8 (Maximality)

An element x of a multiset M is D - maximal for some relation D if for all y ∈ M with y D x , we have y E x . It is strictly D -maximal if it is D -maximal andoccurs only once in M . Deﬁnition 9 (Selection function) A selection function is a function that maps each clauseto a subclause consisting of negative literals, which we call the selected literals of that clause.A literal L y must not be selected if y ¯ u n , with n >

0, is a (cid:23) -maximal term of the clause.The restriction on the selection function is needed for our proof, but it is an open questionwhether it is actually necessary for refutational completeness.Our calculus is parameterized by a strict term order ≻ , a nonstrict term order % , and aselection function HSel . The calculus rules depend on the following auxiliary notions.

Deﬁnition 10 (Eligibility)

A literal L is ( strictly ) D - eligible w.r.t. a substitution σ in C forsome relation D if it is selected in C or there are no selected literals in C and L σ is (strictly) D -maximal in C σ. If σ is the identity substitution, we leave it implicit. Deﬁnition 11 (Deep occurrences)

A variable occurs deeply in a clause C if it occurs insidea λ -expression or inside an argument of an applied variable.For example, x and z occur deeply in f x y ≈ y x ∨ z ( λ w . z a ) , whereas y does not occurdeeply. The purpose of this deﬁnition is to capture all variables with an occurrence thatcorresponds to a position inside a λ -expression in some ground instances of C .The ﬁrst rule of our calculus is the superposition rule. We regard positive and negativesuperposition as two cases of a single rule D z }| { D ′ ∨ t ≈ t ′ C z }| { C ′ ∨ s u ˙ ≈ s ′ S UP ( D ′ ∨ C ′ ∨ s t ′ ˙ ≈ s ′ ) σ where ˙ ≈ denotes either ≈ or . The following side conditions apply: Alexander Bentkamp et al. u is not ﬂuid; 2. u is not a variable deeply occurring in C ;3. variable condition : if u is a variable y , there must exist a grounding substitution θ suchthat t σθ ≻ t ′ σθ and C σθ ≺ C ′′ σθ , where C ′′ = C { y t ′ } ;4. σ ∈ CSU ( t , u ) ; 5. t σ - t ′ σ ; 6. s u σ - s ′ σ ; 7. C σ - D σ ;8. t ≈ t ′ is strictly % -eligible in D w.r.t. σ ;9. s u ˙ ≈ s ′ is % -eligible in C w.r.t. σ , and strictly % -eligible if it is positive.There are four main differences with the statement of the standard superposition rule: Con-texts s [ ] are replaced by green contexts s . The standard condition u / ∈ V is generalizedby conditions 2 and 3. Most general uniﬁers are replaced by complete sets of uniﬁers. And is replaced by the more precise - .The second rule is a variant of S UP that focuses on ﬂuid green subterms: D z }| { D ′ ∨ t ≈ t ′ C z }| { C ′ ∨ s u ˙ ≈ s ′ F LUID S UP ( D ′ ∨ C ′ ∨ s z t ′ ˙ ≈ s ′ ) σ with the following side conditions, in addition to S UP ’s conditions 5 to 9:1. u is either a ﬂuid term or a variable deeply occurring in C ;2. z is a fresh variable; 3. σ ∈ CSU ( z t , u ) ; 4. ( z t ′ ) σ = ( z t ) σ .The equality resolution and equality factoring rules are almost identical to their standardcounterparts: C z }| { C ′ ∨ u u ′ ER ES C ′ σ C z }| { C ′ ∨ u ′ ≈ v ′ ∨ u ≈ v EF ACT ( C ′ ∨ v v ′ ∨ u ≈ v ′ ) σ For ER ES : σ ∈ CSU ( u , u ′ ) and u u ′ is % -eligible in C w.r.t. σ . For EF ACT : σ ∈ CSU ( u , u ′ ) , u σ - v σ , and u ≈ v is % -eligible in C w.r.t. σ .Argument congruence, a higher-order concern, is embodied by the rule C z }| { C ′ ∨ s ≈ s ′ A RG C ONG C ′ σ ∨ s σ ¯ x n ≈ s ′ σ ¯ x n where σ is the most general type substitution that ensures well-typedness of the conclusion.In particular, if the result type of s is not a type variable, σ is the identity substitution; andif the result type is a type variable, it is instantiated with α → · · · → α m → β , where ¯ α m and β are fresh. This yields inﬁnitely many conclusions, one for each m . The literal s ≈ s ′ mustbe strictly % -eligible in C w.r.t. σ , and ¯ x n is a nonempty tuple of distinct fresh variables.The rules are complemented by the polymorphic functional extensionality axiom: y ( diﬀ h α, β i y z ) z ( diﬀ h α, β i y z ) ∨ y ≈ z (E XT )From now on, we will omit the type arguments to diﬀ since they can be inferred from theterm arguments. uperposition with Lambdas 9 The calculus realizes the following division of labor: S UP and F LUID S UP are responsible forgreen subterms, which are outside λ s, A RG C ONG effectively gives access to the remainingpositions outside λ s, and the extensionality axiom takes care of subterms inside λ s. Example 12

Preﬁx subterms such as g in the term g a are not green subterms and thus can-not be superposed into. A RG C ONG gives us access to those positions. Consider the clauses g a f a and g ≈ f . An A RG C ONG inference from g ≈ f generates g x ≈ f x . This clause canbe used for a S UP inference into the ﬁrst clause, yielding f a f a and thus ⊥ by ER ES . Example 13

Applied variables give rise to subtle situations with no counterparts in ﬁrst-order logic. Consider the clauses f a ≈ c and h ( y b ) ( y a ) h ( g ( f b )) ( g c ) , where f a ≻ c .It is easy to see that the clause set is unsatisﬁable, by grounding the second clause with θ = { y λ x . g ( f x ) } . However, to mimic the superposition inference that can be performed at theground level, it is necessary to superpose at an imaginary position below the applied variable y and yet above its argument a , namely, into the subterm f a of g ( f a ) = ( λ x . g ( f x )) a = ( y a ) θ .F LUID S UP ’s z variable effectively transforms f a ≈ c into z ( f a ) ≈ z c , whose left-hand sidecan be uniﬁed with y a by taking { y λ x . z ( f x ) } . The resulting clause is h ( z ( f b )) ( z c ) h ( g ( f b )) ( g c ) , from which ⊥ follows by ER ES . Example 14

The clause set consisting of f a ≈ c , f b ≈ d , and g c y a ∨ g d y b hasa similar ﬂavor. ER ES is applicable on either literal of the third clause, but the computeduniﬁer, { y λ x . g c } or { y λ x . g d } , is not the right one. Again, we need F LUID S UP . Example 15

Third-order clauses containing subterms of the form y ( λ x . t ) can be even morestupefying. The clause set consisting of f a ≈ c and h ( y ( λ x . g ( f x )) a ) y h ( g c ) ( λ w x . w x ) is unsatisﬁable. To see why, apply θ = { y λ w x . w x } to the second clause, yielding h ( g ( f a )) ( λ w x . w x ) h ( g c ) ( λ w x . w x ) . Let f a ≻ c . A S UP inference is possible betweenthe ﬁrst clause and this ground instance of the second one. But at the nonground level, thesubterm f a is not clearly localized: g ( f a ) = ( λ x . g ( f x )) a = ( λ w x . w x ) ( λ x . g ( f x )) a =( y ( λ x . g ( f x )) a ) θ . The F LUID S UP rule can cope with this. One of the uniﬁers of z ( f a ) and y ( λ x . g ( f x )) a will be { y λ w x . w x , z g } , yielding the clearly unsatisﬁable clause h ( g c ) ( λ w x . w x ) h ( g c ) ( λ w x . w x ) . Example 16

The F

LUID S UP rule is concerned not only with applied variables but also with λ -expressions that, after substitution, may be η -reduced to reveal new applied variables orgreen subterms. Consider the clauses g a ≈ b , h ( λ y . x y g z ) ≈ c , and h ( f b ) c . Applying { x λ y ′ w z ′ . f ( w a ) y ′ } to the second clause yields h ( λ y . ( λ y ′ w z ′ . f ( w a ) y ′ ) y g z ) ≈ c , which β -reduces to h ( λ y . f ( g a ) y ) ≈ c and βη -reduces to h ( f ( g a )) ≈ c . A S UP inference is possiblebetween the ﬁrst clause and this new ground clause, generating the clause h ( f b ) ≈ c . Byalso considering λ -expressions, the F LUID S UP rule is applicable at the nonground level toderive this clause. Example 17

Consider the clause set consisting of the facts C succ = succ x zero , C div = n ≈ zero ∨ div n n ≈ one , C prod = prod K ( λ k . one ) ≈ one , and the negated conjecture C conj = prod K ( λ k . div ( succ k ) ( succ k )) one . Intuitively, the term prod K ( λ k . u ) is intended todenote the product ∏ k ∈ K u , where k ranges over a ﬁnite set K of natural numbers. The calculus derives the empty clause as follows: C prod C conj C div E XT y ( diﬀ h α, β i y z ) z ( diﬀ h α, β i y z ) ∨ y ≈ z F LUID S UP w ( diﬀ h α, ι i ( λ k . div ( w k ) ( w k )) z ) ≈ zero ∨ one z ( diﬀ h α, ι i ( λ k . div ( w k ) ( w k )) z ) ∨ ( λ k . div ( w k ) ( w k )) ≈ z ER ES C succ w ( diﬀ h α, ι i ( λ k . div ( w k ) ( w k )) ( λ k . one )) ≈ zero ∨ ( λ k . div ( w k ) ( w k )) ≈ ( λ k . one ) S UP zero zero ∨ ( λ k . div ( succ k ) ( succ k )) ≈ ( λ k . one ) ER ES ( λ k . div ( succ k ) ( succ k )) ≈ ( λ k . one ) S UP prod K ( λ k . one ) one S UP one one ER ES ⊥ Since the calculus does not superpose into λ -expressions, we need to use the extensionalityaxiom to refute this clause set. We perform a F LUID S UP inference into the extensional-ity axiom with the uniﬁer { β ι, z ′ λ x . x , n w ( diﬀ h α, ι i ( λ k . div ( w k ) ( w k )) z ) , y λ k . div ( w k ) ( w k ) } ∈ CSU ( z ′ ( div nn ) , y ( diﬀ h α, β i yz )) . Then we apply ER ES with the uniﬁer { z λ k . one } ∈ CSU ( one , z ( diﬀ h α, ι i ( λ k . div ( wk )( wk )) z )) to eliminate the negative literal.Next, we perform a S UP inference into C succ with the uniﬁer { α ι, w succ , x diﬀ h α, ι i ( λ k . div ( w k ) ( w k )) ( λ k . one ) } ∈ CSU ( w ( diﬀ h α, ι i ( λ k . div ( w k ) ( w k )) ( λ k . one )) , succ x ) . Toeliminate the trivial literal, we apply ER ES . We then apply a S UP inference into C conj andsuperpose into the resulting clause with C prod . Finally we derive the empty clause by ER ES .The uniﬁers in this example were chosen to keep the clauses reasonably small.Because it gives rise to ﬂex–ﬂex pairs, which are uniﬁcation constraints where bothsides are variable-headed, F LUID S UP can be very proliﬁc. With variable-headed terms onboth sides of its maximal literal, the extensionality axiom is another prime source of ﬂex–ﬂex pairs. Flex–ﬂex pairs can also arise in the other rules (S UP , ER ES , and EF ACT ). Dueto order restrictions and fairness, we cannot postpone solving ﬂex–ﬂex pairs indeﬁnitely.Thus, we cannot use Huet’s pre-uniﬁcation procedure [37] and must instead choose a fulluniﬁcation procedure such as Jensen and Pietrzykowski’s [38], Snyder and Gallier’s [61], orthe procedure that has recently been developed by Vukmirovi´c, Bentkamp, and Nummelin[66]. On the positive side, optional inference rules can efﬁciently cover many cases whereF

LUID S UP or the extensionality axiom would otherwise be needed (Sect. 5), and heuristicscan help postpone the explosion. Moreover, ﬂex–ﬂex pairs are not always as bad as theirreputation; for example, y a b ? = z c d admits a most general uniﬁer: { y λ w x . y ′ w x c d , z y ′ a b } .The calculus is a graceful generalization of standard superposition, except for the ex-tensionality axiom. From simple ﬁrst-order clauses, the axiom can be used to derive clausescontaining λ -expressions, which are useless if the problem is ﬁrst-order. For instance, theclause g x ≈ f x x can be used for a F LUID S UP inference into the axiom (E XT ) yielding theclause w t ( f t t ) z t ∨ ( λ u . w u ( g u )) ≈ z via the uniﬁer { α ι, β ι, x t , v λ u . w t u , y λ u . wu ( g u ) } ∈ CSU ( v ( g x ) , y ( diﬀ h α, β i yz )) where t = diﬀ h ι, ι i ( λ u . wu ( g u )) z , the vari-able w is freshly introduced by uniﬁcation, and v is the fresh variable introduced by F LUID -S UP (named z in the deﬁnition of the rule). By ER ES , with the uniﬁer { z λ u . w u ( f u u ) } ∈ CSU ( w t ( f t t ) , z t ) , we can then derive ( λ u . w u ( g u )) ≈ ( λ u . w u ( f u u )) , an equality of two λ -expressions, although we started with a simple ﬁrst-order clause. This could be avoided if uperposition with Lambdas 11 we could ﬁnd a way to make the positive literal y ≈ z of (E XT ) larger than the other literal,or to select y ≈ z without losing refutational completeness. The literal y ≈ z interacts onlywith green subterms of functional type, which do not arise in ﬁrst-order clauses. To show soundness of the inferences, we need the substitution lemma for our logic:

Lemma 18 (Substitution lemma)

Let I = ( I ty , J , L ) beaproper interpretation.Then J τρ K ξ I ty = J τ K ξ ′ I ty and J t ρ K ξ I = J t K ξ ′ I for all terms t , all types τ , and all substitutions ρ , where ξ ′ ( α ) = J αρ K ξ I ty for all type vari-ables α and ξ ′ ( x ) = J x ρ K ξ I foralltermvariables x . Proof

First, we prove that J τρ K ξ I ty = J τ K ξ ′ I ty by induction on the structure of τ . If τ = α is atype variable, J αρ K ξ I ty = ξ ′ ( α ) = J α K ξ ′ I ty If τ = κ ( ¯ υ ) for some type constructor κ and types ¯ υ , J κ ( ¯ υ ) ρ K ξ I ty = J ty ( κ )( J ¯ υρ K ξ I ty ) IH = J ty ( κ )( J ¯ υ K ξ ′ I ty ) = J κ ( ¯ υ ) K ξ ′ I ty Next, we prove J t ρ K ξ I = J t K ξ ′ I by induction on the structure of a λ -term representative of t ,allowing arbitrary substitutions ρ in the induction hypothesis. If t = y , then by the deﬁnitionof the denotation of a variable J y ρ K ξ I = ξ ′ ( y ) = J y K ξ ′ I If t = f h ¯ τ i , then by the deﬁnition of the term denotation J f h ¯ τ i ρ K ξ I = J ( f , J ¯ τρ K ξ I ty ) IH = J ( f , J ¯ τ K ξ ′ I ty ) = J f h ¯ τ i K ξ ′ I If t = u v , then by the deﬁnition of the term denotation J ( u v ) ρ K ξ I = J u ρ K ξ I ( J v ρ K ξ I ) IH = J u K ξ ′ I ( J v K ξ ′ I ) = J u v K ξ ′ I If t = λ z . u , let ρ ′ ( z ) = z and ρ ′ ( x ) = ρ ( x ) for x = z . Using properness of I in the second andthe last step, we have J ( λ z . u ) ρ K ξ I ( a ) = J ( λ z . u ρ ′ ) K ξ I ( a ) = J u ρ ′ K ξ [ z a ] I IH = J u K ξ ′ [ z a ] I = J λ z . u K ξ ′ I ( a ) ⊓⊔ Lemma 19 If I | = C for some interpretation I and some clause C , then I | = C ρ for allsubstitutions ρ . Proof

We have to show that C ρ is true in I for all valuations ξ . Given a valuation ξ , deﬁne ξ ′ as in Lemma 18. Then, by Lemma 18, a literal in C ρ is true in I for ξ if and only if thecorresponding literal in C is true in I for ξ ′ . There must be at least one such literal because I | = C and hence C is in particular true in I for ξ ′ . Therefore, C ρ is true in I for ξ . ⊓⊔ Theorem 20 (Soundness)

Theinferencerules S UP ,F LUID S UP ,ER ES ,EF ACT ,and A RG -C ONG are sound (even without the variable condition and the side conditions on ﬂuidity,deeplyoccurring variables,order,andeligibility).

Proof

We ﬁx an inference and an interpretation I that is a model of the premises. We needto show that it is also a model of the conclusion.From the deﬁnition of the denotation of a term, it is obvious that congruence holds inour logic, at least for subterms that are not inside a λ -expression. In particular, it holds forgreen subterms and for the left subterm t of an application t s .By Lemma 19, I is a model of the σ -instances of the premises as well, where σ is thesubstitution used for the inference. Let ξ be a valuation. By making case distinctions on thetruth under I , ξ of the literals of the σ -instances of the premises, using the conditions that σ is a uniﬁer, and applying congruence, it follows that the conclusion is true under I , ξ . Hence, I is a model of the conclusion. ⊓⊔ As in the λ -free higher-order logic of Bentkamp et al. [10], skolemization is unsound inour logic. As a consequence, axiom (E XT ) does not hold in all interpretations, but the axiomis consistent with our logic, i.e., there exist models of (E XT ). A redundant clause is usually deﬁned as a clause whose ground instances are entailed bysmaller ( ≺ ) ground instances of existing clauses. This would be too strong for our calcu-lus, as it would make most clauses produced by A RG C ONG redundant. The solution is tobase the redundancy criterion on a weaker ground logic—ground monomorphic ﬁrst-orderlogic—in which argument congruence and extensionality do not hold. The resulting notionof redundancy gracefully generalizes the standard ﬁrst-order notion.We employ an encoding F to translate ground higher-order terms into ground ﬁrst-order terms. F indexes each symbol occurrence with the type arguments and the number ofterm arguments. For example, F ( f a ) = f ( a ) and F ( g h κ i ) = g κ . In addition, F conceals λ -expressions by replacing them with fresh symbols. These measures effectively disableargument congruence and extensionality. For example, the clause sets { g ≈ f , g a f a } and { b ≈ a , ( λ x . b ) ( λ x . a ) } are unsatisﬁable in higher-order logic, but the encoded clausesets { g ≈ f , g ( a ) f ( a ) } and { b ≈ a , lam λ x . b lam λ x . a } are satisﬁable in ﬁrst-orderlogic, where lam λ x . t is a family of fresh symbols.Given a higher-order signature ( Σ ty , Σ ) , we deﬁne a ground ﬁrst-order signature ( Σ ty , Σ GF ) as follows. The type constructors Σ ty are the same in both signatures, but → is un-interpreted in ﬁrst-order logic. For each ground instance f h ¯ υ i : τ → · · · → τ n → τ of asymbol f ∈ Σ , we introduce a ﬁrst-order symbol f ¯ υ j ∈ Σ GF with argument types ¯ τ j and returntype τ j + → · · · → τ n → τ , for each j . Moreover, for each ground term λ x . t , we introduce asymbol lam λ x . t ∈ Σ GF of the same type.Thus, we consider three levels of logics: the higher-order level H over a given signature( Σ ty , Σ ) , the ground higher-order level GH, which is the ground fragment of H, and theground monomorphic ﬁrst-order level GF over the signature ( Σ ty , Σ GF ) deﬁned above. Weuse T H , T GH , and T GF to denote the respective sets of terms, Ty H , Ty GH , and Ty GF to denotethe respective sets of types, and C H , C GH , and C GF to denote the respective sets of clauses.Each of the three levels has an entailment relation | = . A clause set N entails a clause set N ,denoted N | = N , if every model of N is also a model of N . For H and GH, we use higher-order models; for GF, we use ﬁrst-order models. This machinery may seem excessive, butit is essential to deﬁne redundancy of clauses and inferences properly, and it will play animportant role in the refutational completeness proof (Sect. 4).The three levels are connected by two functions G and F : uperposition with Lambdas 13 Deﬁnition 21 (Grounding function G on terms and clauses) The grounding function G maps terms t ∈ T H to the set of their ground instances—i.e., the set of all t θ ∈ T GH where θ is a substitution. It also maps clauses C ∈ C H to the set of their ground instances—i.e., theset of all C θ ∈ C GH where θ is a substitution. Deﬁnition 22 (Encoding F on terms and clauses) The encoding F : T GH → T GF is recur-sively deﬁned as F ( λ x . t ) = lam λ x . t F ( f h ¯ υ i ¯ s j ) = f ¯ υ j ( F ( ¯ s j )) using η -short β -normal representatives of terms. The encoding F is extended to map from C GH to C GF by mapping each literal and each side of a literal individually.Schematically, the three levels are connected as follows:Hhigher-order GHground higher-order GFground ﬁrst-order FG The mapping F is clearly bijective. Using the inverse mapping, the order ≻ can betransferred from T GH to T GF and from C GH to C GF by deﬁning t ≻ s as F − ( t ) ≻ F − ( s ) and C ≻ D as F − ( C ) ≻ F − ( D ) . The property that ≻ on clauses is the multiset extension of ≻ on literals, which in turn is the multiset extension of ≻ on terms, is maintained because F − maps the multiset representations elementwise.For example, let C = y b ≈ y a ∨ y f a ∈ C H . Then G ( C ) contains, among many otherclauses, C θ = f b b ≈ f a a ∨ ( λ x . f x x ) f a ∈ C GH , where θ = { y λ x . f x x } . On the GFlevel, this clause corresponds to F ( C θ ) = f ( b , b ) ≈ f ( a , a ) ∨ lam λ x . f x x f ( a ) ∈ C GF .A key property of F is that green subterms in T GH correspond to subterms in T GF .This allows us to show that well-foundedness, totality on ground terms, compatibility withcontexts, and the subterm property hold for ≻ on T GF . Lemma 23

Let s , t ∈ T GH . We have F ( t s p ) = F ( t )[ F ( s )] p . In other words, s is a greensubtermof t atposition p ifandonlyif F ( s ) isasubtermof F ( t ) atposition p . Proof

Analogous to Lemma 3.13 of Bentkamp et al. [10]. ⊓⊔ Lemma 24

Well-foundedness, totality, compatibility with contexts, and the subterm prop-ertyholdfor ≻ in T GF . Proof

Analogous to Lemma 3.15 of Bentkamp et al. [10], using Lemma 23. ⊓⊔ The saturation procedures of superposition provers aggressively delete clauses that arestrictly subsumed by other clauses. A clause

C subsumes D if there exists a substitution σ such that C σ ⊆ D . A clause C strictly subsumes D if C subsumes D but D does not subsume C . For example, x ≈ c strictly subsumes both a ≈ c and b a ∨ x ≈ c . The proof of refuta-tional completeness of resolution and superposition provers relies on the well-foundednessof the strict subsumption relation. Unfortunately, this property does not hold for higher-orderlogic, where f x x ≈ c is strictly subsumed by f ( x a ) ( x b ) ≈ c , which is strictly subsumedby f ( x a a ′ ) ( x b b ′ ) ≈ c , and so on. To prevent such inﬁnite chains, we use a well-foundedpartial order ⊐ on C H . We can deﬁne ⊐ as · & ∩ > size , where · & stands for “subsumed by”and D > size C if either size ( D ) > size ( C ) or size ( D ) = size ( C ) and D contains fewer distinctvariables than C ; the size function is some notion of syntactic size, such as the number of constants and variables contained in a clause. This yields for instance a ≈ c ⊐ x ≈ c and f ( x a a ) ≈ c ⊐ f ( y a ) ≈ c . To justify the deletion of subsumed clauses, we set up our redun-dancy criterion to cover subsumption, following Waldmann et al. [69].We deﬁne the sets of redundant clauses w.r.t. a given clause set as follows: – Given C ∈ C GF and N ⊆ C GF , let C ∈ GFRed C ( N ) if { D ∈ N | D ≺ C } | = C . – Given C ∈ C GH and N ⊆ C GH , let C ∈ GHRed C ( N ) if F ( C ) ∈ GFRed C ( F ( N )) . – Given C ∈ C H and N ⊆ C H , let C ∈ HRed C ( N ) if for every D ∈ G ( C ) , we have D ∈ GHRed C ( G ( N )) or there exists C ′ ∈ N such that C ⊐ C ′ and D ∈ G ( C ′ ) .For example, ( h g ) x ≈ ( h f ) x is redundant w.r.t. g ≈ f , but g x ≈ f x and ( λ x . g ) ≈ ( λ x . f ) arenot, because F translates an unapplied g to g , whereas an applied g is translated to g andthe expression λ x . g is translated to lam λ x . g . These different translations prevent entailmenton the GF level. For an example of subsumption, we assume that a ≈ c ⊐ x ≈ c holds, forinstance using the above deﬁnition of ⊐ . Then a ≈ c is redundant w.r.t. x ≈ c .Along with the three levels of logics, we consider three inference systems: HInf , GHInf ,and

GFInf . HInf is the inference system described in Sect. 3.1. For uniformity, we re-gard the extensionality axiom as a premise-free inference rule E XT whose conclusion isaxiom (E XT ). The rules of GHInf include S UP , ER ES , and EF ACT from

HInf , but with therestriction that premises and conclusion are ground and with all references to % replaced by (cid:23) . In addition, GHInf contains a premise-free rule GE XT whose inﬁnitely many conclusionsare the ground instances of (E XT ), and the following ground variant of A RG C ONG : C ′ ∨ s ≈ s ′ GA RG C ONG C ′ ∨ s ¯ u n ≈ s ′ ¯ u n where s ≈ s ′ is strictly (cid:23) -eligible in C ′ ∨ s ≈ s ′ and ¯ u n is a nonempty tuple of ground terms. GFInf contains all S UP , ER ES , and EF ACT inferences from

GHInf translated by F . Itcoincides with standard ﬁrst-order superposition.Each of the three inference systems is parameterized by a selection function. For HInf ,we globally ﬁx one selection function

HSel . For

GHInf and

GFInf , we need to considerdifferent selection functions. We write

GHInf

GHSel for

GHInf and

GFInf

GFSel for

GFInf to make the dependency on the respective selection functions

GHSel and

GFSel explicit.Let G ( HSel ) denote the set of all selection functions on C GH such that for each clause in C ∈ C GH , there exists a clause D ∈ C H with C ∈ G ( D ) and corresponding selected literals.For each selection function GHSel on C GH , via the bijection F , we obtain a correspondingselection function on C GF , which we denote by F ( GHSel ) .We extend the functions F and G to inferences: Notation 25

Given an inference ι , we write prems ( ι ) for the tuple of premises, mprem ( ι ) for the main (i.e., rightmost) premise, and concl ( ι ) for the conclusion. Deﬁnition 26 (Encoding F on inferences) Given a S UP , ER ES , or EF ACT inference ι ∈ GHInf , let F ( ι ) ∈ GFInf denote the inference deﬁned by prems ( F ( ι )) = F ( prems ( ι )) and concl ( F ( ι )) = F ( concl ( ι )) . Deﬁnition 27 (Grounding function G on inferences) Given an inference ι ∈ HInf , and aselection function

GHSel ∈ G ( HSel ) , we deﬁne the set G GHSel ( ι ) of ground instances of ι tobe all inferences ι ′ ∈ GHInf

GHSel such that prems ( ι ′ ) = prems ( ι ) θ and concl ( ι ′ ) = concl ( ι ) θ for some grounding substitution θ . uperposition with Lambdas 15 This will map S UP and F LUID S UP to S UP , EF ACT to EF

ACT , ER ES to ER ES , E XT toGE XT , and A RG C ONG to GA RG C ONG inferences, but it is also possible that G GHSel ( ι ) isthe empty set for some inferences ι .We deﬁne the sets of redundant inferences w.r.t. a given clause set as follows: – Given ι ∈ GFInf

GFSel and N ⊆ C GF , let ι ∈ GFRed

GFSel I ( N ) if prems ( ι ) ∩ GFRed C ( N ) = ∅ or { D ∈ N | D ≺ mprem ( ι ) } | = concl ( ι ) . – Given ι ∈ GHInf

GHSel and N ⊆ C GH , let ι ∈ GHRed

GHSel I ( N ) if – ι is not a GA RG C ONG or GE XT inference and F ( ι ) ∈ GFRed F ( GHSel ) I ( F ( N )) ; or – ι is a GA RG C ONG or GE XT inference and concl ( ι ) ∈ N ∪ GHRed C ( N ) . – Given ι ∈ HInf and N ⊆ C H , let ι ∈ HRed I ( N ) if G GHSel ( ι ) ⊆ GHRed I ( G ( N )) for all GHSel ∈ G ( HSel ) .Occasionally, we omit the selection function in the notation when it is irrelevant. A clause set N is saturated w.r.t. an inference system and the inference component Red I of a redundancycriterion if every inference from clauses in N is in Red I ( N ) . The redundancy criterion ( HRed I , HRed C ) is strong enough to support most of the simpli-ﬁcation rules implemented in Schulz’s ﬁrst-order prover E [57, Sections 2.3.1 and 2.3.2],some only with minor adaptions. Deletion of duplicated literals, deletion of resolved liter-als, syntactic tautology deletion, negative simplify-reﬂect, and clause subsumption adhereto our redundancy criterion.Positive simplify-reﬂect and equality subsumption are supported by our criterion if theyare applied in green contexts u instead of arbitrary contexts u [ ] . Semantic tautology dele-tion can be applied as well, but we must use the entailment relation of the GF level—i.e.,only rewriting in green contexts can be used to establish the entailment. Similarly, rewritingof positive and negative literals (demodulation) can only be applied in green contexts. More-over, for positive literals, the rewriting clause must be smaller than the rewritten clause—acondition that is also necessary with the standard ﬁrst-order redundancy criterion but notalways fulﬁlled by Schulz’s rule. As for destructive equality resolution, even in ﬁrst-orderlogic the rule cannot be justiﬁed with the standard redundancy criterion, and it is unclearwhether it preserves refutational completeness. We stated some requirements on the term orders ≻ and % in Sect. 3.1 but have not shownhow to fulﬁll them. To derive a suitable strict term order ≻ , we propose to encode η -short β -normal forms into untyped ﬁrst-order terms and apply an order ≻ fo of ﬁrst-order termssuch as the Knuth–Bendix order [45] or the lexicographic path order [43].The encoding, denoted by O , indexes symbols with their number of term arguments,similarly to the F encoding. Unlike the F encoding, O translates λ x : τ. t to lam ( O ( τ ) , O ( t )) and uses De Bruijn [25] symbols to represent bound variables. The O encoding replaces ﬂuidterms t by fresh variables z t and maps type arguments to term arguments, while erasing anyother type information. For example, O ( λ x : κ. f ( f ( a h κ i )) ( y b )) = lam ( κ, f ( f ( a ( κ )) , z y b )) .The use of De Bruijn indices and the monolithic encoding of ﬂuid terms ensure stabilityunder both α -renaming and substitution. Deﬁnition 28 (Encoding O ) Given a signature ( Σ ty , Σ ) , O encodes types and terms as termsover the untyped ﬁrst-order signature Σ ty ⊎ { f k | f ∈ Σ , k ∈ N } ⊎ { lam } ⊎ { db ik | i , k ∈ N } . Wereuse higher-order type variables as term variables in the target untyped ﬁrst-order logic.Moreover, let z t be an untyped ﬁrst-order variable for each higher-order term t . The auxiliaryfunction B x ( t ) replaces each free occurrence of the variable x by a symbol db i , where i isthe number of λ -expressions surrounding the variable occurrence. The type-to-term versionof O is deﬁned by O ( α ) = α and O ( κ ( ¯ τ )) = κ ( O ( ¯ τ )) . The term-to-term version is deﬁned by O ( t ) =  z t if t = x or t is ﬂuid lam ( O ( τ ) , O ( B x ( u ))) if t = ( λ x : τ. u ) and t is not ﬂuid f k ( O ( ¯ τ ) , O ( ¯ u k )) if t = f h ¯ τ i ¯ u k For example, let s = λ y . f y ( λ w . g ( y w )) where y has type κ → κ and w has type κ . We have B y ( f y ( λ w . g ( y w ))) = f db ( λ w . g ( db w )) and B w ( g ( db w )) = g ( db db ) . Neither s nor λ w . g ( y w ) are ﬂuid. Hence, we have O ( s ) = lam ( → ( κ, κ ) , f ( db , lam ( κ, g ( db ( db )))) . Deﬁnition 29 (Derived strict term order)

Let the strict term order derived from ≻ fo be ≻ λ where t ≻ λ s if O ( t ) ≻ fo O ( s ) .We will show that the derived ≻ λ fulﬁlls all properties of a strict term order (Deﬁnition 6) if ≻ fo fulﬁlls the corresponding properties on ﬁrst-order terms. For the nonstrict term order % ,we can use the reﬂexive closure (cid:23) λ of ≻ λ . Lemma 30

Let ≻ fo be a strict partial order on ﬁrst-order terms and ≻ λ the derived termorder on βη -equivalence classes. If the restriction of ≻ fo to ground terms enjoys well-foundedness,totality,thesubtermproperty,andcompatibilitywithcontexts(w.r.t.ﬁrst-orderterms), the restriction of ≻ λ to ground terms enjoys well-foundedness, totality, the greensubtermproperty,andcompatibilitywithgreencontexts (w.r.t. βη -equivalence classes). Proof

Transitivity and irreﬂexivity of ≻ fo imply transitivity and irreﬂexivity of ≻ λ .W ELL - FOUNDEDNESS : If there existed an inﬁnite chain t ≻ λ t ≻ λ · · · of ground terms,there would also be the chain O ( t ) ≻ fo O ( t ) ≻ fo · · · , contradicting the well-foundedness of ≻ fo on ground λ -free terms.T OTALITY : By ground totality of ≻ fo , for any ground terms t and s we have O ( t ) ≻ fo O ( s ) , O ( t ) ≺ fo O ( s ) , or O ( t ) = O ( s ) . In the ﬁrst two cases, it follows that t ≻ λ s or t ≺ λ s . In thelast case, it follows that t = s because O is clearly injective.G REEN SUBTERM PROPERTY : Let s be a term. We show that s (cid:23) λ s | p by induction on p ,where s | p denotes the green subterm at position p . If p = ε , this is trivial. If p = p ′ . i , wehave s (cid:23) λ s | p ′ by the induction hypothesis. Hence, it sufﬁces to show that s | p ′ (cid:23) λ s | p ′ . i . Fromthe existence of the position p ′ . i , we know that s | p ′ must be of the form s | p ′ = f h ¯ τ i ¯ u k . Then s | p ′ . i = u i . The encoding yields O ( s | p ′ ) = f k ( O ( ¯ τ ) , O ( ¯ u k )) and hence O ( s | p ′ ) (cid:23) fo O ( s | p ′ . i ) bythe ground subterm property of ≻ fo . Hence, s | p ′ (cid:23) λ s | p ′ . i and thus s (cid:23) λ s | p .C OMPATIBILITY WITH GREEN CONTEXTS : By induction on the depth of the context, itsufﬁces to show that t ≻ λ s implies f h ¯ τ i ¯ u t ¯ v ≻ λ f h ¯ τ i ¯ u s ¯ v for all t , s , f , ¯ τ , ¯ u , and ¯ v . Thisamounts to showing that O ( t ) ≻ fo O ( s ) implies O ( f h ¯ τ i ¯ u t ¯ v ) = f k ( O ( ¯ τ ) , O ( ¯ u ) , O ( t ) , O ( ¯ v )) ≻ fo f k ( O ( ¯ τ ) , O ( ¯ u ) , O ( s ) , O ( ¯ v )) = O ( f h ¯ τ i ¯ u s ¯ v ) , which follows directly from ground compatibilityof ≻ fo with contexts and the induction hypothesis. ⊓⊔ uperposition with Lambdas 17 Lemma 31

Let ≻ fo be a strict partial order on ﬁrst-order terms. If ≻ fo is stable undergrounding substitutions (w.r.t. ﬁrst-order terms), the derived term order ≻ λ is stable undergrounding substitutions (w.r.t. βη -equivalence classes). Proof

Assume s ≻ λ s ′ for some terms s and s ′ . Let θ be a higher-order substitution grounding s and s ′ . We must show s θ ≻ λ s ′ θ . We will deﬁne a ﬁrst-order substitution ρ grounding O ( s ) and O ( s ′ ) such that O ( s ) ρ = O ( s θ ) and O ( s ′ ) ρ = O ( s ′ θ ) . Since s ≻ λ s ′ , we have O ( s ) ≻ fo O ( s ′ ) . By stability of ≻ fo under grounding substitutions, O ( s ) ρ ≻ fo O ( s ′ ) ρ . It follows that O ( s θ ) ≻ fo O ( s ′ θ ) and hence s θ ≻ λ s ′ θ .We deﬁne the ﬁrst-order substitution ρ as αρ = αθ for type variables α and z u ρ = O ( u θ ) for terms u . Strictly speaking, the domain of a substitution must be ﬁnite, so we restrict thisdeﬁnition of ρ to the ﬁnitely many variables that occur in the computation of O ( s ) and O ( s ′ ) .Clearly O ( τ ) ρ = O ( τθ ) for all types τ occurring in the computation of O ( s ) and O ( s ′ ) .Moreover, O ( t ) ρ = O ( t θ ) for all t occurring in the computation of O ( s ) and O ( s ′ ) , which weshow by induction on the deﬁnition of the encoding. If t = x or if t is ﬂuid, O ( t ) ρ = z t ρ = O ( t θ ) . If t = f h ¯ τ i ¯ u , then O ( t ) ρ = f k ( O ( ¯ τ ) ρ, O ( ¯ u ) ρ ) IH = f k ( O ( ¯ τθ ) , O ( ¯ u θ )) = O ( f h ¯ τθ i ( ¯ u θ )) = O ( t θ ) . If t = ( λ x : τ. u ) and t is not ﬂuid, then O ( t ) ρ = lam ( O ( τ ) ρ, O ( B x ( u )) ρ ) IH = lam ( O ( τθ ) , O ( B x ( u ) θ )) = lam ( O ( τθ ) , O ( B x ( u ) θ [ x x ])) = O ( λ x : τθ. u θ [ x x ]) = O (( λ x : τ. u ) θ ) = O ( t θ ) . ⊓⊔ Besides soundness, the most important property of the Boolean-free λ -superposition cal-culus introduced in Sect. 3 is refutational completeness. We will prove static and dynamicrefutational completeness of HInf w.r.t. ( HRed I , HRed C ) , which is deﬁned as follows: Deﬁnition 32 (Static refutational completeness)

Let

Inf be an inference system and let ( Red I , Red C ) be a redundancy criterion. The inference system Inf is statically refutationallycomplete w.r.t. ( Red I , Red C ) if we have N | = ⊥ if and only if ⊥ ∈ N for every clause set N that is saturated w.r.t. Inf and

Red I . Deﬁnition 33 (Dynamic refutational completeness)

Let

Inf be an inference system andlet ( Red I , Red C ) be a redundancy criterion. Let ( N i ) i be a ﬁnite or inﬁnite sequence over setsof clauses. Such a sequence is a derivation if N i \ N i + ⊆ Red C ( N i + ) for all i . It is fair if all Inf -inferences from clauses in the limit inferior S i T j ≥ i N j are contained in S i Red I ( N i ) . Theinference system Inf is dynamically refutationally complete w.r.t. ( Red I , Red C ) if for everyfair derivation ( N i ) i such that N | = ⊥ , we have ⊥ ∈ N i for some i . The proof proceeds in three steps, corresponding to the three levels GF, GH, and H intro-duced in Sect. 3.4:1. We use Bachmair and Ganzinger’s work on the refutational completeness of standard(ﬁrst-order) superposition [6] to prove static refutational completeness of

GFInf .2. From the ﬁrst-order model constructed in Bachmair and Ganzinger’s proof, we derive aclausal higher-order model and thus prove static refutational completeness of

GHInf .3. We use the saturation framework by Waldmann et al. [69] to lift the static refutationalcompleteness of

GHInf to static and dynamic refutational completeness of

HInf . In the ﬁrst step, since the inference system

GFInf is standard ground superposition, wecan make use of Bachmair and Ganzinger’s results. Given a saturated clause set N ⊆ C GF with ⊥ 6∈ N , Bachmair and Ganzinger prove refutational completeness by constructing aterm rewriting system R N and showing that it can be viewed as an interpretation that is amodel of N . This ﬁrst step deals exclusively with ground ﬁrst-order clauses.In the second step, we derive refutational completeness of GHInf . Given a saturatedclause set N ⊆ C GH with ⊥ 6∈ N , we use the ﬁrst-order model R F ( N ) of F ( N ) constructedin the ﬁrst step to derive a clausal higher-order interpretation that is a model of N . Underthe encoding F , occurrences of the same symbol with different numbers of arguments areregarded as different symbols—e.g., F ( f ) = f and F ( f a ) = f ( a ) . All λ -expressions λ x . t are regarded as uninterpreted symbols lam λ x . t . The difﬁculty is to construct a higher-orderinterpretation that merges the ﬁrst-order denotations of all f i into a single higher-order de-notation of f and to show that the symbols lam λ x . t behave like λ x . t . This step relies onsaturation w.r.t. the GA RG C ONG rule—which connects a term of functional type with itsvalue when applied to an argument x —and on the presence of the extensionality rule GE XT .In the third step, we employ the saturation framework by Waldmann et al. [69], which isbased on Bachmair and Ganzinger’s framework [7, Sect. 4], to prove refutational complete-ness of HInf . Both saturation frameworks help calculus designers prove static and dynamicrefutational completeness of nonground calculi. In addition, the framework by Waldmannet al. explicitly supports the redundancy criterion deﬁned in Sect. 3.4, which can be usedto justify the deletion of subsumed clauses. Moreover, their saturation framework providescompleteness theorems for prover architectures, such as the DISCOUNT loop.The main proof obligation we must discharge to use the framework is that there shouldexist nonground inferences in

HInf corresponding to all nonredundant inferences in

GHInf .We face two speciﬁcally higher-order difﬁculties. First, in standard superposition, we canavoid S UP inferences into variables x by exploiting the clause order’s compatibility withcontexts: If t ′ ≺ t , we have C { x t ′ } ≺ C { x t } , which allows us to show that S UP inferences into variables are redundant. This technique fails for higher-order variables x thatoccur applied in C , because the order lacks compatibility with arguments. This is why ourS UP rule must perform some inferences into variables. The other difﬁculty also concernsapplied variables. We must show that any nonredundant S UP inference in level GH intoa position corresponding to a ﬂuid term or a deeply occurring variable in level H can belifted to a F LUID S UP inference. This involves showing that the z variable in F LUID S UP canrepresent arbitrary contexts around a term t .For the entire proof of refutational completeness, βη -normalization is the proverbial dogthat did not bark. On level GH, the rules S UP , ER ES , and EF ACT preserve η -short β -normalform, and so does ﬁrst-order term rewriting. Thus, we can completely ignore − (cid:4) → β and − (cid:4) → η .On level H, instantiation can cause β - and η -reduction, but this poses no difﬁculties thanksto the clause order’s stability under grounding substitutions. We use Bachmair and Ganzinger’s results on standard superposition [6] to prove refutationalcompleteness of GF. In the subsequent steps, we will also make use of speciﬁc properties ofthe model Bachmair and Ganzinger construct. The basis of Bachmair and Ganzinger’s proofis that a term rewriting system R deﬁnes an interpretation T GF / R such that for every groundequation s ≈ t , we have T GF / R | = s ≈ t if and only if s ←→ ∗ R t . Formally, T GF / R denotesthe monomorphic ﬁrst-order interpretation whose universes U τ consist of the R -equivalence uperposition with Lambdas 19 classes over T GF containing terms of type τ . The interpretation T GF / R is term-generated—that is, for every element a of the universe of this interpretation and for any valuation ξ , thereexists a ground term t such that J t K ξ T GF / R = a . To lighten notation, we will write R to refer toboth the term rewriting system R and the interpretation T GF / R .The term rewriting system is constructed as follows: Deﬁnition 34

Let N ⊆ C GF . We ﬁrst deﬁne sets of rewrite rules E CN and R CN for all C ∈ N byinduction on the clause order. Assume that E DN has already been deﬁned for all D ∈ N suchthat D ≺ C . Then R CN = S D ≺ C E DN . Let E CN = { s − (cid:4) → t } if the following conditions are met:(a) C = C ′ ∨ s ≈ t ;(b) s ≈ t is % -maximal in C ;(c) s ≻ t ;(d) C ′ is false in R CN ;(e) s is irreducible w.r.t. R CN . Then C is said to produce s − (cid:4) → t . Otherwise, E CN = /0. Finally, R N = S D E DN . Based on Bachmair and Ganzinger’s work, Bentkamp et al. [10, Lemma 4.2 and Theo-rem 4.3] prove the following properties of R N : Lemma 35

Let ⊥ 6∈ N and N ⊆ C GF besaturatedw.r.t. GFInf and

GFRed I .If C = C ′ ∨ s ≈ t ∈ N produces s − (cid:4) → t ,then s ≈ t isstrictly (cid:23) -eligiblein C and C ′ isfalsein R N . Theorem 36 (Ground ﬁrst-order static refutational completeness)

The inference sys-tem

GFInf isstaticallyrefutationally completew.r.t. ( GFRed I , GFRed C ) .Moreprecisely,if N ⊆ C GF is a clause set saturated w.r.t. GFInf and

GFRed I such that ⊥ 6∈ N , then R N is amodelof N . In this subsection, let

GHSel be a selection function on C GH , let N ⊆ C GH be a clause set sat-urated w.r.t. GHInf

GHSel and

GHRed

GHSel I such that ⊥ 6∈ N . Clearly, F ( N ) is then saturatedw.r.t. GFInf F ( GHSel ) and GFRed F ( GHSel ) I .We abbreviate R F ( N ) as R . Given two terms s , t ∈ T GH , we write s ∼ t to abbreviate R | = F ( s ) ≈ F ( t ) , which is equivalent to J F ( s ) K R = J F ( t ) K R . Lemma 37

Forallterms t , s : τ → υ in T GH ,thefollowing statementsareequivalent:1. t ∼ s ;2. t ( diﬀ t s ) ∼ s ( diﬀ t s ) ;3. t u ∼ s u forall u ∈ T GH . Proof (3) ⇒ (2): Take u : = diﬀ t s .(2) ⇒ (1): Since N is saturated, the GE XT inference that generates the clause C = t ( diﬀ t s ) s ( diﬀ t s ) ∨ t ≈ s is redundant—i.e., C ∈ N ∪ GHRed C ( N ) —and hence R | = F ( C ) by The-orem 36 and the assumption that ⊥ 6∈ N . Therefore, it follows from t ( diﬀ t s ) ∼ s ( diﬀ t s ) that t ∼ s .(1) ⇒ (3): We assume that t ∼ s —i.e., F ( t ) ←→ ∗ R F ( s ) . By induction on the number of rewritesteps between F ( t ) and F ( s ) and by transitivity of ∼ , it sufﬁces to show that F ( t ) − (cid:4) → R F ( s ) implies t u ∼ s u . If the rewrite step F ( t ) − (cid:4) → R F ( s ) is not at the top level, then neither s ↓ βη nor t ↓ βη can be λ -expressions. Therefore, ( s ↓ βη ) ( u ↓ βη ) and ( t ↓ βη ) ( u ↓ βη ) are in η -short β -normal form, and there is an analogous rewrite step F ( t u ) − (cid:4) → R F ( s u ) using the samerewrite rule. It follows that t u ∼ s u . If the rewrite step F ( t ) − (cid:4) → R F ( s ) is at the top level, F ( t ) − (cid:4) → F ( s ) must be a rule of R . This rule must originate from a productive clause of theform F ( C ) = F ( C ′ ∨ t ≈ s ) . By Lemma 35, F ( t ≈ s ) is strictly (cid:23) -eligible in F ( C ) w.r.t. F ( GHSel ) , and hence t ≈ s is strictly (cid:23) -eligible in C w.r.t. GHSel . Thus, the followingGA RG C ONG inference ι is applicable: C ′ ∨ t ≈ s GA RG C ONG C ′ ∨ t u ≈ s u By saturation, ι is redundant w.r.t. N —i.e., concl ( ι ) ∈ N ∪ GHRed C ( N ) . By Theorem 36 andthe assumption that ⊥ 6∈ N , F ( concl ( ι )) is then true in R . By Lemma 35, F ( C ′ ) is false in R . Therefore, F ( t u ≈ s u ) must be true in R . ⊓⊔ Lemma 38

Let s ∈ T H and θ , θ ′ groundingsubstitutionssuchthat x θ ∼ x θ ′ forallvariables x and αθ = αθ ′ foralltypevariables α .Then s θ ∼ s θ ′ . Proof

In this proof, we work directly on λ -terms. To prove the lemma, it sufﬁces to prove itfor any λ -term s . Here, for λ -terms t and t , the notation t ∼ t is to be read as t ↓ βη ∼ t ↓ βη because F is only deﬁned on η -short β -normal terms.D EFINITION

We extend the syntax of λ -terms with a new polymorphic function symbol ⊕ : Π α. α → α → α . We will omit its type argument. It is equipped with two reduction rules: ⊕ t s − (cid:4) → t and ⊕ t s − (cid:4) → s . A β ⊕ -reduction step is either a rewrite step following one of theserules or a β -reduction step.The computability path order ≻ CPO [22] guarantees that – ⊕ t s ≻ CPO s by applying rule @ ⊲ ; – ⊕ t s ≻ CPO t by applying rule @ ⊲ twice; – ( λ x . t ) s ≻ CPO t [ x s ] by applying rule @ β .Since this order is moreover monotone, it decreases with β ⊕ -reduction steps. The orderis also well founded; thus, β ⊕ -reductions terminate. And since the β ⊕ -reduction steps de-scribe a ﬁnitely branching term rewriting system, by K˝onig’s lemma [44], there is a maximalnumber of β ⊕ -reduction steps from each λ -term.D EFINITION A λ -term is term-ground if it does not contain free term variables. It maycontain polymorphic type arguments.D EFINITION

We introduce an auxiliary function S that essentially measures the size of a λ -term but assigns a size of 1 to term-ground λ -terms. S ( s ) =  s is term-ground or is a bound or free variable or a symbol1 + S ( t ) if s is not term-ground and has the form λ x . t S ( t ) + S ( u ) if s is not term-ground and has the form t u We prove s θ ∼ s θ ′ by well-founded induction on s , θ , and θ ′ using the left-to-right lexico-graphic order on the triple (cid:0) n ( s ) , n ( s ) , n ( s ) (cid:1) ∈ N , where – n ( s ) is the maximal number of β ⊕ -reduction steps starting from s σ , where σ is thesubstitution mapping each term variable x to ⊕ x θ x θ ′ ; uperposition with Lambdas 21 – n ( s ) is the number of free term variables occurring more than once in s ; – n ( s ) = S ( s ) .C ASE

1: The λ -term s is term-ground. Then the lemma is trivial.C ASE

2: The λ -term s contains k ≥ ∼ as follows. Let x be one of the free termvariables in s . Let ρ = { x x θ } the substitution that maps x to x θ and ignores all othervariables. Let ρ ′ = θ ′ [ x x ] .We want to invoke the induction hypothesis on s ρ and s ρ ′ . This is justiﬁed because s σ ⊕ -reduces to s ρσ and to s ρ ′ σ . These ⊕ -reductions have at least one step because x occursin s and k ≥

2. Hence, n ( s ) > n ( s ρ ) and n ( s ) > n ( s ρ ′ ) .This application of the induction hypothesis gives us s ρθ ∼ s ρθ ′ and s ρ ′ θ ∼ s ρ ′ θ ′ . Since s ρθ = s θ and s ρ ′ θ ′ = s θ ′ , this is equivalent to s θ ∼ s ρθ ′ and s ρ ′ θ ∼ s θ ′ . Since moreover s ρθ ′ = s ρ ′ θ , we have s θ ∼ s θ ′ by transitivity of ∼ . The following illustration visualizes theabove argument: s ρ s ρ ′ s θ ∼ IH s ρθ ′ = s ρ ′ θ ∼ IH s θ ′ θ θ ′ θ θ ′ C ASE

3: The λ -term s contains a free term variable that occurs more than once. Then werename variable occurrences apart by replacing each occurrence of each free term variable x by a fresh variable x i , for which we deﬁne x i θ = x θ and x i θ ′ = x θ ′ . Let s ′ be the resulting λ -term. Since s σ = s ′ σ , we have n ( s ) = n ( s ′ ) . All free term variables occur only once in s ′ . Hence, n ( s ) > = n ( s ′ ) . Therefore, we can invoke the induction hypothesis on s ′ toobtain s ′ θ ∼ s ′ θ ′ . Since s θ = s ′ θ and s θ ′ = s ′ θ ′ , it follows that s θ ∼ s θ ′ .C ASE

4: The λ -term s contains only one free term variable x , which occurs exactly once.C ASE λ -term s is of the form f h ¯ τ i ¯ t for some symbol f , some types ¯ τ , and some λ -terms ¯ t . Then let u be the λ -term in ¯ t that contains x . We want to apply the inductionhypothesis to u , which can be justiﬁed as follows. Consider the longest sequence of β ⊕ -reductions from u σ . This sequence can be replicated inside s σ = ( f h ¯ τ i ¯ t ) σ . Therefore, thelongest sequence of β ⊕ -reductions from s σ is at least as long—i.e., n ( s ) ≥ n ( u ) . Sinceboth s and u have only one free term variable occurrence, we have n ( s ) = = n ( u ) . But n ( s ) > n ( u ) because u is a term-nonground subterm of s .Applying the induction hypothesis gives us u θ ∼ u θ ′ . By deﬁnition of F , we have F (( f h ¯ τ i ¯ t ) θ ) = f ¯ τθ m F ( ¯ t θ ) and analogously for θ ′ , where m is the length of ¯ t . By congruenceof ≈ in ﬁrst-order logic, it follows that s θ ∼ s θ ′ .C ASE λ -term s is of the form x ¯ t for some λ -terms ¯ t . Then we observe that, byassumption, x θ ∼ x θ ′ . By applying Lemma 37 repeatedly, we have x θ ¯ t ∼ x θ ′ ¯ t . Since x occurs only once, ¯ t is term-ground and hence s θ = x θ ¯ t and s θ ′ = x θ ′ ¯ t . Therefore, s θ ∼ s θ ′ .C ASE λ -term s is of the form λ z . u for some λ -term u . Then we observe that toprove s θ ∼ s θ ′ , it sufﬁces to show that s θ ( diﬀ s θ s θ ′ ) ∼ s θ ′ ( diﬀ s θ s θ ′ ) by Lemma 37. Via βη -conversion, this is equivalent to u ρθ ∼ u ρθ ′ where ρ = { z diﬀ ( s θ ↓ βη ) ( s θ ′ ↓ βη ) } . Toprove u ρθ ∼ u ρθ ′ , we apply the induction hypothesis on u ρ .It remains to show that the induction hypothesis is applicable on u ρ . Consider the longestsequence of β ⊕ -reductions from u ρσ . Since z ρ starts with the diﬀ symbol, z ρ will not cause more β ⊕ -reductions than z . Hence, the same sequence of β ⊕ -reductions can be appliedinside s σ = ( λ z . u ) σ , proving that n ( s ) ≥ n ( u ρ ) . Since both s and u ρ have only one freeterm variable occurrence, n ( s ) = = n ( u ρ ) . But n ( s ) = S ( s ) = + S ( u ) because s is term-nonground. Moreover, S ( u ) ≥ S ( u ρ ) = n ( u ρ ) because ρ replaces a variable by a ground λ -term. Hence, n ( s ) > n ( u ρ ) , which justiﬁes the application of the induction hypothesis.C ASE λ -term s is of the form ( λ z . u ) t ¯ t for some λ -terms u , t , and ¯ t . We apply theinduction hypothesis on s ′ = u { z t } ¯ t . To justify it, consider the longest sequence of β ⊕ -reductions from s ′ σ . Prepending the reduction s σ − (cid:4) → β s ′ σ to it gives us a longer sequencefrom s σ . Hence, n ( s ) > n ( s ′ ) . The induction hypothesis gives us s ′ θ ∼ s ′ θ ′ . Since ∼ isinvariant under β -reductions, it follows that s θ ∼ s θ ′ . ⊓⊔ We proceed by deﬁning a higher-order interpretation I GH = ( U GH , J GH ty , J GH , L GH ) de-rived from R . The interpretation R is an interpretation in monomorphic ﬁrst-order logic. Let U τ be its universe for type τ and J its interpretation function.To illustrate the construction, we will employ the following running example. Let thehigher-order signature be Σ ty = { ι, →} and Σ = { f : ι → ι, a : ι, b : ι } . The ﬁrst-order signatureaccordingly consists of Σ ty and Σ GF = { f , f , a , b } ∪ { lam λ x . t | λ x . t ∈ T GH } . We write [ t ] for the equivalence class of t ∈ T GF modulo R . We assume that [ f ] = [ lam λ x . x ] , [ a ] =[ f ( a )] , [ b ] = [ f ( b )] , and that f , lam λ x . a , lam λ x . b , a , and b are in disjoint equivalenceclasses. Hence, U ι → ι = { [ f ] , [ lam λ x . a ] , [ lam λ x . b ] , . . . } and U ι = { [ a ] , [ b ] } .When deﬁning the universe U GH of the higher-order interpretation, we need to ensurethat it contains subsets of function spaces, since J GH ty ( → )( D , D ) must be a subset of thefunction space from D to D for all D , D ∈ U GH . But the ﬁrst-order universes U τ consistof equivalence classes of terms from T GF w.r.t. the rewriting system R , not of functions.To repair this mismatch, we will deﬁne a family of functions E τ that give a meaning tothe elements of the ﬁrst-order universes U τ . We will deﬁne a domain D τ for each groundtype τ and then let U GH be the set of all these domains D τ . Thus, there will be a one-to-onecorrespondence between ground types and domains. Since the higher-order and ﬁrst-ordertype signatures are identical (including → , which is uninterpreted in ﬁrst-order logic), wecan identify higher-order and ﬁrst-order types.We deﬁne E τ and D τ in a mutual recursion and prove that E τ is a bijection simulta-neously. We start with nonfunctional types τ : Let D τ = U τ and let E τ : U τ − (cid:4) → D τ be theidentity. We proceed by deﬁning E τ → υ and D τ → υ . We assume that E τ , E υ , D τ , and D υ havealready been deﬁned and that E τ , E υ are bijections. To ensure that E τ → υ will be bijective,we ﬁrst deﬁne an injective function E τ → υ : ( U τ → υ − (cid:4) → D τ ) − (cid:4) → D υ , deﬁne D τ → υ as its image E τ → υ ( U τ → υ ) , and ﬁnally deﬁne E τ → υ as E τ → υ with its codomain restricted to D τ → υ : E τ → υ : U τ → υ − (cid:4) → D τ − (cid:4) → D υ E τ → υ ( J F ( s ) K R ) (cid:0) E τ (cid:0) J F ( u ) K R (cid:1)(cid:1) = E υ (cid:0) J F ( s u ) K R (cid:1) This is a valid deﬁnition because each element of U τ → υ is of the form J F ( s ) K R for some s and each element of D τ is of the form E τ (cid:0) J F ( u ) K R (cid:1) for some u . This function is welldeﬁned if it does not depend on the choice of s and u . To show this, we assume that there areother ground terms t and v such that J F ( s ) K R = J F ( t ) K R and E τ (cid:0) J F ( u ) K R (cid:1) = E τ (cid:0) J F ( v ) K R (cid:1) .Since E τ is bijective, we have J F ( u ) K R = J F ( v ) K R . Using the ∼ -notation, we can write thisas u ∼ v . Applying Lemma 38 to the term x y and the substitutions { x s , y u } and { x t , y v } , we obtain s u ∼ t v —i.e., J F ( s u ) K R = J F ( t v ) K R . Thus, E τ → υ is well deﬁned.It remains to show that E τ → υ is injective as a function from U τ → υ to D τ − (cid:4) → D υ . Assume two uperposition with Lambdas 23 terms s , t ∈ T GH such that for all u ∈ T GH , we have J F ( s u ) K R = J F ( t u ) K R . By Lemma 37, itfollows that J F ( s ) K R = J F ( t ) K R , which concludes the proof that E τ → υ is injective.We deﬁne D τ → υ = E τ → υ ( U τ → υ ) and E τ → υ ( a ) = E τ → υ ( a ) . This ensures that E τ → υ isbijective and concludes the inductive deﬁnition of D and E . In the following, we will usuallywrite E instead of E τ , since the type τ is determined by the ﬁrst argument of E τ .In our running example, we thus have D ι = U ι = { [ a ] , [ b ] } and E ι is the identity U ι − (cid:4) → D ι , c c . The function E ι → ι maps [ f ] to the identity D ι − (cid:4) → D ι , c c ; it maps [ lam λ x . a ] to the constant function D ι − (cid:4) → D ι , c [ a ] ; and it maps [ lam λ x . b ] to the constant function D ι − (cid:4) → D ι , c [ b ] . The swapping function [ a ] [ b ] , [ b ] [ a ] is not in the image of E ι → ι . Therefore, D ι → ι contains only the identity and the two constant functions, but not thisswapping function.We deﬁne the higher-order universe as U GH = { D τ | τ ground } . Moreover, we deﬁne J GH ty ( κ )( D ¯ τ ) = U κ ( ¯ τ ) for all κ ∈ Σ ty , completing the type interpretation I GH ty = ( U GH , J GH ty ) .We deﬁne the interpretation function as J GH ( f , D ¯ υ m ) = E ( J ( f ¯ υ m )) for all f : Π ¯ α m . τ .In our example, we thus have J GH ( f ) = E ([ f ]) , which is the identity on D ι − (cid:4) → D ι .Finally, we need to deﬁne the designation function L GH , which takes a valuation ξ and a λ -expression as arguments. Given a valuation ξ , we choose a grounding substitution θ suchthat D αθ = ξ ( α ) and E ( J F ( x θ ) K R ) = ξ ( x ) for all type variables α and all variables x . Such asubstitution can be constructed as follows: We can fulﬁll the ﬁrst equation in a unique waybecause there is a one-to-one correspondence between ground types and domains. Since E − ( ξ ( x )) is an element of a ﬁrst-order universe and R is term-generated, there exists aground term t such that J t K ξ R = E − ( ξ ( x )) . Choosing one such t and deﬁning x θ = F − ( t ) gives us a grounding substitution θ with the desired property.We deﬁne L GH ( ξ, ( λ x . t )) = E ( J F (( λ x . t ) θ ) K R ) . To prove that this is well deﬁned, weassume that there exists another substitution θ ′ with the properties D αθ ′ = ξ ( α ) for all α and E ( J F ( x θ ′ ) K R ) = ξ ( x ) for all x . Then we have αθ = αθ ′ for all α due to the one-to-onecorrespondence between domains and ground types. We have J F ( x θ ) K R = J F ( x θ ′ ) K R forall x because E is injective. By Lemma 38 it follows that J F (( λ x . t ) θ ) K R = J F (( λ x . t ) θ ′ ) K R ,which proves that L GH is well deﬁned.In our example, for all ξ we have L GH ( ξ, λ x . x ) = E ([ lam λ x . x ]) = E ([ f ]) , which is theidentity. If ξ ( y ) = [ a ] , then L GH ( ξ, λ x . y ) = E ([ lam λ x . a ]) , which is the constant function c [ a ] . Similarly, if ξ ( y ) = [ b ] , then L GH ( ξ, λ x . y ) is the constant function c [ b ] .This concludes the deﬁnition of the interpretation I GH = ( U GH , J GH ty , J GH , L GH ) . It re-mains to show that I GH is proper. In a proper interpretation, the denotation J t K I GH of a term t does not depend on the representative of t modulo βη , but since we have not yet shown I GH to be proper, we cannot rely on this property. For this reason, we use λ -terms in thefollowing three lemmas and mark all βη -reductions explicitly.The higher-order interpretation I GH relates to the ﬁrst-order interpretation R as follows: Lemma 39

Givenaground λ -term t ,wehave J t K I GH = E ( J F ( t ↓ βη ) K R ) Proof

By induction on t . Assume that J s K I GH = E ( J F ( s ↓ βη ) K R ) for all proper subterms s of t . If t is of the form f h ¯ τ i , then J t K I GH = J GH ( f , D ¯ τ )= E ( J ( f , U F ( ¯ τ ) ))= E ( J f h F ( ¯ τ ) i K R )= E ( J F ( f h ¯ τ i ) K R )= E ( J F ( f h ¯ τ i↓ βη ) K R ) = E ( J F ( t ↓ βη ) K R ) If t is an application t = t t , where t is of type τ → υ , then J t t K I GH = J t K I GH ( J t K I GH ) IH = E τ → υ ( J F ( t ↓ βη ) K R )( E τ ( J F ( t ↓ βη ) K R )) Def E = E υ ( J F (( t t ) ↓ βη ) K R ) If t is a λ -expression, then J λ x . u K ξ I GH = L GH ( ξ, ( λ x . u ))= E ( J F (( λ x . u ) θ ↓ βη ) K R )= E ( J F (( λ x . u ) ↓ βη ) K R ) where θ is a substitution such that D αθ = ξ ( α ) and E ( J F ( x θ ) K R ) = ξ ( x ) . ⊓⊔ We need to show that the interpretation I GH = ( U GH , J GH ty , J GH , L GH ) is proper. In theproof, we will need to employ the following lemma, which is very similar to the substitutionlemma (Lemma 18), but we must prove it here for our particular interpretation I GH becausewe have not shown that I GH is proper yet. Lemma 40 (Substitution lemma) J τρ K ξ I GH ty = J τ K ξ ′ I GH ty and J t ρ K ξ I GH = J t K ξ ′ I GH for all λ -terms t ,all τ ∈ Ty H andallgrounding substitutions ρ ,where ξ ′ ( α ) = J αρ K ξ I GH ty foralltypevariables α and ξ ′ ( x ) = J x ρ K ξ I GH foralltermvariables x . Proof

We proceed by induction on the structure of τ and t . The proof is identical to the oneof Lemma 18, except for the last step, which uses properness of the interpretation, a prop-erty we cannot assume here. However, here, we have the assumption that ρ is a groundingsubstitution. Therefore, if t is a λ -expression, we argue as follows: J ( λ z . u ) ρ K ξ I GH = J ( λ z . u ρ ′ ) K ξ I GH where ρ ′ ( z ) = z and ρ ′ ( x ) = ρ ( x ) for x = z = L GH ( ξ, ( λ z . u ρ ′ )) by the deﬁnition of the term denotation = E ( J F (( λ z . u ) ρθ ↓ βη ) K ξ R ) for some θ by the deﬁnition of L GH = E ( J F (( λ z . u ) ρ ↓ βη ) K ξ R ) because ( λ z . u ) ρ is ground ∗ = L GH ( ξ ′ , λ z . u ) by the deﬁnition of L GH and Lemma 39 = J λ z . u K ξ ′ I GH by the deﬁnition of the term denotationThe step ∗ is justiﬁed as follows: We have L GH ( ξ ′ , λ z . u ) = E ( J F (( λ z . u ) θ ′ ↓ βη ) K ξ R ) by the def-inition of L GH , if θ ′ is a substitution such that D αθ ′ = ξ ′ ( α ) for all α and E ( J F ( x θ ′ ↓ βη ) K ξ R ) = ξ ′ ( x ) for all x . By the deﬁnition of ξ ′ and by Lemma 39, ρ is such a substitution. Hence, L GH ( ξ ′ , λ z . u ) = E ( J F (( λ z . u ) ρ ↓ βη ) K ξ R ) . ⊓⊔ Lemma 41

Theinterpretation I GH isproper. uperposition with Lambdas 25 Proof

We must show that J ( λ x . t ) K ξ I GH ( a ) = J t K ξ [ x a ] I GH for all λ -expressions λ x . t , all valua-tions ξ , and all values a . J λ x . t K ξ I GH ( a ) = L GH ( ξ, λ x . t )( a ) by the deﬁnition of J K I GH = E ( J F (( λ x . t ) θ ↓ βη ) K R )( a ) by the deﬁnition of L GH for some θ such that E ( J F ( z θ ) K R ) = ξ ( z ) for all z and D αθ = ξ ( α ) for all α = E ( J F ((( λ x . t ) θ s ) ↓ βη ) K R ) by the deﬁnition of E where E ( J F ( s ) K R ) = a = E ( J F ( t ( θ [ x s ]) ↓ βη ) K R ) by β -reduction = J t ( θ [ x s ]) K I GH by Lemma 39 = J t K ξ [ x a ] I GH by Lemma 40 ⊓⊔ Lemma 42 I GH isamodelof N . Proof

By Lemma 39, we have J t K I GH = E ( J F ( t ) K R ) for all t ∈ T GH . Since E is a bijection,it follows that any (dis)equation s ˙ ≈ t ∈ C GH is true in I GH if and only if F ( s ˙ ≈ t ) is true in R . Hence, a clause C ∈ C GH is true in I GH if and only if F ( C ) is true in R . By Theorem 36and the assumption that ⊥ / ∈ N , R is a model of F ( N ) —that is, for all clauses C ∈ N , F ( C ) is true in R . Hence, all clauses C ∈ N are true in I GH and therefore I GH is a model of N . ⊓⊔ We summarize the results of this subsection in the following theorem:

Theorem 43 (Ground static refutational completeness)

Let

GHSel be a selection func-tion on C GH . Then the inference system GHInf

GHSel is statically refutationally completew.r.t. ( GHRed I , GHRed C ) . In other words, if N ⊆ C GH is a clause set saturated w.r.t. GHInf

GHSel and

GHRed

GHSel I ,then N | = ⊥ ifandonlyif ⊥ ∈ N .The construction of I GH relies on speciﬁc properties of R . It would not work with anarbitrary ﬁrst-order interpretation. Transforming a higher-order interpretation into a ﬁrst-order interpretation is easier: Lemma 44

Given a clausal higher-order interpretation I on GH, there exists a ﬁrst-orderinterpretation I GF on GF such that for any clause C ∈ C GH the truth values of C in I and of F ( C ) in I GF coincide. Proof

Let I = ( I ty , J , L ) be a clausal higher-order interpretation. Let U GF τ = J τ K I ty be theﬁrst-order type universe for the ground type τ . For a symbol f ¯ υ j ∈ Σ GF , let J GF ( f ¯ υ j ) = J f h ¯ υ i K I (up to currying). For a symbol lam λ x . t ∈ Σ GF , let J GF ( lam λ x . t ) = J λ x . t K I . This deﬁnes aﬁrst-order interpretation I GF = ( U GF , J GF ) .We need to show that for any C ∈ C GH , I | = C if and only if I GF | = F ( C ) . It sufﬁcesto show that J t K I = J F ( t ) K I GF for all terms t ∈ T GH . We prove this by induction on thestructure of the η -short β -normal form of t . If t is a λ -expression, this is obvious. If t isof the form f h ¯ υ i ¯ s j , then F ( t ) = f ¯ υ j ( F ( ¯ s j )) and hence J F ( t ) K I GF = J GF ( f ¯ υ j )( J F ( ¯ s j ) K I GF ) = J f h ¯ υ i K I ( J F ( ¯ s j ) K I GF ) IH = J f h ¯ υ i K I ( J ¯ s j K I ) = J t K I . ⊓⊔ To lift the result to the nonground level, we employ the saturation framework of Waldmannet al. [69]. It is easy to see that the entailment relation | = on GH is a consequence relationin the sense of the framework. We need to show that our redundancy criterion on GH is aredundancy criterion in the sense of the framework and that G is a grounding function in thesense of the framework: Lemma 45

The redundancy criterion for GH is a redundancy criterion in the sense ofSect.2ofthesaturationframework.

Proof

We must prove the conditions (R1) to (R4) of the saturation framework. Adapted toour context, they state the following for all clause sets N , N ′ ⊆ C GH :(R1) if N | = ⊥ , then N \ GHRed C ( N ) | = ⊥ ;(R2) if N ⊆ N ′ , then GHRed C ( N ) ⊆ GHRed C ( N ′ ) and GHRed I ( N ) ⊆ GHRed I ( N ′ ) ;(R3) if N ′ ⊆ GHRed C ( N ) , then GHRed C ( N ) ⊆ GHRed C ( N \ N ′ ) and GHRed I ( N ) ⊆ GHRed I ( N \ N ′ ) ;(R4) if ι ∈ GHInf and concl ( ι ) ∈ N , then ι ∈ GHRed I ( N ) .The proof is analogous to the proof of Lemma 4.10 of Bentkamp et al. [10], using Lemma 44. ⊓⊔ Lemma 46

Thegroundingfunctions G GHSel for

GHSel ∈ G ( HSel ) aregroundingfunctionsinthesenseofSect.3ofthesaturationframework. Proof

We must prove the conditions (G1), (G2), and (G3) of the saturation framework.Adapted to our context, they state the following:(G1) G ( ⊥ ) = {⊥} ;(G2) for every C ∈ C H , if ⊥ ∈ G ( C ) , then C = ⊥ ;(G3) for every ι ∈ HInf , G GHSel ( ι ) ⊆ GHRed

GHSel I ( G ( concl ( ι ))) .Clearly, C = ⊥ if and only if ⊥ ∈ G ( C ) if and only if G ( C ) = {⊥} , proving (G1) and (G2).For every ι ∈ HInf , by the deﬁnition of G GHSel , we have concl ( G GHSel ( ι )) ⊆ G ( concl ( ι )) ,and thus (G3) by (R4). ⊓⊔ To lift the completeness result of the previous subsection to the nonground calculus

HInf , we employ Theorem 14 of the saturation framework, which, adapted to our context, isstated as follows. The theorem uses the notation

Inf ( N ) to denote the set of Inf -inferenceswhose premises are in N , for an inference system Inf and a clause set N . Moreover, ituses Herbrand entailment | = G on C H , which is deﬁned so that N | = G N if and only if G ( N ) | = G ( N ) . Theorem 47 (Lifting theorem) If GHInf

GHSel is statically refutationally complete w.r.t. ( GHRed

GHSel I , GHRed C ) for every GHSel ∈ G ( HSel ) , and if for every N ⊆ C H that is satu-ratedw.r.t. HInf and

HRed I thereexistsa GHSel ∈ G ( HSel ) suchthat GHInf

GHSel ( G ( N )) ⊆ G GHSel ( HInf ( N )) ∪ GHRed

GHSel I ( G ( N )) ,thenalso HInf isstaticallyrefutationallycompletew.r.t. ( HRed I , HRed C ) and | = G . Proof

This is almost an instance of Theorem 14 of the saturation framework. We take C H for F , C GH for G , and G ( HSel ) for Q . It is easy to see that the entailment relation | = on GH isa consequence relation in the sense of the framework. By Lemma 45 and 46, ( GHRed

GHSel I , uperposition with Lambdas 27 GHRed C ) is a redundancy criterion in the sense of the framework, and G GHSel are groundingfunctions in the sense of the framework, for all

GHSel ∈ G ( HSel ) . The redundancy crite-rion ( HRed I , HRed C ) matches exactly the intersected lifted redundancy criterion Red ∩ , ⊐ ofthe saturation framework. Their Theorem 14 states the theorem only for ⊐ = ∅ . By theirLemma 16, it also holds if ⊐ = ∅ . ⊓⊔ Let N ⊆ C H be a clause set saturated w.r.t. HInf and

HRed I . We assume that HSel fulﬁllsthe selection restriction that a literal

L y must not be selected if y ¯ u n , with n >

0, is a (cid:23) -maximal term of the clause, as required in Deﬁnition 9. For the above theorem to apply, weneed to show that there exists a selection function

GHSel ∈ G ( HSel ) such that all inferences ι ∈ GHInf

GHSel with prems ( ι ) ∈ G ( N ) are liftable or redundant. Here, for ι to be liftable means that ι is a G GHSel -ground instance of a

HInf -inference from N ; for ι to be redundant means that ι ∈ GHRed

GHSel I ( G ( N )) .To choose the right selection function GHSel ∈ G ( HSel ) , we observe that each groundclause C ∈ G ( N ) must have at least one corresponding clause D ∈ N such that C is a groundinstance of D . We choose one of them for each C ∈ G ( N ) , which we denote by G − ( C ) .Then let GHSel select those literals in C that correspond to literals selected by HSel in G − ( C ) . With respect to this selection function GHSel , we can show that all inferencesfrom G ( N ) are liftable or redundant: Lemma 48

Let G − ( C ) = D ∈ N and D θ = C .Let σ and ρ besubstitutionssuchthat x σρ = x θ forallvariables x in D .Let L bea(strictly) (cid:23) -eligibleliteralin C w.r.t. GHSel .Thenthereexistsa(strictly) % -eligibleliteral L ′ in D w.r.t. σ and HSel suchthat L ′ θ = L . Proof If L ∈ GHSel ( C ) , then there exists L ′ such that L ′ θ = L and L ′ ∈ HSel ( D ) by thedeﬁnition of G − . Otherwise, L is (cid:23) -maximal in C . Since C = D σρ , there are literals L ′ in D σ such that L ′ ρ = L . Choose L ′ to be a % -maximal among them. Then L ′ is % -maximal in D σ because for any literal L ′′ ∈ D with L ′′ % L ′ , we have L ′′ ρ (cid:23) L ′ ρ = L and hence L ′′ ρ = L by (cid:23) -maximality of L .If L is strictly (cid:23) -maximal in C , L ′ is also strictly % -maximal in D σ because a duplicateof L ′ in D σ would imply a duplicate of L in C . ⊓⊔ Lemma 49 (Lifting of ER ES , EF ACT , GA RG C ONG , and GE XT ) All ER ES , EF ACT ,GA RG C ONG ,and GE XT inferencesfrom G ( N ) areliftable. Proof ER ES : Let ι ∈ GHInf

GHSel be an ER ES inference with prems ( ι ) ∈ G ( N ) . Then ι is ofthe form C θ = C ′ θ ∨ s θ s ′ θ ER ES C ′ θ where G − ( C θ ) = C = C ′ ∨ s s ′ and the literal s θ s ′ θ is (cid:23) -eligible w.r.t. GHSel . Since s θ and s ′ θ are uniﬁable and ground, we have s θ = s ′ θ . Thus, there exists an idempotent σ ∈ CSU ( s , s ′ ) such that for some substitution ρ and for all variables x in C , we have x σρ = x θ .By Lemma 48, we may assume without loss of generality that s s ′ is % -eligible in C w.r.t. σ and HSel . Hence, the following inference ι ′ ∈ HInf is applicable: C ′ ∨ s s ′ ER ES C ′ σ Then ι is the σρ -ground instance of ι ′ and is therefore liftable. EF ACT : Analogously, if ι ∈ GHInf

GHSel is an EF

ACT inference with prems ( ι ) ∈ G ( N ) , then ι is of the form C θ = C ′ θ ∨ s ′ θ ≈ t ′ θ ∨ s θ ≈ t θ EF ACT C ′ θ ∨ t θ t ′ θ ∨ s θ ≈ t ′ θ where G − ( C θ ) = C = C ′ ∨ s ′ ≈ t ′ ∨ s ≈ t , the literal s θ ≈ t θ is (cid:23) -eligible in C w.r.t. GHSel ,and s θ t θ . Then s t . Moreover, s θ and s ′ θ are uniﬁable and ground. Hence, s θ = s ′ θ and there exists an idempotent σ ∈ CSU ( s , s ′ ) such that for some substitution ρ and for allvariables x in C , we have x σρ = x θ . By Lemma 48, we may assume without loss of gener-ality that s ≈ t is % -eligible in C w.r.t. σ and HSel . It follows that the following inference ι ′ ∈ HInf is applicable: C ′ ∨ s ′ ≈ t ′ ∨ s ≈ t EF ACT ( C ′ ∨ t t ′ ∨ s ≈ t ′ ) σ Then ι is the σρ -ground instance of ι ′ and is therefore liftable.GA RG C ONG : Let ι ∈ GHInf

GHSel be a GA RG C ONG inference with prems ( ι ) ∈ G ( N ) . Then ι is of the form C θ = C ′ θ ∨ s θ ≈ s ′ θ GA RG C ONG C ′ θ ∨ s θ ¯ u n ≈ s ′ θ ¯ u n where G − ( C θ ) = C = C ′ ∨ s ≈ s ′ , the literal s θ ≈ s ′ θ is strictly (cid:23) -eligible w.r.t. GHSel ,and s θ and s ′ θ are of functional type. It follows that s and s ′ have either a functional ora polymorphic type. Let σ be the most general substitution such that s σ and s ′ σ take n arguments. By Lemma 48, we may assume without loss of generality that s s ′ is strictly % -eligible in C w.r.t. σ and HSel . Hence the following inference ι ′ ∈ HInf is applicable: C ′ ∨ s ≈ s ′ A RG C ONG C ′ σ ∨ s σ ¯ x n ≈ s ′ σ ¯ x n Since σ is the most general substitution that ensures well-typedness of the conclusion, ι is aground instance of ι ′ and is therefore liftable.GE XT : The conclusion of a GE XT inference in GHInf is by deﬁnition a ground instance ofthe conclusion of an E XT inference in HInf . Hence, the GE XT inference is a ground instanceof the E XT inference. Therefore it is liftable. ⊓⊔ Some of the S UP inferences in GHInf are liftable as well:

Lemma 50 (Instances of green subterms)

Let s bea λ -termin η -short β -normalform,let σ be a substitution, and let p be a green position of both s and s σ ↓ βη . Then ( s | p ) σ ↓ βη =( s σ ↓ βη ) | p . Proof

By induction on p . If p = ε , then ( s | p ) σ ↓ βη = s σ ↓ βη = ( s σ ↓ βη ) | p . If p = i . p ′ , then s = f h ¯ τ i s . . . s n and s σ = f h ¯ τσ i ( s σ ) . . . ( s n σ ) , where 1 ≤ i ≤ n and p ′ is a green positionof s i . Clearly, βη -normalization steps of s σ can take place only in proper subterms. So s σ ↓ βη = f h ¯ τσ i ( s σ ↓ βη ) . . . ( s n σ ↓ βη ) . Since p = i . p ′ is a green position of s σ ↓ βη , p ′ mustbe a green position of ( s i σ ) ↓ βη . By the induction hypothesis, ( s i | p ′ ) σ ↓ βη = ( s i σ ↓ βη ) | p ′ .Therefore ( s | p ) σ ↓ βη = ( s | i . p ′ ) σ ↓ βη = ( s i | p ′ ) σ ↓ βη = ( s i σ ↓ βη ) | p ′ = ( s σ ↓ βη ) | p . ⊓⊔ uperposition with Lambdas 29 Lemma 51 (Lifting of S UP ) Let ι ∈ GHInf

GHSel bea S UP inference D θ z }| { D ′ θ ∨ t θ ≈ t ′ θ C θ z }| { C ′ θ ∨ s θ t θ p ˙ ≈ s ′ θ S UP D ′ θ ∨ C ′ θ ∨ s θ t ′ θ p ˙ ≈ s ′ θ where G − ( D θ ) = D = D ′ ∨ t ≈ t ′ ∈ N , s θ = s θ t θ p ,and G − ( C θ ) = C = C ′ ∨ s ˙ ≈ s ′ ∈ N .We assume that s , t , s θ ,and t θ are represented by λ -terms in η -short β -normal form. Let p ′ be the longest preﬁx of p that is a green position of s . Since ε is a green position of s , thelongest preﬁx always exists.Let u = s | p ′ .Suppose one of the following conditions applies:(i) u is a deeply occurring variable in C ; (ii) p = p ′ and the variable condition holds for D and C ;or (iii) p = p ′ and u isnotavariable.Then ι isliftable. Proof

The S UP inference conditions for ι are that t θ ≈ t ′ θ is strictly (cid:23) -eligible, s θ ˙ ≈ s ′ θ is strictly (cid:23) -eligible if positive and (cid:23) -eligible if negative, D θ % C θ , t θ - t ′ θ , and s θ - s ′ θ .We assume that s , t , s θ , and t θ are represented by λ -terms in η -short β -normal form. ByLemma 50, u θ agrees with s θ | p ′ (considering both as terms rather than as λ -terms).C ASE

1: We have (a) p = p ′ , (b) u is not ﬂuid, and (c) u is not a variable deeply occurringin C . Then u θ = s θ | p ′ = s θ | p = t θ . Since θ is a uniﬁer of u and t , there exists an idempotent σ ∈ CSU ( t , u ) such that for some substitution ρ and for all variables x occurring in D and C , we have x σρ = x θ . The inference conditions can be lifted: (Strict) eligibility of t θ ≈ t ′ θ and s θ ˙ ≈ s ′ θ w.r.t. GHSel implies (strict) eligibility of t ≈ t ′ and s ˙ ≈ s ′ w.r.t. σ and HSel ; D θ % C θ implies D % C ; t θ - t ′ θ implies t - t ′ ; and s θ - s ′ θ implies s - s ′ . Moreover, by (a)and (c), condition (ii) must hold and thus the variable condition holds for D and C . Hencethere is the following S UP inference ι ′ ∈ HInf : D ′ ∨ t ≈ t ′ C ′ ∨ s u p ˙ ≈ s ′ S UP ( D ′ ∨ C ′ ∨ s t ′ p ˙ ≈ s ′ ) σ Then ι is the σρ -ground instance of ι ′ and therefore liftable.C ASE

2: We have (a) p = p ′ , or (b) u is ﬂuid, or (c) u is a variable deeply occurring in C .We will ﬁrst show that (a) implies (b) or (c). Suppose (a) holds but neither (b) nor (c) holds.Then condition (iii) must hold—i.e., u is not a variable. Moreover, since (b) does not hold, u cannot have the form y ¯ u n for a variable y and n ≥

1. If u were of the form f h ¯ τ i s . . . s n with n ≥ u θ would have the form f h ¯ τθ i ( s θ ) . . . ( s n θ ) , but then there is some 1 ≤ i ≤ n suchthat p ′ . i is a preﬁx of p and s | p ′ . i is a green subterm of s , contradicting the maximality of p ′ . So u must be a λ -expression, but since t θ is a proper green subterm of u θ , u θ cannot be a λ -expression, yielding a contradiction. We may thus assume that (b) or (c) holds.Let p = p ′ . p ′′ . Let z be a fresh variable. Deﬁne a substitution θ ′ that maps this variable z to λ y . ( s θ | p ′ ) y p ′′ and any other variable w to w θ . Clearly, ( z t ) θ ′ = ( s θ | p ′ ) t θ p ′′ = s θ | p ′ = u θ = u θ ′ . Since θ ′ is a uniﬁer of u and zt , there exists an idempotent σ ∈ CSU ( zt , u ) such thatfor some substitution ρ , for x = z , and for all variables x in C and D , we have x σρ = x θ ′ . Asin case 1, (strict) eligibility of the ground literals implies (strict) eligibility of the nongroundliterals. Moreover, by construction of θ ′ , t θ ′ = t θ = t ′ θ = t ′ θ ′ implies ( z t ) θ ′ = ( z t ′ ) θ ′ , andthus ( z t ) σ = ( z t ′ ) σ . Since we also have (b) or (c), there is the following inference ι ′ : D ′ ∨ t ≈ t ′ C ′ ∨ s u p ′ ˙ ≈ s ′ F LUID S UP ( D ′ ∨ C ′ ∨ s z t ′ p ′ ˙ ≈ s ′ ) σ Then ι is the σρ -ground instance of ι ′ and therefore liftable. ⊓⊔ The other S UP inferences might not be liftable, but they are redundant: Lemma 52

Let ι ∈ GHInf

GHSel bea S UP inference from G ( N ) not coveredby Lemma51.Then ι ∈ GHRed

GHSel I ( G ( N )) . Proof

Let C θ = C ′ θ ∨ s θ ˙ ≈ s ′ θ and D θ = D ′ θ ∨ t θ ≈ t ′ θ be the premises of ι , where s θ ˙ ≈ s ′ θ and t θ ≈ t ′ θ are the literals involved in the inference, s θ ≻ s ′ θ , t θ ≻ t ′ θ , and C ′ , D ′ , s , s ′ , t , t ′ are the respective subclauses and terms in C = G − ( C θ ) and D = G − ( D θ ) . Then theinference ι has the form D ′ θ ∨ t θ ≈ t ′ θ C ′ θ ∨ s θ t θ ˙ ≈ s ′ θ S UP D ′ θ ∨ C ′ θ ∨ s θ t ′ θ ˙ ≈ s ′ θ To show that ι ∈ GHRed

GHSel I ( G ( N )) , it sufﬁces to show { D ∈ F ( G ( N )) | D ≺ F ( C θ ) } | = F ( concl ( ι )) . To this end, let I be an interpretation in GF such that I | = { D ∈ F ( G ( N )) | D ≺ F ( C θ ) } . We need to show that I | = F ( concl ( ι )) . If F ( D ′ θ ) is true in I , then obviously I | = F ( concl ( ι )) . So we assume that F ( D ′ θ ) is false in I . Since C θ ≻ D θ by the S UP orderconditions, it follows that I | = F ( t θ ≈ t ′ θ ) . Therefore, it sufﬁces to show I | = F ( C θ ) .Let p be the position in s θ where ι takes place and p ′ be the longest preﬁx of p that isa green subterm of s . Let u = s | p ′ . Since Lemma 51 does not apply to ι , u is not a deeplyoccurring variable; if p = p ′ , the variable condition does not hold for D and C ; and if p = p ′ , u is a variable. This means either the position p does not exist in s , because it is below anunapplied variable that does not occur deeply in C , or s | p is an unapplied variable that doesnot occur deeply in C and for which the variable condition does not hold.C ASE

1: The position p does not exist in s because it is below a variable x that does notoccur deeply in C . Then t θ is a green subterm of x θ and hence a green subterm of x θ ¯ w for any arguments ¯ w . Let v be the term that we obtain by replacing t θ by t ′ θ in x θ at therelevant position. Since I | = F ( t θ ≈ t ′ θ ) , by congruence, I | = F ( x θ ¯ w ≈ v ¯ w ) for any argu-ments ¯ w . Hence, I | = F ( C θ ) if and only if I | = F ( C { x v } θ ) by congruence. Here, it iscrucial that the variable does not occur deeply in C because congruence does not hold in F -encoded terms below λ -binders. By the inference conditions, we have t θ ≻ t ′ θ , whichimplies F ( C θ ) ≻ F ( C { x v } θ ) by compatibility with green contexts. Therefore, by theassumption about I , we have I | = F ( C { x v } θ ) and hence I | = F ( C θ ) .C ASE

2: The term s | p is a variable x that does not occur deeply in C and for which thevariable condition does not hold. From this, we know that C θ (cid:23) C ′′ θ , where C ′′ = C { x t ′ } . We cannot have C θ = C ′′ θ because x θ = t θ = t ′ θ and x occurs in C . Hence, we have C θ ≻ C ′′ θ . By the deﬁnition of I , C θ ≻ C ′′ θ implies I | = F ( C ′′ θ ) . We will use equalities thatare true in I to rewrite F ( C θ ) into F ( C ′′ θ ) , which implies I | = F ( C θ ) by congruence.By saturation, every A RG C ONG inference ι ′ from D is in HRed I ( N ) —i.e., G ( concl ( ι ′ )) ⊆ G ( N ) ∪ GHRed C ( G ( N )) . Hence, D ′ θ ∨ t θ ¯ u ≈ t ′ θ ¯ u is in G ( N ) ∪ GHRed C ( G ( N )) for anyground arguments ¯ u .We observe that whenever t θ ¯ u and t ′ θ ¯ u are smaller than the (cid:23) -maximal term of C θ forsome arguments ¯ u , we have I | = F ( t θ ¯ u ) ≈ F ( t ′ θ ¯ u ) ( ∗ )To show this, we assume that t θ ¯ u and t ′ θ ¯ u are smaller than the (cid:23) -maximal term of C θ andwe distinguish two cases: If t θ is smaller than the (cid:23) -maximal term of C θ , all terms in D ′ θ are smaller than the (cid:23) -maximal term of C θ and hence D ′ θ ∨ t θ ¯ u ≈ t ′ θ ¯ u ≺ C θ . If, on the otherhand, t θ is equal to the (cid:23) -maximal term of C θ , then t θ ¯ u and t ′ θ ¯ u are smaller than t θ . Hence uperposition with Lambdas 31 t θ ¯ u ≈ t ′ θ ¯ u ≺ t θ ≈ t ′ θ and D ′ θ ∨ t θ ¯ u ≈ t ′ θ ¯ u ≺ D θ ≺ C θ . In both cases, since D ′ θ is false in I ,by the deﬁnition of I , we have ( ∗ ).Next, we show the equivalence of C θ and C ′′ θ via rewriting with equations of theform ( ∗ ) where t θ ¯ u and t ′ θ ¯ u are smaller than the (cid:23) -maximal term of C θ . Since x doesnot occur deeply in C , every occurrence of x in C is not inside a λ -expression and notinside an argument of an applied variable. Therefore, all occurrences of x in C are in agreen subterm of the form x ¯ v for some terms ¯ v that do not contain x . Hence, every occur-rence of x in C corresponds to a subterm F (( x ¯ v ) θ ) = F ( t θ ¯ v θ ) in F ( C θ ) and to a subterm F (( x ¯ v ) { x t ′ } θ ) = F ( t ′ θ ¯ v { x t ′ } θ ) = F ( t ′ θ ¯ v θ ) in F ( C ′′ θ ) . These are the only positionswhere C θ and C ′′ θ differ.To justify the necessary rewrite steps from F ( t θ ¯ v θ ) into F ( t ′ θ ¯ v θ ) using ( ∗ ), we mustshow that F ( t θ ¯ v θ ) and F ( t ′ θ ¯ v θ ) are smaller than the (cid:23) -maximal term in F ( C θ ) for the rel-evant ¯ v . If ¯ v is the empty tuple, we do not need to show this because I | = F ( t θ ≈ t ′ θ ) followsfrom F ( D θ ) ’s being true and F ( D ′ θ ) ’s being false. If ¯ v is nonempty, it sufﬁces to show that x ¯ v is not a (cid:23) -maximal term in C . Then F ( t θ ¯ v θ ) and F ( t ′ θ ¯ v θ ) , which correspond to theterm x ¯ v in C , cannot be (cid:23) -maximal in F ( C θ ) and F ( C ′′ θ ) . Hence they must be smaller thanthe (cid:23) -maximal term in F ( C θ ) because they are subterms of F ( C θ ) and F ( C ′′ θ ) ≺ F ( C θ ) ,respectively.To show that x ¯ v is not a (cid:23) -maximal term in C , we make a case distinction on whether s θ ˙ ≈ s ′ θ is selected in C θ or s θ is the (cid:23) -maximal term in C θ . One of these must hold because s θ ˙ ≈ s ′ θ is (cid:23) -eligible in C θ . If it is selected, by the selection restrictions, x cannot be thehead of a (cid:23) -maximal term of C . If s θ is the (cid:23) -maximal term in C θ , we can argue that x isa green subterm of s and, since x does not occur deeply, s cannot be of the form x ¯ v for anonempty ¯ v . This justiﬁes the necessary rewrites between F ( C θ ) and F ( C ′′ θ ) and it followsthat I | = F ( C θ ) . ⊓⊔ With these properties of our inference systems in place, Theorem 47 guarantees staticand dynamic refutational completeness of

HInf w.r.t.

HRed I . However, this theorem givesus refutational completeness w.r.t. the Herbrand entailment | = G , deﬁned as N | = G N if G ( N ) | = G ( N ) , whereas our semantics is Tarski entailment | = , deﬁned as N | = N if anymodel of N is a model of N . To repair this mismatch, we use the following lemma, whichcan be proved along the lines of Lemma 4.16 of Bentkamp et al. [10], using Lemma 18 andLemma 19. Lemma 53

For N ⊆ C H ,wehave N | = G ⊥ ifandonlyif N | = ⊥ . Theorem 54 (Static refutational completeness)

The inference system

HInf is staticallyrefutationally complete w.r.t. ( HRed I , HRed C ) . In other words, if N ⊆ C H is a clause setsaturatedw.r.t. HInf and

HRed I ,thenwehave N | = ⊥ ifandonlyif ⊥ ∈ N . Proof

We apply Theorem 47. By Theorem 43,

GHInf

GHSel is statically refutationally com-plete for all

GHSel ∈ G ( HSel ) . By Lemmas 49, 51, and 52, for every saturated N ⊆ C H , thereexists a selection function GHSel ∈ G ( HSel ) such that all inferences ι ∈ GHInf

GHSel with prems ( ι ) ∈ G ( N ) either are G GHSel -ground instances of

HInf -inferences from N or belongto GHRed

GHSel I ( G ( N )) .Theorem 47 implies that if N ⊆ C H is a clause set saturated w.r.t. HInf and

HRed I , then N | = G ⊥ if and only if ⊥ ∈ N . By Lemma 53, this also holds for the Tarski entailment | = .That is, if N ⊆ C H is a clause set saturated w.r.t. HInf and

HRed I , then N | = ⊥ if and only if ⊥ ∈ N . ⊓⊔ From static refutational completeness, we can easily derive dynamic refutational com-pleteness.

Theorem 55 (Dynamic refutational completeness)

Theinferencesystem

HInf isdynam-icallyrefutationally completew.r.t. ( HRed I , HRed C ) ,asdeﬁnedinDeﬁnition33. Proof

By Theorem 17 of the saturation framework, this follows from Theorem 54 andLemma 53. ⊓⊔ The core calculus can be extended with various optional rules. Although these are not nec-essary for refutational completeness, they can allow the prover to ﬁnd more direct proofs.Most of these rules are concerned with the areas covered by the F

LUID S UP rule and theextensionality axiom.Two of the optional rules below rely on the notion of “orange subterms.” Deﬁnition 56 A λ -term t is an orange subterm of a λ -term s if s = t ; or if s = f h ¯ τ i ¯ s and t isan orange subterm of s i for some i ; or if s = x ¯ s and t is an orange subterm of s i for some i ;or if s = ( λ x . u ) and t is an orange subterm of u .For example, in the term f ( g a ) ( y b ) ( λ x . h c ( g x )) , the orange subterms are all the greensubterms— a , g a , y b , λ x . h c ( g x ) and the whole term—and in addition b , c , x , g x , and h c ( g x ) . Following Convention 1, this notion is lifted to βη -equivalence classes via repre-sentatives in η -short β -normal form. We write t = s ¯ x n . u to indicate that u is an orangesubterm of t , where ¯ x n are the variables bound in the orange context around u , from outer-most to innermost. If n =

0, we simply write t = s u .Once a term s ¯ x n . u has been introduced, we write s ¯ x n . u ′ η to denote the samecontext with a different subterm u ′ at that position. The η subscript is a reminder that u ′ is notnecessarily an orange subterm of s ¯ x n . u ′ η due to potential applications of η -reduction.For example, if s x . g x x = h a ( λ x . g x x ) , then s x . f x η = h a ( λ x . f x ) = h a f .Demodulation, which destructively rewrites using an equality t ≈ t ′ , is available at greenpositions. In addition, a variant of demodulation rewrites in orange contexts: t ≈ t ′ C s ¯ x . t σ λ D EMOD E XT t ≈ t ′ C s ¯ x . t ′ σ η s ¯ x . t σ ≈ s ¯ x . t ′ σ η where the term t σ may refer to the bound variables ¯ x . The following side conditions apply:1. s ¯ x . t σ ↓ βη is a λ -expression or a term of the form y ¯ u n with n > s ¯ x . t σ ≻ s ¯ x . t ′ σ η ; 3. C s ¯ x . t σ ≻ s ¯ x . t σ ≈ s ¯ x . t ′ σ η Condition 3 ensures that the second premise is redundant w.r.t. the conclusions and maybe removed. The double bar indicates that the conclusions collectively make the premisesredundant and can replace them.The third conclusion, which is entailed by t ≈ t ′ and (E XT ), could be safely omitted if thecorresponding (E XT ) instance is smaller than the second premise. But in general, the thirdconclusion is necessary for the proof, and the variant of λ D EMOD E XT that omits it—let uscall it λ D EMOD —might not preserve refutational completeness. uperposition with Lambdas 33

An instance of λ D EMOD E XT , where g z is rewritten to f z z under a λ -binder, follows: g x ≈ f x x k ( λ z . h ( g z )) ≈ c λ D EMOD E XT g x ≈ f x x k ( λ z . h ( f z z )) ≈ c ( λ z . h ( g z )) ≈ ( λ z . h ( f z z )) Lemma 57 λ D EMOD E XT issoundandpreservesrefutationalcompletenessofthecalculus. Proof

Soundness of the ﬁrst conclusion is obvious. Soundness of the second and third con-clusion follows from congruence and extensionality using the premises. Preservation ofcompleteness is justiﬁed by redundancy. Speciﬁcally, we justify the deletion of the sec-ond premise by showing that it is redundant w.r.t. the conclusions. By deﬁnition, it is re-dundant if for every ground instance

C s ¯ x . t σ θ ∈ G ( C s ¯ x . t σ ) , its encoding F ( C s ¯ x . t σ θ ) is entailed by F ( G ( N )) , where N are the conclusions of λ D EMOD E XT .The ﬁrst conclusion cannot help us prove redundancy because s ¯ x . t σ θ ↓ βη might be a λ -expression and then F ( s ¯ x . t σ θ ) is a symbol that is unrelated to F ( t σθ ) . Instead, weuse the θ -instances of the last two conclusions. By Lemma 23, F ( C s ¯ x . t ′ σ η θ ) has F ( s ¯ x . t ′ σ η θ ) as a subterm. If this subterm is replaced by F ( s ¯ x . t σ θ ) , we obtain F ( C s ¯ x . t σ θ ) . Hence, the F -encodings of the θ -instances of the last two conclusionsentail the F -encoding of the θ -instance of the second premise by congruence. Due to theside condition that the second premise is larger than the second and third conclusion, bystability under grounding substitutions, the θ -instances of the last two conclusions must besmaller than the θ -instance of the second premise. Thus, the second premise is redundant. ⊓⊔ The next simpliﬁcation rule can be used to prune arguments of applied variables if the ar-guments can be expressed as functions of the remaining arguments. For example, the clause C [ y a b ( f b a ) , y b d ( f d b )] , in which y occurs twice, can be simpliﬁed to C [ y ′ a b , y ′ b d ] .Here, for each occurrence of y , the third argument can be computed by applying f to thesecond and ﬁrst arguments. The rule can also be used to remove the repeated arguments in y b b y a a , the static argument a in y a c y a b , and all four arguments in y a b z b d . Itis stated as C P RUNE A RG C σ where the following conditions apply:1. σ = { y λ ¯ x j . y ′ ¯ x j − } ; 2. y ′ is a fresh variable; 3. C ⊐ C σ ;4. the minimum number k of arguments passed to any occurrence of y in the clause C is atleast j ;5. there exists a term t containing no variables bound in the clause such that for all termsof the form y ¯ s k occurring in the clause we have s j = t ¯ s j − s j + . . . s k .Clauses with a static argument correspond to the case t : = ( λ ¯ x j − x j + . . . x k . u ) , where u is the static argument (containing no variables bound in t ) and j is its index in y ’s argumentlist. The repeated argument case corresponds to t : = ( λ ¯ x j − x j + . . . x k . x i ) , where i is theindex of the repeated argument’s mate. Lemma 58 P RUNE A RG issoundandpreservesrefutationalcompletenessofthecalculus. Proof

The rule is sound because it simply applies a substitution to C . It preserves complete-ness because the premise C is redundant w.r.t. the conclusion C σ . This is because the sets of ground instances of C and C σ are the same and C ⊐ C σ . Clearly C σ is an instance of C .We will show the inverse: that C is an instance of C σ . Let ρ = { y ′ λ ¯ x j − x j + . . . x k . y ¯ x j − ( t ¯ x j − x j + . . . x k ) x j + . . . x k } . We show C σρ = C . Consider an occurrence of y in C . By theside conditions, it will have the form y ¯ s k ¯ u , where s j = t ¯ s j − s j + . . . s k . Hence, ( y ¯ s k ) σρ =( y ′ ¯ s j − s j + . . . s k ) ρ = y ¯ s j − ( t ¯ s j − s j + . . . s k ) s j + . . . s k = y ¯ s k . Thus, C σρ = C . ⊓⊔ We designed an algorithm that efﬁciently computes the subterm u of the term t =( λ x . . . x j − x j + . . . x k . u ) occurring in the side conditions of P RUNE A RG . The algorithmis incomplete, but our tests suggest that it discovers most cases of prunable arguments thatoccur in practice. The algorithm works by maintaining a mapping of pairs ( y , i ) of functionalvariables y and indices i of their arguments to a set of candidate terms for u . For an occur-rence y ¯ s n of y and for an argument s j , the algorithm approximates this set by computing allpossible ways in which subterms of s j that are equal to any other s i can be replaced with thevariable x i corresponding to the i th argument of y . The candidate sets for all occurrences of y are then intersected. An arbitrary element of the ﬁnal intersection is returned as the term u .For example, suppose that y a ( f a ) b and y z ( f z ) b are the only occurrences of y in theclause C . The initial mapping is { T H , T H , T H } . After computing the ways inwhich each argument can be expressed using the remaining ones for the ﬁrst occurrence andintersecting the sets, we get {

7→ { a } ,

7→ { f a , f x } ,

7→ { b }} , where x represents y ’s ﬁrstargument. Finally, after computing the corresponding sets for the second occurrence of y andintersecting them with the previous candidate sets, we get { /0 ,

7→ { f x } ,

7→ { b }} . Theﬁnal mapping shows that we can remove the second argument, since it can be expressed as afunction of the ﬁrst argument: t = ( λ x x . f x x ) . We can also remove the third argument,since its value is ﬁxed: t = ( λ x x . b ) . An example where our procedure fails is the pair ofoccurrences y ( λ x . a ) ( f a ) c and y ( λ x . b ) ( f b ) d . P RUNE A RG can be used to eliminate thesecond argument by taking t : = ( λ x x . f ( x x )) , but our algorithm will not detect this.Following the literature [34, 62], we provide a rule for negative extensionality: C ′ ∨ s s ′ N EG E XT C ′ ∨ s ( sk h ¯ α i ¯ y ) s ′ ( sk h ¯ α i ¯ y ) where the following conditions apply:1. sk is a fresh Skolem symbol; 2. s s ′ is % -eligible in the premise;3. ¯ α and ¯ y are the type and term variables occurring free in the literal s s ′ .Negative extensionality can be applied as an inference rule at any time or as a simpliﬁcationrule during preprocessing of the initial problem. The rule uses Skolem terms sk ¯ y rather than diﬀ s s ′ because they tend to be more compact. Lemma 59 (N EG E XT ’s satisﬁability preservation) Let N ⊆ C H and let E be theconclu-sionofa N EG E XT inference from N . If N ∪ { (E XT ) } issatisﬁable,then N ∪ { (E XT ) , E } issatisﬁable. Proof

Let I be a model of N ∪ { (E XT ) } . We need to construct a model of N ∪ { (E XT ) , E } . Since (E XT ) holds in I , so does its instance s ( diﬀ s s ′ ) s ′ ( diﬀ s s ′ ) ∨ s ≈ s ′ . We extend themodel I to a model I ′ , interpreting sk such that I ′ | = sk h ¯ α i ¯ y ≈ diﬀ s s ′ . The Skolem symbol sk takes the free type and term variables of s s ′ as arguments, which include all the freevariables of diﬀ s s ′ , allowing us to extend I in this way.By assumption, the premise C ′ ∨ s s ′ is true in I and hence in I ′ . Since the aboveinstance of (E XT ) holds in I , it also holds in I ′ . Hence, the conclusion C ′ ∨ s ( sk h ¯ α m i ¯ y n ) s ′ ( sk h ¯ α m i ¯ y n ) also holds, which can be seen by resolving the premise against the (E XT )instance and unfolding the deﬁning equation of sk . ⊓⊔ uperposition with Lambdas 35 One reason why the extensionality axiom is so proliﬁc is that both sides of its maximalliteral, y ( diﬀ y z ) z ( diﬀ y z ) , are ﬂuid. As a pragmatic alternative to the axiom, we intro-duce the “abstracting” rules A BS S UP , A BS ER ES , and A BS EF ACT with the same premisesas the core S UP , ER ES , and EF ACT , respectively. We call these rules collectively A BS . Eachnew rule shares all the side conditions of the corresponding core rule except that of the form σ ∈ CSU ( s , t ) . Instead, it lets σ be the most general uniﬁer of s and t ’s types and adds thiscondition: Let v s , . . ., s n = s σ and v t , . . ., t n = t σ , where v is the largest commongreen context of s σ and t σ . If any s i is of functional type and the core rule has conclusion E σ , the new rule has conclusion E σ ∨ s t ∨ · · · ∨ s n t n . The N EG E XT rule can thenbe applied to those literals s i t i whose sides have functional type. Essentially the sameidea was proposed by Bhayat and Reger as uniﬁcation with abstraction in the context ofcombinatory superposition [19, Sect. 3.1]. The approach regrettably does not fully eliminatethe need for axiom (E XT ), as Visa Nummelin demonstrated via the following example. Example 60

Consider the unsatisﬁable clause set consisting of h x ≈ f x , k h ≈ k g , and k g k f , where k takes at most one argument and h ≻ g ≻ f . The only nonredundant A BS inference applicable is A BS ER ES on the third clause, resulting in g f . Applying E XT N EG further produces g sk f sk . The set consisting of all ﬁve clauses is saturated.A different approach is to instantiate the extensionality axiom with arbitrary terms s , s ′ of the same functional type: E XT I NST s ( diﬀ s s ′ ) s ′ ( diﬀ s s ′ ) ∨ s ≈ s ′ We would typically choose s , s ′ among the green subterms occurring in the current clauseset. Intuitively, if we think in terms of eligibility, E XT I NST demands s ( diﬀ s s ′ ) ≈ s ′ ( diﬀ s s ′ ) to be proved before s ≈ s ′ can be used. This can be advantageous because simplifying in-ferences (based on matching) will often be able to rewrite the applied terms s ( diﬀ s s ′ ) and s ′ ( diﬀ s s ′ ) . In contrast, A BS assume s ≈ s ′ and delay the proof obligation that s ( diﬀ s s ′ ) ≈ s ′ ( diﬀ s s ′ ) . This can create many long clauses, which will be subject to expensive generatinginferences (based on full uniﬁcation).Superposition can be generalized to orange subterms as follows: D ′ ∨ t ≈ t ′ C ′ ∨ s ¯ x . u ˙ ≈ s ′ λ S UP ( D ′ ∨ C ′ ∨ s ¯ x . t ′ η ˙ ≈ s ′ ) σρ where the substitution ρ is deﬁned as follows: Let P y = { y } for all type and term variables y ¯ x . For each i , let P x i be recursively deﬁned as the union of all P y such that y occursfree in the λ -expression that binds x i in s ¯ x . u σ or that occurs free in the correspondingsubterm of s ¯ x . t ′ η σ . Then ρ is deﬁned as { x i sk i h ¯ α i i ¯ y i for each i } , where ¯ y i are theterm variables in P x i and ¯ α i are the type variables in P x i and the type variables occurring inthe type of the λ -expression binding x i . In addition, S UP ’s side conditions and the followingconditions apply:10. ¯ x has length n >

0; 11. ¯ x σ = ¯ x ;12. the variables ¯ x do not occur in y σ for all variables y in u .The substitution ρ introduces Skolem terms to represent bound variables that wouldotherwise escape their binders. The rule can be justiﬁed in terms of paramodulation and extensionality, with the Skolem terms standing for diﬀ terms. We can shorten the derivationof Example 17 by applying this rule to the clauses C div and C conj as follows: n ≈ zero ∨ div n n ≈ one prod K ( λ k . div ( succ k ) ( succ k )) one λ S UP succ sk ≈ zero ∨ prod K ( λ k . one ) one From this conclusion, ⊥ can be derived using only S UP and E Q R ES inferences. We thusavoid both F LUID S UP and (E XT ). Lemma 61 ( λ S UP ’s satisﬁability preservation) Let N ⊆ C H andlet E betheconclusionofa λ S UP inference from N . If N ∪ { (E XT ) } issatisﬁable,then N ∪ { (E XT ) , E } issatisﬁable. Proof

Let I be a model of N ∪ { (E XT ) } . We need to construct a model of N ∪ { (E XT ) , E } . For each i , let v i be the λ -expression binding x i in the term s ¯ x . u σ in the rule. Let v ′ i be the variant of v i in which the relevant occurrence of u σ is replaced by t ′ σ . We deﬁne asubstitution π recursively by x i π = diﬀ ( v i π ) ( v ′ i π ) for all i . This deﬁnition is well foundedbecause the variables x j with j ≥ i do not occur freely in v i and v ′ i . We extend the model I to a model I ′ , interpreting sk i such that I ′ | = sk i h ¯ α i i ¯ y i ≈ diﬀ ( v i π ) ( v ′ i π ) for each i . Since thefree type and term variables of any x i π are necessarily contained in P x i , the arguments of sk i include the free variables of diﬀ ( v i π ) ( v ′ i π ) , allowing us to extend I in this way.By assumption, the premises of the λ S UP inference are true in I and hence in I ′ . Weneed to show that the conclusion ( D ′ ∨ C ′ ∨ s ¯ x . t ′ η ˙ ≈ s ′ ) σρ is also true in I ′ . Let ξ bea valuation. If I ′ , ξ | = ( D ′ ∨ C ′ ) σρ , we are done. So we assume that D ′ σρ and C ′ σρ arefalse in I ′ under ξ . In the following, we omit ‘ I ′ , ξ | = ’, but all equations ( ≈ ) are meant tobe true in I ′ under ξ . Assuming D ′ σρ and C ′ σρ are false, we will show inductively that v i π ≈ v ′ i π for all i = k , . . .,

1. By this assumption, the premises imply that t σρ ≈ t ′ σρ and s ¯ x . u σρ ˙ ≈ s ′ σρ . Due to the way we constructed I ′ , we have w π ≈ w ρ for any term w .Hence, we have t σπ ≈ t ′ σπ . The terms v k π ( diﬀ ( v k π ) ( v ′ k π )) and v ′ k π ( diﬀ ( v k π ) ( v ′ k π )) arethe respective result of applying π to the body of the λ -expressions v k and v ′ k . Therefore, bycongruence, t σπ ≈ t ′ σπ and t σ = u σ imply that v k π ( diﬀ ( v k π )( v ′ k π )) ≈ v ′ k π ( diﬀ ( v k π )( v ′ k π )) . The extensionality axiom then implies v k π ≈ v ′ k π .It follows directly from the deﬁnition of π that for all i , v i π ( diﬀ ( v i π )( v ′ i π )) = s i v i + π and v ′ i π ( diﬀ ( v i π ) ( v ′ i π )) = s i v ′ i + π for some context s i . The subterms v i + π of s i v i + π and v ′ i + π of s i v ′ i + π may be below applied variables but not below λ s. Sincesubstitutions avoid capture, in v i and v ′ i , π only substitutes x j with j < i , but in v i + and v ′ i + ,it substitutes all x j with j ≤ i . By an induction using these equations, congruence, and theextensionality axiom, we can derive from v k π ≈ v ′ k π that v π ≈ v ′ π. Since I ′ | = w π ≈ w ρ forany term w , we have v ρ ≈ v ′ ρ. By congruence, it follows that s ¯ x . u σρ ≈ s ¯ x . t ′ η σρ. With s ¯ x . u σρ ˙ ≈ s ′ σρ, it follows that ( s ¯ x . t ′ η ˙ ≈ s ′ ) σρ. Hence, the conclusion of the λ S UP inference is true in I ′ . ⊓⊔ The next rule, duplicating ﬂex subterm superposition , is a lightweight alternative toF

LUID S UP : D ′ ∨ t ≈ t ′ C ′ ∨ s y ¯ u n ˙ ≈ s ′ D UP S UP ( D ′ ∨ C ′ ∨ s z ¯ u n t ′ ˙ ≈ s ′ ) ρσ where n > ρ = { y λ ¯ x n . z ¯ x n ( w ¯ x n ) } , and σ ∈ CSU ( t , w ( ¯ u n ρ )) for fresh variables w , z .The order and eligibility restrictions are as for S UP . The rule can be understood as thecomposition of an inference that applies the substitution ρ and of a paramodulation inferenceinto the subterm w ( ¯ u n ρ ) of s z ( ¯ u n ρ ) ( w ( ¯ u n ρ )) . D UP S UP is general enough to replace uperposition with Lambdas 37 F LUID S UP in Examples 13 and 14 but not in Example 15. On the other hand, F LUID S UP ’suniﬁcation problem is usually a ﬂex–ﬂex pair, whereas D UP S UP yields a less explosiveﬂex–rigid pair unless t is variable-headed.The last rule, ﬂex subterm superposition , is an even more lightweight alternative toF LUID S UP : D ′ ∨ t ≈ t ′ C ′ ∨ s y ¯ u n ˙ ≈ s ′ F LEX S UP ( D ′ ∨ C ′ ∨ s t ′ ˙ ≈ s ′ ) σ where n > σ ∈ CSU ( t , y ¯ u n ) . The order and eligibility restrictions are as for S UP . Zipperposition [27,28] is an open source superposition prover written in OCaml. Originallydesigned for polymorphic ﬁrst-order logic (TF1 [21]), it was later extended with an incom-plete higher-order mode based on pattern uniﬁcation [53]. Bentkamp et al. [12] extendedit further with a complete λ -free clausal higher-order mode. We have now implemented aclausal higher-order mode based on our calculus. We use the order ≻ λ (Sect. 3.6) derivedfrom the Knuth–Bendix order [45] and the lexicographic path order [43]. We currently usethe corresponding nonstrict order (cid:23) λ as % .Except for F LUID S UP , the core calculus rules already existed in Zipperposition in asimilar form. To improve efﬁciency, we extended the prover to use a higher-order general-ization [66] of ﬁngerprint indices [58] to ﬁnd inference partners for all new binary inferencerules. To speed up the computation of the S UP conditions, we omit the condition C σ - D σ in the implementation, at the cost of performing some additional inferences. Among theoptional rules, we implemented λ D EMOD , P

RUNE A RG , N EG E XT , A BS , E XT I NST , λ S UP ,D UP S UP , and F LEX S UP . For λ D EMOD and λ S UP , demodulation, subsumption, and otherstandard simpliﬁcation rules (as implemented in E [59]), we use pattern uniﬁcation. Forgenerating inference rules that require enumerations of complete sets of uniﬁers, we use thecomplete procedure of Vukmirovi´c et al. [66]. It has better termination behavior, producesfewer redundant uniﬁers, and can be implemented more efﬁciently than procedures suchas Jensen and Pietrzykowski’s [38] and Snyder and Gallier’s [61]. The set of ﬂuid terms isoverapproximated in the implementation by the set of terms that are either nonground λ -expressions or terms of the form y ¯ u n with n >

0. To efﬁciently retrieve candidates for A BS inferences without slowing down superposition term indexing structures, we implementeddedicated indexing for clauses that are eligible for A BS inferences [68, Sect. 3.3].Zipperposition implements a DISCOUNT-style given clause procedure [5]. The proofstate is represented by a set A of active clauses and a set P of passive clauses. To interleavenonterminating uniﬁcation with other computation, we added a set T containing possiblyinﬁnite sequences of scheduled inferences. These sequences are stored as ﬁnite instructionsof how to compute the inferences. Initially, all clauses are in P . At each iteration of the mainloop, the prover heuristically selects a given clause C from P . If P is empty, sequences from T are evaluated to generate more clauses into P ; if no clause can be produced in this way, A is saturated and the prover stops. Assuming a given clause C could be selected, it is ﬁrstsimpliﬁed using A . Clauses in A are then simpliﬁed w.r.t. C , and any simpliﬁed clause ismoved to P . Then C is added to A and all sequences representing nonredundant inferences https://github.com/sneeuwballen/zipperposition between C and A are added to T . This maintains the invariant that all nonredundant infer-ences between clauses in A have been scheduled or performed. Then some of the scheduledinferences in T are performed and the conclusions are put into P .We can view the above loop as an instance of the abstract Zipperposition loop prover ZL of Waldmann et al. [69, Example 34]. Their Theorem 32 allows us to obtain dynamiccompleteness for this prover architecture from our static completeness result (Theorem 54).This requires that the sequences in T are visited fairly, that clauses in P are chosen fairly,and that simpliﬁcation terminates, all of which are guaranteed by our implementation.The uniﬁcation procedure we use returns a sequence of either singleton sets containingthe uniﬁer or an empty set signaling that a uniﬁer is still not found. Empty sets are returnedto give back control to the caller of uniﬁcation procedure and avoid getting stuck on nonter-minating problems. These sequences of uniﬁer subsingletons are converted into sequencescontaining subsingletons of clauses representing inference conclusions. We evaluated our prototype implementation of the calculus in Zipperposition with otherhigher-order provers and with Zipperposition’s modes for less expressive logics. All of theexperiments were performed on StarExec nodes equipped with Intel Xeon E5-2609 0 CPUsclocked at 2.40 GHz. Following CASC 2019, we use 180 s as the CPU time limit.We used both standard TPTP benchmarks [63] and Sledgehammer-generated bench-marks [52]. From the TPTP, version 7.2.0, we used 1000 randomly selected ﬁrst-order(FO) problems in CNF, FOF, or TFF syntax without arithmetic and all 499 monomorphichigher-order theorems in TH0 syntax without interpreted Booleans and arithmetic. We par-titioned the TH0 problems into those containing no λ -expressions (TH0 λ f, 452 problems)and those containing λ -expressions (TH0 λ , 47 problems). The Sledgehammer benchmarks,corresponding to Isabelle’s Judgment Day suite [23], were regenerated to target clausalhigher-order logic. They comprise 2506 problems, divided in two groups: SH- λ preserves λ -expressions, whereas SH-ll encodes them as λ -lifted supercombinators [52] to make theproblems accessible to λ -free clausal higher-order provers. Each group of problems is gen-erated from 256 Isabelle facts (deﬁnitions and lemmas). Our results are publicly available. Evaluation of Extensions

To assess the usefulness of the extensions described in Sect. 5,we ﬁxed a base conﬁguration of Zipperposition parameters. For each extension, we thenchanged the corresponding parameters and observed the effect on the success rate. The baseconﬁguration uses the complete variant of the uniﬁcation procedure of Vukmirovi´c et al.[66]. It also includes the optional rules N EG E XT and P RUNE A RG , substitutes F LEX S UP forthe highly explosive F LUID S UP , and excludes axiom (E XT ). The base conﬁguration is notrefutationally complete.The rules N EG E XT (NE) and P RUNE A RG (PA) were added to the base conﬁgurationbecause our informal experiments showed that they usually help. Fig. 1 conﬁrms this, al-though the effect is small. In all tables, + R denotes the inclusion of a rule R not present inthe base, and − R denotes the exclusion of a rule R present in the base. Numbers given inparentheses denote the number of problems that are solved only by the given conﬁgurationand no other conﬁguration in the same table. http://tptp.cs.miami.edu/CASC/27/ https://doi.org/10.5281/zenodo.4032969 uperposition with Lambdas 39 − NE, − PA − NE − PA BaseTH0 446 (0) 446 (0) 447 (0) 447 (0)SH- λ

431 (0) 433 (0) 433 (0) 436 (1)

Fig. 1

Number of problems proved without rules included in the base conﬁgurationBase + λ D + λ S0 + λ S1 + λ S2 + λ S4 + λ S8 + λ S1024TH0 447 (0) 448 (0) 449 (0) 449 (0) 449 (0) 449 (0) 449 (0) 449 (0)SH- λ

436 (1) 435 (4) 430 (1) 429 (0) 429 (0) 429 (0) 429 (0) 429 (0)

Fig. 2

Number of problems proved using rules that perform rewriting under λ -bindersBase + A BS + E XT I NST + (E XT )TH0 447 (0) 450 (1) 450 (1) 376 (0)SH- λ

436 (11) 430 (11) 402 (1) 365 (2)

Fig. 3

Number of problems proved using rules that perform extensionality reasoning − F LEX S UP Base − F LEX S UP , + D UP S UP − F LEX S UP , + F LUID S UP TH0 446 (0) 447 (0) 448 (1) 447 (0)SH- λ

469 (10) 436 (4) 451 (3) 461 (7)

Fig. 4

Number of problems proved with rules that perform superposition into ﬂuid terms

The rules λ D EMOD ( λ D) and λ S UP extend the calculus to perform some rewriting under λ -binders. While experimenting with the calculus, we noticed that, in some conﬁgurations, λ S UP performs better when the number of fresh Skolem symbols it introduces overall isbounded by some parameter n . As Fig. 2 shows, inclusion of these rules has different effecton the two benchmark sets. Different choices of n for λ S UP (denoted by λ S n ) do not seemto inﬂuence the success rate much.The evaluation of the A BS and E XT I NST rules and axiom (E XT ), presented in Fig. 3,conﬁrms our intuition that including the extensionality axiom is severely detrimental to per-formance. The + (E XT ) conﬁguration solved two unique problems on SH- λ benchmarks, butthe success of the + (E XT ) conﬁguration on these problems appears to be due to a coinci-dental inﬂuence of the axiom on heuristics—the axiom is not referenced in the generatedproofs.The F LEX S UP rule included in the base conﬁguration did not perform as well as weexpected. Even the F LUID S UP and D UP S UP rules outperformed F LEX S UP , as shown inFig. 4. This effect is especially visible on SH- λ benchmarks. On TPTP, the differences arenegligible.Most of the extensions had a stronger effect on SH- λ than on TH0. A possible expla-nation is that the Boolean-free TH0 benchmark subset consists mostly of problems that aresimple to solve using most prover parameters. On the other hand, SH- λ benchmarks are ofvarying difﬁculty and can thus beneﬁt more from changing prover parameters. Main Evaluation

We selected all contenders in the THF division of CASC 2019 as repre-sentatives of the state of the art: CVC4 1.8 prerelease [9], Leo-III 1.4 [62], Satallax 3.4 [24],and Vampire 4.4 [18]. We also included Ehoh [67], the λ -free clausal higher-order mode ofE 2.4. Leo-III and Satallax are cooperative higher-order provers that can be set up to regu-larly invoke ﬁrst-order provers as terminal proof procedures. To assess the performance of their core calculi, we evaluated them with ﬁrst-order backends disabled. We denote these“uncooperative” conﬁgurations by Leo-III-uncoop and Satallax-uncoop respectively, as op-posed to the standard versions Leo-III-coop and Satallax-coop.To evaluate the overhead our calculus incurs on ﬁrst-order or λ -free higher-order prob-lems, we ran Zipperposition in ﬁrst-order (FOZip) and λ -free ( λ freeZip) modes, as wellas in a mode that encodes curried applications using a distinguished binary symbol @ be-fore using ﬁrst-order Zipperposition (@+FOZip). We evaluated the implementation of ourcalculus in Zipperposition ( λ Zip) in three conﬁgurations: base, pragmatic, and full. Prag-matic builds on base by disabling F

LEX S UP and replacing complete uniﬁcation with thepragmatic variant procedure pv of Vukmirovi´c et al. Full is a refutationally complete ex-tension of base that substitutes F LUID S UP for F LEX S UP and includes axiom (E XT ). Finally,we evaluated Zipperposition in a portfolio mode that runs the prover in various conﬁgura-tions (Zip-uncoop). We also evaluated a cooperative version of the portfolio which, in someconﬁgurations, after a predeﬁned time invokes Ehoh as backend on higher-order problems(Zip-coop). In this version, Zipperposition encodes heuristically selected clauses from thecurrent proof state to lambda-free higher-order logic supported by Ehoh [67]. On ﬁrst-orderproblems, we ran Ehoh, Vampire, and Zip-uncoop using the provers’ respective ﬁrst-ordermodes.A summary of these experiments is presented in Fig. 5. In the pragmatic conﬁguration,our calculus outperformed λ freeZip on TH0 λ f problems and incurred less than 1% overheadcompared with FOZip, but fell behind λ freeZip on SH-ll problems. The full conﬁgurationsuffers greatly from the explosive extensionality axiom and F LUID S UP rule.Except on TH0 λ problems, both base and pragmatic conﬁgurations outperformed Leo-III-uncoop, which runs a ﬁxed conﬁguration, by a substantial margin. Zip-uncoop outper-formed Satallax-uncoop, which uses a portfolio. Our most competitive conﬁguration, Zip-coop, emerges as the winner on both problem sets containing λ -expressions.On higher-order TPTP benchmarks this conﬁguration does not solve any problems thatno other (cooperative) higher-order prover solves. By contrast, on SH-ll benchmarks Zip-coop solves 21 problems no other higher-order prover solves, and on SH- λ benchmarks, ituniquely solves 27 problems. FO TH0 λ f TH0 λ SH-ll SH- λ CVC4 539 424 31 696 650Ehoh 681 418 – 691 –Leo-III-uncoop 198 389 42 226 234Leo-III-coop 582

43 683 674Satallax-uncoop – 398 43 489 507Satallax-coop – 432 43 602 616Vampire

432 42 λ freeZip 395 398 – 538 – λ Zip-base 388 408 39 420 436 λ Zip-pragmatic 396 411 33 496 503 λ Zip-full 177 339 34 353 361Zip-uncoop 514 426

661 677Zip-coop 625 434 Number of problems proved by the different proversuperposition with Lambdas 41

Bentkamp et al. [12] introduced four calculi for λ -free clausal higher-order logic organizedalong two axes: intensional versus extensional , and nonpurifying versus purifying . The pu-rifying calculi ﬂatten the clauses containing applied variables, thereby eliminating the needfor superposition into variables. As we extended their work to support λ -expressions, wefound the puriﬁcation approach problematic and gave it up because it needs x to be smallerthan x t , which is impossible to achieve with a term order on βη -equivalence classes. We alsoquickly gave up our attempt at supporting intensional higher-order logic. Extensionality isthe norm for higher-order uniﬁcation [30] and is mandated by the TPTP THF format [64]and in proof assistants such as HOL4, HOL Light, Isabelle/HOL, Lean, Nuprl, and PVS.Bentkamp et al. viewed their approach as “a stepping stone towards full higher-orderlogic.” It already included a notion analogous to green subterms and an A RG C ONG rule,which help cope with the complications occasioned by β -reduction.Our Boolean-free λ -superposition calculus joins the family of proof systems for higher-order logic. It is related to Andrews’s higher-order resolution [1], Huet’s constrained resolu-tion [36], Jensen and Pietrzykowski’s ω -resolution [38], Snyder’s higher-order E -resolution[60], Benzmüller and Kohlhase’s extensional higher-order resolution [14], Benzmüller’shigher-order unordered paramodulation and RUE resolution [13], and Bhayat and Reger’scombinatory superposition [19]. A noteworthy variant of higher-order unordered paramod-ulation is Steen and Benzmüller’s higher-order ordered paramodulation [62], whose orderrestrictions undermine refutational completeness but yield better empirical results. Otherapproaches are based on analytic tableaux [8, 46, 47, 55], connections [2], sequents [50],and satisﬁability modulo theories (SMT) [9]. Andrews [3] and Benzmüller and Miller [15]provide excellent surveys of higher-order automation.Combinatory superposition was developed shortly after λ -superposition and is closelyrelated. It is modeled on the intensional nonpurifying calculus by Bentkamp et al. and targetsextensional polymorphic clausal higher-order logic. Both combinatory and λ -superpositiongracefully generalize the highly successful ﬁrst-order superposition rules without sacriﬁcingrefutational completeness, and both are equipped with a redundancy criterion, which earlierrefutationally complete higher-order calculi lack. In particular, P RUNE A RG is a versatilesimpliﬁcation rule that could be useful in other provers. Combinatory superposition’s dis-tinguishing feature is that it uses SKBCI combinators to represent λ -expressions. Combina-tors can be implemented more easily starting from a ﬁrst-order prover; β -reduction amountsto demodulation. However, according to its developers [19], “Narrowing terms with com-binator axioms is still explosive and results in redundant clauses. It is also never likely tobe competitive with higher-order uniﬁcation in ﬁnding complex uniﬁers.” Among the draw-backs of λ -superposition are the need to solve ﬂex–ﬂex pairs eagerly and the explosioncaused by the extensionality axiom. We believe that this is a reasonable trade-off, especiallyfor large problems with a substantial ﬁrst-order component.Our prototype Zipperposition joins the league of automatic theorem provers for higher-order logic. We list some of its rivals. TPS [4] is based on the connection method and expan-sion proofs. LEO [14] and L EO -II [17] implement variants of RUE resolution. Leo-III [62]is based on higher-order paramodulation. Satallax [24] implements a higher-order tableaucalculus guided by a SAT solver. L EO -II, Leo-III, and Satallax integrate ﬁrst-order proversas terminal procedures. AgsyHOL [50] is based on a focused sequent calculus guided bynarrowing. The SMT solvers CVC4 and veriT have recently been extended to higher-orderlogic [9]. Vampire now implements both combinatory superposition and a version of stan- dard superposition with ﬁrst-order uniﬁcation replaced by restricted combinatory uniﬁca-tion [18].Half a century ago, Robinson [56] proposed to reduce higher-order logic to ﬁrst-orderlogic via a translation. “Hammer” tools such as Sledgehammer [54], Miz AR [65], HOLy-Hammer [42], and CoqHammer [29] have since popularized this approach in proof assis-tants. The translation must eliminate the λ -expressions, typically using SKBCI combinatorsor λ -lifting [52], and encode typing information [20]. We presented the Boolean-free λ -superposition calculus, which targets a clausal fragment ofextensional polymorphic higher-order logic. With the exception of a functional extension-ality axiom, it gracefully generalizes standard superposition. Our prototype prover Zipper-position shows promising results on TPTP and Isabelle benchmarks. In future work, we planto pursue ﬁve main avenues of investigation.We ﬁrst plan to extend the calculus to support Booleans and Hilbert choice. Booleans arenotoriously explosive. We want to experiment with both axiomatizations and native supportin the calculus. Native support would likely take the form of a primitive substitution rulethat enumerates predicate instantiations [2], delayed clausiﬁcation rules [32], and rules forreasoning about Hilbert choice.We want to investigate techniques to curb the explosion caused by functional extension-ality.

The extensionality axiom reintroduces the search space explosion that the calculus’sorder restrictions aim at avoiding. Maybe we can replace it by more restricted inferencerules without compromising refutational completeness.We will also look into approaches to curb the explosion caused by higher-order uniﬁca-tion.

Our calculus suffers from the need to solve ﬂex–ﬂex pairs. Existing procedures [38,61,67] enumerate redundant uniﬁers. This can probably be avoided to some extent. It could alsobe useful to investigate uniﬁcation procedures that would delay imitation/projection choicesvia special schematic variables, inspired by Libal’s representation of regular uniﬁers [49].We clearly need to ﬁne-tune and develop heuristics.

We expect heuristics to be a fruit-ful area for future research in higher-order reasoning. Proof assistants are an inexhaustiblesource of easy-looking benchmarks that are beyond the power of today’s provers. Whereas“hard higher-order” may remain forever out of reach, we believe that there is a substantial“easy higher-order” fragment that awaits automation.Finally, we plan to implement the calculus in a state-of-the-art prover.

A suitable basisfor an optimized implementation of the calculus would be Ehoh, the λ -free clausal higher-order version of E developed by Vukmirovi´c, Blanchette, Cruanes, and Schulz [67]. Acknowledgment

Simon Cruanes patiently explained Zipperposition’s internals and allowed us to continuethe development of his prover. Christoph Benzmüller and Alexander Steen shared insights and exampleswith us, guiding us through the literature and clarifying how the Leos work. Maria Paola Bonacina andNicolas Peltier gave us some ideas on how to treat the extensionality axiom as a theory axiom, ideas wehave yet to explore. Mathias Fleury helped us set up regression tests for Zipperposition. Ahmed Bhayat,Tomer Libal, and Enrico Tassi shared their insights on higher-order uniﬁcation. Andrei Popescu and DmitriyTraytel explained the terminology surrounding the λ -calculus. Haniel Barbosa, Daniel El Ouraoui, PascalFontaine, Visa Nummelin, and Hans-Jörg Schurr were involved in many stimulating discussions. ChristophWeidenbach made this collaboration possible. Ahmed Bhayat, Wan Fokkink, Mark Summerﬁeld, and theanonymous reviewers suggested several textual improvements. The maintainers of StarExec let us use theirservice for the evaluation. We thank them all.uperposition with Lambdas 43Bentkamp, Blanchette, and Vukmirovi´c’s research has received funding from the European ResearchCouncil (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreementNo. 713999, Matryoshka). Bentkamp and Blanchette also beneﬁted from the Netherlands Organization forScientiﬁc Research (NWO) Incidental Financial Support scheme. Blanchette has received funding from theNWO under the Vidi program (project No. 016.Vidi.189.037, Lean Forward). References

1. Andrews, P.B.: Resolution in type theory. J. Symb. Log. (3), 414–432 (1971)2. Andrews, P.B.: On connections and higher-order logic. J. Autom. Reason. (3), 257–291 (1989)3. Andrews, P.B.: Classical type theory. In: J.A. Robinson, A. Voronkov (eds.) Handbook of AutomatedReasoning, vol. II, pp. 965–1007. Elsevier and MIT Press (2001)4. Andrews, P.B., Bishop, M., Issar, S., Nesmith, D., Pfenning, F., Xi, H.: TPS: A theorem-proving systemfor classical type theory. J. Autom. Reason. (3), 321–353 (1996)5. Avenhaus, J., Denzinger, J., Fuchs, M.: DISCOUNT: A system for distributed equational deduction. In:J. Hsiang (ed.) RTA-95, LNCS , vol. 914, pp. 397–402. Springer (1995)6. Bachmair, L., Ganzinger, H.: Rewrite-based equational theorem proving with selection and simpliﬁca-tion. J. Log. Comput. (3), 217–247 (1994)7. Bachmair, L., Ganzinger, H.: Resolution theorem proving. In: J.A. Robinson, A. Voronkov (eds.) Hand-book of Automated Reasoning, vol. I, pp. 19–99. Elsevier and MIT Press (2001)8. Backes, J., Brown, C.E.: Analytic tableaux for higher-order logic with choice. J. Autom. Reason. (4),451–479 (2011)9. Barbosa, H., Reynolds, A., Ouraoui, D.E., Tinelli, C., Barrett, C.W.: Extending SMT solvers to higher-order logic. In: P. Fontaine (ed.) CADE-27, LNCS , vol. 11716, pp. 35–54. Springer (2019)10. Bentkamp, A., Blanchette, J., Cruanes, S., Waldmann, U.: Superposition for lambda-free higher-orderlogic. arXiv preprint arXiv:2005.02094v1 (2020). https://arxiv.org/abs/2005.02094v1

11. Bentkamp, A., Blanchette, J., Tourret, S., Vukmirovi´c, P., Waldmann, U.: Superposition with lambdas.In: P. Fontaine (ed.) CADE-27,

LNCS , vol. 11716, pp. 55–73. Springer (2019)12. Bentkamp, A., Blanchette, J.C., Cruanes, S., Waldmann, U.: Superposition for lambda-free higher-orderlogic. In: D. Galmiche, S. Schulz, R. Sebastiani (eds.) IJCAR 2018,

LNCS , vol. 10900, pp. 28–46.Springer (2018)13. Benzmüller, C.: Extensional higher-order paramodulation and RUE-resolution. In: H. Ganzinger (ed.)CADE-16,

LNCS , vol. 1632, pp. 399–413. Springer (1999)14. Benzmüller, C., Kohlhase, M.: Extensional higher-order resolution. In: C. Kirchner, H. Kirchner (eds.)CADE-15,

LNCS , vol. 1421, pp. 56–71. Springer (1998)15. Benzmüller, C., Miller, D.: Automation of higher-order logic. In: J.H. Siekmann (ed.) ComputationalLogic,

Handbook of the History of Logic , vol. 9, pp. 215–254. Elsevier (2014)16. Benzmüller, C., Paulson, L.C.: Multimodal and intuitionistic logics in simple type theory. Log. J. IGPL (6), 881–892 (2010)17. Benzmüller, C., Sultana, N., Paulson, L.C., Theiss, F.: The higher-order prover L EO -II. J. Autom. Rea-son. (4), 389–404 (2015)18. Bhayat, A., Reger, G.: Restricted combinatory uniﬁcation. In: P. Fontaine (ed.) CADE-27, LNCS , vol.11716, pp. 74–93. Springer (2019)19. Bhayat, A., Reger, G.: A combinator-based superposition calculus for higher-order logic. In: N. Peltier,V. Sofronie-Stokkermans (eds.) IJCAR 2020, Part I,

LNCS , vol. 12166, pp. 278–296. Springer (2020)20. Blanchette, J.C., Böhme, S., Popescu, A., Smallbone, N.: Encoding monomorphic and polymorphictypes. Log. Meth. Comput. Sci. (4) (2016)21. Blanchette, J.C., Paskevich, A.: TFF1: The TPTP typed ﬁrst-order form with rank-1 polymorphism. In:M.P. Bonacina (ed.) CADE-24, LNCS , vol. 7898, pp. 414–420. Springer (2013)22. Blanqui, F., Jouannaud, J.P., Rubio, A.: The computability path ordering. Log. Meth. Comput. Sci. (4)(2015)23. Böhme, S., Nipkow, T.: Sledgehammer: Judgement Day. In: J. Giesl, R. Hähnle (eds.) IJCAR 2010, LNCS , vol. 6173, pp. 107–121. Springer (2010)24. Brown, C.E.: Satallax: An automatic higher-order prover. In: B. Gramlich, D. Miller, U. Sattler (eds.)IJCAR 2012,

LNCS , vol. 7364, pp. 111–117. Springer (2012)25. de Bruijn, N.G.: Lambda calculus notation with nameless dummies, a tool for automatic formula manip-ulation, with application to the Church–Rosser theorem. Indag. Math (5), 381–392 (1972)26. Cervesato, I., Pfenning, F.: A linear spine calculus. J. Log. Comput. (5), 639–688 (2003)4 Alexander Bentkamp et al.27. Cruanes, S.: Extending superposition with integer arithmetic, structural induction, and beyond. Ph.D.thesis, École polytechnique (2015)28. Cruanes, S.: Superposition with structural induction. In: C. Dixon, M. Finger (eds.) FroCoS 2017, LNCS ,vol. 10483, pp. 172–188. Springer (2017)29. Czajka, Ł., Kaliszyk, C.: Hammer for Coq: Automation for dependent type theory. J. Autom. Reason. (1-4), 423–453 (2018)30. Dowek, G.: Higher-order uniﬁcation and matching. In: J.A. Robinson, A. Voronkov (eds.) Handbook ofAutomated Reasoning, vol. II, pp. 1009–1062. Elsevier and MIT Press (2001)31. Fitting, M.: Types, Tableaus, and Gödel’s God. Kluwer (2002)32. Ganzinger, H., Stuber, J.: Superposition with equivalence reasoning and delayed clause normal formtransformation. Information and Computation (1–2), 3–23 (2005)33. Gordon, M.J.C., Melham, T.F. (eds.): Introduction to HOL: A Theorem Proving Environment for HigherOrder Logic. Cambridge University Press (1993)34. Gupta, A., Kovács, L., Kragl, B., Voronkov, A.: Extensional crisis and proving identity. In: F. Cassez,J. Raskin (eds.) ATVA 2014, LNCS , vol. 8837, pp. 185–200. Springer (2014)35. Henkin, L.: Completeness in the theory of types. J. Symb. Log. (2), 81–91 (1950)36. Huet, G.P.: A mechanization of type theory. In: N.J. Nilsson (ed.) IJCAI-73, pp. 139–146. WilliamKaufmann (1973)37. Huet, G.P.: A uniﬁcation algorithm for typed lambda-calculus. Theor. Comput. Sci. (1), 27–57 (1975)38. Jensen, D.C., Pietrzykowski, T.: Mechanizing ω -order type theory through uniﬁcation. Theor. Comput.Sci. (2), 123–171 (1976)39. Jouannaud, J.P., Rubio, A.: Rewrite orderings for higher-order terms in eta-long beta-normal form andrecursive path ordering. Theor. Comput. Sci. (1–2), 33–58 (1998)40. Jouannaud, J.P., Rubio, A.: Polymorphic higher-order recursive path orderings. J. ACM (1), 2:1–2:48(2007)41. Kaliszyk, C., Sutcliffe, G., Rabe, F.: TH1: The TPTP typed higher-order form with rank-1 polymorphism.In: P. Fontaine, S. Schulz, J. Urban (eds.) PAAR-2016, CEUR Workshop Proceedings , vol. 1635, pp. 41–55. CEUR-WS.org (2016)42. Kaliszyk, C., Urban, J.: HOL(y)Hammer: Online ATP service for HOL Light. Math. Comput. Sci. (1),5–22 (2015)43. Kamin, S., Lévy, J.J.: Two generalizations of the recursive path ordering. Unpublished manuscript,University of Illinois (1980)44. K˝onig, D.: Über eine Schlussweise aus dem Endlichen ins Unendliche. Acta Sci. Math. (Szeged) (3:2–3), 121–130 (1927)45. Knuth, D.E., Bendix, P.B.: Simple word problems in universal algebras. In: J. Leech (ed.) ComputationalProblems in Abstract Algebra, pp. 263–297. Pergamon Press (1970)46. Kohlhase, M.: Higher-order tableaux. In: P. Baumgartner, R. Hähnle, J. Posegga (eds.) TABLEAUX ’95, LNCS , vol. 918, pp. 294–309. Springer (1995)47. Konrad, K.: HOT: A concurrent automated theorem prover based on higher-order tableaux. In: J. Grundy,M.C. Newey (eds.) TPHOLs ’98,

LNCS , vol. 1479, pp. 245–261. Springer (1998)48. Kovács, L., Voronkov, A.: First-order theorem proving and Vampire. In: N. Sharygina, H. Veith (eds.)CAV 2013,

LNCS , vol. 8044, pp. 1–35. Springer (2013)49. Libal, T.: Regular patterns in second-order uniﬁcation. In: A.P. Felty, A. Middeldorp (eds.) CADE-25,

LNCS , vol. 9195, pp. 557–571. Springer (2015)50. Lindblad, F.: A focused sequent calculus for higher-order logic. In: S. Demri, D. Kapur, C. Weidenbach(eds.) IJCAR 2014,

LNCS , vol. 8562, pp. 61–75. Springer (2014)51. Mayr, R., Nipkow, T.: Higher-order rewrite systems and their conﬂuence. Theor. Comput. Sci. (1),3–29 (1998)52. Meng, J., Paulson, L.C.: Translating higher-order clauses to ﬁrst-order clauses. J. Autom. Reason. (1),35–60 (2008)53. Miller, D.: A logic programming language with lambda-abstraction, function variables, and simple uni-ﬁcation. J. Log. Comput. (4), 497–536 (1991)54. Paulson, L.C., Blanchette, J.C.: Three years of experience with Sledgehammer, a practical link betweenautomatic and interactive theorem provers. In: G. Sutcliffe, S. Schulz, E. Ternovska (eds.) IWIL-2010, EPiC , vol. 2, pp. 1–11. EasyChair (2012)55. Robinson, J.: Mechanizing higher order logic. In: B. Meltzer, D. Michie (eds.) Machine Intelligence,vol. 4, pp. 151–170. Edinburgh University Press (1969)56. Robinson, J.: A note on mechanizing higher order logic. In: B. Meltzer, D. Michie (eds.) MachineIntelligence, vol. 5, pp. 121–135. Edinburgh University Press (1970)57. Schulz, S.: E - a brainiac theorem prover. AI Commun. (2-3), 111–126 (2002)uperposition with Lambdas 4558. Schulz, S.: Fingerprint indexing for paramodulation and rewriting. In: B. Gramlich, D. Miller, U. Sattler(eds.) IJCAR 2012, LNCS , vol. 7364, pp. 477–483. Springer (2012)59. Schulz, S., Cruanes, S., Vukmirovic, P.: Faster, higher, stronger: E 2.3. In: P. Fontaine (ed.) CADE-27,

LNCS , vol. 11716, pp. 495–507. Springer (2019)60. Snyder, W.: Higher order E -uniﬁcation. In: M.E. Stickel (ed.) CADE-10, LNCS , vol. 449, pp. 573–587.Springer (1990)61. Snyder, W., Gallier, J.H.: Higher-order uniﬁcation revisited: Complete sets of transformations. J. Symb.Comput. (1/2), 101–140 (1989)62. Steen, A., Benzmüller, C.: The higher-order prover Leo-III. In: D. Galmiche, S. Schulz, R. Sebastiani(eds.) IJCAR 2018, LNCS , vol. 10900, pp. 108–116. Springer (2018)63. Sutcliffe, G.: The TPTP problem library and associated infrastructure—from CNF to TH0, TPTP v6.4.0.J. Autom. Reason. (4), 483–502 (2017)64. Sutcliffe, G., Benzmüller, C., Brown, C.E., Theiss, F.: Progress in the development of automated theo-rem proving for higher-order logic. In: R.A. Schmidt (ed.) CADE-22, LNCS , vol. 5663, pp. 116–130.Springer (2009)65. Urban, J., Rudnicki, P., Sutcliffe, G.: ATP and presentation service for Mizar formalizations. J. Autom.Reason. (2), 229–241 (2013)66. Vukmirovi´c, P., Bentkamp, A., Nummelin, V.: Efﬁcient full higher-order uniﬁcation. In: Z.M. Ariola(ed.) FSCD 2020, LIPIcs , vol. 167, pp. 5:1–5:17. Schloss Dagstuhl—Leibniz-Zentrum für Informatik(2020)67. Vukmirovi´c, P., Blanchette, J.C., Cruanes, S., Schulz, S.: Extending a brainiac prover to lambda-freehigher-order logic. In: T. Vojnar, L. Zhang (eds.) TACAS 2019,

LNCS , vol. 11427, pp. 192–210. Springer(2019)68. Vukmirovi´c, P., Nummelin, V.: Boolean reasoning in a higher-order superposition prover. In: PracticalAspects of Automated Reasoning (PAAR 2020) (2020)69. Waldmann, U., Tourret, S., Robillard, S., Blanchette, J.: A comprehensive framework for saturationtheorem proving. In: N. Peltier, V. Sofronie-Stokkermans (eds.) IJCAR 2020, Part I,