[PDF] Lassie: HOL4 Tactics by Example

Abstract

Proof engineering efforts using interactive theorem proving have yielded several impressive projects in software systems and mathematics. A key obstacle to such efforts is the requirement that the domain expert is also an expert in the low-level details in constructing the proof in a theorem prover. In particular, the user needs to select a sequence of tactics that lead to a successful proof, a task that in general requires knowledge of the exact names and use of a large set of tactics. We present Lassie, a tactic framework for the HOL4 theorem prover that allows individual users to define their own tactic language by example and give frequently used tactics or tactic combinations easier-to-remember names. The core of Lassie is an extensible semantic parser, which allows the user to interactively extend the tactic language through a process of definitional generalization. Defining tactics in Lassie thus does not require any knowledge in implementing custom tactics, while proofs written in Lassie retain the correctness guarantees provided by the HOL4 system. We show through case studies how Lassie can be used in small and larger proofs by novice and more experienced interactive theorem prover users, and how we envision it to ease the learning curve in a HOL4 tutorial.

Full PDF

aa r X i v : . [ c s . P L ] J a n Lassie: HOL4 Tactics by Example

Heiko Becker

MPI-SWS,Saarland Informatics Campus (SIC)Germany [email protected]

Nathaniel Bos ∗ McGill UniversityCanada [email protected]

Ivan Gavran

MPI-SWSGermany [email protected]

Eva Darulova

MPI-SWSGermany [email protected]

Rupak Majumdar

MPI-SWSGermany [email protected]

Abstract

Proof engineering eﬀorts using interactive theorem prov-ing have yielded several impressive projects in software sys-tems and mathematics. A key obstacle to such eﬀorts is therequirement that the domain expert is also an expert in thelow-level details in constructing the proof in a theorem prover.In particular, the user needs to select a sequence of tacticsthat lead to a successful proof, a task that in general requiresknowledge of the exact names and use of a large set of tac-tics.We present Lassie, a tactic framework for the HOL4 theo-rem prover that allows individual users to deﬁne their owntactic language by example and give frequently used tac-tics or tactic combinations easier-to-remember names. Thecore of Lassie is an extensible semantic parser, which allowsthe user to interactively extend the tactic language througha process of deﬁnitional generalization. Deﬁning tactics inLassie thus does not require any knowledge in implement-ing custom tactics, while proofs written in Lassie retain thecorrectness guarantees provided by the HOL4 system. Weshow through case studies how Lassie can be used in smalland larger proofs by novice and more experienced interac-tive theorem prover users, and how we envision it to easethe learning curve in a HOL4 tutorial.

CCS Concepts: • Software and its engineering → For-mal software veriﬁcation ; Programming by example ; Macro languages . ∗ Nathaniel Bos was supported by a DAAD RISE Internship.Permission to make digital or hard copies of part or all of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for proﬁt or commercial advantage and that copiesbear this notice and the full citation on the ﬁrst page. Copyrights for third-party components of this work must be honored. For all other uses, contactthe owner/author(s).

Keywords:

Interactive Theorem Proving, HOL4, SemanticParsing, Tactic Programming

ACM Reference Format:

Heiko Becker, Nathaniel Bos, Ivan Gavran, Eva Darulova, and Ru-pak Majumdar. 2021. Lassie: HOL4 Tactics by Example. In

Proceed-ings of the 10th ACM SIGPLAN International Conference on CertiﬁedPrograms and Proofs (CPP ’21), January 18–19, 2021, Virtual, Den-mark.

ACM, New York, NY, USA, 13 pages. https://doi.org/10.1145/3437992.3439925

Interactive theorem proving is increasingly replacing “pen-and-paper” correctness proofs in domains such as compil-ers [22, 24], operating system kernels [23], and formalizedmathematics [14, 16]. Interactive theorem provers (ITPs) pro-vide strong guarantees: all proof steps are formalized andmachine-checked by a kernel using only a small set of gen-erally accepted proof rules.These guarantees come at a cost. Writing proofs in anITP requires both domain expertise in the target researcharea as well as in the particulars of the interactive theoremprover. Formally proving a theorem requires an expert tomanually translate the general high-level proof idea froma pen-and-paper proof into detailed, low-level kernel proofsteps, which makes writing formal proofs tedious and time-consuming. Theorem provers thus provide tactic languagesthat allow to programmatically combine low-level proof steps [10,15, 26, 37]. While this makes proofs less tedious, users needto build up a vocabulary of appropriate tactics, which con-stitutes a steep learning curve for novice ITP users.Controlled natural language interfaces [1, 11] have beenexplored as an alternative, more intuitive interface to an ITP.However, these systems do not allow a combination with ageneral tactic language and are thus constrained to a spe-ciﬁc subset of proofs.In this paper, we present the tactic framework

Lassie thatallows HOL4 users to deﬁne their own tactic language ontop of the existing ones by example , eﬀectively providing

PP ’21, January 18–19, 2021, Virtual, Denmark Heiko Becker, Nathaniel Bos, Ivan Gavran, Eva Darulova, and Rupak Majumdar an individualized interface. Each example consists of the to-be-deﬁned tactic (a natural language expression, called ut-terance ) and its deﬁnition using existing HOL4 tactics withconcrete arguments.For instance, we can deﬁne instantiate 'x' with ' ⊤ ' as qpat_x_assum 'x' (qspec_then ' ⊤ ' assume_tac) Newly deﬁned Lassie tactics map directly and transparentlyto the underlying HOL4 tactics, and can be freely combined.The main novelty to existing tactic languages is that Lassieallows to deﬁne tactics by example and thus does not requireknowledge in tactic programming. A tactic deﬁned by ex-ample is automatically generalized into a parametric tacticby Lassie to make the tactic applicable in diﬀerent contexts,making Lassie go beyond a simple macro system.Our key technical contribution is that Lassie realizes thisdeﬁnition-by-example using an extensible semantic parser [4,35]. Lassie tactics are deﬁned as grammar rules that map toHOL4 tactics. Lassie starts with an initial core grammar thatis gradually extended through user-provided examples. Foreach example, the semantic parser ﬁnds matchings betweenthe utterance and its deﬁnition. These matchings are usedto create new rules for the grammar. Eﬀectively, the seman-tic parser identiﬁes the parameters of the newly given com-mand, and thus generalizes from the given example. In ourillustrative example, Lassie will identify 'x' and ⊤ as argu-ments and add a rule that will work with arbitrary terms inplace of 'x' and ⊤ .Typically, extending a grammar through examples leadsto ambiguity—for a single uterance-deﬁnition pair there maybe diﬀerent possible matchings and thus several new pars-ing rules introduced. In previous work [35], this ambiguitywas resolved through user interaction, e.g. showing the usera visualization of diﬀerent parses and letting them choosethe parse with the intended eﬀect. However, it is non-trivialto visualize intermediate steps in a general-purpose program-ming language. Our core insight is that ITPs oﬀer an idealsetting to resolve this ambiguity. We show that by carefullydesigning the core grammar and by making use of type in-formation, the ambiguity can be resolved automatically. Fur-thermore, ITPs “visualize” individual steps by showing theintermediate proof state, and rule out wrong tactic deﬁni-tions by forcing proofs to be checked by the ITP systemskernel.Lassie’s target audience are trained ITP users who imple-ment decision procedures and simple tactic descriptions inLassie. Lassie allows them to deﬁne their own individual-ized language by deﬁning easy-to-remember names for in-dividual tactics, or (frequently used) combinations of tactics.A tactic language implemented in Lassie can then used by non-expert users with prior programming experience butwithout necessarily in-depth experience with an ITP.Compared to general tactic languages like ssreﬂect [15],Ltac [10], and Eisbach [26], Lassie requires less expert knowl-edge, at the expense of expressiveness. Similar to Lassie,structured tactic languages like Isar [36] have an extendedparser. Extending a language like Isar requires editing thesource code, while Lassie supports diﬀerent tactic languagesthat can be deﬁned simply by example. While Lassie can beused to deﬁne a tactic language that is closer to a naturallanguage, by not requiring the interface to be entirely nat-ural, Lassie is more general and ﬂexible than systems likeMizar [1] and Naproche-SAD [11].We implement Lassie as a library for the HOL4 [32] ITPsystem, but our technique is applicable to other theoremprovers as well. Lassie is fully compatible with standard HOL4proofs. Since all Lassie tactics map to standard HOL4 tactics,Lassie allows exporting a Lassie proof into standard HOL4to maintain portability of proofs. On the other hand, thelearned grammar can be ported as well and can be used, forexample, by a teacher to predeﬁne a domain-speciﬁc (tac-tic) language with Lassie, which is used by learners to easeproofs in a particular area.We demonstrate Lassie on a number of case studies prov-ing theorems involving logic, and natural and real numbers.In particular, we show the generality of the naturalized tac-tics by reusing them across diﬀerent proofs, and we showthat Lassie can be incrementally used for proofs inside largercode bases. Finally, by predeﬁning a tactic language withLassie, we develop a tutorial for the HOL4 theorem prover. Contributions.

In summary, this paper presents: • an interactive, extensible framework called Lassie forwriting tactics in an ITP by example; • an implementation of this approach inside HOL4 (avail-able at https://github.com/HeikoBecker/Lassie ); • a number of case studies and a HOL4 tutorial (avail-able at https://github.com/HeikoBecker/HOL4-Tutorial )showing the eﬀectiveness of Lassie. We start by demonstrating Lassie on a small example, beforeexplaining our approach in detail in Section 3.For our initial example we choose to prove that the in-verse function ( 𝑥 − ) on real numbers is inverse monotonicfor ≤ . Figure 2 shows the formal statement of this theorem,together with an (informal) proof that one may ﬁnd in a text-book (the proof uses a previously proven theorem about < ). Proofs in HOL4.

Figure 1a shows the corresponding HOL4theorem statement and proof. We can be sure that this proofis correct, because it is machine-checked by HOL4. HOL4 [32]is an ITP system from the HOL-family. It is based on higher-order logic and all proofs are justiﬁed by inference rules assie: HOL4 Tactics by Example CPP ’21, January 18–19, 2021, Virtual, Denmark

Theorem

REAL_INV_LE_AMONO: ∀ x y.0 < x ∧ ⇒ x − ≤ y − ⇔ y ≤ x Proof rpt strip_tac\\ ` x − < y − ⇔ y < x ` by (MATCH_MP_TAC REAL_INV_LT_ANTIMONO \\ fs [])\\ EQ_TAC\\ fs [REAL_LE_LT]\\ STRIP_TAC\\ fs [REAL_INV_INJ] QED (a)

HOL4 proof

Theorem

REAL_INV_LE_AMONO: ∀ x y.0 < x ∧ ⇒ x − ≤ y − ⇔ y ≤ x Proofnltac ` introduce assumptions.show 'inv x < inv y <=> y < x'using (use REAL_INV_LT_ANTIMONOTHEN follows trivially).case split.simplify with [REAL_LE_LT].introduce assumptions.simplify with [REAL_INV_INJ]. trivial. ` QED (b)

Lassie proof

Figure 1.

HOL4 proof (left) and Lassie proof (right) for theorem

REAL_INV_LE_AMONO

Theorem 1. ∀ 𝑥 𝑦, < 𝑥 ∧ < 𝑦 ⇒ 𝑥 − ≤ 𝑦 − ⇔ 𝑦 ≤ 𝑥 Proof 1.

We show both sides of the implication separately.To show ( 𝑥 − ≤ 𝑦 − ⇒ 𝑦 ≤ 𝑥 ), we do a case split on whether 𝑥 − < 𝑦 − or 𝑥 − = 𝑦 − . If 𝑥 − < 𝑦 − , the claim followsbecause the inverse function is inverse monotonic for < . If 𝑥 − = 𝑦 − , the claim follows from injectivity of the inverse.To show the case ( 𝑦 ≤ 𝑥 ⇒ 𝑥 − ≤ 𝑦 − ), we do a case split onwhether 𝑦 < 𝑥 or 𝑦 = 𝑥 . If 𝑦 < 𝑥 the claim follows becausethe inverse function is inverse monotonic for < . If 𝑦 = 𝑥 , theclaim follows trivially. Figure 2.

Textbook proof that the inverse function is in-verse monotonic for ≤ from a small, trusted kernel. Its implementation language isStandard ML (SML), and similar to other HOL provers likeHOL-Light [18], and Isabelle/HOL [27], proof steps are de-scribed using so-called tactics that manipulate a goal stateuntil the goal has been derived from true.When doing a HOL4 proof, one ﬁrst states the theoremto be proven and starts an interactive proof. Figure 3 showsthe example proof statement from Figure 1a on the left andthe interactive session on the right. To show that the theo-rem holds, the user would write a tactic proof at the placemarked with (* Proof *) , starting with the initial tactic rptstrip_tac , sending each tactic to the interactive session onthe right.A HOL4 tactic implements e.g. a single kernel step, suchas assume_tac thm which introduces thm as a new assump-tion, but a tactic can also implement more elaborate steps,like fs , which implements a stateful simpliﬁcation algorithm, Theorem

REAL_INV_LE_AMONO: ∀ x y.0 < x ∧ ⇒ (inv x ≤ inv y ⇔ y ≤ x) Proof rpt strip_tac(* Proof *)

QED − ≤ y − ⇔ y ≤ x: proof> Figure 3.

HOL4 theorem (left) and interactive proof session(right)and imp_res_tac thm , resolving thm with the current assump-tions to derive new facts. In our example, rpt strip_tac re-peatedly introduces universally quantiﬁed variables and in-troduces left-hand sides of implications as assumptions.After each tactic application, the HOL4 session prints thegoal state that the user still needs to show, keeping track ofthe state of the proof. Once the HOL4 session prints

Initialgoal proved , the proof is ﬁnished. To make sure that theproof can be checked by HOL4 when run non-interactively,the separate tactics used in each step are chained togetherusing the inﬁx-operator \\ . As this operator returns a tacticafter taking some additional inputs, it is called a tactical. Proofs in Lassie.

Figure 1b shows the proof of our theo-rem using Lassie. This proof follows the same steps as thestandard HOL4 proof, but each tactic is called using a namethat we have previously deﬁned in Lassie by example. Wechose the Lassie tactics to be more descriptive (for us atleast), and while they make the proof slightly more verbose,

PP ’21, January 18–19, 2021, Virtual, Denmark Heiko Becker, Nathaniel Bos, Ivan Gavran, Eva Darulova, and Rupak Majumdar

Theorem

REAL_INV_LE_AMONO: ∀ x y. 0 < x ∧ ⇒ (inv x ≤ inv y ⇔ y ≤ x) Proofnlexplain ()introduce assumptions.we show 'inv x < inv y <=> y < x'using (use REAL_INV_LT_ANTIMONOTHEN follows trivially).case split.simplify with [REAL_LE_LT].introduce assumptions.simplify with [REAL_INV_INJ]. trivial.

QED rpt strip_tac \\ ` inv x < inv y ⇔ y < x ` by (irule REAL_INV_LT_ANTIMONO THEN fs [ ])0. 0 < x1. 0 < x2. x − < y − ⇔ y < x-------------------------------------x − ≤ y − ⇔ y ≤ x | > Figure 4.

Intermediate proof state using goalTree’s and nlexplain they also make it easier to follow for (non-)experts. Each ofour Lassie tactics maps to corresponding formal HOL4 tac-tics, so that the proof is machine-checked by HOL4 as before,retaining all correctness guarantees.Unlike existing tactic languages, Lassie allows to deﬁnecustom tactics by example and thus does not require anyknowledge in tactic programming. For instance, for our ex-ample proof, we deﬁned a new tactic by def ` simplify with [REAL_LE_LT] ` ` fs [REAL_LE_LT] ` ; Lassie automatically generalizes from this example so thatwe can later use this tactic with a diﬀerent argument: simplify with [REAL_INV_INJ]

To achieve this automated generalization, Lassie internallyuses an extensible semantic parser [4]. That is, Lassie tacticsare deﬁned as grammar rules. Lassie initially comes with arelatively small core grammar, supporting commonly usedHOL4 tactics. This grammar is gradually and interactivelyextended with additional tactic descriptions by giving exam-ple mappings. For instance our deﬁnition above would addthe following rule to the grammar: simplify with [THM1, THM2, ...] → fs [THM1, THM2, ...] Note that this rule allows simplify with to be called witha list of theorems, not just a single theorem as in the ex-ample given. This generalization happens completely auto-matically in the semantic parser and does not require anyprogramming by the user.The Lassie-deﬁned tactics can be used in a proof usingthe function nltac , that sends tactic descriptions to the se-mantic parser, which returns the corresponding HOL4 tac-tic. Because nltac has the same return type as all other stan-dard HOL4 tactics, it can be used as a drop-in replacementfor standard HOL4 tactics, and can be freely combined withother HOL4 tactics in a proof.

Explaining Proofs with Lassie.

Lassie also comes witha function nlexplain . Instead of being a drop-in replacement,like nltac , nlexplain decorates the proof state with the HOL4tactic that is internally used to perform the current proofstep. Figure 4 shows an intermediate state when using nlexplain to prove our example theorem. All Lassie tactics inside thered dashed box on the left-hand side have been passed to nlexplain . The goal state on the right-hand side shows thecurrent state of the proof as well as the HOL4 tactic scriptthat has the same eﬀect as the Lassie tactics.We envision nlexplain to be used for example in a HOL4tutorial to ease the learning curve when learning interactivetheorem proving. Lassie allows a teacher to ﬁrst deﬁne acustom tactic language that follows the same structure asthe HOL4 proof, but that uses descriptive names and may bethus easier to follow for a novice. In a second step, one canuse nlexplain to teach the actual underlying HOL4 tactics.Function nlexplain can furthermore be used for sharingLassie proofs without introducing additional dependencieson the semantic parser. While sharing Lassie proof scriptsdirectly is possible, it requires sharing the state of the se-mantic parser as well. Alternatively, one can send the Lassieproof to nlexplain and obtain a HOL4 tactic script that canthen be shared without depending on the semantic parser. More Complex Tactics.

While the target user that wehad in mind when developing Lassie is not an ITP expert, ex-perts may nonetheless ﬁnd Lassie useful to, e.g., group com-monly used combinations of tactics. For example, to makethe proofs of simple subgoals easier, an expert can deﬁnea tactic that uses diﬀerent simpliﬁcation algorithms and anautomated decision procedure to attempt to solve a goal au-tomatically: def ` prove with [ADD_ASSOC] `` all_tac THEN ( fs [ ADD_ASSOC ] THEN NO_TAC)ORELSE (rw [ ADD_ASSOC ] THEN NO_TAC) assie: HOL4 Tactics by Example CPP ’21, January 18–19, 2021, Virtual, Denmark $ROOT → $tactic ( 𝜆𝑥 .𝑥 ) $tactic → $TOKEN ( 𝜆𝑥 . lookup "tactic" 𝑥 ) $tactic → $thm->tactic $thm ( 𝜆𝑥 𝑦.𝑥 𝑦 ) $thm->tactic → $TOKEN ( 𝜆𝑥 . lookup "thm list->tactic" 𝑥 ) $thm → $TOKEN ( 𝜆𝑥 .𝑥 ) gen_tac : tacticall_tac : tacticstrip_tac : tacticfs : thm list->tacticsimp : thm list->tactic Figure 5.

Excerpt from Lassie grammar (left) and the database (right), parsing tactics and thm list tactics

ORELSE metis_tac [ ADD_ASSOC ] ` The HOL4 tactic will ﬁrst attempt to solve the goal usingthe simpliﬁcation algorithms implemented in tactics fs and rw , and if both fail, it will call into the automated decisionprocedure metis_tac , based on ﬁrst-order resolution. (Tacti-cal t1 ORELSE t2 applies ﬁrst tactic t1 , and if t1 fails, t2 isapplied. THEN NO_TAC makes the simpliﬁcation fail if it doesnot solve the goal.)The resulting tactic description prove with [THM1, THM2,...] is parametric in the used list of theorems making itapplicable in diﬀerent contexts.Deﬁned tactic descriptions are added to the grammar andare as such part of the generalization algorithm. Thus wecan reuse the just deﬁned tactic description to deﬁne aneven more elaborate version: def ` 'T' from [ CONJ_COMM ] `` 'T' by ( prove with [CONJ_COMM] ) ` ; This tactic description, once generalized by the semanticparser, completely hides the fact that we may need to callinto three diﬀerent algorithms to prove a subgoal, while al-lowing us to enrich our assumptions with arbitrary goals, aslong as they are provable by the underlying HOL4 tactics.

Existing approaches to tactic languages, like Eisbach [26]and ssreﬂect [15] are implemented as domain-speciﬁc lan-guages (DSL), usually within the theorem prover’s imple-mentation language. In these approaches, deﬁning a newtactic is the same as deﬁning a function in the implementedDSL. If a tactic should be generalized over e.g. a list of the-orems, this generalization must be performed manually bythe user of the tactic language.In contrast, Lassie’s tactics are deﬁned in a grammar thatis extended interactively by example using a semantic parser [4]that performs parameter generalization automatically. Wedeﬁne an initial core grammar (Section 3.1) that users canextend by example (Section 3.2). Each such deﬁned descrip-tion (Lassie tactic), maps a description to a (sequence of)HOL4 tactics, which is then applied to the proof state andchecked by the HOL4 kernel. Note that a Lassie user doesnot directly modify and thus does not have to be aware ofthe underlying (core) grammar—the extension happens byexample.

The left-hand side of Figure 5 shows a subset of Lassie’s coregrammar. $ROOT is the symbol for the root node in the gram-mar and must always be a valid tactic. The core grammar isused to parse theorems, tactics, tacticals (of type thm list-> tactic ) and looks up functions of these types.Each rule has the form $left → $right ( 𝜆𝑥 . . . . ) . While $left → $right works just as in a standard context freegrammar, the 𝜆 -abstraction, called logical form, is applied tothe result of parsing $right using the grammar. The logicalform allows us to manipulate parsing results after they havebeen parsed by the grammar, essentially interpreting themwithin the parser. In Lassie we use it to implement functionapplications when combining tactics, and to lookup namesin a database.We have built a core grammar for Lassie that supports themost common tactics and tacticals of HOL4. For instance thecore grammar will parse fs [REAL_INV_INJ] unambiguouslyinto the equivalent SML code as its logical form. We thinkof this core grammar as the starting point for users to deﬁneLassie tactics on top of the HOL4 tactics.Adding every HOL4 tactic and tactical as a separate ter-minal to the grammar would clutter it unnecessarily andmake it hard to maintain. That is why the grammar allowsso-called lookup rules that check a dictionary for elementsof predeﬁned sets. The right-hand side of Figure 5 shows asubset of the database used for the lookups. In the grammarin Figure 5, a tactic can then either be looked up from thedatabase (second rule), or a tactic can be a combination ofa function of type thm -> tactic and a theorem (third rule).We refer to functions of type thm -> tactic as theorem tac-tics, as they take a theorem as input, and return a HOL4tactic. Theorem tactics are again looked up from the data-base, whereas theorems can be any possible string denotedin the grammar by $TOKEN . In addition to HOL4 tactics andtheorem tactics, our core grammar also uses a combinationof rules (not shown in Figure 5) to support functions thatreturn a tactic of type • thm list -> tactic • tactic -> tactic • term quotation -> tactic • (thm -> tactic)-> tactic • tactic -> tactic -> tactic • term quotation -> (thm -> tactic) -> thm -> tactic • term quotation list -> (thm -> tactic) -> PP ’21, January 18–19, 2021, Virtual, Denmark Heiko Becker, Nathaniel Bos, Ivan Gavran, Eva Darulova, and Rupak Majumdar thm -> tactic

These types capture most of the tactics implemented in HOL4,and we add a subset of 53 commonly used tactics into thedatabase.

Non-Ambiguity.

A common issue in semantic parsing isgrammar ambiguity. In Lassie, having an ambiguous gram-mar is not desirable as it would require users to disambiguateeach ambiguous Lassie tactic while proving theorems. Wethus aim to have an unambiguous grammar and achieve thisby a careful design of our core grammar. By encoding thetypes of the tactics as non-terminals, our core grammar actsas a type-checker for our supported subset of HOL4 tactics.Even after deﬁning custom tactics, the semantic parser willalways parse Lassie tactics into the subset it can type checkthus keeping the grammar unambiguous. During our exper-iments we have not found a case where extending the gram-mar introduced any ambiguity, which reassures this designchoice.

With our core grammar, Lassie can parse the HOL4 tacticswe have added to the grammar into their (equivalent) SMLcode. We now explain how this grammar can be interac-tively extended by example in order to provide custom namesfor (sequences of) tactics.Lassie’s tactic learning mechanism relies on a semanticparser. A semantic parser converts a natural language utter-ance into a corresponding (executable) logical form or—dueto ambiguity—a ranked list of candidates. Semantic parserscan be implemented in many ways, e.g., they can be rule-based or learned from data [25]. SEMPRE [4], which weuse, is a toolkit for developing semantic parsers for diﬀerenttasks. It provides commonly used natural language process-ing methods, and diﬀerent ways of encoding logical forms.Lassie’s semantic parser is implemented on top of the in-teractive version of SEMPRE [35]. It starts with a core formalgrammar, which can be expanded through interactions withthe user. Users can add new concepts to the grammar byexample using Lassie’s library function def , which invokesthe semantic parser. Each example consists of a ( utterance , deﬁnition ) pair, where the utterance is the new tactic to bedeﬁned and the deﬁnition is an expression that is alreadypart of the grammar. For instance, we can give as example: def ` simplify with REAL_ADD_ASSOC ` (*utterance*) ` fs [REAL_ADD_ASSOC] ` (*definition*) Note that the command demonstrates the new tactic ( simplifywith ) with a particular argument (

REAL_ADD_ASSOC ), but doesnot explicitly state what the argument is.The deﬁnition has to already be part of the grammar andthus fully parsable, otherwise the parser will reject the pair,whereas only some parts of the utterance may be parsable.That is, the deﬁnition needs to be already understood by the semantic parser, either because it is part of the core gram-mar or because it was previously already deﬁned by the user.The function def ﬁrst obtains a logical form for the deﬁ-nition (which exists since the deﬁnition is part of the gram-mar). The semantic parser then induces one or more gram-mar rules from the utterance-deﬁnition pair and attachesthe logical form of the deﬁnition to those rules.The induction of new grammar rules relies on ﬁndingcorrespondences between parsable parts of the utteranceand its deﬁnition. As an example, observe our simplify with command. Because

REAL_ADD_ASSOC can be parsed into a cate-gory $thm , the two new production rules added to the gram-mar are: $tactic → simplify with REAL_ADD_ASSOC ( 𝜆 x.fs [REAL_ADD_ASSOC])$tactic → simplify with $thm ( 𝜆 thm. fs [thm]) Based on the second added rule, we can now use the Lassietactic simplify with connected to any other description thatis parsed as a $thm , because the parser identiﬁed

REAL_ADD_ASSOC as an argument and generalized from our example by learn-ing the 𝜆 -abstraction over the variable thm .Next time the user calls, for instance, nltac ` simplify with REAL_ADD_COMM ` Lassie’s semantic parser will parse this command into thetactic fs [REAL_ADD_COMM] using the second added rule.

Lassie is implemented as a HOL4 library, which can be loadedinto a running HOL4 session with open LassieLib; . This willstart a SEMPRE process and the library captures its inputand output as SML streams. Whenever nltac or nlexplain are run, the input is send to SEMPRE over the input stream,and if it can be parsed with the currently learned gram-mar, SEMPRE writes the resulting HOL4 tactic to the outputstream as a string. If parsing fails, i.e. SEMPRE does not rec-ognize the description, LassieLib raises an exception, suchthat an end-user can deﬁne the tactic with a call to def .We want nltac to act as a drop-in replacement for HOL4tactics. Therefore, nltac must not only be able to parse sin-gle tactics, but must also be able to parse full tactic scripts,performing a proof from start to ﬁnish. During our case-studies, we noticed that SEMPRE was not built for parsinglarge strings of text, but rather for smaller examples. Tospeed up parsing, we have deﬁned a global constant, LassieSep which is used to split input strings of nltac . For example,calling nltac ` case split. simplify with [REAL_LE_LT]. ` will lead to two separate calls to the semantic parser: onefor case split and one for simplify with [REAL_LE_LT] . Theresulting HOL4 tactics are joined together using the THEN_LT tactical, which is a more general version of the tactical \\ , assie: HOL4 Tactics by Example CPP ’21, January 18–19, 2021, Virtual, Denmark as it has an additional argument for selecting the subgoalto which the given tactic is applied. When proving a goalinteractively, some tactics, like induction, and case splitting,can lead to multiple subgoals being generated. We use the THEN_LT tactical to implement selecting subgoals in nltac .There are some diﬀerences in how nltac and nlexplain are used. Function nltac can be used as a drop-in replace-ment for HOL4 tactics, and thus supports selection of sub-goals. In contrast, nlexplain is meant to be used interac-tively, and therefore parses Lassie tactics, but does not sup-port selection of subgoals. Instead, subgoals are proven inorder of appearance. The main purpose of nlexplain is toshow how Lassie tactics are translated back into HOL4 tac-tics. To do so, it modiﬁes HOL4’s interactive read-eval-printloop (REPL), and thus can only be used interactively, but notto replace plain HOL4 tactics in proof scripts like nltac .To diﬀerentiate between SML expressions and HOL4 ex-pressions, HOL4 requires HOL4 expressions to be wrappedin quotes (`), but quotes are also a way of allowing multilinestrings in HOL4 proofscripts. Therefore we choose quotesto denote the start and end of a Lassie proofscript, and useapostrophes ( ' ) to denote the start and the end of a HOL4expression in a Lassie proof script.Lassie currently does not support debugging tactic appli-cations. While an end-user can easily deﬁne new tactics byexample using the semantic parser, ﬁguring out the tacticsexact behavior, and ﬁxing bugs still requires the user to man-ually step through the corresponding HOL4 tactic in an in-teractive proof and manually inspecting steps. We see ex-tending Lassie with debugging support as future work. Our initial core grammar supports only a ﬁxed set of themost commonly used HOL4 tactics. However, it is commonin ITPs to develop custom tactics on a per-project basis, pos-sibly including fully blown decision procedures [33]. To makesure that users can add their own HOL4 tactics as well as cus-tom decision procedures to Lassie, the library provides thefunctions addCustomTactic , addCustomThmTactic , and addCustomThmlistTactic . The diﬀerence between def and addCustom[*]Tactic is inwhere the elements are added to the semantic parser’s gram-mar. Function def uses SEMPRE’s generalization algorithmand adds rules to the grammar that may contain non-terminals(e.g. follows from [ $thms ] ). Function addCustomTactic al-ways adds a new terminal to the grammar.We explain addCustomTactic by example. Suppose a userwants to reuse an existing linear decision procedure for realnumbers ( REAL_ASM_ARITH_TAC ) to close simple proof goals.Running addCustomTactic REAL_ASM_ARITH_TAC adds the newproduction rule $tactic → REAL_ASM_ARITH_TAC to the SEM-PRE grammar. Tactic

REAL_ASM_ARITH_TAC can then be usedin subsequent calls to def to provide Lassie-based descrip-tions, or immediately in nltac and nlexplain . Now that SEMPRE accepts the decision procedure as avalid tactic, we extend our expert automation tactic frombefore to try to solve a goal with this decision proceduretoo: def ` prove with [ADD_ASSOC] `` all_tac THEN ( fs [ ADD_ASSOC ] THEN NO_TAC)ORELSE (rw [ ADD_ASSOC ] THEN NO_TAC)ORELSE REAL_ASM_ARITH_TACORELSE metis_tac [ ADD_ASSOC ] ` Functions addCustomThmTactic , and addCustomThmlistTactic work similarly, adding grammar rules for $thm->tactic and $thm list->tactic . Users can deﬁne libraries with their own deﬁned Lassie tac-tics using the function registerLibrary which takes as ﬁrstinput a string, giving the libraries a unique name, and assecond input a function of type :unit -> unit , where thefunction should call def on the deﬁnitions to be added, fol-lowing Section 3.2. The deﬁned libraries can then be sharedand loaded simply by calling the function loadLibraries .We deﬁned libraries for proofs using logic, natural num-bers, and real numbers from our case studies and used thesein our HOL4 tutorial (Section 5)

We evaluate Lassie on three case studies and show how itcan be used for developing a HOL4 tutorial. In the paper,we show only the main theorems for the case studies, butthe full developments can be found in the Lassie repository.

First, we prove Euclid’s theorem from the HOL4 tutorial [32]that is distributed with the HOL4 theorem prover documen-tation. Euclid’s theorem states that the prime numbers forman inﬁnite sequence. Its HOL equivalent states that for anynatural number 𝑛 , there exists a natural number 𝑝 which isgreater than 𝑛 and a prime number.To prove the ﬁnal theorem, shown in Figure 6, we haveproven 19 theorems in total. To prove these theorems, wedeﬁned a total of 22 new tactics using LassieLib. def . Sometactics have been used only once, but for example the tactic [...] solves the goal , was reused 16 times.Another example is the tactic thus PRIME_FACTOR for 'FACTn + 1' which introduces a specialized version of the theo-rem

PRIME_FACTOR , proving the existence of a prime factorfor every natural number. Note how the tactic descriptioncan freely mix text descriptions with the parameters for theunderlying tactic. Similarly, the ﬁrst step of the HOL4 proofreads

CCONTR_TAC , which initiates a proof by contradiction.For an untrained user, ﬁguring out and remembering thisname can be cumbersome, even though the user might know

PP ’21, January 18–19, 2021, Virtual, Denmark Heiko Becker, Nathaniel Bos, Ivan Gavran, Eva Darulova, and Rupak Majumdar

Theorem

EUCLID: ∀ n . ∃ p . n < p ∧ prime p Proof

CCONTR_TAC \\ fs[]\\ ` FACT n + 1 ≠ ` by rw[FACT_LESS, neq_zero]\\ qspec_then ` FACT n + 1 ` assume_tac PRIME_FACTOR\\ ` ∃ q. prime q ∧ q divides (FACT n + 1) ` by fs[]\\ ` q ≤ n ` by metis_tac[NOT_LESS_EQUAL]\\ ` ` by metis_tac[PRIME_POS]\\ ` q divides FACT n ` by metis_tac [DIVIDES_FACT]\\ ` q = 1 ` by metis_tac[DIVIDES_ADDL, DIVIDES_ONE]\\ ` prime 1 ` by fs[]\\ fs[NOT_PRIME_1] QED Theorem

EUCLID: (* Lassie *) ∀ n . ∃ p . n < p ∧ prime p Proofnltac ` suppose not. simplify.we can derive 'FACT n + 1 <> 1'from [FACT_LESS, neq_zero].thus PRIME_FACTOR for 'FACT n + 1'.we further know' ∃ q. prime q and q divides (FACT n + 1)'.show 'q <= n' using [NOT_LESS_EQUAL].show '0 < q' using [PRIME_POS] .show 'q divides FACT n' using [DIVIDES_FACT].show 'q=1' using [DIVIDES_ADDL, DIVIDES_ONE].show 'prime 1' using (simplify).[NOT_PRIME_1] solves the goal. ` QED

Figure 6.

HOL4 proof (left) and Lassie proof (right) of euclids theoremthe high-level proof step. Instead, in Lassie we have usedthe—for us—more intuitive name suppose not .Finally, each sub-step of the HOL4 proof is closed usingthe tactic metis_tac . For an expert user, it is obvious that metis_tac can be used, because the expert knows that it per-forms ﬁrst order resolution to prove the goal. In the Lassieproof, we hide metis_tac [] in combination with the sim-pliﬁcation tactics fs [] and rw[] under the description []solves the goal . To further automate proving simple sub-goals, we combine the tactic [] solves the goal with ourLassie tactic for proving subgoals ( show 'T' using (gen_tac) ) by deﬁning show 'T' using [...] as show 'T' using ([...] solves the goal) . Next, we will show how Lassie can be used in more involvedproofs about both real and natural numbers. As an example,we prove that for any natural number 𝑛 , the sum of the cubesof the ﬁrst 𝑛 natural numbers is the same as the square ofthe sum. The Lassie proof of the ﬁnal theorem is in Figure 7.We have proven a total of 5 theorems: two (real-numbered)binomial laws, the closed form for summing the ﬁrst 𝑛 natu-ral numbers, a side lemma on exponentiation, and the mainresult about cubing the ﬁrst 𝑛 numbers. All our proofs inthis case study have been performed using the HOL4 theoryof real numbers simply for convenience, as we found realnumber arithmetic easier for proving theorems that involvesubtractions, powers, and divisions. We deﬁned a total of42 tactics by example using LassieLib. def and added 3 cus-tom tactics using

LassieLib.addCustomTactic and

LassieLib.addCustomThmTactic . Again, some of the tactics were usedonly once or twice but our Lassie tactics for rewriting witha theorem (two calls to

LassieLib. def to support rewriting from left to right, and right to left) are reused 13 times withinthe proofs.This Lassie proof shows how it can be extended with cus-tom tactics. Our restricted core grammar of Lassie does notinclude HOL4’s decision procedure for reals. Nevertheless,a user may want to provide this tactic as part of some au-tomation. Because Lassie supports on-the-ﬂy grammar ex-tensions we add the decision procedure for reals (

REAL_ASM_ARITH_TAC ) to the grammar: addCustomTactic REAL_ASM_ARITH_TAC . Hav-ing added this tactic, it can be used just like the HOL4 tacticswe support in the base grammar. Thus we deﬁne a Lassietactic using the decision procedure: def ` we know 'T' `` 'T' by (REAL_ASM_ARITH_TAC ORELSE DECIDE_TAC) ` The semantic parser now automatically generalizes the gram-mar rule for this tactic, learning the rule $tactic → we know '$term'( 𝜆 t.'t' by (REAL_ASM_ARITH_TAC ORELSE DECIDE_TAC)) With this, we can use more complicated tactics like we know'2 * &n * (1 + &n)* inv 2 = 2 * inv 2 * &n * (1 = &n)' .In general, combining the extensibility of Lassie and thegeneralization of SEMPRE allows us to support arbitrary set-tings where trained experts can implement domain-speciﬁcdecision procedures and provide simple tactic descriptionsto novice users that want to use them in a HOL4 proof, es-sentially decoupling the automation from its implementa-tion. Equally, any user can deﬁne personalized and more in-tuitive names for often-used tactics. assie: HOL4 Tactics by Example CPP ’21, January 18–19, 2021, Virtual, Denmark

Theorem sum_of_cubes_is_squared_sum: ∀ n. sum_of_cubes n = (sum n) pow 2 Proofnltac ` induction on 'n'.simplify conclusion with [sum_of_cubes_def, sum_def].rewrite with [POW_2, REAL_LDISTRIB, REAL_RDISTRIB,REAL_ADD_ASSOC].showing'&SUC n pow 3 =&SUC n * &SUC n + &SUC n * sum n + sum n * &SUC n'closes the proofbecause (simplify conclusion with [REAL_EQ_LADD]).we know '& SUC n * sum n + sum n * &SUC n =2 * (sum n * & SUC n)'.rewrite once [<- REAL_ADD_ASSOC].rewrite last assumption.rewrite with [pow_3, closed_form_sum, real_div,REAL_MUL_ASSOC].we know '2 * &n * (1 + &n) * inv 2 =2 * inv 2 * & n * (1 + &n)'.rewrite last assumption.simplify conclusion with [REAL_MUL_RINV].we show 'n + 1 = SUC n' using (simplify conclusion).rewrite last assumption. simplify conclusion.we show '2 = (SUC (SUC 0))'using (simplify conclusion).rewrite last assumption. rewrite last assumption.rewrite with [EXP].we show 'SUC n = n + 1' using (simplify conclusion).rewrite last assumption.rewrite with [GSYM REAL_OF_NUM_ADD, pow_3].rewrite with [REAL_OF_NUM_ADD, REAL_OF_NUM_MUL,MULT_RIGHT_1, RIGHT_ADD_DISTRIB,LEFT_ADD_DISTRIB, MULT_LEFT_1].simplify. ` QED

Figure 7.

Lassie proof that the sum of the natural numbersfrom to 𝑛 cubed is the same as the square of their sum In our ﬁnal example, we show how Lassie can be integratedinto larger developments, by proving a soundness theoremfrom a library of FloVer [3]. FloVer is a veriﬁed checker forﬁnite-precision roundoﬀ error bounds implemented in HOL4.Its HOL4 deﬁnitions and proofs span approximately 10000lines of code and the interval library is one of the criticalcomponents which is used in most of the soundness proofs.As the FloVer proofs are performed over real numbers, wereuse the tactic descriptions from our previous example anddo not need to add additional deﬁnitions. In Figure 8 we

Theorem interval_inversion_valid: ∀ iv a.(SND iv < 0 \/ 0 < FST iv) /\ contained a iv ==>contained (inv a) (invertInterval iv) Proofnltac ` introduce variables.case split for 'iv'.simplify with [contained_def, invertInterval_def].introduce assumptions.rewrite once [<- REAL_INV_1OVER].Next Goal.rewrite once [ <- REAL_LE_NEG].we know 'a < 0'. thus 'a <> 0'.we know 'r < 0'. thus 'r <> 0'.'inv(-a) <= inv (-r) <=> (- r) <= -a' using(use REAL_INV_LE_AMONO THEN simplify).resolve with REAL_NEG_INV.rewrite assumptions.follows trivially.Next Goal.rewrite once [<- REAL_LE_NEG].we know 'a < 0'. thus 'a <> 0'. we know 'q <> 0'.resolve with REAL_NEG_INV.'inv (-q) <= inv (-a) <=> (-a) <= (-q)' using(use REAL_INV_LE_AMONO THEN simplifyTHEN trivial).rewrite assumptions. follows trivially.Next Goal.rewrite with [<- REAL_INV_1OVER].'inv r <= inv a <=> a <= r' using(use REAL_INV_LE_AMONO THEN trivial).follows trivially.Next Goal.rewrite with [<- REAL_INV_1OVER].'inv a <= inv q <=> q <= a' using(use REAL_INV_LE_AMONO THEN trivial).follows trivially. ` QED

Figure 8.

Soundness of FloVer’s interval inversion in Lassieshow that if we have an interval 𝑖𝑣 , and a real number 𝑎 ∈ 𝑖𝑣 ,then the inverse of 𝑎 is contained in the inverse of 𝑖𝑣 .This example shows that Lassie’s tactic deﬁnitions are ex-pressive enough to build libraries of common tactic descrip-tions that can be shared between projects. We have used Lassie to write a new tutorial for HOL4 withthe goal of decoupling the learning of the basic structure offormal proofs from the particular syntax and tactic namesof HOL4, and by this easing the learning curve. Our tutorial

PP ’21, January 18–19, 2021, Virtual, Denmark Heiko Becker, Nathaniel Bos, Ivan Gavran, Eva Darulova, and Rupak Majumdar

Definition sum_def:sum (n:num) = if n = 0 then 0 else sum (n-1) + n

EndTheorem closed_form_sum: ∀ n. sumEq n = n * (n + 1) DIV 2 Proofnlexplain ()Induction on 'n'.simplify with [sumEq_def]. ` simplify with [sumEq_def, GSYM ADD_DIV_ADD_DIV].'2 * SUC n + n * (n + 1) = SUC n * (SUC n + 1)'suffices to show the goal.show 'SUC n * (SUC n + 1) =(SUC n + 1) + n * (SUC n + 1)'using (simplify with [MULT_CLAUSES]).simplify.show 'n * (n + 1) = SUC n * n'using (trivial using [MULT_CLAUSES,MULT_SYM]).rewrite assumptions. simplify. QED

Induct on ` n ` >- ( fs [ sum_def ])>- ( fs [ sum_def, GSYM ADD_DIV_ADD_DIV ] \\ ` ` suffices_by (fs [ ]) \\0. sum n = n * (n + 1 DIV 2)----------------------------2 * SUC n + n * (n + 1) = SUC n * (SUC n + 1) | > Figure 9.

Intermediate state of nlexplain in our tutorialis based on the existing HOL4 tutorial [32] and the HOL4emacs interaction guide.First, the new HOL4 user uses nltac and the Lassie tacticsthat we deﬁned for our three case studies (i.e. loads themas libraries) to do the proofs. He or she can thus learn thesyntax of theorems and deﬁnitions, as well as structure ofproofs without having to also learn the often unintuitive tac-tic names of the proofs. For example, we show the proof ofthe closed form for summing the ﬁrst 𝑛 natural numbersfrom our tutorial in Figure 10. The example proof showsLassie tactics that abstract from the tactic, but not the theo-rem names. Lassie has limited support for deﬁning descrip-tions of theorems similar to how Lassie tactics are deﬁnedwhich could be used when developing individual languages.In the second step, the new HOL4 user is introduced tothe HOL4 tactics using nlexplain . For instance, they canstep through the proof and see the HOL4 tactics underly-ing each Lassie tactic. We show an example in Figure 9. Theleft-hand side shows the HOL4 proof state obtained by ap-plying Lassie tactics with nlexplain , and the right-hand sidethe modiﬁed HOL4 REPL with the current proof goal and apartial HOL4 tactic script. The red dashed box on the left-hand side marks all Lassie tactics that have been passed to nlexplain .Our tutorial is split into six separate parts. We start byexplaining how HOL4 (and Lassie) are installed and conﬁg-ured on a computer such that the tutorial can be followedinteractively. Next, we explain how one interacts with HOL4in an interactive session. The ﬁrst technical section uses the proof from Figure 10 as a ﬁrst example of an interac-tive HOL4 proof, using only nltac to perform proofs. Hav-ing introduced the reader to the basics of interactive proofsin HOL4, we show how a simple library of proofs can be de-veloped. The library is a re-implementation of our ﬁrst casestudy, and hence follows the structure of the original HOL4tutorial. It spans a total of two deﬁnitions, and 13 theorems.For each of the theorems we show a proof using nltac . Onlyafter these introductory sections, where a user will have al-ready gained an intuition both about how one interacts withthe HOL4 REPL, and how proofs are stored in reusable theo-ries, the next section introduces nlexplain and explains howHOL4 proofs are performed with plain HOL4 tactics. Finally,the tutorial concludes with some helpful tips and tricks thatwe have collected.We deﬁned the tutorial using deﬁnitions that we person-ally found intuitive. However, Lassie’s ability to deﬁne tac-tics by example allows each teacher to deﬁne their own in-dividual language in a straightforward way. In this section, we review approaches designed to ease theuser burden when writing proofs in an ITP.

Hammers.

So-called “hammers” use automated theoremprovers (ATP) to discharge proof obligations by translatinga proof goal into the logic of an ATP and a proof back intothe logic of the interactive prover. Examples are Sledgeham-mer [28] for Isabelle, HolyHammer [21] for HOL4, and a assie: HOL4 Tactics by Example CPP ’21, January 18–19, 2021, Virtual, Denmark

Theorem closed_form_sum: ∀ n. sum n = (n * (n + 1)) DIV 2 Proofnltac ` Induction on 'n'.Goal 'sum 0 = 0 * (0 + 1) DIV 2'.simplify.

End .Goal 'sum (SUC n) = SUC n * (SUC n + 1) DIV 2'.use [sum_def, GSYM ADD_DIV_ADD_DIV] to simplify.'2 * SUC n + n * (n + 1) = SUC n * (SUC n + 1)'suffices to show the goal.show 'SUC n * (SUC n + 1) =(SUC n + 1) + n * (SUC n + 1)'using (simplify with [MULT_CLAUSES]).simplify.show 'n * (n + 1) = SUC n * n'using (trivial using [MULT_CLAUSES, MULT_SYM]).'2 * SUC n = SUC n + SUC n' follows trivially.'n * (SUC n + 1) = SUC n * n + n' follows trivially.rewrite assumptions. simplify.

End . ` QED

Figure 10.

Example proof of the closed form for summing 𝑛 numbers using Lassie in our HOL4 tutorialhammer for Coq [8]. A general overview is given in the sur-vey paper by Blanchette et al. [5]. Some of these use learningto predict which premises are needed to be sent to the ATP,in order not to overwhelm the prover. In contrast to Lassie,the main focus of such hammers is not to make the proofsmore accessible but to solve simple proof obligations using apush-button method. As Lassie is open to adding custom de-cision procedures we think that integrating a hammer withLassie could provide for even richer and easier to deﬁne tac-tic languages by automating simple proofs. Learning-based.

While hammers try to automate the proofwith the help automated theorem provers, other systems usestatistical methods to recommend tactics to the end user toﬁnish a proof. DeepHOL [2] learns a neural network that,given a proof goal, predicts a potential next tactic in HOLLight. GamePad [19] and the work by Yang et al. [38] sim-ilarly use machine learning to predict tactics for Coq. Tac-ticToe [13] uses A * search, guided by previous tactic-levelproofs, to predict tactics in HOL4.

Programming Language-based.

Languages like Eis-bach [26], Ltac [10], Ltac2 [29] and Mtac2 [20] use rigor-ous programming language foundations to give more con-trol to expert users when writing tactics. Eisbach and Ltac are tactic languages similar to the one of HOL4. Mtac2 for-malizes “Coq in Coq” allowing to deﬁne tactics as Coq pro-grams, whereas Ltac2 is a strongly typed language for writ-ing Coq tactics. The tactic language of the Lean theoremprover [9] additionally implements equational reasoning ontop of its tactics, which allows for more textbook-like proofs.Recently, the Lean theorem prover has also been extendedwith a hygienic macro system [34]. A core contribution oftheir work is excluding unintentional capturing in tacticprogramming, thus making tactic programming more ro-bust. In Lassie we did not experience any hygiene issuesas the deﬁnition by example relies on the semantic parserto do the generalization and as such keeps variable levelsseparate. Using any of the languages above requires all thedesired generality to be stated explicit in the tactic deﬁni-tion, usually in the form of function deﬁnitions. In contrast,Lassie’s deﬁnition by example makes it easier to deﬁne newtactics and generalizes automatically.

Natural Language Interfaces.

Several systems providean interface to a theorem prover that is as close as possi-ble to natural language. Languages like Isar [36], Mizar [1],and the work by Corbineau [6] follow a similar approach asLassie by having an extended parser. Their supported nat-uralized proof descriptions are ﬁxed to the authors style ofdeclarative proofs and extending or changing these wouldrequired editing the tool code. In contrast, Lassie is exten-sible enough to support diﬀerent tactic languages that cancoexist without interferring if not loaded simultaneously.The Naproche system [11] provides a controlled naturallanguage, which maps natural language utterances into ﬁrst-order logic proof obligations, to be checked by an (auto-mated) theorem prover (e.g. E Prover [31]). The extensionsto Alfa by Hallgren et al. [17] also use natural languageprocessing technology to extend the Alfa proof editor witha more natural language. The book by Ganesalingam [12]gives a comprehensive explanation of the relation betweennatural language and mathematics. Similarly, Ranta et al. [30]provide more sophisticated linguistic techniques to trans-late between natural language and predicate logic. An or-thogonal approach to the above is presented in the workby Coscoy et al. [7]. Instead of translating from natural lan-guage to tactics, they provide a translation from Coq proofterms to natural language. The main goal of these systemsis to provide an interface that supports as much natural lan-guage as possible. A major limitation, however, is that theirgrammars are ﬁxed, i.e. only the naturalized tactics imple-mented by the authors is available. Our work does not striveto be a full natural language interface, and in turn providesan extensible grammar, which adapts to diﬀerent users andproofs.

PP ’21, January 18–19, 2021, Virtual, Denmark Heiko Becker, Nathaniel Bos, Ivan Gavran, Eva Darulova, and Rupak Majumdar

We have presented the Lassie tactic language frameworkfor the HOL4 theorem prover. Using a semantic parser withan extensible grammar, Lassie learns individualized tacticsfrom user-provided examples. Our example case studies showthat these learned tactics can be easily reused across dif-ferent proofs and can ease both the writing and reading ofHOL4 proofs by providing a more intuitive, personalized in-terface to HOL4’s tactics.

Acknowledgments

The authors would like to thank Magnus Myreen, ZacharyTatlock, and the anonymous reviewers of ITP 2020 and CPP2021 for providing feedback on Lassie and (initial) drafts ofthe paper. Gavran and Majumdar were supported in part bythe DFG project 389792660 TRR 248–CPEC and by the Euro-pean Research Council under the Grant Agreement 610150(ERC Synergy Grant ImPACT).

References [1] Grzegorz Bancerek, Czeslaw Bylinski, Adam Grabowski, Artur Ko-rnilowicz, Roman Matuszewski, Adam Naumowicz, Karol Pak, andJosef Urban. 2015. Mizar: State-of-the-art and Beyond. In

Inter-national Conference on Intelligent Computer Mathematics (CICM) . https://doi.org/10.1007/978-3-319-20615-8_17 [2] Kshitij Bansal, Sarah Loos, Markus Rabe, Christian Szegedy, and Stew-art Wilcox. 2019. HOList: An Environment for Machine Learning ofHigher Order Logic Theorem Proving. In International Conference onMachine Learning (ICML) .[3] Heiko Becker, Nikita Zyuzin, Raphaël Monat, Eva Darulova, Mag-nus O Myreen, and Anthony Fox. 2018. A Veriﬁed Cer-tiﬁcate Checker for Finite-Precision Error Bounds in Coq andHOL4. In

FMCAD (Formal Methods in Computer Aided Design) . https://doi.org/10.23919/FMCAD.2018.8603019 [4] Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. 2013.Semantic Parsing on Freebase from Question-Answer Pairs. In Confer-ence on Empirical Methods in Natural Language Processing (EMNLP) .[5] Jasmin Christian Blanchette, Cezary Kaliszyk, Lawrence C.Paulson, and Josef Urban. 2016. Hammering towardsQED.

Journal of Formalized Reasoning

9, 1 (2016). https://doi.org/10.6092/issn.1972-5787/4593 [6] Pierre Corbineau. 2007. A Declarative Language for the Coq ProofAssistant. In

International Workshop on Types for Proofs and Programs(TYPES) . https://doi.org/10.1007/978-3-540-68103-8_5 [7] Yann Coscoy, Gilles Kahn, and Laurent Théry. 1995. Extracting Textfrom Proofs. In International Conference on Typed Lambda Calculi andApplications (TLCA) . https://doi.org/10.1007/BFb0014048 [8] Łukasz Czajka and Cezary Kaliszyk. 2018. Hammer for Coq: Automa-tion for dependent type theory. Journal of Automated Reasoning https://doi.org/10.1007/s10817-018-9458-4 [9] Leonardo Mendonça de Moura, Soonho Kong, Jeremy Avigad, Florisvan Doorn, and Jakob von Raumer. 2015. The Lean Theorem Prover(System Description). In

International Conference on Automated De-duction (CADE) . https://doi.org/10.1007/978-3-319-21401-6_26 [10] David Delahaye. 2000. A Tactic Language for the System Coq. In In-ternational Conference on Logic for Programming Artiﬁcial Intelligenceand Reasoning (LPAR) . https://doi.org/10.1007/3-540-44404-1_7 [11] Steﬀen Frerix and Peter Koepke. 2019. Making Set Theory GreatAgain: The Naproche-SAD Project. Conference on Artiﬁcial Intelli-gence and Theorem Proving (AITP) (2019). [12] Mohan Ganesalingam. 2013.

The Language of Mathematics - A Linguis-tic and Philosophical Investigation . Lecture Notes in Computer Science,Vol. 7805. Springer. https://doi.org/10.1007/978-3-642-37012-0 [13] Thibault Gauthier, Cezary Kaliszyk, Josef Urban, Ramana Kumar, andMichael Norrish. 2020. TacticToe: Learning to Prove with Tactics.

Journal of Automated Reasoning (2020).[14] Georges Gonthier. 2008. Formal proof–the four-color theorem.

No-tices of the AMS

55, 11 (2008).[15] Georges Gonthier and Assia Mahboubi. 2010. An introduction tosmall scale reﬂection in Coq.

Journal of Formalized Reasoning

3, 2(2010). https://doi.org/10.6092/issn.1972-5787/1979 [16] Thomas C. Hales. 2006. Introduction to the Flyspeck Project. In

Math-ematics, Algorithms, Proofs .[17] Thomas Hallgren and Aarne Ranta. 2000. An Extensible Proof TextEditor. In

International Conference on Logic for Programming and Au-tomated Reasoning (LPAR) . https://doi.org/10.1007/3-540-44404-1_6 [18] John Harrison. 2009. HOL light: An overview. In International Confer-ence on Theorem Proving in Higher Order Logics (TPHOL) .[19] Daniel Huang, Prafulla Dhariwal, Dawn Song, and Ilya Sutskever.2019. GamePad: A Learning Environment for Theorem Proving. In

International Conference on Learning Representations (ICLR) .[20] Jan-Oliver Kaiser, Beta Ziliani, Robbert Krebbers, Yann Régis-Gianas,and Derek Dreyer. 2018. Mtac2: typed tactics for backward reason-ing in Coq.

Proc. ACM Program. Lang.

2, ICFP (2018), 78:1–78:31. https://doi.org/10.1145/3236773 [21] Cezary Kaliszyk and Josef Urban. 2014. Learning-Assisted AutomatedReasoning with Flyspeck.

Journal of Automated Reasoning

53, 2 (2014). https://doi.org/10.1007/s10817-014-9303-3 [22] Yong Kiam Tan, Magnus O. Myreen, Ramana Kumar, Anthony Fox,Scott Owens, and Michael Norrish. 2019. The veriﬁed CakeMLcompiler backend.

Journal of Functional Programming

29 (2019). https://doi.org/10.1017/S0956796818000229 [23] Gerwin Klein, Kevin Elphinstone, Gernot Heiser, June Andronick,David Cock, Philip Derrin, Dhammika Elkaduwe, Kai Engelhardt,Rafal Kolanski, Michael Norrish, et al. 2009. seL4: Formal veriﬁcationof an OS kernel. In

ACM Symposium on Operating Systems Principles(SOSP) . https://doi.org/10.1145/1629575.1629596 [24] Xavier Leroy. 2009. Formal Veriﬁcation of a Realistic Compiler. Com-mun. ACM

52, 7 (2009). https://doi.org/10.1145/1538788.1538814 [25] Percy Liang. 2016. Learning executable semantic parsers fornatural language understanding.

Commun. ACM

59, 9 (2016). https://doi.org/10.1145/2866568 [26] Daniel Matichuk, Toby C. Murray, and Makarius Wenzel. 2016. Eis-bach: A Proof Method Language for Isabelle.

Journal of AutomatedReasoning

56, 3 (2016). https://doi.org/10.1007/s10817-015-9360-2 [27] Tobias Nipkow, Lawrence C. Paulson, and Markus Wenzel.2002.

Isabelle/HOL - A Proof Assistant for Higher-Order Logic .Lecture Notes in Computer Science, Vol. 2283. Springer. https://doi.org/10.1007/3-540-45949-9 [28] Lawrence C. Paulson and Kong Woei Susanto. 2007. Source-LevelProof Reconstruction for Interactive Theorem Proving. In

Interna-tional Conference on Theorem Proving in Higher Order Logics (TPHOL) . https://doi.org/10.1007/978-3-540-74591-4_18 [29] Pierre-Marie Pédrot. 2019. Ltac2: Tactical Warfare. CoqPL 2019 (2019).[30] Aarne Ranta. 2011. Translating between Language and Logic: What IsEasy and What Is Diﬃcult. In

International Conference on AutomatedDeduction (CADE) . https://doi.org/10.1007/978-3-642-22438-6_3 [31] Stephan Schulz. 2013. System Description: E 1.8. In International Con-ference on Logic for Programming, Artiﬁcial Intelligence and Reasoning(LPAR) . https://doi.org/10.1007/978-3-642-45221-5_49 [32] Konrad Slind and Michael Norrish. 2008. A Brief Overview of HOL4.In International Conference on Theorem Proving in Higher Order Logics(TPHOL) . https://doi.org/10.1007/978-3-540-71067-7_6assie: HOL4 Tactics by Example CPP ’21, January 18–19, 2021, Virtual, Denmark [33] Alexey Solovyev and Thomas C. Hales. 2013. Formal Ver-iﬁcation of Nonlinear Inequalities with Taylor Interval Ap-proximations. In NASA Formal Methods Symposium (NFM) . https://doi.org/10.1007/978-3-642-38088-4_26 [34] Sebastian Ullrich and Leonardo de Moura. 2020. Beyond Nota-tions: Hygienic Macro Expansion for Theorem Proving Languages.In International Joint Conference on Automated Reasoning (IJCAR) . https://doi.org/10.1007/978-3-030-51054-1_10 [35] Sida I. Wang, Samuel Ginn, Percy Liang, and Christopher D. Manning.2017. Naturalizing a Programming Language via Interactive Learning.In Proceedings of the 55th Annual Meeting of the Association for Com-putational Linguistics, ACL . https://doi.org/10.18653/v1/P17-1086 [36] Markus Wenzel. 1999. Isar - A Generic Interpretative Ap-proach to Readable Formal Proof Documents. In InternationalConference on Theorem Proving in Higher Order Logics (TPHOL) . https://doi.org/10.1007/3-540-48256-3_12 [37] Markus Wenzel and Lawrence C. Paulson. 2006. Isabelle/Isar. In TheSeventeen Provers of the World . Lecture Notes in Computer Science,Vol. 3600. Springer, 41–49. https://doi.org/10.1007/11542384_8 [38] Kaiyu Yang and Jia Deng. 2019. Learning to Prove Theorems via Inter-acting with Proof Assistants. In