[PDF] LeoPARD --- A Generic Platform for the Implementation of Higher-Order Reasoners

Abstract

LeoPARD supports the implementation of knowledge representation and reasoning tools for higher-order logic(s). It combines a sophisticated data structure layer (polymorphically typed {\lambda}-calculus with nameless spine notation, explicit substitutions, and perfect term sharing) with an ambitious multi-agent blackboard architecture (supporting prover parallelism at the term, clause, and search level). Further features of LeoPARD include a parser for all TPTP dialects, a command line interpreter, and generic means for the integration of external reasoners.

Full PDF

aa r X i v : . [ c s . L O ] M a y LeoPARD — A Generic Platform for theImplementation of Higher-Order Reasoners ⋆ Max Wisniewski, Alexander Steen and Christoph Benzm¨uller

Dept. of Mathematics and Computer Science, Freie Universit¨at Berlin, Germany max.wisniewski|a.steen|[email protected]

Abstract.

LeoPARD supports the implementation of knowledge repre-sentation and reasoning tools for higher-order logic(s). It combines a so-phisticated data structure layer (polymorphically typed λ -calculus withnameless spine notation, explicit substitutions, and perfect term shar-ing) with an ambitious multi-agent blackboard architecture (supportingprover parallelism at the term, clause, and search level). Further featuresof LeoPARD include a parser for all TPTP dialects, a command lineinterpreter, and generic means for the integration of external reasoners.

LeoPARD ( L eo’s P arallel AR chitecture and D atastructures) is designed as ageneric system platform for implementing higher-order (HO) logic based knowl-edge representation, and reasoning tools. In particular, LeoPARD provides thebase layer of the new HO automated theorem prover (ATP)

Leo-III , the suc-cessor of the well known provers

LEO-I [4] and

LEO-II [7].Previous experiments with

LEO-I and the

OAnts mechanism [5] indicatea ﬂexible, multi-agent blackboard architecture is well-suited for automating HOlogic [6]. However, (due to project constraints) such an approach has not beenrealized in

LEO-II . Instead, the focus has been on the proof search layer incombination with a simple, sequential collaboration with an external ﬁrst-order(FO) ATP.

LEO-II also provides improved term data structures, term indexing,and term sharing mechanisms, which unfortunately have not been optimally ex-ploited at the clause and the proof search layer. For the development of

Leo-III the philosophy therefore has been to allocate suﬃcient resources for the ini-tial development of a ﬂexible and reusable system platform. The goal has beento bundle, improve, and extend the features with the highest potential of thepredecessor systems

LEO-I , LEO-II and

OAnts .The result of this initiative is

LeoPARD , which is written in Scala andcurrently consists of approx. 13000 lines of code. LeoPARD combines a sophis-ticated data structure layer [21] (polymorphically typed λ -calculus with nameless ⋆ This work has been supported by the DFG under grant BE 2501/11-1 (

Leo-III ).The ﬁnal publication is available at http://link.springer.com. LeoPARD can be download at: https://github.com/cbenzmueller/LeoPARD.git . pine notation, explicit substitutions, and perfect term sharing), with a multi-agent blackboard architecture [25] (supporting prover parallelism at the term,clause, and search level) and further tools including a parser for all TPTP [22,23]syntax dialects, generic support for interfacing with external reasoners, and acommand line interpreter. Such a combination of features and support tools is,up to the authors knowledge, not matched in related HO reasoning frameworks.The intended users of the LeoPARD package are implementors of HO knowl-edge representation and reasoning systems, including novel ATPs and modelﬁnders. In addition, we advocate the system as a platform for the integrationand coordination of heterogeneous (external) reasoning tools.

Data structure choices are a critical part of a theorem prover and permit reliableincreases of overall performance when implemented and exploited properly. Keyaspects for eﬃcient theorem proving have been an intensive research topic andhave reached maturity within FO-ATPs [19,20]. Naturally, one would expect aneven higher impact of the data structure choices in HO-ATPs. However, in thelatter context, comparably little eﬀort has been invested yet – probably alsobecause of the inherently more complex nature of HO logic.

Term Language.

The

LeoPARD term language extends the simply typed λ -calculus with parametric polymorphism, yielding the second-order polymor-phically typed λ -calculus (corresponding to λ λ -cube [3]).In particular, the system under consideration was independently developed byReynolds [16] and Girard [14] and is commonly called System F today. Furtherextensions, for example to admit dependent types, are future work.Thus, LeoPARD supports the following type and term language: τ, ν ::= t ∈ T (Base type) | α (Type variable) | τ → ν (Abstraction type) | ∀ α. τ (Polymorphic type) s, t ::= X τ ∈ V τ | c τ ∈ Σ (Variable / Constant) | ( λx τ s ν ) τ → ν | ( s τ → ν t τ ) ν (Term abstr. / appl.) | ( Λα s τ ) ∀ α τ | ( s ∀ α τ ν ) τ [ α/ν ] (Type abstr. / appl.) An example term of this language is:

ΛαλP α → o (( f ∀ β ( β → o ) → o → o α ) ( λY α P Y )) T o . Nameless Representation.

Internally,

LeoPARD employs a locally nameless rep-resentation (both at the type and term level), that extends de-Bruijn indices to(bound) type variables [15]. The deﬁnition of de-Bruijn indices [11] for type vari-ables is analogous to the one for term variables. Thus, the above example terms represented namelessly as (cid:0) Λλ → o (( f ∀ (1 → o ) → o → o

1) ( λ T o (cid:1) where de-Bruijn indices for type variables are underlined. Spine Notation and Explicit Substitutions.

On top of nameless terms,

LeoPARD employs spine notation [12] and explicit substitutions [1]. The ﬁrst techniqueallows quick head symbol queries, and eﬃcient left-to-right traversal, e.g. foruniﬁcation algorithms. The latter augments the calculus with substitution clo-sures that admit eﬃcient (partial) β -normalization runs. Internally, the aboveexample reads Λλ → o f ∀ (1 → o ) → o → o · (1; λ · (1); T )where · combines function heads to argument lists ( spines ) in which ; denotesconcatenation of arguments. Term Sharing/Indexing.

Terms are perfectly shared within

LeoPARD , mean-ing that each term is only constructed once and then reused between diﬀerentoccurrences. This does not only reduce memory consumption in large knowledgebases, but also allows constant-time term comparison for syntactic equality us-ing the term’s pointer to its unique physical representation. For fast (sub-)termretrieval based on syntactical criteria (e.g. head symbols, subterm occurrences,etc.) from the term indexing mechanism, terms are kept in β -normal η -long form. Suite of Normalization Strategies.

LeoPARD comes with a number of diﬀer-ent (heuristic) β -normalization strategies that adjust the standard leftmost-outermost strategy with diﬀerent combinations of strict and lazy substitutioncomposition resp. normalization and closure construction. η -normalization is in-variant wrt. β -normalization of spine terms and hence η -normalization (to longform) is applied only once for each freshly created term. Evaluation and Findings.

A recent empirical evaluation [21] has shown that thereis no single best reduction strategy for HO-ATPs. More precisely, for diﬀerentTPTP problem categories this study identiﬁed diﬀerent best reduction strategies.This motivates future work in which machine learning techniques could be usedto suggest suitable strategies.

In addition to supporting classical, sequential theorem proving procedures,

LeoPARD oﬀers means for breaking the global ATP loop down into a set of sub-tasks that can be computed in parallel. This also includes support for subproverparallelism as successfully employed, for example, in Isabelle/HOL’s Sledgeham-mer tool [8]. More generally,

LeoPARD is construed to enable parallalism atvarious levels inside an ATP, including the term, clause, and search level [9]. Forthis,

LeoPARD provides a ﬂexible multi-agent blackboard architecture. lackboard Architecture.

Process communication in

LeoPARD is realized indi-rectly via a blackboard architecture [24]. The

LeoPARD blackboard [25] is acollection of globally shared and accessible data structures which any process,i.e. agent, can query and manipulate at any time in parallel. From the black-board’s perspective each process is a specialist responsible for exactly one kindof problem. The blackboard is generic in the data structures, i.e. it allows theprogrammer to add various kinds data structures for any kind of data. Insertioninto the data structures is handled by the blackboard. Hence, each specialist canindeed by specialized on a single data structure.The

LeoPARD blackboard mechanism and associated data structures pro-vide speciﬁc support for nested and-or search trees, meaning that sets of formulaecan be split into (nested) and-or contexts. Moreover, for each supercontext re-spective TPTP SZS status [22] information is automatically inferred from thestatuses of its subcontexts.

Agents. In LeoPARD specialist processes can be modeled as agents [25]. Clas-sically, agents are composed of three components: environment perception, de-cision making, and action execution [24].The perception of

LeoPARD agents is trigger-based, meaning that eachagent is notiﬁed by a change in the blackboard.

LeoPARD agents are to beseen as homomorphisms on the blackboard data together with a ﬁlter when toapply an action. Depending on the perceived change of the resp. state of theblackboard an agent decides on an action it wants to execute.

Auction Scheduler.

Action execution in

LeoPARD is coordinated by an auctionbased scheduler, which implements an own approximation algorithm [25] forcombinatorical auctions [2]. More precisely, each

LeoPARD agent computesand places a bid for the execution of its action(s). The auction based schedulerthen tries to maximize the global beneﬁt of the particular set of actions to choose.This selection mechanism works uniformly for all agents that can be imple-mented in

LeoPARD . Balancing the value of the actions is therefore crucial forthe performance and the termination of the overall system. A possible generic so-lution for the agents bidding is to apply machine learning techniques to optimizethe bids for the best overall performance. This is future work.Note that the use of advanced agent technology in

LeoPARD is optional.A traditional ATP can still be implemented, for example, as a single, sequentialreasoner instantiating exactly one agent in the

LeoPARD framework.

Agent Implementation Examples.

For illustration purposes, some agent imple-mentations have been exemplarily included in the

LeoPARD package. For ex-ample, simple agents for simpliﬁcation , skolemization , prenex-form , negation-normal-form and paramodulation are provided. Moreover, the agent-based inte-gration of external ATPs is demonstrated and their parallelization is enabled bythe LeoPARD agent framework. This includes agents embodying

LEO-II andSatallax [10] running remotely on the SystemOnTPTP [22] servers in Miami.These example agents can be easily adapted for other TPTP compliant ATPs.ach example agent comes with an applicability ﬁlter, an action deﬁnitionand an auction value computation. The provided agents suﬃce to illustrate theworking principles of the

LeoPARD multi-agent blackboard architecture to in-terested implementors. After the oﬃcial release of

Leo-III , further, more so-phisticated agents will be included and oﬀered for academic reuse.

The

LeoPARD framework provides useful further components. For example, ageneric parser is provided that supports all TPTP syntax dialects. Moreover, acommand line interpreter supports ﬁne grained interaction with the system. Thisis useful not only for debugging but also for training and demonstration purposes.As pointed at above, useful support is also provided for the integration of externalreasoners based on the TPTP infrastructure. This also includes comprehensivesupport for the TPTP SZS result ontology. Moreover, ongoing and future workaims at generic means for the transformation and integration of (external) proofprotocols, ideally by exploiting results of projects such as ProofCert . There is comparably little related work to

LeoPARD , since higher-order the-orem provers typically implement their own data structures. Related systems(mostly concerning term representation) include λ Prolog and Teyjus [17], theAbella interactive theorem prover [13], and the logical framework Twelf [18].

Acknowledgements.

We thank the reviewers for their valuable feedback. More-over, we thank Tomer Libal and the students of the

Leo-III project for theircontributions to

LeoPARD . References

1. M. Abadi, L. Cardelli, P.-L. Curien, and J.-J. Levy. Explicit substitutions. In

Proc.of the 17th ACM SIGPLAN-SIGACT Symposium on Principles of ProgrammingLanguages , POPL ’90, pages 31–46, New York, NY, USA, 1990. ACM.2. K.J. Arrow.

Social Choice and Individual Values . Wiley, New York, 1951.3. H. P. Barendregt.

Introduction to Generalized Type Systems. . In

J. of FunctionalProgramming , 1(2):125–154, 1991.4. C. Benzm¨uller and M. Kohlhase. LEO – A Higher-Order Theorem Prover. In

Proc.of CADE-15 , number 1421 in LNCS, pages 139–143. Springer, 1998.5. C. Benzm¨uller and V. Sorge. OANTS – Combining Interactive and AutomatedTheorem Proving. In M. Kerber and M. Kohlhase, editors,

Symbolic Computationand Automated Reasoning , pages 81–97. A.K.Peters, 2001.6. C. Benzm¨uller, V. Sorge, M. Jamnik, and M. Kerber. Combined Reasoning byAutomated Cooperation.

J. of Applied Logic , 6(3):318–342, 2008. See https://team.inria.fr/parsifal/proofcert/ . C. Benzm¨uller, F. Theiss, L. Paulson, and A. Fietzke. LEO-II - A CooperativeAutomatic Theorem Prover for Higher-Order Logic (system description). In

Proc.of IJCAR 2008 , volume 5195 of

LNCS , pages 162–170. Springer, 2008.8. J. Blanchette, S. B¨ohme, and L. Paulson. Extending Sledgehammer with SMTSolvers.

Journal of Automated Reasoning , 51(1):109–128, 2013.9. M.P. Bonacina. A Taxonomy of Parallel Strategies for Deduction.

Annals ofMathematics and Artiﬁcial Intelligence , 29(1–4):223–257, 2000.10. C.E. Brown. Satallax: An Automatic Higher-Order Prover. In

Automated Reason-ing , volume 7364 of

LNCS , pages 111–117. Springer Berlin Heidelberg, 2012.11. N.G. De Bruijn. Lambda calculus notation with nameless dummies, a tool forautomatic formula manipulation, with application to the Church-Rosser theorem.

INDAG. MATH , 34:381–392, 1972.12. I. Cervesato and F. Pfenning. A linear spine calculus.

J. Logic and Computation ,13(5):639–688, 2003.13. A. Gacek. The Abella interactive theorem prover (system description). In

Proc.Automated Reasoning , IJCAR 2008, Sydney, Australia. pp. 154–161 (2008)14. J.Y. Girard.

Interpr´etation fonctionnelle et ´elimination des coupures del’arithm´etique d’ordre sup´erieur . PhD thesis, Paris VII, 1972.15. A. J. Kfoury, S. Ronchi Della Rocca, J. Tiuryn, and P. Urzyczyn. Alpha-conversionand typability.

Inf. Comput. , 150(1):1–21, 1999.16. J. C. Reynolds. Towards a theory of type structure. In

Symposium on Program-ming , volume 19 of

LNCS , pages 408–423. Springer, 1974.17. C. Liang, D. Mitchell. System Description: Teyjus - A Compiler and AbstractMachine Based Implementation of λ Prolog. In

Automated Deduction CADE-16 ,LNCS, vol. 1632, pp. 287–291. Springer Berlin Heidelberg (1999)18. F. Pfenning, C. Sch¨urmann. System description: Twelf — A Meta-Logical Frame-work for Deductive Systems. In

Automated Deduction , CADE-16, Trento, Italy,July 7-10, 1999, Proceedings. pp. 202–206 (1999)19. A. Riazanov.

Implementing an eﬃcient theorem prover . PhD thesis, University ofManchester, 2003.20. R. Sekar, I. V. Ramakrishnan, and A. Voronkov. Term Indexing. In

Handbook ofAutomated Reasoning , pages 1853–1964. Elsevier Science Publishers B. V., Ams-terdam, The Netherlands, 2001.21. A. Steen. Eﬃcient Data Structures for Automated Theorem Proving in Ex-pressive Higher-Order Logics. Master’s thesis, Freie Universit¨at Berlin, 2014. http://userpage.fu-berlin.de/$\sim$lex/drop/steen_datastructures.pdf .22. G. Sutcliﬀe. The TPTP Problem Library and Associated Infrastructure.

J. Auto-mated Reasoning , 43(4):337–362, 2009.23. G. Sutcliﬀe and C. Benzm¨uller. Automated Reasoning in Higher-Order Logic usingthe TPTP THF Infrastructure.

J. Formalized Reasoning , 3(1):1–27, 2010.24. Gerhard Weiss, editor.

Multiagent Systems . MIT Press, 2013.25. M. Wisniewski. Agent-based Blackboard Architecture for a Higher-Order Theorem Prover. Master’s thesis, Freie Universit¨at Berlin, 2014. http://userpage.fu-berlin.de/$\sim$lex/drop/wisniewski_architecture.pdfhttp://userpage.fu-berlin.de/$\sim$lex/drop/wisniewski_architecture.pdf