[PDF] Theorem Proving and Algebra

Abstract

This book can be seen either as a text on theorem proving that uses techniques from general algebra, or else as a text on general algebra illustrated and made concrete by practical exercises in theorem proving. The book considers several different logical systems, including first-order logic, Horn clause logic, equational logic, and first-order logic with equality. Similarly, several different proof paradigms are considered. However, we do emphasize equational logic, and for simplicity we use only the OBJ3 software system, though it is used in a rather flexible manner. We do not pursue the lofty goal of mechanizing proofs like those of which mathematicians are justly so proud; instead, we seek to take steps towards providing mechanical assistance for proofs that are useful for computer scientists in developing software and hardware. This more modest goal has the advantage of both being achievable and having practical benefits. The following topics are covered: many-sorted signature, algebra and homomorphism; term algebra and substitution; equation and satisfaction; conditional equations; equational deduction and its completeness; deduction for conditional equations; the theorem of constants; interpretation and equivalence of theories; term rewriting, termination, confluence and normal form; abstract rewrite systems; standard models, abstract data types, initiality, and induction; rewriting and deduction modulo equations; first-order logic, models, and proof planning; second-order algebra; order-sorted algebra and rewriting; modules; unification and completion; and hidden algebra. In parallel with these are a gradual introduction to OBJ3, applications to group theory, various abstract data types (such as number systems, lists, and stacks), propositional calculus, hardware verification, the {\lambda}-calculus, correctness of functional programs, and other topics.

Full PDF

SSSSS SSSS SSSS SSSS SSSS SSSS T h e o r e m P r ov i n g a nd A l g e b r a S J o s e ph A . G og u e n Theorem Proving and Algebra

Joseph A. Goguen heorem ProvingandAlgebraheorem ProvingandAlgebra

Joseph A. GoguenUniversity of California, San Diego oseph A. Goguen (1941–2006)Department of Computer Science and EngineeringUniversity of California, San Diego9500 Gilman Drive, La Jolla CA 92093-0114 USAApril 2006 © Joseph A. Goguen, 1990–2006.Edited by:Kokichi FutatsugiNarciso Martí-OlietJosé MeseguerText reviewed and commented by:Kokichi Futatsugi

Daniel Mircea GainaNarciso Martí-OlietJosé MeseguerMasaki NakamuraMiguel PalominoBook designed and typeset by:Alberto Verdejo ontents

Foreword by Editors xiii1 Introduction 1 versus

Logic for Applications . . 21.4 Why Equational Logic and Algebra? . . . . . . . . . . . . 31.5 What Kind of Algebra? . . . . . . . . . . . . . . . . . . . . 41.6 Term Rewriting . . . . . . . . . . . . . . . . . . . . . . . . 51.7 Logical Systems and Proof Scores . . . . . . . . . . . . . 51.8 Semantics and Soundness . . . . . . . . . . . . . . . . . . 61.9 Loose versus

Standard Semantics . . . . . . . . . . . . . 71.10 Human Interface Design . . . . . . . . . . . . . . . . . . . 81.11 OBJ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.12 Some History and Related Work . . . . . . . . . . . . . . 101.13 Using this Text . . . . . . . . . . . . . . . . . . . . . . . . . 121.13.1 Synopsis . . . . . . . . . . . . . . . . . . . . . . . . 131.13.2 Novel Features . . . . . . . . . . . . . . . . . . . . . 131.14 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 14 i Contents ( (cid:63) ) Parse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523.8 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 ( (cid:63) ) An Alternative Congruence Rule . . . . . . . 724.5.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . 734.6 Deduction using OBJ . . . . . . . . . . . . . . . . . . . . . 734.7 Two More Rules of Deduction . . . . . . . . . . . . . . . . 764.8 Conditional Deduction and its Completeness . . . . . . 774.8.1 Deduction with Conditional Equations in OBJ3 . 794.9 Conditional Subterm Replacement . . . . . . . . . . . . . 814.10 ( (cid:63) ) Speciﬁcation Equivalence . . . . . . . . . . . . . . . . 834.11 ( (cid:63) ) A More Abstract Formulation of Deduction . . . . . 884.12 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 ( (cid:63) ) Noetherian Orderings . . . . . . . . . . . . . . 1415.8.4 Proving Church-Rosser . . . . . . . . . . . . . . . 1495.9 ( (cid:63) ) Relation between Abstract and Term Rewriting Sys-tems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1525.10 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 ontents vii ( (cid:63) ) Initial Horn Models . . . . . . . . . . . . . . . . 2508.3 First-Order Logic . . . . . . . . . . . . . . . . . . . . . . . . 253 iii

Contents

10 Order-Sorted Algebra and Term Rewriting 319

11 Generic Modules 36512 Uniﬁcation 367

13 Hidden Algebra 36914 A General Framework (Institutions) 371A OBJ3 Syntax and Usage 373 ontents ix B Exiled Proofs 379

B.1 Many-Sorted Algebra . . . . . . . . . . . . . . . . . . . . . 379B.2 Rewriting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381B.2.1 ( (cid:63) ) Orthogonal Term Rewriting Systems . . . . . 384B.3 Rewriting Modulo Equations . . . . . . . . . . . . . . . . 389B.4 First-Order Logic . . . . . . . . . . . . . . . . . . . . . . . . 389B.5 Order-Sorted Algebra . . . . . . . . . . . . . . . . . . . . . 390

C Some Background on Relations 397

C.1 OBJ Theories for Relations . . . . . . . . . . . . . . . . . 401

D Social Implications 405Bibliography 407Editors’ Notes 423

Contents ist of Figures k = F( A ) . . . . . . . . . . . . . . . . . . . 1546.1 Proof for Uniqueness of Quotients . . . . . . . . . . . . . . . 164 mos Transistor: n on Left, p on Right . 2147.4 A cmos not

Gate . . . . . . . . . . . . . . . . . . . . . . . . . 2207.5 A cmos xor

Gate . . . . . . . . . . . . . . . . . . . . . . . . . . 2227.6 A cmos nor

Gate . . . . . . . . . . . . . . . . . . . . . . . . . 223 ii List of Figures cmos

Cell . . . . . . . . . . . . . . . . . . . . . . . . . . 2268.1 A Ripple Carry Adder . . . . . . . . . . . . . . . . . . . . . . . 2849.1 Series Connected Inverters . . . . . . . . . . . . . . . . . . . . 3149.2 Parity of a Bit Stream . . . . . . . . . . . . . . . . . . . . . . . 31610.1 Visualizing Regularity . . . . . . . . . . . . . . . . . . . . . . . 32410.2 Condition (2) of Universal Property of Quotient . . . . . . . 33710.3 Subsort Structure for Number System . . . . . . . . . . . . . 355B.1 Factorization of θ . . . . . . . . . . . . . . . . . . . . . . . . . 393 oreword by Editors Two of us, Futatsugi and Meseguer, had the privilege of working closelywith Joseph Goguen, were inﬂuenced by his very creative and funda- mental ideas, and, on the occasion of the Festschrift organized in hishonor for his 65th birthday in San Diego, California, we wrote:

Joseph Goguen is one of the most prominent computer scien-tists worldwide. His numerous research contributions spanmany topics and have changed the way we think about manyconcepts. Our views about data types, programming lan-guages, software speciﬁcation and veriﬁcation, computationalbehavior, logics in computer science, semiotics, interface de-sign, multimedia, and consciousness, to mention just some ofthe areas, have all been enriched in fundamental ways by hisideas.

Sadly, Joseph Goguen’s life was cut short due to a fatal illness somedays after the Festschrift Symposium in his honor, in which he couldstill be present. He was at that time working on

Theorem Proving andAlgebra (TPA), a long-term project still unﬁnished, yet quite advanced.The TPA book provides the deﬁnitive mathematical foundation for alge-braic theorem proving. Professor Goguen also exposes formal methodswith the OBJ language system. This is quite unique and an importantfeature of the book.We are convinced that Joseph Goguen’s ideas in the TPA book have a fundamental and lasting value and should be made available to theresearch community. Furthermore, as we explain below, they have in-ﬂuenced subsequent work in several algebraic languages originating inthe OBJ language on which two of us, Futatsugi and Meseguer, workedclosely with Joseph Goguen, namely, CafeOBJ and Maude. However, theTPA book should be his book, with no eﬀorts to complete parts of themanuscript that were unﬁnished or in any way modify its contents.Our approach to this task has been the usual one in editing any partof the nachlass of a scholar: to make only small corrections of typos orsmall mistakes that, clearly, the author would have himself wished tobe done; and to add a few explanatory editorial notes —clearly marked iv Foreword as such, and diﬀerent from the text itself— to help the reader betterunderstand some speciﬁc points in the text: again, making the bestguess possible about what the author himself might have wished toadd as explanations, given the unﬁnished nature of the text. Both thesmall corrections and the editorial notes are based on careful revisionsof the original text by the editors with the additional help of DanielMircea Gaina, Masaki Nakamura, and Miguel Palomino.

Impact on CafeOBJ and Maude

CafeOBJ (https://cafeobj.org/) and Maude (https://maude.cs.uiuc.edu/)are two sibling languages of OBJ which draw signiﬁcant inspiration from Joseph Goguen’s work presented in the TPA book. In what fol-lows we explain several ways in which the ideas in the TPA book haveinﬂuenced further developments in both CafeOBJ and Maude.

CafeOBJ inherits from OBJ distinctive features such as user-deﬁnedmix-ﬁx syntax, subtyping by ordered sorts, module system with pa-rameterized module expressions, conditional rewriting with associa-tive/commutative matching, loose and tight (or initial) semantics formodules, and theorem proving with proof scores.CafeOBJ adds to OBJ new features such as behavioural (or obser-vational) abstraction with hidden algebra, rewriting logic à la Maudefor specifying transition systems, and their combinations with order-sorted algebra. This multiparadigm approach has a mathematical se-mantics based on multiple institutions. Some theorem-proving capabil-ities are also added, including behavioral rewriting, observational coin-duction, and built-in search predicates.Transition systems can be speciﬁed with observational abstractionor with rewriting rules in CafeOBJ. The observational style is more ab-stract/algebraic and there is no need to determine state conﬁgurationsthat are instead needed to deﬁne state transitions via rewriting rules.The built-in search predicates facilitate veriﬁcation of rewriting-basedtransition systems. Both styles have their own merits and it is worth- while to support both in CafeOBJ.In the TPA book’s introduction (1 Introduction) Professor Goguenstates:

We do not pursue the lofty goal of mechanizing proofs likethose of which mathematicians are justly so proud; instead,we seek to take steps towards providing mechanical assis-tance for proofs that are useful for computer scientists in de-veloping software and hardware. This more modest goal hasthe advantage of both being achievable and having practicalbeneﬁts. oreword xv He continues (1.7 Logical Systems and Proof Scores):

The ﬁrst step of our approach is to construct proof scores,which are instructions such that when executed (or “played”),if everything evaluates as expected, then the desired theoremis proved. A proof score is executed by applying proof mea-sures, which progressively transform formulae in a languageof goals into expressions which can be directly executed. Wewill see that equational logic is adequate for implementingversions of ﬁrst and second-order logic in this way, as well asmany other logical systems. and (1.8 Semantics and Soundness):

This text justiﬁes proof measures for a logical system by demon-strating their soundness with respect to the notions of modeland satisfaction for that system. In this sense, it places Se-mantics First! In fact, users are primarily concerned withtruth; they want to know whether certain properties are trueof certain models, which may be realized in software and/orhardware. From this point of view, proof is a necessary nui-sance that we tolerate only because we have no other wayto eﬀectively demonstrate truth. Moreover, it is usually eas-ier and more intuitive to justify proof measures on semanticrather than syntactic grounds.

Inspired by the above-stated OBJ’s proof score approach and itspotential for realizing well structured and reusable proof documents,theorem proving with proof scores has been pursued extensively inCafeOBJ. Many case studies have been done in a variety of applicationareas and the proof scores have been found to be usable for practicaltheorem proving.Constructor-based algebra and its proof calculus, which are not in-cluded in the TPA book, were formalized as a theoretical foundationfor proof scores in CafeOBJ. The Constructor-based Inductive TheoremProver (CITP) was ﬁrst implemented in Maude Metalevel and its vari- ant is now incorporated into CafeOBJ as a Proof Tree Calculus (PTcalc)subsystem. The “Semantics First!” principle played an important rolein the design and implementation of PTcalc. As a result, PTcalc en-courages model-based analyses and proofs of properties to be veriﬁed,which is an important merit of algebraic theorem proving.The harmony of (1) model satisfaction semantics, (2) equational de-duction, and (3) rewriting execution constitutes the core of algebraictheorem proving and the TPA book provides the most reliable and com-prehensive account of it. A proof score applies proof measures by exe-cuting equations as rewriting rules to prove the model satisfaction. Theharmony of semantics, deduction, and execution makes it possible to vi Foreword formalize eﬀective and transparent proof measures. Major proof mea-sures in CafeOBJ include (i) case split with exhaustive equations and (ii)well-founded induction via term reﬁnement.CafeOBJ’s module system is basically the same as OBJ’s one, exceptfor succinct notation for inline view deﬁnition and gradually developedreliable and eﬃcient implementation. The module system is an impor-tant feature of algebraic language systems and its power has been wellappreciated. CafeOBJ’s module system works signiﬁcantly not only forconstructing speciﬁcations and proof scores but also for preparing li-braries of generic data structures and proof measures. The PTcalc sub-system of CafeOBJ, where proof nodes are modules, depends on themodule system in a signiﬁcant way.

Maude is a language based on rewriting logic, a simple computational logic to specify and program concurrent systems as initial models ofrewrite theories. In Maude they are speciﬁed as system modules ofthe form: mod FOO is ( Σ , E, R) endm , where the rewrite theory ( Σ , E, R) speciﬁes a concurrent system whose concurrent states belong to thealgebraic data type (initial algebra) T Σ /E —with Σ a typed signature offunction symbols and E a set of equations— and whose local concurrenttransitions are speciﬁed by the rewrite rules R . When R = ∅ , a rewritetheory ( Σ , E, R) becomes an equational theory ( Σ , E) . In Maude thisgives rise to a sublanguage of functional modules of the form fmodBAR is ( Σ , E) endfm with initial algebra semantics, which is a supersetof OBJ, because Maude is based on the more expressive membershipequational logic, which contains OBJ’s order-sorted equational logic asa special case.Maude naturally extends OBJ, because membership equational logicis itself a sublogic (the case R = ∅ ) of rewriting logic. The practicalmeaning of this extension is that equational logic is well-suited to spec-ify deterministic systems, whereas rewriting logic does naturally spec-ify non-deterministic and concurrent systems. Furthermore, Maude hasthe following additional features: (i) reachability analysis; (ii) LTL modelchecking; (iii) a strategy language to guide the execution of rewrite the-ories; (iv) concurrent object-oriented system speciﬁcation, including ex- ternal objects that allow Maude objects to interact with any other en-tities and support distributed implementations; (v) reﬂection, thanksto the existence of a universal theory that can simulate deduction inall other theories (including itself) and represent functional and systemmodules as data for meta-programming purposes; and (vi) symboliccomputation features such as semantic uniﬁcation, variants, symbolicreachability analysis, and SMT solving.The ideas in Joseph Goguen’s TPA book have stimulated further de-velopments in the formal veriﬁcation of Maude modules, including thefollowing: (1) the use of reﬂection and of symbolic methods to auto-mate constructor-based inductive theorem-proving veriﬁcation of func- oreword xvii tional modules; (2) tools to check the conﬂuence, suﬃcient complete-ness, and termination of functional modules; and (3) theorem provingof properties of system modules (rewrite theories).All the above-described advances in CafeOBJ and Maude illustratesome of the ways in which Joseph Goguen’s TPA book has stimulatedfurther developments in algebraic veriﬁcation. But they do not exhaustat all the possible ways in which this fundamental book could stimulateother readers: we are convinced that it will continue to stimulate usand others. We have undertaken the task of making it available to theresearch community, as it was Joseph Goguen’s desire, precisely forthis purpose. December 2020K. FutatsugiN. Martí-OlietJ. Meseguer viii

Foreword

Introduction

This book can be seen either as a text on theorem proving that usestechniques from general algebra, or else as a text on general algebra illustrated and made concrete by practical exercises in theorem prov-ing. This introductory chapter provides background and motivation,though some points may only become fully clear in light of subsequentchapters, in part because of terminology not yet deﬁned. Section 1.13.1is a synopsis. The book considers several diﬀerent logical systems,including ﬁrst-order logic, Horn clause logic, equational logic, and ﬁrst-order logic with equality. Similarly, several diﬀerent proof paradigmsare considered. However, we do emphasize equational logic, and forsimplicity we use only the OBJ3 software system, though it is used in arather ﬂexible manner.We do not pursue the lofty goal of mechanizing proofs like thoseof which mathematicians are justly so proud; instead, we seek to takesteps towards providing mechanical assistance for proofs that are use-ful for computer scientists in developing software and hardware. Thismore modest goal has the advantage of both being achievable and hav-ing practical beneﬁts.

You can think of theorem proving as game playing where there are veryprecise rules, initial positions and goals; you win if you reach the goal from the initial position by correctly following the rules. Diﬀerent logi-cal systems have diﬀerent rules and diﬀerent notions of position, whilediﬀerent problems in the same system have diﬀerent goals and/or dif-ferent initial positions. Playing is called inference or deduction , a moveis a step of inference or deduction, and positions are (usually) sets offormulae; the formulae in an initial position may be called axioms, as-sumptions or hypotheses . Thus a logical system consists of a languagewhose sentences (which are formulae) are used to state goals and ax-ioms, plus some rules of inference for deriving new sentences fromold ones. There may also be notions of model and of satisfaction, andthese will play a key role in this text. Models relate to rules of inference Introduction much as a chessboard with its pieces relates to the rules of chess; inthis setting, satisfaction means that a sentence accurately describes agiven position.The most classical example is Euclidean plane geometry. More re-cently, ﬁrst-order predicate calculus (usually called “First-Order Logic”and often abbreviated “FOL”) has been the most important game intown, but there are many, many other logical systems, a few of whichare seen later in this book.

Mechanical theorem proving has many practical applications:• The veriﬁcation of digital hardware circuits, especially VLSI, where the economic cost of design errors can be enormous.• The design and veriﬁcation of so-called critical systems , such asnuclear power plants, heart pacemakers, and aircraft guidancesystems, where failure can endanger human life or property ona large scale.• Tools to make programming more reliable and robust, for exam-ple, to help with debugging, modifying, and optimizing programs,based on their semantics.• The technology of theorem proving has been used in a numberof modern programming languages that are based upon logic, in-cluding OBJ (which is described in Section 1.11 and used through-out this text) and Prolog.• The technology of theorem proving has also been used in systemsfor robot vision, motion planning, drug discovery, DNA sequenc-ing, and many similar applications. versus

Logic for Applications

Logic has been mainly concerned with the foundations of mathematics since the rude shock of the paradoxes discovered around the turn ofthe twentieth century by Russell and others. Such foundational worktends to simplify notation, axioms, and inference rules to the bare min-imum, in order to facilitate the study of meta-mathematical issues suchas consistency and completeness. But logic is used in applications forcompletely diﬀerent reasons. In particular, computer scientists and en-gineers build hardware and software systems that are actually used inthe real worlds of science, commerce, and technology, for which verydiﬀerent approaches to logic are more appropriate. In particular, thelogical systems used for applications are often far more complex than hy Equational Logic and Algebra? those used in foundations; there may be many more symbols, axiomsand rules; and some data types may be “built in,” such as natural num-bers or lists. The ability to add new deﬁnitions and notations and thenuse them is also important, and some applications even require the useof more than one logical system. This text gives a privileged role to general algebra (also called univer-sal algebra) and to its logic, which is equational logic. One reasonfor this is that equational logic is the logic of substituting equals forequals, which is a basic common denominator among many diﬀerent logics. Also, equational logic is attractive as the foundation for a theo-rem prover because of the simplicity, familiarity, and elegance of equa-tional reasoning, and because there is a great deal of relevant theory,including the extensive literature on abstract data types. Moreover,equational reasoning can be implemented eﬃciently by term rewriting,which can then serve as a workhorse for a general purpose theoremprover. In addition, many other interesting and important logics can beembedded within or built on top of equational logic, as we will see.We will also see that equational logic is ideal as a meta-logic for de-scribing other logical systems, because the syntax of a logic is a free al-gebra, while the rules of deduction can be implemented by (conditional)rewrite rules. Thus, we can use equational logic at the meta level (fordescribing logical systems and justifying proof scores), as well as at theobject level (for proving theorems).Any computable function over any computable data structure canbe deﬁned in equational logic [11], and order-sorted equational logic,which adds subsorts [82], extends this to encompass the partial com-putable functions. Thus, equational logic is suﬃciently powerful todescribe any standard model of interest. Although not every propertythat one might want to prove about some real system can be expressedusing just equational logic, much more can be expressed than might at ﬁrst be thought. In particular, we will see that many typical resultsabout higher-order functional programs and most of the usual digitalhardware veriﬁcation examples fall within this setting, and it seemsreasonable to use the simplest possible logic for any given application.However, we do not restrict ourselves to the most traditional kind ofequational logic, but rather extend it in various ways, as discussed fur-ther in the next section; also, Chapter 8 considers ﬁrst-order logic. The terms “algebra” and “equational logic” in their narrow senses refer to modelsand deductions, respectively, but in their broad sense, both terms refer to both modelsand deductions.

Introduction

This text does not view algebra as having a single all encompassinglogic, but rather as having a family of related logics, ranging from theclassical unsorted case toward ﬁrst-order logic with equality, and evensecond-order logic. The following are brief character sketches for cer-tain versions that are developed in detail later on:•

Many Sorts.

Computing applications typically involve more thanone sort of data, and it can be awkward, or even impossible, totreat these applications adequately with unsorted algebra. Still,it is not unusual to see papers that treat only the unsorted case,perhaps with a remark that “everything generalizes easily.” Al-though this is true in essence, Section 4.3 shows that signiﬁcant diﬃculties can arise if the generalization is not done carefully.•

Conditional.

Many applications involve equations that are onlytrue under certain conditions; examples include deﬁning the tran-sitive closure of a relation (see Appendix C) as well as many ab-stract data types (see Chapter 6), and the rules of inference forﬁrst-order logic (see Chapter 8). See Chapter 3 for details.•

Overloaded.

Many computing applications involve overloaded op-eration symbols, where arguments may have diﬀerent sort pat-terns. Examples include overloading in ordinary programminglanguages (such as Ada), polymorphism in functional program-ming languages (such as ML) and the λ -calculus, and overwritingin object-oriented languages (such as Smalltalk and Eiﬀel). SeeChapter 2 for more detail.• Ordered Sorts.

This rather substantial extension of many-sortedalgebra involves having a partial ordering on the set of sorts,called the subsort relation, which is interpreted semantically asa subset relation on the sets that interpret the sorts. This hasmany interesting applications, including exception handling andpartially deﬁned functions.•

Second Order.

Another substantial extension allows quantiﬁca- tion over functions as well as over elements. This has signiﬁcantapplications to digital hardware veriﬁcation. Surprisingly, muchof general algebra extends without diﬃculty, as shown in Chap-ter 9.•

Additional Connectives.

The basic formulae of equational logicare universally quantiﬁed equations, but we can build more com-plex formulae from these, using conjunction, implication, disjunc-tion, negation, and existential quantiﬁcation. Satisfaction of suchformulae can be deﬁned in terms of the satisfaction of their con-stituents. Chapter 8 gives the details. erm Rewriting • Hidden Sorts.

Computing applications typically involve states,and it can be awkward to treat these applications in a purely func-tional style. Hidden sorted algebra substantially extends ordinaryalgebra by distinguishing sorts used for data from sorts used forstates, calling them respectively visible and hidden sorts, and itchanges the notion of satisfaction to behavioral (also called ob-servational ) satisfaction, so that equations need only appear tobe satisﬁed under all the relevant experiments. Hidden algebrais powerful enough to give a semantics for the object paradigm,including inheritance and concurrency. See Chapter 13 for details.Putting all this together gives possibilities that are far from classicalgeneral algebra. The major diﬀerence from the usual ﬁrst and second-order logic is that the only relation symbol used is equality. However, other relations can be represented by Boolean-valued functions.

A signiﬁcant part of this text is devoted to explaining and using termrewriting. The general idea can be expressed as follows:

Terms (or expressions ) over a ﬁxed syntax Σ form an algebra. A rewrite rule isa rule for rewriting some terms into others. Each rewrite rule has a leftside ( L ), which is an expression deﬁning a pattern that may or maynot match a given expression. A match of an expression E to L consistsof a subexpression E (cid:48) of E and an assignment of values to the variablesin L such that substituting those values into L yields E (cid:48) .A rewrite rule also has a rightside ( R ), which is a second expressioncontaining only variables that already occur in L . If there is a matchof L to a subexpression of a given expression, then the matched subex-pression is replaced by the corresponding substitution instance of R .This process is called term rewriting , term reduction , or subterm re-placement , and is the basis for the OBJ system (see Section 1.11) usedin this text. A very basic question in theorem proving is what logical system to use.The dominant modern logical system is ﬁrst-order predicate logic, butadvances in computer science have spawned a huge array of new logics,e.g., for database systems, knowledge representation, and the semanticweb; these include variants of propositional logic, modal logic, intu-itionistic logic, higher-order logic, and equational logic, among manyothers.The view of this text is that the choice of logical system should beleft to the user, and that a mechanical theorem prover should be a

Introduction basic engine for applying rewrite rules, so that a wide variety of logicalsystems can be implemented by supplying appropriate deﬁnitions tothe rewrite engine. This view is inﬂuenced by the theory of institutions[67], and also resembles that of the Edinburgh Logical Framework [98,1], Paulson’s Isabelle system [148], and the use of Maude as a meta-toolfor theorem proving [29], in avoiding commitment to any particularlogical system. However, it diﬀers in using equational logic and termrewriting as a basis.The ﬁrst step of our approach is to construct proof scores , which areinstructions such that when executed (or “played”), if everything eval-uates as expected, then the desired theorem is proved. A proof scoreis executed by applying proof measures , which progressively transformformulae in a language of goals into expressions which can be directly executed. We will see that equational logic is adequate for implement-ing versions of ﬁrst and second-order logic in this way, as well as manyother logical systems.This approach can be further mechanized by implementing the metalevel of goals and proof measures in OBJ itself, and providing a transla-tor to an object level of computation, also in OBJ. If each proof measureis sound, and the computer implementations are correct, then each re-sulting proof score is guaranteed to be sound, in the sense that if itexecutes as desired, then its goal has been proved. But the converse,that a proof score will prove its goal (if it is true), does not hold ingeneral. In addition, the advanced module facilities of OBJ can be usedto express the structure of proofs. The Kumo [74, 83, 73] and 2OBJ[170, 84] systems provide even more direct support for such an ap-proach. Equational logic is demonstrably adequate for our purpose,because the syntax, rules of deduction, and rules of translation of alogical system must be computable, and therefore (see Section 1.4) canbe expressed with equations.

This text justiﬁes proof measures for a logical system by demonstratingtheir soundness with respect to the notions of model and satisfaction forthat system. In this sense, it places

Semantics First!

In fact, users are primarily concerned with truth ; they want to knowwhether certain properties are true of certain models, which may berealized in software and/or hardware. From this point of view, proofis a necessary nuisance that we tolerate only because we have no otherway to eﬀectively demonstrate truth. Moreover, it is usually easier and oose versus

Standard Semantics more intuitive to justify proof measures on semantic rather than syn-tactic grounds. The above slogan has many further implications. For example, itsuggests that to deﬁne some “expressive” (i.e., complex) syntax, writesome code that is driven by it, and then call the result a “theoremprover” if it usually prints

TRUE when you want it to, is unwise. Simi-larly, it may not be a good idea to “give a semantics” for some systemafter it has already been implemented (or designed), because such asemantics may well be too complex to be of much use. Finally, it isdangerous to try to combine several logics, unless precise and reason-ably simple notions of model and satisfaction are known for the com-bination. Unfortunately, many theorem-proving projects have failed toobserve these basic rules of logical hygiene. Nevertheless, as empha-sized by the metaphor in Section 1.1, theorem proving is syntactic ma- nipulation, and hence syntax is fundamental and unavoidable for theenterprise of this book. We can state our view in a balanced way asfollows: Semantics is fundamental at the meta level (of correctnessfor proof rules), while syntax is fundamental at the objectlevel (of actual proofs). versus

Standard Semantics

An important distinction concerns the intended semantics of a logicalsystem: is it meant to capture formulae that are true of all models ofsome set of axioms, or just formulae that are true of a ﬁxed standardmodel of those axioms? Let us call the ﬁrst case loose semantics , and thesecond case standard (or tight ) semantics . The usual ﬁrst-order logichas loose semantics, and captures properties that are true of all modelsof some given axioms. This ﬁts many applications. For example, a logicintended to capture group theory must be loose, since it must applyto all groups. On the other hand, a logic for arithmetic should captureproperties of a single standard model, consisting of the numbers with their usual operations.A completeness theorem for a logical system says that all the formu-lae that are true of all the intended models of a given set of formulae(the axioms) can be proved from those formulae. There are complete-ness theorems for some well-known loose logics, including ﬁrst-order These points should not be seen as rejecting constructivist approaches like that ofMartin-Löf and others who identify the syntax and semantics of logical systems. Infact, there is much to recommend such approaches, especially for the foundations ofmathematics. But note that what we are calling soundness problems reappear in thiscontext as consistency problems. Unless perhaps you are prepared to go through several iterations, modifying boththe implementation and the logic until they are consistent and elegant.

Introduction logic and equational logic. However, completeness cannot in general beexpected for logics under standard semantics, because the class of for-mulae true of a ﬁxed model is not in general recursively enumerable;Gödel’s famous incompleteness theorem shows that this holds evenfor the natural numbers. On the other hand, the familiar and powerfultechniques for induction are not (usually) sound for loose semantics,but only for standard semantics. It seems that completeness has beenoveremphasized in the theorem-proving literature, because many com-puter science applications actually concern properties of a standardmodel of some set of formulae, rather than properties of all its models.This text treats theorem proving for both standard and loose seman-tics, as well as combinations of the two. Given a set A of formulae anda formula e , we will write A (cid:238) e to indicate that e is true in all models of A , and A |(cid:155) e to indicate that e is true in the standard model of A . Thealready well-developed theory of abstract data types is helpful in study-ing the relation |(cid:155) . In particular, we will see that formalizing the notionof “standard model” with initiality leads to simple proofs at both theobject and the meta levels, i.e., of properties of standard models andof theorems about such proofs. We also consider loose extensions ofstandard models, standard extensions of models of a loose theory, andso on recursively. Logical systems are so very precise and detailed that human beings of-ten ﬁnd it diﬃcult and/or unpleasant to use them. Usually mathematicsis conducted in a quite informal way, with only infrequent reference toany underlying logical system, much as the inhabitants of a house usu-ally ignore its foundations [51]. Indeed, unlike a house, it is not clearthat mathematics really needs foundations, though they may help yousleep better at night.Computers can greatly lighten the burden of rigorously followingthe rules of a logical system, and fully automatic theorem proving at-tempts to entirely eliminate the pain of applying rules, although of course users must still state their axioms and goals precisely. In fact,fully automatic theorem proving has not been very successful, and nonew theorems of real interest to mathematics have been proved in thisway. E1 One diﬃculty is that users often have to trick some built-inheuristics into doing what they want. However, fully automatic theo-rem proving remains an important area for research. An approach atthe other extreme is proof checking , where rules are explicitly invokedone at a time by the user, and then actually applied by the machine.This can be quite tedious, but it can detect many errors, and even cor-rect certain errors.This text avoids both these extremes, taking the view that humans BJ should do the interesting parts of theorem proving, such as inventingproof strategies and inductive hypotheses, while machines should dothe tedious parts, mechanically applying sets of rewrite rules (see Sec-tion 1.6 above) that lead closer to subgoals. An important advantageof this approach is that partially successful proofs may return usefulinformation about what to try next; for example, the output may sug-gest a new lemma that would further advance the proof. The readerwho does the exercises will see many examples of this phenomenon. Avariant of this approach provides a tactic language in which possiblyquite complex combinations of proof measures can be expressed, anda tactic interpreter to apply these compound tactics; many theorem-proving systems take this approach, including HOL [93], Isabelle [148],and Kumo [74, 83, 73]. Much research has been done on the use of graphics in theoremproving. But we have found that for even modest size proofs, graphicalrepresentations of proof trees are not only unhelpful, but are actuallyobstructive and confusing [64, 65]. Instead, we recommend structuring proofs, by using modules and other features of OBJ, as is much illus-trated in the following; we will see that this also supports proof reuse.

OBJ [47, 90] integrates speciﬁcation, prototyping, and veriﬁcation intoa single system, with a single underlying logic, which is (ﬁrst-order con-ditional order-sorted) equational logic. OBJ3, which is the implementa-tion of OBJ used in this text, allows a module P to be either:1. an object , whose intended interpretation is a standard model of P ; or2. a theory , whose intended interpretation is the variety of all mod-els of P .In OBJ3, objects are executable, while theories describe properties; bothhave sets of equations as their bodies, but the former have standard(initial algebra) semantics and are executed as rewrite rules, while the latter have loose semantics, which can be “executed” in a loose sense byapplying rules of inference to derive new equations. Although theorieshave been studied more extensively in the theorem-proving literature,they often play a lesser role in practice, because most real applicationsrequire particular data structures and operations upon them.OBJ also has generic modules, module inheritance, and module ex-pressions which describe interconnections of modules and actually cre- More precisely, we mean an initial model of P , in a sense made precise in Theo-rem 3.2.1 of Chapter 3. Although much of the terminology in this paragraph may be unfamiliar now, it isall deﬁned later on. Introduction ate the described subsystem when evaluated. The OBJ module systemis a practical realization of ideas originally developed in the Clear lan-guage [22, 23]; these ideas have directly inﬂuenced the module systemsof the ML, Ada, C ++ , and Module-2 languages. OBJ’s user-deﬁnable mix-ﬁx syntax allows users to tailor their notation to their application, andL A TEX [119] symbols can be used to produce pretty output. Rewritingmodulo associativity and/or commutativity is also supported, and caneliminate a great deal of tedium, and the subsorts provided by order-sorted algebra (see Chapter 10) support error messages and exceptionhandling in a smooth and convenient way.OBJ can be used directly as a theorem prover for equational logiconly because its semantics is the semantics of equational logic. E2 Ev-ery OBJ computation is a proof of some theorem. It is not true that any other functional programming language would do just as well. Al-though most functional languages have an operational semantics thatis based on higher-order rewriting, they do not have a declarative, logi-cal semantics for all of their features. It is also important that the OBJmodule facility has a rigorous semantics, as explained in Chapter 11.OBJ began as an algebraic speciﬁcation language at UCLA about1976, and has been further developed at SRI International, Oxford [47,90], UCSD, and several other sites [26, 33, 166] as a declarative speci-ﬁcation and rapid prototyping language; Appendix A gives more detailon OBJ, for which see also [90, 77]. The systematic use of OBJ as atheorem prover stems from [59]. The latest members of the OBJ fam-ily are CafeOBJ [43], Maude [30], and BOBJ [76, 75]. These systemsgo beyond OBJ3 in signiﬁcant ways which are not needed for this text(rewriting logic for Maude, hidden algebra for BOBJ, and both of thesefor CafeOBJ); they could be used instead of OBJ3, though some syntac-tic changes would be needed. This book provides everything about thesyntax and semantics of OBJ3 needed for theorem proving, includingpractical details on getting started (for which see Appendix A).

There is such a large literature on theorem proving that an adequate survey would take several volumes. Consequently, we limit the fol-lowing discussion to systems that have particularly inﬂuenced the ap-proach in this book, or that seem related in some other signiﬁcant way.Much of the early work on mechanical theorem proving was donein the context of so-called “Artiﬁcial Intelligence” (“AI”), and much of itwas not very rigorous. One inspiration was to give computers someability to reason, and then see how far this could be extended andapplied; there were even dreams of replacing mathematicians by pro-grams. The collection

Theorem Proving: After 25 Years [14] summa-rizes the state of theorem proving as of about 1984. The papers by ome History and Related Work Loveland and by Wang in this collection contain many interesting his-torical details. For a long time, the dominant approach for ﬁrst-orderlogic was resolution , introduced by Alan Robinson in 1965 [157]. Thistechnique for loose semantics is most suitable for fully automatic the-orem proving, because its representations make it hard for users tounderstand or use to guide proofs. Chang and Lee [27] give a readableexposition, but Leitsch [122] is more precise and up to date. This tradi-tion is well represented by the otter [132] and Gandalf [174] systems.A wide range of interesting results were ﬁrst veriﬁed with the Boyer-Moore Nqthm prover [19], using clever heuristics for induction. Itsbasic logic is untyped ﬁrst-order universally quantiﬁed with functionsymbols and with equality as its only predicate; users can deﬁne newdata structures by induction and recursion. Users inﬂuence its behav- ior by requesting intermediate results to be proved in a certain order,since it recalls what it has already proved; users can also set certainparameters. However, this can be a very awkward way to control theprover. Its successor system, ACL2, is interactive instead of automatic[112].Another tradition arises from Milner’s LCF system [95, 147], in whicha higher-order strongly typed functional language (namely ML) is usedfor writing tactics that guide the application of elementary steps of de-duction for achieving goals. Soundness is guaranteed by having a type“ thm ” that can only be inhabited by formulae that have actually beenproved. One problem with this approach is that, because a proof isdescribed by a single functional expression, for diﬃcult problems thisexpression can be hard to understand and to edit. Gordon’s HOL sys-tem [93], which has been successful for hardware veriﬁcation, is an im-portant development in this tradition; HOL is now commonly run on Is-abelle [148]. Work by Stickel on the Prolog Technology Theorem Prover[171] should also be mentioned, as should burgeoning work based ontype theory, e.g., the ELF [98, 1] and IPE [156] systems from Edinburgh,and Coq from INRIA; another important development in this area isNuPRL [35]. There is also much work using term rewriting techniquesgeneralizing Knuth-Bendix completion, some of which is discussed in

Chapter 12, although we have found that inductive proofs often workbetter in practice.Every approach mentioned above has some drawbacks, and so doesthe one in this book. In fact, every approach must be unsatisfying insome ways, because general theorem proving is recursively unsolvable.Even though there is a completeness theorem for ﬁrst-order logic withloose semantics, the problem is still only semi-decidable ; there is noway to know whether an attempt to prove a given formula will everhalt, although if there is a proof, it will eventually be found (unless theavailable memory is exceeded). Theorem proving for standard seman-tics is not in general even semi-decidable, so that any automatic prover Introduction for this domain will necessarily fail to ﬁnd proofs of some true formu-lae, even if given arbitrarily much time. However, such results no moreprevent machines from proving theorems than they do humans.The view of theorem proving as very strict “game playing” (see Sec-tion 1.1) comes from the formalist view of the foundations of math-ematics advocated by Hilbert and by Whitehead and Russell, amongothers. In this view, mathematics is a purely formal activity, rigidly gov-erned by sets of rules. This view is opposed by several other schoolsof the philosophy of mathematics, including idealists (e.g., Platonists)and intuitionists, both of whom argue that mathematics has some in-herent meaning. The mechanization of theorem proving is necessarilyconsistent with a formalist view, because computer manipulations arenecessarily formal; but that does not mean that one has to believe the formalist position to engage in mechanical theorem proving, and infact the author of this book does not accept the formalist position, butrather subscribes to a view like that of Wittgenstein, that mathematicalproofs are a kind of (socially situated) language game [51]. In any case,formal semantics cannot capture the sort of meaning that Platonistsand intuitionists talk about, because meaning in their sense is inher-ently non-formal. Despite all this, we will see that formal semanticscan be very useful.Some discussion of the history and literature of algebra is given inSections 2.8 and 3.8.

This text was developed for advanced undergraduates (that is, third orfourth year), but it can also be used at the graduate level by includingmore of the diﬃcult material. It can be used in courses on general alge-bra, on the practice of mechanical theorem proving, and on the math-ematical foundations of theorem proving. The second choice (whichwas taken at Oxford) would give precedence to the OBJ exercises, whilethe ﬁrst and third choices would give precedence to the mathematics;alternatively, all three goals could be pursued at once. This text could perhaps be used as the basis for a course on discrete structures, butfor this purpose it should be supplemented. In any case, the exercisesthat use OBJ should be done if at all possible, because they give a muchmore concrete feeling for the more theoretical material; OBJ is availableby ftp (see Appendix A for details).The choice of what to include (or to develop, in the case of new re-sults) has always preferred material that is directly useful in computerscience, especially theorem proving, although this does sometimes re-quire including other material that is not itself directly useful. Proposi-tional and predicate logic, and basic set theory, are necessary for under-standing this text; some prior exposure to algebra would be helpful, as sing this Text would some experience with computing. Appendix C reviews some ba-sic mathematical concepts, such as transitive closure, and many othersare reviewed within the body of the text.Results, deﬁnitions, and examples are numbered on the same counter,which is reset for each section; E3 exercises are on a separate counter,which is also reset. Material marked “ ( (cid:63) ) ” can be skipped without lossof continuity, and probably should be skipped on a ﬁrst reading. Themore diﬃcult proofs have been relegated to Appendix B. More advancedtopics that could be skipped in an introductory class include initialHorn models, order-sorted algebra and rewriting, generic modules, hid-den algebra, and the object paradigm. The following topics are covered: many-sorted signature, algebra andhomomorphism; term algebra and substitution; equation and satisfac-tion; conditional equations; equational deduction and its complete-ness; deduction for conditional equations; the theorem of constants;interpretation and equivalence of theories; term rewriting, termination,conﬂuence and normal form; abstract rewrite systems; standard mod-els, abstract data types, initiality, and induction; rewriting and deduc-tion modulo equations; ﬁrst-order logic, models, and proof planning;second-order algebra; order-sorted algebra and rewriting; modules; uni-ﬁcation and completion; and hidden algebra. In parallel with these area gradual introduction to OBJ3, applications to group theory, variousabstract data types (such as number systems, lists, and stacks), propo-sitional calculus, hardware veriﬁcation, the λ -calculus, correctness offunctional programs, and other topics. Some social aspects of formalmethods are discussed in Appendix D. Novel features of this book include the following: the use of arrows rather than set-theoretic functions; commutative diagrams; overloadedmany-sorted algebra; an emphasis on signatures for logics and termrewriting; an algebraic treatment of ﬁrst-order logic; second-order gen-eral algebra; applications to vlsi (especially cmos ) transistor circuits;use of an executable speciﬁcation language for proofs; the notion ofproof score; systematic use of the theorem of constants; algebraic treat-ments of termination proofs and rewriting modulo equations. Moreadvanced novel topics include: an algebraic treatment of parsing; anadjunction between term rewriting systems and abstract rewriting sys-tems; an algebraic treatment of Horn clause logic and its initial models;and results on hierarchical term rewriting systems. Introduction

I thank the Computer Science Lab at SRI, where this research began,the Programming Research Group at Oxford University, where this textbegan, and the Department of Computer Science and Engineering atUCSD, where it was ﬁnished. In particular, I thank Frances Page, JoanArnold, and Sarah Farrer for help with the diagrams and corrections.Special thanks to Dr. José Meseguer, in collaboration with whom manyof the ideas behind this text were developed, and to Mr. Timothy Win-kler, who implemented most of OBJ3, as well as Drs. Kokichi Futat-sugi, Jean-Pierre Jouannaud, Claude Kirchner, Hélène Kirchner, DavidPlaisted, Joseph Tardo, Patrick Lincoln, and Aristide Megrelis, all ofwhom helped get OBJ3 where it is today. In addition, I thank Prof. RodBurstall and Drs. James Thatcher, Eric Wagner, and Jesse Wright, who all helped get the theory to the point where a text like this becamepossible. I thank Monica Marcus, R˘azvan Diaconescu, Grigore Ro¸su,Yoshihito Toyama, Virgil-Emil Cazanescu, and José Barros for help withChapter 5, and R˘azvan Diaconescu and Grigore Ro¸su for help withChapter 8. Paulo Borba, Jason Brown, Steven Eker, Healfdene Goguen,Ranko Lazic, Alexander Leitsch, Dorel Lucanu, Sula Ma, Monica Marcus,Chiyo Matsumiya, Oege de Moor, and Adolfo Socorro have all helpedspot bugs and typos. Finally, I thank the students in my classes fortheir patience and comments. It has been a great pleasure for me towork with all these people.

A Note for Lecturers:

It is not necessary to spend time onthe material in this chapter, because most of it arises nat-urally as the actual content of the course unfolds. Instead,it can just be assigned as reading; it should probably be as-signed twice, once at the beginning and once at the end ofthe course.

Signature and Algebra

In order to prove theorems, we need formulae to express assumptionsand goals, and we need precise notions of “model” and of “satisfaction of a formula by a model” to ensure correctness. This chapter developsmany-sorted general (or universal) algebras as models, while the nextchapter considers equations and their satisfaction by algebras.

We ﬁrst brieﬂy summarize some notation that will be used throughoutthis book, assuming that basic set theory is already familiar. ω denotesthe set { , , , . . . } of all natural numbers, and S denotes the cardinal-ity of a ﬁnite set S . We let S * denote the set of all lists (or strings) ofelements from S , including the empty list which we denote [] . We writethe elements of a list in sequence without any punctuation. For exam-ple, if S = { a, b, c, d } then some elements of S * are a, ac, acb, d and [] .Notice that in this notation, S ⊆ S *. Given w ∈ S *, we let w denotethe length of w ; in particular, [] =

0, and for s ∈ S as a one elementlist, s =

1. Let S + = S * − { [] } .This book makes systematic use of arrows (also called maps ), whichare functions with given source and target sets (elsewhere these may becalled domain and codomain sets). f : A → B designates an arrow from A to B , i.e., a function deﬁned on source A with image contained in target B . For example, the successor map s : ω → ω is deﬁned by s(n) = n +

1. We let [A → B] denote the set of all arrows from A to B .In Appendix C, this approach is compared with the usual set-theoreticapproach to functions.Given arrows f : A → B and g : B → C , then f ; g denotes their composition , which is an arrow A → C (this notation follows the con-ventions of computing science, rather than of mathematics). An arrow f : A → B is injective iﬀ f (a) = f (a (cid:48) ) implies a = a (cid:48) , is surjective iﬀfor each b ∈ B there is some a ∈ A such that f (a) = b , and is bijective iﬀ it is both injective and surjective. We let 1 A denote the identity ar-row at A , deﬁned by 1 A (a) = a for all a ∈ A . Notice that 1 A ; f = f and f ; 1 B = f for any f : A → B . Signature and Algebra (cid:36)(cid:37)(cid:38)(cid:39) (cid:36)(cid:37)(cid:38)(cid:39) (cid:36)(cid:37)(cid:38)(cid:39)(cid:19)(cid:18) (cid:16)(cid:17)(cid:16) s (cid:27) (cid:63) (cid:18) OutputStateInput (cid:64)(cid:64)(cid:64)(cid:82) • f • g Figure 2.1: Signature for AutomataSome further set-theoretic topics are reviewed in Appendix C, whichshould be consulted by readers for whom the above concepts are notyet entirely comfortable.

Any theorem-proving problem needs a vocabulary in which to expressits goals and assumptions. This vocabulary will include a sort set S ,to classify (or “sort”) the entities (or “data items”) that are involved.Names for the operations involved are also needed, and for each givenoperation name, the sorts of arguments that it takes, and the sort ofvalue that it returns, should be indicated. A vocabulary with such in-formation is called a signature .For example, let us consider automata. These have three sorts of en-tity, namely input, state, and output, so that S = { Input , State , Output } .Also, there are transition and output operations, say f and g , plus aninitial state s . It is convenient to present this structure graphicallywith the “ADJ diagram” in Figure 2.1, which clearly shows that f takesan input and a state as its arguments and returns a state, while g takesa single state as input, returning an output, and s is a constant of sort State .We wish to formalize the signature concept in such a way that oper-ation symbols can be overloaded , that is, so that an operation symbol can have more than one type; a rather sophisticated word sometimesused to describe this phenomenon is “polymorphism.” For example, S might contain Nat , Bool and

List , and we might want “ + ” to de-note operations for adding natural numbers, for taking exclusive-or ofbooleans, and for concatenating lists.Our ﬁrst step towards formalizing all this is to capture the notionof providing a set of elements for each sort s ∈ S . For example, anautomaton A will have three such sets, for its elements of sorts Input , State , and

Output ; these three sets are denoted A Input , A State , and A Output , respectively, and should be thought of as a “family” of sets ofelements. Our mathematical formalization of this concept is as follows: orted Sets Deﬁnition 2.2.1

Given a set S , whose elements are called sorts (or indices ),an S - sorted (or S - indexed ) set A is a set-valued map with source S ,whose value at s ∈ S is denoted A s ; we will use the notation { A s | s ∈ S } for this. Also, we let | A | = (cid:83) s ∈ S A s and we let a ∈ A meanthat a ∈ | A | . Finally, we may sometimes write { a } s for the single-ton S -sorted set A with A s = { a } and with A s (cid:48) = ∅ for s ≠ s (cid:48) ; wemay also extend this notation, to write { a, a (cid:48) } s for { a } s ∪ { a (cid:48) } s , etc. (cid:2) It is signiﬁcant that the sets A s need not be disjoint, because it is thisthat supports overloading. For example, in many important examplesof automata, the sets A Input , A State , and A Output are all the same, e.g.,the natural numbers, or perhaps the Booleans. These sorted sets willalso be used just a little later as the basis for our notion of (overloaded) signature. (Our use of the notation { A s | s ∈ S } should not be con-fused with the set of all the sets A s ; an S -indexed set A is not a set ofsets, but rather a map from S to sets. Notations like (cid:104) A s | s ∈ S (cid:105) and { A s } s ∈ S might make this distinction clearer, but for this book a nota-tion stressing the analogy of indexed sets with ordinary sets is moredesirable.)A diﬀerent approach to sorting elements provides an arrow τ : | A | → S , where τ(a) gives the sort of a ∈ A . Unfortunately, thisapproach does not permit overloading, and therefore does not allowmany important kinds of syntactic ambiguity to be studied, becausethey cannot even exist without overloading. Applications that requireoverloading include the reﬁnement of data representations, and pars-ing in modern programming languages, such as Ada; in addition, mostobject-oriented languages allow a form of overloading that is resolvedat run-time by so-called dynamic binding.In general, concepts extend component-wise from ordinary sets to S -sorted sets. For example, A ⊆ B means that A s ⊆ B s for each s ∈ S , the empty S -sorted set ∅ has ∅ s = ∅ for each s ∈ S , and A ∪ B is deﬁned by (A ∪ B) s = A s ∪ B s for each s ∈ S . Because of thesecomponent-wise deﬁnitions, many laws about sets also extend fromsimple sets to S -sorted sets. For example, we can show that ∅ ∪ A = A for any S -sorted set A , by checking that it is true for each component,as follows: ( ∅ ∪ A) s = ∅ ∪ A s = A s . Exercise 2.2.1

Deﬁne intersection for S -sorted sets, and show that A ∩ B = B ∩ A , and that A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) , where A, B, C are all S -sorted sets. (cid:2) Deﬁnition 2.2.2 An S - sorted (or S - indexed ) arrow f : A → B between S -sortedsets A and B is an S -sorted family { f s | s ∈ S } of arrows f s : A s → B s .Given S -sorted arrows f : A → B and g : B → C , their composition isthe S -sorted family { f s ; g s | s ∈ S } of arrows. Each S -sorted set A has Signature and Algebra an identity arrow, 1 A = { As | s ∈ S } . An S -sorted arrow h : M → M (cid:48) is injective iﬀ each component h s : M s → M (cid:48) s is injective, is surjective iﬀ each h s : M s → M (cid:48) s is surjective, and is bijective iﬀ it is injective andsurjective. (cid:2) Exercise 2.2.2 If f : A → B is an S -sorted arrow, show that 1 A ; f = f and that f ; 1 B = f . (cid:2) We are now ready for the following basic concept:

Deﬁnition 2.3.1

Given a sort set S , then an S -sorted signature Σ is an indexedfamily { Σ w,s | w ∈ S * , s ∈ S } of sets, whose elements are called opera-tion symbols , or possibly function symbols . A symbol σ ∈ Σ w,s is saidto have arity w , sort s , and rank (or “type”) (cid:104) w, s (cid:105) , also written w → s ;in particular, any σ ∈ Σ [],s is called a constant symbol . (Operationand constant symbols will later be interpreted as actual operations andconstants.)A symbol σ ∈ | Σ | is overloaded iﬀ σ ∈ Σ w,s ∩ Σ w (cid:48) ,s (cid:48) for some (cid:104) w, s (cid:105) ≠ (cid:104) w (cid:48) , s (cid:48) (cid:105) . Σ is a ground signature iﬀ Σ [],s ∩ Σ [],s (cid:48) = ∅ when-ever s ≠ s (cid:48) , and Σ w,s = ∅ unless w = [] , i.e., iﬀ it consists only ofnon-overloaded constant symbols. (cid:2) Now some examples of signatures, beginning with a more formaltreatment of automata:

Example 2.3.2 ( Automata ) Because an automaton consists of an input set X ,a state set W , an output set Y , an initial state s ∈ W , a transitionfunction f : X × W → W , and an output function g : W → Y , we have S = { Input , State , Output } , with Σ [], State = { s } , Σ Input State , State ={ f } , Σ State , Output = { g } , and Σ w,s = ∅ for all other ranks (cid:104) w, s (cid:105) , asshown in Figure 2.1. (cid:2) Example 2.3.3 ( Peano Natural Numbers ) There is just one sort of interest, say S = { Nat } . To describe all natural numbers, it suﬃces to have symbolsfor the constant zero and for the successor operation, say 0 and s , respectively. Then we can describe the number n as n applicationsof s to 0; thus, 0 is represented by 0, 1 by s( ) , 2 by s(s( )) , etc.This is sometimes called Peano notation ; we could also speak of “caveman numbers,” since this is counting in base 1. The signature has Σ [], Nat = { } , Σ Nat , Nat = { s } and Σ w,s = ∅ for all other ranks (cid:104) w, s (cid:105) ; orin the singleton notation of Deﬁnition 2.2.1, Σ = { } [], Nat ∪ { s } Nat , Nat . (cid:2) Notice that the natural interpretation for Example 2.3.3 is a certainparticular standard model , whereas any model provides a suitable in-terpretation for Example 2.3.2. These two kinds of semantics are called ignature (cid:19)(cid:19) (cid:16)(cid:18) (cid:17)(cid:39)(cid:36)(cid:38)(cid:37) (cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83) (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:29) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:93)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:45) ••+ , ∗ s Nat

Figure 2.2: Signature for Numerical Expressions ∂ ∂ (cid:38)(cid:39)(cid:37)(cid:36) NodeEdge (cid:38)(cid:39)(cid:37)(cid:36) (cid:45)(cid:45) •• Figure 2.3: Signature for Graphs standard (or tight ) and loose , respectively. Later, we will make this dis-tinction precise.

Example 2.3.4 ( Numerical Expressions ) Again, there is just one sort of inter-est, say S = { Nat } , and assuming that we are only interested in theoperation symbols shown in Figure 2.2, then Σ [], Nat = { } , Σ Nat , Nat ={ s } , Σ Nat Nat , Nat = {+ , ∗} , and Σ w,s = ∅ for all other ranks (cid:104) w, s (cid:105) . Theintended semantics of this example is standard rather than loose, be-cause there is just one intended model for these expressions. (cid:2) Example 2.3.5 ( Graphs ) A (directed, unordered) graph G consists of a set E of edges , a set N of nodes , and two arrows, ∂ , ∂ : E → N , whichgive the source and target node of each edge, respectively. Thus S ={ Edge , Node } and Σ = { ∂ , ∂ } Edge , Node in the notation of Deﬁnition 2.2.1.This signature is shown in Figure 2.3. The intended semantics is loose,because there are many possible graphs. (cid:2)

Notation 2.3.6

Because a ground signature X has all its sets X w,s empty unless w = [] , we can identify such a signature with the S -indexed set X with X s = X [],s , and we shall often do so in the following; note that in thiscase, X s is disjoint from X s whenever s ≠ s , by the deﬁnition ofground signature (Deﬁnition 2.3.1).By our conventions about sorted sets, | Σ | = (cid:83) w,s Σ w,s and Σ (cid:48) ⊆ Σ means that Σ (cid:48) w,s ⊆ Σ w,s for each w ∈ S * and s ∈ S . Similarly, the union Signature and Algebra of two signatures is deﬁned by ( Σ ∪ Σ (cid:48) ) w,s = Σ w,s ∪ Σ (cid:48) w,s . A common special case is union with a ground signature X . We will usethe notation Σ (X) = Σ ∪ X for this, but always assuming that | X | and | Σ | are disjoint, and X is adisjoint family. When X is an S -indexed set, the above equation may berewritten as Σ (X) [],s = Σ [],s ∪ X s Σ (X) w,s = Σ w,s when w ≠ [] . (cid:2) OBJ modules that are intended to be interpreted loosely begin with thekeyword theory (which may be abbreviated th ) and close with the key-word endth . Between these two keywords come (optional) declarationsfor sorts and operations, plus (as discussed in detail later on) variables,equations, and imported modules. For example, the following speci-ﬁes the theory of automata of Example 2.3.2: th AUTOM issorts Input State Output .op s : -> State .op f : Input State -> State .op g : State -> Output .endth Notice that each of the four internal lines begins with a keywordwhich tells what kind of declaration it is, and terminates with a period.Any number of sorts can be declared following sorts , and operationsare declared with both their arity, between the : and the -> , and theirsort, following the -> . Because a constant like s has empty arity, noth-ing appears between the : and the -> in its declaration. It is conven- tional (but not necessary) in OBJ for sort identiﬁers to begin with anuppercase letter, and for module names to be all uppercase.Graphs as deﬁned in Example 2.3.5 may be speciﬁed as follows: th GRAPH issorts Edge Node .op ∂ : Edge -> Node .op ∂ : Edge -> Node .endth Actually, OBJ3 can only read ASCII characters, and the s that you see was producedby a L A TEX macro whose name consists of all ASCII characters. (ASCII provides a certainﬁxed set of characters with a certain ﬁxed binary encoding.) ignatures in OBJ Also, the Peano natural numbers of Example 2.3.3 may be speciﬁedby the following: obj NATP issort Nat .op 0 : -> Nat .op s_ : Nat -> Nat .endo

Here the keyword pair obj ...endo indicates that standard seman-tics is intended. Also, notice the use of sort instead of sorts in thisexample; actually, sort and sorts are synonyms in OBJ3, so that thechoice of which to use is just a matter of style.This example also uses “mixﬁx” syntax for the successor operationsymbol: in the expression before the colon, the underbar character is a place holder, showing where the operation’s arguments should go;there must be the same number of underbars as there are sorts in thearity; the other symbols before the colon go between or around thearguments. Thus, the notation s_ deﬁnes preﬁx syntax, while _+_ de-ﬁnes inﬁx syntax, as in ; similarly, _! is postﬁx , {_} is outﬁx , and if_then_else_fi is general mixﬁx . When there are no underbars, adefault preﬁx-with-parentheses syntax is assumed, as with f and g in AUTOM above. Notice that the formal deﬁnition of signature does notspecify “ﬁxity”, but only arity and rank; this issue is discussed furtherin Section 3.7 below.Here is an OBJ speciﬁcation for the expressions over the naturalnumbers introduced in Example 2.3.4: obj NATEXP issort Nat .op 0 : -> Nat .op s_ : Nat -> Nat .op _+_ : Nat Nat -> Nat .op _*_ : Nat Nat -> Nat .endo

One way to characterize the intuitive diﬀerence between loose andstandard semantics is to consider what entities are of major interest ineach case. For loose semantics (OBJ theories), the entities of greatestinterest are the models ; for example, for the theory

GRAPH , we are inter-ested in graphs, which are algebras. In such cases, we may say that thetheory denotes the class of all graphs. On the other hand, for standardsemantics (OBJ objects), we are interested in the elements of the stan-dard model; for example, for the speciﬁcation

NATP , we are interestedin the natural numbers. Of course, the algebra of all natural numbersis also of great interest, because it contains all the natural numbers, aswell as certain operations upon them. In such cases, we may say that Signature and Algebra the OBJ speciﬁcation denotes the algebra of natural numbers. (We willlater see that this is only deﬁned up to isomorphism.)

Notation 2.4.1 If FOO is the name of an OBJ module, then we let Σ FOO denote itssignature. For example, | Σ NATP | = { , s } . (cid:2) Signatures specify the syntax of theorem-proving problems, but formany problems we are really interested in semantics , that is, in partic-ular entities of the given sorts, and particular functions that interpretthe given function symbols. This is formalized by the following basicconcept:

Deﬁnition 2.5.1 A Σ - algebra M consists of an S -sorted set also denoted M , i.e.,a set M s for each s ∈ S , plus(0) an element M σ in M s for each σ ∈ Σ [],s , interpreting the constantsymbol σ as an actual element, and(1) a function M σ : M s × · · · × M s n → M s for each σ ∈ Σ w,s where w = s . . . s n (for n > interpretation of Σ in M . Often we will writejust σ for M σ . Also, we may write M w instead of M s × · · · × M s n . Forexample, using this notation we can write M σ : M w → M s for σ ∈ Σ w,s . When a symbol σ is overloaded, the notation M σ isambiguous, and we may instead write M w,sσ to explicitly indicate therank that is intended for a particular interpretation of σ . Finally, wemay sometimes write σ M instead of M σ , especially in examples.The set M s is called the carrier of M of sort s . (cid:2) Example 2.5.2

For example, we could have S = { Int , Bool } with the symbols0 , Σ [], Int and Σ [], Bool . Then we might have a Σ -algebra M with M [], Int = M [], Bool = (cid:2) Example 2.5.3 ( Automata ) An automaton is a Σ -algebra where Σ = Σ AUTOM is thesignature of Example 2.3.2, i.e., it consists of an input set X , a stateset W , an output set Y , an initial state s ∈ W , a transition function f : X × W → W , and an output function g : W → Y .Here is a simple Σ AUTOM -algebra A : let A Input = A State = A Output = ω ,the natural numbers; let (s ) A =

0, let f A (m, n) = m + n , and let g A (n) = n +

1. Then A is an automaton whose state records the sumof the inputs received, and whose output is one more than the sum ofthe inputs received. AUTOM denotes the class of all automata. (cid:2) lgebras d b a (cid:45)(cid:45)(cid:63) (cid:63) c f gg f (cid:45)(cid:45)(cid:63) (cid:63)(cid:45)(cid:45)(cid:63) (cid:63) Figure 2.4: Views of a Graph

Example 2.5.4 ( Expressions ) The standard semantics for Example 2.3.4 is a one-sorted algebra E whose carrier consists of all well-formed expressionsin 0 , s, + , ∗ , such as s( + (s( ), s(s( )))) . Here the value of + E (e , e ) is just the expression + (e , e ) ; similarly, the value of s E (e) is s(e) , of ∗ E (e , e ) is ∗ (e , e ) , and the interpretation 0 E of the symbol 0is 0. Of course, there is another Σ NATEXP -algebra which has carrier ω ,and which interprets 0 , s, + , ∗ as the expected operations on naturalnumbers. However, the intended denotation is the algebra of all well-formed expressions — or more precisely, any isomorphic algebra, aswill be discussed later on. (cid:2) Example 2.5.5 ( Graphs ) If we let Σ be the signature Σ GRAPH of Example 2.3.5,then a Σ -algebra G consists of a set E of edges , a set N of nodes , andtwo arrows, ∂ , ∂ : E → N , which give the source and target nodeof each edge, respectively, that is, G is a (directed, unordered) graph ,which we may write as (E, N, ∂ , ∂ ) .A typical graph is shown to the left in Figure 2.4; here E = { a, b, c, d } , N = { , , , } , ∂ (a) = ∂ (c) = ∂ (a) = ∂ (b) = ∂ (c) = ∂ (d) =

3, and ∂ (b) = ∂ (d) =

4. It is usual to draw such a graph as shownin the center of Figure 2.4, omitting the names of nodes and edges, sothat labels can be attached instead, as shown in the rightmost diagramof Figure 2.4, and as explained further in Example 2.5.6 below. (cid:2)

Example 2.5.6 ( Labelled Graphs ) To the signature of Example 2.3.5, let us add asingle new sort

Nlabel , and a single new operation symbol l ∈ Σ Node , Nlabel . An algebra over this signature is a node labelled graph , and may bewritten (E, N, L, ∂ , ∂ , l) . The most typical interpretations are strict in L , but loose in everything else. An algebra with the underlying graphshown in the left of Figure 2.4 and with node labels from ω is shown atthe left of Figure 2.4. We can also label edges, by adding another sort Elabel and another operation symbol, l (cid:48) ∈ Σ Edge , Elabel . Thus, in therightmost diagram of Figure 2.4, l (cid:48) (a) = l (cid:48) (d) = f and l (cid:48) (c) = l (cid:48) (b) = g . (cid:2) Exercise 2.5.1

Write an OBJ speciﬁcation for the node labelled graphs of Exam-ple 2.5.6. (cid:2) Signature and Algebra

Example 2.5.7 ( Overloading ) Now let’s consider an example with overloading,given by the following OBJ code: th OL issorts Nat Bool .ops 0 1 : -> Nat .ops 0 1 : -> Bool .op s_ : Nat -> Nat .op n_ : Bool -> Bool .op _+_ : Nat Nat -> Nat .op _+_ : Bool Bool -> Bool .endth

Here the keyword ops indicates that a number of operations with the same rank will be deﬁned together. Writing this signature outthe hard way, we have S = { Nat , Bool } , Σ [], Bool = Σ [], Nat = { , } , Σ Bool , Bool = { n } , Σ Nat , Nat = { s } , Σ Bool Bool , Bool = Σ Nat Nat , Nat = {+} , and Σ w,s = ∅ for all other ranks (cid:104) w, s (cid:105) . Then 0 and 1 are overloaded, andso is +.One algebra for this signature, usually denoted T Σ , has the naturalnumber terms in its carrier of sort Nat , and the Boolean terms in itscarrier of sort

Bool . Many of these terms are ambiguous in the sensethat there is no unique s ∈ S such that they lie in T Σ ,s . For example,the terms 0 + + ( + ) are ambiguous, as of course are 0 and1; but s( ) and 1 + (n( ) + ) are unambiguous. (Proposition 3.7.2in Section 3.7 will give a necessary and suﬃcient condition for non-ambiguity.) We will also see later how to disambiguate terms. (cid:2) The terms over a given signature Σ form a Σ -algebra which will be es-pecially useful and important to us in the following. Indeed, it is akind of “universal” Σ -algebra, which can serve as a standard model forspeciﬁcations that do not have any equations. Deﬁnition 2.6.1

Given an S -sorted signature Σ , then the S -sorted set T Σ of all( ground ) Σ - terms is the smallest set of lists over the set | Σ | ∪ { (, ) } (where ( and ) are special symbols disjoint from Σ ) such that(0) Σ [],s ⊆ T Σ ,s for all s ∈ S , and(1) given σ ∈ Σ s ...s n ,s and t i ∈ T Σ ,s i for i = , . . . , n then σ (t . . . t n ) ∈ T Σ ,s . (cid:2) When the operations are not constants, parentheses are needed to separate thediﬀerent operation forms. erm Algebras s ss + s s + ss (cid:0) (cid:64) (cid:0) (cid:64) Figure 2.5: Trees for Some TermsNotice that this representation of terms does not use mixﬁx syntax, butrather uses a default preﬁx-with-parentheses syntax; nevertheless, wewill use mixﬁx notation in examples, and will later give some theoryto support its use. Also, we will usually omit the underbars on theparentheses. For example, using the signature Σ NATEXP , we can form terms like 0, s( ) , s(s( )) , 0 + s( ) , and s( ) + s(s( )) . It is common topicture such terms as node labelled trees, as shown in Figure 2.5. Thiscorrespondence is made precise in Example 3.6.3 below.Notice also that the carriers of T Σ need not be disjoint when Σ isoverloaded. For example, if Σ is the signature of Example 2.5.7, then (T Σ ) Int and (T Σ ) Bool both contain 0 and 1.We can use an operation symbol σ in Σ as a “constructor,” that is, asa template into whose argument slots terms of appropriate sorts can beplaced, yielding new terms. For example, if t and t are two Σ -terms ofsort s , and if + is in Σ ss,s , then + (t , t ) is another Σ -term, constructedby placing t and t into the form + (_,_) . Similarly, we can think ofa constant symbol σ ∈ Σ [],s as constructing the constant term σ itself.In this way, T Σ becomes a Σ -algebra. More precisely now, Deﬁnition 2.6.2

We can view T Σ as a Σ -algebra as follows:(0) interpret σ ∈ Σ [],s in T Σ as the singleton list σ , and(1) interpret σ ∈ Σ s ...s n ,s in T Σ as the operation which sends t , . . . , t n to the list σ (t . . . t n ) , where t i ∈ T Σ ,s i for i = , . . . , n .Thus, (T Σ ) σ (t , . . . , t n ) = σ ( t . . . t n ) , and from here on we usuallyuse the ﬁrst notation. T Σ is called the term algebra , or sometimes the word algebra , over Σ . (cid:2) Example 2.6.3

Let us consider the term algebra for the signature Σ AUTOM of Ex-ample 2.3.2: since there are no terms of sort Input , the only term ofsort

State is s , and hence the only term of sort Output is g(s ) ; thatis, T Σ AUTOM , Input = ∅ , while T Σ AUTOM , State = { s } , and T Σ AUTOM , Output = { g(s ) } .The moral of this example is that the term algebras of signatures thatare intended to be interpreted loosely are not necessarily very interest-ing. In fact, Example 2.5.4 is a more typical term algebra. (cid:2) Signature and Algebra

Most expositions of logic treat the unsorted case, rather than the many-sorted case. Indeed, there has been a belief that many-sorted logic isjust a special case of unsorted logic. However, this fails for a varietyof reasons, including that many-sorted theorem proving can be muchmore eﬃcient than unsorted theorem proving, and that diﬀerent rulesof deduction must be used in certain cases.This subsection describes unsorted algebra, and informally showsthat it is equivalent to one-sorted algebra, i.e., to the special case where S =

1. Rules of deduction and some other logics are considered later.

Deﬁnition 2.7.1 An unsorted signature Σ is a family { Σ n | n ∈ ω } of sets, whose elements are called operation symbols ; σ ∈ Σ is called a con- stant symbol , and σ ∈ Σ n is said to have arity n . (cid:2) Notice that overloading is impossible for unsorted signatures.

Deﬁnition 2.7.2

Given an unsorted signature Σ , then a Σ - algebra M is a set,also denoted M and called the carrier , together with(0) an element M σ ∈ M for each σ ∈ Σ , and(1) an arrow M σ : M n → M for each σ ∈ Σ n with n > (cid:2) Note that when n =

0, then M is (by convention deﬁned to be) a one-point set, and so M σ : M → M determines an element of M in its image,which can be considered its value.We now consider the relationship with the one-sorted case. Let usassume that we are given a sort set S = { s } . Then an S -sorted set isa family { M s | s ∈ S } consisting of a single set M s and an S -sortedarrow h : M s → M (cid:48) s is a family { h s | s ∈ S } consisting of a single arrow h s : M s → M (cid:48) s . So one-sorted sets and arrows are essentially the samething as ordinary sets and arrows.Again assuming that S = { s } , an S -sorted signature is a family { Σ (cid:104) w,s (cid:105) | (cid:104) w, s (cid:105) ∈ S * × S } , which can be identiﬁed with the unsortedsignature Σ (cid:48) = { Σ (cid:48) n | n ∈ ω } where Σ (cid:48) n = Σ s n ,s . Hence, a Σ -algebrais essentially the same thing as a Σ (cid:48) -algebra. In the following, we willusually identify the unsorted and one-sorted concepts. The use of many-sorted structures is important in computing sciencebecause it can model the way that programming and speciﬁcation lan-guages keep track of the types of entities. Also, syntax in general forms iterature a many-sorted algebra; this observation will later be helpful in formal-izing various logical systems. However, most of the mathematics litera-ture and much of the computing literature treat only unsorted algebra,which is inadequate for the applications on which this book focusses.The development and use of algebras in the technical sense was amajor advance in mathematics. Alfred North Whitehead [182] preﬁg-ured this revolution in the late nineteenth century.Around 1931, Emmy Nöther laid the foundations for what is nowcalled “modern” or “abstract algebra” by systematizing and widely ap-plying the concepts of algebra and particularly homomorphism; for thisreason she has been called “the mother of modern algebra” [20, 127].In 1935, Garrett Birkhoﬀ [12] gave the now standard deﬁnitions for the unsorted case, and proved the important completeness and vari-ety theorems. Perhaps the classic mathematical reference for general(unsorted) algebra is Cohn [32]; this book also discusses some categorytheory and some applications to theoretical computing science.Many-sorted algebra seems to have been ﬁrst studied by Higgins[101] in 1963; Benabou [8] gave an elegant category-theoretic develop-ment around 1968. The use of sorted sets for many-sorted algebraseems notationally simpler than alternative approaches, such as [101],[8] and [13]; it was introduced by the author in lectures at the Universityof Chicago in 1968, and ﬁrst appeared in print in [52]. The deﬁnitionof signature with overloading (Deﬁnition 2.3.1) was ﬁrst developed inthese early lectures, but the idea only reveals its full potential in order-sorted algebra, which adds subsorts, as discussed in Chapter 10. Ideasin the papers [24] and [137] also contributed to the treatment of many-sorted general algebra that is given in this text.Our systematic use of arrows is inﬂuenced by category theory, forwhich see, e.g., [126, 63].“ADJ diagrams” were introduced in [87] as a way to visualize the many-sorted signatures used in the theory of abstract data types bythe “ADJ group,” which was originally deﬁned to be the set {Goguen,Thatcher, Wagner, Wright}. (See [58] for some historical remarks onADJ.) The name “ADJ diagram” is due to Cliﬀ Jones.OBJ began as an algebraic speciﬁcation language at UCLA about1976 [53, 55, 85], and was further developed at SRI International [47,90] and several other sites [26, 166, 33] as a declarative speciﬁcationand rapid prototyping language; Appendix A gives more detail on OBJ3,following [90, 77]. The use of OBJ as a theorem prover stems from [59],as further developed in [62]. Signature and Algebra

A Note for Lecturers:

When lecturing on the material inthis chapter, it may help to begin with examples (automata,natural number expressions, graphs, etc.), ﬁrst giving theirADJ diagrams, then an intuitive explanation, then their OBJ3syntax, then some models, and then some computations inthose models. This is because some students without a suf-ﬁcient mathematics background can ﬁnd the formalities of S -sorted sets, arrows, and so on, rather diﬃcult. After thesetopics have been treated, then the formal deﬁnitions of sig-nature and algebra can be introduced. It helps to motivatethe material by reminding students frequently that signa-tures provide the syntax for a domain within which we wishto prove theorems, and that algebras provide the semantics (models). Homomorphism, Equationand Satisfaction

Homomorphisms can express many important relationships betweenalgebras, including isomorphism , in which two structures diﬀer only inhow they represent their elements, as well as the subalgebra and quo-tient algebra relationships. In addition, we will use homomorphismsto characterize standard models and to deﬁne substitutions, two basicconcepts that will play an important role throughout this book.

Homomorphisms formalize the idea of interpreting one Σ -algebra intoanother, by mapping elements to elements in such a way that all sorts,operations, and constants are preserved. This concept may alreadybe familiar from linear transformations, which map vectors to vectorsin such a way as to preserve the (constant) vector 0, as well as theoperations of vector addition and scalar multiplication. The followingequations express this, T ( ) = T (a + b) = T (a) + T (b)T (r • a) = r • T (a) where a, b are vectors, r is a scalar, and • is scalar multiplication. Thegeneral notion is: Deﬁnition 3.1.1

Given an S -sorted signature Σ and Σ -algebras M, M (cid:48) , a Σ - homo-morphism h : M → M (cid:48) is an S -sorted arrow h : M → M (cid:48) such that thefollowing homomorphism condition holds:(0) h s (M c ) = M (cid:48) c for each constant symbol c ∈ Σ [],s , and(1) h s (M σ (m , . . . , m n )) = M (cid:48) σ (h s (m ), . . . , h s n (m n )) whenever n > σ ∈ Σ s ...s n ,s and m i ∈ M s i for i = , . . . , n . Homomorphism, Equation and Satisfaction

The composition of Σ -homomorphisms g : M → M (cid:48) and h : M (cid:48) → M (cid:48)(cid:48) istheir composition as S -sorted arrows, denoted g ; h : M → M (cid:48)(cid:48) .If h : M (cid:48) → M is an inclusion and a homomorphism, then M (cid:48) is saidto be a sub- Σ -algebra of M ; in this case, h may be called an inclusionhomomorphism . (cid:2) Exercise 3.1.1

Show that a sub- Σ -algebra M (cid:48) of M is a subset ( S -indexed, ofcourse) that is closed under all the operations in Σ , i.e., that satisﬁes: (1) M (cid:48) s ⊆ M s for all s ∈ S ; and (2) for every σ ∈ Σ s ...s n ,s , M σ (a , . . . , a n ) ∈ M (cid:48) s whenever a i ∈ M (cid:48) s i . (cid:2) Note that to cover signatures with overloading, we should reallyhave written the two conditions of Deﬁnition 3.1.1 as:(0) h s (M [],sc ) = M (cid:48) [],sc for each constant symbol c ∈ Σ [],s , and (1) h s (M w,sσ (m , . . . , m n )) = M (cid:48) w,sσ (h s (m ), . . . , h s n (m n )) whenever w = s . . . s n , n > σ ∈ Σ w,s and m i ∈ M s i for i = , . . . , n . Exercise 3.1.2

Show that a composition of two Σ -homomorphisms is a Σ -homo-morphism, and that the identity 1 M on a Σ -algebra M is a Σ -homomor-phism. (cid:2) It may be interesting to see explicitly what are the homomorphismsof graphs and automata as deﬁned in the previous chapter:

Example 3.1.2

Given two graphs (in the sense of Example 2.5.5), say G = (E, N,∂ , ∂ ) and G (cid:48) = (E (cid:48) , N (cid:48) , ∂ (cid:48) , ∂ (cid:48) ) , then a homomorphism h : G → G (cid:48) con-sists of two arrows, h E : E → E (cid:48) and h N : N → N (cid:48) that satisfy thehomomorphism condition for each σ ∈ Σ . In this case, there are justtwo σ in Σ , and the corresponding equations are h N (∂ (e)) = ∂ (cid:48) (h E (e))h N (∂ (e)) = ∂ (cid:48) (h E (e)) for e ∈ E . These equations say that graph homomorphisms preservesource and target. (cid:2) Example 3.1.3 If A = (X, W , Y , f , g, s ) and A (cid:48) = (X (cid:48) , W (cid:48) , Y (cid:48) , f (cid:48) , g (cid:48) , s (cid:48) ) are two automata (in the sense of Example 2.5.3), then a homomorphism h : A → A (cid:48) consists of three arrows, which we may denote h Input : X → X (cid:48) , h State : W → W (cid:48) , and h Output : Y → Y (cid:48) , satisfying the following threeequations h State (f (x, s)) = f (cid:48) (h Input (x), h

State (s))h

Output (g(s)) = g (cid:48) (h State (s))h

State (s ) = s (cid:48) which just say that automaton homomorphisms preserve the opera-tions of automata. (cid:2) For those not already familiar with it, this notion is deﬁned in Appendix C. omomorphism and Isomorphism Thus, Σ -homomorphisms are arrows that preserve the structure of Σ -algebras. One of the most important kinds of homomorphism is the isomorphism , which provides a translation between the data represen-tations of two Σ -algebras that are “abstractly the same.” Before givinga formal deﬁnition, we illustrate the concept with the following: Example 3.1.4

Let us consider two diﬀerent ways of representing the naturalnumbers, each a variant of the Peano representation of Example 2.3.3.For the ﬁrst, we have a one-sorted algebra P whose carrier consists ofthe lists 0 , s , s s , . . . , in which 0 denotes the list 0, and the operation s maps an expression e to the expression s e . For the second, we havean algebra P (cid:48) whose carrier consists of the lists 0 , (cid:48) , (cid:48)(cid:48) , . . . . We cannow deﬁne h : P → P (cid:48) recursively by the equations h( ) = h(s e) = h(e) (cid:48) . Intuitively, P and P (cid:48) provide two diﬀerent representations of the samething, and h describes a translation between these representations. (cid:2) Deﬁnition 3.1.5 A Σ -homomorphism h : M → M (cid:48) is a Σ - isomorphism iﬀ thereis another Σ -homomorphism g : M (cid:48) → M such that h ; g = M and g ; h = M (cid:48) (i.e., such that for each s ∈ S , g s (h s (m)) = m for all m ∈ M s and h s (g s (m (cid:48) )) = m (cid:48) for all m (cid:48) ∈ M (cid:48) s ). In this case, g is called the inverse of h , and is denoted h − ; also, we write M (cid:155) Σ M (cid:48) if there existsa Σ -isomorphism between M and M (cid:48) , and we may omit the subscript Σ if it is clear from context. (cid:2) Exercise 3.1.3

Prove that h as deﬁned in Example 3.1.4 above really is an iso-morphism. (cid:2) Example 3.1.6

We now consider the binary representation of the natural num-bers, forming a one-sorted algebra B . Its carrier consists of the symbol0 plus all (ﬁnite) lists of 0’s and 1’s not beginning with 0; 0 denotes thelist 0 (i.e., B = s is binary addition of 1. Thenthere is an arrow h : P → B with P as deﬁned in Example 3.1.4, such that h( ) = h(s e) = + h(e). (cid:2) Exercise 3.1.4

Prove that h as deﬁned in Example 3.1.6 is an isomorphism. (cid:2) Exercise 3.1.5

Prove that a Σ -homomorphism h is an isomorphism iﬀ each h s is bijective. (cid:2) The following summarizes some of the most useful properties ofisomorphisms: Homomorphism, Equation and Satisfaction

Proposition 3.1.7 If f : M → M (cid:48) and g : M (cid:48) → M (cid:48)(cid:48) are Σ -isomorphisms, then(a) (f − ) − = f .(b) (f ; g) − = g − ; f − .(c) 1 − M = M .(d) (cid:155) Σ is an equivalence relation on the class of all Σ -algebras. (cid:2) Exercise 3.1.6

Prove the assertions in Proposition 3.1.7. (cid:2)

In Chapter 6, we will see that if there is an injective Σ -homomorphism h : M → M (cid:48) , then M is isomorphic to a subalgebra of M (cid:48) , and if thereis a surjective Σ -homomorphism h : M → M (cid:48) , then M (cid:48) is isomorphic to a quotient algebra of M ; the converses also hold. These results areCorollaries 6.1.8 and 6.1.9, respectively, and their converses are givenin Exercise 6.1.2. Deﬁnition 3.1.8 An S -sorted arrow f : M → M (cid:48) is a left inverse iﬀ there isanother S -sorted arrow g : M (cid:48) → M such that f ; g = M . In this case,we also say that g is a right inverse of f ; we may also say that f has aright inverse and that g has a left inverse. (cid:2) Exercise 3.1.7

Show that if an S -sorted arrow has a right inverse then it isinjective, and if it has a left inverse then it is surjective. (cid:2) Exercise 3.1.8

Show that if an S -sorted arrow is injective and has a left inverse,then it is bijective. Similarly, show that if an arrow is surjective and hasa right inverse, then it is bijective. (cid:2) These results imply the following, which will be very useful in certainproofs later on:

Proposition 3.1.9

An injective Σ -homomorphism with a left inverse is an iso-morphism, and so is a surjective Σ -homomorphism with a right inverse. (cid:2) We now consider homomorphisms for unsorted algebra in the sense ofSection 2.7.

Deﬁnition 3.1.10

Given unsorted Σ -algebras M and M (cid:48) , then a Σ - homomor-phism h : M → M (cid:48) is an arrow M → M (cid:48) , also denoted h , such that(0) h(M σ ) = M (cid:48) σ whenever σ ∈ Σ , and This concept is deﬁned in Appendix C. nitiality of the Term Algebra (1) h(M σ (m , . . . , m n )) = M (cid:48) σ (h(m ), . . . , h(m n )) whenever σ ∈ Σ n and m i ∈ M for i = , . . . , n with n > (cid:2) Consistently with the results of Section 2.7, such a Σ -homomorphismis essentially the same thing as a one-sorted Σ (cid:48) -homomorphism, where Σ n = Σ (cid:48) s n ,s with S = { s } . For many signatures Σ , the term algebra T Σ of Section 2.6 has a veryspecial (and important) property: There is a unique way to interpreteach of its elements in any Σ -algebra M . For example, if we let Σ = Σ NATEXP and let M = ω , then t = s(s(s(s( ))) ∗ (s( ) + s( ))) should be interpreted as 7. And if we let M = { true , false } , with + interpretedas “or”, with ∗ interpreted as “and”, s as “not”, and 0 as false , then t should be interpreted as false . This section formalizes this propertyand explores some of its consequences. The key property of T Σ is statedbelow; its proof is given in Appendix B. We later construct a Σ -algebrathat has this property for Σ that may be overloaded (Theorem 3.2.10). Theorem 3.2.1 ( Initiality ) Given a signature Σ without overloading and any Σ -algebra M , there is a unique Σ -homomorphism T Σ → M . (cid:2) The property that there is a unique Σ -homomorphism to any other Σ -algebra is called initiality , and any such algebra is called an initial al-gebra . We may think of the operation symbols in a signature Σ as ele-mentary operations or commands (or microinstructions ), and then thinkof T Σ as the collection of all expressions (or simple programs ) formedfrom Σ , and ﬁnally think of a Σ -algebra M as a machine (or micropro-cessor ) that can execute the commands in Σ . For example, a constantsymbol f in Σ can be thought of as an instruction to load the value of f in M . Then Theorem 3.2.1 tells us that each such simple program hasone and only one value when executed on M . Thus initiality expressesa very basic intuition about computation on a machine.We will later see that any two initial algebras are isomorphic, so that initiality deﬁnes a “standard model” for a signature that is unique upto the renaming of its elements.Many interesting arrows arise as unique Σ -homomorphisms fromsome Σ -algebra; indeed, deﬁning arrows by initiality is essentially thesame as deﬁning functions by induction. Let us consider some exam-ples. Example 3.2.2 ( Evaluating Terms over the Naturals ) If Σ is the signature Σ NATEXP of Example 2.3.4, then we can give ω the structure of a Σ -algebra inwhich the operation symbol 0 is interpreted as the number 0, the oper-ation symbol s is interpreted as the successor operation, the operation Homomorphism, Equation and Satisfaction symbol + is interpreted as the addition function, and ∗ as multiplica-tion. Then the unique Σ -homomorphism from T Σ to ω computes thevalues of the arithmetic expressions in T Σ in exactly the expected way. (cid:2) Exercise 3.2.1

Compute the values of the terms in Figure 2.5 in the algebra ofExample 3.2.2. (cid:2)

Example 3.2.3 If Σ = Σ NATEXP then T Σ is not isomorphic to ω but for Σ (cid:48) = Σ NATP ,then T Σ (cid:48) is isomorphic to ω . Indeed, T Σ (cid:48) is the natural numbers in Peanonotation. (cid:2) Example 3.2.4 ( Depth of a Term ) Given an arbitrary signature Σ , we can makethe natural numbers into a Σ -algebra Ω by letting Ω s = ω for each s ∈ S , and by interpreting(0) each σ ∈ Σ [],s as 0 ∈ Ω s , and(1) each σ ∈ Σ s ...s n ,s for n > n naturalnumbers i , . . . , i n to the number 1 + max { i , . . . , i n } .Then the unique Σ -homomorphism d : T Σ → Ω computes the depth of Σ -terms, that is, the maximum amount of nesting in terms. (cid:2) Exercise 3.2.2

Compute the depth of the terms shown in Figure 2.5 (page 25)using the algebra of Example 3.2.4. (cid:2)

Example 3.2.5 ( Size of a Term ) Let Σ be the signature Σ NATEXP of Example 2.3.4,let ω be the carrier of an algebra A in which 0 ∈ Σ is interpreted as 1, s A (n) = n + + A (m, n) = ∗ A (m, n) = m + n +

1. Then the unique Σ -homomorphism h : T Σ → A computes the size of a term, that is, thenumber of operation symbols that occur in it. (cid:2) Exercise 3.2.3

Compute the size of the terms shown in Figure 2.5 (page 25)using the algebra of Example 3.2.5. (cid:2)

Exercise 3.2.4

Use initiality to deﬁne an arrow from Σ -terms which gives thenumber of interior (i.e., non-leaf) nodes in the corresponding tree. (cid:2) This way of deﬁning functions is a special case of a much moregeneral method called initial algebra semantics [52, 88]. This methodregards terms in T Σ as objects to which some meaning is to be assigned,constructs a Σ -algebra M of suitable meanings, and then lets the unique Σ -homomorphism that automatically exists do the work. For exam-ple, T Σ might contain the various syntactic elements of a programminglanguage in its various sorts, such as expressions, procedures, and ofcourse programs, with M containing suitable denotations for these, e.g.,in the style of denotational semantics [92, 161]; many examples of thisapproach are given in [77]. nitiality of the Term Algebra Example 3.2.6 ( Final Algebra ) There is a trivial but interesting algebra denoted F Σ that can be constructed for any signature Σ : let (F Σ ) s = { s } for eachsort s ∈ S ; and given σ ∈ Σ w,s , let F σ (s , . . . , s n ) = s when w = s . . . s n .Then the unique homomorphism T Σ → F Σ gives the sort of a Σ -term. (cid:2) The following generalizes this to any signature Σ and any Σ -algebra; theunique Σ -homomorphism again gives the sorts of elements. Proposition 3.2.7

Given any signature Σ and any Σ -algebra M , there is one andonly one Σ -homomorphism h : M → F Σ . Proof:

Given m ∈ M s , we have to deﬁne h(m) = s because h(m) must bein (F Σ ) s = { s } . It is straightforward to check that this gives a Σ -homomorphism. (cid:2) This property is called ﬁnality ; it is dual to initiality.

Exercise 3.2.5

Let Σ be the signature Σ NATP of Example 2.3.3, and let D be the Σ -algebra with carrier { , } , with 0 D =

0, with s D ( ) = s D ( ) =

0. Give a direct proof that there is one and only one Σ -homomorphism T Σ → D . (cid:2) Example 3.2.8

When there is overloading, terms do not always have a uniquesort or parse. For example, if Σ is the signature of Example 2.5.7, then0 and 1 are ambiguous in the sense that there is no unique s ∈ S suchthat they lie in (T Σ ) s ; the terms 0 +

1, 1 + ( + ) and many others are alsoambiguous, although for example, the terms s( ) and 1 + (n( ) + ) areunambiguous. Proposition 3.7.2 in Section 3.7 below gives a necessaryand suﬃcient condition on Σ such that no Σ -terms are ambiguous.For Σ as in Example 2.5.7, T Σ is initial even though it has ambiguousterms. However, there are overloaded signatures such that T Σ is not initial. For example, let S = { A, B, C } , let Σ [],A = Σ [],B = { } , and let Σ A,C = Σ B,C = { f } . Now deﬁne a Σ -algebra M as follows: M A = { } ; M B = { } ; M C = { , } ; M [],A = M [],B = M A,Cf ( ) = M B,Cf ( ) = Σ -homomorphism h : T Σ → M , because the term f ( ) has two distinct parses of the same sort, which are T A,Cf (T [],A ) and T B,Cf (T [],B ) . Therefore f ( ) must be mapped to two diﬀerent elementsof M , h(T A,Cf (T [],A )) = M A,Cf (M [],A ) = ,h(T B,Cf (T [],B )) = M B,Cf (M [],B ) = , which is impossible. (cid:2) Because initiality is so important for this book, the above examplemeans that T Σ is not adequate for our purposes. However, there is aclosely related Σ -algebra that is initial for any signature; its terms areannotated with their sorts. Homomorphism, Equation and Satisfaction

Deﬁnition 3.2.9

Given any S -sorted signature Σ , the S -sorted set T Σ of all sorted(ground ) Σ - terms is the smallest set of lists over the set S ∪| Σ |∪{· , (, ) } (where · , ( and ) are special symbols disjoint from Σ ) such that(0) if σ ∈ Σ [],s then σ · s ∈ T Σ ,s for all s ∈ S , and(1) if σ ∈ Σ s ...s n ,s and t i ∈ T Σ ,s i for i = , . . . , n then σ · s(t . . . t n ) ∈ T Σ ,s .The S -sorted set T Σ can be given the structure of a Σ -algebra in thesame way that T Σ was:(0) interpret σ ∈ Σ [],s in T Σ as the singleton list σ · s , and(1) interpret σ ∈ Σ s ...s n ,s in T Σ as the function that sends t , . . . , t n tothe list σ · s(t . . . t n ) , where t i ∈ T Σ ,s i for i = , . . . , n with n > As before, we will usually write σ · s(t , . . . , t n ) instead of σ · s(t . . . t n ) .Call T Σ the (sort) annotated term algebra over Σ . (cid:2) The following result is proved in Appendix B:

Theorem 3.2.10 ( Initiality ) Given any signature Σ and any Σ -algebra M , there isa unique Σ -homomorphism T Σ → M . (cid:2) It follows that T Σ is “almost an initial Σ -algebra,” because its terms dif-fer from those of T Σ only in the sort annotations; for many signatures,including most of those that come up in practice, T Σ actually is initial.This motivates the following Convention 3.2.11

We will usually write terms without sort annotation, andwill usually annotate operations only in so far as necessary to deter-mine a unique fully annotated term with the given partial annotation.Moreover, we will usually write T Σ when we really mean T Σ . (cid:2) Proposition 3.2.12 below characterizes when T Σ is initial. It uses thefollowing: Exercise 3.2.6

Show that the arrow h : T Σ → T Σ that strips sort annotations oﬀoperation symbols is a Σ -homomorphism. It may be deﬁned as follows:(0) h s (σ · s) = σ for σ ∈ Σ [],s , and (1) h s (σ · s(t . . . t n )) = σ (h s t . . . h s n t n ) for σ ∈ Σ s ...s n ,s and t i ∈ T Σ ,s i for i = , . . . , n with n > (cid:2) Proposition 3.2.12

The term algebra T Σ is initial iﬀ any two distinct sortedterms of the same sort remain distinct after the sorts are stripped oﬀoperation symbols. Proof:

That the unique Σ -homomorphism h : T Σ → T Σ strips oﬀ sorts is dueto its homomorphic property, and since it is surjective, it is an isomor-phism iﬀ it is injective. (cid:2) quation and Satisfaction Recall that a suﬃcient condition for T Σ to be initial is that Σ has nooverloading. The above says that it is initial iﬀ whatever overloadingmay be present does not produce ambiguity. From this, it follows byinduction that it suﬃces for there to be no overloaded constants. Structural induction [21] is an important proof technique that is closelyrelated to initiality. To prove that a certain property P is true of all Σ -terms, we (0) prove that P is true of all the constants in Σ , and then (1)prove that if P is true of t , . . . , t n (of appropriate sorts), then P is trueof σ (t , . . . , t n ) , for every nonconstant symbol σ in Σ . It then followsthat P is true of all (ground) Σ -terms, because they all can be built from the symbols in Σ , working upward from the constants. Somewhat moreformally, let P ⊆ T Σ be the ( S -sorted) subset of all Σ -terms for whichthe desired property holds. Then the two properties to be shown implythat P is a Σ -algebra. But it can be shown (see the next paragraph) that T Σ does not have any proper Σ -subalgebras, from which it follows that P = T Σ .More formally the following steps are required for proving that some S -indexed family P of predicates holds for all Σ -terms by structural in-duction :(0) prove that P s holds for every σ ∈ Σ [],s , for each s ∈ S ; and(1) prove that P s holds for σ (t , . . . , t n ) for all σ ∈ Σ s ...s n ,s where P s i is assumed to hold for each t i , a Σ -term of sort s i , for i = , . . . , n .This proof method is carefully stated and validated in Chapter 6 (The-orem 6.4.4), but the idea is as follows: let i denote the inclusion of P into T Σ and let h : T Σ → P be the unique Σ -homomorphism given byinitiality. Then h ; i : T Σ → T Σ is also a Σ -homomorphism. But initialityimplies that there is only one Σ -homomorphism T Σ → T Σ , which is nec-essarily the identity 1 T Σ , because that is a Σ -homomorphism. Therefore h ; i = T Σ , and so by Exercise 3.1.8, i is a Σ -isomorphism, because it isinjective and a right inverse. This section deﬁnes the basic concepts of equation, and of satisfactionof an equation by an algebra. This will give us a semantic notion oftruth for equational logic, and hence a standard by which to judge thesoundness of rules of deduction for that system. A number of examplesare also given.To discuss equations, we need terms with variables. It can seemquite diﬃcult to say exactly what a variable actually is in some branches Homomorphism, Equation and Satisfaction of mathematics. But in general algebra, this is not so hard: A Σ -termwith variables in X is just an element of T Σ (X) where X is a groundsignature (see Notation 2.3.6) disjoint from Σ ; that is, a variable is justa new constant symbol. Deﬁnition 3.3.1 A Σ - equation consists of a ground signature X of variablesymbols (disjoint from Σ ) plus two Σ (X) -terms of the same sort s ∈ S ;we may write such an equation abstractly in the form ( ∀ X) t = t (cid:48) and concretely in the form ( ∀ x, y, z) t = t (cid:48) when (for example) | X | = { x, y, z } and the sorts of x, y, z can be in- ferred from their uses in t and in t (cid:48) . A speciﬁcation is a pair ( Σ , A) ,consisting of a signature Σ and a set A of Σ -equations. A Σ -speciﬁcationis a speciﬁcation whose signature is Σ . (cid:2) Example 3.3.2 ( Semigroups ) This speciﬁcation has just one sort, say

Elt , andjust one operation, say _*_ : Elt Elt -> Elt , which must obey theassociative law, (x ∗ y) ∗ z = x ∗ (y ∗ z) where x, y, z are variables of sort Elt . If we let X be the ground sig-nature with X Elt = { x, y, z } , then this can be written more accuratelyas ( ∀ X) (x ∗ y) ∗ z = x ∗ (y ∗ z) and slightly less formally as ( ∀ x, y, z) (x ∗ y) ∗ z = x ∗ (y ∗ z) . In OBJ, we would write th SEMIGROUP is sort Elt .op _*_ : Elt Elt -> Elt .vars X Y Z : Elt .eq (X * Y)* Z = X *(Y * Z).endth

This follows the convention that variable names begin with an up-percase letter; like the convention for sort names, it is not enforcedby the OBJ3 system. However, the systematic use of these conventionsdoes help users to distinguish sort and variable names from keywordsand operation symbols, and thus helps make speciﬁcations more read-able. (For a diagram of the signature of this speciﬁcation, delete theedges labelled by e and by -1 from Figure 3.1.) (cid:2) Recall that this means that all of the variable symbols in X are distinct. quation and Satisfaction (cid:19)(cid:19) (cid:16)(cid:18) (cid:17)(cid:31)(cid:28)(cid:30)(cid:29) (cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83)(cid:83) (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:29) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:93)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:45) •• *-1e Elt Figure 3.1: Signature for Groups

Example 3.3.3 ( Monoids ) Similarly, monoids are speciﬁed as follows: th MONOID is sort Elt .op e : -> Elt .op _*_ : Elt Elt -> Elt .vars X Y Z : Elt .eq X * e = X .eq e * X = X .eq (X * Y)* Z = X *(Y * Z).endth

This theory denotes the class of all monoids. (We no longer givea set-theoretic version.) Our convention names both objects and sortswith the singular version of the structure involved, rather than the plu-ral; thus, we write

MONOID and

Elt rather than

MONOIDS and

Elts . (cid:2) Exercise 3.3.1

Write out a formal set-theoretic deﬁnition of the above OBJ spec-iﬁcation of monoids, in the style of Example 2.5.3. (cid:2)

Example 3.3.4 ( Groups ) It is little more work to specify groups than to specifymonoids: th GROUP is sort Elt .op e : -> Elt .op _ − : Elt -> Elt .op _*_ : Elt Elt -> Elt .vars X Y Z : Elt .eq X * e = X .eq X *(X − ) = e .eq (X * Y)* Z = X *(Y * Z).endth Notice that only half of the usual pairs of equations for the identityand inverse laws are given. We will later prove that the other halves Homomorphism, Equation and Satisfaction follow from these laws (and vice versa ). The signature for this speciﬁ-cation is shown in Figure 3.1.If h : G (cid:48) → G is an inclusion homomorphism, then G (cid:48) is said to be a subgroup of G . (cid:2) Example 3.3.5 ( Integers ) We can specify the integers as follows: obj INT is sort Int .op 0 : -> Int .op s_ : Int -> Int .op p_ : Int -> Int .var I : Int .eq s p I = I .eq p s I = I .endo

Here s_ is the successor operation, and p_ is the predecessor oper-ation. This speciﬁcation deﬁnes the algebra of integers, with the givenoperations; we may also say that it denotes the class of all standardmodels of the integers, as initial algebras with these operations.The following speciﬁcation also deﬁnes addition and negation onthe integers: obj INT is sort Int .op 0 : -> Int .op s_ : Int -> Int .op p_ : Int -> Int .op -_ : Int -> Int .op _+_ : Int Int -> Int .vars I J : Int .eq s p I = I .eq p s I = I .eq - 0 = 0 .eq - s I = p - I .eq - p I = s - I .eq I + 0 = I .eq I + s J = s(I + J).eq I + p J = p(I + J).endo

Here -_ is the negation operation and _+_ is addition. It is interestingto notice that the integers with 0 as identity, -_ as inverse and _+_ as“multiplication” form a group; however, we do not yet have the toolsneeded to prove this. (cid:2) Our commitment to semantics requires that we not only formalizewhat equations are , but also what they mean . This is done by the con-cept of satisfaction, which will use the following: As will be discussed in more detail later, neither of these operations is a “construc-tor” in the usual sense, because there are non-trivial relations between them. quation and Satisfaction Notation 3.3.6

Recall that a Σ -algebra M provides an interpretation for eachoperation symbol in Σ , and in particular, for each constant symbol in Σ .If X is a ground signature (e.g., a set of variables), then an interpretationfor X is just a (many-sorted) arrow a : X → M . Thus a Σ -algebra M and an arrow a : X → M give an interpretation in M for all of Σ (X) ,allowing M to be seen as a Σ (X) -algebra. Theorem 3.2.1 now gives aunique Σ (X) -homomorphism from the initial Σ (X) -algebra T Σ (X) to M as a Σ (X) -algebra, using a .In such a situation, we call a : X → M an interpretation or an assign-ment of the variable symbols in X , and we let a : T Σ (X) → M denote theunique extension of a to a Σ (X) -homomorphism from the term algebra T Σ (X) . (cid:2) Deﬁnition 3.3.7 A Σ -algebra M satisﬁes E4 a Σ -equation ( ∀ X) t = t (cid:48) iﬀ for any assignment a : X → M we have a(t) = a(t (cid:48) ) in M . In this case we write M (cid:238) Σ ( ∀ X) t = t (cid:48) . We call (cid:238) a “ ,” and we generally omit the subscript Σ when it is clear from context. A Σ -algebra M satisﬁes a set A of Σ -equations iﬀ it satisﬁes each e ∈ A , and in this case we write M (cid:238) Σ A .

We may also say that M is a P -algebra, and write M (cid:238) P where P is a speciﬁcation ( Σ , A) . The class of all algebras that satisfy P is called the variety deﬁned by P , and we may also say that the deno-tation of P is this variety.Finally, for A a set of Σ -equations, we let A (cid:238) Σ ( ∀ X) t = t (cid:48) mean that M (cid:238) Σ A implies M (cid:238) Σ ( ∀ X) t = t (cid:48) . (cid:2) Example 3.3.8 If Σ is the signature Σ MONOID of Example 3.3.3, then a Σ -algebra is a monoid iﬀ it satisﬁes the equations in Example 3.3.3, i.e., iﬀ it satis-ﬁes the speciﬁcation MONOID of monoids. The denotation of the theory

MONOID is the variety of all monoids. For example, given a set S , recallthat S * denotes the set of all lists of elements from S , including theempty list [] . Then S * is a monoid with e = [] and with ∗ interpretedas concatenation of lists: for any choice of x, y, z ∈ S *, it is true that (x ∗ y) ∗ z = x ∗ (y ∗ z) , because each term yields the concatenationof the three lists. (cid:2) Example 3.3.9 If Σ is the signature Σ GROUP of Example 3.3.4, then a Σ -algebra isa group iﬀ it satisﬁes the equations in Example 3.3.4, i.e., iﬀ it satisﬁes Homomorphism, Equation and Satisfaction the speciﬁcation

GROUP . For example, S * satisﬁes the ﬁrst and thirdaxioms of GROUP , because it is a monoid, but there is no way to deﬁnean operation i : S * → S * such that x ∗ i(x) = [] for all x ∈ S *, becausea concatenation of two lists always yields a list that is at least as longas its arguments. Indeed, the only concatenation that yields the value [] is [] ∗ [] . This is an example of non-satisfaction. (cid:2) Exercise 3.3.2

Let S be a set.1. Show that the bijections f : S → S form a group under composi-tion (of functions), with identity 1 S .2. Let G be a group of bijections on a set S , and let F ⊆ S , called a“ﬁgure.” Show that { f ∈ G | f (F) = F } is a subgroup of G , calledthe group of symmetries of F . (cid:2) Exercise 3.3.3 An endomorphism of a Σ -algebra M is a Σ -homomorphism h : M → M , and an automorphism is an endomorphism that is bijective(i.e., an isomorphism).1. Show that the set of all endomorphisms of a given Σ -algebra M has the structure of a monoid under composition.2. Show that the set of all automorphisms of a given Σ -algebra M hasthe structure of a group under composition. (cid:2) Example 3.3.10

A rather cute speciﬁcation that is not very well known deﬁnespairs of natural numbers using just one constant, two unary operations,and a single equation: obj 2NAT is sort 2Nat .op 0 : -> 2Nat .ops (s1_) (s2_) : 2Nat -> 2Nat .var P : 2Nat .eq s1 s2 P = s2 s1 P .endo

We can show that pairs of natural numbers are an initial algebra for as follows: Let P = {(cid:104) m, n (cid:105) | m, n ∈ ω } , let 0 P be (cid:104) , (cid:105) , let s send (cid:104) m, n (cid:105) to (cid:104) sm, n (cid:105) , and let s send (cid:104) m, n (cid:105) to (cid:104) m, sn (cid:105) . Nowif M is any -algebra, then deﬁne h : P → M to send (cid:104) m, n (cid:105) to(the value that is denoted by the term) s m s n M . Then h is a Σ -homomorphism because h( P ) = h( (cid:104) , (cid:105) ) = M , and h(s (cid:104) m, n (cid:105) ) = s h( (cid:104) m, n (cid:105) ) because each equals s m + s n M , and similarly for preser-vation of s . We leave it as an exercise to show that if g : P → M is alsoa Σ -homomorphism, then necessarily g = h .Another -algebra has as carrier the set consisting of all terms ofthe form s m s n

0, with s (s m s n ) = s m + s n s (s m s n ) = s m s n + P . (cid:2) quation and Satisfaction The fact that variables are just unconstrained constants suggeststhat equations with variables can be regarded as ground equations inwhich the variables are treated as new constants. The following is aformal statement of this basic intuition:

Theorem 3.3.11 ( Theorem of Constants ) Given a signature Σ , a ground signa-ture X disjoint from Σ , a set A of Σ -equations, and t, t (cid:48) ∈ T Σ (X) , then A (cid:238) Σ ( ∀ X) t = t (cid:48) iﬀ A (cid:238) Σ ∪ X ( ∀∅ ) t = t (cid:48) . Proof:

Each condition is equivalent to the condition that a(t) = a(t (cid:48) ) for every Σ (X) -algebra M satisfying A and every assignment a : X → M . E5 (cid:2) It is very pleasing that this proof is so simple. This is because it is based on the semantics of satisfaction, rather than some particularrules of deduction, and because it exploits the initiality of the termalgebra. The intuition behind this result is again that variables “are”constants about which we do not know anything.Although Example 3.3.4 gives perhaps the most common way tospecify groups, there are many other equivalent ways. Therefore itis interesting to examine what “equivalent” means in this context. Ofcourse, we ﬁrst deﬁne equivalence semantically, and after that give syn-tactic measures for proving equivalence. This section considers onlythe special case where the two speciﬁcations have the same signature;Section 4.10 extends this to allow diﬀerent signatures.

Deﬁnition 3.3.12 Σ -speciﬁcations P and P (cid:48) are equivalent iﬀ for each Σ -algebra M , M (cid:238) P iﬀ M (cid:238) P (cid:48) . We can now deﬁne a theory to be an equivalence class of speciﬁcations. (cid:2)

It is usual to identify a speciﬁcation P with the theory that it represents,that is, with its equivalence class; thus, we may say that GROUP is thetheory of groups, rather than saying that GROUP represents (or presents ) the theory of groups. Example 3.3.13 ( Left Groups ) Example 3.3.4 speciﬁed the theory

GROUP of groupswith right identity and inverse equations; here is a speciﬁcation withleft-handed versions of these equations: th GROUPL is sort Elt .op e : -> Elt .op _ − : Elt -> Elt .op _*_ : Elt Elt -> Elt .vars X Y Z : Elt .eq e * X = X . Homomorphism, Equation and Satisfaction eq (X − ) * X = e .eq X *(Y * Z) = (X * Y)* Z .endth In Chapter 4 we will prove that

GROUPL is equivalent to

GROUP . (cid:2) There are many cases where some equation (or other formula) holdsonly in certain conditions. The extension of equations and their satis-faction to the conditional case is straightforward.

Deﬁnition 3.4.1 A conditional Σ - equation consists of a ground signature X dis-joint from Σ , a ﬁnite set C of pairs of Σ (X) -terms, and a pair t, t (cid:48) of Σ (X) -terms; we will use the notation ( ∀ X) t = t (cid:48) if C .

Given a Σ -algebra M , deﬁne E6 M (cid:238) Σ ( ∀ X) t = t (cid:48) if C to mean that, given any interpretation a : X → M , if a(u) = a(v) for each (cid:104) u, v (cid:105) ∈ C , then a(t) = a(t (cid:48) ) . If A is a set of conditional Σ -equations, then we say that M satisﬁes A iﬀ M satisﬁes each equationin A , and if e is a conditional equation, then A (cid:238) Σ e iﬀ M (cid:238) Σ e whenever M (cid:238) Σ A . (cid:2) Conditional equations make sense even when C is not ﬁnite; but with-out that restriction, neither equational deduction nor term rewritingwith such equations would be possible in ﬁnite space, and in partic-ular, we could not write down (ﬁnite) proof scores that use equationswith inﬁnite conditions. Fact 3.4.2

Given any Σ -equation e = ( ∀ X) t = t (cid:48) , let e (cid:48) = ( ∀ X) t = t (cid:48) if ∅ .Then for each Σ -algebra M , M (cid:238) Σ e iﬀ M (cid:238) Σ e (cid:48) . (cid:2) Consequently, we can regard any ordinary equation as a conditionalequation with the empty condition, and vice versa ; we will feel free to do this hereafter.The following result gives us a technique for proving conditionalequations with equational deduction, and hence with reduction; theapplication of this result for deduction is given in Theorem 4.8.4.

Proposition 3.4.3

Given a conditional Σ -equation ( ∀ X) t = t (cid:48) if C and a set A of Σ -equations, then A (cid:238) Σ ( ∀ X) t = t (cid:48) if C iﬀ (A ∪ C (cid:48) ) (cid:238) Σ (X) ( ∀∅ ) t = t (cid:48) , where C (cid:48) is deﬁned to be { ( ∀∅ ) u = v | (cid:104) u, v (cid:105) ∈ C } . Proof:

Each condition is equivalent to the following: ubstitution XT Σ (X) Mi X ¯ aa (cid:45)(cid:54)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:51) Figure 3.2: Free Algebra Propertyfor each Σ -algebra M and interpretation a : X → M , if M (cid:238) A and if a(u) = a(v) for each (cid:104) u, v (cid:105) ∈ C , then a(t) = a(t (cid:48) ) ,where a : T Σ (X) → M is the unique homomorphism. (cid:2) Once again, initiality enables us to give a very simple proof.

The substitution of terms into other terms will play a basic role in laterchapters, especially those on deduction and term rewriting. To helpdeﬁne this concept, we ﬁrst consider the so-called free algebras , usinga technique already employed in setting up Deﬁnition 3.3.7. Given asignature Σ and a ground signature X disjoint from Σ , we can form the Σ (X) -algebra T Σ (X) and then view it as a Σ -algebra by just “forgetting”about the constants in X ; this works because T Σ (X) already has all theoperations it needs to be a Σ -algebra, and it does no harm that it alsohas some others. Let us denote this Σ -algebra by T Σ (X) . It is calledthe free Σ -algebra generated by (or over ) X , and it has the followingcharacteristic property, called free generation by (or over ) X (see alsoFigure 3.2): Proposition 3.5.1

Given a signature Σ , a ground signature X disjoint from Σ , a Σ -algebra M , and a map a : X → M , there is a unique Σ -homomorphism a : T Σ (X) → M which extends a , in the sense that a s (x) = a s (x) for each s ∈ S and x ∈ X s . (This property is illustrated in Figure 3.2, where i X is the S -sorted inclusion.) We may call a an assignment from X to M . Proof:

Let j be the interpretation for Σ in M . Then combining j with a gives aninterpretation for Σ (X) in M , and hence makes M into a Σ (X) -algebra.Therefore, by initiality of T Σ (X) there is a unique Σ (X) -homomorphismfrom T Σ (X) to M . But this is exactly the same thing as a Σ -homomor-phism from T Σ (X) to M that extends a . (cid:2) We have already noted that a Σ -term with variables in an S -sortedground signature Y is just an element of T Σ (Y ) . Then an assignment Homomorphism, Equation and Satisfaction a : X → T Σ (Y ) assigns Σ -terms with variables from Y to variables from X in a way that respects the sorts involved, and the Σ -homomorphism a : T Σ (X) → T Σ (Y ) given by Proposition 3.5.1 substitutes a term a(x) for each variable x ∈ X into each term t in T Σ (X) , yielding a term a(t) in T Σ (Y ) . Hence we have the following: Deﬁnition 3.5.2 A substitution of Σ -terms with variables in Y for variables in X is an arrow a : X → T Σ (Y ) ; we may also use the notation a : X → Y . The application of a to t ∈ T Σ (X) is a(t) . Given substitutions a : X → T Σ (Y ) and b : Y → T Σ (Z) , their composition a ; b (as substitutions) is the S -sorted arrow a ; b : X → T Σ (Z) . (cid:2) Notation 3.5.3

The following notation makes substitutions look less abstract:Given t ∈ T Σ (X) and given a : X → T Σ (Y ) such that | X | = { x , . . . , x n } and a(x i ) = t i for i = , . . . , n , then we may write a(t) in the form t(x ← t , x ← t , . . . , x n ← t n ), and whenever t i is the variable x i , we can omit the pair x i ← t i . (cid:2) Exercise 3.5.1

Let Σ be the signature of Example 2.3.4, let X = { x, y, z } [], Nat ,let t = x + s(s(y) + z) , let Y = { u, v } [], Nat , and deﬁne a : X → T Σ (Y ) by a(x) = u + s(v) , a(y) =

0, and a(z) = v + s( ) . Now compute a(t) . (cid:2) Exercise 3.5.2 If i X : X → T Σ (X) is the inclusion, show that i X (t) = t for each t ∈ T Σ (X) . (cid:2) Exercise 3.5.3

Given a substitution a : X → T Σ (Y ) , show that i X ; a = a and a ; i Y = a . (cid:2) Notation 3.5.4

Because i X serves as an identity for the composition of substi-tutions, we may write 1 X for i X in the following. (cid:2) It is natural to expect that term substitution is associative, in thesense that given substitutions a : W → T Σ (X) , b : X → T Σ (Y ) and c : Y → T Σ (Z) , we have (a ; b) ; c = a ; (b ; c) . We will see in the nextsection that there is a simple and beautiful proof of this using the free property of term algebras. Exercise 3.5.4

Is substitution commutative? I.e., given a, b : X → T Σ (X) , does a ; b = b ; a ? Give a proof or a counterexample. (cid:2) There is a very nice way to graphically represent systems of equations,such as those that arise from the homomorphism condition; this is themethod of commutative diagrams . We will later see that this method asting and Chasing not only allows us to represent systems of equations graphically, butalso to reason about them graphically; such reasoning with diagrams isoften called diagram chasing because of the characteristic way in whichone follows arrows around the diagram. In order to explain this in aprecise way, we ﬁrst need some further concepts from graph theory. Deﬁnition 3.6.1 A path p in a graph G is a list e , . . . , e m of edges of G such that ∂ (e i ) = ∂ (e i + ) for i = , . . . , m −

1; the source of p is ∂ (p) = ∂ (e ) and the target of p is ∂ (p) = ∂ (e m ) ; we will write p : n → n (cid:48) when ∂ (p) = n and ∂ (p) = n (cid:48) . If m = p would be the empty list [] , and ∂ ([]) and ∂ ([]) would not be deﬁned; so instead, we makethe source and target of [] explicit, writing [] n . Given p : n → n (cid:48) and q : n (cid:48) → n (cid:48)(cid:48) , we deﬁne their composition p ; q : n → n (cid:48)(cid:48) to be their concatenation as lists. Note that [] n is an identity for this composition,in the sense that [] n ; p = p and p ; [] m = p , for any path p with source n and target m ; in practice, we may omit the subscripts on [] .A graph G is a tree iﬀ it has a node r , called its root , such that foreach node n of G , there is a unique path from r to n in G . (cid:2) One motivation for being so formal about all of this is that mechanicaltheorem proving is necessarily formal to this extent.

Exercise 3.6.1

Show that the root of a tree is necessarily unique. (cid:2)

Exercise 3.6.2

Given a graph G with at most one edge between each pair ofnodes, show that a path p = e . . . e k in G is uniquely determined by thesequence n n . . . n k − n k of the nodes that it passes through, where n = ∂ (e ) , n = ∂ (e ) = ∂ (e ) , . . . , n k − = ∂ (e k − ) = ∂ (e k ) and n k = ∂ (e k ) . (cid:2) Deﬁnition 3.6.2 A diagram ( of ( sorted ) sets ) is a graph whose nodes are la-belled by (sorted) sets, and whose edges are labelled by (sorted) arrows,in such a way that• if the arrow f : A → A (cid:48) labels the edge e : n → n (cid:48) , then the label of n is A and the label of n (cid:48) is A (cid:48) .A diagram commutes iﬀ• whenever p, q are two paths, at least one of which has length atleast 2, each with (say) source n and target n (cid:48) , such that the labelsalong the edges of p are f , . . . , f m and along q are g , . . . , g k , then f ; . . . ; f m = g ; . . . ; g k .That is, the arrows obtained by composition along any two (non-trivial)paths from one node to another are equal. (cid:2) Homomorphism, Equation and Satisfaction • • • f gh (cid:45) (cid:35) (cid:32) (cid:63) (cid:34) (cid:33) (cid:54)

Figure 3.3: Length One Paths E (cid:48) ∂ (cid:48) N (cid:48) h N N∂ (cid:45)(cid:45)(cid:63) (cid:63) Eh E E (cid:48) ∂ (cid:48) N (cid:48) h N N∂ (cid:45)(cid:45)(cid:63) (cid:63) Eh E Figure 3.4: Commutative Diagrams for Graph HomomorphismThus, a commutative diagram is a geometrical presentation of a sys-tem of (non-trivial) equations among arrows. The reason for excludingpaths of length 1 in the above deﬁnition is that we want diagrams ofthe form in Figure 3.3 to say that f ; g = f ; h , without also saying that g = h .For example, we can express the two equations which say that an S -sorted arrow h : G → G (cid:48) is a graph homomorphism by the two com-mutative diagrams shown in Figure 3.4, in which h E and h N denote the Edge and

Node components of h , respectively. Similarly, the three dia-grams in Figure 3.5 express the conditions for a sorted arrow to be anautomaton homomorphism (here h, i, j denote the Input , State and

Output components of the homomorphism, respectively).We now give a more complex example (it can be skipped at ﬁrstreading): X (cid:48) × S (cid:48) f (cid:48) S (cid:48) iSf (cid:45)(cid:45)(cid:63) (cid:63) X × Sh × i S (cid:48) g (cid:48) Y (cid:48) jYg (cid:45)(cid:45)(cid:63) (cid:63) SiSS (cid:48) i {∗} (cid:63)(cid:72)(cid:72)(cid:72)(cid:72)(cid:72)(cid:106)(cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:42) s (cid:48) s Figure 3.5: Commutative Diagrams for Automaton Homomorphism asting and Chasing Example 3.6.3 ( (cid:63) ) ( Tree of a Term ) Given a signature Σ , we will construct a “ Σ -algebra of labelled graphs,” some of whose elements will be the treesthat represent Σ -terms. Let S be the sort set of Σ , and recall that | Σ | = (cid:83) w,s Σ w,s . Now deﬁne G Σ to be the Σ -algebra where, for each s ∈ S , G Σ ,s is the set of all node labelled graphs G = (E, N, L, ∂ , ∂ , l) having E, N ⊆ ω * and L = | Σ | , with each σ ∈ Σ [],s interpreted as the graph G σ = ( ∅ , { [] } , L, ∂ , ∂ , l σ ) where l σ ([]) = σ , and each σ ∈ Σ s ...s m ,s interpreted as the arrow which sends graphs G , . . . , G m to the graph G = (E, N, L, ∂ , ∂ , l) in which• N = { [] } ∪ (cid:83) mi = i · N i , where · denotes concatenation for lists ofnaturals, where N i is the node set of the graph G i • E = N − { [] } • ∂ (i . . . i n ) = i . . . i n − • ∂ (i . . . i n ) = i . . . i n • l([]) = σ and• l(i . . . i n ) = l i (i . . . i n ) , where n > l k is the label functionof G k for k = , . . . , m .Then the unique Σ -homomorphism h : T Σ → G Σ sends each Σ -term toits | Σ | -labelled tree representation. (cid:2) Exercise 3.6.3 ( (cid:63) ) Show that the trees shown in Figure 2.5 actually do arisein the manner of Example 3.6.3 from the terms shown after Deﬁnition2.6.1. (cid:2)

Commutative diagrams are a well established proof technique inmodern algebra, and are increasingly used in computing science aswell. In this technique, a diagram represents a system of simultane-ous equations (among compositions of arrows ), and geometrical oper-ations on diagrams correspond to algebraic operations on systems ofequations. One such operation is called “pasting,” because geometri-cally it amounts to pasting commutative diagrams together, whereas algebraically it amounts to combining systems of equations.For example, we can prove that the composition of two graph ho-momorphisms is a graph homomorphism by diagram pasting, ratherthan by calculation (as in Exercise 3.1.2): assume that we are given ho-momorphisms h : G → G (cid:48) and h (cid:48) : G (cid:48) → G (cid:48)(cid:48) . Then Figure 3.4 showsthe diagrams for h ; those for h (cid:48) are similar, but with an additional (cid:48) symbol everywhere. For the operation symbol ∂ , Figure 3.6 shows thetwo diagrams, let us call them P and P , that we wish to paste together,with their common subdiagram P and their union P . The fact that P Later we will see how equations among other kinds of entities fall into the sameframework. Homomorphism, Equation and Satisfaction (cid:63) (cid:63)(cid:45)(cid:45)

E h E E (cid:48) ∂ (cid:48) N (cid:48) h N N∂ (cid:63) (cid:63)(cid:45)(cid:45) E (cid:48) h (cid:48) E E (cid:48)(cid:48) ∂ (cid:48)(cid:48) N (cid:48)(cid:48) h (cid:48) N N (cid:48) ∂ (cid:48) (cid:63) N (cid:48) E (cid:48) ∂ (cid:48) (cid:63) (cid:63) (cid:63)(cid:45)(cid:45) (cid:45)(cid:45) N (cid:48) E (cid:48) ∂ E E (cid:48)(cid:48) N (cid:48)(cid:48) N h E h (cid:48) E h N h (cid:48) N ∂ (cid:48)(cid:48) ∂ (cid:48) (cid:63) (cid:63)(cid:45)(cid:45) E h E ; h (cid:48) E E (cid:48)(cid:48) ∂ (cid:48)(cid:48) N (cid:48)(cid:48) h N ; h (cid:48) N N∂ Figure 3.6: Commutative Diagrams for Graph Homomorphism Proofcommutes then gives us that the rightmost diagram commutes, whichis what we really want. (The case of ∂ is similar.)It is easy to give a formal justiﬁcation for this assertion using equa- tional reasoning. The leftmost two squares represent the two equations ∂ ; h N = h E ; ∂ (cid:48) ∂ (cid:48) ; h (cid:48) N = h (cid:48) E ; ∂ (cid:48)(cid:48) and so we can prove commutativity of the square in which we are inter-ested as follows: ∂ ; (h N ; h (cid:48) N ) = (∂ ; h N ) ; h (cid:48) N = (h E ; ∂ (cid:48) ) ; h (cid:48) N = h E ; (∂ (cid:48) ; h (cid:48) N ) = h E ; (h (cid:48) E ; ∂ (cid:48)(cid:48) ) = (h E ; h (cid:48) E ) ; ∂ (cid:48)(cid:48) . Geometrically, this argument simply says that the functions along eachoutside path are equal to the function along the path through the cen-tral edge, namely h E ; ∂ (cid:48) ; h (cid:48) N .This argument is typical of those used to justify diagram pasting. Itis also a typical (though rather simple) diagram chase . Usually such ar-guments are done geometrically, preferably on a black (or white) board,and are omited in written documents. Example 3.6.4

Although pasting commutative diagrams works just as well withtriangles, pentagons and other polygons as it does with squares, it isworth remarking that there are cases where the union of a collection ofcommutative diagrams is not commutative. Hence some caution must be observed. Consider, for example, the diagram in Figure 3.7, in which N is the natural numbers, Z is the integers, each edge labelled 1 isthe identity on N , each diagonal is the inclusion map N → Z , and thefour outer maps ( a, b, c, d ) are arbitrary except that they restrict to theidentity on N . For example, we might choose the following, for i ∈ Z , a(i) = (cid:40) i if i ∈ N − i (cid:54)∈ N b(i) = (cid:40) i if i ∈ N − i (cid:54)∈ N asting and Chasing Z ZZZ N NNN c dba 1 111 (cid:45) (cid:63)(cid:45)(cid:63) (cid:45) (cid:63)(cid:45)(cid:63)(cid:64)(cid:64)(cid:64)(cid:64)(cid:73) (cid:0)(cid:0)(cid:0)(cid:0)(cid:18)(cid:64)(cid:64)(cid:64)(cid:64)(cid:82)(cid:0)(cid:0)(cid:0)(cid:0)(cid:9)

Figure 3.7: A Non-Commutative Diagram c(i) = (cid:40) i if i ∈ N − i (cid:54)∈ N d(i) = (cid:40) i if i ∈ N − i (cid:54)∈ N Then for all i ∈ Z , (a ; b)(i) = b(i) and (c ; d)(i) = d(i) ; so in particular, (a ; b)( − ) = − (c ; d)( − ) = −

4. So a ; b ≠ c ; d . (cid:2) Because “diagram chasing” refers to arguments made using dia-grams, diagram pasting may be considered a particular kind of diagramchasing. Another common form involves using initiality (or freeness) toargue that because there are two arrows between two nodes (with someproperty), they must be equal. Both are illustrated in the following ele-gant proof of the associativity of substitution:

Proposition 3.6.5 ( Associativity of Substitution ) Given substitutions a : W → T Σ (X) , b : X → T Σ (Y ) , c : Y → T Σ (Z) , then (a ; b) ; c = a ; (b ; c). Proof:

The assertion to be proved translates to (a ; b) ; c = a ; (b ; c) , where “;”indicates composition of ordinary (many-sorted) arrows. By the usualassociative law for such arrows, it suﬃces to show that b ; c = (b ; c) . By the uniqueness condition of Proposition 3.5.1, the above equationwill follow from showing that b ; c is a Σ -homomorphism extending b ; c . Homomorphism, Equation and Satisfaction WT Σ (W ) T Σ (X) ¯ aa (cid:45)(cid:54)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:51) X T Σ (Y ) ¯ bb (cid:45)(cid:54)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:51) Y ¯ cc T Σ (Z)Z (cid:45) (cid:54)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:51) (cid:19) (cid:16) (cid:63) (b ; ¯ c) (cid:54) Figure 3.8: Associativity of Substitution Proof Diagram

If we let i : X → T Σ (X) denote the injection, then what we have to showis that i ; (b ; c) = b ; c . But this follows from i ; b = b , which is just commutativity of the middlebottom triangle. (cid:2) This proof looks much simpler and more elegant if done by chasingthe right hand two thirds of the diagram in Figure 3.8 on a white orblackboard. By contrast, to prove the result by direct manipulation ofthe set-theoretic representation of terms would require several pagesof very tedious calculation. It is worth drawing out the following keyelement of this in the proof, because it is needed later on:

Corollary 3.6.6

Given substitutions a : W → T Σ (X) , b : X → T Σ (Y ) , then a ; b = (a ; b) . (cid:2) ( (cid:63) ) Parse

Because a signature Σ can be overloaded, Σ -terms can also be over-loaded, and it is useful to characterize when this can happen. Deﬁnition 3.7.1 An S -sorted signature Σ is regular iﬀ σ ∈ Σ w,s ∩ Σ w,s (cid:48) implies s = s (cid:48) . A Σ -term t is overloaded iﬀ there are distinct sorts s, s (cid:48) ∈ S such that t ∈ T Σ ,s ∩ T Σ ,s (cid:48) . (cid:2) Notice that regularity implies in particular that all constant symbols aredistinct.

Proposition 3.7.2

A signature Σ is regular iﬀ there are no overloaded Σ -terms. Proof:

Assume that Σ is regular. By induction on the depth of terms, we willshow that Σ -terms t, t (cid:48) of depths ≤ d with distinct sorts s, s (cid:48) must be (cid:63) ) Parse diﬀerent. For d =

0, suppose that t = σ and t (cid:48) = σ (cid:48) ; then w = w (cid:48) = [] and the result follows directly from regularity. For depth d > t = σ (t , . . . , t m ) and t (cid:48) = σ (cid:48) (t (cid:48) , . . . , t (cid:48) n ) . Because Σ -termsare lists, and lists have unique factorizations, t = t (cid:48) implies σ = σ (cid:48) , m = n and t i = t (cid:48) i for i = , . . . , n . Now the induction hypothesisimplies that t i and t (cid:48) i have the same sort for i = , . . . , n . Therefore w = w (cid:48) , and hence regularity gives s = s (cid:48) .Conversely, E7 suppose that t = σ (t , . . . , t n ) ∈ T Σ ,s ∩ T Σ ,s (cid:48) is a min-imal overloading , in the sense that s ≠ s (cid:48) and none of the t i are over-loaded. Then necessarily σ ∈ Σ w,s ∩ Σ w,s (cid:48) where w = s . . . s n and s i isthe sort of t i (for i = , . . . , n ). Thus Σ is not regular. (cid:2) This result does not consider ambiguities due to mixﬁx syntax, becauseit only uses the preﬁx-with-parenthesis syntax. These more elaborate kinds of ambiguities are considered below.

Exercise 3.7.1

Give a simple non-regular signature Σ and a simple overloaded Σ -term. (cid:2) We now generalize the deﬁnitions of signature and term to the caseof mixﬁx syntax:

Deﬁnition 3.7.3

Let A be some ﬁxed set of characters that does not include theunderbar character “ _ ” or the three special symbols · , ( , and ) . Then a form is a list in ( A ∪ { _ } ) *, and the arity of a form is the number of _ ’sthat occur in it. A (many-sorted) mixﬁx signature is an indexed family { Σ w,s | w ∈ S * , s ∈ S } for some set S of sorts , where each Σ w,s is a setof forms of arity w . (cid:2) Example 3.7.4

Let

A = { a, b, +} . Then the following are all forms: a, a _ , _ a, a _ b, _ + _ , __ . The ﬁrst has arity 0, the next three have arity 1, and the last two have ar-ity 2. The ﬁrst deﬁnes syntax for a constant, while the second throughlast (respectively) deﬁne syntax for preﬁx, postﬁx, outﬁx, inﬁx, and jux-taposition operations. (cid:2)

We now give a recursive construction for mixﬁx Σ -terms: Deﬁnition 3.7.5 If Σ is an S -sorted mixﬁx signature disjoint from S , then the S -sorted set E8 M Σ of all mixﬁx ( ground ) Σ - terms is the smallest set oflists over the set A ∪ {· , (, ) } such that(0) if f ∈ Σ [],s then f · s ∈ M Σ ,s for all s ∈ S , and(1) if f ∈ Σ s ...s n ,s for n > t i ∈ M Σ ,s i for i = , . . . , n then (k t k . . . k n t n k n + ) · s ∈ M Σ ,s , where f = k _ k _ . . . k n _ k n + (note that some of the k i may bethe empty list). Homomorphism, Equation and Satisfaction

As with terms in T Σ , we will usually omit the sort annotation unless itis necessary. (cid:2) Every mixﬁx signature Σ is also an ordinary signature. However,the Σ -terms will look rather diﬀerent in the two cases. For example, if _+_ ∈ Σ ss,s and a, b ∈ Σ [],s , then a + b is in M Σ but not in T Σ , whereas _+_ (a, b) is in T Σ but not in M Σ . If extra clarity is needed, we will let Σ denote the ordinary signature corresponding to a mixﬁx signature Σ . Deﬁnition 3.7.6

Given a mixﬁx signature Σ disjoint from S , we can give M Σ thestructure of a Σ -algebra in the following way:(0) interpret f ∈ Σ [],s in M Σ as the singleton list f · s , and (1) interpret f ∈ Σ s ...s n ,s with n > M Σ as the function sending t , . . . , t n to the list (k t k . . . k n t n k n + ) · s , where t i ∈ M Σ ,s i for i = , . . . , n and f = k _ k _ . . . k n _ k n + .Thus, we have that (M Σ ) f (t , . . . , t n ) = (k t k . . . k n t n k n + ) · s , althoughit will usually be written k t k . . . k n t n k n + . (cid:2) It follows that there is a unique Σ -homomorphism T Σ → M Σ ; the fol-lowing uses this fact in comparing the two kinds of Σ -terms: Deﬁnition 3.7.7

A mixﬁx signature Σ is sort ambiguous iﬀ the carriers of M Σ are non-disjoint, and is mixﬁx ambiguous iﬀ the unique homomor-phism h : T Σ → M Σ is non-injective. Given m ∈ M Σ , if h(t) = m then t is said to be a parse of m . (cid:2) Exercise 3.7.2 . Suppose that a mixﬁx signature Σ has a single sort A , and alsohas Σ [],A = { a } , and Σ A,A = { a _ , a _ a, _ a } , with all other Σ w,s = ∅ . Then

1. Show that Σ is mixﬁx ambiguous; in particular, show that aaa hasﬁve distinct parses, and write each one out.2. How many parses does aaaa have?3. Noting that Σ is not sort ambiguous, construct an ordinary many-sorted signature that is sort ambiguous. (cid:2) Exercise 3.7.3

Show that a term over a mixﬁx signature is mixﬁx ambiguous iﬀit has at least two distinct parses. (cid:2) As usual, T Σ here really means T Σ . iterature Relatively few books develop many-sorted general algebra in any depth.Bergstra et al. [9], Ehrig and Mahr [45] and van Horenbeck [181] each de-velop a certain amount for the algebraic speciﬁcation theory which istheir main concern. I am not aware of any mathematics text that de-velops many-sorted general algebra in any detail. The notation and ap-proach of this chapter continues that of the previous chapter, followingideas from [52] as further developed in [137, 78] and other publications.Initial algebra semantics (as discussed in Section 3.2) originated in[52], and was further developed in [88]. It can be seen as an algebraicformulation of the so-called attribute semantics of Knuth [116].It is common to treat both variables and substitutions either in- tuitively, or else with extreme logical formalism; the approach givenhere tries to ﬁnd a middle ground. Our Theorem of Constants (Theo-rem 3.3.11) is analogous to a well-known result in ﬁrst-order logic. Thisresult is not usually treated in the computing or the general algebra lit-erature, although it is not diﬃcult, and it plays an important role injustifying proofs by term rewriting.It is known that conditional equations have more expressive powerthan unconditional equations, in the sense that there are algebras thatare initial models of a speciﬁcation having conditional equations thatare not initial models of any speciﬁcation having only unconditionalequations [176].The proof of associativity for substitution (Theorem 3.6.5) follows[89], and is the same as the more abstract proof which a category the-orist would call “associativity of composition in a Kleisli category” (thenecessary concepts are beyond the scope of this book, but may befound, for example, in [126]). Structural induction was introduced tocomputer science by Burstall [21] in 1969.The formulation of theory equivalence at the end of Section 3.3 is amore concrete and many-sorted version of Lawvere’s category-theoreticformulation for theories [121]; see [130] and [5] for further informationon Lawvere theories, and see Section 4.10 for some techniques for prov-ing equivalence.

The emphasis on models and satisfaction in this and the previouschapter (as well as in subsequent chapters) was inﬂuenced by the theoryof institutions [67], which axiomatizes the notion of “logical system”using satisfaction. But as Wittgenstein is said to have remarked,

Is a proof not also part of an institution? and as Thomas Jeﬀerson said on July 12, 1816,

Laws and institutions must go hand in hand with the progressof the human mind. Homomorphism, Equation and Satisfaction

And indeed, laws and proofs play a major role in the rest of this book,as they must in any study of theorem proving.

A Note for Lecturers:

Emphasizing the “microprocessor”interpretation of initiality in the discussion after Theo-rem 3.2.1 can considerably sharpen students’ intuitions, andsome concrete examples with drawings can strengthen thisprocess.The proof of Proposition 3.6.5 should be done as a live dia-gram chase on the board; this is a lot of fun, and it is also thebest way to bring out the essential simplicity of this proof.

Equational Deduction

This chapter considers how to correctly deduce new equations fromold ones. We give several ﬁnite sets of rules for equational deduction that are both sound , i.e., truth preserving, and complete for loose se-mantics, in the sense that every equation that is true in all models of agiven set of equations can be deduced from that set using these rules.Such results are important because they say that we can ﬁnd out whatis true by using formal, ﬁnitary manipulations of ﬁnite syntactic ob-jects, whereas the semantic deﬁnition of truth (by satisfaction, Deﬁni-tion 3.3.7 of the previous chapter) in general requires examining inﬁnitesets of inﬁnite objects (since algebras in general have inﬁnite carriers);obviously, such an examination cannot be done on any real computer,which has only a ﬁnite amount of memory. Nonetheless, satisfaction re-mains fundamental, because it provides the standard of correctness fordeduction. Equational deduction also has many important applicationsin computer science and elsewhere (see Section 4.12).

Equational deduction is reasoning with just the properties of equality.Basic properties of equality include the following:(1) Anything is equal to itself; this is the reﬂexivity of equality.(2) If t equals t (cid:48) , then t (cid:48) equals t ; this is the symmetry of equality. (3) If t equals t (cid:48) and t (cid:48) equals t (cid:48)(cid:48) , then t equals t (cid:48)(cid:48) ; this is the transi- tivity of equality.(4) If t equals t (cid:48) and t equals t (cid:48) . . . and t n equals t (cid:48) n , and if t hasvariables x , . . . , x n , then the result of substituting t i for x i in t equals the result of substituting t (cid:48) i for x i in t ; this is called the congruence property of equality, and may be paraphrased as say-ing that substituting equal expressions into the same expressionyields equal expressions.(5) If t equals t (cid:48) where t and t (cid:48) involve variables x , . . . , x n , and if t , . . . , t n are terms, then the result of substituting t i for x i in t Equational Deduction equals the result of substituting t i for x i in t (cid:48) ; this is called the substitutivity (or instantiation ) property of equality, and may beparaphrased as saying that any substitution instance of an equa-tion is an equation.In these properties, the various t ’s are terms, which may involve vari-ables; furthermore, both (4) and (5) involve substituting terms for vari-ables. We will see that it is necessary to keep careful track of variablesduring equational deduction, or else soundness can be lost. This moti-vates the following: Notation 4.1.1

We will use the notation of Deﬁnition 3.3.1 for equations thatappear in deduction, writing ( ∀ X) t = t (cid:48) , where all variables in t and t (cid:48) are taken from X . Also, we will write θ : X → T Σ (Y ) for a substitutionof terms from T Σ (Y ) for variables in X , as in Deﬁnition 3.5.2. Finally, we adopt the convention that if t ∈ T Σ (X) is a Σ -term with variables in X , then the result of substituting θ(x) for each x in X into t may bewritten θ(t) , rather than θ(t) as in Deﬁnition 3.5.2. (cid:2) The simple example below reviews notation and concepts, in prepara-tion for more complex material to come.

Example 4.1.2

Suppose there is just one sort, say

Elt , and that X has threevariables of that sort, say x, y, z . Let Y = { x, w } and deﬁne θ : X → T Σ (Y ) by θ(x) = x − − , θ(y) = w − , θ(z) = x ∗ x − . Now if t = (x ∗ y) − , then θ(t) = (x − − ∗ w − ) − . (cid:2) We can now give the following formal versions of the above proper-ties of equality:

Deﬁnition 4.1.3

Given a signature Σ and a set A of Σ -equations, called the ax-ioms or assumptions , the following rules of deduction deﬁne the Σ -equations that are deducible (or provable or inferable ) ( from A ):(0) Assumption . Each equation in A is deducible.(1) Reﬂexivity . Each equation of the form ( ∀ X) t = t is deducible.(2) Symmetry . If ( ∀ X) t = t (cid:48) is deducible, then so is ( ∀ X) t (cid:48) = t .(3) Transitivity . If the equations ( ∀ X) t = t (cid:48) , ( ∀ X) t (cid:48) = t (cid:48)(cid:48) are deducible, then so is ( ∀ X) t = t (cid:48)(cid:48) . ules of Deduction (4) Congruence . If θ, θ (cid:48) : Y → T Σ (X) are substitutions such that foreach y ∈ Y , the equation ( ∀ X) θ(y) = θ (cid:48) (y) is deducible, then given any t ∈ T Σ (Y ) , the equation ( ∀ X) θ(t) = θ (cid:48) (t) is also deducible.(5) Instantiation . If ( ∀ Y ) t = t (cid:48) is in A , and if θ : Y → T Σ (X) is a substitution, then the equation ( ∀ X) θ(t) = θ(t (cid:48) ) is deducible. (cid:2) The next section will give a formal deﬁnition of equational deductionusing these rules, and will deﬁne a relation A (cid:96) e indicating that e can be deduced from A . But ﬁrst, we illustrate what it is that we wish toformalize: Example 4.1.4 ( Left Groups ) Suppose we want to prove that the right inverselaw ( ∀ x) x ∗ x − = e holds in the speciﬁcation GROUPL of Example 3.3.4, which is reproducedbelow, except that the variables A , B , and C are used instead of X , Y ,and Z , and a precedence declaration has been added for the inverseoperation -1 .Precedence provides a way to declare that some operation symbolsare “stronger” or “more binding” than others. For example, the usualconventions for mathematical notation assume that x ∗ x − means x ∗ (x − ) rather than (x ∗ x) − , because − binds more tightly than ∗ . In OBJ3, precedence is deﬁned by giving a natural number p as an“attribute” of an operation, in the form [prec p] following the opera-tion’s sort. Lower precedence means tighter binding, and a binary inﬁxoperation symbol like * has a default precedence of 41. (We do not givea formal treatment of precedence in this text, although it is not diﬃcultto do so; see [90] for further discussion.) th GROUPL is sort Elt .op _*_ : Elt Elt -> Elt .op e : -> Elt .op _-1 : Elt -> Elt [prec 2] .var A B C : Elt .eq e * A = A .eq A -1 * A = e .eq A *(B * C) = (A * B)* C .endth Let G.1, G.2 and G.3 denote the three equations in the speciﬁcation

GROUPL above, in their order of appearance. Equational Deduction

As an illustration, let us apply rule (4) to the equation ( ∀ x) e = (x − − ∗ x − ) , with X = { x } and Y = { z, x } and t = z ∗ (x ∗ x − ) . Then we want θ(z) = e and θ (cid:48) (z) = x − − ∗ x − , while θ(x) = θ (cid:48) (x) = x , so that theresult is the equation ( ∀ x) e ∗ (x ∗ x − ) = (x − − ∗ x − ) ∗ (x ∗ x − ) . The following is a deduction for the right identity law, using therules of Deﬁnition 4.1.3. In this display, some of the deduced equationsare named by bracketed numbers given to their left. Also, each step ofdeduction is indicated by a long horizontal line, to the right of whichthe rule used is indicated, together with the names of any equations used (in addition to the one above the line) after the word “on”. ( ∀ x) e ∗ (x ∗ x − ) = (x ∗ x − ) (5) on G.1 [ ] ( ∀ x) (x ∗ x − ) = e ∗ (x ∗ x − ) (2) ( ∀ x) (x − − ∗ x − ) = e (5) on G.2 [ ] ( ∀ x) e = (x − − ∗ x − ) (2) ( ∀ x) e ∗ (x ∗ x − ) = (x − − ∗ x − ) ∗ (x ∗ x − ) (4) on [2] with t = z ∗ (x ∗ x − )[ ] ( ∀ x) (x ∗ x − ) = (x − − ∗ x − ) ∗ (x ∗ x − ) (3) on [1] ( ∀ x) (x − − ∗ x − ) ∗ (x ∗ x − ) = ((x − − ∗ x − ) ∗ x) ∗ x − (5) on G.3(3) on [3] [ ] ( ∀ x) (x ∗ x − ) = ((x − − ∗ x − ) ∗ x) ∗ x − ( ∀ x) x − − ∗ (x − ∗ x) = (x − − ∗ x − ) ∗ x (5) on G.3 ( ∀ x) (x − − ∗ x − ) ∗ x = x − − ∗ (x − ∗ x) (2) ( ∀ x) ((x − − ∗ x − ) ∗ x) ∗ x − = (x − − ∗ (x − ∗ x)) ∗ x − (4) with t = z ∗ x − (3) on [4] [ ] ( ∀ x) (x ∗ x − ) = (x − − ∗ (x − ∗ x)) ∗ x − ( ∀ x) (x − ∗ x) = e (5) on G.2 ( ∀ x) (x − − ∗ (x − ∗ x)) ∗ x − = (x − − ∗ e) ∗ x − (4) with t = (x − − ∗ z) ∗ x − (3) on [5] [ ] ( ∀ x) (x ∗ x − ) = (x − − ∗ e) ∗ x − ( ∀ x) x − − ∗ (e ∗ x − ) = (x − − ∗ e) ∗ x − (5) on G.3 ( ∀ x) (x − − ∗ e) ∗ x − = x − − ∗ (e ∗ x − ) (2)(3) on [6] [ ] ( ∀ x) (x ∗ x − ) = x − − ∗ (e ∗ x − )( ∀ x) e ∗ x − = x − (5) on G.1 ( ∀ x) x − − ∗ (e ∗ x − ) = x − − ∗ x − (4) with t = x − − ∗ z (3) on [7] [ ] ( ∀ x) (x ∗ x − ) = x − − ∗ x − ( ∀ x) (x − − ∗ x − ) = e (5) on G.2(3) on [8] [ ] ( ∀ x) (x ∗ x − ) = e quational Proof This rather tedious proof is illustrated in the “proof tree” shown inFigure 4.1, in which each arrow indicates the application of the rule ofdeduction with which it is labelled (the last step is omitted). Section 4.5will give a more powerful rule that will allow us to give a much easierproof of this result. (cid:2)

The rules of equational deduction are used to prove or deduce a newequation e from a given set A of equations, by repeatedly applying therules (0–5) to previously deduced equations. We let A (cid:96) e mean that e is deducible from A . A proof for the assertion A (cid:96) e is a sequence of rule applications that really proves e from A . A fully annotated proofprovides all of the information that is used in each step of deduction,including the name of the rule involved, and any substitutions that areused. We formalize this as follows: Deﬁnition 4.2.1

Given a signature Σ and a set A of Σ -equations, a ( bare ) proof from A is a sequence e , . . . , e n of Σ -equations where each e i is de-ducible from A ∪ { e , . . . , e i − } by a single application of a single rule;then we say that e , . . . , e n is a ( bare ) proof of e n from A . A ( fully ) annotated proof is a sequence a , . . . , a n , where each a i has one of thefollowing forms:(0) (cid:104) e i , ( ) (cid:105) where e i ∈ A .(1) (cid:104) e i , ( ) (cid:105) where e i is of the form ( ∀ X) t = t .(2) (cid:104) e i , e j , ( ) (cid:105) where j < i and e i is of the form ( ∀ X) t = t (cid:48) and e j is of the form ( ∀ X) t (cid:48) = t .(3) (cid:104) e i , e j , e k , ( ) (cid:105) where j, k < i , and e i is of the form ( ∀ X) t = t (cid:48)(cid:48) and e j , e k are of the forms ( ∀ X) t = t (cid:48) , ( ∀ X) t (cid:48) = t (cid:48)(cid:48) respectively.(4) (cid:104) e i , θ, θ (cid:48) , ϕ, t, ( ) (cid:105) where t ∈ T Σ (Y ) , where ϕ : Y → ω , and where θ, θ (cid:48) : Y → T Σ (X) are substitutions such that for each y ∈ Y , eachequation ( ∀ X) θ(y) = θ (cid:48) (y) is some e ϕ(y) where ϕ(y) < i , and e i is of the form ( ∀ X) θ(t) = θ (cid:48) (t) . Equational Deduction

G.1 ↓ (5) ( ∀ x) e ∗ (x ∗ x − ) = (x ∗ x − ) ↓ (2) ( ∀ x) (x ∗ x − ) = e ∗ (x ∗ x − ) (cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:113) G.2 ↓ (5) ( ∀ x) (x − − ∗ x − ) = e ↓ (2) ( ∀ x) e = (x − − ∗ x − ) ↓ (4) with t = z ∗ (x ∗ x − )( ∀ x) e ∗ (x ∗ x − ) = (x − − ∗ x − ) ∗ (x ∗ x − ) ↓ (3) ( ∀ x) (x ∗ x − ) = (x − − ∗ x − ) ∗ (x ∗ x − ) (cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:80)(cid:113) G.3 ↓ (5) ( ∀ x) (x − − ∗ x − ) ∗ (x ∗ x − ) = ((x − − ∗ x − ) ∗ x) ∗ x − ↓ (3) ( ∀ x) (x ∗ x − ) = ((x − − ∗ x − ) ∗ x) ∗ x − (cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:57) G.3 ↓ (5) ( ∀ x) x − − ∗ (x − ∗ x) = (x − − ∗ x − ) ∗ x ↓ (2) ( ∀ x) (x − − ∗ x − ) ∗ x = x − − ∗ (x − ∗ x) ↓ (4) with t = z ∗ x − ( ∀ x) ((x − − ∗ x − ) ∗ x) ∗ x − = (x − − ∗ (x − ∗ x)) ∗ x − ↓ (3) ( ∀ x) (x ∗ x − ) = (x − − ∗ (x − ∗ x)) ∗ x − (cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:122) G.2 ↓ (5) ( ∀ x) (x − ∗ x) = e ↓ (4) with t = (x − − ∗ z) ∗ x − ( ∀ x) (x − − ∗ (x − ∗ x)) ∗ x − = (x − − ∗ e) ∗ x − ↓ (3) ( ∀ x) (x ∗ x − ) = (x − − ∗ e) ∗ x − (cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:24)(cid:57) G.3 ↓ (5) ( ∀ x) x − − ∗ (e ∗ x − ) = (x − − ∗ e) ∗ x − ↓ (2) ( ∀ x) (x − − ∗ e) ∗ x − = x − − ∗ (e ∗ x − ) ↓ (3) ( ∀ x) (x ∗ x − ) = x − − ∗ (e ∗ x − ) (cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:88)(cid:122) G.1 ↓ (5) ( ∀ x) e ∗ x − = x − ↓ (4) with t = x − − ∗ z( ∀ x) x − − ∗ (e ∗ x − ) = x − − ∗ x − ↓ (3) ( ∀ x) (x ∗ x − ) = x − − ∗ x − ( ) ( )( )( ) ( )( ) Figure 4.1: A Proof Tree oundness and Counterexamples (5) (cid:104) e i , θ, e, t, t (cid:48) , ( ) (cid:105) where t, t (cid:48) ∈ T Σ (Y ) , and where θ : Y → T Σ (X) isa substitution such that e ∈ A has the form ( ∀ Y ) t = t (cid:48) and e i is of the form ( ∀ X) θ(t) = θ(t (cid:48) ) .Let A (cid:96) Σ e , or usually A (cid:96) e when Σ is clear from context, indicate thatthere is a proof of e from A . We will also use notations like “ A (cid:96) ( − ) e ”or “ A (cid:96) ( − , ) e ” to indicate that e is deducible using only the rules (0–4), or (1–3) plus (5), respectively; by default, A (cid:96) e will mean A (cid:96) ( − ) e .Also, we let A denote the set of all equations deducible from A usingrules (0–5), and call it the deductive closure or theory of A . (cid:2) Notice that if a , . . . , a n is a fully annotated proof, then the sequence e , . . . , e n of the ﬁrst components of the a i is a bare proof.We now illustrate these concepts by proving that any equation thatcan be deduced using (0) can also be deduced using (5). First, supposethat ( ∀ X) t = t (cid:48) is in A , let Y = X , and deﬁne θ : X → T Σ (X) by θ(x) = x for all x ∈ X . Then (by Exercise 3.5.2) θ(t) = t and θ(t (cid:48) ) = t (cid:48) ,so (5) tells us that ( ∀ X) t = t (cid:48) is deducible, as desired. The followingis a formal statement of what we have just shown: Fact 4.2.2

Given a set A of Σ -equations and a Σ -equation e , then e ∈ A implies A (cid:96) ( ) e . (cid:2) From this we get the following:

Fact 4.2.3

Given a set A of Σ -equations, then for any Σ -equation e , A (cid:96) ( − ) e iﬀ A (cid:96) ( − ) e . Proof:

Let P be a proof of e from A using (0–5). Then for each use of the rule(0) in P , substitute the corresponding use of (5) according to Fact 4.2.2,resulting in a proof P (cid:48) of e from A that does not use rule (0). (cid:2) Thus we can get by with a set of ﬁve rules, instead of six. We will see later on that there are many other rule sets for equational deduction,and that the number of rules can be further reduced. A more abstractformulation of deduction is given in (the optional) Section 4.11.

This section shows that equational deduction is sound , in the sense thatif we can deduce e from A , then e is true in all models of A . To thisend, the following lemmas show the soundness of each rule separately: Equational Deduction

Lemma 4.3.1

Given any Σ -algebra M such that M (cid:238) A , and given e ∈ A , then M (cid:238) e . Proof:

This is immediate from the deﬁnition of satisfaction. (cid:2)

Lemma 4.3.2

Given any Σ -algebra M and given t ∈ T Σ (X) , then M (cid:238) ( ∀ X) t = t . Proof:

Let a : X → M . Then certainly ¯ a(t) = ¯ a(t) . (cid:2) Lemma 4.3.3

Given any Σ -algebra M and given t, t (cid:48) ∈ T Σ (X) , then M (cid:238) ( ∀ X) t = t (cid:48) implies M (cid:238) ( ∀ X) t (cid:48) = t . Proof:

Let a : X → M . Then ¯ a(t) = ¯ a(t (cid:48) ) implies ¯ a(t (cid:48) ) = ¯ a(t) . (cid:2) Lemma 4.3.4

Given any Σ -algebra M and given t, t (cid:48) , t (cid:48)(cid:48) ∈ T Σ (X) , then M (cid:238) ( ∀ X) t = t (cid:48) and M (cid:238) ( ∀ X) t (cid:48) = t (cid:48)(cid:48) imply M (cid:238) ( ∀ X) t = t (cid:48)(cid:48) . Proof:

Let a : X → M . Then ¯ a(t) = ¯ a(t (cid:48) ) and ¯ a(t (cid:48) ) = ¯ a(t (cid:48)(cid:48) ) imply ¯ a(t) = ¯ a(t (cid:48)(cid:48) ) . (cid:2) Lemma 4.3.5

Given any Σ -algebra M , given t ∈ T Σ (Y ) , and given θ, θ (cid:48) : Y → T Σ (X) such that M (cid:238) ( ∀ X) θ(y) = θ (cid:48) (y) for each y ∈ Y , then M (cid:238) ( ∀ X) θ(t) = θ (cid:48) (t) . Proof:

Let a : X → M . Then ¯ a(θ(y)) = ¯ a(θ (cid:48) (y)) , i.e., θ ; ¯ a(y) = θ (cid:48) ; ¯ a(y) for each y ∈ Y . But now the freeness of T Σ (Y ) implies that θ ; ¯ a(t) = θ (cid:48) ; ¯ a(t) , i.e., that ¯ a(θ(t)) = ¯ a(θ (cid:48) (t)) . (cid:2) Lemma 4.3.6

Given any Σ -algebra M , given t, t (cid:48) ∈ T Σ (Y ) such that M (cid:238) ( ∀ Y ) t = t (cid:48) , and given θ : Y → T Σ (X) , then M (cid:238) ( ∀ X) θ(t) = θ(t (cid:48) ) . Proof:

Let a : X → M . Then θ ; ¯ a : Y → M , and so M (cid:238) ( ∀ Y ) t = t (cid:48) implies (θ ; ¯ a)(t) = (θ ; ¯ a)(t (cid:48) ) , i.e., ¯ a(θ(t)) = ¯ a(θ (cid:48) (t (cid:48) )) , i.e., M (cid:238) ( ∀ X) θ(t) = θ(t (cid:48) ) . (cid:2) Exercise 4.3.1

The notation used in the proof above conceals a use of Corollary3.6.6. Identify the gap and show how to ﬁll it using this result. (cid:2)

We can now use induction on proof length to show soundness ofequational deduction:

Proposition 4.3.7 ( Soundness ) Given a set A of Σ -equations, a Σ -equation e , and a Σ -algebra M , then M (cid:238) A and A (cid:96) ( − ) e imply M (cid:238) e . Proof:

Let M be a Σ -algebra such that M (cid:238) A .If e has a proof of length 1 from A , then e is derived using exactlyone instance of exactly one of the rules (0–5); then Lemmas 4.3.1–4.3.6show that M (cid:238) e for each of these six cases, thus concluding the baseof the induction.For the inductive step, assume that if e has a proof of length n then M (cid:238) e , and let e (cid:48) have a proof e , . . . , e n + of length n +

1. The inductivehypothesis gives us that M (cid:238) e i , for i = , . . . , n , and from this we canconclude that M (cid:238) e n + by applying one of Lemmas 4.3.1–4.3.6. (cid:2) oundness and Counterexamples Suppose we are given a set A of axioms and an equation e , all overthe same signature Σ , and we want to prove that e cannot be deducedfrom A . (For example, we may have put some eﬀort into proving A (cid:96) e without success, and now suspect that it is not possible.) The impossi-bility of giving a proof can be demonstrated by giving a counterexample ,which is a Σ -algebra M that satisﬁes A but does not satisfy e . The proofthat counterexamples work only depends on the soundness of deduc-tion, because if A (cid:96) e then for any Σ -algebra M , if M (cid:238) A then M (cid:238) e .Therefore if M (cid:238) A but M (cid:238) e is false, we cannot have A (cid:96) e .In order to show that M (cid:238) e is false, we need only give a singleassignment where e fails: if e is ( ∀ X) t = t (cid:48) , then we need only exhibit θ : X → M such that θ(t) ≠ θ(t (cid:48) ) . Thus, the way to show that an equation cannot be proved is to give an algebra M , an assignment θ into that algebra, and a proof that the assignment has diﬀerent valueson the two terms of the equation. We will use this in the next subsectionand elsewhere. The most common formulations of equation and equational deductiondo not involve explicit universal quantiﬁers for variables. However,we will show that explicit quantiﬁers E9 are necessary for an adequatetreatment of satisfaction. Our demonstration will use the followingspeciﬁcation: th FOO is sorts B A .ops T F : -> B .ops (_ ∨ _) (_&_) : B B -> B .op ¬ _ : B -> B [prec 2] .op foo : A -> B .var B : B . var A : A .eq B ∨ ¬ B = T .eq B & ¬ B = F .eq B ∨ B = B .eq B & B = B .eq ¬ F = T .eq ¬ T = F .eq ¬ foo(A) = foo(A) .endth The OBJ3 keyword ops allows two or more operation symbols hav-ing the same rank to be declared together; for non-constant operationsymbols, parentheses must be used to separate the diﬀerent opera-tion forms. The notation T , F , ∨ , & , ¬ , and the ﬁrst four equationsshould be familiar from Boolean algebra, and we can think of foo as a Equational Deduction kind of “test” on elements of sort A . This example therefore resemblesspeciﬁcations found in many applications, except perhaps for the lastequation.Now consider the Σ FOO -algebra I with I A = ∅ and I B = { T , F } , where T,F are distinct, and where & , ∨ , ¬ are interpreted as expected for thebooleans ( F ∨ F = F , etc.), and where foo is the empty function. (This isactually the initial Σ FOO -algebra.) It is easy to check that I satisﬁes theequation ( ∀ x) F = T where x is of sort A , and that I does not satisfythe equation ( ∀∅ ) F = T . Since these two equations have diﬀerentmeanings, they cannot be identiﬁed, and therefore the quantiﬁer reallyis necessary.Example 4.3.8 below will show that with unsorted equational deduc-tion, the unquantiﬁed equation F = T can be proved from the equations in FOO , from which, given the above discussion of I , it follows thatunsorted equational deduction is in general not sound . This refutesthe apparently common misconception that unsorted and many-sortedequational deduction are equivalent; see [78] for a detailed discussionof this issue.To make our discussion precise, we need an explicit formulation ofunsorted equational deduction. Recall that a Σ -equation consists of aground signature X disjoint from Σ , plus two terms t, t (cid:48) ∈ T Σ (X),s forsome sort s ; that is, a Σ -equation is a triple (cid:104) X, t, t (cid:48) (cid:105) , by convention writ-ten ( ∀ X) t = t (cid:48) . By contrast, equations in unsorted equational logic donot have explicit quantiﬁers; they are just pairs (cid:104) t, t (cid:48) (cid:105) , conventionallywritten in the form t = t (cid:48) . The unsorted rules of deduction are exactlythe same as the many-sorted rules (1–5) of Deﬁnition 4.1.3 except thatall quantiﬁers (e.g., ( ∀ X) and ( ∀ Y ) ) are omitted.

Example 4.3.8 ( An Unsound Deduction ) We will show that unsorted equationaldeduction can prove an equation that is untrue in some models of thespeciﬁcation

FOO above. We apply the unsorted versions of the rules(1–5), letting F.1, . . . ,F.7 denote the equations in

FOO in the order oftheir appearance, and letting x be a new variable symbol: ¬ foo (x) = foo (x) (5) on F.7 [ ] foo (x) = ¬ foo (x) (2) foo (x) ∨ ¬ foo (x) = T (5) on F.1 foo (x) ∨ foo (x) = foo (x) ∨ ¬ foo (x) (3)(4) on [0] with t = foo (x) ∨ z foo (x) ∨ foo (x) = Tfoo (x) ∨ foo (x) = foo (x) (5) on F.3 foo (x) = foo (x) ∨ foo (x) (2)(3) [ ] foo (x) = T ompleteness foo (x) & foo (x) = foo (x) (5) on F.4 foo (x) = foo (x) & foo (x) (2) foo (x) & foo (x) = foo (x) & ¬ foo (x) (3)(4) on [0] with t = foo (x) & z[ ] foo (x) = foo (x) & ¬ foo (x) foo (x) & ¬ foo (x) = F (5) on F.2(3) on [2] foo (x) = F (2) F = foo (x) (3) on [1] [ ] F = T The algebra I is a counterexample to the equation F = T that wasproved above. But since the proof really does use the unsorted rules ofdeduction correctly, we must conclude that these rules are not soundfor this many-sorted algebra. It should however be noted that the un- sorted rules of deduction are sound and complete for the classical case(studied by Birkhoﬀ and others) where only unsorted (i.e., one-sorted)algebras are used as models. (cid:2) We will see later that by adding quantiﬁers to the proof, we get aproof of ( ∀ x) F = T , and we will also see that this does not mean that F = T is satisﬁed by all models of FOO . The counterexample is only pos-sible because I A = ∅ , and indeed, it can be shown that F = T does holdin every model of FOO that has all of its carriers non-empty. Moreover, itcan be shown that unsorted equational deduction is sound if restrictedto models that have all their carriers non-empty. Hence, it might seemthat the way out is just to restrict signatures so that no carrier can pos-sibly be empty; for example, this approach is advocated in [109]. Butsuch a restriction would exclude many important examples, such asthe theory of partially ordered sets. Another possible way out (and thisis the approach of classical logic) is simply to require that all modelshave all their carriers non-empty. However, we do not want to abandonthe possibility of empty carriers, because then not all speciﬁcations willhave initial models, as demonstrated by the above example, and manyothers. It therefore follows that we cannot use the unsorted rules of de-duction with their unsorted notation for equations, and instead must use a version of many-sorted equational deduction in which equationshave explicit quantiﬁers.

The main result about equational deduction is that it is complete forloose semantics. The following extensions of the notation for satisfac-tion enable us to state this in a simple way:

Deﬁnition 4.4.1

Let A and A (cid:48) be sets of Σ -equations, and let e be a Σ -equation.Then we write A (cid:238) Σ e iﬀ for all Σ -algebras M , M (cid:238) Σ A implies M (cid:238) Σ e , Equational Deduction that is, iﬀ every Σ -algebra that satisﬁes A also satisﬁes e . Also, we write A (cid:238) Σ A (cid:48) iﬀ A (cid:238) Σ e (cid:48) for all e (cid:48) ∈ A (cid:48) . Similarly, we write A (cid:96) Σ A (cid:48) iﬀ A (cid:96) Σ e (cid:48) for all e (cid:48) ∈ A (cid:48) . (cid:2) Now the main result:

Theorem 4.4.2 ( Completeness ) Given a signature Σ and a set A of Σ -equations,then for any Σ -equation e , A (cid:96) e iﬀ A (cid:238) e . (cid:2) One direction of this equivalence is the soundness of the rules, whichhas already been proved (Proposition 4.3.7); it says that anything thatcan be proved by equational deduction really is true of all models.The other direction, which is completeness in the narrow sense, is much more diﬃcult, and is proved in Appendix B (actually, the moregeneral case of conditional order-sorted equations is proved there).Theorem 4.4.2 is very comforting, because it says every equation e thatis true in all models of A can be deduced using our rules. For example,we can conclude that every equation that is true of all groups can beproved from the group axioms.We will soon see that there are other rule sets that can make proofsmuch easier than they are with (1–5). In fact, the particular rules (1–5)were chosen because each rule is relatively simple and intuitive, andbecause this formulation facilitates proving the completeness theorem.The following slightly more general formulation of completenessfollows from Theorem 4.4.2 and Deﬁnition 4.4.1: Corollary 4.4.3

Let A and A (cid:48) be sets of Σ -equations. Then A (cid:96) A (cid:48) iﬀ A (cid:238) A (cid:48) . (cid:2) Before leaving this section, we show transitivity for the extendednotion of satisfaction given in Deﬁnition 4.4.1:

Fact 4.4.4

Let

A, A (cid:48) , A (cid:48)(cid:48) be sets of Σ -equations. Then A (cid:238) A (cid:48) and A (cid:48) (cid:238) A (cid:48)(cid:48) imply A (cid:238) A (cid:48)(cid:48) . Proof:

We are assuming that M (cid:238) A implies M (cid:238) A (cid:48) and that M (cid:238) A (cid:48) implies M (cid:238) A (cid:48)(cid:48) . Therefore, by transitivity of implication, M (cid:238) A implies M (cid:238) A (cid:48)(cid:48) . (cid:2) A specialized rule of inference using subterm replacement is the basisfor term rewriting , a powerful technique for mechanical inference thatis discussed in the next chapter. We will develop this rule gradually,starting with a special case of rule (4) in which only one variable issubstituted for. ubterm Replacement Suppose (using the notation of Deﬁnition 2.2.1) that X = Y ∪ { z } s where z (cid:54)∈ Y and that θ, θ (cid:48) : X → T Σ (Y ) are substitutions such that θ(y) = θ (cid:48) (y) = y for all y ∈ Y and such that the equation ( ∀ Y ) θ(z) = θ (cid:48) (z) is deducible. Since ( ∀ Y ) y = y is deducible for all y ∈ Y , therule (4) implies that, for any t ∈ T Σ (Y ∪ { z } s ) , ( ∀ Y ) t (z ← t ) = t (z ← t ) is also deducible, where t = θ(z) and t = θ (cid:48) (z) , noting that t , t havethe same sort s as z . Therefore the following rule is sound, because wehave shown that it is a special case of (4):(4 ) One Variable Congruence . Given t ∈ T Σ (Y ∪ { z } ) where z (cid:54)∈ Y , if ( ∀ Y ) t = t is of sort s and is deducible, then ( ∀ Y ) t (z ← t ) = t (z ← t ) is also deducible. Example 4.5.1

Let us use the speciﬁcation

FOO of Example 4.3.8. Consider theequation ( ∀ x) foo (x) ∨ ¬ foo (x) = T , which is shown deducible inExample 4.3.8. Now let t = foo (x) ∨ z . Then rule (4 ) gives us that ( ∀ x) foo (x) ∨ ( foo (x) ∨ ¬ foo (x)) = foo (x) ∨ T is also deducible. (cid:2) We can get the eﬀect of (4) by repeated applications of (4 ) (i.e., theformal proof is by induction on the number of variables in X , usingthe transitivity of equality). Notice that in (4 ), t (respectively, t ) issubstituted for all occurrences of z in t ; there may be many such oc-currences, or none. We will see later that OBJ3 implements the casewhere there is exactly one occurrence of z . Proposition 4.5.2

Given a set A of Σ -equations, then for any Σ -equation e , A (cid:96) ( , ) e iﬀ A (cid:96) ( , ) e . (cid:2) That is, (4) and (4 ) are interchangeable so long as (3) is present. Thisgives the following: Corollary 4.5.3

Given a set A of Σ -equations, then for any Σ -equation e , A (cid:96) ( − ) e iﬀ A (cid:96) ( − , , ) e . (cid:2) And of course, both rule sets are complete, by Theorem 4.4.2. Our nextstep is to combine (4 ) and (5) into the following rule:(6) Forward Subterm Replacement . Given t ∈ T Σ (X ∪ { z } s ) with z (cid:54)∈ X , and given a substitution θ : Y → T Σ (X) , if ( ∀ Y ) t = t is of sort s and is in A , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible. Equational Deduction

Exercise 4.5.1

Show that rule (0) is a special case of rule (6). (cid:2)

Exercise 4.5.2

Show that if A ≠ ∅ then rule (1) is a special case of rule (6). (cid:2) Exercise 4.5.3

Show that (5) is a special case of (6), but (4 ) is not. (cid:2) The following symmetrical variant of (6) is just as useful:(–6)

Reverse Subterm Replacement . Given t ∈ T Σ (X ∪{ z } s ) with z (cid:54)∈ X ,and given a substitution θ : Y → T Σ (X) , if ( ∀ Y ) t = t is of sort s and is in A , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible. The soundness of (–6) follows from that of (6) by ﬁrst using (2) on theequation ( ∀ X) t = t and then applying (6). For clarity and emphasis,we may write (+6) instead of (6). We now combine (+6) and (–6) into asingle rule, as follows:( ± Bidirectional Subterm Replacement . Given t ∈ T Σ (X ∪ { z } s ) with z (cid:54)∈ X , and given a substitution θ : Y → T Σ (X) , if either ( ∀ Y ) t = t or ( ∀ Y ) t = t is of sort s and is in A , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible.This rule is sound because it is the disjunction of two sound rules. Itincludes (5) and basic cases of (2) and (4). In fact, the following can beshown (see Appendix B): Theorem 4.5.4

For any set A of Σ -equations and any (unconditional) Σ -equation e , A (cid:96) e iﬀ A (cid:96) ( , , ± ) e . (cid:2) This result says that (cid:96) is the reﬂexive and transitive closure of (cid:96) ( ± ) ;consequently, we might write ( ± ∗ ) instead of ( , , ± ) . It is equiva-lent to take the reﬂexive, symmetric and transitive closure of (6), whichjustiﬁes writing ( ≡ ) . Based on this, we could get a single rule of de-duction based on (6) that is complete all by itself. However, this rulewould be rather complex, and we do not give it here.Theorem 4.5.4 has the important consequence that the reﬂexive,transitive closure of ( ±

6) is complete, by Theorem 4.4.2. These are the cases where the deduced equation in the premise is actually in A . Appendix C contains a brief review of this concept. ubterm Replacement Exercise 4.5.4

Given a set A of Σ -equations, show that for any Σ -equation e , A (cid:96) e iﬀ A (cid:96) ( , , , ) e . Hint:

Show A (cid:96) ( ± ) e iﬀ A (cid:96) ( , ) e . (cid:2) Example 4.5.5 ( Groups ) Now let us use this to prove the right inverse law ( ∀ x) x ∗ x − = e for the speciﬁcation GROUPL . By the Theorem of Constants (Theorem3.3.11), it suﬃces to introduce a new constant a and then prove theequation ( ∀∅ ) a ∗ a − = e. Let GL.1, GL.2, GL.3 denote the three equations in

GROUPL . Then: [ ] a ∗ a − = e ∗ (a ∗ a − ) (–6) on GL.1 [ ] = (a − − ∗ a − ) ∗ (a ∗ a − ) (–6) on GL.2 with A = a − (6) on GL.3 [ ] = ((a − − ∗ a − ) ∗ a) ∗ a − (–6) on GL.3 [ ] = (a − − ∗ (a − ∗ a)) ∗ a − (6) on GL.2 [ ] = (a − − ∗ e) ∗ a − (–6) on GL.3 [ ] = a − − ∗ (e ∗ a − ) (6) on GL.1 [ ] = (a − − ∗ a − ) (6) on GL.2 [ ] = e This proof is much simpler than that given in Section 4.1. Also, noticethat each step builds on the one before it, which makes the proof mucheasier to understand. (cid:2)

The rules (+6), (–6) and ( ±

6) can all be specialized to the case where t has exactly one occurrence of z , which corresponds to what OBJ3implements. In particular, the specialized form of ( ± ) is the followingrule:( ± ) Bidirectional One Occurrence Subterm Replacement . Given t ∈ T Σ (X ∪ { z } s ) with exactly one occurrence of z where z (cid:54)∈ X , and given a substitution θ : Y → T Σ (X) , if either ( ∀ Y ) t = t or ( ∀ Y ) t = t is of sort s and is in A , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible.It is a bit tricky to formalize the concept that t ∈ T Σ (X ∪ { z } s ) has exactly one occurrence of z . One way is to use the initial algebraapproach of Section 3.2. Letting Σ (cid:48) = Σ (X ∪ { z } s ) , we deﬁne a Σ (cid:48) -homomorphism z : T Σ (cid:48) → ω which counts the number of occurrencesof z in terms, by giving ω a Σ (cid:48) -structure, using the convention that ω Equational Deduction denotes an S -sorted set of copies of the natural numbers, where S isthe sort set of Σ : if σ ∈ Σ has n > σ on ω by σ (i , . . . , i n ) = i + · · · + i n ; if σ is a constant in Σ , deﬁne σ on ω to be0; and ﬁnally, deﬁne x ∈ X to be 0, and z to be 1, in ω .Now we can state the main result of this section: Theorem 4.5.6

Given a set A of Σ -equations, then for any Σ -equation e , A (cid:96) ( − ) e iﬀ A (cid:96) ( , , ± ) e . Proof:

We have already shown the soundness of rules (1), (3) and ( ± ). There-fore A (cid:96) ( , , ± ) e implies A (cid:238) e , and then the completeness of (cid:96) givesus that A (cid:96) e . For the converse, by Theorem 4.5.4 it suﬃces to showthat we can derive the rule ( ±

6) from rules (1), (3), and ( ± ). This can be done by using induction and rule (3) for the two cases (6) and (–6)separately. (cid:2) Because of Theorem 4.4.2, this result implies that any equation thatis true of all groups can be proved using just ( ± ) plus transitivityand reﬂexivity. Exercise 4.5.5

Show that if we weaken (6) to “at most one occurrence,” then (1)is a special case of this weaker rule.

E10 (cid:2)

Corollary 4.5.7

Given a set A of Σ -equations, then for any Σ -equation e , A (cid:96) e iﬀ A (cid:96) ( , , , ) e . Proof:

This follows from Theorem 4.5.6, since (2) and ( ) are equivalent to ( ± ) . (cid:2) Completeness of (cid:96) ( , , , )A means that every equation valid for A canbe proved from A without ever having to apply a rule backwards. Thisperhaps surprising result motivates and justiﬁes term rewriting, a com-putational method based on (cid:96) ( , , )A which is the topic of the next chap- ter. ( (cid:63) ) An Alternative Congruence Rule

This section shows that an apparently weaker congruence rule is in factequivalent to the original formulation (4) on page 58. The new rule is: ( (cid:48) ) Congruence . Given σ ∈ Σ s ...s n ,s and given deducible equations ( ∀ X) t i = t (cid:48) i of sort s i for i = , . . . , n , then ( ∀ X) σ (t , . . . , t n ) = σ (t (cid:48) , . . . , t (cid:48) n ) is also deducible. eduction using OBJ Proposition 4.5.8

Given a set A of Σ -equations, then for any Σ -equation e , A (cid:96) ( − ) e iﬀ A (cid:96) ( − , (cid:48) , ) e .A (cid:96) ( ) e iﬀ A (cid:96) ( (cid:48) ) e . Proof:

Since the ﬁrst assertion follows from the second by induction on thelength of proofs, it suﬃces to prove the second assertion.If A (cid:96) ( (cid:48) ) e , then A (cid:96) ( ) e , because e necessarily has the form ( ∀ X) σ (t , . . . , t n ) = σ (t (cid:48) , . . . , t (cid:48) n ) , and in rule (4) we can take t = σ (y , . . . , y n ) with θ(y i ) = t i and θ (cid:48) (y i ) = t (cid:48) i for i = , . . . , n .For the converse, assume A (cid:96) ( ) e . We use structural induction (seeSection 3.2.1) on the form of t to show that A (cid:96) ( (cid:48) ) e . For the base, if t ∈ Y , then ( ∀ X) θ(t) = θ (cid:48) (t) is deducible by hypothesis. Now sup-pose that t = σ (t , . . . , t n ) and that ( ∀ X) θ(t i ) = θ (cid:48) (t i ) is deducible for i = , . . . , n . Then ( (cid:48) ) gives us that ( ∀ X) σ (θ(t ), . . . , θ(t n )) = σ (θ (cid:48) (t ), . . . , θ (cid:48) (t n )) is deducible, i.e., that ( ∀ X) θ(t) = θ (cid:48) (t) is de-ducible. (cid:2) Notice that this result implies that (cid:96) ( − , (cid:48) , ) is complete. We have given many rules of deduction for many-sorted equationallogic, and shown that various subsets are complete for loose seman-tics. The ﬁrst variant, consisting of the rules (1–5) in Deﬁnition 4.1.3, isnot very convenient for calculation, but each rule is relatively intuitive,and this system is convenient for proving the completeness theorem.The ﬁnal variant, consisting of rules (1), (3) and ( ± ), is much moreconvenient for calculation, although the rule ( ± ) may seem somewhatcomplex at ﬁrst. Some intermediate variants helped to bridge the gapbetween these two. OBJ3 not only supports writing theories (such as that of groups), butalso deducing new equations from theories, by applying subterm re-placement. This section introduces some features of OBJ3 that are use-ful for such proofs, through an example proving the right inverse lawfor the following left-handed theory

GROUPL for groups (it is the sameas the theory

GROUPL in Example 4.1.4 on page 59): th GROUPL is sort Elt .op _*_ : Elt Elt -> Elt .op e : -> Elt .op _-1 : Elt -> Elt [prec 2] .var A B C : Elt . Equational Deduction eq e * A = A .eq A -1 * A = e .eq A *(B * C) = (A * B)* C .endth

OBJ3 automatically assigns numbers to equations. If there are noother equations in the current environment, then the above equationswill be numbered, starting from 1, in the order that they occur, and canbe referred to as “

GROUPL.1 ”, “

GROUPL.2 ” and “

GROUPL.3 ”, or morecompactly, as “ .1 ”, “ .2 ” and “ .3 ”, provided that GROUPL is the modulecurrently in focus.You can also give your own name to an equation, by placing thatname in square brackets in front of the equation. For example, if youhad written [e] eq e * A = A .[i] eq A -1 * A = e .[a] eq A *(B * C) = (A * B)* C . in GROUPL , then you could refer to these equations with the names“

GROUPL.e ”, “

GROUPL.i ” and “

GROUPL.a ”, or more compactly, with “ .e ”,“ .i ” and “ .a ”. When there are multiple modules around, this can bemuch more convenient, because introducing new modules can causethe numbers of old equations to change, whereas the user-assignednames will not change.As in Example 4.5.5, the proof given below exploits the Theoremof Constants to get rid of a quantiﬁer, and instead reason with a newconstant. Assuming that GROUPL has just been read into OBJ3, the com-mand “ open . ” permits us to begin working within the module

GROUPL ,and “ op a : -> Elt ” temporarily adds a new constant symbol “ a ” ofsort Elt , so that we can form terms that involve this symbol (it repre-sents the universally quantiﬁed variable). The command “ start a *a -1 . ” declares an initial term to which subterm replacement can beapplied, yielding a series of equal new terms.The lines beginning with “ *** > ” are comments, in this case used to say what term we expect the command above it to produce. Each apply command applies the rule (6 ) or ( − ) to the term produced by thecommand above it. In each case, an equation in GROUPL is mentioned,either in the form . n or in the form -. n , depending on whether (6 )or ( − ) is to be used. A substitution θ is indicated in the form “ withA = a -1 ”, and “ at term ” indicates that t = z in rule ( ± ). Othersubterms (i.e., other, non-trivial choices of t ) can be selected using so-called “occurrence notation.” For example, the left subterm of (a ∗ b) ∗ (c ∗ d) is selected with (1) , and the right subterm with (2) ; moreover, b is selected with (1 2) , c with (2 1) , and d with (2 2) . Finally, eduction using OBJ “ close ” exits this special mode of OBJ3, forgetting any operationsand equations that may have been added. Further details about apply appear in Section 7.2.2. open .op a : -> Elt .start a * a -1 .apply -.1 at term .***> should be: e * (a * a -1)apply -.2 with A = (a -1) at (1) .***> should be: (a -1 -1 * a -1) * (a * a -1)apply .3 at term .***> should be: ((a -1 -1 * a -1)* a)* a -1apply -.3 at (1) .***> should be: (a -1 -1 * (a -1 * a)) * a -1apply .2 at (1 2) .***> should be: (a -1 -1 * e) * a -1apply -.3 at term .***> should be: a -1 -1 * (e * a -1)apply .1 at (2) .***> should be: a -1 -1 * a -1apply .2 at term .***> should be: eclose Exercise 4.6.1

Try this yourself. (cid:2)

In conjunction with Theorem 4.5.6, the completeness theorem im-plies that every equation that is true in the theory of groups can beproved using OBJ3 in the style of Example 4.5.5; therefore, we knowthat the proofs requested in Exercises 4.6.2–4.6.4 are possible (providedthe equations are true).The following additional features of OBJ3 are also useful in doingsuch proofs: To add an equation that you have just proved, while stillinside an open...close environment, you can just type (for example) [ri] eq A * A -1 = e . and then use this equation in proving another. You can also add morevariables, e.g., vars C D : Elt .

And of course you can add more operations. At any time, you can seewhat the current environment contains just by typing show . Note that open must be followed by a period, while close must not be followed bya period! Equational Deduction

You may be surprised to see that a version of the Booleans has beenautomatically included; it will not be included if before the speciﬁcationyou type set include BOOL off .

The command “ show rules . ” causes all the equations currentlyknown to OBJ to be printed together with their numbers and names(if any). In examples that involve importing other modules, it can behard to predict what ordering OBJ will give to the rules. Therefore,it is important to check what order they actually have, if you want toapply them by their numbers. (A potentially confusing point is thatin the terminology of OBJ, “equations” are also called “rules,” becausethey are applied as rewrite rules in OBJ computations; thus, one must be careful to distinguish whether a given instance of the word “rule”means “rule of deduction” in talk about the theory of

OBJ, or “rewriterule” in talk about computations in

OBJ.)

Exercise 4.6.2

Prove the right identity law for the speciﬁcation

GROUPL . (cid:2) Exercise 4.6.3

Prove the left identity law for the speciﬁcation

GROUP . (cid:2) Exercise 4.6.4

Prove the left inverse law for the speciﬁcation

GROUP . (cid:2) Exercise 4.6.5

Use OBJ3 to prove that

T = F for the speciﬁcation

FOO of Exam-ple 4.3.8. What equation have you really proved? Does it follow that

T= F in every

FOO -algebra? Why? (cid:2)

This section gives two more rules of deduction for equational logic;they throw an interesting light on cases like that of Example 4.3.8.(7)

Abstraction . If ( ∀ X) t = t is deducible from A , and if Y is a ground signature disjoint from X , then ( ∀ X ∪ Y ) t = t is also deducible from A .(This rule also applies when X = ∅ , where there are originally no vari-ables and some are added.)For the next rule, we need a preliminary concept: let us say that asort s ∈ S is void in a signature Σ iﬀ (T Σ ) s = ∅ . onditional Deduction and its Completeness (8) Concretion . If ( ∀ X ∪ Y ) t = t is deducible from A , if no sort of a variable in Y is void in Σ andif t , t ∈ T Σ (X) , then ( ∀ X) t = t is also deducible from A . Exercise 4.7.1

Show the soundness of rule (7). (cid:2)

Fact 4.7.1

Rule (8) is sound.

Proof:

We have to show that if M (cid:238) ( ∀ X ∪ Y ) t = t , then M (cid:238) ( ∀ X) t = t ;i.e., that if a(t ) = a(t ) for all assignments a : X ∪ Y → M , then b(t ) = b(t ) for all b : X → M . This will follow if we can extend any b : X → M to some a : X ∪ Y → M . But we can always pick an arbitraryelement m y ∈ M s for each y ∈ Y s , and then set b(y) = m y , unlessthere are some s ∈ S and y ∈ Y s such that M s = ∅ . However, thenon-voidness of each s ∈ S such that there is some y ∈ Y s guaranteesthat this cannot happen. (cid:2) From the proof, we see that it is unsound to remove a quantiﬁer overa void sort, because there really can exist models where the carrier ofthat sort is void. For example, in Example 4.3.8, we are unable to applythe concretion rule to remove the variable in the equation ( ∀ x) F = T ,because the sort A is void. We have seen that the resulting equation F = T is not satisﬁed by the model I , although the quantiﬁed version issatisﬁed.By contrast, the abstraction rule (7) is sound if some, or even all,sorts in Y are void. We can deduce unconditional equations from conditional equations us-ing rules of deduction very similar to those in Deﬁnition 4.1.3, except that rule (5) must be modiﬁed to account for conditional equations in A . (Recall that conditional equations and their satisfaction have alreadybeen deﬁned in Section 3.4.) We will see later that the resulting rule setis complete. Here is the modiﬁed rule:(5C) Conditional Instantiation . If ( ∀ Y ) t = t (cid:48) if C is in A , and if θ : Y → T Σ (X) is a substitution such that ( ∀ X) θ(u) = θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C , then ( ∀ X) θ(t) = θ(t (cid:48) ) is deducible. Equational Deduction

We will write concrete instances of conditional rules in forms like ( ∀ x, y, z, w) x + z = y + w if x = y, z = w , separating pairs in the condition by commas, and using the equalitysign.We now show that the rule (5C) is sound: Lemma 4.8.1

Given a Σ -algebra M satisfying A and a substitution θ : Y → T Σ (X) , if M (cid:238) ( ∀ X) θ(u) = θ(v) for all (cid:104) u, v (cid:105) ∈ C , then also M (cid:238) ( ∀ X) θ(t) = θ(t (cid:48) ) . Proof:

Let a : X → M and assume that M (cid:238) ( ∀ X) θ(u) = θ(v) for each (cid:104) u, v (cid:105) ∈ C . Then (θ ; a)(u) = (θ ; a)(v) , and so by the deﬁnition ofconditional satisfaction, (θ ; a)(t) = (θ ; a)(t (cid:48) ) , i.e., a(θ(t)) = a(θ(t (cid:48) )) , by Corollary 3.6.6. (cid:2) The proof above shows that the rule (5C) is sound even if C is inﬁnite;but of course, in that case we could never write a ﬁnite proof scoreusing the rule. Here are some examples of the use of rule (5C): Example 4.8.2

In the context of a speciﬁcation for the natural numbers witha Boolean-valued inequality function > , consider the conditional equa-tion ( ∀ x, y, z) x = y if z ∗ x = z ∗ y, z > = true , and suppose that at some point we have deduced that 5 ∗ a = ∗ b .Then we can use the above conditional equation and rule (5C) to deducethat a = b , since 5 > = true .In the context of the same speciﬁcation, now consider the equation ( ∀ x, y, z) x > z = true if x > y = true , y > z = true . Then if at some point we have deduced that a + b > a + c = true and a + c > d = true , then we can use rule (5C) and the above to deducethat a + c > d = true . (cid:2) Let us write A (cid:96) C Σ e if e is deducible from A using the rules (1, 2, 3,4, 5C); also as usual, let us omit the subscript Σ and the superscript C ifthey are clear from context. As with Proposition 4.3.7 in Section 4.3, itnow follows by induction on the length of derivations that (cid:96) C is sound.As with deduction for unconditional equations, we have a completenesstheorem: Theorem 4.8.3 ( Completeness ) Given a set A of (possibly conditional) Σ -equa-tions, then for any unconditional Σ -equation e , A (cid:96) C e iﬀ A (cid:238) e. (cid:2) onditional Deduction and its Completeness The proof is given in Appendix B. (Actually, Appendix B proves the moregeneral Theorem 10.3.2 for the case where the equations in A may beorder-sorted.)The following result gives the most generally useful approach toproving a conditional equation from a given set of (possibly conditional)equations: Theorem 4.8.4

Given a set A of (possibly conditional) Σ -equations and a condi-tional Σ -equation ( ∀ X) t = t (cid:48) if C , let A (cid:48) = A ∪ { ( ∀∅ ) u = v | (cid:104) u, v (cid:105) ∈ C } . Then A (cid:238) Σ ( ∀ X) t = t (cid:48) if C iﬀ A (cid:48) (cid:96) C Σ (X) ( ∀∅ ) t = t (cid:48) . Proof:

By letting C (cid:48) = { ( ∀∅ ) u = v | (cid:104) u, v (cid:105) ∈ C } and using Proposition 3.4.3, A (cid:238) Σ ( ∀ X) t = t (cid:48) if C is equivalent to A ∪ C (cid:48) (cid:238) Σ (X) ( ∀∅ ) t = t (cid:48) , which by Theorem 4.8.3 is in turn equivalent to A ∪ C (cid:48) (cid:96) C Σ (X) ( ∀∅ ) t = t (cid:48) , as desired. (cid:2) This result, although not very diﬃcult, gives an important complete-ness theorem for conditional equations, since it says that any condi-tional equation e that is satisﬁed by all models of A can be proved byequational deduction from A plus the conditions of e . In practice, it isoften possible to do the proof using just (conditional) rewriting. Exercise 4.8.1

Show that the result of modifying Theorem 4.8.4 by deﬁning A (cid:48) to be A ∪ { ( ∀ X) u = v | (cid:104) u, v (cid:105) ∈ C } , and then asserting A (cid:238) Σ ( ∀ X) t = t (cid:48) if C iﬀ A (cid:48) (cid:96) C Σ ( ∀ X) t = t (cid:48) is false. (cid:2) Conditional equations in OBJ are a special case of conditional equations as deﬁned above: there is only one pair (cid:104) u, v (cid:105) in the set C of conditions,and it must have sort Bool with v = true . As a result, conditionalequations in OBJ3 can have the simpliﬁed syntactic form eq t = t (cid:48) if u . where u is a term of sort Bool . However, this is not really much ofa restriction, because equalities (as well as inequalities) can (usually)be expressed as Boolean terms, and u can also be a conjunction ofconditions.For example, the transitive law for a relation > viewed as a Boolean-valued function takes the following form: Equational Deduction eq X > Z = true if X > Y and Y > Z .

The OBJ3 commands for explicitly applying conditional equationshave exactly the same syntax as for unconditional equations. However,the rewrite will not actually be done unless is there not only a match ofthe leftside, but OBJ3 is also able to reduce the condition to true . Thereare two modes within which this reduction might be accomplished:1. If reduce conditions is off (which is the default) then the focusfor application shifts to the condition, and the user can explicitlyinvoke apply to try to prove that the condition equals true .2. If reduce conditions is set on (which must be done explicitly,using the set command) then OBJ3 will just compute the nor- mal form of the condition, and apply the equation iﬀ that form is true .If a conditional equation is applied when the focus is on the conditionof a previously applied equation, then the focus shifts to the conditionof the latest equation; these foci can be nested arbitrarily deeply, anda given focus is abandoned in favour of the previous one iﬀ the proofthat it is true has been completed. OBJ3 does all this automaticallywhen reduce conditions is on , as can be seen by setting trace on . Exercise 4.8.2

Recall that a function f : A → B is injective iﬀ it satisﬁes theconditional equation ( ∀ x, y) x = y if f (x) = f (y) , and is a right inverse iﬀ there is a function g : B → A such that g ; f = B , i.e., such that ( ∀ x) f (g(x)) = x is satisﬁed. Now do the following:(a) Write an OBJ theory which expresses the two assumptions above. (b) Write an OBJ proof score for showing that ( ∀ y) g(f (y)) = y holds under these assumptions.(c) Explain why this proof score proves the equation in (b).Note that this proves Exercise 3.1.8 in Section 3.1 for the unsorted case. (cid:2) onditional Subterm Replacement Exercise 4.8.3

Given the following code obj INT is sort Int .ops (inc_)(dec_): Int -> Int .op 0 : -> Int .vars X Y : Int .eq inc dec X = X .eq dec inc X = X .op _+_ : Int Int -> Int .eq 0 + Y = Y .eq (inc X)+ Y = inc(X + Y).eq (dec X)+ Y = dec(X + Y).endo give an OBJ proof score for the conditional equation ( ∀ x, y) x = y if inc x = inc y and justify it. (cid:2) This section develops subterm replacement for the case of conditionalequations. The basic rule, generalizing rule (6) is as follows:(6C)

Forward Conditional Subterm Replacement . Given t ∈ T Σ (X ∪{ z } s ) with z (cid:54)∈ X , if ( ∀ Y ) t = t if C is of sort s and is in A , and if θ : Y → T Σ (X) is a substitution suchthat ( ∀ X) θ(u) = θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible. Exercise 4.9.1

Show that rule (6C) is sound. (cid:2)

Exercise 4.9.2

Show that rule (0) is a special case of (6C). (cid:2)

Exercise 4.9.3

Show that if A ≠ ∅ then rule (1) is a special case of (6C). (cid:2) Exercise 4.9.4

Show that (5C) is a special case of (6C). (cid:2)

Exercise 4.9.5

Show that (4 ) is not a special case of (6C). (cid:2) As with unconditional rewriting, there is also a very useful symmet-rical variant of (6C):(–6C)

Reverse Conditional Subterm Replacement . Given t ∈ T Σ (X ∪{ z } s ) with z (cid:54)∈ X , if ( ∀ Y ) t = t if C Equational Deduction is of sort s and is in A , and if θ : Y → T Σ (X) is a substitution suchthat ( ∀ X) θ(u) = θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is of sort s and also deducible.The soundness of (–6C) follows from that of (6C) by ﬁrst using (2) on theequation ( ∀ X) t = t and then applying (6C). For clarity and emphasis,we may write (+6C) instead of (6C). We now combine (+6C) and (–6C)into a single rule, as follows:( ± C ) Bidirectional Conditional Subterm Replacement . Given t ∈ T Σ (X ∪{ z } s ) with z (cid:54)∈ X , if either ( ∀ Y ) t = t if C or ( ∀ Y ) t = t if C is of sort s and is in A , and if θ : Y → T Σ (X) is a substitution such that ( ∀ X) θ(u) = θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible.This rule is sound because it is the disjunction of two sound rules. Itincludes (5C) and basic cases (where the given equation is in A ) of (2)and (4). In fact, the following can be shown: Theorem 4.9.1 ( Completeness of Subterm Replacement ) For any set A of (pos-sibly conditional) Σ -equations and any unconditional Σ -equation e , A (cid:96) C e iﬀ A (cid:96) ( , , ± C) e . (cid:2) This result follows from the more general Theorem 10.3.3 proved inAppendix B, for the case where the equations in A may be order-sorted.Note also that Theorem 4.5.4 is a special case of Theorem 4.9.1, andhence also of Theorem 10.3.3.The rules (+6C), (–6C) and ( ± C ) can each be specialized to the casewhere t has exactly one occurrence of z . In particular, the specializedform of ( ± C) is the following rule:( ± C ) Bidirectional Conditional One Occurrence Subterm Replacement . Given t ∈ T Σ (X ∪ { z } s ) with exactly one occurrence of z where z (cid:54)∈ X , and given a substitution θ : Y → T Σ (X) , if either ( ∀ Y ) t = t if C or ( ∀ Y ) t = t if C is of sort s and is in A , then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is deducible if for each pair (cid:104) u, v (cid:105) ∈ C( ∀ X) θ(u) = θ(v) is also deducible.We can now obtain the following: (cid:63) ) Speciﬁcation Equivalence Corollary 4.9.2

Given a set A of (possibly conditional) Σ -equations, then forany unconditional Σ -equation e , A (cid:96) ( − C) e iﬀ A (cid:96) ( , , ± C) e iﬀ A (cid:96) ( , , , C) e . Proof:

We have already shown the soundness of rules (1), (3) and ( ± C ).Therefore A (cid:96) ( , , ± C) e implies A (cid:238) e , and then the completeness of (cid:96) C gives us that A (cid:96) C e . For the converse, by Theorem 4.9.1 it suﬃcesto show that we can derive the rule ( ± C ) from rules (1), (3), and ( ± C ).This can be done by using induction and rule (3) for the two cases (+6C)and (–6C) separately. The second “iﬀ” follows as in Corollary 4.5.7. (cid:2) Results in this section provide a foundation for conditional term rewrit-ing, discussed in Section 5.8 of the next chapter. In particular, the above result, extending Corollary 4.5.7, gives completeness of the tran-sitive, reﬂexive, symmetric closure of forward conditional subterm re-placement, which becomes conditional term rewriting when symmetryis dropped. ( (cid:63) ) Speciﬁcation Equivalence

This section discusses the equivalence of speciﬁcations that may in-volve diﬀerent signatures. For example, the following gives a ratherdiﬀerent speciﬁcation of groups:

Example 4.10.1

If we deﬁne a/b = a ∗ b − in the theory GROUPL of groups inExample 4.1.4, and then try to ﬁnd enough properties of this operationto deﬁne groups, we might get the following: th GROUPD- is sort Elt .op _/_ : Elt Elt -> Elt .var A B C : Elt .eq A /(B / B) = A .eq (A / A)/(B / C) = C / B .eq (A / C)/(B / C) = A / B .endth

Even though these equations are enough, this speciﬁcation is not equiv-alent to

GROUPL , because the empty set is a model of

GROUPD- , whereasit is not a model of

GROUPL . However, if we add an identity to the spec-iﬁcation, we do get an equivalent theory: th GROUPD is sort Elt .op _/_ : Elt Elt -> Elt .op e : -> Elt .var A B C : Elt .eq A /(B / B) = A .eq (A / A)/(B / C) = C / B . Equational Deduction eq (A / C)/(B / C) = A / B .eq (A / A) = e .endth

The last axiom says that e is an identity. (cid:2) But how can we prove that

GROUPD and

GROUPL are equivalent? We willcertainly need a more general deﬁnition of equivalence than the onegiven in Section 3.3, because the signatures of the two speciﬁcationsare diﬀerent.Before we can give this deﬁnition, we need some more notation.Each Σ -algebra has an interpretation of each operation symbol σ ∈ Σ as an actual operation; we show how this extends to an interpretationfor Σ -terms with variables. Given w = s . . . s n ∈ S *, we let w X denote an S -sorted ground signature disjoint from Σ such that ( w X s ) = { i | s i = s } . One way to construct such a signature is to let | w X | = { x , . . . , x n } where n = w , and then let w X s = { x i | s i = s } . For example, if S ={ a, b, c } and w = abbac , then w X has w X a = { x , x } , w X b = { x , x } and w X c = { x } . We shall use this construction in the following: Deﬁnition 4.10.2

Given a signature Σ , the signature of all derived Σ -operationsis the S -sorted signature Der ( Σ ) with Der ( Σ ) w,s = T Σ ( w X) s for all w ∈ S * and s ∈ S .

Any t ∈ Der ( Σ ) w,s deﬁnes an actual operation M t : M w → M s on any Σ -algebra M as follows: given a ∈ M w , there is a naturally corresponding S -indexed map a : w X → M , which lets us view M as a Σ ( w X) -algebra;hence there is a unique Σ ( w X) -homomorphism ¯ a : T Σ ( w X) → M whichlets us deﬁne M t (a) to be a(t) . This is called the derived operation deﬁned by t . In this way, we can view any Σ -algebra M as a Der ( Σ ) -algebra, also denoted M . (cid:2) Deﬁnition 4.10.3

Given signatures Σ and Σ (cid:48) with sort sets S and S (cid:48) respectively,then a signature morphism (or map ) ϕ : Σ → Σ (cid:48) consists of a map f : S → S (cid:48) and an S * × S -indexed map g with components g w,s : Σ w,s → Σ (cid:48) f (w),f (s) where f is extended to lists by f ([]) = [] and f (s . . . s n ) = f (s ) . . . f (s n ) . Given s ∈ S, w ∈ S * we may write ϕ(s) and ϕ(w) instead of f (s) and f (w) , respectively; and given σ ∈ Σ w,s , we maywrite ϕ(σ ) instead of g(σ ) .Given a signature morphism ϕ : Σ → Σ (cid:48) and a Σ (cid:48) -algebra M , we get a Σ -algebra, called the reduct of M under ϕ and denoted ϕM , as follows:• Given s ∈ S , let (ϕM) s = M ϕ(s) ;• Given σ ∈ Σ w,s , let (ϕM) σ = M ϕ(σ) : M ϕ(w) → M ϕ(s) .In particular, given a signature morphism ϕ : Σ → Der ( Σ (cid:48) ) and a Σ (cid:48) -algebra M , we can view M as a Der ( Σ (cid:48) ) -algebra by Deﬁnition 4.10.2,and then get a Σ -algebra denoted ϕM from the construction above. (cid:63) ) Speciﬁcation Equivalence We will call a signature morphism ϕ : Σ → Der ( Σ (cid:48) ) a derivor from Σ to Σ (cid:48) . (cid:2) It follows that any derivor ϕ : Σ → Der ( Σ (cid:48) ) induces a unique Σ -homo-morphism T Σ → ϕT Σ (cid:48) , because ϕT Σ (cid:48) is a Σ -algebra by the above. Let usdenote this homomorphism ϕ . Deﬁnition 4.10.4 An interpretation of speciﬁcations ϕ : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) is aderivor ϕ : Σ → Der ( Σ (cid:48) ) such that for every Σ (cid:48) -algebra M (cid:48) , M (cid:48) (cid:238) A (cid:48) implies ϕM (cid:48) (cid:238) A .

Speciﬁcations ( Σ , A) and ( Σ (cid:48) , A (cid:48) ) are equivalent iﬀ there exist interpre-tations ϕ : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) and ψ : ( Σ (cid:48) , A (cid:48) ) → ( Σ , A) such that ϕ(ψM) = M ψ(ϕM (cid:48) ) = M (cid:48) for all ( Σ , A) -algebras M and ( Σ (cid:48) , A (cid:48) ) -algebras M (cid:48) . (cid:2) If speciﬁcations ( Σ , A) and ( Σ (cid:48) , A (cid:48) ) are equivalent, then any ( Σ , A) -alge-bra can be seen as a ( Σ (cid:48) , A (cid:48) ) -algebra, and vice versa . We will see that thespeciﬁcations GROUPL and

GROUPD are equivalent in this sense. Also,Exercises 4.6.1–4.6.4 show that the speciﬁcations

GROUP and

GROUPL are equivalent in the sense of the less general deﬁnition of equivalencegiven in Section 3.3, which suﬃces because the signatures are the same.It is worth remarking that we can get an even more general deﬁnitionof equivalence by weakening the conditions in the above deﬁnition tothe following: ϕ(ψM) (cid:238) e iﬀ M (cid:238) eψ(ϕM (cid:48) ) (cid:238) e (cid:48) iﬀ M (cid:48) (cid:238) e (cid:48) for all Σ -equations e and Σ (cid:48) -equations e (cid:48) . Another generalization (whichhowever involves some category theory) is to require the two categoriesof models to be isomorphic. Exercise 4.10.1

Use OBJ3 to prove that the speciﬁcations

GROUPL and

GROUPD (of Example 4.10.1) are equivalent in the sense of Deﬁnition 4.10.4. (cid:2)

Exercise 4.10.2

Show that if two speciﬁcations are equivalent in the sense ofSection 3.3, then they are equivalent in the sense of Deﬁnition 4.10.4. (cid:2)

Exercise 4.10.3

Show that replacing the three equations in

GROUPD- by the sin-gle equation eq A /((((A / A)/ B)/ C)/(((A / A)/ A)/ C)) = B . yields an equivalent speciﬁcation. (cid:2) Equational Deduction

Deﬁnition 4.10.5

A derivor ϕ : Σ → Der ( Σ (cid:48) ) induces a signature morphism ϕ * : Der ( Σ ) → Der ( Σ (cid:48) ) as follows: suppose that ϕ = (cid:104) f , g (cid:105) , and notethat g w,s : Σ w,s → T Σ (cid:48) ( f (w) X) f (s) for each pair (cid:104) w, s (cid:105) . Deﬁne ϕ * w : T Σ ( w X) → ϕT Σ (cid:48) ( f (w) X) for each w ∈ S * to send x i (of sort w i ) in w X s to x i (of sort f (w i ) ) in f (w) X f (s) , noting that T Σ ( w X) is the free Σ -algebra generated by w X and that ϕT Σ (cid:48) ( f (w) X) is also a Σ -algebra.Then, noting that ϕ * w is an S -sorted map, the collection ϕ * w,s forms asignature morphism ϕ * : Der ( Σ ) → Der ( Σ (cid:48) ) .Now given interpretations ϕ : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) and ψ : ( Σ (cid:48) , A (cid:48) ) → ( Σ (cid:48)(cid:48) , A (cid:48)(cid:48) ) , we can deﬁne their composition as interpretations to be thesignature morphism ϕ ; ψ * : Σ → Der ( Σ (cid:48)(cid:48) ) . (cid:2) Exercise 4.10.4

Show that the composition of two interpretations ϕ : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) and ψ : ( Σ (cid:48) , A (cid:48) ) → ( Σ (cid:48)(cid:48) , A (cid:48)(cid:48) ) is also an interpretation. (cid:2) Exercise 4.10.5

Let i Σ : Σ → Der ( Σ ) send σ ∈ Σ w,s to the term σ (x , . . . , x n ) ∈ T Σ ( w X) s where w has length n . Then show that i Σ * = Der ( Σ ) and that i Σ serves as an identity for the composition of interpretations. (cid:2) In Section 5.9 we will need a more syntactical formulation of inter-pretation. We ﬁrst give some auxiliary notions:

Deﬁnition 4.10.6

Let X be an S -sorted variable set and let f : S → S (cid:48) be a map.Then the S (cid:48) -sorted variable set denoted f X is deﬁned for s (cid:48) ∈ S (cid:48) by (f X) s (cid:48) = (cid:91) { X s | f (s) = s (cid:48) } . Note that f X is again a variable set because the variable symbols in X are all distinct.Now let t be a Σ -term with variables in X and let ϕ = (f , g) : Σ → Σ (cid:48) be a signature morphism. Then we extend ϕ to a function ϕ : T Σ (X) → ϕT Σ (cid:48) (f X) as follows: ϕ(x) = x for x ∈ Xϕ(σ ) = g [],s (σ ) for σ ∈ Σ [],s ϕ(σ (t , . . . , t n )) = g w,s (σ )(ϕt , . . . , ϕt n ) for σ ∈ Σ w,s where w = s . . . s n . Finally, let e be a Σ -equation ( ∀ X) t = t (cid:48) and let ϕ = (f , g) : Σ → Σ (cid:48) be a signature morphism. Then the Σ (cid:48) -equation denoted ϕe is de-ﬁned E11 to be ( ∀ f X) ϕt = ϕt (cid:48) . In this context, we may also write ϕX instead of f X . Note that ϕ : T Σ (X) → ϕT Σ (cid:48) (ϕX) is a Σ -homomorphism, and that it could also havebeen deﬁned using the freeness of T Σ (X) . (cid:2) We will need the following result: (cid:63) ) Speciﬁcation Equivalence Theorem 4.10.7 ( Satisfaction Condition ) Given a signature morphism ϕ : Σ → Σ (cid:48) and a Σ (cid:48) -algebra M (cid:48) , then for any Σ -equation e , M (cid:48) (cid:238) Σ (cid:48) ϕe iﬀ ϕM (cid:48) (cid:238) Σ e . (cid:2) A proof may be found in [66]; it is not trivial. A general discussion ofthe importance of this kind of result for abstract model theory is givenin [67]. We can now state the result that we have been working towards:

Theorem 4.10.8

A derivor ϕ : Σ → Der ( Σ (cid:48) ) is an interpretation of speciﬁca-tions ϕ : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) iﬀ for each Σ -equation e , A (cid:238) Σ e implies A (cid:48) (cid:238) Σ (cid:48) ϕe . Proof:

Call the conditions of Deﬁnition 4.10.4 and of this theorem (A) and (B),respectively. To show (A) implies (B), we assume (A) and A (cid:238) e , andthen show that A (cid:48) (cid:238) ϕe , i.e., that M (cid:48) (cid:238) A (cid:48) implies M (cid:48) (cid:238) ϕe . So assuming M (cid:48) (cid:238) A (cid:48) , (A) gives us ϕM (cid:48) (cid:238) A , and then A (cid:238) e gives us ϕM (cid:48) (cid:238) e . Now we apply the satisfaction condition to obtain M (cid:48) (cid:238) ϕe ,as desired.For the converse, we assume (B) and M (cid:48) (cid:238) A (cid:48) , and wish to showthat ϕM (cid:48) (cid:238) A . If we let e ∈ A , then A (cid:238) e , so that (B) reduces to M (cid:48) (cid:238) A (cid:48) implies M (cid:48) (cid:238) ϕe . Our assumption then gives M (cid:48) (cid:238) ϕe , and thesatisfaction condition gives ϕM (cid:48) (cid:238) e . Therefore ϕM (cid:48) (cid:238) A . (cid:2) The following is now a direct consequence of the completeness ofequational deduction:

Corollary 4.10.9

A derivor ϕ : Σ → Der ( Σ (cid:48) ) is an interpretation of speciﬁca-tions ϕ : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) iﬀ for each equation e ∈ A , A (cid:96) e implies A (cid:48) (cid:96) ϕe . Furthermore, a pair of derivors ϕ : Σ → Der ( Σ (cid:48) ) and ψ : Σ (cid:48) → Der ( Σ ) is an equivalence of the speciﬁcations ( Σ , A) and ( Σ (cid:48) , A (cid:48) ) iﬀ A (cid:96) e implies A (cid:48) (cid:96) ϕeA (cid:48) (cid:96) e (cid:48) implies A (cid:96) ψe (cid:48) for each e ∈ A and e (cid:48) ∈ A (cid:48) , and in addition ϕ(ψM) = Mψ(ϕM (cid:48) ) = M (cid:48) for each ( Σ , A) -algebra M and ( Σ (cid:48) , A (cid:48) ) -algebra M (cid:48) . (cid:2) Equational Deduction

Exercise 4.10.6

Given a speciﬁcation ( Σ , A) , show that its closure ( Σ , A) is givenby A = { e | M (cid:238) A implies M (cid:238) e for all Σ -algebras M } ;that is, A equals the set of all equations that are true of all models of A .Now show that if ϕ and ψ are an equivalence of speciﬁcations ( Σ , A) and ( Σ (cid:48) , A (cid:48) ) , then ϕA = A (cid:48) ψA (cid:48) = A . (cid:2) ( (cid:63) ) A More Abstract Formulation of Deduction

This section gives a more abstract formulation of deduction; it is rather specialized, and can safely be skipped on a ﬁrst reading or in an intro-ductory course.

Deﬁnition 4.11.1

Let

Sen be a set whose elements are called sentences . Thenan inference rule over

Sen is a pair (cid:104)

H, c (cid:105) where H ⊆ Sen is ﬁnite and c ∈ Sen ; we call H the hypothesis of the rule, and c its conclusion . Asubset C ⊆ Sen is closed under (cid:104) H, c (cid:105) iﬀ H ⊆ C implies c ∈ C . Given aset R of inference rules, a set C of sentences is closed under R iﬀ C isclosed under each rule in R . Two sets R, R (cid:48) of inference rules are called equivalent iﬀ they have the same closed sets of sentences.Given a set R of inference rules, a proof for a sentence e is a ﬁnitesequence of sentences, e e . . . e n with e n = e such that for i = , . . . , n there exists a rule (cid:104) H, e i (cid:105) with H ⊆ { e , . . . , e i − } ; in this case, we saythat e can be proved using R . Let Th (R) denote the set of all sentencesthat can be proved using R . (cid:2) In the case of equational logic,

Sen would be the set of all equations overa given signature, and a rule (e.g., congruence) is represented by the setof its instances for all ground equations. A rule with no hypothesis,such as reﬂexivity, has H = ∅ . Fact 4.11.2 If C i is closed under a set R of inference rules for i ∈ I , then (cid:84) i ∈ I C i is also closed under R . (cid:2) Proposition 4.11.3

Given any set R of inference rules, Th (R) is the least set ofsentences that is closed under R . Proof:

We ﬁrst show that Th (R) is closed under R . Let (cid:104) H, c (cid:105) be in R with H = { e , . . . , e n } such that H ⊆ Th (R) . Then for i = , . . . , n , thereexists a proof p i of e i . It now follows that the sequence p . . . p n c is aproof of c , so that c ∈ Th (R) .Now assuming that T ⊆ Sen is closed under R , we will show that Th (R) ⊆ T . Let e ∈ Th (R) and let e . . . e n be a proof of e = e n . Wehave to show e ∈ T . For this purpose, we show by induction that e i ∈ T iterature for i = , . . . , n . Suppose that e j ∈ T for each j < i . Because e . . . e i is a proof, there exists a rule (cid:104) H, e i (cid:105) such that H ⊆ { e , . . . , e i − } . Since H ⊆ T and T is closed under R , we have that e i ∈ T . (cid:2) This result can be used to prove a property of Th (R) by showing thatthe set of sentences having that property is closed under R . This prooftechnique may be called “structural induction over proof rules” (seealso the discussion in Section 3.2.1). It is typical of logical systems that there are many diﬀerent variantsof their rules of deduction, each variant more suitable for some pur- poses than for others. Although the intuitions behind equational logicare very familiar (basically, substituting equals for equals in equals),the details can be surprisingly subtle, and there are many errors in thepublished literature. In fact, it is also typical that reasoning about de-ducibility (such reasoning is called “proof theory”) can be very subtle,and the reasons for emphasizing semantics in this text include avoidingsuch reasoning as much as possible, as well as checking its soundness.(The proof of Theorem 4.5.4 in Appendix B is a good example of proof-theoretic reasoning.)In 1935 Birkhoﬀ [12] ﬁrst proved a completeness theorem for equa-tional logic, in the unsorted case. Example 4.3.8, showing that the un-sorted rules can be unsound for many-sorted algebras that may haveempty carriers, is from [78] and [137], which ﬁrst gave rules of deduc-tion that are sound and complete for the general case. The rules in Sec-tion 4.7 are also from [78]. The discussion of completeness (especiallyTheorem 4.4.2) follows [136], although the proof in Appendix B is from[82] for the order-sorted case. The discussion in Section 4.3 followsthat in [80]. A more detailed historical discussion of various versionsof completeness for equational logic is given in [136], along with fur-ther discussion of the non-equivalence of one-sorted and many-sortedequational logic that was demonstrated in Section 4.3.2.Subterm replacement is the logical basis for term rewriting, which is the topic of Chapter 5. Some historical remarks on term rewritingare also given there. The extension of OBJ3 to permit applying rulesone at a time was done at Oxford by Timothy Winkler mostly duringSeptember 1989.The discussion of deduction with conditional equations in Sections4.8 and 4.9 parallels that in the preceding sections for unconditionalequations. The conditional Completeness Theorem (4.8.3) is a spe-cial case of the corresponding theorem for the order-sorted case, ﬁrstproved in (the ﬁrst version of) [82]. Theorems 4.9.1 and 4.8.4 seem notto appear in the literature, though they are certainly important, and will Equational Deduction probably not surprise experts in the ﬁeld.Lawvere’s categorical formulation of algebraic theories for the un-sorted case [121] embodies the same notion of speciﬁcation equiva-lence as that discussed in Section 4.10; however, the formulation thatwe give is new. Readers with a categorical background may be inter-ested to note that

Der is a left adjoint to the forgetful functor fromLawvere theories to signatures; many properties in Section 4.10 fol-low from this property. The composition operation for interpretationsis the Kleisli category composition. The one-equation speciﬁcation ofgroups in Section 4.10 is due to Higman and Neumann [102]. The sat-isfaction condition (Theorem 4.10.7) plays a central role in the theoryof institutions, which is an abstract theory of the relationship betweensyntax and semantics that has been used for a number of computing science applications, including speciﬁcation [67].Equational deduction has many applications in computer scienceand other areas. For example, it is applied to modelling and verifyingseveral diﬀerent kinds of hardware circuits in Chapter 5, and it haseven been used directly as a model of computation [70, 135].I thank Prof. Virgil Emil Cazanescu for the formulations in Section4.11, and for several other valuable suggestions, including ﬁnding sev-eral bugs in this chapter and suggesting ﬁxes for them.

A Note to Lecturers:

When lecturing on Section 4.5, it wouldmake sense to treat all the preceding rules and results asleading up to rule ( ± ) and Theorem 4.5.6. In this case,the variant rules (such as ( ) ) and the results about themcan be regarded as parts of the proof of Theorem 4.5.6, andassigned as reading rather than covered in lectures.Much of the material on conditional equations can be cov-ered quickly by relying on the analogy with the uncondi-tional case. However, rule (5C) and the two completenesstheorems (4.8.3 and 4.8.4) should be covered explicitly. The material in Section 4.10 is diﬃcult and should not beattempted in an introductory course. Section 4.11 is ratherspecialized and somewhat abstract, and should also be omit-ted in an introductory course. Section 4.5.1 is also ratherspecialized.

Rewriting

This chapter studies term rewriting , the restricted form of equationaldeduction in which equations are applied in the forward direction only, starting from a given term and “chaining” with the transitivity of equal-ity, that is, repeatedly applying the rule ( + ), or ( + C ) for the con-ditional case, in the style of Example 4.5.5. Corollary 4.5.7 (or Corol-lary 4.9.2) shows that term rewriting with symmetry is complete forequational logic. Without symmetry, completeness is lost, but we willsee that certain natural assumptions restore it, giving a decision proce-dure for equality of ground terms under loose semantics, and a com-putational semantics for sets of equations. Term rewriting has im-portant applications in algebraic speciﬁcation, computer algebra, the λ -calculus, implementation of declarative languages, and much else.OBJ implements term rewriting with a command which, given a term,searches for a match of a rule to a subterm of that term, applies therule, and iterates this process until there are no more matches; the ﬁ-nal term (if one exists) is considered the result of the computation. Thisprocess is sound, even when not complete.This chapter mainly presents and proves results that are usefulfor theorem proving; many are new, especially those on terminationand conditional term rewriting. Abstract rewrite systems are also dis-cussed, with optional sections (5.8.3 and 5.9) on Noetherian orderingsand on the relationship between term rewriting and abstract rewritingsystems. Syntactically, rewrite rules are a special kind of equation; their deﬁni-tion (5.1.3 below) uses:

Deﬁnition 5.1.1

Given t ∈ T Σ (X) , the set of variables in t , denoted var (t) , isthe least subsignature Y ⊆ X such that t ∈ T Σ (Y ) . (cid:2) We can deﬁne var (t) formally using initial algebra semantics (Section 3.2). Let V be the S -sorted set with each V s the set of all ﬁnite subsets of the elements of X ,given Σ (X) -structure by V x = { x } ∈ V s for x ∈ X s , V σ = ∅ ∈ V s for σ ∈ Σ [],s , and Rewriting

Notation 5.1.2

From now on, we will usually just say “ Σ -term” for what we werepreviously careful to call a “ Σ -term with variables,” i.e., for t ∈ T Σ (X) for some ground signature X without overloading. A Σ -term withoutvariables will be called a ground term . (cid:2) Notice that t is a ground term iﬀ var (t) = ∅ . Of course, every Σ -term t is a ground term over any signature that contains Σ ( var (t)) . The usualliterature on term rewriting is not very careful with bookkeeping forthe variables involved, but we have seen in Section 4.3.2 that this canbe very important for theorem proving. Deﬁnition 5.1.3 A Σ - rewrite rule is a Σ -equation ( ∀ X) t = t with var (t ) ⊆ var (t ) = X . It follows that the notation t → t is unambiguous, be-cause X is determined by t . A Σ - term rewriting system (abbreviated Σ - TRS ) is a set of Σ -rewrite rules; we may denote such a system ( Σ , A) ,and we may omit Σ here and elsewhere if it is clear from context. (cid:2) Most equations that users write in OBJ are rewrite rules with vari-ables exactly those that occur in its leftside. For example, all the equa-tions in all of our group speciﬁcations are rewrite rules in this way.Notice that if some equation is not a rewrite rule, then its converse(with its left and right sides reversed) may be a rewrite rule. The rule (+6 ) of Chapter 4 replaces exactly one subterm, moving inthe forward direction only. We now further restrict it to equations thatare rewrite rules, to get the following:( rw) Rewriting . Given t ∈ T Σ ( { z } s ∪ Y ) with exactly one occurrence of z , and given a substitution θ : X → T Σ (Y ) , if t → t is a Σ -rewriterule of sort s in A with var (t ) = X , then ( ∀ Y ) t (z ← θ(t )) = t (z ← θ(t )) is deducible.As usual, it is assumed that { z } s and Y are disjoint. This rule is soundbecause it is a restriction of a rule that we have already proved to besound. The successive application of this rule to a term gives a methodof reasoning that is formalized in the following: Deﬁnition 5.1.4

Given a Σ -TRS A , the one-step rewriting relation is deﬁned for Σ -terms t, t (cid:48) by t ⇒ A t (cid:48) iﬀ there exist a rule t → t of sort s in A , a V σ (v , . . . , v n ) = (cid:83) ni = v i for σ ∈ Σ w,s where w = s . . . s n and v i ∈ V s i . Then var isthe restriction to T Σ (X) of the unique Σ (X) -homomorphism T Σ (X) → V . It is common to assume that the leftside of a rewrite rule is not just a single variable,because rules of this kind, which we call lapse rules , are not very useful and moreoverhave some bad properties. (The term “lapse” is a joke based on the facts that a rulewith its rightside a variable is called a collapse rule , and that “co” indicates duality.)However, few results actually need the no lapse assumption, and we will invoke it onlywhere necessary. A TRS that has no lapse rules will be called lapse free . However, this is not necessarily the case; for example, the variables that occur inthe two sides could even be disjoint. erm Rewriting term t ∈ T Σ ( { z } s ∪ Y ) with exactly one occurrence of the variable z ,and a substitution θ : X → T Σ (Y ) where X = var (t ) , such that t = t (z ← θ(t )) and t (cid:48) = t (z ← θ(t )) . In this case, the pair (t , θ) is called a match to (a subterm of) t by (theterm t of) the rule t → t . The term rewriting relation is the transitive,reﬂexive closure of the one-step rewriting relation, for which we write t ∗ ⇒ A t (cid:48) and say that t rewrites to t (cid:48) ( under A ). We may also write t + ⇒ A t (cid:48) if t rewrites to t (cid:48) in one or more steps (i.e., + ⇒ A is the transitiveclosure of ⇒ A ), and ∗ (cid:97) A for the transitive, reﬂexive, symmetric cloureof ⇒ A . We may omit the subscript A if it is clear from context, and wemay also write ⇒ instead of ⇒ . The subterm θ(t ) of t is sometimes called the redex (for red ucible ex pression) of the rewrite t ⇒ t (cid:48) , and thesubterm θ(t ) of t (cid:48) is sometimes called the contractum of the rewrite,while t is called the source and t (cid:48) the target or the result of the rewrite. (cid:2) For reasons that will soon become clear, it is worth emphasizing thatthe above deﬁnitions apply only to ground terms, i.e., to T Σ . We con-sider rewriting terms with variables later. Example 5.1.5

Consider the following speciﬁcation for the natural numberswith addition: obj NATP+ is sort Nat .op 0 : -> Nat .op s_ : Nat -> Nat [prec 2] .op _+_ : Nat Nat -> Nat .var N M : Nat .eq N + 0 = N .eq N + s M = s(N + M) .endo

We can tell OBJ to regard this speciﬁcation as a TRS and apply its equa-tions as rewrite rules to a term t , just by giving the command reduce t . where the ﬁnal period is required, and must be separated from the termby a space unless the last character of t is a parenthesis. Note that, bydefault, reductions are executed in the context of the most recentlypreceding module. Also, “ reduce ” can be abbreviated “ red ”. Here aretwo examples: red s s 0 + s s 0 .red s(s s 0 + s s s 0)+ 0 . This concept is reviewed in Appendix C. Rewriting

The results are as you would expect, namely s s s s 0 and s s s ss s 0 , respectively, i.e., 2 + = ( + ( + )) + =

6. The OBJoutput from the ﬁrst reduction looks as follows, ==========================================reduce in NATP+ : s (s 0) + s (s 0)rewrites: 3result Nat: s (s (s (s 0)))========================================== and the steps of this reduction are s s 0 + s s 0 ⇒ s(s s 0 + s 0) ⇒ s s(s s 0 + 0) ⇒ s s(s s 0) =s s s s 0 . If the trace mode is on, then OBJ will display each rewriting step of eachreduction it executes. The commands set trace on .set trace off . respectively turn trace mode on and oﬀ. (cid:2)

Exercise 5.1.1

Show the rewriting steps for the second reduction in Example5.1.5. (cid:2)

Examples like the above suggest how term rewriting can be considereda model of computation: given a Σ -term t in T Σ (X) , each rewrite in asequence t ⇒ t ⇒ t ⇒ . . . is considered a step of computation, and a term that cannot be rewrit-ten any further is considered a result. Note that given t with var (t) = X ,all these computations occur in the Σ -algebra T Σ (X) or T Σ (Y ) for any X ⊆ Y . The following formalizes this notion of a computation result: Deﬁnition 5.1.6

Given a Σ -TRS A , a Σ -term t is irreducible , also called a normal or reduced form ( under A ), iﬀ there is no match to t by any rule in A .If t ∗ ⇒ t (cid:48) and t (cid:48) is a normal form, then we say that t (cid:48) is a normal (or reduced ) form of t ( under A ). (cid:2) Here is another example of computation with rewrite rules:

Example 5.1.7

The following simple version of the Booleans is suﬃcient forcomputing the values of ground terms that involve only true , false , and , and not : erm Rewriting obj ANDNOT is sort Bool .ops true false : -> Bool .op _and_ : Bool Bool -> Bool .op not_ : Bool -> Bool [prec 2].var X : Bool .eq true and X = X .eq false and X = false .eq not true = false .eq not false = true .endo The following are some sample reductions using this code: red not true and not false .red not (true and not false) .red (not not true and true) and not false .

We can also run reductions that involve variables, e.g., red X and not X .red not not X . (We don’t need to declare X here because it is already declared in ANDNOT ,but any other variables would need to be declared.) The results of thesetwo reductions show that this speciﬁcation is not powerful enough toprove every true Boolean equation with variables. (cid:2)

Exercise 5.1.2

Show the rewrites and the results for each reduction in Example5.1.7. (cid:2)

Although this book contains many “natural” examples of TRS’s, ar-tiﬁcial examples like the two below can be illuminating, for example, ascounterexamples to conjectured general results, or to illustrate certainconcepts:

Example 5.1.8

Consider a TRS A having just one sort, with a, b, c, d constantsof that sort, and with the following rewrite rules: a → b , a → c , b → a ,and b → d . (cid:2) Exercise 5.1.3

Show that a + ⇒ c , a + ⇒ d , and also that a + ⇒ a , for the TRS ofExample 5.1.8. (cid:2) The rest of this section explores the crucial relationship betweenterm rewriting and equational deduction. We ﬁrst extend rewriting toterms with variables X , which we deﬁne as ground term rewriting in T Σ (X) and denote by ⇒ A,X when rules in A are used. Rewriting

Proposition 5.1.9

Given t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X , and a Σ -TRS A , then t ⇒ A,X t (cid:48) iﬀ t ⇒ A,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . Proof:

The converse implication is easy, so we assume t ⇒ A,X t (cid:48) with t = t (z ← θ(t )) , t (cid:48) = t (z ← θ(t )) for t = t ∈ A with θ : var (t ) → T Σ (X) and t ∈ T Σ (X ∪ { z } ) . Since t, t (cid:48) ∈ T Σ (Y ) , we must have t ∈ T Σ (Y ∪ { z } ) as well as θ(t ), θ(t ) ∈ T Σ (Y ) , so that θ : var (t ) → T Σ (Y ) .Therefore t ⇒ A,Y t (cid:48) , and var (t (cid:48) ) ⊆ var (t) since var (t ) ⊆ var (t ) . (cid:2) Corollary 5.1.10

Given t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X , and Σ -TRS A , then t ∗ ⇒ A,X t (cid:48) iﬀ t ∗ ⇒ A,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . Proof:

By induction using Proposition 5.1.9. (cid:2)

Thus both ⇒ A,X and ∗ ⇒ A,X restrict and extend well over X , so we can drop the variable set subscript and write just t ∗ ⇒ A t (cid:48) , with the under-standing that any X such that var (t) ⊆ X may be used. Exercise 5.1.4

Show for any ﬁnite X , there are A, t, t (cid:48) such that t ⇒ A t (cid:48) but t ⇒ A,X t (cid:48) fails. (cid:2) The following gives soundness for term rewriting:

Proposition 5.1.11

Given Σ -TRS A and t , t ∈ T Σ (X) , then t ∗ ⇒ A t implies A (cid:96) ( ∀ X) t = t . Proof:

Each single step of rewriting is an application of the rule ( rw ), so sound-ness follows from the fact that ( rw ) is a special case of the rule ( + ) ofChapter 4, which we have already shown sound. Induction then extendsthis from ⇒ A to ∗ ⇒ A . (cid:2) Deﬁnition 5.1.12

Given Σ -TRS A and t , t ∈ T Σ (X) , write t ↓ A,X t iﬀ thereis some Σ -term t such that t ∗ ⇒ A t and t ∗ ⇒ A t ; in this case, we saythat t and t are convergent , or converge ( to t ). We may refer to aconﬁguration t ∗ ⇒ A t, t ∗ ⇒ A t as a join or a V . (cid:2) Proposition 5.1.13

Given Σ -TRS A and t , t ∈ T Σ (Y ) with Y ⊆ X , then t ↓ A,X t iﬀ t ↓ A,Y t , so the variable set subscript can be dropped. Moreover, t ↓ A t implies A (cid:96) ( ∀ X) t = t . (cid:2) Exercise 5.1.5

Prove Proposition 5.1.13. (cid:2)

Proposition 5.1.14

Given Σ -TRS A and t, t (cid:48) ∈ T Σ (X) , then t ∗ (cid:97) A t (cid:48) iﬀ there are t , . . . , t n ∈ T Σ (X) such that t ↓ A t and t i ↓ A t i + for i = , . . . , n − t n ↓ A t (cid:48) . Proof: If R denotes the transitive closure of ↓ A then we wish to show that t ∗ (cid:97) A t (cid:48) iﬀ tRt (cid:48) . Since ⇒ A ⊆ ↓ A ⊆ ∗ (cid:97) A , it follows that ∗ ⇒ A ⊆ R ⊆ ∗ (cid:97) A .But R is reﬂexive and symmetric because ↓ A is. Therefore R = ∗ (cid:97) A . (cid:2) anonical Form Corollary 5.1.10 and Proposition 5.1.13 show that ∗ ⇒ A,X and ↓ A,X re-strict and extend reasonably over variables, whereas ∗ (cid:97) A,X does not,because in Example 5.1.15 below, T ∗ (cid:97) FOO , { x } F holds but T ∗ (cid:97) FOO , ∅ F fails. Nevertheless, it makes sense to let t ∗ (cid:97) A t (cid:48) mean there exists an X such that t ∗ (cid:97) A,X t (cid:48) , and we shall do so. Example 5.1.15

Using the speciﬁcation of Example 4.3.8 and letting A = FOO ,then over the signature Σ ( { x } ) where x has sort A , we have T ↓ foo (x) ∨ ¬ foo (x) ↓ foo (x) ↓ foo (x) & ¬ foo (x) ↓ F , which implies the valid equation ( ∀ x) T = F , but not the invalid equa-tion ( ∀∅ ) T = F . (cid:2) In fact, ∗ (cid:97) A is complete, even though ∗ (cid:97) A,X may not be when X is ﬁnite: Theorem 5.1.16

Given Σ -TRS A and t, t (cid:48) ∈ T Σ (X) , then t ∗ (cid:97) A t (cid:48) if and only if A (cid:96) ( ∀ X) t = t (cid:48) . Proof:

By Proposition 5.1.11, we need only prove the converse direction, soassume we have a proof for A (cid:96) ( ∀ X) t = t (cid:48) . Any such proof nec-essarily starts with (1) and then chains forward using ( , ) until anapplication of (2) occurs. Unless this chain gives t (cid:48) or is a dead end, itsﬁnal term must also be the ﬁnal term of another chain, in which casewe have a join. Similarly, the whole proof is a set of joins, which canonly be put together without dead ends if they form a sequence as inProposition 5.1.14, which then gives t ∗ (cid:97) A t (cid:48) . (cid:2) One might think that Proposition 5.1.14 could give a decision proce-dure for ∗ (cid:97) A and hence for equality under A , since only term rewritingis involved, but this is not the case, because it can be diﬃcult to ﬁnd theappropriate t i . In fact, the problem is unsolvable, for reasons discussedin Section 5.10. When we compute, we usually hope to get a unique well-deﬁned answerin the end. The following gives one necessary condition for this tooccur for every term using a given set of rules; we also give a versionthat applies to all terms of a particular sort.

Deﬁnition 5.2.1 A Σ -TRS A is terminating (also called Noetherian ) iﬀ there isno inﬁnite sequence t , t , t , . . . of Σ -terms such that t ⇒ t ⇒ t ⇒ . . . . Rewriting

Similarly, A is ground terminating iﬀ there is no such inﬁnite sequenceof ground terms, and A is terminating (or ground terminating ) for sort s iﬀ there is no such sequence (of ground terms), all of sort s . (cid:2) For example, if we add a commutative law for addition to the speci-ﬁcation

NATP+ of Example 5.1.5, then there are computations like thefollowing that do not terminate:0 + s ⇒ s + ⇒ + s ⇒ . . . . If A is terminating, then every Σ -term has a normal form, but some Σ -terms may have more than one normal form. However, the followingcondition will guarantee the uniqueness of normal forms for terminat-ing TRS’s; again we also give a version for terms of a particular sort. Deﬁnition 5.2.2 A Σ -TRS A is Church-Rosser (also called conﬂuent ) iﬀ forevery Σ -term t , whenever t ∗ ⇒ t and t ∗ ⇒ t , there is some Σ -term t such that t ∗ ⇒ t and t ∗ ⇒ t . Similarly, A is Church-Rosser for sort s iﬀ this condition holds for all t of sort s . A Σ -TRS A is canonical (orsometimes convergent or complete ) iﬀ it is terminating and Church-Rosser, and is canonical for sort s iﬀ it is terminating and Church-Rosser for sort s . In these cases, normal forms may also be called canonical forms .A Σ -TRS A is ground Church-Rosser (also called ground conﬂuent )iﬀ for every ground Σ -term t , whenever t ∗ ⇒ t and t ∗ ⇒ t , there is aground Σ -term t such that t ∗ ⇒ t and t ∗ ⇒ t . Similarly, A is groundcanonical iﬀ it is ground terminating and ground Church-Rosser, and is ground Church-Rosser for sort s iﬀ the condition holds for all t of sort s . In these cases, normal forms may also be called ground canonicalforms .Similarly, a TRS A is locally Church-Rosser (or locally conﬂuent )iﬀ for every Σ -term t , whenever t ⇒ t and t ⇒ t , there is a Σ -term t such that t ∗ ⇒ t and t ∗ ⇒ t . Also, a TRS A is ground locallyChurch-Rosser iﬀ the above condition holds for all ground terms t , and is locally Church-Rosser for sort s or ( ground locally Church-Rosserfor sort s ) iﬀ it holds for all (ground) terms t of sort s . (cid:2) The generalizations to a particular sort are needed because rewritingwith a many-sorted TRS may well have diﬀerent properties for diﬀerentsorts. Figures 5.1(a) and 5.1(b) illustrate the Church-Rosser and localChurch-Rosser properties.The Church-Rosser property is shown graphically in Figure 5.1(a).For example, our discussion of

FOO from Example 4.3.8 shows that the But the French school does not use the terms “Church-Rosser” and “conﬂuent” syn-onymously, e.g., [109]. anonical Form tt t t (cid:10)(cid:10)(cid:10) (cid:10)(cid:10)(cid:10) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:10)(cid:10)(cid:10) (cid:10)(cid:10)(cid:10)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74) ∗∗∗ ∗ (cid:81)(cid:81)(cid:66)(cid:66) (cid:17)(cid:17)(cid:2)(cid:2)(cid:17)(cid:17)(cid:2)(cid:2) (cid:81)(cid:81)(cid:66)(cid:66) tt t t (cid:10)(cid:10)(cid:10) (cid:10)(cid:10)(cid:10) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:10)(cid:10)(cid:10) (cid:10)(cid:10)(cid:10)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74) ∗ ∗ (cid:81)(cid:81)(cid:66)(cid:66) (cid:17)(cid:17)(cid:2)(cid:2)(cid:17)(cid:17)(cid:2)(cid:2) (cid:81)(cid:81)(cid:66)(cid:66) (a) Church-Rosser (b) Locally Church-RosserFigure 5.1: Church-Rosser Propertiesresulting TRS is not Church-Rosser, because if we let t = foo (x) & ¬ foo (x) , then t ∗ ⇒ F and t ∗ ⇒ foo (x) , each of which is irreducible;however, this system is terminating. The TRS of Example 5.1.8 is notterminating; it is also not Church-Rosser, because both c and d arenormal forms of a . The following result is immediate from Deﬁnitions5.2.1 and 5.2.2: Fact 5.2.3 If ( Σ , A) is Church-Rosser, then it is ground Church-Rosser; if it isterminating then it is ground terminating; and if it is canonical then itis ground canonical. (cid:2) However, a ground Church-Rosser TRS is not necessarily Church-Rosser,and a ground canonical TRS is not necessarily canonical, as the follow-ing shows:

Example 5.2.4

Consider the following variant of the theory of monoids, inwhich the direction of the associative law has been reversed: th RMON issort Elt .op e : -> Elt .op _*_ : Elt Elt -> Elt .vars X Y Z : Elt .eq X * e = X .eq (X * Y)* Z = X *(Y * Z).endth

Viewed as a TRS, this has only one reduced ground term, namely e ,so it is certainly ground Church-Rosser, ground locally Church-Rosser,ground terminating, and ground canonical. However, it is not Church-Rosser; for example, the term (X * e) * X rewrites to both X * X and

X * (e * X) , each of which is reduced. (cid:2) Rewriting

Exercise 5.2.1

Show that the following speciﬁcation of the Peano natural num-bers with addition gives a ground canonical TRS that is not Church-Rosser, and hence not canonical: obj RNATP+ is sort Nat .op 0 : -> Nat .op s_ : Nat -> Nat [prec 2] .op _+_ : Nat Nat -> Nat .vars X Y Z : Nat .eq 0 + X = X .eq (s X) + Y = s(X + Y) .eq X + (Y + Z) = (X + Y) + Z .endo (cid:2)

Exercise 5.2.2

Show that if a TRS A has a lapse rule of sort s with its rightside a ground term, then A is Church-Rosser for sort s . (cid:2) Exercise 5.2.3

Show that the TRS of Example 5.1.8 is locally Church-Rosser. (cid:2)

We call the next result a “Theorem” and give its proof in detail, eventhough it is trivial, because it is such a fundamental result about termrewriting:

Theorem 5.2.5

If a Σ -TRS A is canonical, then every Σ -term t has a uniquenormal form, denoted [[t]] A , or just [[t]] if A is clear from context, andcalled the canonical form of t . Proof:

Each Σ -term t has at least one normal form by the Noetherian property.Suppose that t and t are two normal forms for t . Then by the Church-Rosser property, because t ∗ ⇒ t and t ∗ ⇒ t , there is a term t such that t ∗ ⇒ t and t ∗ ⇒ t . But because t and t are both normal forms, weget that t = t and t = t , and hence that t = t . (cid:2) For example, it will follow from later results that the TRS of Exam-ple 5.1.5 is canonical, and that its ground normal forms all have theform s s . . .

0, with zero or more s ’s. The results below bring out thevery important consequence of Theorem 5.2.5 that every canonical TRShas a natural procedure for deciding the equality of ground terms; the proposition below is proved in Section 5.7 as a consequence of the moreabstract Proposition 5.7.6 there. Proposition 5.2.6 If A is a Church-Rosser TRS, then A (cid:238) ( ∀ X) t = t (cid:48) iﬀ t ↓ t (cid:48) . (cid:2) Corollary 5.2.7 If A is a canonical TRS, then A (cid:238) ( ∀ X) t = t (cid:48) iﬀ [[t]] A = [[t (cid:48) ]] A , where the last equality is syntactical identity. Proof:

By Proposition 5.2.6, because t ↓ t (cid:48) iﬀ [[t]] A = [[t (cid:48) ]] A when A is canonical. (cid:2) anonical Form This says that when A is canonical, we can decide whether or not anequation ( ∀ X) t = t (cid:48) is satisﬁed by all models of A just by compar-ing the canonical forms of its two sides. Equivalently, it says that wecan decide whether or not terms t, t (cid:48) can be proved equal using theequations in A , just by checking whether or not their canonical formsare identical. OBJ provides a function that does exactly this: t == t (cid:48) computes the normal forms of t and t (cid:48) , and then returns true if theseterms are identical, and false otherwise.For a simple illustration using the TRS of Example 5.1.5, if we showthat the terms s s + s s s s(s s + s s ) each reduce to the samething (namely s s s s s ( ∀∅ ) s s + s s s = s(s s + s s ) , holds for all models; this is very conveniently done by executing red s s 0 + s s s 0 == s(s s 0 + s s 0) . However, this method cannot be used to prove either ( ∀ x) x + s s s = s s s + x , or the more general equation ( ∀ x, y) x + y = y + x , and, in fact, both of these are false in some models of NATP+ .The situation is as follows: Term rewriting over a canonical Σ -TRSgives a decision procedure for equality of Σ -terms with respect to loose semantics. But often we are really interested in the initial semanticsof a speciﬁcation, e.g., NATP+ . In such a case, we can decide the equal-ity of any two ground Σ -terms, i.e., of any two elements of the initial Σ -algebra, but we cannot decide the equality of two Σ -terms whose vari-ables are restricted to range over ground terms only, i.e., over the initialalgebra. For example, the commutative law is true for every pair x, y of ground terms over NATP+ , and hence it is true for the initial algebra, but this cannot be proved just by reduction, it requires induction . Infact, the commutative law is not true for every choice of elements fromevery model of

NATP+ . Thus it is important to remember that this kindof decision procedure only decides equality for loose semantics.

Exercise 5.2.4

1. Use the Completeness Theorem to prove that there are models of

NATP+ where the commutative law fails, without explicitly givingsuch a model.2. Now give such a model. (cid:2) Rewriting

Clearly it is useful to know if a speciﬁcation is canonical as a TRS.Sections 5.5 and 5.6 will give several useful tools for proving termina-tion and conﬂuence, respectively. However, it is important (and per-haps surprising) to note that for proving equality, it is not necessary for a TRS A to be Church-Rosser or even terminating: if t and t (cid:48) haveequal normal forms under A then the equation ( ∀ X) t = t (cid:48) is prov-able from A , whether or not A is canonical; canonicity is only neededto guarantee that if [[t]] A ≠ [[t (cid:48) ]] A then the equation ( ∀ X) t = t (cid:48) is not provable from A . This explains why OBJ does not require checkingconﬂuence or termination before it accepts code, and why we may usethe notation [[t]] A to denote an arbitrary normal form of t even when A is not canonical. We have found that in practice, it is more irritatingthan useful to prove canonicity, although it would be desirable to have algorithms that could help with this on an optional basis. Experiencealso shows that OBJ speciﬁcations written for many application areas,including programming, are essentially always canonical.OBJ also provides a function =/= that is the negation of == . How-ever, it can be dangerous to use when A is not Church-Rosser for thesort of the terms involved, because even if the terms are provably equal,OBJ could compute diﬀerent normal forms for them. But the result willalways be correct if A is canonical for the common sort of t, t (cid:48) and thesubset of rules actually used in computing t =/= t (cid:48) (however, complica-tions can arise for conditional rules, as discussed in Section 5.8). Example 5.2.8 ( Groups ) This OBJ code gives a canonical TRS for the theory ofgroups: th GROUPC is sort Elt .op _*_ : Elt Elt -> Elt .op e : -> Elt .op _-1 : Elt -> Elt [prec 2].var A B C : Elt .eq e * A = A .eq A -1 * A = e .eq A * e = A .eq e -1 = e .eq (A * B)* C = A *(B * C) .eq A -1 -1 = A .eq A * A -1 = e .eq A *(A -1 * B) = B .eq A -1 *(A * B) = B .eq (A * B)-1 = B -1 * A -1 .endth

For example, suppose we want to know whether or not the equation ( ∀ w, x, y, z) ((w ∗ x) ∗ (y ∗ z)) − = ((z − ∗ y − ) ∗ x − ) ∗ w − Termination is shown in Exercise 5.5.4, while the Church-Rosser property is shownin Exercise 5.6.7. anonical Form is true in all groups. In OBJ, we can open the module

GROUPC , introducethe variables, and then reduce the left and right sides to see if they havethe same normal form: open .vars W X Y Z : Elt .red ((W * X)*(Y * Z))-1 .red ((Z -1 * Y -1)* X -1)* W -1 .close

We could also use the OBJ built-in operation == for this: open .vars W X Y Z : Elt .red ((W * X)*(Y * Z))-1 == ((Z -1 * Y -1)* X -1)* W -1 .close This tells whether or not the equation is true of all groups, but itdoes not tell us what the normal forms are, and that additional infor-mation is often useful when trying to build a proof.Alternatively, using the Theorem of Constants, we can add new con-stants a, b, c, d , and then reduce the left and right sides to see if theyhave the same normal form. This is equivalent because variables arereally just new constants. Here is how that looks in OBJ: open .ops a b c d : -> Elt .red ((a * b)*(c * d))-1 .red ((d -1 * c -1)* b -1)* a -1 .close

It is not necessary to use open and close for this example; we couldinstead deﬁne a new module which enriches

GROUPC , and then do thereduction in that context, as follows: th GROUPC+ isinc GROUPC .ops a b c d : -> Elt .endthred ((a * b)*(c * d))-1 == ((d -1 * c -1)* b -1)* a -1 .

This uses a feature of OBJ that we have not yet discussed: the pre-viously deﬁned theory

GROUPC is “imported” into the current theory bythe declaration “ including GROUPC ”, here abbreviated “ inc GROUPC ”;the eﬀect is exactly the same as if the code in

GROUPC were copied into

GROUPC+ .New material introduced within an open...close pair is forgottenafter the close . If you want it to be added to the module in focus andretained as part of it for future use, you should instead use the pair Rewriting openr...close . If you introduced the module

GROUPC+ after havingdone the above proof within an openr...close pair, then you wouldget parsing errors, because now there would be two copies each of a,b,c,d (you can see the problem by typing “ show . ”). In order toget around this, you could re-enter the theory

GROUPC , which has theeﬀect of restoring it to its original state; OBJ will then warn you that

GROUPC is being “redeﬁned,” but this should not worry you becausethat is exactly what you wanted to do. Another approach is to type select GROUPC .red ((a * b)*(c * d))-1 == ((d -1 * c -1)* b -1)* a -1 .

This returns focus to the original

GROUPC module, which will haveretained (one copy of each of) the constants a,b,c,d , provided you previously used openr...close ; otherwise you will get a parse error.Thus we see that there is considerable ﬂexibility in how OBJ can be usedin proofs of this kind. (cid:2)

Exercise 5.2.5

Experiment with the new features of OBJ introduced above, in-cluding “ select ”, “ openr...close ” and “ include ”. (cid:2) Exercise 5.2.6

Assuming canonicity of the TRS of Example 5.2.8, check whetheror not the following equations are true of all groups:1. ( ∀ x, y, z) (x ∗ y) − ∗ (x ∗ z) = y − ∗ z .2. ( ∀ x, y) (x − ∗ y − ) − = y ∗ x .3. ( ∀ x, y, z) ((x − ∗ e) ∗ (y − ∗ z) − ) − = ((y − ∗ e) ∗ (z ∗ x)) − . (cid:2) Exercise 5.2.7

Given the following theory of monoids, th MONOID issort Elt .op e : -> Elt .op _*_ : Elt Elt -> Elt .vars X Y Z : Elt .eq X * e = X .eq e * X = X .eq (X * Y)* Z = X *(Y * Z).endth check whether or not the following equations are true of all monoids,assuming canonicity of the rules in

MONOID :1. ( ∀ x, y) (x ∗ e) ∗ y = y ∗ x . ( ∀ x, y, z, w) x ∗ (y ∗ (z ∗ w)) = ((x ∗ y) ∗ z) ∗ w . ( ∀ x, y, z) ((x ∗ e) ∗ (y ∗ z) = (x ∗ y) ∗ (z ∗ e) . anonical Form Exercise 5.6.7 shows that

MONOID is a Church-Rosser TRS. (cid:2)

The following fundamental result tells us that for canonical spec-iﬁcations, the ground canonical forms give an initial algebra. This isanother justiﬁcation for the use of irreducible terms as the results ofcomputations. The proof makes good use of the initiality of T Σ . Theorem 5.2.9

If a speciﬁcation P = ( Σ , A) is a ground canonical TRS, then thecanonical forms of ground terms under A form a P -algebra, called the canonical term algebra of P , denoted N P or N Σ ,A or just N A , in thefollowing way:(0) interpret σ ∈ Σ [],s as [[σ ]] in N P,s ; and(1) interpret σ ∈ Σ s ...s n ,s with n > ( t , . . . , t n ) with t i ∈ N P,s i to [[σ (t , . . . , t n )]] in N P,s .Furthermore, if M is any P -algebra, there is one and only one Σ -homo-morphism N P → M . Proof:

Since N P is a Σ -algebra by deﬁnition, we check that it satisﬁes A . Given ( ∀ X) t = t (cid:48) in A and a : X → N P we also get b : X → T Σ since N P ⊆ T Σ .Moreover, a(t) = [[b(t)]] for any t ∈ T Σ (X) because [[b( _ )]] is a Σ -homomorphism since [[ _ ]] is, and there is a unique Σ -homomorphism T Σ (X) → N P that extends a . Applying the given rule to t with thesubstitution b gives b(t) ⇒ A b(t (cid:48) ) , and so these two terms have thesame canonical form, i.e., [[b(t)]] = [[b(t (cid:48) )]] . Therefore a(t) = a(t (cid:48) ) forevery a , and so we are done.Now let h : T Σ → M be the unique Σ -homomorphism. Noting that N P ⊆ T Σ , let us deﬁne g : N P → M to be the restriction of h to N P . Wenow check that g is a Σ -homomorphism, using structural induction on Σ :(0) Given σ ∈ Σ [],s , we get g(σ N P ) = h([[σ ]]) by deﬁnition. NowProposition 5.1.16 gives us that h(σ ) = h([[σ ]]) because σ ∗ ⇒ A [[σ ]] . Therefore g(σ N P ) = σ M , as desired, because h(σ ) = σ M since h is a Σ -homomorphism.(1) Given σ ∈ Σ s ...s n ,s with n >

0, by deﬁnition we get g(σ N P (t , . . . , t n )) = h([[σ (t , . . . , t n )]]). Then Proposition 5.1.16 gives us that h(σ (t , . . . , t n )) = h([[σ (t , . . . , t n )]]), The assignments a, b are diﬀerent functions (in the sense of Appendix C) becausethey have diﬀerent targets, and even though a, b have the same values, the functions a, b have quite diﬀerent values. Rewriting and the fact that h is a Σ -homomorphism gives us g(σ N P (t , . . . , t n )) = σ M (h(t ), . . . , h(t n )) = σ M (g(t ), . . . , g(t n )), as desired.To show uniqueness, suppose g (cid:48) : N P → M is another Σ -homo-morphism. Let r : T Σ → N P be the map that sends t to [[t]] , and notethat it is a Σ -homomorphism by the deﬁnition of [[ _ ]] . Next, note that if i : N P → T Σ denotes the inclusion Σ -homomorphism, then i ; r = N P . Fi-nally, note that r ; g = r ; g (cid:48) = h , by the uniqueness of h . It now followsthat i ; r ; g = i ; r ; g (cid:48) , which implies g = g (cid:48) . (cid:2) Exercise 5.2.8

Draw a commutative diagram that brings out the simple equa- tional character of the above uniqueness argument. (cid:2)

Exercise 5.2.9

Show that any terminating TRS is lapse free. (cid:2)

This section discusses the preservation of some basic properties of aTRS when new constants are added to its signature. This is importantbecause it allows us to conclude that a TRS is terminating from a proofthat it is ground terminating, and it can also justify using the Theoremof Constants in theorem proving. Although we may speak of adding“variables” to a signature, of course they are really constants. Recallingthat ( Σ , A) indicates that A is a Σ -TRS, if X is a suitable variable set, itis convenient to let A(X) denote the TRS ( Σ (X), A) . We begin with asimple but important result: Proposition 5.3.1

If a TRS A is terminating, then so is A(X) , for any signatureof constants X for Σ . On the other hand, if A is Church-Rosser or locallyChurch-Rosser, then so is A(X) . (cid:2) The intuition is that the variables in X are really just constants; howeverthe formal justiﬁcations in Propositions 5.3.4 and 5.3.1 use abstractrewrite systems, which have not yet been introduced at this point inthe chaper.Although proofs of the Church-Rosser property generally apply tothe non-ground case, so that one does not have to worry about addedconstants causing trouble, this is not so for termination, which is usu-ally easier to prove for the ground case. By deﬁnition (or Fact 5.2.3),any terminating TRS is also ground terminating, but the converse doesnot hold, as shown by the following simple but important TRS (and alsoby Example 5.2.4): dding Constants Example 5.3.2

Let Σ have just one sort, one unary function symbol f , and onerule, f (Z) → f (f (Z)) , where Z is a variable. Because there are no ground terms, there are noground rewrite sequences at all, and so this TRS is necessarily groundterminating. However, the term f (X) is the start of an inﬁnite rewritesequence, so it is not terminating; similarly, if we add a constant to thesignature, the resulting TRS also fails to terminate. (cid:2) This motivates the following:

Deﬁnition 5.3.3

A signature Σ is non-void iﬀ (T Σ ) s ≠ ∅ for each sort s . (cid:2) That is, Σ is non-void iﬀ each of its sorts is non-void in the sense of Sec- tion 4.7. A simple suﬃcient condition is that each sort has a constant;also, a signature cannot be non-void if it has no constants at all. Thenext result follows from a more abstract version, Proposition 5.8.10, onpage 131: Proposition 5.3.4 If Σ is non-void, then a TRS ( Σ , A) is ground terminating iﬀ ( Σ (X), A) is ground terminating, where X is any signature of constantsfor Σ . Also ( Σ (X), A) is ground terminating if ( Σ , A) is ground termi-nating. E12 If Σ is non-void, then a TRS is ground terminating iﬀ it isterminating. (cid:2) Therefore when Σ is non-void, we know that A(X) is terminating if weknow that A is ground terminating; also, when Σ (X) is non-void, it issuﬃcient that A(X) is ground terminating.Although every Church-Rosser TRS is ground Church-Rosser, theconverse is false, and the same holds for the local Church-Rosser prop-erty. Moreover, these converse implications fail even when the signa-ture is non-void, as shown by the following:

Example 5.3.5

Let Σ have just one sort, with one constant a plus three unaryfunction symbols, f , g, h , and with the following rules: f (X) → g(X) g(a) → af (X) → h(X) h(a) → a Then every ground term has the same normal form, namely a , so thisTRS is certainly ground Church-Rosser and also ground locally Church-Rosser; furthermore, it is terminating and ground terminating. How-ever, it is neither Church-Rosser nor locally Church-Rosser. (cid:2) (Example 5.3.5 is also a counterexample to the conjecture that everyground locally Church-Rosser TRS is locally Church-Rosser; it also showsthat even assuming termination does not help.) Nevertheless, we havethe following: Rewriting

Proposition 5.3.6

Given a sort set S , let X ωS be the signature of constants with (X ωS ) s = { x is | i ∈ ω } for each s ∈ S (that is, X ωS has a countablenumber of distinct variable symbols for each sort in S , diﬀerent fromthose in Σ ). Then a TRS ( Σ , A) is Church-Rosser iﬀ ( Σ (X ωS ), A) is groundChurch-Rosser. Also, ( Σ , A) is locally Church-Rosser iﬀ ( Σ (X ωS ), A) isground locally Church-Rosser. (cid:2) The proof is given in Section 5.7, as an application of ideas developedthere.The following example and exercise show that the above result isoptimal with respect to the number of additional constants.

Example 5.3.7

Let Σ have one sort and binary function symbols, f , g, h withthe rules: f (X, Y ) → g(X, Y ) g(X, X) → Xf (X, Y ) → h(X, Y ) h(X, X) → X This TRS is neither Church-Rosser nor locally Church-Rosser, but it isground Church-Rosser and ground locally Church-Rosser. If we add oneconstant, the resulting TRS remains ground Church-Rosser and groundlocally Church-Rosser, but if we add two constants, that TRS is neitherground Church-Rosser nor ground locally Church-Rosser. So it is notsuﬃcient for A( { a } ) to have the ground property in order for A to havethe non-ground property. (cid:2) Exercise 5.3.1

Generalize the above example to show that adding two con-stants will not suﬃce. Then show that there is no natural number n such that n additional constants will suﬃce. (cid:2) Exercise 5.3.2

Apply the results in this section to discuss the situation result-ing from adding constants to the theory

FOO of Example 4.3.8, and thenusing rewriting of ground terms to prove equations with variables. (cid:2)

In general, a large tree will have many diﬀerent sites where rewrite rules might apply, and the choice of which rules to try at which sitescan strongly aﬀect both eﬃciency and termination. Most modern func-tional programming languages have a uniform lazy (i.e., top-down, oroutermost, or call-by-name) semantics. But because raw lazy evalu-ation is slow, lazy evaluation enthusiasts have built clever compilersthat ﬁgure out when an “eager” (i.e., bottom-up, or innermost, or call-by-value) evaluation can be used with exactly the same result; this iscalled “strictness analysis” (for example, see [141, 110]).OBJ is much more ﬂexible, because each operator can be given itsown evaluation strategy. Syntactically, a local strategy , also called an

E-strategy (E is for “evaluation”), is a sequence of integers in parentheses, valuation Strategies given as an operator attribute following the keyword strategy , or just strat for short. For example, OBJ’s built-in conditional operator hasthe following declaration op if_then_else_fi : Bool Int Int -> Int [strat (1 0)] . which says its local strategy is to evaluate its ﬁrst argument until it isreduced, and then apply rules at the top (indicated by “ ”). Similarly, op _+_ : Int Int -> Int [strat (1 2 0)] . indicates that _+_ on Int has strategy (1 2 0) , which evaluates botharguments before attempting to add them.Moreover, the ﬂexibility of local evaluation strategies requires min-imum eﬀort, because OBJ determines a default strategy for each oper- ator if none is explicitly given. This default strategy is computed veryquickly, because only a very simple form of strictness analysis is done,and it is surprisingly eﬀective, though of course it does not ﬁt all possi-ble needs. In OBJ3, the default local strategy for a given operator is de-termined from its equations by requiring that all argument places thatcontain a non-variable term in some rule are evaluated before equationsare applied at the top. If an operator with a user-supplied local strategyhas a tail recursive rule (in the weak sense that the top operator occursin its rightside), then it may apply an optimization that repeatedly ap-plies that rule, and thus violates the strategy. In those rare cases whereit is desirable to prevent this optimization from being applied, you canjust give an explicit local strategy that does not have an initial .There are actually two ways to get lazy evaluation. The simplest is toomit a given argument number from the strategy; then that argumentis not evaluated unless some rewrite exposes it from underneath thegiven operator. For example, taking this approach to “lazy cons” gives op cons : Sexp Sexp -> Sexp [strat (0)] . The second approach involves giving a negative number -j in a strat-egy, which indicates that the j th argument is to be evaluated “on de- mand,” where a “demand” is an attempt to match a pattern to the termthat occurs in the j th argument position. Under this approach, lazycons has the declaration op cons : Sexp Sexp -> Sexp [strat (-1 -2)] . Then a reduce command at the top level of OBJ is interpreted as a top-level demand that may force the evaluation of certain arguments. Thissecond approach cannot be applied to operators with an associative orcommutative attribute.A local strategy is called non-lazy if it requires that all argumentsof its operator are reduced in some order, and either the operator has Rewriting no rules, or the strategy ends with a ﬁnal “ ”. In general, for the resultsof a reduction command to actually be fully reduced, it is necessarythat all local strategies be non-lazy. All of the default local strategiescomputed by the system are non-lazy. Giving an operator the memo attribute causes the results of evaluatinga term headed by this operator to be saved, and then used if this termneeds to be reduced later in the same context [139]. In OBJ3, users cangive any operator the memo attribute, and memoization is implementedeﬃciently with hash tables. More precisely, given a memoized operatorsymbol f and given a term f(t1, ...,tn) to be reduced (possibly as part of some larger term), a table entry for f(t1, ...,tn) giving itsfully reduced value is added to the memo table, and entries giving thisfully reduced value are also added for each term f(r1, ...,rn) that,according to the evaluation strategy for f , could arise while reducing f(t1, ...,tn) just before a rule for f is applied at the top. This isnecessary because at that moment the function symbol f could disap-pear. In some cases, memoizing these intermediate reductions is morevaluable than memoizing just the original expression.For example, if f has the strategy (2 3 0 1 0) , let r be the reducedform of the term f(t1,t2,t3,t4) , and let r i be the reduced form of t i for i = , ,

3. Then the memo table will contain the following pairs: (f(t1,t2,t3,t4),r)(f(t1,r2,r3,t4),r)(f(r1,r2,r3,t4),r)

Memoization gives the eﬀect of structure sharing for common sub-terms, and this can greatly reduce term storage requirements in someproblems. Whether or not the memo tables are re-initialized beforeeach reduction can be controlled with the top level commands set clear memo on .set clear memo off .

The default is that the tables are not reinitialized. However, theycan be reinitialized at any time with the command do clear memo .

Of course, none of this has any eﬀect on the result of a reduction,but only on its speed. A possible exception to this is the case wherethe deﬁnitions of operators appearing in the memo table have beenaltered. (When rules are added to an open module, previous com-putations may become obsolete. Therefore, you may need to explic-itly give the command “ do clear memo . ”) Memoization is an area roving Termination where term-rewriting-based systems seem to have an advantage overuniﬁcation-based systems like Prolog.

It is known that (ground) termination is undecidable; that is, there isno algorithm which, given a TRS, can decide whether or not it is ter-minating. Nonetheless one can often prove termination by assigninga “weight” ρ(t) to each term t , i.e., by giving a function ρ : T Σ → ω ,such that ρ(t) > ρ(t (cid:48) ) whenever t ⇒ t (cid:48) . Because there are no inﬁnitestrictly decreasing sequences of natural numbers, it follows that if sucha function exists, then the TRS is terminating. The converse also holdsunder a rather mild assumption. Proposition 5.5.1 A Σ -TRS A is ground terminating if there is a function ρ : T Σ → ω such that for all ground Σ -terms t, t (cid:48) , if t ⇒ A t (cid:48) then ρ(t) >ρ(t (cid:48) ) . Moreover, the converse holds when A is globally ﬁnite , in thesense that for each term, there are only a ﬁnite number of rewrite se-quences that begin with it. Proof: If t ⇒ t ⇒ t ⇒ · · · ⇒ t n , then ρ(t ) > ρ(t ) > ρ(t ) > · · · > ρ(t n ) ,that is, a sequence of n − n . Hence, because there are noinﬁnite strictly decreasing natural number sequences, there cannot beinﬁnite proper rewrite sequences; therefore A is terminating.For the converse, assume A is terminating and let ρ(t) be the max-imum of the lengths of all rewrite sequences that reduce t to a normalform; there are only a ﬁnite number of these, because of global ﬁnite-ness. Then t ⇒ A t (cid:48) implies ρ(t) ≥ + ρ(t (cid:48) ) . (cid:2) The converse direction of this result is of mainly theoretical interest,since it can be diﬃcult to prove global ﬁniteness without knowing ter-mination. But note that a terminating TRS is globally ﬁnite if it has aﬁnite rule set.

Example 5.5.2

Here is a simple TRS showing that the converse of Proposi-tion 5.5.1 does not hold without the additional assumption of globalﬁniteness. The signature Σ has just one sort, say s , with Σ [],s = ω and Σ w,s = ∅ for w ≠ [] ; therefore T Σ = ω . The rule set A is0 → n for each n > n → n − n > . In the many-sorted case, the target is the S -sorted set with each component ω . Wewill see later that it is sometimes convenient to replace ω by certain other ordered sets. I thank Prof. Yoshihito Toyama for providing this example. Note that A is inﬁnitehere. Rewriting

Then there is a rewrite sequence 0 ⇒ n ⇒ n − ⇒ · · · ⇒ n for every n >

0. Now suppose there is a function ρ : T Σ → ω suchthat t ⇒ t (cid:48) implies ρ(t) > ρ(t (cid:48) ) , and let ρ( ) = K . But because thereis a rewrite sequence of length K + ⇒ K +

1, we get that K = ρ( ) > ρ(K + ) > ρ(K) > ρ(K − ) > · · · > ρ( ) , which isimpossible. (cid:2) There are two diﬃculties with using Proposition 5.5.1: (1) it can behard to ﬁnd an appropriate function ρ ; and (2) it can be hard to provethe required inequalities. We discuss the ﬁrst diﬃculty a little later;regarding the second, it is natural to reduce it by using the structure ofterms, as in the following deﬁnition and result: Deﬁnition 5.5.3

Given a poset P and ρ : T Σ → P , a Σ -rewrite rule r : t → t (cid:48) of sort s is strict ρ -monotone iﬀ ρ(θ(t)) > ρ(θ(t (cid:48) )) for each appli-cable ground substitution θ . An operation symbol σ ∈ Σ is strict ρ -monotone iﬀ ρ(t) > ρ(t (cid:48) ) implies ρ(t (z ← t)) > ρ(t (z ← t (cid:48) )) foreach t, t (cid:48) ∈ T Σ and any t ∈ T Σ ( { z } s ) of the form σ (t , . . . , t n ) whereeach t i except one is ground, and that one is just z . Σ -substitution isstrict ρ -monotone if ρ(t) > ρ(t (cid:48) ) implies ρ(t (z ← t)) > ρ(t (z ← t (cid:48) )) for any t, t (cid:48) ∈ T Σ and t ∈ T Σ ( { z } s ) having a single occurrence of z . Inany of the above, we speak of weak ρ - monotonicity if > is replaced by ≥ . (cid:2) The ﬁrst deﬁnition says that every application of the rule is weightdecreasing; the second says that any weight decreasing substitutiondecreases the weight of the result; and the third says the same for asingle operation symbol.

E13

Proposition 5.5.4

Given a Σ -TRS A , if there is a function ρ : T Σ → ω suchthat each rule in A is strict ρ -monotone, and Σ -substitution is strict ρ -monotone, then A is ground terminating. Proof: If t ⇒ t then the two assumptions imply that ρ(t ) > ρ(t ) , because t = t (z ← θ(t)) and t = t (z ← θ(t (cid:48) )) . Therefore A is groundterminating by Proposition 5.5.1. (cid:2) The third condition in Deﬁnition 5.5.3 is actually a special case of thesecond that is suﬃcient to imply it; this fact can greatly simplify manytermination proofs; the proof of the following result is given in Ap-pendix B:

Proposition 5.5.5

Given a Σ -TRS A and a function ρ : T Σ → ω , then Σ -substi-tution is strict ρ -monotone if every operation symbol in Σ is strict ρ -monotone; the same holds for weak ρ -monotonicity. (cid:2) We can now directly combine Propositions 5.5.4 and 5.5.5 to get thefollowing useful result: roving Termination

Proposition 5.5.6

Given a Σ -TRS A , if (1) each rule in A is strict ρ -monotone,and ( (cid:48) ) each σ ∈ Σ is strict ρ -monotone, then A is ground terminating. (cid:2) Note that Section 5.3 gives easy-to-check conditions for a ground ter-minating TRS to be terminating, so it is not a problem that the aboveresults, and others that follow, only show ground termination.It is very natural to reduce the tedium of deﬁning an appropriatefunction ρ by using initial algebra semantics, that is, by giving ω a Σ -algebra structure and letting ρ be the unique Σ -homomorphism. Thento prove that A is terminating, we can prove that each rule is weightdecreasing, and that each operation ω σ on ω is strict ρ -monotone ineach argument on ω . The two hypotheses can be stated using variablesthat range over terms of the appropriate sorts, and the resulting in- equalities can be proved by rewriting, as illustrated by examples in thisand Section 5.8.2. A subtle point is that if we know that ρ(t) ≥ t , then we can assume that all variables are > ω that are induced by the two hypotheses, as illustratedin the following examples. Example 5.5.7

Consider the TRS for Boolean conjunction that corresponds tothe following OBJ speciﬁcation: obj AND is sort Bool .ops tt ff : -> Bool .op _&_ : Bool Bool -> Bool .var X : Bool .eq X & tt = X .eq tt & X = X .eq X & ff = ff .eq ff & X = ff .endo

Let Σ denote its signature, and give ω the structure of a Σ -algebraby deﬁning ω tt = ω ff = ω & (m, n) = m + n . Then by Propo-sition 5.5.6, it suﬃces to prove the following, where ρ is the unique Σ -homomorphism T Σ → ω , ρ(x & tt ) > ρ(x)ρ( tt & x) > ρ(x)ρ(x & ff ) > ρ( ff )ρ( ff & x) > ρ( ff ) for all x ∈ T Σ ; and ρ(t) > ρ(t (cid:48) ) implies ρ(t & t) > ρ(t & t (cid:48) )ρ(t) > ρ(t (cid:48) ) implies ρ(t & t ) > ρ(t (cid:48) & t ) for all ground Σ -terms t, t (cid:48) , t , t . (The ﬁrst four inequalities come from(1) and the next two from ( (cid:48) ) .) Each of these six assertions can be Rewriting proved straightforwardly, using ρ(t) ≥ t . For example, theﬁrst and third amount to ρ(x) + > ρ(x)ρ(x) + > i > j implies k + i > k + j . All the other cases are similar, and so this TRS is ground terminating.Therefore it is terminating by Proposition 5.3.4, and since Exercise 5.6.7shows it is Church-Rosser, it is also canonical. (cid:2)

Exercise 5.5.1

Fill in the missing cases and details in Example 5.5.7. (cid:2)

Exercise 5.5.2

Show that the TRS of Example 5.1.5 is terminating by applyingProposition 5.5.6 with ρ the unique homomorphism into ω satisfyingthe following: ρ( ) = ρ(s t) = + ρ(t)ρ(t + t (cid:48) ) = + ρ(t)ρ(t (cid:48) ) . (cid:2) Exercise 5.5.3

Show that the TRS of Example 5.1.7 is terminating. (cid:2)

Exercise 5.5.4

Show that the group TRS of Example 5.2.8 is terminating by ap-plying Proposition 5.5.6 with ρ the unique homomorphism into ω sat-isfying the following: ρ(e) = ρ(t − ) = ρ(t) ρ(t ∗ t (cid:48) ) = ρ(t) ρ(t (cid:48) ) . You may also enjoy mechanizing the proofs using OBJ, in the mannerillustrated in Exercise 5.5.5 below. (cid:2)

Termination proofs in the literature tend to use polynomials, but theinitial algebra point of view makes it evident that any monotone func-tion at all can be used, e.g., exponentials, as in the function for group inverse in the above exercise. Moreover, Proposition 5.5.6 generalizesto partially ordered sets other than ω , provided they are Noetherian , inthe sense that they have no inﬁnite sequence of strictly decreasing ele-ments; this is discussed in detail in Section 5.8.3 below. OBJ can oftenbe used to do termination proofs based on Proposition 5.5.6, becauseas we have seen, these boil down to proving a set of inequalities, whichare often rather fun in themselves. We illustrate this in the following:

Exercise 5.5.5

The purpose of this exercise is to show termination for thefollowing speciﬁcation of the function half ( n ) , which computes thelargest natural number k such that 2 k ≤ n : roving Termination obj NATPH is sort Nat .op 0 : -> Nat .op s_ : Nat -> Nat [prec 2] .op _+_ : Nat Nat -> Nat .op half : Nat -> Nat .var N M : Nat .eq N + 0 = N .eq N + s M = s(N + M) .eq half(0) = 0 .eq half(s 0) = 0 .eq half(s s M) = s half(M) .endo The ﬁrst step is to deﬁne an appropriate function ρ : T Σ → ω , where Σ is the signature of NATPH .1. Give a fourth equation which, when added to the three below,uniquely deﬁnes a function ρ , which should furthermore satisfythe properties in items 2–5 below: ρ( ) = ρ(s(t)) = + ρ(t)ρ(t + t (cid:48) ) = + ρ(t)ρ(t (cid:48) ) . Explain why this uniquely deﬁnes ρ .The following should ﬁrst be proved by hand, and then proved usingOBJ proof scores based on the object NATP+*> given below:2. ρ(t) > ρ(t (cid:48) ) implies ρ(s(t)) > ρ(s(t (cid:48) )) for every t, t (cid:48) ∈ T Σ .3. ρ(t) > ρ(t (cid:48) ) implies ρ(t + t (cid:48)(cid:48) ) > ρ(t (cid:48) + t (cid:48)(cid:48) ) for every t, t (cid:48) , t (cid:48)(cid:48) ∈ T Σ .4. ρ(t) > ρ(t (cid:48) ) implies ρ( half ( t )) > ρ( half ( t (cid:48) )) .5. ρ(θ(t)) > ρ(θ(t (cid:48) )) for each rule t → t (cid:48) in NATPH and each substi-tution θ . Hints:

You may assume that adding the equations eq L + M > L + N = M > N .cq L * M > L * N = M > N if L > 0 .cq s M > N = M > N if M > N .cq L + M > L = true if M > 0 .cq M > 0 = true if M > s 0 . has already been justiﬁed by earlier OBJ proofs; you may also needsome other similar lemmas. (We will later see how to prove such resultsby induction.) You may need to use the fact that ρ(t) > t . Rewriting obj NATP+*> is sort Nat .ops 0 1 2 : -> Nat .op s_ : Nat -> Nat [prec 1] .eq 1 = s 0 .eq 2 = s 1 .vars L M N : Nat .op _+_ : Nat Nat -> Nat [assoc comm prec 3] .eq M + 0 = M .eq M + s N = s(M + N).op _*_ : Nat Nat -> Nat [assoc comm prec 2] .eq M * 0 = 0 .eq M * s N = M * N + M .eq L * (M + N) = L * M + L * N .op _>_ : Nat Nat -> Bool .eq M > M = false .eq s M > 0 = true .eq s M > M = true .eq 0 > M = false .eq s M > s N = M > N .endo

6. Explain why the above results prove that

NAPTH is terminating. (cid:2)

Exercise 5.5.6

Give mechanical proofs for the other termination examples inthis section. (cid:2)

Like termination, the Church-Rosser property is undecidable for termrewriting systems. But once again, there are useful techniques for manyspecial cases. We start with the following important result, the cleverproof of which is due to Barendregt [3]; this proof is easier to under-stand when done dynamically on a whiteboard or applet, than staticallyon paper.

Proposition 5.6.1 ( Newman Lemma ) If a TRS A is terminating, then it is Church- Rosser if and only if it is locally Church-Rosser.

Proof:

The “only if” direction is trivial. For the converse, let us call a term t ambiguous iﬀ it has (at least) two distinct normal forms. If we canshow that when t is ambiguous, there is some ambiguous t (cid:48) such that t ⇒ t (cid:48) , it then follows that if there are any ambiguous terms, then thesystem is non-terminating. Hence by contradiction, the system cannothave any ambiguous terms. The Church-Rosser property follows fromthis.We now prove the auxiliary claim. Assume that t is ambiguous, andlet t and t be two distinct normal forms for t , where t ⇒ t (cid:48) ∗ ⇒ t and roving Church-Rosser (cid:2)(cid:2)(cid:2)(cid:2)(cid:2)(cid:2)(cid:13) (cid:66)(cid:66)(cid:66)(cid:66)(cid:66)(cid:66)(cid:78) t t (cid:48) t t (cid:65)(cid:65)(cid:65) (cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:1)(cid:11) (cid:65)(cid:65)(cid:65)(cid:65)(cid:65)(cid:65)(cid:65)(cid:65)(cid:65)(cid:85)(cid:63) tt (cid:48) t (cid:48) t (cid:48)(cid:48) t t t Figure 5.2: Barendregt’s Proof of the Newman Lemma t ⇒ t (cid:48) ∗ ⇒ t . If t (cid:48) = t (cid:48) , let t (cid:48) = t (cid:48) . If t (cid:48) ≠ t (cid:48) , apply local conﬂuence to get t (cid:48)(cid:48) such that t ⇒ t (cid:48) ∗ ⇒ t (cid:48)(cid:48) and t ⇒ t (cid:48) ∗ ⇒ t (cid:48)(cid:48) , and let t be a normal formfor t (cid:48)(cid:48) . Then t is also a normal form for t , so that t ≠ t or t ≠ t . If t ≠ t we let t (cid:48) = t (cid:48) and if t ≠ t we let t (cid:48) = t (cid:48) . See Figure 5.2. (cid:2) Notice that although the TRS of Example 5.1.8 is locally Church-Rosser by Exercise 5.2.3, it is neither terminating nor Church-Rosser,nor does it have unique normal forms for all its terms, by Exercise 5.1.3.This shows that conﬂuence and local conﬂuence are not equivalent con-cepts.Theorem 5.6.9 below gives a method for showing the local Church-Rosser property, and then Corollary 5.6.10 applies the Newman lemmato get a method for showing the Church-Rosser property, and hencethe canonicity, of a terminating TRS. Chapter 12 will show how to makethis method into a general algorithm.

Deﬁnition 5.6.2

A TRS is left linear iﬀ no rule has more than one instance ofthe same variable in its leftside, and is right linear iﬀ no rule has morethan one instance of the same variable in its rightside.Two rules with leftsides t, t (cid:48) overlap iﬀ there are substitutions θ, θ (cid:48) such that θ(t ) = θ (cid:48) (t (cid:48) ) with t a subterm of t not just a variable,i.e., with t = t (z ← t ) where t has just one occurrence of the new variable z and t is not a variable. If the two rules are actually thesame, it is additionally required that t ≠ z , and the rule is called self-overlapping . A TRS is overlapping iﬀ it has two rules (possibly thesame) that overlap, and then the term θ(t ) = θ (cid:48) (t (cid:48) ) is called an over-lap of t, t (cid:48) ; otherwise the TRS is non-overlapping .A TRS is orthogonal iﬀ it is left linear and non-overlapping. (cid:2) Example 5.6.3

The idempotent rule, of the form B + B = B , is not left linear,but the associative and commutative rules are left linear.We now show that the associative and commutative rules overlap.Let their leftsides be t = (A + B) + C and t (cid:48) = A + B , respectively. Let Rewriting t be the subterm (A + B) of t ; then t is non-trivial because t = z + C .Deﬁne substitutions θ, θ (cid:48) as follows: θ(A) = a θ (cid:48) (A) = aθ(B) = b θ (cid:48) (B) = bθ(C) = C .

Then θ(t ) = θ (cid:48) (t (cid:48) ) = a + b .The associative rule is self-overlapping. As before, let its leftside be t = (A + B) + C . Let t be the subterm (A + B) of t ; then t is non-trivialbecause t = z + C . Now deﬁne the substitutions θ and θ (cid:48) by: θ(A) = a + b θ (cid:48) (A) = aθ(B) = c θ (cid:48) (B) = bθ(C) = C θ (cid:48) (C) = c. Then θ(t ) = θ (cid:48) (t (cid:48) ) = (a + b) + c . (cid:2) Exercise 5.6.1

Prove that the commutative rule is not self-overlapping. (cid:2)

The rather complex proof of the following theorem is given in Ap-pendix B:

Theorem 5.6.4 ( Orthogonality ) E14

A TRS is Church-Rosser if it is lapse free andorthogonal. (cid:2)

Exercise 5.6.2

Show that a lapse rule overlaps with any non-lapse rule. Give aTRS showing that the lapse free hypothesis is needed in Theorem 5.6.4. (cid:2)

Example 5.6.5

The TRS of the object

NATP+ of Example 5.1.5 is lapse free andorthogonal, and therefore Church-Rosser. To prove this, we check foroverlap of each rule with itself and each other rule; this gives 4 cases,and the reader can verify that each one fails because of incompatiblefunction symbols. Combining this with Exercise 5.5.2, it follows that

NATP+ is canonical. (cid:2)

Exercise 5.6.3

1. Show that the TRS

AND of Example 5.5.7 is not orthogonal.

2. Show that the TRS’s

MONOID of Exercise 5.2.7 and

GROUPC of Ex-ample 5.2.8 are not orthogonal. (cid:2)

Chapter 12 shows that proofs of canonicity by orthogonality can befully mechanized, by using an algorithm that checks if a given pair ofrules is overlapping, and noting that it is trivial to check left linearity.

Example 5.6.6 ( Combinatory logic ) The motivation for this classical logic issimilar to that for the lambda calculus, namely to axiomatize a theoryof functions, in this case certain collection of higher-order functionscalled combinators . Here we give it as an equational theory. roving Church-Rosser

The basic operation is to apply one combinator to another. Thetraditional notation for this (which might seem a bit confusing at ﬁrst)is simple juxtaposition, i.e., the syntactic form denoted __ in OBJ. Forexample, A B means apply A to B ; this might be more explicitly writtensomething like App(A,B) , or

A . B . This calculus has just one sort,which is denoted T in the OBJ code below; it is the type of functions.Thus, particular combinators will be constants of sort T , even thoughthey represent functions.The attribute gather (E e) of the operation __ makes it parse leftassociatively ([90] gives a detailed explanation of how these “gatheringpatterns” work in OBJ3). For example, A B C would be more explic-itly written as

App(App(A,B),C) , or (A . B). C . Finally, the let con-struction used below is just a convenient shorthand for ﬁrst declaring a constant and then letting it equal a given term; OBJ3 computes thesort for the constant by parsing the term. obj COMBL is sort T .op __ : T T -> T [gather (E e)].ops S K I : -> T .vars L M N : T .eq K M N = M .eq I M = M .eq S M N L = (M L)(N L).endoopen .ops m n p : -> T .red S K K m == I m .red S K S m == I m .red S I I I m == I m .red K m n == S(S(K S)(S(K K)K))(K(S K K)) m n .red S m n p ==S(S(K S)(S(K(S(K S)))(S(K(S(K K)))S)))(K(K(S K K))) m n p .red S(K K) m n p == S(S(K S)(S(K K)(S(K S)K)))(K K) m n p .let X = S I .red X X X X m == X(X(X X)) m .close

The last reduction takes 27 rewrites, which is more than one wouldlike to do by hand. (cid:2)

Exercise 5.6.4

The following refer to Example 5.6.6, and if possible should bedone with OBJ.1. Deﬁne B = S(KS)K . Then show that

Bxyz = x(yz) , and hencethat Bxy is the composition of functions x and y .2. Deﬁne C by Cxyz = zxy and prove that SI(KK) = CIK .3. Deﬁne ω = SII and show that ωx = xx , so that ωω ∗ ⇒ ωω ,which implies that the TRS COMBL is non-terminating. (cid:2) Rewriting

Exercise 5.6.5

Show that the TRS

COMBL of Example 5.6.6 is orthogonal and soChurch-Rosser. (cid:2)

Exercise 5.6.6

Show that the following TRS’s are orthogonal, and since alreadyknown to be terminating, therefore canonical:1.

NATPH of Exercise 5.5.5; and2.

ANDNOT of Example 5.1.7. (cid:2)

Unfortunately, many important Church-Rosser TRS’s are not orthog-onal, and therefore cannot be checked using Theorem 5.6.4. The follow-ing material, which builds on concepts in Deﬁnition 5.6.2, and which isfurther developed in Chapter 12, is much more powerful. The nextresult is proved in Chapter 12:

Proposition 12.0.1

If terms t, t (cid:48) overlap at a subterm t of t , then there is a most general overlap p , in the sense that any other overlap of t, t (cid:48) at t is a substitution instance of p . (cid:2) Note that if the leftsides t, t (cid:48) of two rules in a TRS have the overlap θ(t ) = θ (cid:48) (t (cid:48) ) , then the term θ(t) can be rewritten in two ways (one foreach rule). Deﬁnition 5.6.7

A most general overlap (in the sense of Proposition 12.0.1) iscalled a superposition of t and t (cid:48) , and the pair of rightsides resultingfrom applying the two rules to the term θ(t) is called a critical pair .If the two terms of a critical pair can be rewritten to a common termusing rules in A , then that critical pair is said to converge or to beconvergent . (cid:2) Theorem 5.6.9 below is our main result, while the following illustratesthe deﬁnition above:

Example 5.6.8

The fourth and sixth rules of Example 5.2.8 overlap. Their left-sides are t = e -1 and t (cid:48) = A -1 -1 , while t = A -1 , θ( A ) = e, θ (cid:48) = ∅ (the empty substitution), the superposition is e -1 , and the critical pairis e , e -1 , each term of which rewrites to e , so that the two diﬀerentrewrites of t also both yield e . (cid:2) Theorem 5.6.9 ( Critical Pair Theorem ) A TRS is locally Church-Rosser if andonly if all its critical pairs are convergent.

Sketch of Proof:

The converse is easy. Suppose that all critical pairs converge,and consider a term with two distinct rewrites. Then their redexes areeither disjoint or else one of them is a subterm of the other, since iftwo subterms of a given term are not disjoint, one must be containedin the other. If the redexes are disjoint, then the result of applying both bstract Rewrite Systems rewrites is the same in either order. If the redexes are not disjoint, theneither the rules overlap (in the sense of Deﬁnition 5.6.2), or else thesubredex results from substituting for a variable in the leftside of therule producing the larger redex. In the ﬁrst case, the result terms ofthe two rewrites rewrite to a common term by hypothesis, since theoverlap is a substitution instance of the overlap of some critical pairby Proposition 12.0.1. In the second case, the result of applying bothrules is the same in either order, though the subredex may have to berewritten multiple (or zero) times if the variable involved is non-linear. (cid:2)

The full proof is in Appendix B. This and the Newman Lemma (Proposi-tion 5.6.1) give:

Corollary 5.6.10

A terminating TRS is Church-Rosser if and only if all its criti-cal pairs are convergent, in which case it is also canonical. (cid:2)

Chapter 12 introduces uniﬁcation, an algorithm that can be used tocompute all critical pairs of a TRS, and hence to decide the Church-Rosser property for any terminating TRS.

Exercise 5.6.7

Use Corollary 5.6.10 to show the Church-Rosser property, andhence the canonicity, of the following TRS’s:1.

GROUPC of Example 5.2.8;2.

AND of Example 5.5.7; and3.

MONOID of Exercise 5.2.7. (cid:2)

Many important results about term rewriting are actually special casesof much more general results about a binary relation on a set. Althoughthis abstraction of term rewriting to the one-step rewrite relation ig-nores the structure of terms, it still includes a great deal. The classical approach takes an unsorted view of the elements to be rewritten, buthere we generalize to sorted sets of elements, enabling applications tomany-sorted term rewriting and equational deduction that appear to benew.

Deﬁnition 5.7.1 An abstract rewrite system (abbreviated ARS ) consists of a(sorted) set T and a (similarly sorted) binary relation → on T , i.e., → ⊆ T × T . We may denote such a system as a pair (T , → ) , or possibly as atriple (S, T , → ) , if the sort set S needs to be emphasized.An ARS (T , → ) is terminating if and only if there is no inﬁnite se-quence a , a , . . . of elements of T such that a i → a i + for i = , , . . . Rewriting (note that a i ∈ T s and a i → s a i + for the same s ∈ S ). Also, t ∈ T iscalled reduced , or a reduced form , or a normal form , iﬀ there is no t (cid:48) ∈ T such that t → t (cid:48) .Let ∗ → denote the reﬂexive, transitive closure of → . Then t is calleda normal form of t ∈ T iﬀ t ∗ → t and t is a normal form. Let t ↓ t mean there is some t ∈ T such that t ∗ → t and t ∗ → t ; we say t , t areconvergent , or converge to t . An ARS is Church-Rosser (also called conﬂuent ) iﬀ for every t ∈ T , whenever t ∗ → t and t ∗ → t then t ↓ t .An ARS is canonical iﬀ it is terminating and Church-Rosser; in this case,normal forms are called canonical forms . Let ∗ ↔ denote the reﬂexive,symmetric, transitive closure of → , and let , → denote the reﬂexive clo-sure of → .An ARS is locally Church-Rosser (or locally conﬂuent ) iﬀ for every t ∈ T , whenever t → t and t → t then t ↓ t . An ARS is globally ﬁnite iﬀ for every t ∈ T , there are only a ﬁnite number of distinct maximalrewrite sequences (ﬁnite or inﬁnite) that begin with t . (cid:2) We can relativize all these concepts to a single sort s just as in Deﬁ-nitions 5.2.1 and 5.2.2, to take account of the fact that rewriting overdiﬀerent sorts may have diﬀerent properties. Three of the more usefulTRS results that generalize to ARS’s are as follows: Theorem 5.7.2

Given a canonical ARS, every t ∈ T has a unique normal form,denoted [[t]] and called the canonical form of t . (cid:2) Proposition 5.7.3

Given an ARS (T , → ) and t, t (cid:48) ∈ T , then t ∗ ↔ t (cid:48) iﬀ there are t , . . . , t n ∈ T such that t ↓ t and t i ↓ t i + for i = , . . . , n − t n ↓ t (cid:48) . (cid:2) Proposition 5.7.4 ( Newman Lemma ) A terminating ARS is Church-Rosser iﬀ itis locally Church-Rosser. Hence any ARS that is terminating and locallyChurch-Rosser is canonical. (cid:2)

These results are proved essentially the same way as the correspondingTRS results (the second generalizes Proposition 5.1.14). Alternatively,they can be proved directly from the TRS results, by using the connec- tion between ARS’s and TRS’s that we now describe; a more detaileddiscussion of this connection appears in Section 5.9.Given a TRS

T = ( Σ , A) where Σ has sort set S , we get an ARS (S, T , → ) by letting T s = (T Σ ) s and for t , t ∈ T s , deﬁning t → s t iﬀ t ⇒ A t ; denote this ARS by R( T ) . It is suitable for dealing withground properties of its TRS. Exercise 5.7.1

Prove that a TRS T is ground terminating iﬀ the ARS R( T ) isterminating. Prove that a TRS T is ground Church-Rosser iﬀ the ARS R( T ) is Church-Rosser. Also prove corresponding results for the localChurch-Rosser and canonicity properties. (cid:2) bstract Rewrite Systems Next, an ARS

A = (T , → ) gives rise to a TRS F( A ) = ( Σ T , A → ) as follows:deﬁne Σ T by letting Σ T[],s = T s and Σ w,s = ∅ for all other pairs w, s , withthe rules in A → the equations ( ∀∅ ) t = t such that t → s t in A forsome sort s . Exercise 5.7.2

Prove that an ARS A is terminating iﬀ the TRS F( A ) is groundterminating. Do the same for the Church-Rosser, local Church-Rosser,and canonicity properties. (cid:2) It is usually easier to prove results about ARS’s than about TRS’s, butlike most bridges, this one can be used in either direction, as illustratedin the following:

Exercise 5.7.3

Prove Theorem 5.7.2 and Proposition 5.7.4 by reducing them to the corresponding results for TRS’s. Also do the reverse for the groundcase. (cid:2)

The following gives another tool for showing the Church-Rosser prop-erty:

Proposition 5.7.5 ( Hindley-Rosen Lemma ) For each i ∈ I , let (T , → i ) be a Church-Rosser ARS, and assume that for all i, j ∈ I the relations → i and → j commute in the sense that for all a, b, c ∈ T , if a ∗ → i b and a ∗ → j c thenthere is some d ∈ T such that b ∗ → j d and c ∗ → i d . Now deﬁne → on T by a → b iﬀ there is some i ∈ I such that a → i b . Then (T , → ) isChurch-Rosser. Proof:

We begin by showing that it suﬃces to prove this result for the casewhere I has just two indices. First, notice that since any particularrewrites a ∗ → b and a ∗ → c can only involve a ﬁnite set of relations → i , itsuﬃces to consider ﬁnite sets I of indices. Now assuming that Hindley-Rosen holds for index sets of cardinality 2, we show that it holds forany ﬁnite cardinality k , by induction on k . For k =

1, there is nothingto prove. Now assume Hindley-Rosen for some k >

2, and supposewe are given relations → i for i = , . . . , k + → = (cid:83) ki = → i is Church-Rosser by the induction hypothesis. Therefore if we show that → and → k + commute, we aredone by Hindley-Rosen for k = → and → k + commute, let a ∗ → b and a ∗ → k + c .The proof is by induction on the length n of the rewrite sequence for a ∗ → b . If n =

1, we apply Hindley-Rosen for k =

2. Now assumingthe hypothesis for some n >

1, we prove it for a ∗ → b of length n + a ∗ → b be a ∗ → i b . Thenby Hindley-Rosen for k =

2, there is some d such that c ∗ → i d and b ∗ → k + d . We now conclude the proof by applying the inductionhypothesis to b ∗ → b and b ∗ → k + d , noting that the former has Rewriting a (cid:63) (cid:45) b c b (cid:45) (cid:45)(cid:63) i i d d (cid:45) (cid:63) Figure 5.3: Hindley-Rosen Proof Reduction a (cid:63) (cid:45) (cid:45) b (cid:48) (cid:63) b (cid:63) c (cid:48) (cid:63) (cid:45) d e (cid:48) c (cid:45) (cid:45)(cid:63) e (cid:48)(cid:48) e (cid:45) (cid:63) Figure 5.4: Hindley-Rosen Proof for k = n , to get d such that b ∗ → k + d and d ∗ → d , and hence also c ∗ → d (see Figure 5.3, in which every arrow has an omitted ∗ , and eachdownward arrow is ∗ → k + ).We now prove Hindley-Rosen for k =

2. Let → and → be commut-ing Church-Rosser relations, let → = → ∪ → , and let → be arbitrarilymany applications of → followed by arbitrarily many applications of → , i.e., → = ∗ → ◦ ∗ → . Then the argument suggested by Figure 5.4shows that → is Church-Rosser, where all leftside horizontal arrowsare ∗ → , all rightside horizontal arrows are ∗ → , all top downward ar-rows are ∗ → , and all bottom downward arrows are ∗ → . Next, one cancheck that → ⊆ → ⊆ ∗ → , which implies that ∗ → = ∗ → . Therefore ∗ → is also Church-Rosser, and hence so is → (by Exercise 5.7.6). (cid:2) Exercise 5.7.4

Show that there is no analog of the Hindley-Rosen Lemma fortermination: give terminating ARS’s (T , → ) and (T , → ) which com-mute in the sense of Proposition 5.7.5, such that (T , → ) is not terminat-ing, where → = → ∪ → . (cid:2) Exercise 5.7.5

Prove a more convenient version of Hindley-Rosen, which re-places commutation with the following notion of strong commutation :if a → i b and a → j c then there is some d ∈ T such that b , → j d and c ∗ → i d bstract Rewrite Systems where , → j indicates the reﬂexive closure of → j . (cid:2) Although the deﬁnition of strong commutation is assymetrical, in prac-tice it is used in situations where it holds symmetrically, in both orders.

Exercise 5.7.6

Prove that an ARS (T , → ) is Church-Rosser iﬀ (T , ∗ → ) is Church-Rosser. (cid:2) There are also ARS versions of Proposition 5.2.6 and its Corollary5.2.7. For the ﬁrst of these, recall that ∗ ↔ denotes the reﬂexive, symmet-ric, transitive closure of the relation → . Proposition 5.7.6 If (A, → ) is a Church-Rosser ARS, then t ∗ ↔ t (cid:48) iﬀ t ↓ t (cid:48) . Proof:

We use induction on the number n of rewrites involved in t ∗ ↔ t (cid:48) . If n = t = t (cid:48) so that t ↓ t (cid:48) trivially. Now suppose that t ∗ ↔ t (cid:48) with n + t ∗ ↔ t (cid:48)(cid:48) ← t (cid:48) , and (2) t ∗ ↔ t (cid:48)(cid:48) → t (cid:48) ,where in both cases, t ∗ ↔ t (cid:48)(cid:48) in n rewrites, so that t ↓ t (cid:48)(cid:48) by the inductionhypothesis, say with t such that t ∗ → t and t (cid:48)(cid:48) ∗ → t . For case (1), since t (cid:48)(cid:48) ∗ → t and t (cid:48) → t (cid:48)(cid:48) , we get t ↓ t (cid:48) . For case (2), from t (cid:48)(cid:48) ∗ → t and t (cid:48)(cid:48) ∗ → t (cid:48) ,the Church-Rosser property gives t such that t ∗ → t and t (cid:48) ∗ → t , fromwhich it follows that t ∗ → t , so that t ↓ t (cid:48) . (cid:2) Corollary 5.7.7 If (T , → ) is a canonical ARS, then t ∗ ↔ t (cid:48) iﬀ t ↓ t (cid:48) iﬀ [[t]] = [[t (cid:48) ]] . Proof:

This is because t ↓ t (cid:48) iﬀ [[t]] = [[t (cid:48) ]] for a canonical ARS, noting that theequality used here is syntactical identity. (cid:2) Propositions 5.8.16 on page 134 and 5.8.19 on page 135 give ways toprove termination of ARS’s.So far we have related ARS’s to ground term rewriting; extending therelationship to non-ground rewriting can be somewhat tricky, becausewe must take account not only of the sorts of terms, but also of thesets of variables that appear in terms through universal quantiﬁcation.As a ﬁrst example, we provide the proof promised in Section 5.3 for the following result:

Proposition 5.3.6

Given a sort set S , let X ωS be the signature of constantswith (X ωS ) s = { x is | i ∈ ω } , i.e., with a countable set of distinct vari-able symbols for each s ∈ S . Then a TRS ( Σ , A) is Church-Rosser iﬀ ( Σ (X ωS ), A) is ground Church-Rosser. Also, ( Σ , A) is locally Church-Rosser iﬀ ( Σ (X ωS ), A) is ground locally Church-Rosser. Proof:

Given a term t with var (t) = X , let T = T Σ (X) and let G = T Σ (X ωS ) . Thatthe properties for ( Σ , A) imply the corresponding ground properties for ( Σ (X ωS ), A) is direct. For the converse, form the ARS’s T = (T , → T ) and G = (G, → G ) using rewriting with A on T and on G , respectively. Let Rewriting f : X → X ωS be an injection, which we can without loss of generalityassume is an inclusion, and let f also denote its free extension to termswhich is again an inclusion, T → G . Then t → T t (cid:48) iﬀ t → G t (cid:48) , andhence t ∗ → T t (cid:48) iﬀ t ∗ → G t (cid:48) . Now suppose that ( Σ (X ωS ), A) is groundChurch-Rosser and let t ∗ → T t and t ∗ → T t . Then t ∗ → G t and t ∗ → G t ,with X = var (t) . Therefore there exists t such that t ∗ → G t and t ∗ → G t , and hence t ∗ → T t and t ∗ → T t , from which it follows that T is Church-Rosser, and hence that ( Σ , A) is Church-Rosser, since t was arbitrary. An analoguous proof works for the local Church-Rosserproperty. (cid:2) The above proof involves a rewrite relation explicitly indexed overthe sorts in S , and implicitly indexed over variable sets X . To make the latter explicit, we could index over I = P F (X ωS ) × S , where P F (U) denotes the set of all ﬁnite subsets of U , and where for simplicity allvariables are assumed drawn from the ﬁxed signature X ωS introducedabove.For the next result, we need the following construction: Given aTRS T = ( Σ , A) , deﬁne the ARS N( T ) = (T , → ) by T s = (T Σ (X ωS )) s and t → s t (cid:48) iﬀ t ⇒ A t (cid:48) , for t, t (cid:48) ∈ T s . We now apply this machinery to getthe proofs that were promised for some results in Section 5.2: Proposition 5.2.6 If T = ( Σ , A) is a Church-Rosser TRS, then A (cid:238) ( ∀ X) t = t (cid:48) iﬀ t ↓ t (cid:48) . Proof:

First form the ARS N( T ) as described above, and notice that A (cid:96) ( ∀ X) t = t (cid:48) iﬀ t ∗ ↔ s t (cid:48) where t, t (cid:48) both have sort s . Now apply Proposi-tion 5.7.6 to N( T ) , and ﬁnally appeal to the Completeness Theorem. (cid:2) Deﬁnition 5.7.8

Let (T , → ) and (T (cid:48) , → (cid:48) ) be ARS’s. Then (T (cid:48) , → (cid:48) ) is a sub-ARS of (T , → ) iﬀ T (cid:48) ⊆ T and t → (cid:48) t implies t → t . Also, an ARS isomor-phism of (T , → ) and (T (cid:48) , → (cid:48) ) is a bijective function f : T → T (cid:48) such that t → t iﬀ f (t ) → (cid:48) f (t ) . (cid:2) Exercise 5.7.7

Show that if (T , → ) is a terminating ARS and if (T (cid:48) , → (cid:48) ) is a sub-ARS of (T , → ) , then (T (cid:48) , → (cid:48) ) is also terminating. Show that by contrast,a sub-ARS of a Church-Rosser ARS need not be Church-Rosser. Alsoshow that if two ARS’s are isomorphic, then each of them is terminat-ing, or Church-Rosser, or locally Church-Rosser iﬀ the oher one is. (cid:2) We now prove the result stated in Section 5.3 about adding constantsto a TRS:

Proposition 5.3.4 If Σ is non-void, then a TRS ( Σ , A) is ground terminating iﬀ ( Σ (X), A) is ground terminating, where X is any signature of constants onditional Term Rewriting for Σ . Also ( Σ (X), A) is ground terminating if ( Σ , A) is ground termi-nating, E15 and if Σ is non-void, then a TRS is ground terminating iﬀ it isterminating. Proof:

The reader should ﬁrst check that a TRS ( Σ , A) is terminating iﬀ N( Σ , A) is. Next given an S -sorted set Y of constants and a signature isomor-phism f : X ωS → Y , we can show that N( Σ , A) and (T Σ (Y ), → A ) areisomorphic ARSs, from which it follows by Exercise 5.7.7 that one ofthem is terminating iﬀ the other is. Finally, since X is countable, X ωS and X ωS ∪ X are isomorphic, from which it follows that the TRS ( Σ , A) is terminating iﬀ ( Σ (X), A) is, since N( Σ (X), A) = (T Σ (X ∪ X ωS ), → A ) . (cid:2) Proposition 5.3.1

If a TRS A is Church-Rosser or locally Church-Rosser, then so is A(X) , for any suitable countable set X of constants. Proof:

Let P stand for either of the above properties. The reader shouldcheck that a TRS ( Σ , A) is P iﬀ N( Σ , A) is P . Next given an S -sortedset Y of constants and a signature isomorphism f : X ωS → Y , N( Σ , A) and (T Σ (Y ), → A ) are isomorphic ARSs, from which it follows by Exer-cise 5.7.7 that one of them is P iﬀ the other is. Finally, since X iscountable, X ωS and X ωS ∪ X are isomorphic, from which it follows thatthe TRS ( Σ , A) is P iﬀ ( Σ (X), A) is P , since N( Σ (X), (T Σ (X ∪ X ωS ), → A ) . (cid:2) Conditional term rewriting arises naturally from the desire to imple-ment algebraic speciﬁcations that have conditional equations in thesame way that unconditional rewriting implements unconditional equa-tional speciﬁcations. There are many examples of such speciﬁcations,and they are very useful in practice, as well as strictly more expres-sive [176]. Just as unconditional rewrite rules are a special kind ofunconditional equation, so conditional rewrite rules are a special kind of conditional equation:

Deﬁnition 5.8.1 A conditional Σ - rewrite rule is a conditional Σ -equation ( ∀ X) t = t if C such that var (t ) ∪ var (C) ⊆ var (t ) = X ,where var (C) = (cid:83) (cid:104) u,v (cid:105)∈ C ( var (u) ∪ var (v)) .A conditional Σ - term rewriting system (abbreviated Σ - CTRS ) is aset of (possibly) conditional Σ -rewrite rules; we denote such a systemby ( Σ , A) , and we may omit Σ here and elsewhere if it is clear fromcontext. (cid:2) Rewriting

Notation and terminology for conditional term rewriting follow thosefor the unconditional case. Instead of ( ∀ Y ) t = t if C , we usuallywrite ( ∀ Y ) t → t if C , and in concrete cases we may write ( ∀ Y ) t → t if u = v , ( ∀ Y ) t → t if u = v, u (cid:48) = v (cid:48) , etc. The notation t → t if C is unambiguous because X is determined by t . Also,when t ⇒ t (cid:48) using a rule in A having leftside (cid:96) , with substitution θ where t = t (z ← θ((cid:96))) , then the pair (t , θ) , is called a match to (asubterm of) t by that rule. Unconditional rules are the special casewhere C = ∅ .Unfortunately, there is no easy way to generalize the rule ( rw ) forterm rewriting to the conditional case, e.g., by specializing the rule(+6 C ) from Section 4.9 to replace exactly one subterm using a substi-tution instance of a conditional rewrite rule. This is because the con- ditions must be checked, which may lead to further conditional termrewriting, including further condition checking, and so on recursively.We therefore need a recursive deﬁnition of the conditional term rewrit-ing relation, and so will deﬁne (sorted) relations temporarily denoted R k and R k , with the rewriting relation the union of the R k and with the R k used for evaluating conditions. Deﬁnition 5.8.2

Given a CTRS ( Σ , A) and a set X of variables, let R = R ={(cid:104) t, t (cid:105) | t ∈ T Σ (X) } , and for k ≥

0, let (cid:104) t, t (cid:48) (cid:105) ∈ R k + iﬀ there exist aconditional rule ( ∀ Y ) t = t if C of sort s in A and a substitution θ : Y → T Σ (X) such that t = t (z ← θ(t )) and t (cid:48) = t (z ← θ(t )) forsome t ∈ T Σ (X ∪{ z } s ) , and such that for each (cid:104) u, v (cid:105) ∈ C there is some r such that (cid:104) θ(u), r (cid:105) , (cid:104) θ(v), r (cid:105) ∈ R k . Also let R k + = (R k + ∪ R k ) ∗ andlet R = (cid:83) ∞ k = R k . Then R is the conditional term rewriting relation ,hereafter denoted t ⇒ A t (cid:48) . As usual, ∗ ⇒ A denotes its transitive reﬂexiveclosure, and when X = ∅ we get the ground case. (cid:2) Note that it is possible to go into an inﬁnite loop when evaluating thecondition of an instance of a rule, in which case the correspondinghead is simply not included in R . Note also that there exists r suchthat (cid:104) θ(u), r (cid:105) , (cid:104) θ(v), r (cid:105) ∈ R k iﬀ θ(u), θ(v) converge under R k − . E16

It is not hard to check that t ⇒ A t (cid:48) iﬀ (cid:104) t, t (cid:48) (cid:105) ∈ R k for some k > and that R ∗ = R . The soundness results in Proposition 5.8.8 are alsostraightforward. However, we do not prove these results here, becausethey follow from more general results in Section 7.7.Many results developed earlier in this chapter for unconditionalterm rewriting extend to the conditional case. The easiest extensionsuse the fact that CTRS’s give rise to ARS’s just as in the unconditionalcase, so we can directly apply general ARS deﬁnitions and results toconditional term rewriting. More speciﬁcally, if P = ( Σ , A) is a CTRS,we let R(P ) be the ARS (T , → ) with T s = T Σ (X) s and with t → s t (cid:48) iﬀ t ⇒ A t (cid:48) for t, t (cid:48) of sort s . This gives us the correct notions of termina- onditional Term Rewriting tion, normal form, Church-Rosser, local Church-Rosser, and canonicityfor CTRS’s. For example, P is terminating iﬀ R(P , s) is terminating foreach s ∈ S . As with ordinary TRS’s, we let X = ∅ for the ground case,and we choose X large enough for the general case, e.g., X ωS as deﬁnedin Proposition 5.3.6. The ARS results Theorem 5.7.2, Proposition 5.7.3,and Proposition 5.7.4 give the following: Theorem 5.8.3

Given a canonical CTRS, every t ∈ T has a unique normal form,denoted [[t]] and called the canonical form of t . (cid:2) Proposition 5.8.4

Given CTRS ( Σ , A) and t, t (cid:48) ∈ T Σ (X) , then t ∗ (cid:97) A t (cid:48) iﬀ thereare terms t , . . . , t n ∈ T Σ (X) such that t ↓ A t , t i ↓ A t i + for i = , . . . , n −

1, and t n ↓ A t (cid:48) . (cid:2) Proposition 5.8.5 ( Newman Lemma ) A terminating CTRS is Church-Rosser ifand only if it is locally Church-Rosser. (cid:2)

Other results generalize, not through ARS’s, but because their proofsgeneralize. We begin with two results from Section 5.1 that connectrewriting with deduction:

Proposition 5.8.6

For t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X , and ( Σ , A) a CTRS, t ⇒ A,X t (cid:48) iﬀ t ⇒ A,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . (cid:2) Corollary 5.8.7

For t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X , and ( Σ , A) a CTRS, t ∗ ⇒ A,X t (cid:48) iﬀ t ∗ ⇒ A,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) ; moreover, t ↓ A,X t (cid:48) iﬀ t ↓ A,Y t (cid:48) . (cid:2) As before, this shows that ∗ ⇒ A,X and ↓ A,X restrict and extend wellover variables, which permits dropping the variable set subscripts. Onthe other hand, noting that TRS’s are a special case of CTRS’s, Exam-ple 5.1.15 shows that ∗ (cid:97) A,X does not restrict and extend well over vari-ables. The next result gives soundness:

Proposition 5.8.8

Given CTRS ( Σ , A) and t, t (cid:48) ∈ T Σ (X) , then t ∗ ⇒ A t (cid:48) implies A (cid:96) ( ∀ X) t = t (cid:48) . Also t ↓ A t (cid:48) and t ∗ (cid:97) A t (cid:48) both imply A (cid:96) ( ∀ X) t = t (cid:48) . (cid:2) We cannot hope for completeness here, because it is possible, for acondition t = t , that A (cid:96) t = t but t ↓ t fails. The literature in-cludes several diﬀerent ways to deﬁne conditional term rewriting; theone in Deﬁnition 5.8.2 is called join conditional rewriting . OBJ im-plements a special case of this, where there is just one condition ineach rule, with its leftside a Bool -sorted term, and its implicit rightsidethe constant true . Although conditional rewriting can be diﬃcult, thespecial case implemented in OBJ is much more eﬃcient, because theconvergence of a condition can be checked just by rewriting its leftsideterm. Rewriting

Perhaps surprisingly, OBJ’s restrictions do not limit its power inpractice. In particular, evaluation of an OBJ equation of the form t = t (cid:48) if u == v agrees with Deﬁnition 5.8.2 despite the implicit true onthe rightside of the condition, because of the operational semantics of == . In any case, soundness implies that any rewriting computation is a proof, so if you get the result you want, then you have proved theresult you wanted to prove, whether or not the CTRS that you usedwas Church-Rosser or terminating. Also, it is rare in practice that whenOBJ evaluates u == v , a term r exists such that u ∗ ⇒ A r and v ∗ ⇒ A r ,but OBJ does not ﬁnd this r , because u and v reduce to diﬀerent nor-mal forms, or because at least one of them does not terminate. Thereis an important obstacle to soundness for non-canonical CTRS’s: if == occurs in a negative position (such as =/= ) in a condition, then failure of == to ﬁnd a common reduced form may lead to its negation return-ing an unsound true . However, soundness can be guaranteed for suchconditional rules if A is canonical for the sorts of terms that occur insuch positions, using the subset of rules that are actually applied in theparticular computation. As noted before, some uses of == have to beconsidered carefully, because they can take one outside the mathemat-ical semantics of OBJ.Theorem 5.2.9 on initiality of the algebra of normal forms also gen-eralizes; we do not prove it here, because it is a special case of Theorem7.7.8, which is proved in Section 7.7. Theorem 5.8.9

If a (conditional) speciﬁcation P = ( Σ , A) is a ground canonicalCTRS, then the canonical forms of ground terms under A form a P -algebra called the canonical term algebra of P and denoted N P , in thefollowing way:(0) interpret σ ∈ Σ [],s as [[σ ]] in N P,s ; and(1) interpret σ ∈ Σ s ...s n ,s with n > t , . . . , t n ) with t i ∈ N P,s i to [[σ (t , . . . , t n )]] in N P,s .Furthermore, if M is any P -algebra, there is one and only one Σ -homo- morphism N P → M . (cid:2) This subsection extends results from Section 5.3 from TRS’s to CTRS’s,on how properties can change when new constants are added. As be-fore, these results are important because they help us conclude thatrewriting systems terminate, are Church-Rosser, local Church-Rosser,or canonical; moreover they can justify using the Theorem of Constantsin theorem proving. Proposition 5.3.1 extends to the conditional caseto support this assertion; the proof appears in Appendix B: onditional Term Rewriting

Proposition 5.8.10

A CTRS ( Σ (X), A) is ground terminating, where X is a sig-nature of constants if ( Σ , A) is ground terminating. E17

Moreover, if Σ is non-void, then ( Σ , A) is ground terminating iﬀ ( Σ (X), A) is groundterminating. (cid:2) As with the unconditional case, proofs of the (local) Church-Rosserproperty usually cover the general case, not just the ground case, sothat constants can be added without worry. The following result is stillof interest:

Proposition 5.8.11

A CTRS ( Σ , A) is (locally) Church-Rosser if and only if theCTRS ( Σ (X ωS ), A) is (locally) ground Church-Rosser, where X ωS is as de-ﬁned in Proposition 5.3.6. (cid:2) The proof is omitted since it is the same as that for Proposition 5.3.6 on page 108 for the unconditional case.

Proving termination of a CTRS can be much more diﬃcult than for theunconditional case. But we can often reduce to the unconditional case,and then apply the techniques of Section 5.5. In the following result,the “unconditional version” of a conditional rule is deﬁned to be therule obtained by deleting its condition:

Proposition 5.8.12

Given a CTRS C , let C U be the TRS whose rules are thoseof C with their conditions (if any) removed. Then C is terminating (orground terminating) if C U is. Proof:

Any rewrite sequence of C is also a rewrite sequence of C U and there-fore ﬁnite. (cid:2) Notice that the normal forms of C may be diﬀerent from those of C U ,because in general C U has more rewrites than C . The following illus-trates the use of this result to prove termination of a CTRS, and becausewe give rather a lot of detail, it can also serve as a review of the tech-nique of Proposition 5.5.6: Example 5.8.13

The function max , which gives the maximum of two naturalnumbers, is often deﬁned using conditional equations as follows: obj NATMAX is sort PNat .op 0 : -> PNat .op s_ : PNat -> PNat .op _<=_ : PNat PNat -> Bool .op max : PNat PNat -> PNat .vars N M : PNat .eq 0 <= N = true .eq s N <= 0 = false . Rewriting eq s N <= s M = N <= M .cq max(N,M) = N if M <= N .cq max(N,M) = M if N <= M .endo

We will show that this CTRS is terminating. It suﬃces to prove thatthe corresponding unconditional TRS is ground terminating, by Propo-sitions 5.8.10 and 5.8.12. (It is interesting to notice that this TRS is notChurch-Rosser, although the original CTRS is Church-Rosser.)We deﬁne ρ : T Σ → ω by initiality, by making ω a Σ -algebra asfollows: ω true = ω false = ω = ω s (N) = N + ω <= (N, M) = + N + M ; and ω max (N, M) = + N + M . To apply Proposition 5.5.6, we mustcheck a number of inequalities. The following arise from condition (1),and must hold for any x, y ∈ T Σ , PNat : ρ( x) > ρ( true )ρ( s x <= 0 ) > ρ( false )ρ( s x <= s y) > ρ(x <= y)ρ( max (x, y)) > ρ(x) if (x <= y) = true ρ( max (x, y)) > ρ(y) if (y <= x) = false . For condition (2 (cid:48) ), we must check the following for any x, y, z ∈ T Σ , PNat under the assumption that ρ(x) > ρ(y) : ρ( s x) > ρ( s y)ρ(x <= z) > ρ(y <= z)ρ(z <= x) > ρ(z <= y)ρ( max (x, z)) > ρ( max (y, z))ρ( max (z, x)) > ρ( max (z, y)) . All of these translate to inequalities over the natural numbers that areeasily checked mechanically, e.g., with appropriate reductions underthe following deﬁnition, noting that we must introduce only the syntaxof

NATMAX , not its equations, and that the version of

NAT used, in thiscase

NATP+*> , must contain enough facts about addition and > to makethe proofs work: obj NATMAXPF is sort PNat .pr NATP+*> .op 0 : -> PNat .op s_ : PNat -> PNat .op _<=_ : PNat PNat -> Bool .op max : PNat PNat -> PNat .op r : PNat -> Nat .op r : Bool -> Nat .vars X Y : PNat .eq r(0) = 1 .eq r(true) = 1 .eq r(false) = 1 . onditional Term Rewriting eq r(s X) = s r(X).eq r(X <= Y) = s(r(X) + r(Y)).eq r(max(X,Y)) = s s(r(X) + r(Y)).endo Thus the following proves the ﬁrst set of inequalities, where we haveintroduced constants to eliminate the universal quantiﬁers, and then alemma that was not already in

NATP+*> : openr .ops x y : -> PNat .vars N M : Nat .eq s(N + M) > N = true .red r(0 <= x) > r(true).red r(s x <= 0) > r(false).red r(s x <= s y) > r(x <= y).red r(max(x,y)) > r(x).red r(max(x,y)) > r(y).close The last two equations are true without their conditions, althoughthe conditions could have been added as assumptions for the proofs ifthey had been needed. (cid:2)

Exercise 5.8.1

Give mechanical proofs for the second set of inequalities in Ex-ample 5.8.13, similar to those given for the ﬁrst set of inequalities there. (cid:2)

The conclusion of Proposition 5.8.12 is that inﬁnite rewrite sequencescannot occur. However, a related phenomenon can occur for terminat-ing CTRS’s, whereby a process of determining whether a conditionalrewrite applies does not stop. This is illustrated in the following:

Example 5.8.14

Let Σ have just one sort plus four constants, a, b, c, d , and let A contain the following two conditional rewrite rules: a → b if c = dc → d if a = b Then given the term a , to check if the ﬁrst rule applies, we must con-sult the second, which in turn requires that we consult the ﬁrst ruleagain, etc., etc. According to the formal deﬁnition of conditional termrewriting, the result of such an inﬁnite regress is simply that the orig-inal rule does not apply to the given term; so this does not lead tonon-termination in the sense of Deﬁnition 5.2.1, and in fact, this CTRSis terminating, as is easily seen using Proposition 5.8.12. Intuitively,neither rule applies, because neither condition can ever be satisﬁed.What does occur here, is that a certain algorithm that might be used to Rewriting implement conditional term rewriting fails to terminate. In fact, eachof a, b, c, d are reduced forms under this CTRS (however, only b and d are reduced under its unconditional version).It is possible to get the same phenomenon with just one rule. Let Σ now have one sort plus two constants, a, b , and one unary function s .Then the rule a → b if s(a) = s(b) leads to an inﬁnite condition evaluation regress similar to that of theabove two-rule example. Here too the CTRS is terminating, and a, b areboth reduced, for similar reasons. (cid:2) Exercise 5.8.2

Write out the details of the inﬁnite regress and of the termina- tion proof for the one-rule CTRS above. (cid:2)

Example 5.8.15

The following OBJ code for the two examples above aborts,producing the error message “

Value stack overflow. ” because ofinﬁnite conditional evaluation regress: obj CTRS1 is sort S .ops a b c d : -> S .cq a = b if c == d .cq c = d if a == b .endored a .obj CTRS2 is sort S .ops a b : -> S .op s : S -> S .cq a = b if s(a) == s(b) .endored a . (cid:2)

Of course, it is interesting to know when condition evaluation termi-nates, as well as when rewriting terminates, but we do not address thatproblem here.Proposition 5.5.1 on page 111 generalizes to abstract rewrite sys- tems, and hence applies to the conditional case just as well as to theunconditional case, and Example 5.5.2 again shows the necessity ofglobal ﬁniteness for the converse.

Proposition 5.8.16

An ARS A on a set T is terminating if there is a function ρ : T → ω such that for all t, t (cid:48) ∈ T , if t → A t (cid:48) then ρ(t) > ρ(t (cid:48) ) .Furthermore, if A is globally ﬁnite, then A is terminating iﬀ such afunction exists. (cid:2) Although it is easy to design an algorithm that does terminate on simple examplesof this kind, just by checking for loops, it is impossible to write an algorithm that worksfor all examples, because the problem is unsolvable. onditional Term Rewriting

With this we can generalize Proposition 5.5.6 to CTRS’s, using the fol-lowing terminology:

Deﬁnition 5.8.17

Given a poset P and ρ : T Σ → P , then a conditional Σ -rewriterule t → t (cid:48) if C is strict ρ -monotone iﬀ ρ(θ(t)) > ρ(θ(t (cid:48) )) for each ap-plicable ground substitution θ such that θ(u) ↓ θ(v) for each (cid:104) u, v (cid:105) ∈ C ; we speak of weak ρ -monotonicity if > is replaced by ≥ above. SeeDeﬁnition 5.5.3 on page 112 for related concepts. (cid:2) Proposition 5.8.18

Given a CTRS ( Σ , A) , if there is a function ρ : T Σ → ω suchthat(1) each rule in A is strict ρ -monotone, and ( (cid:48) ) each σ ∈ Σ is strict ρ -monotone, then A is ground terminating. (cid:2) The proof is like that of Proposition 5.5.6 on page 113, and is thereforeomitted. Rather than give an example using this result now, we willfurther generalize it to the very common case of a (C)TRS that we knowterminates, to which we add some new rules, and then want to showthat the resulting system also terminates. The following easy but usefulARS result is the basis for this generalization:

Proposition 5.8.19

Let A be an ARS on a set T , let B be a terminating “base”ARS contained in A , and let N denote the “new” rewrites of A on T ,i.e., let → N = → A − → B . Then A is terminating if there is a function ρ : T → ω such that(1) if t → B t (cid:48) then ρ(t) ≥ ρ(t (cid:48) ) , and(2) if t → N t (cid:48) then ρ(t) > ρ(t (cid:48) ) . Proof:

Any A -rewrite sequence can be put in the form t ∗ → B t → N t ∗ → B t → N · · · , from which it follows that ρ(t ) ≥ ρ(t ) > ρ(t ) ≥ ρ(t ) > · · · . Hence there is some k such that no N rewrite applies to t k . Be-cause B is terminating, there can only be a ﬁnite number of rewritesafter t k , so the sequence must be ﬁnite. (cid:2) The two levels of this result can be iterated to form a multi-levelhierarchy, in which one proves the termination of each layer assumingthe one below it. The following is a conditional hierarchical version ofProposition 5.5.6; of course, it also applies to unconditional TRS’s. Theproof is not entirely trivial. Recall that the inequality only needs to hold when all the conditions of the ruleconverge. Rewriting

Theorem 5.8.20

Let ( Σ , A) be a CTRS with Σ non-void, let ( Σ , B) be a termi-nating sub-CTRS of ( Σ , A) , and let N = A − B . If there is a function ρ : T Σ → ω such that(1) every rule in B is weak ρ -monotone,(2) every rule in N is strict ρ -monotone,(3) every σ ∈ Σ is strict ρ -monotone,then A is ground terminating. Proof:

We will use Proposition 5.8.19. Let A , B , N be the ARS’s for A, B, N respectively, on the (indexed) set T = T Σ and let Σ (cid:48) be the minimal sig-nature for B . Notice that rules in B may apply to terms with operations in Σ − Σ (cid:48) ; therefore B must apply such rewrites. This means we can-not assume termination for B , and hence to apply Proposition 5.8.19,we must ﬁrst establish that assumption for B on T Σ . We will do thisby induction on the depth of nesting of new operation symbols in a Σ -term t .For the base case, a Σ -term t has depth zero iﬀ it contains no oper-ations in Σ − Σ (cid:48) , and then we have termination by our assumption that B is ground terminating on T Σ (cid:48) .Next, suppose t is a Σ -term with depth d > g(t , . . . , t n ) with g ∈ Σ − Σ (cid:48) and with each t , . . . , t n of depth less than d . Then bythe inductive assumption, rewriting with B is terminating on each t i ,and hence is terminating on t , because only a lapse rule in B could beapplied at the top of t , and any such application will reduce us to thecase of the previous paragraph, because the rightside of the lapse rulemust be a ground term, or else B would not be terminating.Now consider the general case of a Σ -term t with depth d >

0, whichwill have the form t = t (z ← t , . . . , z n ← t n ) with t involving onlyoperations in Σ (cid:48) , with each t , . . . , t n of depth d or less, and with thetop operation of each t i in Σ − Σ (cid:48) . Then any rewrite of t must either beinside of some t i or else inside of t . There can only be a ﬁnite numberof rewrites of the ﬁrst kind, by the argument of the previous paragraph, and there can only be a ﬁnite number of rewrites of the second kind,noting that our signature is non-void and applying Proposition 5.8.10on page 131 of Section 5.8.1, which generalizes Proposition 5.3.4, aboutthe eﬀect on termination of adding constants for the conditional case,because the z i are just new constants. Hence rewriting with B termi-nates on any such term t , and we have therefore proved terminationof B .Next, observe that our assumptions (1) and (3) above imply assump-tion (1) of Proposition 5.8.19, by the same reasoning that was used toprove Proposition 5.5.5 in Appendix B. Similarly our assumptions (2)and (3) imply assumption (2) of Proposition 5.8.19. (cid:2) onditional Term Rewriting In many cases, ρ is already deﬁned on B , and we only need check theconditions for the new rules and new operations. Notice that if a CTRS A has a terminating sub-CTRS B such that the new rules in N = A − B cannot be used in evaluating the conditions of rules in N , then inﬁnitecondition evaluation regress cannot occur.We ﬁrst apply Theorem 5.8.20 to a case where all the new rulesare unconditional; here the hierarchical speciﬁcation greatly simpliﬁesthe termination proof, using the fact that termination was previouslyshown for the base system. Example 5.8.21

Suppose we are given some (C)TRS B for the natural numbersthat we already know is terminating, such as NATP+ , and then deﬁnethe Fibonacci numbers over B by: obj FIBO is pr NATP+ .op f : Nat -> Nat .var N : Nat .eq f(0) = 0 .eq f(s 0) = s 0 .eq f(s s N) = f(s N) + f(N).endo Letting Σ be the signature for the union TRS A , and Σ (cid:48) the signaturefor B (which is NATP+ ), we deﬁne ρ : T Σ → ω by letting each σ ∈ Σ (cid:48) have its usual meaning in ω , and letting ω f (N) = N . Then for t ∈ T Σ (cid:48) ,the value of ρ(t) is the number that it denotes, and so all the B -rulesare weak monotone (in fact, with equality). For condition (2), strictmonotonicity of the three N -rules for the Fibonacci function followsfrom the corresponding inequalities, of which the most interesting isthe third,4 · N > · N . Condition (3) of Theorem 5.8.20 is easy to check from the deﬁnitionsof the functions deﬁned on ω . Hence this TRS is terminating. (cid:2) Notice that proving termination for the speciﬁcation of a function like Fibonacci gives much more than just termination of the underlyingalgorithm, because it applies to terms with any number of occurrencesof the function, in any combination with functions from the base rewrit-ing system, to any level of nesting.We next do an example with a conditional rule such that the methodof Proposition 5.8.12 cannot be used, because the unconditional ver-sion of this CTRS fails to terminate; this example is the bubblesortalgorithm.

Example 5.8.22

Assume that the following speciﬁcation for lists of naturalnumbers has been shown to be terminating as a TRS: Rewriting obj NATLIST is sorts Nat List .op 0 : -> Nat .op s : Nat -> Nat .op _<_ : Nat Nat -> Bool .op nil : -> List .op _._ : Nat List -> List .*** vars and eqs omitted ...endo

Now add to this the following new operation and rule, which deﬁnethe so called bubblesort algorithm for sorting lists of naturals: obj BSORT is pr NATLIST .op sort : List -> List .vars N M : Nat .var L : List .cq sort(N .(M . L)) = sort(M .(N . L)) if M < N .endo

The conditional rewrite rule above switches two adjacent list ele-ments iﬀ they are out of order. We want to show that this hierarchicalCTRS is terminating. Notice that the above equation without the con-dition is deﬁnitely not terminating; for example, the list can be rewritten to which can be rewritten to the originallist, etc., etc. Even though we only sketch the proof, the speciﬁcationreally needs to have an operation and equation such as sorted : List -> Bool .cq sort(L) = L if sorted(L) . to get rid of the sort function symbol when the list is ﬁnally sorted.However, the essence of bubble sort is the conditional rule in the

BSORT module above.The Σ -algebra structure of ω is deﬁned by interpreting the opera-tions on the naturals as themselves, interpreting true , false , and nil as , letting ω < (N, M) = N + M , letting ω sort (L) = L +

1, and letting ω . (N, L) = d(N, L) + d(L) , where d is the “displacement” function, i.e.,the number of pairs that are out of order, deﬁned by d( nil ) = d(N.L) = d(N, L) + d(L)d(N, nil ) = d(N, M.L) = + d(N, L) if N > Md(N, M.L) = d(N, L) if N ≤ M .

Proving strict monotonicity of the new rule depends on the lemma d(N.(M.L)) = + d(M.(N.L)) if N < M, onditional Term Rewriting which is not hard to prove by case analysis. The strict monotonicityof ω . can be checked from the deﬁnition of d . It is easy to check theother monotonicity conditions for both rules and operations, and so weare done. By the way, we can actually write the above deﬁnition of d inOBJ and then deﬁne eq sorted(L) = d(L) == 0 . (cid:2) Exercise 5.8.3

Show that the equations deﬁning d in Example 5.8.22 above areterminating, when viewed as rewrite rules over NATLIST . Hint:

Showthat the unconditional version is terminating with Theorem 5.8.20 andthen apply Proposition 5.8.12. (cid:2)

Exercise 5.8.4

Give OBJ proofs for the results of Example 5.8.22 and Exercise5.8.3. (cid:2)

It should not be thought that proving termination of conditionalterm rewriting systems is always an easy task. While the results givenin this subsection seem adequate for the most common examples, thereare many others for which they are not. The following two examplesconstitute a somewhat entertaining partial digression on non-proofs ofnon-termination.

Example 5.8.23

We give two TRS’s that are ground terminating separately butcombine to give a TRS that is not; this is called “Toyama’s example”[177]. We also prove that Theorem 5.8.20 could never be used to demon-strate the termination of this TRS. obj B is sort S .ops 0 1 : -> S .op f : S S S -> S .var X : S .eq f(0,1,X) = f(X,X,X) .endoobj A is pr B .op g : S S -> S .vars X Y : S .eq g(X,Y) = X .eq g(X,Y) = Y .endo

A term that demonstrates the non-ground termination of this TRSis t = f (g( , ), g( , ), g( , )) ,which rewrites ﬁrst to f ( , g( , ), g( , )) , then to f ( , , g( , )) , andthen back to the initial term t . Note that the equation in B could alsohave been given as the conditional equation cq f(X,Y,Z) = f(Z,Z,Z) if X == 0 and Y == 1 . Rewriting

Now suppose we have ρ : T Σ → ω (where Σ is the signature of A ) that isweak ρ -monotone on the rule in B , and strict ρ -monotone on the newrules of A , such that all operations in Σ are strict ρ -monotone. Let uswrite [t] for ρ(t) . Then [t] = [f (g( , ), g( , ), g( , ))] >[f ( , g( , ), g( , ))] >[f ( , , g( , ))] ≥ [t] , which is a contradiction. Therefore Theorem 5.8.20 could never be usedto prove ground termination of this TRS (which is of course consistentwith the fact that this TRS is not ground terminating). (cid:2) Example 5.8.24

Using the same technique as in Example 5.8.23, we sketch a proof that Theorem 5.8.20 cannot be used to prove ground termina-tion of the speciﬁcation for the greatest common divisor given below(which is essentially Euclid’s algorithm), viewed as a hierarchical CTRSover some suitable terminating speciﬁcation

NAT of the natural num-bers with subtraction and > , where ρ is deﬁned homomorphically. Ter-mination of this CTRS is proved in Example 5.8.32 using much moresophisticated methods. obj GCD is pr NAT .op gcd : Nat Nat -> Nat .vars M N : Nat .eq gcd(M,0) = M .eq gcd(0,N) = N .cq gcd(M,N) = gcd(M - N, N) if M >= N and N > 0 .cq gcd(M,N) = gcd(M, N - M) if N >= M and M > 0 .endo The proof will be by contradiction, so we assume that there are a Σ -algebra structure on ω and a weight function ρ : T Σ → ω satisfyingall the conditions of Theorem 5.8.20, where A is GCD plus

NAT , Σ is thesignature of A , and B is NAT . We ﬁrst prove a lemma, that

M(x, y) ≥ x for all x, y in ω , where M(x, y) denotes the function ω − (x, y) on ω . The proof is by contra-diction, so we suppose that there exist x , y such that M(x , y )

0. But this is impos-sible because ω is Noetherian (i.e., has no inﬁnite strictly decreasingsequences). By the same reasoning, the analoguous inequality holdsfor the function G(x, y) = ω gcd .Now we are ready for the main part of the proof, in which we write [t] for ρ(t) , as well as M for ω − and G for ω gcd as above. Let x, y be onditional Term Rewriting natural number terms (i.e., ground terms in the base rewriting system NAT ) with x > y . Then [ gcd (x, y)] >[ gcd (x − y, y)] = G([x − y], [y]) = G(M([x], [y]), [y]) ≥ G([x], [y]) = [ gcd (x, y)] , which is a contradiction (the ﬁrst step results from monotonicity whenapplying the ﬁrst conditional rule, and the next to last step uses thelemma for M , and then for G ). (cid:2) The ﬁnal calculation in the above example suggests that the reason this termination proof method fails for gcd is the monotonicity requirementfor operations combined with homomorphicity. Because these assump-tions are not needed for Proposition 5.8.19, the possibility remains ofapplying something like that result directly, as is done in the next sub-section.

Exercise 5.8.5

Assuming a suitable terminating speciﬁcation

NAT for the natu-ral numbers with inequality > , prove termination of the following CTRSfor binary search trees: obj BTREE is sort BTree .pr NAT .op empty : -> BTree .op make : BTree Nat BTree -> BTree .op insert : Nat BTree -> BTree .vars T1 T2 T3 : BTree .vars N M : Nat .eq insert(M,empty) = make(empty,M,empty) .cq insert(M,make(T1,N,T2)) = make(insert(M,T1),N,T2)if N > M .cq insert(M,make(T1,N,T2)) = make(T1,N,insert(M,T2))if M > N .endo (cid:2) ( (cid:63) ) Noetherian Orderings

This subsection develops the remark after Example 5.5.4 that it is use-ful to allow weight functions that take values in Noetherian partial or-derings other than ω (see Appendix C for a review of partially orderedsets, also called posets ), where a poset is Noetherian (also called wellfounded ) iﬀ it has no inﬁnite sequence of strictly decreasing elements.The key observation is that Proposition 5.8.19 and Theorem 5.8.20 gen-eralize to any Noetherian poset, because their proofs depend only onthe Noetherian property; note also that a diﬀerent Noetherian ordering Rewriting could be used for each sort, since we are really dealing with a sortedset of posets. As with using ω , the key intuition is that rewrites shouldstrictly decrease weight. Some examples, including the greatest com-mon divisor as computed by Euclid’s algorithm, need rather compli-cated orderings. To help with this, we introduce some ways to buildnew orderings out of old ones, such that if the old orderings are Noethe-rian then so are the new ones. Unfortunately, much of the material inthis subsection is rather technical. Deﬁnition 5.8.25

Let

P , Q be posets, with both their orderings denoted ≥ . Thentheir ( Cartesian ) product poset , denoted P × Q , has as its elements thepairs (p, q) with p ∈ P and q ∈ Q , and has (p, q) ≥ (p (cid:48) , q (cid:48) ) iﬀ p ≥ p (cid:48) and q ≥ q (cid:48) . Their lexicographic product , here denoted P (cid:11) Q , againhas as elements the pairs (p, q) with p ∈ P and q ∈ Q (we may use the notation p (cid:11) q ), but now ordered by (p, q) ≥ (p (cid:48) , q (cid:48) ) iﬀ p > p (cid:48) or else p = p (cid:48) and q ≥ q (cid:48) . To avoid confusion with pairs (p, q) ∈ P × Q , wewill hereafter use the notation p (cid:11) q for elements of P (cid:11) Q . The sum of posets P , P , denoted P + P , has as its elements pairs (i, p) with p ∈ P i for i = i =

2, ordered by (i, p) ≥ (i (cid:48) , p (cid:48) ) iﬀ i = i (cid:48) and p ≥ p (cid:48) in P i . A poset Q is a subposet of a poset P iﬀ Q ⊆ P and q ≥ q (cid:48) in Q iﬀ q ≥ q (cid:48) in P , for all q, q (cid:48) ∈ Q . (cid:2) The following result is rather straightforward to prove:

Proposition 5.8.26 If P , Q are both Noetherian posets, then so are P × Q , P (cid:11) Q and P + Q . Moreover, the discrete ordering on any set X , deﬁned by x ≥ y iﬀ x = y , is also a Noetherian poset, and any subposet of aNoetherian poset is Noetherian. (cid:2) Example 5.8.27

Motivated by the applications of term rewriting to verifyinghardware circuits that are developed in Section 7.4, a system T of Σ (X) -equations is said to be triangular iﬀ X is ﬁnite, there is a subset of X called input variables , say i , . . . , i n , and there is an ordering of thenon-input variables, say p , . . . , p m , such that the equations in T havethe form p k = t k (i , . . . , i n , p , . . . , p k − ) for k = , . . . , m , where each t k is a Σ (X)-term involving only input variables and thosenon-input variables p j with j < k (in particular, t must contain onlyinput variables).We ﬁrst prove that any triangular system T is terminating as a TRS.Let P = (cid:11) mi = ω , the m -fold lexicographic product of ω with itself, anddeﬁne ρ : T Σ → P by letting ρ(t) = ((cid:96) m , . . . , (cid:96) ) where (cid:96) k is the number If P and P are disjoint, then the elements of P + P can be taken as just thosein P ∪ P . The purpose of the construction with the pairs (i, p) is just to enforcedisjointness in case P , P were not already disjoint. onditional Term Rewriting of occurrences of p k in t . Rewriting a Σ (X) -term t with any equation in T will decrease ρ(t) , because it will decrease the number (cid:96) k of occur-rences of the non-input variable p k in the rule’s leftside by one, whilepossibly increasing the numbers (cid:96) j of occurrences of variables p j with j < k . Therefore T is terminating by Proposition 5.5.1 generalized toNoetherian posets.Next, we use the Newman Lemma (Proposition 5.7.4) to show that T is Church-Rosser, by proving that the local Church-Rosser propertyholds. For this purpose, we ﬁrst note that if a Σ (X) -term t can berewritten in two distinct ways, it must have the form t (z ← p i , z ← p j ) where z , z are distinct new variables, each occurring just once in t . To prove this, pick one of the rewrites and note that, since its redexis a non-input variable, t must have the form t (cid:48) (z ← p i ) . Because there is just one rule for each non-input variable, the redex for the secondrewrite is disjoint from that for the ﬁrst, so that t (cid:48) (z ← p i ) and hence t , has the form t (z ← p i , z ← p j ) , for which we will use the shorternotation t (p i , p j ) . It now follows that the two rewrites have the forms t ⇒ t (t i , p j ) and t ⇒ t (p i , t j ) . Therefore each target term can berewritten to t (t i , t j ) by applying the other rule once. We now concludethat any triangular system is canonical.Finally, we show that the only variables that can occur in a normalform of a triangular system are input variables, by proving the con-trapositive: If a term t contains a non-input variable, then it can berewritten using the rule with that variable as its leftside, and hence it isnot reduced. (cid:2) A more complex construction of a new Noetherian poset from anold one is given by multisets. Intuitively, multisets generalize ordinarysets by allowing elements to occur multiple times. A multiset is oftendeﬁned to be a function A : D → ω + , where D is the domain and A(d) is the multiplicity of d ∈ D . It is common to use set notation formultisets, so that for example the multiset denoted by { , , } wouldhave D = { , } , with A( ) = A( ) =

1, indicating two instancesof 1 and one of 2. Then the most natural notion of a submultiset of A would be a subset D (cid:48) of D and a function A (cid:48) : D (cid:48) → ω + such that A (cid:48) (d) ≤ A(d) for all d ∈ D (cid:48) ; for example, { , } ≤ { , , } .However, this approach is inadequate for our applications, whichrequire multisets of elements drawn from a Noetherian poset P , with anordering such that, for example where P is ω with the usual ordering, { } < { } < { } and { , } < { , } . Also, in term rewriting theory,the phrase “multiset ordering” usually refers to an ordering that allowseven more possibilities, such as { , , } < { } and { , , , } < { , } ;but because our applications do not need this extra sophistication, wewill develop only a somewhat simpliﬁed special case.Our mathematical formulation of multisets involves a possibly sur-prising reversal of the approach sketched above, in that we dispense Rewriting with ω + , and instead rely on abstract sets whose elements representinstances of elements of P . For example, { , , } is represented by thefunction A : { x, y, z } → P with A(x) = , A(y) =

2, and

A(z) = Deﬁnition 5.8.28

Given a poset P with an ordering ≥ , then a multiset over P isa function A : X → P with underlying set X ; call a multiset A : X → P ﬁnite iﬀ its underlying set is ﬁnite; the empty multiset , denoted ∅ , hasthe empty underlying set. Given multisets A : X → P and B : Y → P ,deﬁne A ≥ B iﬀ there is an injective function f : Y → X such that A(f (y)) ≥ B(y) for all y ∈ Y . Let M (P ) denote the class of all ﬁnitemultisets, where all multisets A, B such that A ≥ B and B ≥ A areidentiﬁed. (cid:2) Exercise 5.8.6

Prove the following, where P is ω with the usual ordering, { , } > { } > ∅{ , } > { , } > { }{ , } > { , } > { , } , and where (as in Appendix C) A > B means A ≥ B and A ≠ B (which,because of the equivalence on multisets, means A ≥ B and not B ≥ A ).However, it is not possible to show (for example) that { , } > { , , , } , which would be required by the more usual and powerful multiset or-dering. (cid:2) Although this multiset ordering is weaker than the usual one, it is easierto reason about, and is suﬃcient for the applications in this chapter.

Proposition 5.8.29 If P is a Noetherian poset, then so is M (P ) . Proof:

Reﬂexivity is easy. For anti-symmetry, use the lemma that A ≥ B and B ≥ A iﬀ there is some bijective f : Y → X such that A(f (y)) = B(y) for all y ∈ Y . For transitivity, given A ≥ B ≥ C with underlying sets X, Y , Z and injections f : Z → Y and g : Y → X , then f ; g : Z → X isalso injective and satisﬁes A(g(f (z))) ≥ B(f (z)) ≥ C(z) for all z ∈ Z . For the Noetherian property, suppose that A > A > · · · > A n > · · · is an inﬁnite strictly decreasing sequence, where A i has underlyingset X i . This gives rise to an inﬁnite sequence of injections X ← X ←· · · ← X n ← · · · . Then because X is ﬁnite, there must exist some n such that (up to isomorphism) X n = X n + k for all k ≥

1. Then for each In order to avoid set-theoretic worries, it is desirable to restrict the underlying setsthat are used, for example, to ﬁnite subsets of ω . So technically speaking, we have an ordering on the quotient set. To obtain the mul-tiset ordering that is more usual in term rewriting, the restriction to injective functionsshould be relaxed to asserting of f : Y → X that if f (y) = f (y (cid:48) ) with y ≠ y (cid:48) then A(f (y)) > B(y) and

A(f (y (cid:48) )) > B(y (cid:48) ) . onditional Term Rewriting k ≥ x ∈ X n such that A n + k (x) > A n + k + (x) .But for each particular x ∈ X n , there can only be a ﬁnite number ofsuch k because P is Noetherian. Now because X n is ﬁnite, there canonly be a ﬁnite number of pairs (k, x) such that the above inequalityholds, which contradicts our initial assumption. (cid:2) We now make one further identiﬁcation, of p ∈ P with { p } ∈ M (P ) ,noting that p ≥ p (cid:48) in P iﬀ { p } ≥ { p (cid:48) } in M (P ) , so that the inclusionmap P ⊆ M (P ) is order preserving. Therefore deﬁning M n + (P ) =M ( M n (P )) with of course M (P ) = M (P ) , we get P ⊆ M (P ) ⊆ M (P ) ⊆ · · · ⊆ M n (P ) ⊆ · · · , and can therefore form the union of all these to get M ω (P ) = (cid:91) n M n (P ) , the union ordering on which is called the nested multiset ordering . Fact 5.8.30

The nested multiset ordering M ω (P ) = (cid:83) n M n (P ) is Noetherian if P is. Proof:

Each M n (P ) is Noetherian by induction using Proposition 5.8.29, andeach element of the union lies in M n (P ) for some least n , as do allelements less than any given element. (cid:2) Similar constructions are used in Exercise 5.8.8 and Example 5.8.32 be-low.

Exercise 5.8.7

Given a poset P and an equivalence relation ≡ on the carrier of P (which is also denoted P ), let P / ≡ denote the set P / ≡ ordered by [p] ≤ [q] for p, q ∈ P iﬀ p (cid:48) ≤ q (cid:48) for some p (cid:48) ≡ p and q (cid:48) ≡ q . Show that if P / ≡ is a poset, and if it has only a ﬁnite number of non-trivial equivalenceclasses, then it is Noetherian if P is. Give an example showing that thehypothesis about a ﬁnite number of non-trivial equivalence classes isnecessary. Hint:

Let P have a > a , a (cid:48) > a , a (cid:48) > a , . . . , and thenidentify a i with a (cid:48) i for i = , , , . . . . (cid:2) Exercise 5.8.8

Given a poset P , let ⊥ be a new element not already in P , and let P ⊥ denote the poset having underlying set P ∪ {⊥} with the ordering of P plus ⊥ < p for all p ∈ P . Show that P ⊥ is Noetherian if P is.Now given a Noetherian poset P with a unique least element ⊥ , form (cid:11) P = P (cid:11) P , and identify p ∈ P with the element p (cid:11) ⊥ ∈ (cid:11) P , so thatthere is an order-preserving inclusion P ⊆ (cid:11) P . Iterate this to obtain Those who know some category theory may recognize this as a colimit construction.The result proved in the next sentence is that this colimit of an increasing sequence ofNoetherian posets is Noetherian. We also note that the product and sum constructionsof Deﬁnition 5.8.25 are the categorical product and coproduct. Rewriting (cid:11) n P ⊆ (cid:11) n + P , and note that each (cid:11) n P is Noetherian for each n byinduction and Proposition 5.8.26. Now form (cid:11) ω P = (cid:83) n (cid:11) n P , showthat (cid:11) ω P corresponds to the usual lexicographic ordering on the set ofﬁnite strings from P , and give an example showing that (cid:11) ω P in generalis not Noetherian.

Hint: If b > a , then b > ab > aab > aaab > · · · . (cid:2) A straightforward generalization of Proposition 5.8.19 requires deﬁn-ing a weight function ρ : T Σ → P where P is a Noetherian poset, andthen showing that each new rewrite is strict ρ -monotone and eachold rewrite is weak ρ -monotone. A less straightforward generalizationweakens the assumption that P is Noetherian to assuming that eachparticular item and everything to which it can be rewritten lie withinsome Noetherian subposet of P . Proposition 5.8.31

Let A be an ARS on ( S -indexed) set T , let B be a terminating“base” ARS contained in A , let N denote the “new” rewrites of A on T (i.e., → N = → A − → B ), and let P be a poset. Then A is terminating ifthere is a function ρ : T → P such that(1) if t → B t (cid:48) then ρ(t) ≥ ρ(t (cid:48) ) ,(2) if t → N t (cid:48) then ρ(t) > ρ(t (cid:48) ) , and(3) P is Noetherian, or if not, then for each t ∈ T Σ ,s there is a Noethe-rian poset P ts ⊆ P s such that t ∗ → A t (cid:48) implies ρ(t (cid:48) ) ∈ P ts . Proof:

By exactly the same reasoning that was used for Proposition 5.8.19. (cid:2)

Example 5.8.32 ( (cid:63) ) We show termination of the

GCD

CTRS of Example 5.8.24using Proposition 5.8.31 with a rather complex ordering. To deﬁne thisordering, for a given poset P , let N (P ) = (P × P ) + (ω × ω) (cid:11) P + M (P ) , with its ordering given by Deﬁnition 5.8.25. Then N (P ) is Noethe-rian if P is by Propositions 5.8.26 and 5.8.29, and because P ⊆ M (P ) is an order-preserving inclusion, so is P ⊆ N (P ) . Therefore deﬁning N n + (P ) = N ( N n (P )) with N (P ) = P , we have that each N n (P ) isNoetherian if P is. However, the union N ω (P ) of the chain P ⊆ N (P ) ⊆ N (P ) ⊆ · · · ⊆ N n (P ) ⊆ · · · is in general not Noetherian, for the reasons considered in Exercise 5.8.8.The case in which we are most interested takes P = ω ⊥ . For each N n (ω ⊥ ) , identify p ∈ P with p (cid:11) ⊥ ∈ (cid:11) P , and with p (cid:11) ⊥ (cid:11) ⊥ ∈ (cid:11) P ,etc., recursively as in Exercise 5.8.8, and also identify ⊥ in P with ( ⊥ , ⊥ ) When T is S -indexed, a family { P s | s ∈ S } of posets, although in practice they areoften all the same. onditional Term Rewriting in (P × P ) and with ∅ in M (P ) ; then the result of these identiﬁcationsis Noetherian for each n , by Exercise 5.8.7. Finally, for all p, q , add theinequalities1 . (p, q) > p, q if p, q ≠ ⊥ . (m, n) (cid:11) (p, q) > p, q if p, q ≠ ⊥ . To simplify notation, denote the resulting poset at level n by N n andthe union by N ; let N = {∅} . We leave the reader to check that theabove new inequalities do not violate the poset axioms or the Noethe-rian condition for each N n .In order to apply Proposition 5.8.31, let T be T Σ where Σ is thetotal signature of GCD , let A consist of all rewrites induced on T by therules in GCD , let B consist of all rewrites induced on T by the rules in NAT , and let

N = A − B . We must deﬁne a weight function ρ : T → N such that each rewrite in N is strict ρ -monotone, such that each rewritein B is weak ρ -monotone, and such that for each Σ -term, everything towhich it can be rewritten lies within some ﬁxed Noetherian subposet N t of N . We take N t be the poset that was denoted N n above, with n = d ,where d is the maximum depth of nesting of gcd ’s inside of t . In thefollowing, we let T d denote the set of ground Σ -terms of maximumnesting depth not greater than d , we let Σ (cid:48) denote the signature of NAT ,and we let g abbreviate gcd . Notice that for any d , rewriting on T d with A always remains within T d , because none of the rules in GCD or NAT can increase the depth of nesting of gcd ’s in terms.We give a recursive deﬁnition for ρ , in which a subterm of t ∈ T iscalled top if it is a maximal subterm of t having g as its head, and asbefore we write [t] for ρ(t) : (a) [n] = ∅ ( the empty multiset ) if n is a Σ (cid:48) term (b) [g(t, t (cid:48) )] = (m, n) (cid:11) ([t], [t (cid:48) ]) where t, t (cid:48) reduce to Peanoterms m, n(c) [t] = { [t ], . . . , [t n ] } if t is not top and t , . . . , t n are its top subterms . By a “Peano term” we mean a term of the form s . . . s

0. We can show that rewriting on T Σ (cid:48) always terminates with such a term using an argu-ment E18 like that given in Example 5.5.5, and then we can show that B isterminating on T Σ with an argument like that given in Theorem 5.8.20on page 136, noting that Σ is non-void and using Proposition 5.3.4.To apply (b), we need to know that the Σ -terms t and t (cid:48) reduce toPeano terms under A , whereas in general we don’t even know whetherrewriting with A terminates for arbitrary Σ -terms. Therefore we should Generalizing the proof in Example 5.8.24 to poset weights shows that the general-ization of Proposition 5.8.19 to poset weights (stated below as Theorem 5.8.33) cannotbe made to work for this example. Rewriting demonstrate termination with a Peano term result along with condi-tions (1) and (2) of Proposition 5.8.31, as part of our induction on themaximum depth of nesting of gcd ’s in Σ -terms.Our induction hypothesis is the conjunction of four subsidiary hy-potheses: ( A d ) rewriting on T d with B preserves weight; ( B d ) rewritingon T d with N is strict monotone; ( C d ) the weights of terms in T d al-ways lie in N d ; and ( D d ) rewriting on T d with A always terminateswith a Peano term. Notice that ( A d ) implies that rewriting with B isweak monotone.The base case takes d = t ∈ T = T Σ (cid:48) . By (a) ofthe deﬁnition of ρ , we have [t] = ∅ ; therefore rewrites with old rulesare weight preserving, and weights remain within N = {∅} becauserewriting remains within T . Also, because no new rules can be applied to t ∈ T Σ (cid:48) and no operations from Σ − Σ (cid:48) can be introduced by rewrit- ing Σ (cid:48) -terms, rewrites using new rules are vacuously strict monotone,because there aren’t any.The induction step assumes the four induction hypotheses ( A d , B d , C d , D d ) for some d >

0. We ﬁrst prove a preliminary lemma, which saysthat any rewrite induced by applying a new rule at the top of a term in T d + is strict monotone. For the ﬁrst rule, g(M, ) = M , because any t ∈ T d reduces to a Peano term (say) m by ( D d ), we get [g(t, )] = (m, ) (cid:11) ([t], ⊥ ) , while for the rightside, we get just [t] . Therefore [g(t, )] > [t] by the inequality 2. The argument for the second rule, g( , N) = N , is the same. Of the two conditional rules in GCD , we checkonly the ﬁrst, because the second follows the same way. This rule is g(M, N) = g(M − N, N) if M ≥ N and N >

0. By ( D d ), t, t (cid:48) reduce toPeano terms, say m, n ; then [g(t, t (cid:48) )] = (m, n) (cid:11) ([t], [t (cid:48) ]) , while forthe rightside we have [g(t − t (cid:48) , t (cid:48) )] = (m − n, n) (cid:11) ([t − t (cid:48) ], [t (cid:48) ]) , and thedesired inequality follows because n >

0, so that (m, n) > (m − n, m) ,and hence (m, n) (cid:11) ([t], [t (cid:48) ]) > (m − n, n) (cid:11) ([t − t (cid:48) ], [t (cid:48) ]) , by thedeﬁnitions of the product and lexicographic orderings.The induction step for the ﬁrst two inductive assertions has twocases. The ﬁrst case considers t = g(t , t ) with t , t ∈ T d . Then by(b), [t] = (n , n ) (cid:11) ([t ], [t ]) , where the reduced forms of t , t are respectively n , n , which are Peano terms by ( D d ). For assertion ( A d + ),any application of an old rule is weight preserving because it rewriteseither t or t , which preserves the weight of t by ( A d ). For assertion( B d + ), any application of a new rule at the top is strict monotone byour lemma, and otherwise is strict monotone by ( B d ).The second case of the induction step considers a Σ -term t havingdepth d + t (g , . . . , g k ) where k > g i hasthe form g(t i, , t i, ) with t i,j ∈ T d and t ∈ T Σ (cid:48) ( { z , . . . , z k } ) . Then [g i ] = (n i, , n i, ) (cid:11) ([t i, ], [t i, ]) as in the ﬁrst case, and so we have [t] = { (n , , n , ) (cid:11) ([t , ], [t , ]), . . . , (n k, , n k, ) (cid:11) ([t k, ], [t k, ]) } by (c).For assertion ( A d + ), once again any application of an old rule preserves onditional Term Rewriting the weight of t , because it either rewrites some t i,j , which preservesthe weight of t because it preserves the weight of g i by ( A d ), or else itrewrites within t , which also preserves the weight of t , because of (c).For assertion ( B d + ), any application of a new rule is strict monotoneon any such t , either by ( B d ), or else by the lemma.Finally, we consider the remaining two inductive assertions. For( C d + ), when rewriting with A on T d , all weights remain within N d + = N (d + ) because of the form of [t] and ( C d ). For ( D d + ), rewriting alwaysterminates for terms in T d + by Proposition 5.8.31 with T = T d + and P = N d + , plus the induction hypotheses; moreover, the result mustbe a Peano term because of the form of [t] , using ( D d ) and (a).It now follows that ( A d , B d , C d , D d ) hold for every d ≥

0, and in par-ticular that rewriting on any ground Σ -term t ∈ T = (cid:83) d T d necessarilyterminates with a Peano term as its result. (cid:2) The main result of Section 5.8.2, Theorem 5.8.20 on page 136, gen-eralizes to Noetherian orderings, though we do not use this result inthis book:

Theorem 5.8.33

Let ( Σ , A) be a CTRS with Σ non-void, let ( Σ , B) be a terminat-ing sub-CTRS of ( Σ , A) , let P be a poset, and let N = A − B . If there is afunction ρ : T Σ ,B → P such that(1) each rule in B is weak ρ -monotone,(2) each rule in N is strict ρ -monotone,(3) each operation in Σ is strict ρ -monotone, and(4) P is Noetherian, or at least, then for each t ∈ (T Σ ,B ) s there is someNoetherian poset P ts ⊆ P s such that t ∗ ⇒ A t (cid:48) implies ρ(t (cid:48) ) ∈ P ts .then ( Σ , A, B) is ground terminating. (cid:2) The proof depends on the straightforward generalization from ρ : T Σ → ω to ρ : T Σ → P of results in Section 5.8.2 and hence is omitted here. When applying a conditional rule t = t (cid:48) if u == v over some theory A in OBJ, it is possible that a term r exists such that θ(u) ∗ ⇒ A r and θ(v) ∗ ⇒ A r , but rewriting does not ﬁnd this r because θ(u) and θ(v) reduce to diﬀerent normal forms. In fact, it cannot be guaranteed thatOBJ will always evaluate a condition u == v to true when θ(u) ↓ θ(v) unless the set of rules that can be used for evaluating conditions isboth Church-Rosser and terminating. This provides some motivationfor checking the Church-Rosser property in the conditional case. But as Rewriting we continue to emphasize, because of soundness, any rewriting com-putation is a proof, so if you do get the result that you want, then youhave proved the result that you wanted to prove, whether or not theCTRS is terminating or Church-Rosser. In our experience, practical ex-amples can usually be handled without bothering to check canonicity,though of course it is comforting.This subsection extends the techniques presented for proving con-ﬂuence in Section 5.6 to the conditional case. Many basic results extendjust because they follow directly from the corresponding results aboutARS’s, as discussed at the beginning of Section 5.8. The situation forProposition 5.3.6 is slightly diﬀerent; the proof, which involves passingto two diﬀerent ARS’s, generalizes to the conditional case without anychange.

Proposition 5.8.34

Given a sort set S , let X ωS be the ground signature with (X ωS ) s = { x is | i ∈ ω } for each s ∈ S . Then a CTRS ( Σ , A) is Church-Rosser iﬀ the CTRS ( Σ (X ωS ), A) is ground Church-Rosser. Similarly, ( Σ , A) is locally Church-Rosser iﬀ the CTRS ( Σ (X ωS ), A) is ground lo-cally Church-Rosser. (cid:2) Example 5.8.35

The analog of Proposition 5.8.12 for conﬂuence is not true,not even for orthogonal CTRS’s. Let C be the CTRS corresponding tothe following equations (with x a variable): f (x) = a if x = f (x)b = f (b) . Then b ⇒ f (b) ⇒ a and f (b) ∗ ⇒ f (a) . However, it is not true that f (a) ↓ a . Hence C is not Church-Rosser. However, C U is Church-Rosser.Note also that C is orthogonal. (This example is due to Bergstra andKlop [10].) (cid:2) The Orthogonality Theorem (Theorem 5.6.4) can be generalized toCTRS’s by generalizing the notion of non-overlapping, but we do not doso here, because orthogonality is a rather strong property, and in any case the generalization of the Newman Lemma handles most examplesof practical interest; to maximize practicality, we give a hierarchicalversion that allows checking the key properties “incrementally,” that is,one level at a time, as was previously done for termination.

Proposition 5.8.36

Let A be a terminating Σ -CTRS, let B be a “base” Church-Rosser CTRS contained in A , and let N = A − B . Then A is Church-Rosser(and hence canonical) if:(1) N is locally Church-Rosser, i.e., if t ⇒ N t and t ⇒ N t then thereis some t (cid:48) such that t ∗ ⇒ N t (cid:48) and t ∗ ⇒ N t (cid:48) ; and onditional Term Rewriting (2) if t ⇒ B t and t ⇒ N t then there is some t (cid:48) such that t ∗ ⇒ N t (cid:48) and t ∗ ⇒ B t (cid:48) . Proof: (1) and (2) imply that A is locally Church-Rosser, and then the NewmanLemma gives the full Church-Rosser property. (cid:2) Note that by the original Newman Lemma, it would be equivalent toassume that B is locally Church-Rosser. Condition (2) is a local versionof the Hindley-Rosen property from Proposition 5.7.5. We now givesome applications of the above result. Example 5.8.37

We ﬁrst consider the maximum function of Example 5.8.13,which has already been shown terminating. Assume that the natural number part of this CTRS, here denoted B , has been shown Church-Rosser. Then it remains to check conditions (1) and (2). Condition(1) can be checked completely mechanically by using the Knuth-Bendixalgorithm [117] (see Chapter 12), and condition (2) can be checked bya variant of the same algorithm. But here we give a rather informalargument, which will serve to motivate the more formal developmentsof Chapter 12. The idea is to determine which rules could give riseto the two given rewrites, and then show the existence of a suitable t (cid:48) for each such case. Note that unless the two rewrites overlap, it isstraightforward to see that t (cid:48) exists. e will abbreviate max by just m in the detailed arguments below.For (1), we suppose that t ⇒ N t and t ⇒ N t . The only way that twonew rules can overlap is if the redex has the form m(u, u) for some Σ -term u with t = t = u , so that we have t ∗ ⇒ N t (cid:48) and t ∗ ⇒ N t (cid:48) with t (cid:48) = u . For (2), we consider t ⇒ B t and t ⇒ N t . But it is impossiblefor a new rule to overlap with a base rule in this speciﬁcation, becausethe leftsides of the rules in B and N have disjoint function symbols;so there is nothing to check. Thus Proposition 5.8.36 implies that thisspeciﬁcation is Church-Rosser, and hence canonical.Now we consider the greatest common divisor function of Exam-ple 5.8.24, which was already shown terminating in Example 5.8.32. For (1), note that in this speciﬁcation, there is no overlap between newrules, so there is nothing to check. Similarly for (2), there is no over-lap between new and old rules, because the leftsides involve disjointfunction symbols, and so again there is nothing to check. ThereforeProposition 5.8.36 shows that this speciﬁcation is Church-Rosser, andhence canonical. (cid:2)

There is also a version of Proposition 5.8.36 that does not assumetermination, but instead requires stronger conﬂuence conditions; how-ever we are usually less interested in the Church-Rosser property iftermination does not hold, so we do not give this result here. Rewriting

Exercise 5.8.9

Given a CTRS C , let C U be the TRS whose rules are those of C with their conditions (if any) removed. Then C is Church-Rosser (orground Church-Rosser) if C U is. (cid:2) ( (cid:63) ) Relation between Abstract and Term Rewriting Systems

This section describes a relationship between abstract rewriting sys-tems and term rewriting systems, including a construction of each fromthe other, and proves that these two constructions are the best possi-ble with respect to each other, in a sense that is made precise by sayingthat they form an “adjoint pair of functors.” However, the notion ofadjointness is not needed to understand our statement of this result,and in fact, no category theory at all is used in this section, although we do mention categories and functors in some exercises. We can say much more about the relationship between the TRS’s andARS’s than in Section 5.7 after we introduce morphisms of TRS’s andARS’s. First, we add a little more information

E19 to TRS’s, by includinga sort s from its signature. Then TRS morphisms are interpretations (inthe sense of Deﬁnition 4.10.4) that preserve the designated sort and allone-step rewrites. Deﬁnition 5.9.1 A TRS morphism ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) is a signature morphism h : Σ → Der ( Σ (cid:48) ) such that whenever t ⇒ A t then h(t ) ⇒ A (cid:48) h(t ) , where h is as deﬁned just after Deﬁnition 4.10.3, the unique Σ -homomorphismto T Σ (cid:48) obtained by looking at T Σ (cid:48) ﬁrst as a Der ( Σ (cid:48) ) -algebra, and thenthrough h , as the reduct Σ -algebra hT Σ (cid:48) (see page 84 for this).Given TRS morphisms h : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) and h (cid:48) : ( Σ (cid:48) , A (cid:48) ) → ( Σ (cid:48)(cid:48) , A (cid:48)(cid:48) ) , their composition h ; h (cid:48) : ( Σ , A) → ( Σ (cid:48)(cid:48) , A (cid:48)(cid:48) ) is deﬁned to betheir composition h ; h (cid:48) * : Σ → Der ( Σ (cid:48)(cid:48) ) as derivors in the sense of Deﬁ-nition 4.10.5. (cid:2) Exercise 5.9.1

Show that the composition of TRS morphisms is a TRS mor-phism. Use Exercise 4.10.5 to show that i Σ : Σ → Der ( Σ ) (sending σ ∈ Σ w,s to the term σ (x , . . . , x n ) ∈ T Σ ( w X) s ) is the identity for TRS morphism composition. Show that TRS morphism composition is asso-ciative (whenever the compositions involved are deﬁned). These resultsshow that TRS’s form a category; let us denote it T RS . (cid:2) Deﬁnition 5.9.2 An ARS morphism (S, T , → ) (cid:45) → (S (cid:48) , T (cid:48) , → (cid:48) ) is a pair (f , g) where f : S → S (cid:48) and g s : T s → T (cid:48) f (s) for each s ∈ S , such that t → t implies g(t ) → (cid:48) g(t ) . Given ARS morphisms (f , g) : (S, T , → ) (cid:45) → A category is just a collection of objects (such as TRS’s) and maps (also often calledmorphisms) between them, such that certain axioms are satisﬁed. There are manyplaces to learn about these concepts, including [91, 6] and [126]; [63] discusses theintuitive meanings of these and other categorical concepts. (cid:63) ) Relation between Abstract and Term Rewriting Systems (S (cid:48) , T (cid:48) , → (cid:48) ) and (f (cid:48) , g (cid:48) ) : (S (cid:48) , T (cid:48) , → (cid:48) ) (cid:45) → (S (cid:48)(cid:48) , T (cid:48)(cid:48) , → (cid:48)(cid:48) ) , then their compo-sition is the pair (f ; f (cid:48) , { g s ; g (cid:48) f (s) | s ∈ S } ) : (S, T , → ) (cid:45) → (S (cid:48)(cid:48) , T (cid:48)(cid:48) , → (cid:48)(cid:48) ) . (cid:2) Exercise 5.9.2

Show that the composition of ARS morphisms is an ARS mor-phism. Show that the pair of identity maps ( S , T ) serves as an iden-tity for (S, T , → ) under ARS morphism composition. Show that ARSmorphism composition is associative (when the compositions involvedare deﬁned). These results show ARS’s form a category; let us denote it A RS . (cid:2) Now we are ready for the ﬁrst of our two main constructions:

Deﬁnition 5.9.3

Let R send a TRS ( Σ , A) to the ARS (T Σ , ⇒ A ) , and send a TRS morphism h : ( Σ , A) → ( Σ (cid:48) , A (cid:48) ) , i.e., h : Σ → Der ( Σ (cid:48) ) , to R(h) = (f , h) : T Σ → T Σ (cid:48) , where f is the sort component of the signature morphism h . (cid:2) Exercise 5.9.3

Show that

R(h) is an ARS morphism, and that R : T RS → A RS preserves composition and identities, i.e., that it is a functor. Hint:

To show that

R(h) is an ARS morphism, check that t ⇒ A t implies h(t ) ⇒ A (cid:48) h(t ) . (cid:2) Here is our second construction:

Deﬁnition 5.9.4

Let F send an ARS (S, T , → ) to ( Σ T , A → ) , where Σ T is deﬁnedby Σ T[],s = T s and Σ Tw,s = ∅ for all other w, s , and where A → containsa rewrite rule ( ∀∅ ) t = t iﬀ t → t in (S, T , → ) . Also, if (f , g) : (S, T , → ) (cid:45) → (S (cid:48) , T (cid:48) , → (cid:48) ) is an ARS morphism, then deﬁne F(f , g) : ( Σ T , A → ) → ( Σ T (cid:48) , A → (cid:48) ) to be h : Σ T → Der ( Σ T (cid:48) ) deﬁned by h s (t) = g f (s) (t) ∈ Σ T (cid:48) [],f (s) = T (cid:48) f (s) for t ∈ Σ T[],s = T s . (cid:2) Exercise 5.9.4

Show that

F(f , g) is a TRS morphism, and that F : A RS → T RS preserves composition and identities, i.e., that it is a functor. Hint:

To show that

F(f , g) is a TRS morphism, show that t → A → t implies h(t ) → A →(cid:48) h(t ) where h = F(f , g) . (cid:2) Before stating the main result, we need the following:

Fact 5.9.5

R(F( A )) = A , for any ARS A . Proof: If A = (S, T , → ) , then F( A ) = ( Σ T , A → ) where Σ T[],s = T s and Σ w,s = ∅ for all other w, s , and where the rewrite rule ( ∀∅ ) t = t is in A → iﬀ t → t in A . Then R(F( A )) = (T Σ T , ⇒ A → ) = (T , → ) . (cid:2) The theorem below says that the functor F is left adjoint to R , butit is stated as a so-called “universal property” that does not use anycategory theory. (Figure 5.5 shows the traditional commutative diagramfor this property.) Rewriting (cid:45)(cid:64)(cid:64)(cid:64)(cid:64)(cid:64)(cid:64)(cid:64)(cid:82) (cid:63) (cid:63) A A R(F( A )) F( A )R(u) u(f , g) R( T ) T Figure 5.5: Universal Property of F( A ) Theorem 5.9.6

For every ARS A , TRS T and ARS morphism (f , g) : A → R( T ) ,there is a unique TRS morphism u : F( A ) → T such that R(u) = (f , g) . Proof:

Let A be (S, T , → ) and let T be ( Σ (cid:48) , A (cid:48) ) . Then F( A ) = ( Σ T , A → ) . If we assume that R(u) = (f , g) as morphisms (S, T , → ) (cid:45) → (T Σ (cid:48) , ⇒ A (cid:48) ) , thenfor u : ( Σ T , A → ) → ( Σ (cid:48) , A (cid:48) ) , which is really u : Σ T → Der ( Σ (cid:48) ) , we musthave u [],s (t) = g s (t) for each t ∈ Σ T[],s = T s , and that all the other maps Σ Tw,s → T Σ (cid:48) ,s (cid:48) ( w X) are empty. Furthermore, with this deﬁnition of u ,we have that R(u) = (f , g) , because R(u) = u : T Σ T = T → T Σ (cid:48) where u s (t) = g s (t) for t ∈ T s .Finally, to show that u is a TRS morphism, we must show that t ⇒ A → t implies u(t ) ⇒ A (cid:48) u(t ) . But t ⇒ A → t iﬀ t → t , and since (f , g) isan ARS morphism, from this we get g s (t ) → A (cid:48) g s (t ) , which occurs iﬀ u(t ) → A (cid:48) u(t ) . (cid:2) Exercise 5.9.5

Substitute “ t ⇒ A t implies h(t ) ∗ ⇒ A (cid:48) h(t ) ” for “ t ⇒ A t implies h(t ) ⇒ A (cid:48) h(t ) ” in Deﬁnition 5.9.1, substitute “ t → t implies g(t ) ∗ → (cid:48) g(t ) ” for “ t → t implies g(t ) → (cid:48) g(t ) ” in Deﬁnition 5.9.2,and then show that Theorem 5.9.6 still holds. Give an interpretation forthis result. Show that the local Church-Rosser property is not preservedby either F or R when morphisms are generalized in this way. (cid:2) Term rewriting captures a basic computational aspect of equationallogic, and is fundamental for theorem proving. However, expositionsof term rewriting typically have a combinatorial, syntactic ﬂavor, ratherthan an algebraic, semantic ﬂavor. This is due in part to the historicalfact that term rewriting arose as an abstraction of the lambda calculus,especially the so called Normalization (i.e., Church-Rosser) Theorem,which was ﬁrst proved by Church and Rosser, and which is the originof the “Church-Rosser property.”There is a very large literature on term rewriting. This chapter doesnot faithfully represent that literature, because it emphasizes results iterature that are of practical value for theorem proving, as opposed to resultsthat are largely of theoretical interest, and it leans heavily towards al-gebra. Huet and Oppen gave a good survey that developed some of theconnections with algebra [109]. Klop [114, 115], Dershowitz and Jouan-naud [39], and Plaisted [151] have also written useful surveys; the lattertwo describe some more recent developments. A nice self-contained in-troductory textbook has been written by Baader and Nipkow [2]. New-man proved his lemma for the unsorted case in 1942 [143]; the elegantproof given here is due to Barendregt [3], but generalized to overloadedmany-sorted rewriting. Theorem 5.2.9 is from [56]; it expresses a funda-mental connection between term rewriting and algebra. The Noetheriancondition is named after Emmy Noether, the great pioneer in abstractalgebra mentioned in Section 2.8.Combinatory logic was developed by Schönﬁnkel [160] to eliminate bound variables from predicate logic, and was later independently de-veloped further by Haskell Curry as a foundation for mathematics thathe called “Illative Combinatory Logic” [37, 38]. Combinatory logic alsoplays an important role in implementing functional programming lan-guages, as described in [4], [178] and many other places. The so-calledcategorical combinators developed more recently by Curien and othershave played a similar role [36].Although not discussed here, the lambda calculus is a TRS closelyrelated to combinatory logic. It was developed by Alonzo Church [28]as a calculus of functions, again as part of a foundational programmefor mathematics. This TRS played a key role in formalizing the no-tion of computability (the so-called Church-Turing thesis), and follow-ing work of Landin [120] and Strachey [172], it became the basis forthe “denotational semantics” [162] method for deﬁning the meaningof programming languages. Lambda calculus has also been an impor-tant inﬂuence on the design of programming languages, including Lisp[131] and more recently, higher-order functional languages like ML [99],Miranda [179], and Haskell [107].Term rewriting plays a basic role in proving properties of abstractdata types, including their correctness and implementation, and also in the study of their computability properties [137]. In addition, termrewriting has played an important role in developing languages thatcombine the functional and logic paradigms, through an operationalsemantics based on so called narrowing [79, 40, 105]. As far back as1951, Evans [46] used term rewriting to prove the decidability of theequational theory called “loops.”Most expositions of term rewriting do not make the signature ex-plicit, so that the distinction between (for example) conﬂuence andground conﬂuence can seem mysterious, and various confusions caneasily arise. Similarly, it is not usual to be careful about the variablesand constants involved in rewriting a given term. The results of Propo- Rewriting sitions 5.3.4 and 5.3.6, and of Corollary ?? , E20 which address these is-sues, do not seem to be in the literature, nor do the correspondingresults for the conditional case, Propositions 5.8.10 and 5.8.11. This ispresumably because they cannot even be stated without the additionalcare for variables and constants that we have taken.Section 5.4 on evaluation strategies has been largely taken from [90].There is an interesting literature on proving termination of rewritingwhen operations have local strategies, for example, see [50], which citesmany other papers.The unsolvability of equality mentioned in connection with Proposi-tion 5.1.14 is shown by the unsolvability of the so-called word problemfor groups, posed by Max Dehn in 1911: given a group presentation, de-termine whether or not two terms over the generators are equal in that equational theory; this was shown unsolvable by Petr Novokov [144]and William Boone [16] in the 1950s. Unsolvability of equality also fol-lows from the word problem for semigroups, posed by Axel Thue in1914 and shown unsolvable by Emil Post in 1947 [153].That orthogonality implies Church-Rosser (Proposition 5.6.4) hasbeen proved many times, perhaps ﬁrst by Rosen [158], but our versionmay be the ﬁrst that goes beyond the unsorted case, and our proof alsoappears to be novel. The term “orthogonal” is due to Dershowitz. Manyother results in this chapter are also new, in the same limited sensethat they are proved for overloaded many-sorted rewriting. Theorem5.6.9 and the notions of superposition and critical pair are part of alarger story about uniﬁcation and the Knuth-Bendix method covered inChapter 12; that material appears in this chapter because of its valuefor showing the Church-Rosser property. Hindley’s original proof ofthe Hindley-Rosen Lemma appears in [103].The literature includes several diﬀerent notions of conditional rewrit-ing, e.g., see the survey of Klop [115]. The join conditional rewritingapproach of our Deﬁnition 5.8.2 is the most satisfactory for OBJ be-cause it includes the computations done in the common case when == occurs in a condition. The alternative notions are either less general,or else are too general, for example, going beyond term rewriting by requiring conditions to be evaluated using the full power of equationaldeduction.Although Propositions 5.5.1 and 5.8.16 are very simple, they ex-press the fundamental relationship between termination and weightfunctions, and they have not been emphasized in the literature, andmay even be partially new. Results 5.8.18, 5.8.19, 5.8.20, and 5.8.33all appear to be new, and have practical value for proving termination,especially Theorems 5.8.20 and 5.8.33, which also handle conditionalrules.The results on constructing Noetherian orderings in Section 5.8.3are standard, although the particular constructions given for the mul- iterature tiset and lexicographic orderings may be new. The observation thatcolimits appear in several places is new, as is the termination criterionin Theorem 5.8.31 and its application in Example 5.8.32. Though theproof in this example is a bit elaborate for a result that is intuitivelyrelatively obvious, it does provide a fairly thorough illustration of themachinery introduced in Section 5.8.3. The material on the Church-Rosser property in Section 5.8.4 may be new; although the special caseof Proposition 5.8.36 with B = ∅ is of course familiar, the generaliza-tion to hierarchical CTRS’s is very useful in practice.The use of ARS’s to study TRS’s is standard in the literature, al-though terminology and deﬁnitions vary. Klop [114, 115] considerssets with an indexed family of relations (as in Proposition 5.7.5), callingthem “abstract reduction systems,” and using the name “replacement system” for the case of just one relation, which we call an abstractrewrite system; actually, our formulation is a bit more general, becauseit is S -indexed, which enables some novel applications to many-sortedterm rewriting and equational deduction. The results in Section 5.9 arenew, especially Theorem 5.9.6 on the adjoint relation between ARS’sand TRS’s. This material suggests many questions for further research,such as exploring properties of the two categories involved, and moreambitiously, reformulating term rewriting theory in a more categoricalstyle.José Meseguer [134] has developed rewriting logic, which gives soundand complete rules of inference for term rewriting; these rules are thesame as those for equational deduction, except that the symmetry lawis omitted. This logic can also be seen as a logic for the term rewrit-ing model of computation, and as such has many interesting applica-tions, including a comprehensive uniﬁcation of diﬀerent theories ofconcurrency, a nice operational semantics for inference systems, anda uniform meta-logic in which inference systems can be described andimplemented [29].I thank Prof. Virgil-Emil Cazanescu for his help with the proofs ofPropositions 5.2.6 and 5.3.4, and Dr. R˘azvan Diaconescu for help withthe proof of Theorem 5.9.6. I also thank José Barros and R˘azvan Dia-conescu for their help with some of the examples, Kai Lin for several very useful discussions, as well as for signiﬁcant help with the exam-ples in Section 5.8.2, and Grigore Ro¸su for the proof of the Orthogo-nality Theorem (Theorem 5.6.4) in Appendix B, and for several valu-able suggestions. Finally, I thank Ms. Chiyo Matsumiya and especiallyProf. Yoshihito Toyama, and Dr. Monica Marcus, for their valuable com-ments and corrections to this chapter. The proof of Theorem 5.6.9 inAppendix B is due to Dr. Marcus. Rewriting

A Note to Lecturers:

This chapter contains a great deal ofmaterial, some of which is rather diﬃcult. Except in the caseof an advanced course of some duration, the lecturer willhave to omit a fair amount, certainly including all the starredsections. Beyond that, the material to be covered may bedetermined by the taste of the lecturer and the choice ofmaterial to be covered from later chapters. In particular,it is safe to omit most of the detailed material on provingtermination and the Church-Rosser property, since little ofthat is needed for later chapters. It could also be a goodidea to interleave material from this chapter with parts ofChapter 6, to create a bit more variety.

Initial Algebras, StandardModels and Induction

This chapter shows that every equational speciﬁcation has an initialalgebra, gives further characterizations for these structures, and justi-ﬁes and illustrates the use of induction for verifying their properties.It also investigates abstract data types and standard models for equa-tional speciﬁcations, showing that they are initial models. Congruenceand quotients are important technical tools, and we prove some of theirmain properties.

This section discusses congruences, quotients, initial and free algebrassatisfying equations, and then substitutions modulo equations. Mainresults include the so-called homomorphism theorem, the universalcharacterization of quotients, and the existence of initial algebras.

Initial algebras for speciﬁcations with equations are constructed asquotients of term algebras, a construction that relies upon the follow- ing:

Deﬁnition 6.1.1 A Σ - congruence relation on a Σ -algebra M is an S -sorted equiv-alence relation ≡ = {≡ s | s ∈ S } on M , for S the sort set of Σ , where each ≡ s is an equivalence relation on M s such that whenever σ ∈ Σ s ...s n ,s then a i ≡ s i a (cid:48) i for i = , . . . , n implies M σ (a , . . . , a n ) ≡ s M σ (a (cid:48) , . . . , a (cid:48) n ) , for a i , a (cid:48) i ∈ M s i for i = , . . . , n . (cid:2) Initial Algebras, Standard Models and Induction

Example 6.1.2

Deﬁne a signature Σ by the OBJ fragment, sorts Nat Bool .op 0 : -> Nat .op s : Nat -> Nat .ops T F : -> Bool .op odd : Nat -> Bool . and let N be the Σ -algebra with N Nat = ω , with N Bool = {

T , F } , andwith the operations interpreted as expected. Then we can deﬁne a Σ -congruence Q on N as follows: nQ , Nat n (cid:48) iﬀ n − n (cid:48) is divisible by 8;and bQ , Bool b (cid:48) iﬀ b = b (cid:48) . We can also deﬁne another Σ -congruence Q on N as follows: nQ , Nat n (cid:48) iﬀ n − n (cid:48) is divisible by 2; and bQ , Bool b (cid:48) iﬀ b = b (cid:48) . (cid:2) Exercise 6.1.1

In the context of Example 6.1.2, prove that Q and Q are Σ -congruences on N . Now deﬁne Q to mean having the same remainderunder division by 3, and show that Q is not a Σ -congruence on N . (cid:2) Proposition 6.1.3

Given a Σ -algebra M and a Σ -congruence ≡ on M , the quo-tient of M by ≡ , denoted M/ ≡ , is a Σ -algebra, interpreting constantsymbols σ ∈ Σ [],s as [M σ ] , and operations σ ∈ Σ s ...s n ,s with n > [a ], . . . , [a n ] to [M σ (a , . . . , a n )] , for a i ∈ M s i . Proof:

We have to show that [M σ (a , . . . , a n )] is well deﬁned. So let us assume,for a i , a (cid:48) i ∈ M s i , that a i ≡ s i a (cid:48) i for i = , . . . , n , i.e., that [a i ] = [a (cid:48) i ] .Then the deﬁnition of congruence gives us that [M σ (a , . . . , a n )] = [M σ (a (cid:48) , . . . , a (cid:48) n )] . (cid:2) Example 6.1.4

The equivalence classes of ground Σ -terms under a set A of Σ -equations form a nice Σ -algebra, which is in fact a quotient of theterm algebra by a congruence based on equational deduction as fol-lows: Given Σ -terms t, t (cid:48) with variables in X , let t (cid:39) XA t (cid:48) iﬀ A (cid:96) ( ∀ X) t = t (cid:48) . where (cid:39) XA for sort s has t, t (cid:48) also of sort s . That (cid:39) XA is an equivalence re-lation follows directly from rules (1), (2), (3) of Deﬁnition 4.1.3 on page

58, which are the reﬂexivity, symmetry and transitivity of equationaldeduction, respectively. In the special case where X = ∅ , we write just (cid:39) A instead of (cid:39) ∅ A . Now we take the quotient T Σ / (cid:39) A as an S -sorted set,and let [t] A or (usually) just [t] , denote the equivalence class of a term t under (cid:39) A .For these equivalence classes to form a Σ -algebra, we need to giveinterpretations for the constant and operation symbols in Σ . It seemsclear that we should interpret the constant symbol σ ∈ Σ [],s as [σ ] ,and interpret σ ∈ Σ s ...s n ,s with n > [t ], . . . , [t n ] Appendix C reviews this construction for S -sorted sets. uotient and Initiality to [σ (t , . . . , t n )] , where t i ∈ T Σ ,s i for i = , . . . , n . But it may not beclear that this deﬁnition makes sense. For, if we had picked some other t (cid:48) , . . . , t (cid:48) n such that [t i ] = [t (cid:48) i ] , then we would need to know that [σ (t , . . . , t n )] = [σ (t (cid:48) , . . . , t (cid:48) n )] in order to know that the proposed interpretation for σ gives the sameresult, no matter which representatives we happen to have chosen forthe equivalence classes. Translating back to the notation of equationaldeduction, the property we need is A (cid:96) ( ∀∅ ) t i = t (cid:48) i for i = , . . . , n implies A (cid:96) ( ∀∅ ) σ (t , . . . , t n ) = σ (t (cid:48) , . . . , t (cid:48) n ) . But this follows directly from the rule of deduction (4) of Deﬁnition4.1.3: let X = { x , . . . , x n } with x i of sort s i , let Y = ∅ , let θ(x i ) = t i ,let θ (cid:48) (x i ) = t (cid:48) i , and let t = σ (x , . . . , x n ) ; then ( ∀∅ ) σ (t , . . . , t n ) = σ (t (cid:48) , . . . , t (cid:48) n ) is deducible, because θ(t) = σ (t , . . . , t n ) and θ (cid:48) (t) = σ (t (cid:48) , . . . , t (cid:48) n ) . Thus T Σ / (cid:39) A is a Σ -algebra, in fact an initial ( Σ , A) -algebra,though we do not prove it directly in this way. (cid:2) Deﬁnition 6.1.5

Given a Σ -homomorphism h : M → M (cid:48) , the kernel of h is the S -sorted family of equivalence relations ≡ h on M , deﬁned on M s by a ≡ h,s a (cid:48) iﬀ h s (a) = h s (a (cid:48) ) ; the kernel of h may be denoted ker (h) .The image of h , denoted im (h) or h(M) , is the Σ -subalgebra of M (cid:48) with h(M) s = h s (M s ) for each s ∈ S , and with operations those of M (cid:48) suitably restricted. (cid:2) Proposition 6.1.6

The kernel of a Σ -homomorphism h : M → M (cid:48) is a Σ -congru-ence, and its image is a Σ -algebra. Proof:

Each ≡ h,s is an equivalence relation, for any S -indexed function h : M → M (cid:48) . To prove the congruence property, let σ ∈ Σ w,s with w = s . . . s n and assume a i ≡ h,s a (cid:48) i , i.e., that h s (a i ) = h s (a (cid:48) i ) for i = , . . . , n . Then h s (M σ (a , . . . , a n )) = M (cid:48) σ (h s (a ), . . . , h s n (a n )) = M (cid:48) σ (h s (a (cid:48) ), . . . , h s n (a (cid:48) n )) = h s (M σ (a (cid:48) , . . . , a (cid:48) n )) , so that M σ (a , . . . , a n ) ≡ h,s M σ (a (cid:48) , . . . , a (cid:48) n ) , as desired.For the second assertion, we ﬁrst check condition (2) of the def-inition of subalgebra given in Exercise 3.1.1, let σ ∈ Σ w,s with w = s . . . s n , let b i ∈ h(M) s i for i = , . . . , n , and let a i ∈ M s i such that b i = f s i (a i ) for i = , . . . , n . Then M (cid:48) σ (b , . . . , b n ) ∈ h(M) s since M (cid:48) σ (b , . . . , b n ) = h s (M σ (a , . . . , a n )) . (cid:2) Initial Algebras, Standard Models and Induction

The following is one of the most important elementary results ofgeneral algebra. Due to Emmy Noether in its original form, calledthe “ﬁrst isomorphism theorem,” it relates homomorphisms, quotients,and subalgebras in a very elegant (and useful) way.

Theorem 6.1.7 ( Homomorphism Theorem ) For any Σ -homomorphism h : M → M (cid:48) , there is a Σ -isomorphism M/ ker (h) (cid:155) Σ im (h) . Proof:

Let ≡ denote ker (h) , let Q denote M/ ≡ , and deﬁne f : Q → h(M) asfollows: given some ≡ -class c , let f (c) be h(m) where m is any elementof M such that [m] = c ; by the deﬁnition of ≡ , if m , m are two suchelements, then h(m ) = h(m ) , so that f is well-deﬁned. Also, f is surjective, since for any h(m) ∈ h(M) , we have f ([m]) = h(m) . So itremains to show that f is a Σ -homomorphism and is injective.For the ﬁrst, if σ ∈ Σ [],s then f (Q σ ) = f ([M σ ]) = h(M σ ) = M (cid:48) σ .Also, if σ ∈ Σ w,s with w = s . . . s k then f (Q σ ([m ], . . . , [m k ])) = f ([M σ (m , . . . , m k )]) = h(M σ (m , . . . , m k )) = M (cid:48) σ (h(m ), . . . , h(m k )) = M (cid:48) σ (f ([m ]), . . . , f ([m k ])). To show that f is injective, assume that [m ] ≠ [m ] but f ([m ]) = f ([m ]) , which by deﬁnition of f means that h(m ) = h(m ) , whichby deﬁnition of ≡ means [m ] = [m ] , contradicting our assumption. (cid:2) The following two corollaries and one exercise spell out some easy con-sequences of the above:

Corollary 6.1.8 If h : M → M (cid:48) is an injective Σ -homomorphism, then M is iso-morphic to the subalgebra h(M) of M (cid:48) . (cid:2) Corollary 6.1.9 If h : M → M (cid:48) is a surjective Σ -homomorphism, then M (cid:48) isisomorphic to the quotient M/ ker (h) of M . (cid:2) Exercise 6.1.2

Show that the converses of the above two corollaries also hold,i.e., show that M is isomorphic to a subalgebra of M (cid:48) iﬀ there is aninjective Σ -homomorphism h : M → M (cid:48) , and show that M (cid:48) is isomorphicto a quotient of M iﬀ there is a surjective Σ -homomorphism h : M → M (cid:48) . (cid:2) Example 6.1.10

There is a nice example of the homomorphism theorem in au-tomaton theory. Deﬁne a state system to consist of an input set X , a state set Z , and a transition function t : X × Z → Z ; it is conventional touse a tuple notation (X, Z, t) for such systems. Recall that X ∗ is the setof all ﬁnite sequences from X , with the empty sequence denoted [] . We uotient and Initiality can extend t to a function t : X ∗ × Z → Z , by deﬁning t([], z) = z and t(wx, z) = t(x, t(w, z)) for x ∈ X, w ∈ X ∗ , z ∈ Z ; this gives the statereached from z after a sequence of inputs. State systems with input set X are Σ -algebras with Σ the one-sorted signature having Σ = X , and Σ n = ∅ for all n ≠

1, where the “action” of x ∈ X on z ∈ Z is deﬁnedto be t(x, z) . It is conventional to write x · z instead of t(x, z) , and alsoto extend this notation to write w · z instead of t(w, z) for w ∈ X ∗ .Next, deﬁne an automaton to be a state system plus a function o : Z → Y , and deﬁne the behavior of an automaton A = (X, Z, t, o) at state z ∈ Z to be the function b z : X ∗ → Y deﬁned by b z (w) = o(t(w, z)) . Now let B be the Σ -algebra of all possible behaviors for A ,with carrier [X ∗ → Y ] , by deﬁning (x · b)(w) = b(xw) for x ∈ X, b ∈ B, w ∈ X ∗ . Then the function b that sends z ∈ Z to b z ∈ B is a Σ -homomorphism, and thus Theorem 6.1.7 gives the Σ -isomorphism A/ ker (b) (cid:155) Σ im (b) . The Σ -algebra A/ ker (b) is called the minimal realization of A ; let usdenote it M . The above isomorphism says that M is the state systemwith the minimal set of states that realizes the same behaviors as A .We can extend M to an automaton by deﬁning M o ([z]) = o(z) ; thereader may check that this is well deﬁned, in that if [z] = [z (cid:48) ] then o(z) = o(z (cid:48) ) .The literature often adds an initial state σ ∈ Z to state systemsand/or automata; then the signature Σ is extended by adding Σ = { σ } ,and the algebra B is extended by adding B σ = b σ . Attention is oftenrestricted to automata that are reachable in the sense that for every z ∈ Z , there is some w ∈ X ∗ such that w · σ = z , and minimal realizationsfor such automata are again obtained by taking the quotient by thekernel of the behavior map. The kernel of b is often called the Nerodeequivalence , after Anil Nerode, who ﬁrst deﬁned it and the minimalrealization of a machine, though in a diﬀerent way. (cid:2)

A slightly more general notion of quotient than that developed abovestarts with an arbitrary relation on a Σ -algebra: Deﬁnition 6.1.11

Given a Σ -algebra M and a subset R s of M s × M s for each sort s of Σ , let ≡ R be the Σ -congruence generated by R on M , which isthe least Σ -congruence on M that contains R , and let M/R denote thequotient M/ ≡ R . (cid:2) The relation ≡ R exists because any intersection of Σ -congruences on M containing R is another such, necessarily the least, and the intersectionis non-empty because M × M is a Σ -congruence on M containing R . Thefollowing states a fundamental property of quotients: Proposition 6.1.12

Given a Σ -algebra M and a relation R on M , then the quo-tient map q : M → M/R satisﬁes the following: Initial Algebras, Standard Models and Induction

MQ Q (cid:48) q q (cid:48) uu (cid:48) (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:29) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:94)(cid:45)(cid:27) Figure 6.1: Proof for Uniqueness of Quotients(1) R ⊆ ker (q) ; and(2) if h : M → B is a Σ -homomorphism such that R ⊆ ker (h) thenthere is a unique Σ -homomorphism u : M/R → B such that q ; u = h . Proof:

For (1), it suﬃces to note that ker (q) is the least congruence containing R . For (2), Theorem 6.1.7 gives M/ ker (h) (cid:155) im (h) ⊆ B , so R ⊆ ker (h) implies ker (q) ⊆ ker (h) , which implies by Lemma 6.1.13 below that h factors as q ; q (cid:48) ; i where q (cid:48) : Q → M/ ker (h) with Q = M/ ker (h) , andwhere i : ker (h) → B is the inclusion. Therefore we get u = q (cid:48) ; i : Q → B such that q ; u = h . To show uniqueness, if u (cid:48) : A → B such that q ; u (cid:48) = h , then q ; u (cid:48) = q ; u and so the surjectivity of q implies u = u (cid:48) . (cid:2) Lemma 6.1.13

Given Σ -congruences ≡ , ≡ (cid:48) on a Σ -algebra M with ≡ ⊆ ≡ (cid:48) , let Q, Q (cid:48) and q, q (cid:48) be the respective quotients and quotient maps for M .Then q (cid:48) factors as q ; q (cid:48)(cid:48) , where q (cid:48)(cid:48) is also surjective. Proof:

Deﬁne q (cid:48)(cid:48) ([m]) = [m] (cid:48) where [m] (cid:48) is the ≡ (cid:48) congruence class of m ∈ M . This is well deﬁned because if [m] = [m (cid:48) ] then [m] (cid:48) = [m (cid:48) ] (cid:48) since ≡ ⊆ ≡ (cid:48) . Then q (cid:48) (q(m)) = [m] (cid:48) = q (cid:48) (m) . (cid:2) An assertion that a unique map exists satisfying certain conditionsis often called a universal property ; the above is an example, as areinitiality assertions (e.g., Theorems 3.2.1, 3.2.10 and 6.1.15). In eachcase, the universal property characterizes a structure uniquely up to isomorphism. Proposition 6.2.1 shows this for initial algebras, and thefollowing proves it for quotients:

Proposition 6.1.14

If both q : M → Q and q (cid:48) : M → Q (cid:48) satisfy conditions (1)and (2) of Proposition 6.1.12, then Q and Q (cid:48) are isomorphic. Proof:

First notice that if B = M in Proposition 6.1.12, then uniqueness im-plies u = M since 1 M satisﬁes (1) and (2) in this case. Now under ourassumptions, we get Σ -homomorphisms u, u (cid:48) such that q ; u = q (cid:48) and q (cid:48) ; u (cid:48) = q , from which it follows that q (cid:48) ; u (cid:48) ; u = q (cid:48) and q ; u ; u (cid:48) = q ,which by our initial remark implies that u (cid:48) ; u = Q and u ; u (cid:48) = Q (cid:48) , sothat Q and Q (cid:48) are isomorphic. See Figure 6.1. (cid:2) uotient and Initiality T Σ ,A T Σ Mq vu (cid:45)(cid:63)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:51)

Figure 6.2: Proof for Initiality with Equations

The following is the main result of this section. It says that given aset A of Σ -equations, there is a Σ -algebra T Σ ,A , also denoted T P when P = ( Σ , A) , with the property that given any Σ -algebra M satisfying A ,there is a unique Σ -homomorphism T Σ ,A → M ; i.e., it says that everyequational speciﬁcation has an initial model. Note that, strictly speak-ing, we are dealing with sorted (or annotated) terms here, in the senseof Deﬁnition 3.2.9. Theorem 6.1.15 ( Initiality ) Given a set A of (possibly conditional) Σ -equations,let ≡ be the Σ -congruence on T Σ generated by the relation R having thecomponents R s = { (cid:104) t, t (cid:48) (cid:105) | A (cid:96) ( ∀∅ ) t = t (cid:48) where t, t (cid:48) are of sort s } . Then T Σ /R , denoted T Σ ,A , is an initial ( Σ , A) -algebra. Proof:

E21

Given any ( Σ , A) -algebra M , let v : T Σ → M be the unique homomor-phism, and note that R ⊆ ker (v) because M (cid:238) A implies M (cid:238) ( ∀∅ ) t = t (cid:48) for every (cid:104) t, t (cid:48) (cid:105) ∈ R , by the soundness of equational deduction. Nowlet v = q ; u with u : T Σ ,A → M be the factorization of v given by Propo-sition 6.1.12; see Figure 6.2. This shows existence. For uniqueness, ifalso u (cid:48) : T Σ ,A → M , then q ; u (cid:48) = v by the initiality of T Σ , and R ⊆ ker (v) because M (cid:238) A . Therefore u = u (cid:48) by (2) of Proposition 6.1.12. (cid:2) In fact, Example 4.3.8 shows that R is a Σ -congruence, there denoted (cid:39) A . Theorem 6.1.15 is very fundamental in algebraic speciﬁcation; it isbasic to the theory of abstract data types developed later in this chap-ter, as well as to the theory of rewriting modulo equations in Chapter 7,and several other topics. Moreover, it has a satisfying intuitive interpre-tation similar to that given for T Σ in Section 3.2: we can view T Σ ,A asa kind of “universal language” of (simple expression-like) programs forthe Σ -algebras that satisfy A , which are the “processors” that are able tocorrectly evaluate the programs in T Σ ,A . Initiality says that every suchprogram has a unique result on each such processor. Let’s consideran example, which will help to motivate our approach to substitutionsmodulo equations in Section 6.1.3: Initial Algebras, Standard Models and Induction

Example 6.1.16

Let Σ = Σ GROUPL ( { a, b, c } ) and let A contain just the associativelaw. Then expressions like a ∗ b ∗ c and (a ∗ a ∗ b ∗ b) − have uniqueinterpretations in any group in which a, b, c have been given interpre-tations. For example, if M is the group of non-zero rational numbersunder multiplication with a = b = c =

2, then the ﬁrst expres-sion has value 12, while the second has value . (cid:2) We now generalize freeness to the case where there are equations.Given an S -sorted set X , deﬁne T Σ ,A (X) to be the algebra T Σ ∪ X / (cid:39) A viewed as a Σ -algebra, with the A -equivalence class of t denoted [t] A .This algebra is called the free ( Σ , A) -algebra generated by X , and aswas the case with Theorem 3.2.1, it has the following important univer-sal property: Theorem 6.1.17

Given a set A of Σ -equations, a Σ -algebra M satisfying A , andan assignment a : X → M , there is a unique Σ -homomorphism a : T Σ ,A (X) → M that extends a , i.e., such that a(x) = a(x) for all x ∈ X . Proof: A ( Σ ∪ X, A) -algebra M is exactly the same thing as a ( Σ , A) -algebra M and an assignment a : X → M . For T Σ ,A (X) , the assignment is theinjective morphism i x : X → T Σ ,A (X) that sends x to [x] A . Theo-rem 3.2.1 implies that there is a unique ( Σ ∪ X) -homomorphism a : T Σ ∪ X,A → M , which is exactly the same as saying that there is a unique Σ -homomorphism a : T Σ ,A (X) → M that extends a . (cid:2) Exercise 6.1.3

Show that any two Σ -algebras that satisfy the freeness universalproperty of Theorem 6.1.17 are Σ -isomorphic, and show that any Σ -algebra that is Σ -isomorphic to T Σ ,A (X) also satisﬁes the same universalproperty. (cid:2) The following consequence of the above free algebra theorem and thehomomorphism theorem will be used in Chapter 7:

Proposition 6.1.18

Every ( Σ , A) -algebra is a quotient of a free ( Σ , A) -algebra. Proof:

Let M be a ( Σ , A) -algebra, and let | M | denote its underlying ( S -indexed) carrier set. We will show that M is a quotient of the free algebra P = T Σ ,A ( | M | ) . Let h be the unique Σ -homomorphism a from P to M givenby Theorem 6.1.17 with a the identity map on | M | . Notice that h issurjective, because a is already surjective. Then by Corollary 6.1.9, P / ≡ h is isomorphic to M . (cid:2) We conclude this section with substitution modulo A , generalizing Sec-tion 3.5; this also is needed in Chapter 7. bstract Data Types Deﬁnition 6.1.19

Given a set A of Σ -equations, a substitution modulo A of ( Σ , A) -terms with variables in Y for variables in X is an arrow a : X → T Σ ,A (Y ) ; we may use the notation a : X → Y . The application of a to t ∈ T Σ ,A (X) is a(t) . Given substitutions a : X → T Σ ,A (Y ) and b : Y → T Σ ,A (Z) , then their composition (as substitutions), denoted a ; b , is the S -sorted arrow a ; b : X → T Σ ,A (Z) . (cid:2) Again as in Section 3.5, an alternative notation makes this look morefamiliar: Given t ∈ T Σ ,A (X) and a : X → T Σ ,A (Y ) such that | X | ={ x , . . . , x n } and a(x i ) = [t i ] A for i = , . . . , n , then a(t) can also bewritten t(x ← t , x ← t , . . . , x n ← t n ) , and whenever t i is just x i , thepair x i ← t i can be omitted from this notation. Exercise 6.1.4

The following assume a set A of Σ -equations.

1. If i X : X → T Σ ,A (X) is the inclusion, show that i X ([t] A ) = [t] A foreach [t] A ∈ T Σ ,A (X) .2. Given a substitution a : X → T Σ ,A (Y ) , show that i X ; a = a and that a ; i Y = a ; as before, this justiﬁes writing 1 X for i X .3. Show that substitution modulo A is associative, in the sense thatgiven substitutions a : W → T Σ ,A (X) , b : X → T Σ ,A (Y ) and c : Y → T Σ ,A (Z) , then (a ; b) ; c = a ; (b ; c) . Hint:

The “magical” proof for the ordinary case (Proposition 3.6.5)generalizes, using the free property of term algebras modulo A (Theorem 6.1.17).4. Show that Corollary 3.6.6, asserting a ; b = (a ; b) , generalizes tosubstitution modulo A . (cid:2) This section motivates abstract data types from the viewpoint of soft-ware engineering, then gives a precise deﬁnition for this concept, and ﬁnally proves some of its most basic properties, especially that an ab-stract data type is uniquely determined by its speciﬁcation as an initialalgebra, and that abstract data types are indeed abstract. Section 6.6discusses some limitations of this chapter’s approach, placing it alonga path having important further developments.

It is well known that most of the eﬀort in programming goes into de-bugging and maintenance (i.e., into improving and updating programs)[15]. Therefore anything that can be done to ease these processes has Initial Algebras, Standard Models and Induction enormous economic potential. One step in this direction is to “encap-sulate data representations”; this means to make the actual structureof data invisible, and to provide access to it only through a given set ofoperations which retrieve and modify the hidden data structure. Thenthe implementing code can be changed without having any eﬀect onother code that uses it. On the other hand, if client code relies on prop-erties of the representation, it may be extremely hard to track down allthe consequences of modifying a given data structure (say, changing adoubly linked list to an array), because the client code may be scatteredall over the program, without any clear identifying marks. The so-calledY2K problem is one relatively dramatic example of this phenomenon.An encapsulated data structure with its accompanying operationsis called an abstract data type . The crucial advance was to recognize that operations should be associated with data representations; this isexactly the same insight that advanced algebra from mere sets to al-gebras , which are sets with their associated operations. In softwareengineering this insight seems to have been due to David Parnas [146],and in algebra to Emmy Noether [20, 127]. The parallel between de-velopments in software engineering and in abstract algebra is a majorsubtheme of this chapter.A theory of abstract data types should enable us to check whetheror not implementations are correct, by verifying their properties. Thischapter presents some of the basics of such a theory. Abstract datatypes also provide the foundation for many theorem-proving problems:before we can prove something about the natural numbers, or aboutlists, we need a precise characterization of the structure that is in-volved. Even results about groups often use the natural numbers. Moreelaborate problems in computer science, such as proving the correct-ness of a compiler, usually involve more elaborate data structures, suchas queues, stacks, arrays, or lists of stacks of integers. We usually wantsuch proofs to be independent of how the underlying data types hap-pen to be represented; for example, we are usually not interested inproperties of the decimal or binary representations of natural num-bers, but instead are interested in abstract properties of the naturalnumbers, like the commutativity of addition. We have already seen several examples where the Σ -term algebra T Σ serves as a standard model for a speciﬁcation P = ( Σ , ∅ ) with no equa-tions. For example, if Σ = Σ NATP (from Example 2.3.3) then T Σ is the nat-ural numbers in Peano notation, and if Σ = Σ NATEXP (from Example 2.3.4)then T Σ consists of all expressions formed from the operation symbols , s , + and * .There are also many examples that need equations, such as Example bstract Data Types (cid:38)(cid:37)(cid:39)(cid:36) (cid:38)(cid:37)(cid:39)(cid:36)(cid:15) (cid:12)(cid:13)(cid:14) (cid:63) (cid:63)(cid:54)(cid:54) • ••• (cid:15) (cid:12) (cid:15) (cid:12)(cid:13)(cid:14)(cid:14) (cid:13) f ; g f g ; fgM M (cid:48) Figure 6.3: Uniqueness of Initial Algebras5.1.5, the natural numbers with addition, for which we now repeat theOBJ code: obj NATP+ issort Nat .op 0 : -> Nat .op s_ : Nat -> Nat .op _+_ : Nat Nat -> Nat .var N M : Nat .eq N + 0 = N .eq N + s M = s(N + M) .endo

Theorem 6.1.15 tells us that such speciﬁcations do indeed have ini-tial models, in which the elements of the carriers are the equivalenceclasses of terms modulo the equations. However, Theorem 5.2.9 givesa diﬀerent initial algebra for speciﬁcations that are also canonical asterm rewriting systems, namely as the normal forms of terms underreduction. Moreover, a speciﬁcation like

NATP+ may well have stillother representations that are preferred, such as natural numbers inthe usual decimal positional notation. The choice of representation isjust a matter of convenience, because all initial algebras are “essentiallythe same” in the sense that they are isomorphic algebras, as shown bythe following:

Proposition 6.2.1

Given a speciﬁcation P = ( Σ , A) , any two initial P -algebrasare Σ -isomorphic; in fact, if M and M (cid:48) are two initial P -algebras, then the unique Σ -homomorphisms M → M (cid:48) and M (cid:48) → M are both isomor-phisms, and indeed, are inverse to each other. Proof:

The diagram in Figure 6.3 pertains to this proof. Because M and M (cid:48) areboth initial, there are Σ -homomorphisms f : M → M (cid:48) and g : M (cid:48) → M .Thus there are Σ -homomorphisms f ; g : M → M and g ; f : M (cid:48) → M (cid:48) . Butbecause the identity on M is a Σ -homomorphism and there is a unique Σ -homomorphism from M to M by the initiality of M , we necessarilyhave f ; g = M . Similarly, g ; f = M (cid:48) . (cid:2) For example, if P = ( Σ , A) is NATP+ , then the Σ -algebra N P of normalforms under A (of Theorem 5.2.9) and the Σ -algebra T / (cid:39) A of equiva- Initial Algebras, Standard Models and Induction lence classes of ground terms under A are isomorphic, and in fact, bothare isomorphic to ω .The following result shows that satisfaction of an equation by analgebra is an “abstract” property, in the sense that it is independent ofhow the algebra happens to be represented; more precisely, it is invari-ant under isomorphism. This is fortunate, because these are usually theproperties in which we are most interested. This and Proposition 6.2.1imply that exactly the same equations are true of any one initial P -algebra as any other. Proposition 6.2.2

Given isomorphic Σ -algebras M and M (cid:48) , and given a Σ -equa-tion e , then M (cid:238) e iﬀ M (cid:48) (cid:238) e . Proof:

Let h : M → M (cid:48) be an isomorphism, let e be ( ∀ X) t = t (cid:48) and let a : X → M be an interpretation of X in M . Then a(t) = a(t (cid:48) ) implies h(a(t)) = h(a(t (cid:48) )) . Moreover, any interpretation b : X → M (cid:48) is of theform a ; h for some a : X → M , namely a = b ; g , where g is the inverseof h . Hence a(t) = a(t (cid:48) ) for all a : X → M implies b(t) = b(t (cid:48) ) for all b : X → M (cid:48) . The converse implication follows by symmetry. (cid:2) The word “abstract” in the phrase “abstract algebra” means “uniquelydeﬁned up to isomorphism.” In abstract group theory, we are not in-terested in properties of representations of groups, but only in thosethat hold up to isomorphism. Because Proposition 6.2.1 implies that allthe initial models of a speciﬁcation P = ( Σ , E) are abstractly the samein precisely this sense, the word “abstract” in “abstract data type” has exactly the same meaning. This is not a mere pun, but a signiﬁcant factabout software engineering.Another fact which strongly suggests that we are on the right trackis that any computable abstract data type has an equational speciﬁ-cation; moreover, this speciﬁcation tends to be reasonably simple andintuitive in practice. The following result from [137] somewhat general-izes the original version due to Bergstra and Tucker [11] ( M is reachable iﬀ the unique Σ -homomorphism T Σ → M is surjective): Theorem 6.2.3 ( Adequacy of Initiality ) Given any computable reachable Σ -alge-bra M with Σ ﬁnite, there is a ﬁnite speciﬁcation P = ( Σ (cid:48) , A (cid:48) ) such that Σ ⊆ Σ (cid:48) , such that Σ (cid:48) has the same sorts as Σ , and such that M is Σ -isomorphic to T P viewed as a Σ -algebra. (cid:2) We do not here deﬁne the concept of a “computable algebra,” but itcorresponds to what one would intuitively expect: all carrier sets aredecidable and all operations are total computable functions; see [137].What this result tells us is that all of the data types that are of interest incomputer science can be deﬁned using initiality, although sometimes it tandard Models are Initial Models may be necessary to add some auxiliary functions. All of this motivatesthe following:

Deﬁnition 6.2.4

The abstract data type (abbreviated

ADT ) deﬁned by a speci-ﬁcation P is the class of all initial P -algebras. (cid:2) We now address the basic question of what a standard model is bygiving two intuitively motivated properties of a standard model, andthen showing that any model satisfying these properties is in fact aninitial model; because we already know that initial models are uniqueup to isomorphism, this settles the question.

Suppose we are given a theorem-proving problem that involves asignature Σ and a set A of equations that characterize the operations in Σ . Suppose further that M is a standard model of P = ( Σ , A) , and let h denote the unique Σ -homomorphism T Σ → M . Then the two propertiesare as follows:1. No Junk . For each m ∈ M , there is some t ∈ T Σ such that m = h(t) .2. No Confusion . Given t, t (cid:48) ∈ T Σ , then h(t) = h(t (cid:48) ) iﬀ A (cid:96) ( ∀∅ ) t = t (cid:48) .The intuitive justiﬁcation for these principles is as follows: Becausethe elements of M are supposed to represent the entities that exist inthe “world” of the problem, it would be wrong to allow entirely newentities. Similarly, it is necessary that all entities are distinct unless itfollows from the statement of the problem that they must be the same.For example, consider the “Missionaries and Cannibals” problem, inwhich n Missionaries and n Cannibals are on one shore of a river, andall of them wish to get to the other shore, using a boat which can holdat most k people. If ever there are more Missionaries than Cannibals,either on one shore or the other, or in the boat, then all the Cannibalspresent in that place are converted to Christianity. The problem is to get everyone to the other shore without any conversions.Clearly, it would not be legitimate to postulate a bridge over whicheveryone could just walk, or a second larger boat into which everyonecould ﬁt; this would be “junk.” Similarly, it would not be legitimateto postulate some number of extra Cannibals to stand guard. A dif-ferent kind of illegitimate solution would simply assume that all theMissionaries are actually the same individual, with a number of diﬀer-ent names; this would be a “confusion” of identities.We can also give a “fair mystery story” interpretation of these twoconditions: the ﬁrst says that the butler didn’t do it unless he wasactually introduced into the story as a suspect (“no deus ex machina ”), Initial Algebras, Standard Models and Induction T Σ M M (cid:48) h uh (cid:48) (cid:45)(cid:54)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:51)

Figure 6.4: No Junk, No Confusion Proofwhile the second says that all the characters are distinct unless theauthor has explicitly said otherwise (“no artiﬁcial aliases”). Thus, if theclues point to two diﬀerent characters, the author would be cheatingif he resolved the apparent conﬂict by saying that these two charactersare really the same. Rather, he should give suﬃcient evidence to narrow the suspects down to just one.

Theorem 6.3.1

Given a speciﬁcation P = ( Σ , A) , then a Σ -algebra M has no junkand no confusion relative to P iﬀ it is an initial P -algebra. Proof:

The diagram in Figure 6.4 pertains to this proof.If M is an initial P -algebra, then M (cid:155) T Σ / (cid:39) A and the no junk and noconfusion conditions are obvious for this algebra.For the converse, we ﬁrst show that if a Σ -algebra M has no junk andno confusion relative to P = ( Σ , A) , then it satisﬁes A . Let ( ∀ X) t = t (cid:48) be in A , and let θ : X → M be a substitution; then we must show that θ(t) = θ(t (cid:48) ) in M . Let h : T Σ → M be the unique Σ -homomorphism, andlet | X | = { x , . . . , x n } . By no junk, we may assume that θ(x i ) = h(t i ) for some t i ∈ T Σ . Because A (cid:96) ( ∀∅ ) t(x ← t , . . . , x n ← t n ) = t (cid:48) (x ← t , . . . , x n ← t n ) follows by equational deduction, and because θ(t) = h(t(x ← t , . . . , x n ← t n )) and θ(t (cid:48) ) = h(t (cid:48) (x ← t , . . . , x n ← t n )) , the no confusion condition gives us that θ(t) = θ(t (cid:48) ) , as desired.To show that M is initial, let M (cid:48) be another P -algebra, and let h : T Σ → M and h (cid:48) : T Σ → M (cid:48) be the unique homomorphisms. Now given m ∈ M , by no junk we may assume that m = h(t) for t ∈ T Σ andthen deﬁne u : M → M (cid:48) by u(m) = h (cid:48) (t) . To show that this is well-deﬁned, let us also assume that h(t) = h(t (cid:48) ) ; then by no confusion, A (cid:96) ( ∀∅ ) t = t (cid:48) , and so M (cid:48) (cid:238) ( ∀∅ ) t = t (cid:48) , which implies that h (cid:48) (t) = h (cid:48) (t (cid:48) ) , as desired. Moreover, h ; u = h (cid:48) by construction, and u is uniquebecause if it exists, it must satisfy the equation h ; u = h (cid:48) , which as wehave just seen determines its value. (cid:2) nitial Truth and Subalgebras This section deﬁnes initial satisfaction, and proves the fundamental re-sult that an initial algebra has no proper subalgebras. The proof is,I think, suprisingly simple and beautiful, and many important resultsabout induction will follow from it in subsequent sections of this chap-ter.

Deﬁnition 6.4.1

Given a speciﬁcation P = ( Σ , A) and a Σ -equation e , we say that P initially satisﬁes e iﬀ T P (cid:238) e ; in this case we write P |(cid:155) e or A |(cid:155) Σ e ,and we may omit the subscript Σ when it is clear from context. (cid:2) Notice that this is a semantic property. Because anything that is true ofall models is certainly true of initial models, we have that P (cid:238) e implies P |(cid:155) e (where P (cid:238) e means that A (cid:238) Σ e ). However, the converse does not hold: Example 6.4.2

Let Σ contain a constant 0, a unary function symbol s , and abinary function symbol + , which we will write with inﬁx notation; let A contain the equations ( ∀ n) + n = n( ∀ m, n) s(m) + n = s(m + n) . Then the commutative equation ( ∀ m, n) m + n = n + m holds in T P but does not hold in every P -algebra. For example, it doesnot hold in the Σ -algebra M with carrier all strings of a ’s and b ’s, with0 ∈ Σ denoting the empty string in M , with m + n denoting the concate-nation of the string n after the string m , and with s sending a string m to the string a + m . For example, a + b ≠ b + a , because ab ≠ ba . (cid:2) Theorem 6.4.4 below is another fundamental property of initial alge-bras, with a very simple proof. We will soon see that this result providesthe foundation for proofs by induction. But ﬁrst, we need the following:Recall that given a Σ -algebra M , another Σ -algebra M (cid:48) is a subalge-bra (or sub- Σ -algebra ) of M iﬀ there is an inclusion M (cid:48) → M that is a Σ -homomorphism. In this case, we may write M (cid:48) ⊆ M . A subalgebra M (cid:48) of M is proper iﬀ M (cid:48) ≠ M (i.e., iﬀ M (cid:48) s ≠ M s for some s ∈ S ). Example 6.4.3 If ω denotes the natural numbers, and Z denotes the integers(positive, negative and zero), and if Σ = Σ NATP , then ω is a sub- Σ -algebraof Z . (cid:2) Initial Algebras, Standard Models and Induction T Σ M T P u jh (cid:45)(cid:54)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:17)(cid:51) Figure 6.5: No Proper Subalgebra Proof

Exercise 6.4.1

E22

Given a Σ -algebra M and given a subset M (cid:48) s ⊆ M s for each s ∈ S , show that M (cid:48) gives the carriers of a subalgebra of M if and onlyif M σ (m , . . . , m k ) ∈ M (cid:48) s for each σ ∈ Σ , where m i ∈ M (cid:48) s i for i = , . . . , k and where σ has arity s . . . s k and sort s . (cid:2) Exercise 6.4.2

Show that if M is a ( Σ , A) -algebra and if M (cid:48) ⊆ M is a sub- Σ -algebra, then M (cid:48) is also a ( Σ , A) -algebra. (cid:2) Theorem 6.4.4 If P = ( Σ , A) is a speciﬁcation, then an initial P -algebra has noproper sub- Σ -algebras. Proof:

The diagram in Figure 6.5 pertains to this proof. Let h : T Σ → T P bethe unique Σ -homomorphism, which is surjective by hypothesis, let j : M → T P be the inclusion for a sub- Σ -algebra M , and let u : T Σ → M bethe unique Σ -homomorphism. Then by initiality of T Σ and because j isa Σ -homomorphism, u ; j = h . Hence j is also surjective, and so T P hasno proper sub- Σ -algebra. Therefore, no other initial P -algebra can havea proper sub- Σ -algebra, because they are all isomorphic. (cid:2) In general, pure equational deduction is inadequate for proving proper-ties of standard models, and many properties require the use of induc-tion. Fortunately, there exist powerful induction principles for initialmodels, that let us prove that some predicate holds for all values byproving that it holds for each constructor whenever it holds for all ar- guments of that constructor. These generalize Peano induction fromthe natural numbers to arbitrary data types, and can be consideredforms of “structural induction” [21], as discussed in Section 3.2.1.The results in this section follow [137], and justify using inductionin proof scores for unsorted signatures; Section 6.5.2 extends this tothe many-sorted case; much more general results on induction, includ-ing its use on ﬁrst-order formulae, are given in Chapter 8. The followingbasic deﬁnition applies to both the unsorted and many-sorted cases:

Deﬁnition 6.5.1

Given a signature Σ and a Σ -algebra M , then a subsignature Φ ⊆ Σ is a signature of constructors for M iﬀ the unique Φ -homomorphism nduction h : T Φ → M is surjective, and is a signature of unique constructors iﬀ h is a Φ -isomorphism. A signature Φ of constructors is minimal iﬀ noproper subsignature Φ (cid:48) ⊂ Φ has T Φ (cid:48) → M surjective. A signature of con-structors for a speciﬁcation P = ( Σ , A) is a signature of constructorsfor T P . (cid:2) If a Σ -algebra M has a signature of constructors, then it has a mini-mal signature of constructors; however, it may have more than one. Ev-ery speciﬁcation has a signature of constructors and at least one min-imal signature of constructors. Clearly, using a minimal signature ofconstructors will require less eﬀort in proofs. Although speciﬁcationsneed not in general have signatures of unique constructors, the speciﬁ-cations that arise in practice often have a unique minimal signature of unique constructors. The properties of reachability and induction cor-respond to the “no junk” and “no confusion” conditions that togetherare equivalent to initiality [24, 137]. Example 6.5.2 If Σ = Σ NATP + , then Φ = Σ NATP is a minimal signature of construc-tors and also a signature of unique constructors for T NATP + . (cid:2) You may wish to review Section 2.7 on unsorted algebra before read-ing the next result.

Theorem 6.5.3 ( Structural Induction I ) Given an unsorted speciﬁcation P = ( Σ , A) and a signature Φ of constructors for P , let V be a subset of T P . Then V = T P if(0) c ∈ Φ implies [c] ∈ V , and(1) f ∈ Φ n for n > [t i ] ∈ V for i = , . . . , n imply [f (t , . . . , t n )] ∈ V . Proof:

Because T P has no proper subalgebras by Proposition 6.4.4 and because V ⊆ T P , we need only show that V is closed under Φ ; but that is exactlywhat conditions (0) and (1) say. (cid:2) This very simple proof is possible because we have taken initial alge-bra semantics as our starting point. Note that a complete inductiveproof using a signature Φ of constructors must include a proof that Φ in fact is a signature of constructors, i.e., that there is a surjective Φ -homomorphism h : T Φ → T P . Often this surjective property will beproved using structural induction. But if A is ground canonical and allits normal forms are Φ -terms, then h is just the isomorphism of N P with T P that is discussed in Section 5.2.9. Initial Algebras, Standard Models and Induction

The usual formulation of induction follows easily from Theorem6.5.3:

Corollary 6.5.4 ( Structural Induction II ) Given an unsorted speciﬁcation P = ( Σ , A) and a signature Φ of constructors for P , let Q(x) be a Σ ( { x } ) -sentence. Then A |(cid:155) Σ ( ∀ x) Q(x) if(0) c ∈ Φ implies A |(cid:155) Σ Q(c) , and(1) f ∈ Φ n for n > t i ∈ T Σ for i = , . . . , n and A |(cid:155) Σ Q(t i ) for i = , . . . , n imply A |(cid:155) Σ Q(f (t , . . . , t n )) . Proof:

This follows from Theorem 6.5.3 by letting V = { h(x) | A |(cid:155) Σ Q(x) } ,where h : T Φ → T P is the surjective Φ -homomorphism that exhibits Φ asa signature of constructors. (cid:2) Chapter 8 is much more precise about the notion of “sentence,” butfor now it suﬃces to think of sentences as including equations andtheir combinations under conjunction and implication. Corollary 6.5.4justiﬁes the use of simple induction in proof scores, and is used inexamples below.

Example 6.5.5

When Σ = Σ NATP , the above result states exactly the usual prin-ciple of induction for the natural numbers. (cid:2)

Induction is basic to many theorem-proving systems, including theBoyer-Moore theorem prover [19], although not in the same form asabove. Experience [59] shows that it is often easier to prove results bystructural induction, as justiﬁed by the above results, than by so-calledinductionless induction [56, 137, 140] using Knuth-Bendix completion,because structural induction arguments do not require showing thetermination of new rule sets, and do not produce uncontrollable ex-plosions of strange new rules that may gradually become less and lessrelevant; Garland and Guttag [49] report a similar experience.Note that inductive proof techniques are not valid for loose seman-tics, because (in general) the results proved by induction are not true ofall models, but only of the standard (initial) models. Also, it is usuallymuch easier to directly exploit the close connection between rewrite rules, initiality, and induction than to try to remain within a “loose”semantics framework by axiomatizing a standard model with explicitreachability and induction schemata, because the ﬁrst of these requiresexistential quantiﬁcation (e.g., Skolem functions) and the second re-quires second-order quantiﬁcation.

The following proof scores use the inductive proof techniques intro-duced above. The ﬁrst two examples import the following code for thenatural numbers with addition: nduction obj NAT is sort Nat .op 0 : -> Nat .op s_ : Nat -> Nat [prec 1] .op _+_ : Nat Nat -> Nat [prec 3] .vars M N : Nat .eq M + 0 = M .eq M + s N = s(M + N) .endo

Example 6.5.6 ( Associativity of Addition ) The following score proves that addi-tion of natural numbers is associative: open NAT .ops l m n : -> Nat .***> base case, n=0: l+(m+0)=(l+m)+0reduce l + (m + 0) == (l + m) + 0 .***> induction stepeq l + (m + n) = (l + m) + n .reduce l + (m + s n) == (l + m) + s n .close

Therefore we have proved the equation ( ∀ L, M, N) L + (M + N) = (L + M) + N . (cid:2) Example 6.5.7 ( Commutativity of Addition ) E23

The following proof score showsthat addition of natural numbers is commutative: open NAT .ops m n : -> Nat .***> first lemma0: 0 + n = n, by induction on n***> base for lemma0, n=0reduce 0 + 0 == 0 .***> induction stepeq 0 + n = n .reduce 0 + s n == s n .*** thus we can asserteq 0 + N = N .***> show lemma1: s m + n = s(m + n), again by induction on n***> base for lemma1, n=0reduce s m + 0 == s(m + 0) .***> induction stepeq s m + n = s(m + n) .reduce s m + s n == s(m + s n) .*** thus we can asserteq s M + N = s(M + N).***> show m + n = n + m, again by induction on n***> base case, n=0 Initial Algebras, Standard Models and Induction reduce m + 0 == 0 + m .***> induction stepeq m + n = n + m .reduce m + s n == s n + m .close

Of course, we should not assert commutativity as a rewrite rule, or wemay get non-terminating behavior. (cid:2)

We will see in Chapter 7 that the above results imply that we can useassociative-commutative rewriting for addition in doing more complexexamples.It is interesting to contrast the above proofs with correspondingproofs due to Paulson in Cambridge LCF [147]. The LCF proofs aremuch more complex, in part because LCF allows partial functions, and then must prove them total, whereas functions are automatically total(on their domain) in equational logic.

Exercise 6.5.1

E24

Use OBJ3 to prove that the equation ( ∀ N) + M = M holdsfor the speciﬁcation NAT above. (cid:2)

Exercise 6.5.2

Given the following code: obj INT is sort Int .ops (inc_)(dec_): Int -> Int .op 0 : -> Int .vars X Y : Int .eq inc dec X = X .eq dec inc X = X .op _+_ : Int Int -> Int .eq 0 + Y = Y .eq (inc X) + Y = inc(X + Y).eq (dec X) + Y = dec(X + Y).endo (a) What set of algebras does it denote? What are its signatures ofcontructors (if any)? What are its minimal signatures of contruc-tors (if any)? What are its signatures of unique constructors (ifany)?(b) Give an OBJ proof score for the equation ( ∀ Y ) Y + = Y , and justify it. (cid:2) Exercise 6.5.3

This question refers to the same code as Exercise 6.5.2.(a) Explain how to represent ( − ) +

4, and explain how OBJ wouldreduce it.(b) Give an OBJ proof score for ( ∀ X, Y ) X + ( dec Y ) = dec (X + Y ) ,and justify it.(c) Give an OBJ proof score for ( ∀ X, Y ) X + Y = Y + X , and justify it. (cid:2) nduction This section extends the results of Section 6.5 to the many-sorted case;hence, Σ is an S -sorted signature throughout. Again, much more gen-eral results may be found in Chapter 8. Deﬁnition 6.5.8

Given a Σ -algebra M and s ∈ S , then a subsignature ∆ ⊆ Σ is inductive for s over M iﬀ(0) each δ ∈ ∆ has sort s , and(1) Σ (cid:48) = { σ ∈ Σ | sort (σ ) ≠ s } ∪ ∆ is a signature of constructors for M .A signature ∆ that is inductive for s over M is minimal iﬀ it is an in- ductive signature for s over M such that no proper subsignature is in-ductive for s over M . Given a speciﬁcation P = ( Σ , A) , a signature ∆ is inductive for s over P iﬀ it is inductive for s over T P . (cid:2) Of course, we want to do as little work as possible in an inductive proof.A minimal inductive signature allows this.

Exercise 6.5.4

Show that a signature ∆ is a minimal inductive signature for s over M iﬀ ∆ ⊆ Φ where Φ is a minimal signature of constructors for M . (cid:2) Theorem 6.5.9 ( Structural Induction I (cid:48) ) Given a speciﬁcation P = ( Σ , A) anda signature ∆ that is inductive for sort s over P , let V ⊆ T P,s . Then V = T P,s if(0) c ∈ ∆ [],s implies [c] ∈ V , and(1) f ∈ ∆ s ...s n ,s for n > t i ∈ T Σ ,s i for i = , . . . , n with [t i ] ∈ V if s i = s imply [f (t , . . . , t n )] ∈ V . Proof:

We ﬁrst deﬁne an S -sorted set M by M s = V and M s (cid:48) = T P,s (cid:48) for s (cid:48) ≠ s .Then (0) and (1) tell us that M is a Σ (cid:48) -algebra, where Σ (cid:48) is as in Deﬁni-tion 6.5.8. Now because T P has no proper sub- Σ (cid:48) -algebras by Proposi-tion 6.4.4, we conclude that V = T P,s . (cid:2) As before, the more familiar formulation of induction follows im-mediately:

Corollary 6.5.10 ( Structural Induction II (cid:48) ) Given a speciﬁcation P = ( Σ , A) anda signature ∆ that is inductive for sort s over P , let Q(x) be a Σ ( { x } ) -sentence where x is a variable of sort s . Then A |(cid:155) Σ ( ∀ x) Q(x) if(0) c ∈ ∆ [],s implies A |(cid:155) Σ Q(c) , and(1) f ∈ ∆ s ...s n ,s for n > t i ∈ T Σ ,s i for i = , . . . , n and A |(cid:155) Σ Q(t i ) when s i = s imply A |(cid:155) Σ Q(f (t , . . . , t n )) . Initial Algebras, Standard Models and Induction

Proof:

This follows directly from Theorem 6.5.9 by letting V = { x ∈ T P,s | A |(cid:155) Σ Q(x) } . (cid:2) Actually, “if” can be replaced by “iﬀ” in both Theorem 6.5.9 and Corol-lary 6.5.10; however, these converse “completeness” results do notseem to be useful in practice. Also, note that we can generalize all thisa bit further by considering an S -sorted set V = { V s | s ∈ S } deﬁned by Q = { Q s (x) | s ∈ S } . Exercise 6.5.5

E25

Consider the following speciﬁcation: obj SET is sort Set .pr NAT .op {} : -> Set .op ins : Nat Set -> Set .op _U_ : Set Set -> Set [id: ({})] .vars N N’ : Nat .vars S S’ : Set .eq ins(N,ins(N’,S)) = ins(N’,ins(N,S)).eq ins(N,ins(N,S)) = ins(N,S).eq ins(N,S) U S’ = ins(N,S U S’).endo where

NAT is the Peano natural numbers. Then do the following:(a) Write a speciﬁcation for

SET that involves neither module impor-tation (the line “ pr NAT ”) nor attributes (“ [id:{}] ”). What wouldresult from executing the following in OBJ3? open SET .op s0 : -> Set .red ins(0,{}) .red {} U {} .red ins(0,ins(0,{})) .red ins(0,ins(0,s0)) . (b) Give an inductive signature for the sort

Set over

SET which isminimal, and one which is not. Explain why.(c) Give a manual proof that the equation ( ∀ S, S (cid:48) ) S ∪ S (cid:48) = S (cid:48) ∪ S holds for SET .(d) Give an OBJ proof score for this equation. (cid:2)

We can formulate induction as a general rule of inference. Let thenotation A |(cid:39) Σ e indicate that e can be proved from A using the new rulegiven below plus the usual rules for (cid:96) Σ , and assume that ∆ is inductivefor sort s over P = ( Σ , A) . Then the new rule is: Closer Look at State, Encapsulation and Implementation (I)

Given t, t (cid:48) ∈ T Σ ( { x } ) with x of sort s , if A |(cid:39) Σ ( ∀∅ ) t(x ← c) = t (cid:48) (x ← c) for each c ∈ ∆ [],s , and if A |(cid:39) Σ ( ∀∅ ) t(x ← t i ) = t (cid:48) (x ← t i ) for i = , . . . , n and f ∈ ∆ s ...s n ,s imply A |(cid:39) Σ ( ∀∅ ) t(x ← f (t , . . . , t n )) = t (cid:48) (x ← f (t , . . . , t n )), then A |(cid:39) Σ ( ∀ x) t = t (cid:48) . Corollary 6.5.10 and the soundness of (cid:96) show that |(cid:39) Σ is sound forinitial truth, i.e., that A |(cid:39) Σ e implies A |(cid:155) Σ e . However, |(cid:39) Σ cannot be complete, i.e., in general the converse implica-tion does not hold [129]. In fact, the set of equations satisﬁed by T P is not in general even recursively enumerable, for reasons discussed inSection 5.10. Initial semantics works very well for static data structures like inte-gers, lists, booleans, and even vectors and matrices, that are typicallypassed as values in programs unless they are very large. But such anapproach is more awkward for dynamic data structures that have ded-icated storage and commands that change the internal representation,which is not viewed directly, but only through “attributes.” For exam-ple, it is usually more appropriate to view stacks as “state machines”with an encapsulated internal state, and with “top” as an attribute. Al-though initial models exist for any reasonable speciﬁcation of stacks, real stacks are more likely to be implemented by a model that is notinitial, such as a pointer plus an array. This means that a notion ofimplementation is needed that diﬀers from initial models. Moreover, inconsidering (for example) stacks of integers, the sorts for stacks and forintegers have a diﬀerent character, since the latter can still be modelledinitially as data. Although an initial framework has been successfullyused for some applications of this kind (e.g., see [86]), it is really betterto take a viewpoint that explicitly distinguishes between “visible” sortsfor data and “hidden” sorts for states, and that deﬁnes an implementa-tion to be any model satisfying certain natural constraints; the hiddenalgebra developed in Chapter 13 takes such an approach. Initial Algebras, Standard Models and Induction

The notions of quotient and image algebra, like the notions of homo-morphism and isomorphism, developed as abstractions of the corre-sponding basic notions for groups and rings, mainly from lectures byEmmy Nöther in the late 1920s; the same holds for the HomomorphismTheorem (Theorem 6.1.7). The development of minimal automata inExample 6.1.10 using the homomorphism theorem may be original, butalgebraic treatments of automata go back to Anil Nerode and others inthe earliest days of theoretical computer science [142, 154]. The uni-versal properties of quotient and free algebras (Proposition 6.1.12 andTheorem 6.1.17) are notable for the smooth way that consequences canbe drawn from them, as illustrated by the proof of Theorem 6.1.15 andthe results at the end of Section 6.1; several proofs in the chapter also make good use of diagrams. The concept of universal property comesfrom category theory, where it takes the elegant form of an adjoint pairof functors (see [128] or the end of [126]); it was also developed byBourbaki in a more concrete form closer to the one in this text, whichis also similar to that in the excellent abstract algebra text of Mac Laneand Birkhoﬀ [128].The importance of initiality for computing has developed gradu-ally. The term “initial algebra semantics” and its ﬁrst applications, toKnuthian attribute semantics, appear in [52], while the ﬁrst applicationsto abstract data types are in [87]; a more complete and rigorous expo-sition is given in [86]. The terminology “no junk” and “no confusion”and Theorem 6.3.1 are from [24]. Many examples of initiality can befound in [89] and [137]; the latter especially develops connections withinduction and computability. Results on the computational adequacyof initiality ﬁrst appeared in the work of Bergstra and Tucker [11]. Theprinciple that standard models are initial models extends beyond equa-tional logic, for example, to standard models of Horn clause logic asused in (pure) Prolog; [79] discusses this and some other applicationsof the principle. A signiﬁcant generalization of algebraic induction isgiven in Section 8.7.

Deduction and RewritingModulo Equations

This chapter generalizes term rewriting to the situation where someequations B are “built in” as part of matching, rather than used asrewrite rules; the phrase “ modulo B ” refers to this. We also intro-duce terms, equations, and deduction modulo B , give a decision pro-cedure for the propositional calculus (using rewriting modulo associa-tivity and commutativity), generalize theory from Chapter 5, includingways to prove termination and Church-Rosser, and apply all this to sev-eral kinds of digital hardware, among other things. The reader may well have felt that the repeated use of the associativelaw in Examples 4.1.4 and 4.5.5 (as well as Exercises 4.6.1–4.6.4) wasrather tedious, and perhaps even unnecessary. A more precise way toexpress this feeling is to say that because the associative law tells usthat “parentheses are unnecessary,” it should be unnecessary to moveparentheses around in deductions that assume this law. For example,the two expressions (a − − ∗ a − ) ∗ (a ∗ a − ) (a − − ∗ (a − ∗ a)) ∗ a − are “obviously” equal to each other, because they have the same un-parenthesised form, namely a − − ∗ a − ∗ a ∗ a − . Example 7.1.1

Eliminating parentheses in the proof of Example 4.5.5 gives the Deduction and Rewriting Modulo Equations following, [ ] a ∗ a − = e ∗ a ∗ a − by ( − ) on GL . [ ] = a − − ∗ a − ∗ a ∗ a − by ( − ) on GL . A = a − [ ] = a − − ∗ e ∗ a − by ( ) on GL . [ ] = a − − ∗ a − by ( ) on GL . [ ] = e by ( ) on GL . (cid:2) Because we want to be precise throughout this book, we must askwhat it means to “build in” associativity this way. One approach is toconsider expressions that diﬀer only in their parenthesization to be equivalent ; the equivalence classes should then form an algebra such that we can do deduction on the classes in essentially the same way thatwe do deduction on terms. For example, the following is an equivalenceclass of terms modulo associativity: { (a ∗ b) ∗ (c ∗ d), a ∗ (b ∗ (c ∗ d)), a ∗ ((b ∗ c) ∗ d),((a ∗ b) ∗ c) ∗ d, (a ∗ (b ∗ c)) ∗ d } . We also want the unparenthesized expression a ∗ b ∗ c ∗ d to serve assurface syntax of this class, representing it to users in the familiar way.Because there are applications other than associativity, it is worth-while to develop the theory at a general level. For example, if we want tostudy commutative groups, which in addition to the usual group laws,also satisfy the commutative law, ( ∀ X, Y ) X ∗ Y = Y ∗ X , then we need to avoid using this equation as a rewrite rule, because itcan lead to non-terminating computations, such as a ∗ b ⇒ b ∗ a ⇒ a ∗ b ⇒ · · · . This observation means it is impossible to give a canonical term rewrit-ing system for the theory of commutative groups. However, by regard-ing terms that diﬀer only in the order of their factors as equivalent, we can build in commutativity and thus avoid non-termination. Forexample, (a ∗ b) ∗ c has the following class of equivalent terms { (a ∗ b) ∗ c, (b ∗ a) ∗ c, c ∗ (a ∗ b), c ∗ (b ∗ a) } , and moreover, the class { a ∗ b, b ∗ a } is a normal form under rewritingmodulo commutativity.There are also many examples, including commutative groups, whereit is useful to identify terms that are the same up to the order of theirfactors and their parenthesization. Thus, there are at least three inter-esting cases: associativity, commutativity, and both together, which areoften abbreviated A, C, and AC, respectively. eduction Modulo Equations Exercise 7.1.1

Prove that there are 12 terms in the equivalence class moduloAC of the term (a ∗ b) ∗ c , and write them all out. Also prove there are 5terms in the equivalence class of (a ∗ b) ∗ (c ∗ d) modulo associativity. (cid:2) The “

Semantics First ” slogan of Chapter 1 implies that a discussionof deduction should be preceded by a discussion of satisfaction, as astandard against which to test soundness and completeness. We nowdo this for deduction with a set A of equational axioms, modulo a set B of equations. The ﬁrst step is to deﬁne the kind of equation involved;we begin with B -equivalence classes of Σ -terms, i.e., with elements of T Σ ,B (X) , as deﬁned in Section 6.1.1. Deﬁnition 7.2.1

Given a set B of (possibly conditional) Σ -equations, a ( condi-tional ) Σ - equation modulo B , or ( Σ , B) - equation , is a 4-tuple (cid:104) X, t, t (cid:48) , C (cid:105) with t, t (cid:48) ∈ T Σ ,B (X) and C a ﬁnite set of pairs from T Σ ,B (X) , usually writ-ten ( ∀ X) t = B t (cid:48) if C ; we may use the same notation with t, t (cid:48) , C all Σ -terms to represent their B -equivalence classes, and we may drop the B subscripts. We write just ( ∀ X) t = B t (cid:48) when C = ∅ and call it an unconditional equation .Given a ( Σ , B) -algebra M , Σ - satisfaction modulo B , written M (cid:238) Σ ,B ( ∀ X) t = B t (cid:48) if C , is deﬁned by a(t) = a(t (cid:48) ) whenever a(u) = a(v) for each (cid:104) u, v (cid:105) ∈ C , for all a : X → M . Theorem 6.1.17 provides theunique Σ -homomorphism a : T Σ ,B (X) → M extending a . Given a set A of ( Σ , B) -equations, let A (cid:238) Σ ,B e mean that M (cid:238) Σ ,B A implies M (cid:238) Σ ,B e for all B -models M . We may drop the subscripts Σ and/or B if they areclear from context. (cid:2) We can get class deduction versions of the rules for equational infer-ence in Chapter 4, just by replacing each occurrence of T Σ by T Σ ,B andeach occurrence of = by = B , assuming that all axioms in A are ( Σ , B) -equations. We denote the B -class deduction version of rule (i) by (i B ) ,and we let A (cid:96) Σ ,B e indicate deduction modulo B , also called class deduction , which is deduction using the rules ( B ), ( B ), ( B ), ( B ) and ( C B ) to deduce e from A modulo B ; as above we may drop either orboth subscripts Σ and B if they are clear from context. For example,here is the class deduction version of rule ( C) : ( C B ) Forward Conditional Subterm Replacement . Given t ∈ T Σ ,B (X ∪{ z } s ) with z (cid:54)∈ X , if ( ∀ Y ) t = B t if C is of sort s and is in A , and if θ : Y → T Σ ,B (X) is a substitution suchthat ( ∀ X) θ(u) = B θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C ,then Deduction and Rewriting Modulo Equations ( ∀ X) t (z ← θ(t )) = B t (z ← θ(t )) is also deducible.Note that this uses substitution modulo B , Deﬁnition 6.1.19 of Chap-ter 6.The next result lets us carry over soundness and completeness re-sults from Chapter 4 to class deduction. It says that deduction from A modulo B is equivalent to deduction from A ∪ B on representatives,and that satisfaction of an equation modulo B is equivalent to ordinarysatisfaction by a representative of the equation. To express this moreprecisely, given a Σ -equation e of the form ( ∀ X) t = t (cid:48) , let [e] denoteits modulo B version, ( ∀ X) [t] = B [t (cid:48) ] , and similarly for conditionalequations. Given a set A of Σ -equations, let [A] denote the set of mod-ulo B versions of equations in A ; for more clarity, we may also write [t] B , [e] B and [A] B . Proposition 7.2.2 ( Bridge ) Given sets

A, B of Σ -equations and another Σ -equa-tion e (with A, B and e possibly conditional), then [A] (cid:96) B [e] iﬀ A ∪ B (cid:96) e . Furthermore, given any ( Σ , B) -algebra M and a (possibly conditional) Σ -equation e , then M (cid:238) Σ ,B [e] iﬀ M (cid:238) Σ e . Proof:

For the ﬁrst assertion, if e , . . . , e n is a proof of e from A ∪ B , then thesubsequence [e i ], . . . , [e ik ] formed by omitting all steps that used B ,and then taking the B -classes of those equations that remain, is a proofof [e] from [A] , and conversely, any proof [e ], . . . , [e k ] of [e] from [A] can be ﬁlled out to become a proof of e from A ∪ B , by choosingrepresentatives for each [e i ] and adding intermediate steps that use B .For the second assertion, let q : T Σ (X) → T Σ ,B (X) be the quotientmap, let e be the equation ( ∀ X) t = t (cid:48) if C , and (just for now) let ˜ a denote the extension of a : X → M to a Σ -homomorphism T Σ ,B (X) → M , with a the usual Σ -homomorphism T Σ (X) → M . Then ( ∗ ) q ; ˜ a = a , by the universal property of Theorem 6.1.17, because both sides are Σ -homomorphisms T Σ ,B (X) → M that agree on X , since q(x) = [x] for x ∈ X implies ˜ a(q(x)) = a(x) for x ∈ X . Then M (cid:238) e iﬀ for all a : X → M , a(t) = a(t (cid:48) ) whenever a(u) = a(v) for all (cid:104) u, v (cid:105) ∈ C , andcomposing with q gives us that for all a : X → M , ˜ a([t]) = B ˜ a([t (cid:48) ]) whenever ˜ a([u]) = B ˜ a([v]) for all (cid:104) u, v (cid:105) ∈ C , because of ( ∗ ). But thissays that M (cid:238) B [e] . (cid:2) eduction Modulo Equations Theorem 7.2.3 ( Completeness ) Given sets

A, B of Σ -equations and another Σ -equation e (all possibly conditional), then the following are equivalent: ( ) [A] (cid:96) B [e] ( ) [A] (cid:238) B [e] ( ) A ∪ B (cid:96) e ( ) A ∪ B (cid:238) e Proof:

The ﬁrst part of Proposition 7.2.2 gives the equivalence of (1) with (3),the Completeness Theorem (4.8.4) gives the equivalence of (3) with (4),and the second part of Proposition 7.2.2 gives the equivalence of (2)with (4). (cid:2)

We also have the following completeness result, which (with the The-orem of Constants) justiﬁes the calculation in Example 7.1.1, since eachstep there is an instance of rule ( ± C B ) : Theorem 7.2.4

Given sets

A, B of (possibly conditional) Σ -equations and an un-conditional Σ -equation e , then [A] (cid:96) B [e] iﬀ [A] (cid:96) ( B , B , ± C B )B [e] . Moreover, [A] (cid:96) ( B , B , ± C B )B [e] iﬀ M (cid:238) B [e] for all ( Σ , A ∪ B) -algebras M . Proof:

The two predicates in the ﬁrst assertion are equivalent to (A ∪ B) (cid:96) e and (A ∪ B) (cid:96) ( , , ± C) e , respectively, the latter by reasoning analoguousto that in Proposition 7.2.2; hence they are equivalent by Theorem 4.9.1.The second assertion now follows by Theorem 7.2.3. (cid:2) When B consists of just the associative law, every equivalence class inany T Σ (X)/ (cid:39) B is ﬁnite; however, there is no upper bound for the num-ber of terms that may be in these classes. Moreover, there are speciﬁca-tions where the equivalence classes are actually inﬁnite. For example,if B contains an identity law ( x ∗ = x ), then equivalence classes mod-ulo B are inﬁnite; the same holds for an idempotent law ( x ∗ x = x ).Consequently, equivalence classes are not feasible representations for systems like OBJ, either for surface syntax seen by users, or for con-crete data structures used internally for calculation. However, they arefundamental for semantics.When B consists of the associative law, B -equivalence classes havea simple natural representation based on omitting parentheses; for ex-ample, the equivalence class of (a ∗ b) ∗ c can be represented to usersas a ∗ b ∗ c and internally as ∗ (a, b, c) , which avoids the ambiguityof the simpler list representation (a, b, c) if more than one binary op-eration is declared associative. Similarly, bags and sets of terms canrepresent terms modulo AC, and AC plus idempotency, respectively,and these can be implemented using standard concrete data structures. Deduction and Rewriting Modulo Equations

However, there is no optimal linear representation for built in commu-tativity, since any linear representation must choose some ordering forsubterms. Theorem 7.2.3 shows that equivalence classes can be repre-sented by terms, and in fact this is what OBJ3 does, for both calculationand surface syntax, except for associativity, where omitting parenthe-ses is the default; however, this default can be turned oﬀ with the com-mand “ set print with parens off ”.Another diﬃculty involved with implementing deduction modulo B with equivalence classes is that occurrences of a variable in [t] B maynot be well deﬁned, since diﬀerent representatives of the class mayhave diﬀerent numbers of occurrences. For example, if B contains anidempotent law for a binary operation ∗ , then the terms x, x ∗ x, (x ∗ x) ∗ x, x ∗ (x ∗ x), . . . are all B -equivalent, so that the class [x] B contains terms with n oc-currences of x for every n >

0. Similarly, if B contains a zero law( x ∗ = [x ∗ ] B contains inﬁnitely many terms, e.g., with n occurrences of x for every n ≥

0. In such cases subterm replacementcannot make sense at a single instance of a variable, i.e., the rule ( ± ) does not generalize to arbitrary equational theories B ; this is unfortu-nate because this rule is the basis for term rewriting.However, we can make single occurrence rewriting work on B -classeswith two additional assumptions. Recall that an equation ( ∀ X) t = t (cid:48) is balanced iﬀ var (t) = var (t (cid:48) ) = X , and note that if an equation in B does not have the same variables in its two sides, then deduction mod-ulo B may require ﬁnding values for the unmatched variables, whichin general cannot be done automatically. Also, recall that an equation ( ∀ X) t = t (cid:48) is linear iﬀ t and t (cid:48) each have at most one occurrence ofeach variable in X . For example, associative, commutative and iden-tity laws are all both linear and balanced, while an idempotent law isbalanced but not linear, and a zero law is linear but not balanced. Fact 7.2.5

Given B linear balanced and [t] ∈ T Σ ,B (X) , then every term t (cid:48) that is B -equivalent to t has the same number of occurrences of any x ∈ X as t does. Proof:

This is because the number of occurrences of a variable symbol is thesame in the result of applying a linear balanced equation to a term asit was in the original term. (cid:2)

Therefore if B is linear balanced, the following makes sense:( ± ,B ) Bidirectional Single Subterm Replacement Modulo B . Given t ∈ T Σ ,B ( { z } s ∪ Y ) with exactly one occurrence of z , where z (cid:54)∈ Y , andgiven a substitution θ : X → T Σ ,B (Y ) , if ( ∀ X) t = B t or ( ∀ X) t = B t is of sort s and is in A , then eduction Modulo Equations ( ∀ Y ) t (z ← θ(t )) = B t (z ← θ(t )) is deducible.This suggests the following notion of class rewriting modulo B basedon single subterm replacement: If B is linear balanced and A is aset of modulo B rewrite rules, deﬁne an abstract rewriting system onthe ( S -sorted) set T Σ ,B of B -equivalence classes of ground Σ -terms, by c ⇒ [A/B] c (cid:48) iﬀ there is some c ∈ T Σ ,B ( { z } s ) such that c = c (z ← θ(t )) and c (cid:48) = c (z ← θ(t )) , for some (modulo B ) substitution θ and somerule t → t of sort s in A . Later in this chapter, we show that classrewriting can be deﬁned without restricting B to be linear or balanced.As noted above, class rewriting is impractical, because classes can bevery large, even inﬁnite; nevertheless our later general version provides a semantic standard for the correctness of eﬃcient implementationslike that in OBJ3, which rewrites representive terms rather than classes.This is discussed in detail in Section 7.3, and also appears in the nextsubsection. OBJ3 implements deduction modulo any combination of A, C and I(where “I” stands for identity ), for any subset of binary operationsin the signature; the equations in B are declared using attributes ofoperations. For example, an operation modulo CI is declared by op _*_ : S S -> S [comm id: e] . Note that the identity constant “ e ” must be declared explicitly, becausethere could be other constants of an appropriate sort. The keyword“ assoc ” is used for associativity. We illustrate OBJ’s rewriting mod-ulo associativity with the following proof of the right inverse law forleft groups, as in Example 7.1.1; the reader may wish to ﬁrst reviewSection 4.6. Example 7.2.6 ( Right Identity for Left Groups ) We must ﬁrst give a new ver- sion of the speciﬁcation that treats associativity as a built in equationrather than as a rewrite rule. Then we do the proof itself, beginningwith a constant for the universal quantiﬁer. The “range” notation, as in“ [2 .. 3] ”, is explained after the proof. th GROUPLA is sort Elt .op _*_ : Elt Elt -> Elt [assoc] .op e : -> Elt .op _-1 : Elt -> Elt [prec 2] .var A : Elt .[lid] eq e * A = A . For the non-commutative case, this means both the left and right identity laws. Deduction and Rewriting Modulo Equations [linv] eq A -1 * A = e .endthopen . ***> first prove the right inverse law:op a : -> Elt .start a * a -1 .apply -.lid at term . ***> should be: e * a * a -1apply -.linv with A = (a -1) at [1] .***> should be: a -1 -1 * a -1 * a * a -1apply .linv at [2 .. 3] . ***> should be: a -1 -1 * e * a -1apply .lid at [2 .. 3] . ***> should be: a -1 -1 * a -1apply .linv at term . ***> should be: e[rinv] eq A * A -1 = e . ***> add the proven equation:start a * e . ***> now prove the right identity law:apply -.linv with A = a at [2] . ***> should be: a * a -1 * aapply .rinv at [1 .. 2] . ***> should be: e * aapply .lid at term . ***> should be: aclose

The keyword “ term ” indicates application of the rule at the top of thecurrent term (i.e., with t = z in rule ( ± ) B ), while the notation “ [2] ”indicates application at its second subterm, and “ [2 .. 3] ” indicatesthe subterm consisting of the second and third subterms; the selectedrule is applied at most once, and fails if the selected subterm does notmatch. The next section shows how this example can be simpliﬁedeven further by using term rewriting modulo equations, in addition todeduction modulo equations. (cid:2) We now discuss apply in somewhat more detail; a complete descrip-tion is given in [90]. The notations “ [k] ” and “ [k .. n] ” are used forbinary operations that are associative only; incidentally, “ [k .. k] ”is equivalent to “ [k] ”, and “ [] ” is not allowed. Because OBJ3 repre-sents terms modulo A, C, and AC with ordinary terms, if you know howthe representing term is parenthesized, then for each case you can se-lect subterms using the parenthetic occurrence notation of Section 4.6.Thus, instead of “ [2] ” above, we could have written “ (2) ”; however, thesquare bracket range notation is preferable. You can see the parenthe-sization with the command “ apply print at term ” provided “ printwith parens ” is on . The default parenthesization takes the right most subterm as the innermost; note that applying an equation may cause re-parenthesization closer to the default form, even in subterms disjointfrom the redex.Occurrence notation must be used for selecting subterms for oper-ations that are commutative only. Note that “ () ” is a valid occurrence,and is equivalent to “ term ”; another synonym is “ top ”. The “set” nota-tions “ {k} ” and “ {k , . . . , n} ” are used for AC binary operations, analo-gously to “ [k] ” and “ [k .. n] ” for associative operations.Because speciﬁcations can have multiple binary operations with vary-ing combinations of modulo attributes, a notation is needed for com-posing the three selection methods discussed above. For example, sup- eduction Modulo Equations pose ∗ , + , are respectively associative, AC, and without attributes, andconsider the term t = (a ∗ b) (a + b + (a ∗ c)) . Then the following [1] of {3} of (2) ,selects the third subterm a , while {3,1} of (2) will select the subterm (a ∗ c) + a , also causing the representing termto be rearranged. A ﬁnal “ of term ” is optional for composite selectors.If we knew that the representation of t was parenthesized as (a ∗ b) (a + (b + (a ∗ c)))) , then we could also do the ﬁrst selection aboveusing the occurrence notation (2 3 1) ; however, the second selectionabove cannot be done with occurrence notation. Note the reversal oforder between “ [1] of {3} of (2) ” and “ (2 3 1) ”. Note also thatthe command apply print at {3,1} of (2) will cause the representation of t to be rearranged as (a ∗ b) (((a ∗ c) + a) + b) , even though no deduction is involved.Instead of “ at ”, the keyword “ within ” can be used, indicating a sin-gle application at some proper subterm of the selected term; this can beconvenient when there is a unique subterm that matches. A summaryof the syntax for apply is produced by the command “ apply ? ”. Tobetter understand this material, in addition to the two exercises below,the reader should also look at Example 7.3.6 on page 194. Exercise 7.2.1 (1) Do the proof in Example 7.2.6, everywhere replacing rangenotation with occurrence notation. (2) Do the proof of Example 7.2.6using “ within ” wherever possible. (cid:2)

Exercise 7.2.2

The following is part of the calculus of relations (Appendix Chas some basics): th REL is sort Rel .op I : -> Rel .op _U_ : Rel Rel -> Rel [assoc comm].op _;_ : Rel Rel -> Rel [assoc].vars R R1 R2 : Rel .eq I ; R = R . eq R ; I = R .eq R ;(R1 U R2) = (R ; R1) U (R ; R2).eq (R1 U R2); R = (R1 ; R) U (R2 ; R).endth

These operations and laws constitute a so-called semi-ring . Now addthe recursive deﬁnition op _*_ : Rel Nat -> Rel .var R : Rel . var N : Nat .eq R * 0 = I .eq R * s N = (R * N);(I U R). and prove that if R * N = R * s N , then also Deduction and Rewriting Modulo Equations (R U I);(R * N) = (R * N) . (In this case,

R * N is the transitive reﬂexive closure of R .) Use the OBJ3 apply feature to do all the calculations, although the induction itselfremains outside the OBJ3 framework. Hint:

Use induction to prove that

R ;(R * N) = (R * N); R for all R and all N . (cid:2) Term rewriting is the main computational engine for theorem prov-ing in this book, and this section develops rewriting modulo B , which has signiﬁcant advantages over the class rewriting sketched in Sec-tion 7.2.1: (1) it uses ordinary rules instead of modulo B rules; (2) B needs be neither balanced nor linear; (3) inﬁnite classes of terms areavoided; and (4) occurrences make sense. We assume throughout that B is unconditional. Deﬁnition 7.3.1 A modulo term rewriting system , abbreviated MTRS , consistsof a signature Σ , a set B of Σ -equations, and a Σ -term rewriting system A , written ( Σ , A, B) .Given an MTRS ( Σ , A, B) , deﬁne an ARS on T Σ ,B by c ⇒ [A/B] c (cid:48) iﬀ thereare t, t (cid:48) ∈ T Σ such that c = [t] , c (cid:48) = [t (cid:48) ] , and t ⇒ A t (cid:48) . This relation iscalled one-step class rewriting , and its transitive, reﬂexive closure is class rewriting .Given an MTRS ( Σ , A, B) , deﬁne an ARS on T Σ by t ⇒ A/B t (cid:48) iﬀ thereare t , t ∈ T Σ such that t (cid:39) B t , t (cid:48) (cid:39) B t , and t ⇒ A t . This relation iscalled one-step rewriting with A modulo B , and its transitive, reﬂexiveclosure is rewriting modulo B .We extend to rewriting terms with variables by extending Σ to Σ (X) ,and in this case write c ⇒ [A/B],X c (cid:48) and t ⇒ A/B,X t (cid:48) , which are deﬁned on T Σ (X),B and T Σ (X) respectively. (cid:2) The proof of the following is similar to those of Proposition 5.1.9 and

Corollary 5.1.10:

Proposition 7.3.2

Given t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X and MTRS ( Σ , A, B) , then t ⇒ A/B,X t (cid:48) iﬀ t ⇒ A/B,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . Also t ∗ ⇒ A/B,X t (cid:48) iﬀ t ∗ ⇒ A/B,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . (cid:2) Thus both ⇒ A/B,X and ∗ ⇒ A/B,X restrict and extend reasonably over vari-ables, so that we can drop the subscript X , with the understanding thatany X such that var (t) ⊆ X may be used. On the other hand, ∗ (cid:97) A/B,X erm Rewriting Modulo Equations does not restrict and extend reasonably, as shown by Example 5.1.15.Thus, we deﬁne t ∗ (cid:97) A t (cid:48) to mean there exists an X such that t ∗ (cid:97) A,X t (cid:48) .Example 5.1.15 also shows bad behavior for (cid:39) XA/B (deﬁned by t (cid:39) XA/B t (cid:48) iﬀ A ∪ B (cid:238) ( ∀ X) t = t (cid:48) ), although the concretion rule (8) of Chapter 4(extended to rewriting modulo) implies (cid:39) XA,B does behave reasonablywhen the signature is non-void. Deﬁning ↓ A/B,X as usual from the ARSof an MTRS ( Σ , A, B) , we can generalize Proposition 5.1.13 to show that ↓ A/B,X also restricts and extends reasonably, again allowing the sub-script X to be dropped: Proposition 7.3.3

Given t , t ∈ T Σ (Y ) , Y ⊆ X and MTRS ( Σ , A, B) , then wehave t ↓ A/B,X t if and only if t ↓ A/B,Y t , and moreover, these imply A ∪ B (cid:96) ( ∀ X) t = t . (cid:2) There are good implementations of rewriting modulo B for those B that are actually available through attributes in OBJ. Although rewrit-ing modulo B accurately describes what OBJ does, it is not how OBJactually does it, because this would require much needless search formatches; Section 7.3.3 gives some details of what OBJ3 really does,which of course is equivalent to rewriting modulo B . The following(generalizing 5.1.11 and 5.1.16) says rewriting modulo B is equivalentto class rewriting, and that both are semantically sound and complete: Theorem 7.3.4 ( Completeness ) Given an MTRS ( Σ , A, B) and t, t (cid:48) ∈ T Σ (X) , then t ⇒ A/B t (cid:48) iﬀ [t] ⇒ [A/B] [t (cid:48) ] ,t ∗ ⇒ A/B t (cid:48) iﬀ [t] ∗ ⇒ [A/B] [t (cid:48) ] ,[t] ∗ ⇒ [A/B] [t (cid:48) ] implies [A] (cid:96) ( ∀ X) [t] = [t (cid:48) ] ,t ∗ (cid:97) A/B t (cid:48) iﬀ A ∪ B (cid:96) ( ∀ X) t = t (cid:48) . Thus ∗ (cid:97) A/B is complete for satisfaction of A ∪ B , and ∗ (cid:97) [A/B] is completefor satisfaction of A modulo B . Moreover, ∗ (cid:97) A/B and (cid:39) XA ∪ B are equalrelations on terms with variables in X . Proof:

The ﬁrst assertion follows from the deﬁnitions of ⇒ A/B and ⇒ [A/B] (Def-inition 7.3.1), and the second follows from the ﬁrst by induction. The third follows from ⇒ [A/B] being a rephrasing of the rule ( + B ) . Theforward direction of the fourth follows from the second and third, plus(1) implies (4) of Theorem 7.2.3, while the converse follows from The-orem 5.1.16 for A ∪ B and the deﬁnition of ∗ ⇒ A/B . The ﬁfth and sixthfollow from the fourth plus Theorem 7.2.3. (cid:2)

Example 7.3.5 ( Right Identity for Left Groups ) Although the proof in Exam-ple 7.2.6 using built in associativity is simpler than the original proof inSection 4.6, it can be simpliﬁed even further, by using rewriting moduloassociativity, replacing the last two apply commands in both the ﬁrstand the second subproof by “ apply reduction within term . ” Deduction and Rewriting Modulo Equations open GROUPLA .op a : -> Elt .start a * a -1 . ***> first prove the right inverse law:apply -.lid at term . ***> should be: e * a * a -1apply -.linv with A = (a -1) within term .***> should be: a -1 -1 * a -1 * a * a -1apply reduction within term . ***> should be: e[rinv] eq A * A -1 = e . ***> add the proven equationstart a * e . ***> now prove the right identity lawapply -.linv with A = a within term . ***> should be: a * a -1 * aapply reduction within term . ***> should be: aclose

Now the ﬁrst subproof takes only three apply commands, and the sec-ond only two. But it is something of an accident that this works, be-cause a diﬀerent ordering of rewrites could have blocked this proof by undoing some prior backward applications; thus, the proof style ofExample 7.2.6 represents a more safe and sure way to proceed. (cid:2) Example 7.3.6

The deﬁnition of ring is as follows (e.g., see [128], Chapter IV),noting that the multiplication * need not be commutative: th RING is sort R .ops 0 1 : -> R .op _+_ : R R -> R [assoc comm idr: 0 prec 5] .op _*_ : R R -> R [assoc idr: 1 prec 3] .op -_ : R -> R [prec 1] .vars A B C : R .[ri] eq A + (- A) = 0 .[ld] eq A *(B + C) = A * B + A * C .[rd] eq (B + C) * A = B * A + C * A .endth We will prove that ( ∀ A) A ∗ =

0. For this, we should turn on the printwith parens feature so that when rules are shown, we can check howthey are parenthesized. open .ops a b c : -> R .show rules .start a * 0 .apply -.6 at top .apply -.ri with A = a * a at 1 .apply -.ld with A = a at [1 .. 3] .apply red at term .close

Rule .6 is (0 + X_id) = X_id , which was generated by the “ idr: 0 ”declaration (we learn this from the output of the show rules com-mand). The result of the ﬁnal reduction is , and so the proof is done. (cid:2) erm Rewriting Modulo Equations Modulo B versions of the basic term rewriting concepts are obtainedfrom the ARS deﬁnitions applied to the class rewriting relation: Deﬁnition 7.3.7

Given an MTRS

M = ( Σ , A, B) , then A is terminating modulo B iﬀ ⇒ [A/B] is terminating, and is Church-Rosser modulo B iﬀ ⇒ [A/B] isChurch-Rosser. Ground terminating, ground Church-Rosser, canonical,ground canonical, etc. modulo B are deﬁned similarly, and we also saythat M is terminating, Church-Rosser, canonical, etc. (cid:2) From this and Theorem 7.3.4 we get the following:

Proposition 7.3.8

An MTRS

M = ( Σ , A, B) is terminating iﬀ ⇒ A/B is terminating,i.e., iﬀ there is no inﬁnite sequence t , t , t , . . . of Σ -terms such that t ⇒ A/B t ⇒ A/B t ⇒ A/B · · · . Similarly, M is ground terminating iﬀ there is no such inﬁnite sequence of ground Σ -terms. Also, M is Church-Rosser iﬀ whenever t ∗ ⇒ A/B t and t ∗ ⇒ A/B t (cid:48) , there exist t , t (cid:48) equivalentmodulo B such that t ∗ ⇒ A/B t and t (cid:48) ∗ ⇒ A/B t (cid:48) , and is ground Church-Rosser iﬀ this property holds for all ground terms t . Moreover, a Σ -term t is a normal form for M iﬀ there is no t (cid:48) such that t ⇒ A/B t (cid:48) , and is anormal form for a Σ -term t (cid:48) iﬀ t is a normal form and t (cid:48) ∗ ⇒ A/B t . Finally, t is a normal form for ⇒ A/B iﬀ [t] B is a normal form for ⇒ [A/B] . (cid:2) The following generalizes Theorem 5.2.9 to rewriting modulo B : Theorem 7.3.9

Given a ground canonical MTRS ( Σ , A, B) , if t , t are two normalforms of a ground term t under ⇒ A/B then t (cid:39) B t . Moreover, the B -equivalence classes of ground normal forms under ⇒ A/B form an initial ( Σ , A ∪ B) -algebra, denoted N Σ ,A/B or N A/B , as follows, where [[t]] denotesany arbitrary normal form of t , and [[t]] B denotes the B -equivalenceclass of [[t]] :(0) interpret σ ∈ Σ [],s as [[σ ]] B in N Σ ,A/B,s ; and(1) interpret σ ∈ Σ s ...s n ,s with n > ([[t ]] B , . . . , [[t n ]] B ) with t i ∈ T Σ ,s i to [[σ (t , . . . , t n )]] B in N Σ ,A/B,s .Finally, N Σ ,A/B is Σ -isomorphic to T Σ ,A ∪ B . Proof:

For convenience, write N for N Σ ,A/B . The ﬁrst assertion follows fromthe ARS result Theorem 5.7.2, using Theorem 7.3.4. Note that σ N iswell deﬁned by (1), because of the ﬁrst assertion, plus the fact that (cid:39) B is a Σ -congruence relation.Next, we check E26 that N satisﬁes A ∪ B . Satisfaction of B is bydeﬁnition of N as consisting of B -equivalence classes of normal forms.Now let ( ∀ X) t = t (cid:48) be in A ; we need a(t) = a(t (cid:48) ) for all a : X → N .Let b : X → T Σ ,B denote the extension of a from target N to T Σ ,B . Then a(t) = [[b(t)]] B for any t ∈ T Σ ,B because [[b( _ )]] B is a Σ -homomorphismsince [[ _ ]] B is, and there is a unique Σ -homomorphism T Σ ,B (X) → N Deduction and Rewriting Modulo Equations that extends a . Now applying the given rule to t with the substitution b gives b(t) ⇒ A/B b(t (cid:48) ) , so these two terms have the same canonicalform, i.e., [[b(t)]] B = [[b(t (cid:48) )]] B and thus a(t) = a(t (cid:48) ) , as desired.Next, let M be an arbitrary ( Σ , A ∪ B) -algebra, and let h : T Σ ,B → M bethe unique Σ -homomorphism. Noting that N ⊆ T Σ ,B , let g : N → M bethe restriction of h to N . We now prove that g is a Σ -homomorphismby structural induction over Σ :(0) Given σ ∈ Σ [],s , we get g(σ N ) = h([[σ ]] B ) by deﬁnition. Then The-orem 7.3.4 gives h([σ ]) = h([[σ ]] B ) because [σ ] ∗ ⇒ [A/B] [[σ ]] B , andthen h([σ ]) = σ M because h is a Σ -homomorphism. Therefore g(σ N ) = σ M , as desired.(1) Given σ ∈ Σ s ...s n ,s with n >

0, we get g(σ N ([[t ]] B , . . . , [[t n ]] B )) = h([[σ (t , . . . , t n )]] B ) by deﬁnition. Then Theorem 7.3.4 gives h([σ (t , . . . , t n )]) = h([[σ (t , . . . , t n )]] B ), so that h([σ (t , . . . , t n )]) = σ M (h([t ]), . . . , h([t n ])) = σ M (g([[t ]] B ), . . . , g([[t n ]] B )) because h is a Σ -homomorphism. Therefore g(σ N ([[t ]] B , . . . , [[t n ]] B )) = σ M (g([[t ]] B ), . . . , g([[t n ]] B )), as desired.For uniqueness, suppose g (cid:48) : N → M is another Σ -homomorphism.Let r : T Σ ,B → N be the map sending [t] to [[t]] B , and note that it is a Σ -homomorphism by the deﬁnition of [[ _ ]] B . Next, note that if i : N → T Σ ,B denotes the inclusion Σ -homomorphism, then i ; r = N . Finally, notethat r ; g = r ; g (cid:48) = h , by the uniqueness of h . It now follows that i ; r ; g = i ; r ; g (cid:48) , which implies g = g (cid:48) . The last assertion follows sinceboth are initial ( Σ , A ∪ B) -algebras. (cid:2) Theorem 7.3.10

Given a canonical MTRS ( Σ , A, B) , then [A] (cid:96) B ( ∀ X) t = B t (cid:48) iﬀ A ∪ B (cid:96) ( ∀ X) t = t (cid:48) iﬀ [[t]] (cid:39) XB [[t (cid:48) ]] , where as before, [[t]] denotes an arbitrary normal form of t under ⇒ A/B . Proof:

The ﬁrst “iﬀ” is Theorem 7.2.3. The “if” direction of the second “iﬀ” isstraightforward. For its “only if”, let h and h (cid:48) be the unique Σ (X) -homomorphisms from T Σ (X) to T Σ (X),A ∪ B and N Σ (X),A/B , respectively.Then A ∪ B (cid:96) ( ∀ X) t = t (cid:48) implies h(t) = h(t (cid:48) ) . But T Σ (X),A ∪ B is Σ (X) -isomorphic to N Σ (X),A/B by Theorem 7.3.9. Therefore h (cid:48) (t) = h (cid:48) (t (cid:48) ) , i.e., t and t (cid:48) have the same normal form modulo B . (cid:2) erm Rewriting Modulo Equations An important consequence of the above theorem is that we can deﬁnea function == that works for canonical MTRS’s the same way that thefunction == described in Section 5.2 works for ordinary canonical TRS’s,namely, t == t (cid:48) returns true over an MTRS ( Σ , A, B) iﬀ t, t (cid:48) are provablyequal under A ∪ B as an equational theory. This function is implementedin OBJ3 by computing the normal forms of t, t (cid:48) and checking whetherthey are equal modulo B . Note that even when ( Σ , A, B) is not canonical,if t == t (cid:48) does return true then t and t (cid:48) are equal under ( Σ , A, B) , againjust as for ordinary TRS’s. The function =/= is also available for MTRS’sin OBJ3, but just as for ordinary TRS’s, it is dangerous if the system isnot canonical for the sort involved. The use of == is illustrated byproofs in the following subsection. This section gives more inductive proofs along the lines of those inSection 6.5.1, but using associative-commutative rewriting. Example6.5.7 implies we can use AC rewriting for addition, and Exercises 7.3.1and 7.3.2 below imply we can also use AC rewriting for multiplication.

Example 7.3.11 ( Formula for + · · · + n ) We give an inductive proof of a for-mula for the sum of the ﬁrst n positive numbers,1 + + · · · + n = n(n + )/ , using Exercises 7.3.1 and 7.3.2 by giving + and * the attributes assoc and comm . This saves us from having to worry about the ordering andgrouping of subterms within expressions. The second module deﬁnesthe function sum (n) = + · · · + n . (What we actually prove is that sum (n) + sum (n) = n(n + ) .) obj NAT is sort Nat .op 0 : -> Nat .op s_ : Nat -> Nat [prec 1] .op _+_ : Nat Nat -> Nat [assoc comm prec 3] .vars M N : Nat .eq M + 0 = M .eq M + s N = s(M + N).op _*_ : Nat Nat -> Nat [assoc comm prec 2] .eq M * 0 = 0 .eq M * s N = M * N + M .endoobj SUM is protecting NAT .op sum : Nat -> Nat .var N : Nat .eq sum(0) = 0 .eq sum(s N) = s N + sum(N) .endo Deduction and Rewriting Modulo Equations open .ops m n : -> Nat .***> base casereduce sum(0) + sum(0) == 0 * s 0 .***> induction stepeq sum(n) + sum(n) = n * s n .reduce sum(s n) + sum(s n) == s n * s s n .close

The line “ protecting NAT ” indicates that the natural numbers are im-ported in “protecting” mode, which means that there are supposed tohave no junk and no confusion for the sort

Nat in the models of

SUM .We can also use this example to illustrate how unsuccessful proofscores can yield hints about lemmas to prove. If we try the same proofscore as above, but without the assoc and comm attributes for multi- plication, then the base case works, but the induction step fails, withthe two sides being s (s (n + n + (n * n) + n)) and s (s (n +(s n * n) + n)) . (The reduction actually evaluates to a rather non-informative false . However, we can get the desired information eitherby reducing the two sides separately, or else by replacing _==_ by aBoolean-valued operation _eq_ satisfying the single equation

N eq N= true ). The diﬀerence between these comes from the terms n + (n* n) and (s n * n) , the equality of which diﬀers from the second lawfor * by commutativity. This might suggest either proving the lemma s N ∗ N = N + N ∗ N , or else proving the commutativity of * . The induction step goes througheither way, and we also discover that associativity of multiplication isnot needed here. The two lemmas in the proof of the commutativity ofaddition (Example 6.5.7) were arrived at in the same way. (cid:2) Exercise 7.3.1

Use OBJ3 to show the associativity of multiplication of naturalnumbers. (cid:2)

Exercise 7.3.2

Use OBJ3 to show the commutativity of multiplication of natural numbers. (cid:2)

Example 7.3.12 ( Fermat’s Little Theorem for p =

3) The “little Fermat theorem”says that x p ≡ x ( mod p) for any prime p , i.e., that the remainder of x p after division by p equalsthe remainder of x after division by p . The following OBJ3 proof scorefor the case p = erm Rewriting Modulo Equations proof where there are non-trivial relations among the constructors; forin this example, unlike the usual natural numbers, s s s 0 = 0 . obj NAT3 is sort Nat .op 0 : -> Nat .op s_ : Nat -> Nat [prec 1] .op _+_ : Nat Nat -> Nat [assoc comm prec 3] .vars L M N : Nat .eq M + 0 = M .eq M + s N = s(M + N) .op _*_ : Nat Nat -> Nat [assoc comm prec 2] .eq M * 0 = 0 .eq M * s N = M * N + M .eq L * (M + N) = L * M + L * N .endoopen .var M : Nat .eq M + M + M = 0 .op x : -> Nat .***> base case, x = 0red 0 * 0 * 0 == 0 .***> induction stepeq x * x * x = x .red s x * s x * s x == s x .close The ﬁrst equation after the open quotients the natural numbers to thenaturals modulo 3. (cid:2)

Exercise 7.3.3

Let (A, ⊕ ) be an Abelian semigroup , i.e., suppose that ⊕ is abinary associative, commutative operation on A . Now use OBJ for thefollowing:1. Deﬁne (cid:76) ≤ i ≤ n a(i) , where a(i) ∈ A for 1 ≤ i ≤ n and n > a(i), b(i) ∈ A for 1 ≤ i ≤ n , prove that (cid:77) ≤ i ≤ n (a(i) ⊕ b(i)) = ( (cid:77) ≤ i ≤ n a(i)) ⊕ ( (cid:77) ≤ i ≤ n b(i)) Hint: use the following declarations in OBJ: op _+_ : A A -> A [assoc comm] .ops a b : Nat -> A .

3. Give an example of this formula where A is the integers and ⊕ isaddition. Explain how to extend (cid:76) ≤ i ≤ n a(i) to the case where n =

0, and generalize this to the case of an arbitrary Abeliansemigroup. (cid:2) I thank Dr. Immanuel Kounalis for doubting that OBJ3 could handle non-trivial re-lations on constructors, and then presenting the challenge to prove this result. Deduction and Rewriting Modulo Equations

A very nice application of term rewriting modulo equations is a deci-sion procedure for the propositional calculus. One way to deﬁne thepropositional calculus is with the following equational theory, which iswritten in OBJ3; we override the default inclusion of the Booleans inorder to avoid ambiguous parsing for and , or , etc.; the imported mod-ule TRUTH provides OBJ’s built in sort

Bool with just the two constants true and false , and basic built in operations like == . set include BOOL off .obj PROPC is protecting TRUTH .op _and_ : Bool Bool -> Bool [assoc comm prec 2] .op _xor_ : Bool Bool -> Bool [assoc comm prec 3] .vars P Q R : Bool .eq P and false = false .eq P and true = P .eq P and P = P .eq P xor false = P .eq P xor P = false .eq P and (Q xor R) = (P and Q) xor (P and R).op _or_ : Bool Bool -> Bool [assoc comm prec 7] .op not_ : Bool -> Bool [prec 1] .op _implies_ : Bool Bool -> Bool [prec 9] .op _iff_ : Bool Bool -> Bool [assoc prec 11] .eq P or Q = (P and Q) xor P xor Q .eq not P = P xor true .eq P implies Q = (P and Q) xor P xor true .eq P iff Q = P xor Q xor true .endo The main part of this speciﬁcation involves only connectives and and xor ; the second part deﬁnes the remaining propositional connectives interms of these two, plus the constant true . Because it is already knownthat these equations (including those for AC) are one way to deﬁne thepropositional calculus, we know that the above really is a theory of the propositional calculus. The following result (due to Hsiang [106])explains why

PROPC is important:

Theorem 7.3.13

As a term rewriting system,

PROPC is canonical modulo B ,where B consists of the associative and commutative laws for xor and and . (cid:2) This is proved in Exercise 12.1.3. Note also that this B is linear balanced.Moreover, Fact 7.3.14

The initial algebra of

PROPC has just two elements, namely true and false . erm Rewriting Modulo Equations Proof:

Given Theorem 7.3.13, it suﬃces to determine the reduced forms, byTheorem 7.3.9. The terms true and false are reduced because norules apply to them, and a case analysis of the eight terms built from true and false using and and xor shows that they all reduce to either true or false . (We can ignore the other operations because they aredeﬁned in terms of and and xor .) (cid:2) It follows from this that

PROPC really does protect its imported mod-ule

TRUTH , as its speciﬁcation claims (the “protecting” notion was de-ﬁned

E27 in Chapter 6).Our commitment to semantics demands that before going further,we should make it clear what we mean by saying that an equation is“true” in the propositional calculus:

Deﬁnition 7.3.15

An equation e is a theorem of the propositional calculus iﬀ T PROPC (cid:238) e , and a T Σ PROPC (X) -term t is a tautology iﬀ ( ∀ X) t = true is atheorem of the propositional calculus. Formulae t, t (cid:48) are equivalent iﬀ ( ∀ X) t = t (cid:48) is a theorem of the propositional calculus. (cid:2) We can prove T PROPC (cid:238) ( ∀ X) t = t (cid:48) directly, by checking whether a(t) = a(t (cid:48) ) for all a : X → T PROPC . Because T PROPC has exactly twoelements, true and false , by Fact 7.3.14, there are exactly 2 N cases tocheck when X has N variables, one case for each possible assignment a . This is essentially Ludwig Wittgenstein’s well-known method of truthtables , also called ( Boolean ) case analysis ; it is easy to apply this methodby hand when N is small. Example 7.3.16

To illustrate the method of truth tables, let’s check whetheror not the equation ( ∀ P , Q)

P and (P or Q) = P or Q is a theorem ofthe propositional calculus. Since there are two variables, there are fourpossible assignments; these appear in the left two columns, serving aslabels for the four rows of the table below:

P Q

P or Q P and (P or Q) equal?true true true true yestrue false true true yesfalse true true false no false false false false yesThus we see that the equation is false, and that

P = false , Q = true is a counterexample. Of course, such calculations do not require anelaborate L A TEX tabular format. (cid:2)

Theorem 7.3.13 plus some results from Chapter 5 that are generalizedto rewriting modulo B later in this chapter will imply that PROPC (X) ,the enrichment of

PROPC by X , is canonical for any X . However, thecanonical forms of this MTRS are not as well known as they should be.The following is needed to describe those forms: Deduction and Rewriting Modulo Equations

Deﬁnition 7.3.17

A formula of the propositional calculus is an exclusive nor-mal form (abbreviated

ENF ) iﬀ it has the form E xor E xor . . . xor E n , where each E i (called an exjunct ) has the form C i, and C i, and . . . and C i,k i , where each C i,j (called a conjunct ) either has the form P or else theform not P , where P is a variable of sort Prop ; by convention, we saythat the empty ENF ( n = false , and that the empty exjunct( k i = true . Those P that occur in an exjunct with a preced-ing not are said to occur negatively , and those that occur without it positively . Given a set X of variable symbols, an exjunct E is complete (with respect to X ) iﬀ each variable in X appears in E , and an ENF is complete (with respect to X ) iﬀ each of its exjuncts is complete. (cid:2) The result mentioned above may now be stated as follows:

Proposition 7.3.18

Every formula of the propositional calculus is equivalent toa unique (modulo B ) irredundant positive ENF , deﬁned to be an ENFhaving only positive conjuncts, involving (at most) the same variables,with no repeated conjuncts and no repeated exjuncts; these irredun-dant positive ENFs are the canonical forms of PROPC . Moreover, everyformula of the propositional calculus is equivalent to a unique (modulo B ) complete ENF involving (exactly) the same variables that it has. Proof:

The ﬁrst assertion follows from noticing that no rule of

PROPC (X) ap-plies to any irredundant positive ENF, so that these forms are reduced,and noticing that any formula using only xor and and that is not anirredundant positive ENF can be rewritten using one of the rules in theﬁrst part of

PROPC , and so cannot be canonical. Therefore the irredun-dant positive ENFs must be its canonical forms.For the second assertion, the complete ENF of a formula can be ob-tained from its irredundant positive ENF as follows: for each exjunct, ifsome variable x does not appear in it, exjoin to it the conjunct x and not x , and then simplify the resulting term using only the distributive and idempotent laws; the result will be a complete ENF that is equivalent tothe original irredundant positive ENF, because only rules from PROPC were used. This equivalence and the ﬁrst assertion imply that distinctcomplete ENFs are inequivalent, and that every term is equivalent to aunique (modulo B ) complete ENF. (cid:2) Under the correspondence of the above proof, a term t has the irre-dundant positive ENF true iﬀ its complete ENF contains all 2 N exjuncts.More generally, the exjuncts in the complete ENF of t correspond tothose rows in its truth table where it is true. As an example, we ﬁndthe complete ENF of the term x xor y , using a simpliﬁed notation with erm Rewriting Modulo Equations + for xor , juxtaposition for and , and overbar for negation: calcu-lating with PROPC , modulo AC for both binary operations, we have x = x + y ¯ y = xy + x ¯ y and y = y + x ¯ x = yx + y ¯ x , so that thecomplete ENF for x + y is xy + x ¯ y + ¯ xy . Similarly, the complete ENFfor x ¯ y + xz is xyz + x ¯ yz + x ¯ y ¯ z . Corollary 7.3.19

Two propositional calculus formulae over variables X are prov-ably equal in PROPC (X) iﬀ they yield the same Boolean value for everyassignment of Boolean values for the variables in X . Proof:

Two terms are provably equal iﬀ they have the same complete ENF, theconjuncts of which give exactly the Boolean assignments to X for whichthe terms are true. E28 (cid:2)

Proposition 7.3.20

Given a set X of N Boolean variables,

PROPC has a free alge- bra on X generators, and it has 2 N elements. Proof:

Recall that the free

PROPC -algebra on X generators is the initial alge-bra of PROPC (X) viewed as a

PROPC -algebra. By the proof of Proposi-tion 7.3.18, the normal forms of

PROPC (X) are in bijective correspon-dence with the complete exclusive normal forms. Each complete ex-clusive normal form can be seen as a set of complete disjuncts, andthen it is easy to see that there are 2 N diﬀerent complete disjuncts, andtherefore 2 N diﬀerent sets of complete disjuncts. (cid:2) The elements of this free algebra can be seen as all of the possibleBoolean functions on N variables, noting that N variables can take 2 N conﬁgurations, each of which can have 2 values. Exercise 7.3.4

Let B be the set containing true and false , let Σ be the signatureof PROPC (X) , and let M be the Σ -algebra with carrier [[X → B ] → B ] ,with operations from PROPC deﬁned “pointwise” on functions fromBoolean operations on B (e.g., with xor M (f , g)(a) = f (a) xor g(a) for a : X → B , and with x ∈ X interpreted as x M (a) = a(x) . Show that M is a free PROPC -algebra on X generators. (cid:2) The

PROPC

MTRS has the very special property that we can decide whether or not non-ground equations hold in the initial algebra just by comparing the canonical forms of their left- and rightsides; this prop-erty is unfortunately as rare as it is useful. Note that canonicity onlyallows showing that an equation does hold for the initial algebra; it de-cides whether or not the equation holds for all models. The followingprovides a precise formulation of what it means to say that

PROPC givesa decision procedure for the propositional calculus : Deﬁnition 7.3.21

A TRS ( Σ , A) is reduction complete iﬀ it is canonical and forany Σ -equation e , say ( ∀ X) t = t , we have T Σ ,A (cid:238) e iﬀ [[t ]] = [[t ]] .An MTRS ( Σ , A, B) is reduction complete iﬀ it is canonical and for any Σ -equation e , say ( ∀ X) t = t , we have T Σ ,A ∪ B (cid:238) e iﬀ [[t ]] (cid:39) XB [[t ]] . (cid:2) Deduction and Rewriting Modulo Equations

Exercise 7.3.5

Show that the TRS’s from Examples 5.1.7 and 5.5.7 are not re-duction complete. (cid:2)

Theorem 7.3.22

The MTRS

PROPC is reduction complete.

Proof:

Let E = A ∪ B . By the completeness theorem, it will suﬃce to provethat, for any Σ -equation e , T Σ ,E (cid:238) e iﬀ M (cid:238) e for every ( Σ , E) -algebra M . That the second condition implies the ﬁrstis immediate. For the converse, we ﬁrst treat the free ( Σ , E) -algebras.We will use contradiction, and so we suppose that T Σ ,E satisﬁes e butthat T Σ ,E (Z) does not satisfy e . Then there exists an a : X → T Σ ,E (Z) such that a(t) ≠ a(t (cid:48) ) . By Exercises 7.3.4 and 6.1.3, we can take T Σ ,E (Z) to be [[Z → B ] → B ] with pointwise operations, where B = { true , false } ;similarly, we can take T Σ ,E to be B . Let u = a(t) and let u (cid:48) = a(t (cid:48) ) . Since u ≠ u (cid:48) , there exists some b : Z → B such that u(b) ≠ u (cid:48) (b) . Now deﬁn-ing c = a ; b : X → B we get c = a ; b : T Σ ,E (X) → B , by 4. of Exercise 6.1.4.Next, if we deﬁne ˆ b : [[Z → B ] → B ] → B by ˆ b(u) = u(b) , then thereader can check that ˆ b is a Σ -homomorphism such that ˆ b(z) = b(z) where the ﬁrst z is the function in [[Z → B ] → B ] deﬁned in Exer-cise 7.3.4. Then ˆ b = b since there is just one such Σ -homomorphismextending b . Therefore c(t) = b(a(t)) = b(u) = ˆ b(u) = u(b) , andsimilarly c(t (cid:48) ) = u (cid:48) (b) . Therefore c(t) ≠ c(t (cid:48) ) , contradicting our as-sumption that B satisﬁes e . We next show the desired implication forany ( Σ , E) -algebra M . By Proposition 6.1.18, there is some Z such that q : T Σ ,E (Z) → M is surjective. But then T Σ ,E (Z) (cid:238) e implies M (cid:238) e , andso we are done. (cid:2) It is easy to apply this result in OBJ, because its built in operation == re-turns true iﬀ its two arguments have normal forms that are equivalentmodulo the attributes declared for the operations involved; an alterna-tive, which is justiﬁed in Exercise 7.3.8, is just to reduce the expression t iff t (cid:48) . Exercise 7.3.6

Use OBJ3 to determine whether or not the following are tautolo-gies of the propositional calculus:1.

P implies (P implies P) .2.

P implies (P implies not P) .3. not P implies (P implies not P) .4. (P implies Q) implies Q .5.

P iﬀ P iﬀ P .6.

P iﬀ P iﬀ P iﬀ P . erm Rewriting Modulo Equations (P implies Q) implies (Q implies Q) .Now use truth tables to check at least three of the above. (cid:2) Exercise 7.3.7

Use OBJ3 to determine whether or not the following are theo-rems of the propositional calculus:1. ( ∀ P )

P = not not P .2. ( ∀ P , Q)

P or Q = not P xor not Q .3. ( ∀ P )

P = P iﬀ P .4. ( ∀ P , Q, R)

P implies (Q and R) = (P implies Q) and (P implies R) .5. ( ∀ P , Q) not (P and Q) = not P or not Q . ( ∀ P , Q, R)

P implies (Q or R) = (P implies Q) or (P implies R) .Also use truth tables to check at least three of them. (cid:2)

Exercise 7.3.8

Show that ( ∀ X) t = t (cid:48) is a theorem of the propositional calculusiﬀ t iff t (cid:48) is a tautology, iﬀ not (t xor t (cid:48) ) is a tautology. (cid:2) Exercise 7.3.9

Show that if Σ PROPC -formulae t, t (cid:48) are equivalent, then t is a tau-tology iﬀ t (cid:48) is. (cid:2) Exercise 7.3.10

Show that if X ⊆ Y then ( ∀ X) t = t (cid:48) is a theorem of the propo-sitional calculus iﬀ ( ∀ Y ) t = t (cid:48) is. (cid:2) Deﬁnition 7.3.23

A formula of the propositional calculus is a disjunctive nor-mal form (abbreviated

DNF ) iﬀ it has the form D or D or . . . or D n , where each D i (called a disjunct ) has the form C i, and C i, and . . . and C i,k i , where each C i,j (called a conjunct ) either has the form P or else the form not P , where P is a variable of sort Prop ; by convention, we saythat the empty DNF ( n = false and the empty disjunct ( k i = true . Those P that occur in a disjunct with a preceding not occur negatively , and those that occur without it occur positively .Given a set X of variable symbols, a disjunct C is complete (with respectto X ) iﬀ each variable in X appears in C , and a DNF is complete (withrespect to X ) iﬀ each of its disjuncts is complete. (cid:2) It follows from the above conventions that both true and false are DNFs.It also follows that if X has N elements, then each complete disjuncthas N conjuncts. The following is well known: Deduction and Rewriting Modulo Equations

Proposition 7.3.24

Every formula of the propositional calculus is equivalent toa DNF having (at most) the same variables, and to a unique (modulo B )complete DNF having (exactly) the same variables. (cid:2) A nice proof of the above uses the MTRS in Exercise 7.3.11 below torewrite formulae to disjunctive normal form; this MTRS is shown termi-nating in Example 7.5.10, and Church-Rosser in Exercise 12.1.1; hencethis MTRS is canonical. Exercise 12.1.2 shows that its reduced formsare DNF’s.

Exercise 7.3.11

Choose ﬁve non-trivial formulae of the propositional calculusand use the TRS below to ﬁnd their DNFs; explain why the reducedforms of this TRS are necessarily correct if they are DNFs, without using the as yet unproved result that the TRS is canonical. obj DNF is protecting TRUTH .op _and_ : Bool Bool -> Bool [assoc comm prec 2] .op _or_ : Bool Bool -> Bool [assoc comm prec 3] .op not_ : Bool -> Bool [prec 1] .vars P Q R : Bool .eq P and false = false .eq P and true = P .eq P and P = P .eq P or false = P .eq P or true = true .eq P or P = P .eq not false = true .eq not true = false .eq P or not P = true .eq not not P = P .eq not(P and Q) = not P or not Q .eq not(P or Q) = not P and not Q .eq P and (Q or R) = (P and Q) or (P and R).op _xor_ : Bool Bool -> Bool [assoc comm prec 7] .op _implies_ : Bool Bool -> Bool [prec 9] .op _iff_ : Bool Bool -> Bool [assoc prec 11] .eq P xor Q = (P and not Q) or (not P and Q).eq P implies Q = not P or Q .eq P iff Q = (P and Q) or (not P and not Q) .eq P and not P = false .endo

Please note that once again,

BOOL should not be included. Although thisTRS is similar to

PROPC , it has a quite diﬀerent purpose; in particular,the formula (p and q) or (p and not q) is reduced under

DNF but notunder

PROPC , where it has the canonical form p . Similarly, not p isreduced under DNF but not under

PROPC , where it has canonical form pxor true . (cid:2) eriﬁcation of Hardware Circuits This section brings us closer to how OBJ3 actually implements termrewriting modulo equations, with the following weaker relation whichovercomes the ineﬃciency of Deﬁnition 7.3.1, because it only requiresmatching on subterms of the source term:

Deﬁnition 7.3.25

Given an MTRS ( Σ , A, B) , for t, t (cid:48) ∈ T Σ (X) , we say t weaklyrewrites to t (cid:48) under (or with ) A modulo B in one step iﬀ there exist arule t → t of sort s in A with variables Y , a term t ∈ T Σ ( { z } s ∪ X) , anda substitution θ : Y → T Σ (X) such that t = t (z ← t ∗ ) and t ∗ (cid:39) B θ(t ) and t (cid:48) = t (z ← θ(t )) . In this case we write t ⇒ A,B t (cid:48) . The relation weakly rewrites under A modulo B is the reﬂexive, transitive closureof ⇒ A,B , denoted ∗ ⇒ A,B . (cid:2) As usual, ⇒ A,B gives an abstract rewriting system, so we automaticallyget the appropriate notions of termination, Church-Rosser, canonical,and local Church-Rosser for weak term rewriting modulo B , in boththe general and the ground cases; and of course we also get the usualcollection of results by specializing the general results about abstractrewrite systems.It is clear that weak rewriting modulo B implies rewriting modulo B ,i.e., ⇒ A,B ⊆ ⇒

A/B . But the following example shows that weak rewrit-ing modulo B is strictly weaker than rewriting modulo B , so that itsreﬂexive, transitive, symmetric closure cannot be complete. Example 7.3.26

Let Σ have one sort with a binary operation + and constants0 , a, b , let A contain the left zero law, 0 + X = X , and let B contain theassociative law. Then (a + ) + b ⇒ A/B a + b because (a + ) + b (cid:39) B a + ( + b) and a + ( + b) → A a + b , but (a + ) + b is a normal formfor ⇒ A,B . Therefore ⇒ A,B really is weaker than ⇒ A/B . (cid:2) Despite this incompleteness, many B have “completion procedures,”which given a set A of rewrite rules, produce another set A (cid:48) such thatrewriting with ∗ ⇒ A/B and with ∗ ⇒ A (cid:48) ,B always yield B -equivalent terms. Infact this is how OBJ3 actually implements rewriting modulo some equa-tions [90, 113]. This allows users to think of computation as being done with ∗ ⇒ A/B even though it is really done with ∗ ⇒ A (cid:48) ,B ; in OBJ3, the newrules generated by completion can be seen with the “ show all rules. ” command. Completion will be discussed in Chapter 12. Hereafter,we study ⇒ A/B since it describes what OBJ3 does, though not how.

Hardware veriﬁcation is a natural application for equational logic, be-cause both circuits and their behaviors are described by sets of equa-tions in a very natural way; moreover equational logic is simple and well Deduction and Rewriting Modulo Equations

Pwr GndFigure 7.1: Power and Groundunderstood, with eﬃcient algorithms for many relevant decision prob-lems. We ﬁrst treat so called combinatorial (or combinational ) circuits,which have no loops or memory, illustrated by combinations of “logicgates” and also by simple cmos transistor circuits. Bidirectional logiccircuits are then treated in Section 7.4.5; these may have loops andmemory. Chapter 9 extends equational logic with second-order uni-versal quantiﬁcation, enabling us to verify so-called sequential circuits,which have time-dependent behavior. Much more could be said about solving the equations that arise from hardware circuits, but our inten-tion here is not to develop a complete theory, but rather to provide acollection of enticing applications for term rewriting modulo equations.Our circuit models involve two voltage levels, called “power” and“ground,” diagramed as shown in Figure 7.1, and identiﬁed with true and false , respectively. Wires in circuits are assumed to have eitherthe value power or else the value ground, and are modeled by Booleanvariables. Wires that are directly connected must share the same volt-age level, and are therefore represented by the same variable. Theseassumptions allow us to use the term rewriting decision procedure forthe propositional calculus in our proofs. The simplest kind of combinatorial circuit features a direct “ﬂow” fromsome given input wires, through some logic gates, to some outputwires. The gates are modeled by the corresponding Boolean functions,and in this situation, all the computation can be done by the proposi-tional calculus decision procedure, as illustrated in the following:

Example 7.4.1 ( ) Figure 7.2 is a circuit diagram for the usual xor (exclusive or). The equations Although this approach ignores issues such as load (i.e., current ﬂow), resistance,timing, and capacitance, it does fully capture the logical aspects of circuits. Moreover,it seems likely that many other issues can be handled by using larger sets of values onwires, for example, in the style of Winskel [183], and that these larger value sets canalso be implemented with term rewriting. eriﬁcation of Hardware Circuits (cid:12)(cid:13)(cid:12)(cid:13) (cid:12)(cid:13)(cid:12)(cid:13)(cid:12)(cid:13)(cid:12)(cid:13) (cid:12)(cid:13)(cid:12)(cid:13)(cid:12)(cid:13) (cid:12)(cid:13)(cid:12)(cid:13) (cid:12)(cid:13)(cid:12)(cid:13) cini2i1 p5p2p1 p4p3 coutsout • ••• • •• • • • • • Figure 7.2: 1-bit Full Adderin the OBJ module

FADD below say the same thing as this diagram, us-ing constants for the values of wires: i1 , i2 , cin are inputs, and cout , sout are the outputs. To verify that this circuit has the logical behaviorof a full adder, we must prove the following ﬁrst-order formula (Sec-tion 7.4.3 explains why we prove this particular formula): ( ∀ Z) (T ⇒ e ∧ e ) , where T speciﬁes the circuit, Z consists of its variables, and e , e arethe two equations cout = (i1 and i2) or (i1 and cin) or (i2 and cin)sout = (i1 and i2 and cin) or (i1 and not i2 and not cin) or(not i1 and i2 and not cin) or (not i1 and not i2 and cin) with the equations in T as in the module FADD below.

The following OBJ proof score for this veriﬁcation ﬁrst introducesconstants for the variables representing the wires, then gives the equa-tions that describe the circuit, and ﬁnally checks whether the outputvariables satisfy their speciﬁcations by using reduction over the

PROPC

Boolean decision procedure. Proposition 7.4.3 below and familiar re-sults on ﬁrst-order logic (fully explained in Chapter 8) justify that thisscore actually proves the desired formula. th FADD is extending PROPC .ops i1 i2 cin p1 p2 p3 p4 p5 cout sout : -> Bool .eq p1 = i1 and i2 . Deduction and Rewriting Modulo Equations eq p2 = i1 and cin .eq p3 = p1 or p2 .eq p4 = cin and i2 .eq p5 = cin xor i2 .eq cout = p3 or p4 .eq sout = i1 xor p5 .endthreduce cout iff (i1 and i2) or (i1 and cin) or (i2 and cin) .reduce sout iff (i1 and i2 and cin) or(i1 and not i2 and not cin) or(not i1 and i2 and not cin) or(not i1 and not i2 and cin).

No manual application of rules is needed, since OBJ does all the work.By contrast, [25] gives a six step, one and a half page outline of a proof for just the sout formula. (cid:2)

The equations for this circuit have a very special form, which isdescribed in the following, as a ﬁrst step towards an algebraic theoryof hardware circuits:

Deﬁnition 7.4.2

A set T of Σ (Z) -equations is an unconditional triangular prop-ositional system iﬀ Σ is the signature of PROPC (the propositional cal-culus speciﬁcation given in Section 7.3.2), Z is a ﬁnite set of constantsof sort Bool called variables , there is a subset X of Z , say x , . . . , x n ,called the input variables , and there is an ordering of the rest of Z ,say y , . . . , y m , called the dependent variables , such that T consists ofequations having the form y k = t k for k = , . . . , m , where each t k is a Σ (Z) -term involving only input variables and those non-input variables y j with j < k ; there must be exactly one equation for each k . In ad-dition, some of the non-input variables may be designated as outputvariables , with the rest being called internal (or “test point”) variables . (cid:2) Hereafter we may omit “unconditional” and we may also use thephrases “combinatorial system” and “triangular propositional system”interchangeably, often omitting the word “propositional.” The follow-ing display demonstrates why we chose the term “triangular”: y = t (x , . . . , x n ). . . . . .y k = t k (x , . . . , x n , y , . . . , y k − ). . . . . . . . .y m = t m (x , . . . , x n , y , . . . , y k − , . . . , y m − ) Note that each equation has sort

Bool since that is the only sort in Σ (Z) , and that t can only contain input variables. Also, the equations In practice we may let Σ and PROPC contain some propositional functions not inthe original version of

PROPC that could have been deﬁned in it, such as p nor q = not (p or q) . eriﬁcation of Hardware Circuits in a triangular system are usually considered (implicitly) quantiﬁed by ( ∀∅ ) , which means that, although we call them variables, all the x i and y i are technically constants; however, they will sometimes be uni-versally quantiﬁed in formulae that describe the intended behavior ofhardware circuits. In particular, we are interested in solving the equa-tions in the triangular system over the initial model of PROPC , so thatthe variables in the triangular system are constrained to be either true or false , i.e., power or ground; we will see that these solutions cor-respond to certain Σ (X) -models of the system. Here PROPC has initialsemantics, while triangular systems over it have loose semantics.Triangular systems are not in general term rewriting systems over Σ , because the rightsides in general contain variables that do not oc-cur in the leftsides. However, they are MTRS’s over Σ (Z) , since for this signature the “variables” are really constants. Example 5.8.27 showedthat unconditional triangular systems (in the sense of that example) arecanonical as TRS’s, and that the only variables in their normal formsare input variables. We want similar results for triangular systems over PROPC . We will generalize techniques from Chapter 5 to rewriting mod-ulo B to show that enriching PROPC with a triangular system T againyields a canonical system, modulo the same B used for PROPC (Theo-rem 7.7.24). We use this in the following:

Proposition 7.4.3

Given an unconditional triangular system

E29 T with variables Z , let B be the associative and commutative laws for and and xor , P theequations of PROPC except B , and A = T ∪ P . Then the following areequivalent, for t, t (cid:48) any Σ (Z) -terms:1. PROPC (cid:238) Σ ( ∀ Z) (T ⇒ t = t (cid:48) ) ;2. (A ∪ B) (cid:238) Σ (Z) ( ∀∅ ) t = t (cid:48) ;3. (t iff t (cid:48) ) ∗ ⇒ A/B true ;4. [[t]] A (cid:39) B [[t (cid:48) ]] A ;5. t ↓ A/B t (cid:48) ;

6. ( t == t (cid:48) ) ∗ ⇒ A/B true . Proof:

We omit subscripts B from (cid:39) B , = B , ⇒ A/B , ⇒ T/B and ⇒ P/B . Conditions1. and 2. are equivalent by rules of ﬁrst-order logic. Conditions 2. and4. are equivalent by Theorem 7.3.10. Conditions 3. and 4. are equiv-alent because ([[t]] T iff [[t (cid:48) ]] T ) ∗ ⇒ P true iﬀ [[[[t]] T ]] P (cid:39) [[[[t (cid:48) ]] T ]] P byExercise 7.3.8, and [[[[t]] T ]] P (cid:39) [[t]] A because both T modulo B and A Chapter 8 gives full details, including the ﬁrst-order version of the Theorem of Con-stants, which gives us that P (cid:238) Σ (Z) ( ∀∅ ) (T ⇒ t = t (cid:48) ) , and an implication eliminationrule which moves T over to conjoin with P . Deduction and Rewriting Modulo Equations modulo B are canonical, by Theorem 7.7.24. Finally, 4. and 5. are equiv-alent by Corollary 5.7.7, and 4. and 6. are equivalent by the deﬁnitionof == . (cid:2) This result justiﬁes proving universally quantiﬁed implications for tri-angular systems as in Example 7.4.1, by reducing t iff t (cid:48) to true , toconclude that ( ∀ Z) (T ⇒ t = t (cid:48) ) . We call this the “method of reduc-tion.” Note that the variables in T , t, t (cid:48) are constants for reduction,while those in

PROPC are not.

Fact 7.4.4

The canonical forms of an unconditional triangular propositionalsystem contain only input variables.

Proof:

We prove the contrapositive: If a term t contains an occurrence of anon-input variable y k , then t is not reduced, because the rewrite rule y k = t k can be applied to it. (cid:2) Proof scores like that in Example 7.4.1 can be generated completelyautomatically from the circuit diagram and the sentences to be proved,because there is an exact correspondence between circuit diagrams likethat in Figure 7.2 and triangular propositional systems. Although itwould be too tedious to spell out this correspondence in detail here,we note that if a circuit cannot be put in triangular form, then either itis not combinatorial because it has some loops, or else it has internalvariables that should instead have been declared as input variables, orvice versa.

Exercise 7.4.1

Design a circuit with inputs a , a , a , and with one output z which is true iﬀ exactly two of the inputs are true; you may use any (2input) logic gates you like. Prove that your design is correct using themethod of reduction and OBJ3. (cid:2) Exercise 7.4.2

Design a circuit using only not , nand and or gates having in-puts a , a , a , a , and an output z which is true iﬀ exactly two of theinputs are true. Use as few gates as you can (there is a solution withjust 19). Prove the correctness of your design using OBJ3. (cid:2) Deﬁnition 7.4.5 A Boolean (or ground ) solution to a triangular system is an as- signment of Boolean values to all its variables such that all its equationsare satisﬁed. A system is consistent iﬀ each assignment of Booleanvalues to input variables extends to a Boolean solution. A system is underdetermined iﬀ it is consistent and some assignment of Booleanvalues to input variables extends to more than one Boolean solution.Two systems are Boolean equivalent iﬀ they have exactly the sameBoolean solutions. A model of a triangular system is a Σ (Z) -model thatsatisﬁes PROPC and the equations of the system (considered quantiﬁedby ( ∀∅ ) ), and a Boolean (or protected ) model is a model having theset { true , false } as its carrier. (cid:2) eriﬁcation of Hardware Circuits We are interested in Boolean solutions because these correspond topossible behaviors of the circuit. They are bijective with Boolean mod-els, by letting a : Z → { true , false } correspond to the model that inter-prets each z ∈ Z as a(z) ; we might say that Boolean models satisfy theso-called Law of the Excluded Middle, in that the only values allowedare true and false , with everything else excluded. It follows that twosystems are Boolean equivalent iﬀ they have exactly the same Booleanmodels.Underdetermination is similar to the situation for a system of lin-ear equations where there are more variables than (independent) equa-tions. The next subsection will show that transistors are consistentand underdetermined; this is possible because these are conditionalrather than unconditional systems. A circuit consisting of an inverter (i.e., negation) with its output connected to its input is inconsistent,because its equation p = not p has no solutions; but no system containing such an equation can betriangular. It is also possible for an unwise choice of input variables toproduce inconsistency. For example, a system that contains the equa-tion i1 = not i2 is unsolvable if both i1 and i2 are input variables. To avoid this, oneof these two variables could instead be declared internal. Proposition 7.4.6

Every unconditional triangular propositional system is con-sistent, and no unconditional triangular propositional system is under-determined.

Proof:

The proof is by induction. Let a be a Boolean assignment to the inputvariables. Then by the ﬁrst equation, t (a(x ), . . . , a(x n )) gives a valuefor y , let’s denote it a(y ) . Similarly, t (a(x ), . . . , a(x n ), a(y )) gives a value for y , denoted a(y ) . And so on, until we get a value a(y m ) = t m (a(x ), . . . , a(x n ), a(y ), . . . , a(y m − )) for y m . This assignment a on Z is a solution by construction, andsince its values are computed directly from the equations, it is the onlypossible solution extending the original assignment. (cid:2) This is more than an analogy, because systems of linear equations over the ﬁeld Z (the two-element ﬁeld of the integers modulo 2), with 0 representing false and1 representing true , describe certain kinds of circuit. Our systems are more generalsince their equations may be non-linear (they are multi-linear) and/or conditional. Deduction and Rewriting Modulo Equations (cid:102) a b a bg g

Figure 7.3: The Two Kinds of mos

Transistor: n on Left, p on Right

This section considers conditional triangular propositional systems andtheir solutions. A key diﬀerence between conditional and unconditionalsystems is that a conditional system may fail to determine the values of some of its variables under some conditions. The most basic (andmost important) example of such a system is a transistor.

Example 7.4.7

The two main types of transistor are the n-transistor and the p-transistor , diagramed as shown in Figure 7.3, and having logical be-havior given by conditional equations of respective forms a = b if g = true a = b if g = false where the variables a, b, g are Boolean, with a, g as inputs and b asoutput. The system consisting of a single n-transistor is consistent andunderdetermined, because if g = false , then according to its equation(the ﬁrst above), b can have any value, no matter what value a has; thesame holds for p-transistors with g = true . (These single equationsystems are clearly triangular.) (cid:2) More complex systems can be built by putting several transistors to-gether, as illustrated in examples in Section 7.4.4 and thereafter. Wenow generalize Deﬁnition 7.4.2 to conditional equations, and then de-velop some theory for such systems.

Deﬁnition 7.4.8

A system T of (possibly conditional) Σ (Z) -equations is a ( con-ditional ) triangular ( propositional ) system iﬀ Σ is the signature of PROPC (the propositional calculus speciﬁcation of Section 7.3.2), Z isa ﬁnite set of constants of sort Bool called variables , there is a subset X of Z , say x , . . . , x n , called the input variables , and there is an or-dering of the rest of Z , say y , . . . , y m , called the dependent variables ,such that T consists of equations of the form y k = t if C , where each t is a Σ (Z) -term involving only input variables and variables y j with j < k , where each C is a ﬁnite set of pairs of such terms, and wherethere may be any number (including zero) of equations for each k . Inaddition, some of the dependent variables may be designated as output eriﬁcation of Hardware Circuits variables , with the remaining dependent variables being called internal (or “test point”) variables.If the rightside of each pair in each C is true , then the system has Boolean conditions . A triangular system T has consistent conditions iﬀ for each variable y k and for each assignment of Boolean values toinput variables, if more than one condition of an equation with leftside y k is true, then the corresponding rightsides are not provably equal(using PROPC and all previous equations in T ) to diﬀerent Boolean val-ues. A triangular system has disjoint conditions iﬀ whenever C, C (cid:48) arethe conditions of distinct equations with the same dependent variableas leftside, then C ∧ C (cid:48) is provably false for each assignment of Booleanvalues to input variables. A triangular system is total iﬀ every depen-dent variable has equations, and for each k and each choice of Boolean values for its input variables, the disjunction of the conditions in itsequations with leftside y k , say C k = C k, ∨ C k, ∨ · · · ∨ C k,(cid:96) , is prov-ably true, where each C k,i is considered the conjunction of its pairs asequations; otherwise the system is called partial . (cid:2) When a conditional triangular system has all C k = ∅ , it is equivalentto an unconditional system. We may assume that triangular systemshave Boolean conditions, since this is convenient and entails no loss ofgenerality. As with unconditional triangular systems, the equations areusually considered to be quantiﬁed by ( ∀∅ ) , with all variables consid-ered as constants, although they may sometimes appear universallyquantiﬁed, e.g., in formulae that describe intended circuit behavior.As in the unconditional case, conditional triangular systems are notin general rewriting systems over Σ , but are over Σ (Z) . However, unlikethe unconditional case, conditional triangular systems are not alwaysChurch-Rosser, which motivates the next result. Since the concepts inDeﬁnition 7.4.5, including solution of a system, consistent system, andunderdetermined system, carry over completely unchanged to condi-tional triangular systems, we do not repeat this material here. Proposition 7.4.9

A conditional triangular system is consistent if its conditionsare consistent. Moreover, a conditional triangular system with disjoint conditions has consistent conditions, and also is Church-Rosser.

Proof:

Let a be an assignment to the input variables. Then if the condition ofthe ﬁrst equation is true for that assignment, its rightside gives a value t (a(x ), . . . , a(x n )) for y , let’s denote it a(y ) . Otherwise, if there isa subsequent equation with leftside y the condition of which is true,let the value of its rightside be a(y ) ; if there is no such equation, pickan arbitrary value for a(y ) . Similarly, we get a value a(y ) for y , andso on, until we get a value a(y m ) for y m . The resulting assignment a is by construction a solution.The second and third assertions follow since for each assignment,each independent variable can be rewritten in at most one way. (cid:2) Deduction and Rewriting Modulo Equations

The next result generalizes Proposition 7.4.3 to conditional trian-gular systems, and applying its equivalence of 1. and 2. to conditionalrewrite rules reassures us that our join semantics for conditional termrewriting modulo equations is adequate for our hardware applications(see Section 7.7 for the technical details of this semantics).

Proposition 7.4.10

Using the notation of Proposition 7.4.3, the following areequivalent for any conditional triangular system

E30 T with variables Z that is Church-Rosser as a rewrite system, for t, t (cid:48) any Σ (Z) -terms:1. t ↓ A/B t (cid:48) ;2. [[t]] A (cid:39) B [[t (cid:48) ]] A ;3. ( t == t (cid:48) ) ∗ ⇒ A/B true ; (A ∪ B) (cid:96) Σ (Z) ( ∀∅ ) t = t (cid:48) ;5. PROPC (cid:238) Σ ( ∀ Z) (T ⇒ t = t (cid:48) ) ;6. (t iff t (cid:48) ) ∗ ⇒ A/B true . Proof:

We omit subscripts B from (cid:39) B , = B , ⇒ A/B , ⇒ T/B and ⇒ P/B . A is termi-nating by Proposition 7.7.19 and Church-Rosser by hypothesis, so it iscanonical. Therefore 1. and 2. are equivalent by Corollary 5.7.7, and2. and 3. are equivalent by deﬁnition of == . Also, 1. implies 4. and 4.implies 2., while 4. and 5. are equivalent by the Completeness Theoremand ﬁrst-order logic (see footnote 5). Finally, 2. and 6. are equivalent,since ([[t]] T iff [[t (cid:48) ]] T ) ∗ ⇒ P true iﬀ [[[[t]] T ]] P (cid:39) [[[[t (cid:48) ]] T ]] P by Exercise7.3.8, and [[[[t]] T ]] P (cid:39) [[t]] A because T modulo B and A modulo B arecanonical. (cid:2) This result justiﬁes proving sentences of the form ( ∀ Z) (T ⇒ t = t (cid:48) ) byreducing t iff t (cid:48) to true when T has conditional equations. However,for hardware veriﬁcation problems, this reduction will not in generalwork without using case analysis on the input variables, for reasonsthat are discussed in Section 7.4.3. For underdetermined systems, it is often convenient to use parame-ters in solutions.

Deﬁnition 7.4.11 A general ( Boolean ) solution of a conditional triangular sys-tem T with dependent variables y , . . . , y m is a family f k (x , . . . , x n ,w , . . . , w (cid:96) ) of terms for k = , . . . , m such that for every assignmentof Boolean values a , . . . , a n to the input variables x , . . . , x n and of The requirement that a general solution include all dependent variables, not justthe output variables, is reasonable, because a circuit designer should know what allhis wires are supposed to do; indeed, the redundancy involved in checking the mutualconsistency of solutions for all internal variables is desirable in itself. eriﬁcation of Hardware Circuits

Boolean values b , . . . , b (cid:96) to the parameter variables w , . . . , w (cid:96) , thevalues of f k (a , . . . , a n , b , . . . , b (cid:96) ) are a Boolean solution extending theoriginal input variable assignment. A most general ( Boolean ) solution of T is a general Boolean solution such that the Boolean ground so-lutions of T are exactly its Boolean substitution instances with theircorresponding input assignments. A set F of equations has the formof an unparameterized general solution iﬀ its equations are y k = f k (x , . . . , x n ) for k = , . . . , m , and has the form of a parameterizedgeneral solution iﬀ its equations are y k = f k (x , . . . , x n , w , . . . , w (cid:96) ) for k = , . . . , m . (cid:2) To check if a family of terms for dependent variables is a general solu-tion of a system T , it is by deﬁnition suﬃcient to substitute the termsfor the corresponding variables in each equation of T , and check if the two sides are equal for all Boolean values of the input and parametervariables. Notice that equations having the form of general solutionsare in particular unconditional triangular systems. Example 7.4.12

Consider the following conditional triangular system T , y = x or x y = not y if not x which is underdetermined, since y can have any value when x is true.The following is a proposed general solution F for T , y = x or x y = ( not x and not x ) or (w and x ) where w is a parameter variable. We can check that F is a most gen-eral solution of T by enumerating the solutions of T and then check-ing that they all are substitution instances of F . Using the format (x , x , y , y ) , and representing true by 1 and false by 0, the solu-tions of T are ( , , , ), ( , , , ), ( , , , (cid:63)) , and ( , , , (cid:63)) , where (cid:63) can be either 0 or 1. The reader may now verify that exactly the sameset of six Boolean 4-tuples arises from F . (cid:2) Proposition 7.4.13

Every unconditional triangular system has a most generalsolution, obtained by progressively substituting its equations into laterequations; this solution has no parameters, and is unique in the sensethat the corresponding terms for each y i are equal under PROPC . Suchsystems have exactly 2 n Boolean solutions, where n is the number ofinput variables. Proof:

The construction is like that of Proposition 7.4.9, except that we do thesubstitutions with the terms in the triangular system, instead of withBoolean values. First, let f be t . Next, substitute f for each instanceof y in t and call the result f , noting that both f and f contain Deduction and Rewriting Modulo Equations no non-input variables. Then, in t , substitute f for each instance of y and f for each instance of y and call the result f , noting thatit too contains no non-input variables. Continuing in this way, afterthe appropriate substitutions t k becomes f k , which by induction alsocontains no non-input variables. Since this construction involves onlyequational reasoning, the result is sound, and because each f k involvesonly input variables, the result is indeed a solution with no parameters.Moreover, this process is reversible, i.e., we can also derive the orig-inal triangular system from this solution. Therefore the two sets ofequations are equivalent as theories. Because equivalent theories haveexactly the same models, they also have exactly the same Boolean mod-els, and hence exactly the same Boolean solutions. For uniqueness,Corollary 7.3.19 implies that two Σ (X) -terms are equivalent if they areequal for all Boolean values of the input variables. Finally, the form of a general solution ensures that it produces ex-actly one Boolean solution for each Boolean assignment to the inputvariables, and since there are 2 n such distinct assignments, that is alsothe number of Boolean solutions. (cid:2) Throughout this subsection, Σ denotes the signature of PROPC , and allterms are over Σ (Z) for some variable symbols Z containing input vari-ables X = { x , . . . , x n } and parameter variables W = { w , . . . , w (cid:96) } (ifany). If T is a ﬁnite set of equations (perhaps conditional), let T denotethe conjunction of the equations in T without quantiﬁcation, and withconditional equations represented as implications. Note that an assign-ment a : Z → M to a Σ (Z) -model M is a solution of T iﬀ a(T ) is true in M , where a(T ) denotes the truth value of T in M under a . Proposition 7.4.14 If T is a (conditional) triangular system, then a set F ofequations having the form of an unparameterized general solution is amost general solution for T if the formula ( ∀ Z)(F (cid:97)

T ) can be provedassuming

PROPC . Proof:

The formula says that the two sets of equations are equivalent as theo- ries extending

PROPC , which implies that they have exactly the samemodels, and therefore in particular, have exactly the same Booleanmodels, and hence exactly the same Boolean solutions. (cid:2)

In examples, this formula can be proved by checking it for all possibleBoolean values of the input and parameter variables, since values ofthe other variables are determined by these using reduction, due to theforms of T and F . There are 2 n + (cid:96) cases to check, which is manageablein comparison with 2 n + (cid:96) + m , which could be much larger for a complexcircuit. eriﬁcation of Hardware Circuits The formula ( ∀ Z)(F ⇒ T ) says that if the y k are deﬁned accord-ing to F , then they satisfy T . Its converse, ( ∀ Z)(T ⇒ F) , says that ifsome assignment of values to Z satisﬁes T , then it also satisﬁes F ; thisis the “most general” part of “most general solution.” Together thesegive the equivalence of the two theories. Note that it can be satisﬁedwithout F actually being a solution, for example, if T is inconsistent(i.e., has no solutions), then the converse formula is valid for any F whatsoever. Nevertheless, Proposition 7.4.16 shows that under somemild assumptions, it suﬃces to prove just the “most general” part ofthe equivalence; this explains why we only proved that direction in Ex-ample 7.4.1, and why we do the same in several examples below. Recallthat B = { true , false } . Lemma 7.4.15 If T is a total triangular system with consistent conditions, then every Boolean assignment i : X → B to input variables extends to aunique solution i ∗ : Z → B for T . Proof:

Since T is total and has consistent conditions, the conditions of at leastone equation with leftside y evaluate to true, and all the rightsidesof such equations evaluate to the same value. Therefore the value of y is uniquely determined by the values of i . Similarly, the value of y is uniquely determined by the values of i and y , and so on byinduction, so that all the dependent variables in a solution are uniquelydetermined. (cid:2) Proposition 7.4.16 If T is a total triangular system with consistent conditionsand if F has the form of an unparameterized general solution, then ( ∀ Z)(T ⇒ F) implies ( ∀ Z)(T (cid:97) F) , and hence implies that F is a mostgeneral solution. Proof:

Given a Boolean assignment i : X → B for input variables, there areunique extensions i ∗ T and i ∗ F that are solutions to T and F , by Lemma7.4.15, noting that F is also total with consistent conditions. The for-mula in the hypothesis implies that i ∗ T = i ∗ F for any i : X → B , so wecan write just i ∗ . Uniqueness implies that a : Z → B is a solution to T iﬀ a = i ∗ when the restriction of a to X is i ; the same holds for F .Hence, a : Z → B is a solution to T iﬀ it is a solution to F , from which ( ∀ Z)(F (cid:97)

T ) follows by Corollary 7.3.19. (cid:2)

In practice, it suﬃces to prove ( ∀ X)(T ⇒ F) , regarding the input vari-ables as quantiﬁed and the dependent variables as constants, whosevalues are determined by X ; this is considerably easier by case analy-sis, since there are considerably fewer cases than required by Z .The situation is more complex for parameterized solutions, wherefor Z (cid:48) = Z ∪ W , with W the parameter variables, the relevant for-mula is ( ∀ Z)((( ∃ W )F) (cid:97)

T ) , for which it often suﬃces to prove just Using the inference rule (( ∀ Z)P) ∧ (( ∀ Z)Q) = ( ∀ Z)(P ∧ Q) , which is discussed inChapter 8. Deduction and Rewriting Modulo Equations i • • Gnd Pwr (cid:102) o •• Figure 7.4: A cmos not

Gate ( ∀ Z)((T ⇒ ( ∃ W )F)) , noting that ( ∀ Z)((( ∃ W )F) ⇒ T ) is equivalent to ( ∀ Z (cid:48) )(F ⇒ T ) . We omit details, which are similar to those for Proposi-tion 7.4.16.The method of reduction of Proposition 7.4.3 and its extension toconditional equations in Theorem 7.4.10, provide a way to verify uni-versally quantiﬁed implications of the kind considered here. (But notethat rewriting with conditional rules modulo equations is not treateduntil Section 7.7.) Although often not applicable, the method of re-duction is eﬃcient when it is applicable, and Boolean case analysis isavailable when it is not. Also note that while proving a conditionalequation by checking that its two sides reduce to the same thing, weshould assume that its condition is true when doing the reduction, asis of course familiar mathematical practice (Chapter 8 gives a formaltreatment). All this together gives a powerful and ﬂexible tool set forverifying hardware circuits.

This subsection gives some examples of conditional triangular systemsand their solutions, mainly so-called “ cmos ” circuits, which use n- andp-transistors in balanced pairs.

Example 7.4.17 ( not Gate ) We prove that the cmos circuit shown in Figure 7.4implements a not gate (also called a negation , or an inverter gate), i.e.,we prove that o = not i is a most general solution for T , which containsthe two conditional equations of the module NOT below, describing thebehavior of the two transistors: th NOT is extending PROPC .ops i o : -> Bool .cq o = true if not i .cq o = false if i .endth eriﬁcation of Hardware Circuits

We show that negation is a most general solution by case analysis onthe variable i (the validity of case analysis was shown in Section 7.4.3): open NOT . eq i = true .red o iff not i .closeopen NOT . eq i = false .red o iff not i .close Since both reductions give true and the circuit is easily seen to be to-tal and consistent, the proof is done by Proposition 7.4.16. Althoughit is now unnecessary, we can also show directly that negation is a so-lution, by assuming it as an equation and then checking that the twoconditional equations of the circuit hold: th BEH is ex PROPC .ops i o : -> Bool .eq o = not i .endthopen . eq not i = true .red o == true .closeopen . eq i = true .red o == false .close (cid:2)

Example 7.4.18 ( xor Gate ) We show that the six-transistor circuit of Figure 7.5realizes the exclusive or ( xor ) function, i.e., that ( ∀ Z) (T (cid:97) o = i xor i ) , where T consists of the equations in the module XOR be-low, describing the behavior of the circuit, and where Z contains thefour variables i1 , i2 , p1 , o . The variables i1 and i2 are inputs, while p1 is internal and o is the output. To model this circuit, we write oneconditional equation for each transistor: th XOR is extending PROPC .ops i1 i2 p1 o : -> Bool .cq p1 = false if i1 .cq p1 = true if not i1 .cq o = i1 if not i2 .cq o = p1 if i2 .cq o = i2 if not i1 .cq o = i2 if p1 .endth Example 7.4.17 shows that the internal variable p1 is the negation of i1 , but we nevertheless prove this again in the present context. Becausethis circuit is total, Proposition 7.4.16 implies that it suﬃces to showthat xor is most general: Deduction and Rewriting Modulo Equations (cid:102) (cid:102) (cid:102)

PwrGnd • •• • • •• • •• • • • • i1i2 p1 o

Figure 7.5: A cmos xor

Gate open XOR . eq i1 = false . eq i2 = true .red p1 iff not i1 .red o iff i1 xor i2 .closeopen XOR . eq i1 = true . eq i2 = false .red p1 iff not i1 .red o iff i1 xor i2 .closeopen XOR . eq i1 = false . eq i2 = true .red p1 iff not i1 .red o iff i1 xor i2 .closeopen XOR . eq i1 = false . eq i2 = false .red p1 iff not i1 .red o iff i1 xor i2 .close

Since all reductions give true , the proof is done. (cid:2)

We can summarize our method for verifying a proposed unparame-terized most general solution for a total consistent conditional systemas follows: use case analysis on the input variables and then reduc-tion to prove that the circuit equations imply the solution equations.This does not require human intervention: the OBJ proof score can begenerated automatically from a circuit diagram and a proposed solu-tion. Moreover, it is a kind of decision procedure under the conditions eriﬁcation of Hardware Circuits

PwrGnd (cid:102)(cid:102) •• • •• • •• ab zx Figure 7.6: A cmos nor

Gateof Proposition 7.4.16, because an unparameterized system fails to bea most general solution for a circuit iﬀ reduction fails to be true forsome case of some equation (but the system could still be a solution,even though not most general). In some cases, this method can be moreeﬃcient than more traditional approaches, and it works for combinato-rial circuits built from any (combinatorial) components, including e.g.,both transistors and logic gates.

Exercise 7.4.3

Use OBJ to prove correctness of the nor circuit shown in Figure7.6 (where x nor y = not (x or y) ). (cid:2) The method for checking solutions described above extends to par- tial solutions, where the desired behavior of a circuit is not determinedfor some input conditions. This can arise in speciﬁcations where it isknown that certain inputs will never occur, so it doesn’t matter whatthe circuit does in these cases. Partial speciﬁcations are preferable tochoosing arbitrary values for these inputs, since this allows engineersmore freedom to produce better designs.

Deﬁnition 7.4.19 A partial solution is a family of conditional equations, eachof the form y k = f k (x , . . . , x n ) if c k (x , . . . , x n ) , Deduction and Rewriting Modulo Equations where y k is a non-input variable, where x , . . . , x n are the input vari-ables, and where f k and c k are propositional terms. The cases wherethe predicates c k (x , . . . , x n ) are not true correspond to what are oftencalled don’t care conditions . (cid:2) Although we do not give all the details here, when verifying a partialsolution, it is necessary to determine when the condition of a condi-tional equation is true on Boolean models: ﬁrst, substitute the expres-sions of the proposed solution into the condition; then compute thecomplete disjunctive normal form (see Proposition 7.3.24) of the result;next, consider each disjunct as a “case,” in which the input variablesthat occur positively (i.e., those that are not negated) must be true ,while those that occur negatively must be false ; and ﬁnally, checkthat for each case the two sides evaluate to the same reduced form, using the values for the variables that belong to that case. For example,the condition of the sixth equation in Example 7.4.18 is p1 ; substitut-ing the equation for p1 yields not i1 , which is already in disjunctivenormal form, with only one disjunct and hence only one case, in which i1 = false ; the last open above sets up this assumption before eval-uating the equation. More complex conditions can easily arise, and canbe handled the same way. It is also valid to use exclusive normal formin the same manner.The problem of verifying a proposed solution is inherently simplerthan the problem of ﬁnding a solution, because we can look at the latteras having the form ( ∀ X)( ∃ N) PROPC (cid:238) Σ T , where T is a propositional system and X, N are its input and non-input variables, respectively. Although standard solution methods likeGaussian elimination require linearity for their completeness, essen-tially the same method of successive substitution and simpliﬁcationcan often be used to ﬁnd solutions for general propositional systems;sometimes even most general solutions can be found this way. Notethat the above formula only concerns Boolean solutions; a formuladeﬁning general solutions would require second-order variables.

Exercise 7.4.4

Design and verify a circuit that implements the conditional equa-tion a = b if c = d . (Note that this is a partial speciﬁcation.) (cid:2) Recall that bidirectional circuits may involve feedback loops, in thesense that they are not triangular. It may not be obvious that a function-based formalism can deal with bidirectional logic circuits, due to theinput/output character of functions. But equational logic is based onthe relation of equality, which is symmetric, and conditional equations eriﬁcation of Hardware Circuits provide additional expressive power. Section 7.4.4 showed that thisframework is suﬃcient for mos transistor circuits. We now general-ize beyond the simple input-output ﬂow-oriented structure of triangu-lar systems, to circuits described by arbitrary conditional propositionalsystems, but with a designated subset of “external” variables, insteadof designated input and output variables as in Deﬁnition 7.4.8. Thetransistor and the cell (Example 7.4.21 below) are examples.

Deﬁnition 7.4.20 A propositional system (or propositional theory ) over a vari-able set Z is a ﬁnite set of (possibly conditional) Σ (Z) -equations, where Σ is the signature of PROPC , with a designated subset of Z called the external variables , some of which may be input variables, and with theremainder called the internal variables . (cid:2) The notions of

Boolean solution , consistency , underdetermination , Boolean equivalence , and

Boolean model for propositional systemsare the same as in Deﬁnition 7.4.5, and the bijection between Booleansolutions and Boolean models also carries over, so that two proposi-tional systems are Boolean equivalent iﬀ they have the same Booleanmodels. Moreover, the notions of general solution and most generalsolution in Deﬁnition 7.4.11 also carry over.The task of proving that some equations E are satisﬁed if the equa-tions T describing a circuit are satisﬁed has the form ( ∀ Z)(T ⇒ E) , andcan therefore be proved by showing PROPC (cid:238) Σ (Z) (T ⇒ E) , which in turncan be proved by showing PROPC ∪ T (cid:238) Σ (Z) E , in which the variables inthe equations of PROPC remain variables, while those in T and E becomeconstants. The method of Proposition 7.4.16 is not available, becauseneither set of equations is in triangular form. Therefore to show thata proposed solution is most general, it is necessary both to prove thatit is a solution by proving the formula ( ∀ Z)(T ⇒ E) as above, and toprove the converse formula, ( ∀ Z)(E ⇒ T ) ; of course, we hope thesecan be done by reduction based on the forms

PROPC ∪ T (cid:238) Σ (Z) E and PROPC ∪ E (cid:238) Σ (Z) T , but in general some case analysis is required. Example 7.4.21 ( Cell ) Figure 7.7 is a cmos circuit for a simple 1-bit memory cell. This circuit is underdetermined, and in fact is bistable , i.e., ithas exactly two distinct Boolean solutions, which are its possible stablypersistent memory states. The system describing this circuit is p1 = true if not p2p1 = false if p2p2 = true if not p1p2 = false if p1 where p1 and p2 are external variables (so there are no internal vari-ables). This system is equivalent to Deduction and Rewriting Modulo Equations

PwrGnd PwrGnd (cid:102) (cid:102) • • •• • • p1 p2

Figure 7.7: A 1-bit cmos

Cell p1 = not p2p2 = not p1 as well as to p1 = not p2 .

This circuit has no designated inputs or outputs because the wires p1 and p2 are used bidirectionally, for both reading and writing. Becausesolutions are Boolean expressions in the internal variables, of whichthere are none, and because there are only two Boolean expressionsthat depend on no variables, namely true and false , it is clear thatthe two systems p1 = truep2 = false and p1 = falsep2 = true are the only possible unparameterized solutions for this circuit, and itis also clear that they do indeed satisfy the equations.We can introduce a parameter q for a most general parameterizedsolution p1 = not qp2 = q roving Termination Modulo Equations and observe that it is indeed a most general solution because it hasexactly the above two Boolean solutions as its Boolean instances.We can also give a mechanical proof that the above parameterizedsystem is indeed a solution. The OBJ3 proof score for this is a bit moresubtle than previous examples because the cases when the conditionof a conditional equation is true must be expressed in terms of theparameter variable q . th BEH is extending PROPC .ops p1 p2 q : -> Bool .eq p1 = not q .eq p2 = q .endth*** p1 = true if not p2 .open BEH . eq q = false .reduce p1 iff true .close*** p1 = false if p2 .open BEH . eq q = true .reduce p1 iff false .close*** p2 = true if not p1 .open BEH . eq q = true .reduce p2 iff true .close*** p2 = false if p1 .open BEH . eq q = false .reduce p2 iff false .close Since all these reductions give true , we are done. (cid:2)

Exercise 7.4.5

Design and verify a cmos circuit that can store and read twobits. (cid:2)

This section considers ways to prove termination modulo equations,generalizing results from Chapter 5 on rewriting with unconditionalrules, and illustrating their use on examples from earlier in this book.We ﬁrst note that any TRS result proved from an ARS result will ofcourse generalize to MTRS’s, although this doesn’t get us very far withtermination proofs. We ﬁrst show that it is not necessary that a termi-nating TRS is also terminating modulo B . Example 7.5.1

Let A contain the rules a + b → c and c → b + a where a, b, c are constants, + is binary, and there is just one sort. It is not hard tosee that A is terminating. Now let B contain the commutative law for Deduction and Rewriting Modulo Equations + . Then a + b ⇒ A/B a + b because b + a (cid:39) B a + b . Therefore rewritingwith A modulo B is nonterminating. (cid:2) Nevertheless, there is a simple condition that sometimes allows us toinfer termination of A modulo B from termination of A ; we will seelater that the same condition also works for the Church-Rosser andlocal Church-Rosser properties. Deﬁnition 7.5.2 A Σ -TRS A and set B of Σ -equations are said to commute iﬀfor any Σ -term t , whenever t (cid:39) B t (cid:48) and t (cid:48) ⇒ A t there is a Σ -term t (cid:48) such that t ⇒ A t (cid:48) and t (cid:48) (cid:39) B t . (cid:2) Lemma 7.5.3

Given a Σ -MTRS A commuting with Σ -equations B , if t ⇒ A/B t · · · ⇒ A/B t n then there exist t (cid:48) , . . . , t (cid:48) n such that t ⇒ A t (cid:48) · · · ⇒ A t (cid:48) n and t (cid:48) n (cid:39) B t n . The same result also holds for weak rewriting modulo B . Proof:

The ﬁrst step, t ⇒ A t (cid:48) (with t (cid:48) (cid:39) B t ), is direct from commutativity;next, because t ⇒ A/B t and t (cid:48) (cid:39) B t , we get t (cid:48) ⇒ A t (cid:48) with t (cid:48) (cid:39) B t ; andwe can continue in this same way until we get t (cid:48) n − ⇒ A t (cid:48) n with t (cid:48) n (cid:39) B t n .The result for weak rewriting modulo B now follows because ⇒ A,B is asubrelation of ⇒ A/B . (cid:2) Proposition 7.5.4

Given a Σ -TRS A commuting with Σ -equations B , if A is ter-minating, then A is also terminating modulo B ; this also holds for weakrewriting modulo B . Proof:

We prove the contrapositive. Suppose there is an inﬁnite sequence t ⇒ A/B t · · · ⇒ A/B t n ⇒ A/B · · · . Then we can construct an inﬁnite sequence t ⇒ A t (cid:48) · · · ⇒ A t (cid:48) n ⇒ A · · · by repeatedly using Lemma 7.5.3. This also holds for weak rewritingbecause ⇒ A,B is a subrelation of ⇒ A/B . (cid:2) We now generalize some results from Chapter 5 to rewriting mod- ulo B . For the results that arise from ARS’s, we can simply re-applythe ARS result, without doing any new work. For example, instead ofgeneralizing the proof of Proposition 5.5.1 on page 111, we can simplyapply its ARS version, which is Proposition 5.8.16 on page 134, to ob-tain the following, in which we also generalize from ω to an arbitraryNoetherian poset P : Proposition 7.5.5

An MTRS

M = ( Σ , A, B) is ground terminating if there is afunction ρ : T Σ ,B → P , where P is a Noetherian poset, such that for allground ( Σ , B) -terms t, t (cid:48) , if t ⇒ A/B t (cid:48) then ρ(t) > ρ(t (cid:48) ) . Moreover, theconverse holds provided M is globally ﬁnite. (cid:2) roving Termination Modulo Equations Note that if we want to deﬁne ρ using the initiality of T Σ ,B , then we haveto check that the Σ -structure given to P actually satisﬁes B .Of course, we also want to generalize Proposition 5.5.4, which givesa termination criterion that is easier to apply in practice, by taking ac-count of the structure of terms. Here we cannot rely on an ARS result,but fortunately the proof generalizes from terms to B -classes of terms;again we generalize to an arbitrary Noetherian poset P : Proposition 7.5.6

Given an MTRS

M = ( Σ , A, B) and ρ : T Σ ,B → P with P Noethe-rian, if(1) ρ(θ(t)) > ρ(θ(t (cid:48) )) for each t → t (cid:48) in A and applicable substitu-tion θ : X → T Σ ,B , and(2) ρ(t) > ρ(t (cid:48) ) implies ρ(t (z ← t)) > ρ(t (z ← t (cid:48) )) for each t, t (cid:48) ∈ (T Σ ,B ) s and any t ∈ T Σ ,B ( { z } s ) having a single occurrence of z ,then M is ground terminating. (cid:2) Note that expressions of the form ρ(t) really mean ρ([t]) above. Theproof is the same as that of Proposition 5.5.4, with ⇒ A/B substitutedfor ⇒ A . We next generalize Proposition 5.5.6 in the same way, notingthat all the concepts in Deﬁnition 5.5.3 generalize from Σ -terms to B -classes of Σ -terms by substituting ρ([ __ ]) for ρ( __ ) , and that the proofof Proposition 5.5.5 in Appendix B also generalizes, using induction onthe Σ -structure of Σ -terms that represent ( Σ , B) -terms: Proposition 7.5.7

Given an MTRS

M = ( Σ , A, B) and ρ : T Σ ,B → P with P Noethe-rian, if(1) each rule in A is strict ρ -monotone, and(2’) each σ ∈ Σ is strict ρ -monotone,then M is ground terminating. (cid:2) Here the rules must be seen as class rewriting rules, and monotonicitymust be interpreted on classes. As before, it is often easiest to deﬁnethe function ρ by giving its target a ( Σ , B) -algebra structure and then letting initiality deﬁne ρ , as in the following: Example 7.5.8

We can use Proposition 7.5.7 with P = ω to simplify the groundtermination proof of Example 5.5.7 by building in commutativity, i.e.,putting it into B . The resulting simpliﬁed speciﬁcation is as follows: obj ANDCOMM is sort Bool .ops tt ff : -> Bool .op _&_ : Bool Bool -> Bool [comm] .var X : Bool .eq X & tt = X .eq X & ff = ff .endo Deduction and Rewriting Modulo Equations

Letting Σ denote the signature of ANDCOMM , we give ω the structureof a Σ -algebra by deﬁning ω tt = ω ff = ω & (m, n) = m + n .Note that ω with this structure is a ( Σ , B) -algebra (because addition iscommutative), and let ρ denote the resulting unique Σ -homomorphism T Σ ,B → ω . Then by Proposition 7.5.7, we only need to prove ρ(x & tt ) > ρ(x)ρ(x & ff ) > ρ( ff ) for all x ∈ T Σ , plus (from condition ( (cid:48) ) ) that ρ(t) > ρ(t (cid:48) ) implies ρ(t & t) > ρ(t & t (cid:48) ) for all t, t (cid:48) , t ∈ T Σ . (Note again that there is an implicit [ _ ] inside eachinstance of ρ above.) As in Example 5.5.7, the proofs are all trivial – but there are only half as many of them. (cid:2) Example 7.5.9

We can use Proposition 7.5.7 with P = ω to prove ﬁrst groundtermination and then (non-ground) termination of the basic proposi-tional calculus speciﬁcation that we have been using as a decision pro-cedure: obj BPROPC is sort Bool .ops true false : -> Bool .op _and_ : Bool Bool -> Bool [assoc comm prec 2] .op _xor_ : Bool Bool -> Bool [assoc comm prec 3] .vars P Q R : Bool .eq P and false = false .eq P and true = P .eq P and P = P .eq P xor false = P .eq P xor P = false .eq P and (Q xor R) = (P and Q) xor (P and R).endo We let B contain the associative and commutative laws for and and xor , and we deﬁne a B -algebra structure on ω as follows, where p, q range over ω : ω true = ω false = ω and (p, q) = pqω xor (p, q) = p + q + . Now observe that ω with this structure satisﬁes B , because product andaddition are both associative and commutative, and that the resultingunique ( Σ , B) -homomorphism ρ satisﬁes ρ( true ) = ρ( false ) = ρ(P and Q) = ρ(P )ρ(Q) ρ(P xor Q) = ρ(P ) + ρ(Q) + , roving Termination Modulo Equations and also (by induction) ρ(P ) > P . It is now easy to check that allrules and both operations are strict ρ -monotone, the least trivial beingthe distributive law. It then follows that BPROPC is ground terminating.If constants, such as p, q, r , are added to the signature, then by deﬁning ρ on them to be 2, the above results still hold, and termination againfollows; but since variables are just such constants, we get termination,not just ground termination. (cid:2) Exercise 7.5.1

Use OBJ3 to check all equalities and inequalities needed in Ex-ample 7.5.9. (cid:2)

Example 7.5.10

We can prove termination of the MTRS

DNF of Example 7.3.11in much the same way. We ﬁrst consider just the operations and , or and not , with the ﬁrst fourteen equations, and we let B contain the associa-tive and commutative laws for and and or . Then as in Example 7.5.9above, we can give a B -algebra structure for ω : ω true = ω false = ω and (p, q) = pqω or (p, q) = p + q + ω not (p) = p + . Because ω with this structure satisﬁes B (since product and additionare both associative and commutative), the unique ( Σ , B) -homomor-phism ρ satisﬁes ρ( true ) = ρ( false ) = ρ(P and Q) = ρ(P )ρ(Q)ρ(P or Q) = ρ(P ) + ρ(Q) + ρ( not P ) = ρ(P) + . and also (by induction) ρ(P ) > P . It is easy to check thatall fourteen rewrite rules and both operations are strict ρ -monotone,the least trivial again being the distributive law. It follows that DNF isground terminating, and because after adding constants, such as p, q, r to the signature and deﬁning ρ on them to be 3, all of the above results still hold, we get termination.To extend this to the full MTRS of Example 7.3.11, we deﬁne eachadditional operation on ω to be one more than ρ of the rightside of itsdeﬁning equation. For example, ω xor (p, q) = p q + + q p + + ω implies (p, q) = p + + q + . It is easy to see that these operations and their rules are strict mono-tone, so the full MTRS is ground terminating by Proposition 7.5.7, andthen termination follows using the same trick. (cid:2) Deduction and Rewriting Modulo Equations

The above gives a general method for proving ground termination ofan extension of an MTRS by derived operations when the MTRS withoutthem has already been proved terminating using Proposition 7.5.7.

Exercise 7.5.2

Prove termination of the entire

PROPC

MTRS using the abovemethod for handling derived operations. (cid:2)

This section generalizes results from Chapter 5 for proving the Church-Rosser property to unconditional rewriting modulo equations. We de-fer some proofs and even statements of results to Section 7.7.3, whichtreats the more general case of conditional rewriting modulo equations.

We ﬁrst show that a Church-Rosser TRS is not necessarily Church-Rosser modulo equations:

Example 7.6.1

Let Σ be one-sorted with constants a, b,

0, let A have rules (a + ) + b → a + b → a , and let B have equations 0 + X = X and X + (Y + Z) = (X + Y ) + Z . Then (a + ) + b ⇒ A/B

0, and (a + ) + b ⇒ A/B a + b ⇒ A/B a , which are both reduced. Therefore A modulo B is notChurch-Rosser, although A without B is Church-Rosser. (cid:2) The same simple commutativity condition that let us infer terminationmodulo B from termination without B also works for the Church-Rosserproperty; the result below is a special case of Proposition 7.7.23, whichis proved in Section 7.7.3. Proposition 7.6.2 If A is a Church-Rosser (or locally Church-Rosser) Σ -TRS com-muting with Σ -equations B , then A is Church-Rosser (or locally Church-Rosser) modulo B . (cid:2) The following is a straightforward application of the above plus Propo-sition 7.5.4:

Proposition 7.6.3

Any triangular propositional system T is canonical modulo the associative and commutative laws for and and xor . Proof:

Associative and commutative laws commute with any rule having a con-stant as leftside, and Example 5.8.27 showed that any such T is termi-nating and Church-Rosser (without B ). (cid:2) Exercise 7.6.1

Give a proof or counterexample for the assertion that if A isChurch-Rosser and commutes with B then weak rewriting modulo B isalso Church-Rosser. (cid:2) Results in Chapter 5 for proving the Church-Rosser property thatfollow from ARS results immediately extend to MTRS’s. One such is roving Church-Rosser Modulo Equations the Hindley-Rosen Lemma, which is stated in even greater generality inSection 7.7.3. Here we state another modulo B result, which followsdirectly from Proposition 5.7.4: Proposition 7.6.4 ( Newman Lemma ) A terminating MTRS is Church-Rosser iﬀit is locally Church-Rosser. (cid:2)

Since Example 7.5.9 shows termination of

PROPC , it would be very de-sirable to use Proposition 7.6.4 to prove Hsiang’s Theorem (Theorem7.3.13) by proving the local Church-Rosser property for

PROPC . Thisprovides strong motivation for generalizing the Critical Pair Theorem(Theorem 5.6.9) to MTRS’s. Unfortunately, this is far from straightfor-ward; some aspects of the problem are treated in Chapter 12. Herewe content ourselves with some simple, specialized results, and with showing that the Critical Pair and Orthogonality Theorems do not gen-eralize straightforwardly to the modulo B case. First we generalize therelevant deﬁnitions: Deﬁnition 7.6.5

Two rules of an MTRS with leftsides (cid:96), (cid:96) (cid:48) , overlap iﬀ thereexist a subterm (cid:96) of (cid:96) not just a variable, and substitutions θ, θ (cid:48) suchthat θ((cid:96) ) = B θ (cid:48) ((cid:96) (cid:48) ) . If the two rules are the same, it is required inaddition that the corresponding substitution instances of the rightsidesare not equivalent modulo B , and in this case the rule is called self-overlapping . An MTRS is overlapping iﬀ it has two rules (possibly thesame) that overlap, and then (the B -class of) θ((cid:96) ) is called an overlap of the rules; otherwise the MTRS is called non-overlapping . A mostgeneral overlap p of (cid:96), (cid:96) (cid:48) at (cid:96) is an overlap of (cid:96), (cid:96) (cid:48) at (cid:96) such thatany other is equal (modulo B ) to a substitution instance of p , and a complete overlap set for (cid:96), (cid:96) (cid:48) at (cid:96) is a set of overlaps of (cid:96), (cid:96) (cid:48) at (cid:96) such that any other is equal (modulo B ) to a substitution instance ofsome overlap in the set. (cid:2) Note that the subredex need not be proper in the self-overlapping case,as was required in Deﬁnition 5.6.2 for ordinary term rewriting. How-ever, it is still true that if the leftsides (cid:96), (cid:96) (cid:48) of two rules in A overlapat θ((cid:96) ) , then that overlap can be rewritten in two diﬀerent ways (one for each rule). The following, due to Dr. Monica Marcus, refutes thestraightforward modulo B generalization of the TRS orthogonality the-orem that replaces all concepts by their modulo B counterparts: Example 7.6.6

Let Σ be one-sorted with a binary operation + and constants a, b , let A have the rules a + b → b and b + a → a , and let B have theassociative law. Then A is non-overlapping modulo B , but a + b + a ⇒ A/B a + a , and a + b + a ⇒ A/B b + a ⇒ A/B a , which are both reduced modulo B . Therefore A modulo B is not Church-Rosser. (And it is not hard tosee that this MTRS terminates.) (cid:2) Deduction and Rewriting Modulo Equations

The following is proved in Chapter 12, recalling that the associative,commutative, and identity laws are abbreviated A, C, I, respectively:

Proposition 12.0.2

Given an MTRS ( Σ , A, B) , if B consists of any combination ofA, C, I laws for operations in Σ , except A, I and AI, and if the leftsides (cid:96), (cid:96) (cid:48) of two rules in A overlap at a subterm (cid:96) of (cid:96) , then there is a ﬁnitecomplete overlap set for (cid:96), (cid:96) (cid:48) at (cid:96) . (cid:2) Note that any ﬁnite complete overlap set contains a minimal such set,in the sense that no subset of it is a complete overlap set; however,there may be more than one such subset.

Deﬁnition 7.6.7

An MTRS ( Σ , A, B) is said to have complete overlaps iﬀ when- ever leftsides (cid:96), (cid:96) (cid:48) of rules in A overlap at a subterm (cid:96) of (cid:96) , they havea ﬁnite complete overlap set. Each such overlap is called a superposi-tion of (cid:96), (cid:96) (cid:48) , and the pair of rightsides resulting from applying the tworules to the overlap θ((cid:96) ) is called a critical pair . If the two terms of acritical pair can be rewritten modulo B to a common term using A , thenthat critical pair is said to converge or to be convergent . (cid:2) The following illustrates the deﬁnitions above:

Example 7.6.8

The ﬁrst two rules of

PROPC (see Example 7.5.9), with leftsides t = P and false and t (cid:48) = P and true , have the overlap true and false modulo B , with θ(P ) = true and θ (cid:48) (P ) = false . Then the term θ(t) = θ (cid:48) (t (cid:48) ) = true and false rewrites to false in two diﬀerent ways. (Only thecommutative law for and is actually used here.) (cid:2) For the Critical Pair Theorem (Theorem 5.6.9) to generalize to the mod-ulo B case, would mean that an MTRS with complete overlaps is locallyChurch-Rosser if all its critical pairs are convergent. This does not hold,and in fact Example 7.6.6 is a counterexample. Chapter 12 discussessome algorithms that generalize uniﬁcation for computing an analogof critical pairs for MTRS’s over certain sets of equations, and thus de- ciding the local Church-Rosser property.The following weak modulo B orthogonality result, which followsfrom the ordinary version, is the best we can do here: Proposition 7.6.9 ( Weak Orthogonality Modulo B ) Given an MTRS M = ( Σ , A, B) ,let R = A ∪ B ∪ B (cid:94) (where B (cid:94) denotes the converse of B ). If R is lapsefree and orthogonal, and if B is balanced, then M is Church-Rosser. Proof: ( Σ , R) is Church-Rosser by Theorem 5.6.4, and since B is balanced andlapse free, any rewrite sequence t ∗ ⇒ A/B t (cid:48) expands to a rewrite se-quence t ∗ ⇒ R t (cid:48) , which implies that M is also Church-Rosser. (cid:2) onditional Term Rewriting Modulo Equations Unfortunately, this is not very useful: the associative law is disqualiﬁedbecause it is self-overlapping; although the commutative law satisﬁesthe assumptions for B , it is unlikely to be non-overlapping with inter-esting rule sets A ; moreover, B cannot contain identity or idempotentlaws, since these (or else their converses) are not lapse free rewriterules. Exercise 7.6.2

Use Propositions 7.6.9 and 7.6.2 for an alternative proof of Propo-sition 7.6.3. (cid:2)

The results of Section 5.3 on adding new constants generalize straight-forwardly to the modulo B setting, and as before, are important for theorem proving; however, we do not explicitly state them here, butrefer the reader to Section 7.7.1, which gives more general results forconditional rules modulo B . We ﬁrst develop an ARS version of join conditional term rewriting,which will let us deﬁne conditional rewriting modulo equations moreeasily than by developing it directly; we will also use it again for order-sorted term rewriting in Chapter 10. An unconditional ARS (Deﬁnition5.7.1) consists of a set of “rules,” which are really just pairs of elementsof the same sort from an indexed set T , by convention written t → t (cid:48) .We generalize this as follows: Deﬁnition 7.7.1 A join conditional ARS , abbreviated JCARS or just

CARS , is apair (T , W ) , where T is S -sorted and W is a set of conditional rules on T , which are (n + ) -tuples for n ≥

0, of pairs of elements of T of thesame sort, by convention written in one of the forms t → t (cid:48) if t = t (cid:48) , . . . , t n = t (cid:48) n (cid:16) ∧ ni = t i = t (cid:48) i (cid:17) ⇒ t → t (cid:48) where the ﬁrst pair, or head , of the tuple is t → t (cid:48) . Note that uncondi-tional rules are the special case where n =

0. Now given a CARS (T , W ) ,deﬁne an ordinary ARS on T by R = {(cid:104) t, t (cid:105) | t ∈ T } R k + = {(cid:104) t, t (cid:48) (cid:105) | ( ∧ ni = t i = t (cid:48) i ) ⇒ t → t (cid:48) in W and t i ↓ t (cid:48) i by R k for i = , . . . , n } ∪ R (cid:63)k for each k ≥

0, where R (cid:63)k denotes the transitive, reﬂexive closure of R k ,and then let R = ∪ ∞ k = R k . Deduction and Rewriting Modulo Equations

We often write W (cid:5) for the relation R in the following. Now deﬁne anordinary ARS on T by t → t (cid:48) iﬀ there is a rule ( ∧ ni = t i = t (cid:48) i ) ⇒ t → t (cid:48) in W such that t i ↓ t (cid:48) i using W (cid:5) for i = , . . . , n . We call this the ARSdeﬁned by W , and we may write (T , → W ) or even just (T , → ) for it.We say that a relation R on T is join closed under W iﬀ whenever t → t (cid:48) if t = t (cid:48) , . . . , t n = t (cid:48) n is in W and t i ↓ t (cid:48) i by R for i = , . . . , n then (cid:104) t, t (cid:48) (cid:105) is in R . (When n =

0, this just means (cid:104) t, t (cid:48) (cid:105) ∈ R .) (cid:2) Proposition 7.7.2

Given a CARS (T , W ) , then W (cid:5) (as above) is the least transi-tive, reﬂexive relation on T that is join closed under W . Moreover, therelation ∗ → W is equal to W (cid:5) . Proof:

We write R for W (cid:5) . Reﬂexivity of R follows from the inclusion of R .To show transitivity, suppose (cid:104) t , t (cid:105) , (cid:104) t , t (cid:105) ∈ R . Then there is some k such that (cid:104) t , t (cid:105) , (cid:104) t , t (cid:105) ∈ R k , so that (cid:104) t , t (cid:105) is in R ∗ k and hence in R .To show join closure under W , suppose t → t (cid:48) if t = t (cid:48) , . . . , t n = t (cid:48) n is in W and that t i ↓ t (cid:48) i by R for i = , . . . , n . Then there is some k suchthat t i ↓ t (cid:48) i by R k for i = , . . . , n , so that (cid:104) t, t (cid:48) (cid:105) is in R k + and hence isin R .To show minimality, suppose R (cid:48) is join closed under W . Then R ⊆ R (cid:48) , and also R k ⊆ R (cid:48) implies R k + ⊆ R (cid:48) . Therefore R ⊆ R (cid:48) .For the second assertion, we ﬁrst show → W ⊆ W (cid:5) , which implies ∗ → W ⊆ W (cid:5) since W (cid:5) is transitive. So we suppose t → W t (cid:48) . Then there exists k such that t i ↓ t (cid:48) i by R k , which implies that (cid:104) t, t (cid:48) (cid:105) is in R k + and hencein W (cid:5) . To prove the converse, we show R k ⊆ ∗ → W for all k . R ⊆ ∗ → W since ∗ → is reﬂexive by deﬁnition. Next suppose (cid:104) t, t (cid:48) (cid:105) is in R k + but notin R (cid:63)k . Then there is a rule t → t (cid:48) if t = t (cid:48) , . . . , t n = t (cid:48) n in W with t i ↓ t (cid:48) i by R k for i = , . . . , n . Therefore t → W t (cid:48) since t i ↓ t (cid:48) i also by W (cid:5) since R k ⊆ W (cid:5) . (cid:2) This result can be used to prove properties of ∗ → W . We now apply theCARS machinery to conditional term rewriting modulo equations: Deﬁnition 7.7.3 A conditional modulo term rewriting system , abbreviated CMTRS , is ( Σ , A, B) , where A is a set of (possibly) conditional Σ -rewriterules, and B is a set of unconditional Σ -equations. From a given CMTRS ( Σ , A, B) , we deﬁne two diﬀerent CARS’s, where t → t (cid:48) if t = t (cid:48) , . . . , t n = t (cid:48) n has sort s and is in A , Y = var (t) , θ : Y → T Σ , u ∈ T Σ ( { z } s ) :1. For class rewriting, let W be the set of rules of the form c → c (cid:48) if c = c (cid:48) , . . . , c n = c (cid:48) n where c = [u(z ← θ(t))], c (cid:48) = [u(z ← θ(t (cid:48) ))] , c i = [θ(t i )] and c (cid:48) i = [θ(t (cid:48) i )] for i = , . . . , n .2. For term rewriting, let W be the set of rules of the form v → v (cid:48) if v = v (cid:48) , . . . , v n = v (cid:48) n where v (cid:39) B u(z ← θ(t)), v (cid:48) (cid:39) B u(z ← θ(t (cid:48) )) , v i = θ(t i ) and v (cid:48) i = θ(t (cid:48) i ) for i = , . . . , n . onditional Term Rewriting Modulo Equations Deﬁnition 7.7.1 now yields an ARS for each of these CARS’s. We write ⇒ [A/B] for the ﬁrst, and ⇒ A/B for the second, which are conditionalclass rewriting modulo equations, and conditional term rewriting mod-ulo equations, respectively. The pair (u, θ) is called a match . As before,rewriting extends to terms with variables by extending Σ to Σ (X) , andwhich case write c ⇒ [A/B],X c (cid:48) and t ⇒ A/B,X t (cid:48) , deﬁned on T Σ (X),B and T Σ (X) respectively. (cid:2) All ARS results, e.g., the Newman lemma and the multi-level ter-mination results in Section 5.8.2, apply, because ⇒ [A/B] and ⇒ A/B aredeﬁned by ARS’s; Theorem 7.7.7 below shows an equivalence of ⇒ [A/B] and ⇒ A/B . The following is proved similarly to Proposition 7.3.2:

Proposition 7.7.4

Given t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X and CMTRS ( Σ , A, B) , then t ⇒ A/B,X t (cid:48) iﬀ t ⇒ A/B,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . Therefore t ∗ ⇒ A/B,X t (cid:48) iﬀ t ∗ ⇒ A/B,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . (cid:2) Thus both ⇒ A/B,X and ∗ ⇒ A/B,X restrict and extend reasonably over vari-ables, so we can drop the subscript X and use any X with var (t) ⊆ X ;also as before, ∗ (cid:97) A/B,X does not restrict and extend reasonably, asshown by Example 5.1.15, so we deﬁne t ∗ (cid:97) A t (cid:48) to mean there existsan X such that t ∗ (cid:97) A,X t (cid:48) . Example 5.1.15 also shows bad behavior for (cid:39) XA/B (deﬁned by t (cid:39) XA/B t (cid:48) iﬀ A ∪ B (cid:238) ( ∀ X) t = t (cid:48) ), although again, rule(8) in Chapter 4 (extended to rewriting modulo B ) implies (cid:39) XA,B does be-have reasonably when the signature is non-void. Deﬁning ↓ A/B,X fromthe ARS, we generalize Proposition 5.1.13, again allowing the subscript X to be dropped: Proposition 7.7.5

Given t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X and CMTRS ( Σ , A, B) , then wehave t ↓ A/B,X t if and only if t ↓ A/B,Y t , and moreover, these imply A ∪ B (cid:96) ( ∀ X) t = t . (cid:2) We next give a cute proof to illustrate the approach (some additionaltheory needed for its justiﬁcation is discussed after Theorem 7.7.10):

Example 7.7.6

We continue Example 7.3.6 by showing that a ring has no zerodivisors (i.e., non-zero elements a, b such that a ∗ b =

0) if it satisﬁesthe left cancellation law, that a ∗ b = a ∗ c and a ≠ b = c . Forthis proof, we turn on the “reduce conditions” feature, so that whenthe conditional rewrite rule for the cancellation law is applied, its con-dition is automatically checked by reduction; because this rule involvesBoolean “ and ” it is also important that include BOOL be turned on.The result of Example 7.3.6 is used as a lemma. set reduce conditions on .open RING . vars-of . Deduction and Rewriting Modulo Equations ops a b c : -> R .eq A * 0 = 0 . *** the lemma[lc] cq B = C if A * B == A * C and A =/= 0 .eq a * b = 0 . *** the assumptionshow rules .start b .apply .lc with C = 0, A = a at top .close

Since the result is , as expected, the proof is done. Note that both == and =/= are used in the conditional equation lc . (cid:2) Exercise 7.7.1

Prove the converse of the last result of Example 7.3.6, that a ringsatisﬁes the left cancellation law if it has no zero divisors. (cid:2)

We now generalize some basic semantic results on term rewritingmodulo equations to the conditional case, beginning with Theorem 7.3.4,noting that Proposition 7.2.2 and Theorem 7.2.3 were already stated forconditional equations. As in the unconditional case, we cannot hope forcompleteness due to the semantics of join conditional rewriting.

Theorem 7.7.7 ( Soundness ) Given a CMTRS ( Σ , A, B) and t, t (cid:48) ∈ T Σ (X) , then t ⇒ A/B t (cid:48) iﬀ [t] ⇒ [A/B] [t (cid:48) ] ,t ∗ ⇒ A/B t (cid:48) iﬀ [t] ∗ ⇒ [A/B] [t (cid:48) ] ,[t] ∗ ⇒ [A/B] [t (cid:48) ] implies [A] (cid:96) B ( ∀ X) [t] = [t (cid:48) ] ,t ∗ (cid:97) A/B t (cid:48) implies A ∪ B (cid:96) ( ∀ X) t = t (cid:48) . Also ∗ (cid:97) A/B is sound for satisfaction of A ∪ B , and ∗ (cid:97) [A/B] is sound forsatisfaction of A modulo B . Moreover, ∗ (cid:97) A/B ⊆ (cid:39) XA ∪ B on terms with vari-ables in X . Proof:

The ﬁrst assertion follows from the deﬁnitions of ⇒ A/B and ⇒ [A/B] (Def-inition 7.7.3), and the second follows from the ﬁrst by induction. Thethird follows because ⇒ [A/B] rephrases the rule ( + C B ) . The fourth fol-lows from the second and third, plus Theorem 7.2.3. The ﬁfth and sixth follow from the third and fourth plus Theorem 7.2.3. (cid:2) The following generalizes Theorem 7.3.9 to the conditional case:

Theorem 7.7.8

Given a ground canonical CMTRS ( Σ , A, B) , if t , t are both nor-mal forms of a ground term t under ⇒ A/B then t (cid:39) B t . Moreover, the B -equivalence classes of ground normal forms under ⇒ A/B form an ini-tial ( Σ , A ∪ B) -algebra, denoted N Σ ,A/B , in the following way, where [[t]] denotes any arbitrary normal form of t , and where [[t]] B denotes the B -equivalence class of [[t]] :(0) interpret σ ∈ Σ [],s as [[σ ]] B in N Σ ,A/B,s ; and onditional Term Rewriting Modulo Equations (1) interpret σ ∈ Σ s ...s n ,s with n > ([[t ]] B , . . . , [[t n ]] B ) with t i ∈ T Σ ,s i to [[σ (t , . . . , t n )]] B in N Σ ,A/B,s .Finally, N Σ ,A/B is Σ -isomorphic to T Σ ,A ∪ B . Proof:

For convenience, write N for N Σ ,A/B . The ﬁrst assertion follows fromthe ARS result Theorem 5.7.2, using also Theorem 7.7.7. Note that σ N is well deﬁned by (1), by the ﬁrst assertion, plus the fact that (cid:39) B is a Σ -congruence.Next, we check that N satisﬁes A ∪ B . Satisfaction of B is by deﬁni-tion of N as consisting of B -equivalence classes of normal forms. Nowlet ( ∀ X) t = t (cid:48) if C be in A ; we need to prove that a(t) = a(t (cid:48) ) for all a : X → N that satisfy C . The proof follows that of Theorem 7.3.9, ex-cept that we must restrict to assignments that satisfy the condition, andthat uses of Theorem 7.3.4 must be replaced by uses of Theorem 7.7.7. (cid:2) Deﬁnition 7.7.9

A CMTRS ( Σ , A, B) is join condition canonical iﬀ the CMTRS ( Σ (cid:48) , A (cid:48) , B) is canonical, where: (1) Σ (cid:48) ⊆ Σ is least such that if t i = t (cid:48) i isa condition in some rule r in A , then θ(t i ) and θ(t (cid:48) i ) are in T Σ (cid:48) for all θ : X → T Σ where X = var (t) where t is the leftside of the head of rule r ; and (2) A (cid:48) ⊆ A is least such that all conditional rules are in A (cid:48) , andall unconditional rules that can be used in evaluating the conditions ofrules in A are also in A (cid:48) . (cid:2) It is of course possible that Σ (cid:48) = Σ , but in many real examples, con-ditions are relatively simple tests on data types that involve relativelyfew operations and rules. Consequently, ( Σ (cid:48) , A (cid:48) , B) is often signiﬁcantlysimpler than ( Σ , A, B) . In OBJ, the operations used in conditions may bebuilt in rather then deﬁned by explicit equations, but in this case, theyshould be considered deﬁned by a canonical TRS. Theorem 7.7.10 ( Completeness ) Given a join condition canonical CMTRS ( Σ , A,B) , the following four conditions are equivalent for any t, t (cid:48) ∈ T Σ (X) : t ∗ (cid:97) A/B t (cid:48) A ∪ B (cid:96) ( ∀ X) t = t (cid:48) [A] (cid:96) B ( ∀ X) t = B t (cid:48) t (cid:39) A ∪ B t (cid:48) Moreover, if ( Σ , A, B) is Church-Rosser, then t ↓ A/B t (cid:48) is also equivalent to the above. Finally, if ( Σ , A, B) is canonical, then [[t]] A (cid:39) B [[t (cid:48) ]] A is alsoequivalent. Proof:

Equivalence of the last three of the ﬁrst four assertions has alreadybeen proved. The ﬁrst implies the second by the soundness of rewrit-ing. For the converse, because ( ± C ∗ B ) is complete and is equivalentto bidirectional rewriting, we are done if the conditions of rules can al-ways be evaluated; but this is given by the join condition canonical as-sumption. Equivalence of the ﬁfth condition assuming Church-Rosseris a general ARS result, and equivalence of the ﬁnal condition with thefourth is immediate when ⇒ A/B is canonical. (cid:2) Deduction and Rewriting Modulo Equations

This generalization of Theorem 7.3.10 implies that we can deﬁne anoperation == that works for a canonical CMTRS ( Σ , A, B) the same wayas for a TRS, CTRS, or MTRS that is canonical: t == t (cid:48) returns true over ( Σ , A, B) iﬀ t, t (cid:48) have the same normal form modulo B iﬀ they are prov-ably equal under A ∪ B as an equational theory. As before, even when ( Σ , A, B) is not canonical, if t == t (cid:48) returns true then t and t (cid:48) are equalmodulo B . Also as before, if ( Σ , A, B) is non-canonical, rewriting can beunsound if == occurs in a negative position (such as =/= in a positiveposition) in a condition; however, it is sound if the system is canonicalfor the sorts of terms that occur in negative positions, with respect tothe subset of rules actually used. Exercise 7.7.2

Replace ↓ in Deﬁnition 7.7.1 by ∗ ↔ to obtain equality conditionabstract rewrite systems , apply this deﬁnition to term rewriting mod- ulo B to obtain equality condition term rewriting modulo equations ,and then prove the analogs of Proposition 7.7.2 and Theorem 7.7.10,where the latter asserts completeness, not just soundness. (cid:2) The OBJ implementation does not use equality condition rewriting, be-cause it is too ineﬃcient for use in a practical system. On the otherhand, most of the literature on conditional rewriting, e.g., [113], usesequality condition rewriting, because it allows stronger theorems to beproved.

Fact 7.7.11

If a CMTRS has no variables in its rules, then it is terminating iﬀit is ground terminating, Church-Rosser iﬀ ground Church-Rosser, andhence canonical iﬀ ground canonical.

Proof:

Let t be a term with variables. Because only subterms (modulo theequations) without variables can be redexes, and because rewriting onthese ground subterms of t is terminating (or Church-Rosser), so isrewriting on all of t . Therefore the ground properties imply the generalproperties. The converse is immediate. (cid:2) Applying this result to triangular systems tells us that when variablesare treated as constants, ground canonicity is equivalent to generalcanonicity.

The results of Section 5.3 on adding new constants generalize straight-forwardly to conditional rewriting modulo B ; we state these generaliza-tions explicitly because of their importance for theorem proving, andbecause they appear to be new in this context. Proposition 7.7.12

If a CMTRS ( Σ , A, B) is terminating, or Church-Rosser, or lo-cally Church-Rosser, then so is ( Σ (X), A, B) , for any suitable countableset X of variable symbols. (cid:2) onditional Term Rewriting Modulo Equations The ARS proof of Proposition 5.3.1 generalizes, using the ARS (T Σ ,B (X ωS ), ⇒ [A/B] ) , where the reader should recall that for a given set S of sorts, X ωS denotes the ground signature with (X ωS ) s = { x is | i ∈ ω } for each s ∈ S , a countable set of new variable symbols distinct from the sym-bols in Σ . The proof of Proposition 5.3.4 in Appendix B also generalizesto conditional rewriting modulo B , giving the following: Proposition 7.7.13

A CMTRS ( Σ , A, B) is ground terminating if ( Σ (X), A, B) isground terminating, where X is a variable set for Σ ; moreover, if Σ isnon-void, then ( Σ , A, B) is ground terminating iﬀ ( Σ (X), A) is groundterminating. (cid:2) Corollary 7.7.14 If Σ is non-void, then a CMTRS ( Σ , A, B) is ground terminatingiﬀ it is terminating. (cid:2) Exercise 7.7.3

Show that adding any set of constants to

DNF gives a terminatingMTRS. (cid:2)

Proposition 5.3.6 in Section 5.7 can be generalized to the following:

Proposition 7.7.15

A CMTRS ( Σ , A, B) is Church-Rosser if and only if ( Σ (X ωS ),A, B) is ground Church-Rosser, and ( Σ , A, B) is locally Church-Rosser ifand only if ( Σ (X ωS ), A, B) is ground locally Church-Rosser. Proof:

If we let T = T Σ ,B (X) and G = T Σ ,B (X ωS ) , then the proof given for Propo-sition 5.3.6 on page 125 goes through as it stands. (cid:2) Exercise 7.7.4

Use Corollary 7.7.14 and Proposition 7.7.15 to show that

PROPC ( X ) is canonical if PROPC is canonical. (cid:2)

Deﬁnition 7.5.2 (of commutativity) generalizes to conditional rules, asdo Lemma 7.5.3 and Proposition 7.5.4; these results are stated below,and their proofs are exactly the same as for the unconditional case, except that Proposition 7.7.17 uses Lemma 7.7.16.

E31

Lemma 7.7.16

Given a Σ -CMTRS A commuting with Σ -equations B , if t ⇒ A/B t ⇒ A/B · · · ⇒

A/B t n then there exist t (cid:48) , . . . , t (cid:48) n such that t ⇒ A t (cid:48) ⇒ A · · · ⇒ A t (cid:48) n and t (cid:48) n (cid:39) B t n . The same result also holds for weak rewritingmodulo B . (cid:2) Proposition 7.7.17

If a CTRS ( Σ , A) commuting with Σ -equations B is terminat-ing, then the CMTRS ( Σ , A, B) is also terminating, and the same holdsunder weak rewriting. (cid:2) Deduction and Rewriting Modulo Equations

We also have the following:

Proposition 7.7.18

Given a CMTRS C , let C U be the MTRS whose rules are thoseof C with their conditions (if any) removed. Then C is terminating (orground terminating) if C U is. Proof:

Any rewrite sequence of C is also a rewrite sequence of C U and there-fore ﬁnite. (cid:2) The following is a nice application of several results earlier in thischapter:

Theorem 7.7.19

Given a conditional triangular propositional system, let T beits set of equations, let A = T ∪ P where P is the set of equations in PROPC , and let B be the associative and commutative laws for and and xor . Then T and A are terminating modulo B . Proof:

Let T (cid:48) , A (cid:48) be the unconditional non-modulo versions of T , A , respec-tively, so that A (cid:48) = T (cid:48) ∪ P . Example 5.8.27 showed that unconditionaltriangular propositional systems are terminating under rewriting mod-ulo no equations, so T (cid:48) is terminating. Therefore T is terminating byProposition 5.8.12, and Proposition 7.7.17 implies that T modulo B isterminating, since B commutes with T because it commutes with anyrule with a constant as its leftside.An argument similar to that of Example 5.8.27 shows that A (cid:48) is ter-minating. Let N = (cid:11) m + i = ω where m is the number of dependent vari-ables in T , and note that N is Noetherian. For t a Σ (Z) -term, let ψ i (t) be the number of occurrences of y i in t , let ρ(t) be as in Example 7.5.9with ρ(z) = z ∈ Z , and let τ(t) = (ψ (t), ψ (t), . . . , ψ m (t), ρ(t)) .Then τ satisﬁes the hypotheses of Proposition 7.5.7, and thus A (cid:48) is ter-minating. Therefore Proposition 5.8.12 implies A is terminating, and so A is also terminating modulo B by Proposition 7.7.17. (cid:2) Given a poset P , we can deﬁne weak and strong ρ -monotonicty ofconditional rewrite rules modulo B , of substitution modulo B , and ofoperations in Σ , just as in Deﬁnition 5.5.3, except that T Σ and T Σ ( { z } s ) arereplaced by T Σ ,B and T Σ ,B ( { z } s ) , respectively. Note that as before, theinequalities for a rule are only required to hold when all the conditions of the rule converge (modulo B ). The following is the modulo B gener-alization of the most powerful termination result (Theorem 5.8.33) forconditional rules in Chapter 5: Theorem 7.7.20

Let ( Σ , A, B) be a CMTRS with Σ non-void and with ( Σ (cid:48) , A (cid:48) , B) a CMTRS with Σ (cid:48) ⊆ Σ and A (cid:48) ⊆ A ground terminating; let P be a poset,and let N = A − A (cid:48) with Σ (cid:48) the minimal signature. If there is a ρ : T Σ ,B → P such that(1) each rule in A (cid:48) is weak ρ -monotone,(2) each rule in N is strict ρ -monotone, onditional Term Rewriting Modulo Equations (3) each operation in Σ is strict ρ -monotone, and(4) P is Noetherian, or if not, then for each t ∈ T Σ ,s there is someNoetherian poset P ts ⊆ P s such that t ∗ ⇒ [A/B] t (cid:48) implies ρ(t (cid:48) ) ∈ P ts ,then ( Σ , A, B) is ground terminating. (cid:2) The proof, which is much like that of Theorem 5.8.33, is sketched inAppendix B.

Exercise 7.7.5

Given a CMTRS C , let C U be the MTRS whose rules are those of C with their conditions (if any) removed. Then C is Church-Rosser (orground Church-Rosser) if C U is. (cid:2) As always, ARS results apply directly, including the Newman Lemmaand the Hindley-Rosen Lemma (Exercise 5.7.5), so we do not state themhere. The results stated here are actually rather weak. Perhaps themost generally useful methods for proving Church-Rosser are based onthe Newman Lemma, because it is usually much easier to prove the localChurch-Rosser property. As discussed in Section 7.6, and in more detailin Chapter 12, although the Critical Pair Theorem (5.6.9) does not gen-eralize to modulo B rewriting, in many cases the local Church-Rosserproperty can still be checked by a variant of an algorithm introducedby Knuth and Bendix [117] for the unsorted unconditional non-modulocase. The Hindley-Rosen Lemma applied to conditional rewriting mod-ulo equations gives the following: Proposition 7.7.21

Given Church-Rosser CMTRS’s M i = ( Σ , A i , B) for i ∈ I ,which strongly commute in the sense thatif t ⇒ A i /B t and t ⇒ A j /B t for some i, j ∈ I , then there issome t such that t , ⇒ A j /B t and t ∗ ⇒ A i /B t , then M = ( Σ , (cid:83) i ∈ I A i , B) is Church-Rosser, where , ⇒ indicates reﬂexiveclosure. (cid:2) Proposition 5.2.6 also generalizes, since it follows from the ARS re-sult Proposition 5.7.6:

Proposition 7.7.22 If ( Σ , A, B) is a Church-Rosser CMTRS, A (cid:238) ( ∀ X) t = B t (cid:48) iﬀ t ↓ A/B t (cid:48) . (cid:2) Deduction and Rewriting Modulo Equations

We next prove the following generalization of Proposition 7.6.2:

Proposition 7.7.23

Given a CTRS ( Σ , A) commuting with B , if A is (locally)Church-Rosser, then the CMTRS ( Σ , A, B) is also (locally) Church-Rosser. Proof:

For the Church-Rosser property, suppose t ∗ ⇒ A/B t and t ∗ ⇒ A/B t .Then by Lemma 7.7.16, we can ﬁnd t (cid:48) and t (cid:48) such that t ∗ ⇒ A t (cid:48) and t ∗ ⇒ A t (cid:48) with t (cid:48) (cid:39) B t and t (cid:48) (cid:39) B t . Then by Church-Rosser, thereis a term t such that t (cid:48) ∗ ⇒ A t and t (cid:48) ∗ ⇒ A t . Hence t ∗ ⇒ A/B t and t ∗ ⇒ A/B t . So again by Lemma 7.7.16, there exist terms t (cid:48) and t (cid:48)(cid:48) suchthat t ∗ ⇒ A t (cid:48) and t ∗ ⇒ A t (cid:48)(cid:48) with t (cid:48) (cid:39) B t and t (cid:48)(cid:48) (cid:39) B t , so that t (cid:48) (cid:39) B t (cid:48)(cid:48) .Thus t ∗ ⇒ A/B t (cid:48) and t ∗ ⇒ A/B t (cid:48)(cid:48) with t (cid:48) (cid:39) B t (cid:48)(cid:48) , and we are done. A similar proof works for the local Church-Rosser property. (cid:2) The next result uses the above and Proposition 7.7.18:

Theorem 7.7.24

Given a conditional triangular propositional system T withconsistent conditions, let B be the associative and commutative lawsfor and and xor , and let A = T ∪ P where P is the equations in PROPC excluding B . Then T and A are ground Church-Rosser modulo B , andhence ground canonical. Proof:

Proposition 7.7.19 shows termination modulo B for T and A . We nowshow T locally Church-Rosser modulo B , using the fact that only thesymbols y k can be rewritten. Suppose t ⇒ A/B t and t ⇒ A/B t , withredexes y i and y j respectively. If y i = y j and they are at the sameoccurrence in t , then t = B t by the consistent conditions hypothesis.Otherwise, we can rewrite y j in t and y i in t to get the same term t (cid:48) . This gives the local Church-Rosser property, so that the Newmanlemma implies that T modulo B is Church-Rosser and thus canonical.Since P modulo B is Church-Rosser, and it is not diﬃcult to checkthat ⇒ P/B and ⇒ T/B strongly commute in the sense of Proposition 7.7.21(the Hindley-Rosen Lemma), it follows that A is Church-Rosser modulo B . Therefore A is also canonical. (cid:2) The treatment of equational deduction modulo equations in Section 7.2may be novel, but it is similar to the treatment of term rewriting mod-ulo equations in Section 7.3, which follows the work of Huet [108] andothers, although our exposition is more semantic and algebraic thanthe standard literature. Basic semantic results on rewriting moduloequations include its equivalence with class rewriting and its relation-ship with equational deduction (Theorem 7.3.4 and Proposition 7.2.2). iterature

Theorem 7.3.9 is also very fundamental, and although it will not sur-prise experts, it does not appear in the literature. Hsiang’s Theorem(7.3.13) was ﬁrst proved by Hsiang [106]. The exclusive normal formresults such as Proposition 7.3.18 are not as well known as they shouldbe, although they are direct consequences of Hsiang’s Theorem. Theo-rem 7.3.22 says that reduction gives a decision procedure for the predi-cate calculus; it is of basic importance to this book, and the proof givenhere, which appears to be new, is a nice example of algebraic techniquesin term rewriting theory.Our algebraic approach to hardware veriﬁcation was ﬁrst outlined in[59]. Although inﬂuenced by Mike Gordon’s clear expositions of hard-ware veriﬁcation using higher-order logic, e.g., [93, 94], we disagreewith Gordon’s claim that higher-order logic is necessary for hardware veriﬁcation. A key insight for this chapter is that equality is alreadya “bidirectional” (i.e., symmetric) relation, which of course is axioma-tized by equational logic. Bidirectionality is needed for many importantcircuits, and conditional equational logic adds important expressivepower. The results about triangular propositional systems seem to benew, and Proposition 7.4.3 and Theorem 7.4.10 justify the use of reduc-tion to prove properties of combinatorial circuits. Proposition 7.4.16and the subsequent discussion in Section 7.4.3 are very useful. Severalof the main techniques from this chapter are illustrated in the proofsof Theorems 7.7.19 and 7.7.24. It is reasonable to conjecture that everytriangular system has a most general partial solution, and that generalconditional solutions are unique (even though most general uncondi-tional solutions are not unique).This approach to hardware veriﬁcation has been applied to manyexamples, including some that are non-trivial [168, 169, 84], using the2OBJ theorem prover. An early application of OBJ to hardware speciﬁ-cation and testing appears in [159]. Most general solutions may remindsome readers of uniﬁers, and indeed, they are a special case of uniﬁ-cation understood in a suﬃciently broad sense, as for example in [60],though this is not the place to discuss such abstract notions.Many results from Section 7.5 onward are novel, in that few have been proved for many-sorted rewriting, let alone overloaded many-sorted rewriting, and several are new even for the unsorted case. Theresults about rules commuting with equations in Proposition 7.6.2 andits generalization Proposition 7.7.23 do not appear to be in the litera-ture. Conditional abstract rewrite systems (Deﬁnition 7.7.1) appear tobe a new and useful concept. The construction of W (cid:5) can be describedmore abstractly using concepts from Section 8.2: because the relationof the ARS is deﬁned using only Horn clauses, an initiality theorem forHorn clause theories gives the least relation satisfying those sentences.The CARS approach could also have been applied to ordinary CTRS’s,simplifying Section 5.8. Theorems 7.7.7, 7.7.8 and 7.7.10 extend the Deduction and Rewriting Modulo Equations main semantic results of Chapter 5 to conditional term rewriting mod-ulo equations. In particular, Theorem 7.7.8 makes explicit the connec-tion with initial algebra semantics. Proposition 7.7.17 (about commu-tativity) and the very general Theorem 7.7.20 also appear to be usefulnew results for proving termination modulo equations.I thank Prof. Mitsuhiro Okada and Dr. Monica Marcus for many valu-able comments on the material in this chapter; the latter has read manyparts of the chapter carefully and oﬀered many corrections, though sheis of course not responsible for any remaining errors.

A Note to Lecturers:

This chapter contains a great deal of diﬃcult material, much of which would have to be skippedin a one semester or one quarter course. I have found it pos-sible to present the hardware examples with only a pointerto the theory, since intuitions about hardware are strongenough to make the computations convincing. Of course,many proofs can be skipped in lectures and even in read-ings, especially since this chapter has the structure of a se-quence of generalizations, in which similar results appear ingradually increasing generality.

First-Order Logic and ProofPlanning

This chapter extends our algebraic approach to full ﬁrst-order logic.Section 8.2 treats the special case of Horn clause logic, which we showis essentially the same as equational logic. Section 8.3 presents ﬁrst-order logic syntax and some basics of its model theory; this devel-opment is unusual in treating the many-sorted case and in allowing(partially) empty models. Section 8.4 discusses proof planning. Proofrules for existential quantiﬁers, case analysis, and induction are givenin Sections 8.5, 8.6, and 8.7, respectively. The notion of induction isunusually general.

First-order signatures provide symbols for building ﬁrst-order sentences,but actually deﬁning these sentences and their satisfaction is put oﬀ toSection 8.3.

Deﬁnition 8.1.1

Given a set S of sorts , an S -sorted ﬁrst-order signature Φ isa pair ( Σ , Π ) , where Σ is an S -sorted algebraic signature, i.e., an in-dexed set of the form { Σ w,s | w ∈ S ∗ , s ∈ S } whose elements are called function (or operation ) symbols , and Π is an indexed set of the form { Π w | w ∈ S ∗ } whose elements are called predicate symbols , where π ∈ Π w is said to have arity w . (cid:2) Example 8.1.2

Let Σ be the algebraic signature Σ NATP of Example 2.3.3, with onesort,

Nat , and operations 0 , s ; let Π have Π Nat = { pos } , Π NatNat = { geq } ,and Π w = ∅ otherwise. The signature Φ NAT = ( Σ , Π ) is adequate forexpressing many simple properties of natural numbers. (cid:2) Our discussion of semantics for a given signature starts with itsmodels:

Deﬁnition 8.1.3

Given a ﬁrst-order signature Φ = ( Σ , Π ) , a ﬁrst-order Φ - model M consists of a Σ -algebra, also denoted M , together with for each π ∈ First-Order Logic and Proof Planning Π w , a subset M π ⊆ M w , where M s ...s n denotes the set M s × · · · × M s n .A model M is nonempty iﬀ each M s is nonempty. (cid:2) Think of M π as the set of values where π is true in M . When w = [] , then π ∈ Π w should represent a relation that is constant, i.e., atruth value. Because M [] is a one-point set, say { (cid:63) } , there are only twopossible values for M π ⊆ { (cid:63) } , namely ∅ and { (cid:63) } . We let the ﬁrst casemean π is false, and the second mean it is true. Example 8.1.4

If we let Φ be the signature Φ NAT of Example 8.1.2 above, then thestandard Φ -model M has M Nat = T Σ (the natural numbers in Peano nota-tion, with 0 and s interpreted as usual in T Σ ), with M pos = { s( ), s(s( )),s(s(s( ))), . . . } and with M geq = {(cid:104) m, n (cid:105) | m ≥ n } (where ≥ has theusual meaning, and n, m are Peano numbers). Of course, there aremany other Φ -models, most of which are not isomorphic to the natural numbers. For example, there are Φ -models with just one element. (cid:2) Exercise 8.1.1

For Φ the signature of Example 8.1.2 above, how many Φ -models M are there with M Nat = { } ? How many are there with M Nat = { , } ? (cid:2) Exercise 8.1.2 (a) Give a ﬁrst-order signature Φ that is adequate for partiallyordered sets, and interpret the natural numbers as a Φ -model.(b) Give a ﬁrst-order signature Φ that is adequate for equivalence re-lations, and interpret the natural numbers as a Φ -model. (cid:2) Deﬁnition 8.1.5

Given a ﬁrst-order signature Φ = ( Σ , Π ) and given Φ -models M, M (cid:48) , then a Φ - morphism h : M → M (cid:48) is a Σ -homomorphism h : M → M (cid:48) such that for each π ∈ Π s ...s n (m , . . . , m n ) ∈ M π implies (h s (m ), . . . , h s n (m n )) ∈ M (cid:48) π for all m i ∈ M s i . The composition of two Φ -morphisms is their compo-sition as Σ -homomorphisms. The identity Φ -morphism on M , denoted1 M , is the identity on M .A Φ -morphism h : M → M (cid:48) is a Φ - isomorphism iﬀ there is a Φ -morphism g : M (cid:48) → M such that h ; g = M and g ; h = M (cid:48) . Such a morphism g is called an inverse to h . (cid:2) Exercise 8.1.3 (a) Show that the composition of two Φ -morphisms is also a Φ -morphism.(b) Show that 1 M satisﬁes the identity law for composition of Φ -mor-phisms.(c) Show that if h is a Φ -isomorphism then it has a unique inverse.(d) Show that a bijective Φ -morphism is not necessarily a Φ -isomor-phism. (cid:2) orn Clause Logic Horn clause logic is a sublogic of ﬁrst-order logic that is essentially thesame as conditional equational logic. Although Horn clause notationuses ﬁrst-order logic symbols, this section will develop its syntax andsatisfaction independently.

Deﬁnition 8.2.1

Given a ﬁrst-order signature Φ = ( Σ , Π ) , a Φ - Horn clause is anexpression of the form ( ∀ X) p ∧ . . . ∧ p n ⇒ p where X is an S -sorted set of variable symbols, and the p i , called atoms ,are each of the form π(t , . . . , t k ) such that π ∈ Π s ...s k and t j ∈ T Σ (X) s j for j = , . . . , k . We may say that p is the head of the clause, and that p , . . . , p n is its body . As usual, we assume that the components X s for X are mutually disjoint, and are also disjoint from the symbols in Σ and Π . For n =

0, we include Horn clauses of the form ( ∀ X) p , whichare just universally quantiﬁed atoms. (cid:2) Note that the symbols ∀ , ∧ and ⇒ in a Horn clause do not have anyseparate meanings, but are parts of one single mixﬁx symbol, as arethe symbols ∀ and if in conditional equations. Example 8.2.2

For Φ the signature of Example 8.1.2 above, the following are allHorn clauses: ( ∀ n) geq (n, )( ∀ n, m) geq (n, m) ⇒ geq (s(n), m)( ∀ n, m) geq (n, m) ⇒ geq (s(n), s(m))( ∀ n) pos (s(n)) . (cid:2) Exercise 8.2.1

Are all the axioms for partially ordered sets Horn clauses? Whatabout those for equivalence relations? (cid:2)

Section 3.5 discussed extending an assignment θ : X → M where X isa variable set and M is a Σ -algebra, to a Σ -homomorphism θ : T Σ (X) → M , where θ(t) is the result of simultaneously substituting θ(x) for each x into t ∈ T Σ (X) . The following uses this for Φ -models M when Σ is thealgebraic part of Φ . Deﬁnition 8.2.3

Given a ﬁrst-order signature Φ = ( Σ , Π ) , a ﬁrst-order Φ -model M , and a Φ -Horn clause h c of the form ( ∀ X) p ∧ · · · ∧ p n ⇒ p ,then we say M satisﬁes h c and write M (cid:238) Φ h c , iﬀ for every assignment a : X → M , a(t i ) ∈ M π i for i = , . . . , n implies a(t ) ∈ M π , where p i = π i (t i ) with t i = (t i , . . . , t ik(i) ) , and where a(t i ) = (a(t i ), . . . ,a(t ik(i) ) , for i = , . . . , n . First-Order Logic and Proof Planning A Horn speciﬁcation consists of a ﬁrst-order signature Φ = ( Σ , Π ) and a set H of Φ -Horn clauses; we may write ( Σ , Π , H) , or ( Φ , H) , oreven just H . A Φ -model M satisﬁes ( Φ , H) iﬀ it satisﬁes each clause in H , and then we write M (cid:238) Φ H and call M an H - model . (cid:2) Let

HCL denote the institution of Horn clause logic, consisting of itssignatures (ﬁrst-order signatures), its sentences (Horn clauses), its mod-els and morphisms (ﬁrst-order models and morphisms), and the notionof satisfaction in Deﬁnition 8.2.3 above. Exercise 8.2.2

Show that each Horn clause in Example 8.2.2 is satisﬁed by themodel M Nat of Example 8.1.4, or else that it is not. (cid:2)

Example 8.2.4

Letting Φ be the signature of Example 8.1.2 and H the Horn clauses of Example 8.2.2 gives a Horn speciﬁcation for the natural num-bers with predicates for inequality and positivity. The intended modelis the initial model, which exists by Theorem 8.2.6 below. (cid:2) ( (cid:63) ) Initial Horn Models

In a precise analogy with the equational case, we have the followingdeﬁnition and theorem:

Deﬁnition 8.2.5

Given a Horn speciﬁcation ( Φ , H) , then an H -model I is initial iﬀ given any H -model M , there is a unique Φ -morphism from I to M . (cid:2) Theorem 8.2.6 ( Initiality ) Every Horn speciﬁcation has an initial model.

Proof:

Given a Horn speciﬁcation ( Σ , Π , H) , we construct an initial model T H as follows:1. If S is the sort set of Φ = ( Σ , Π ) , let S = S ∪ { B } , where B ∉ S (thinkof B as the Booleans).2. Deﬁne an algebraic S -sorted signature Π by Π w, B = Π w for all w ∈ S + , and Π [], B = { true } ∪ Π [] , and Π w,s = ∅ otherwise. Let Φ = (S, Σ ∪ Π ) .

3. Given a Φ -model M , let M be the Φ -model constructed as follows:(a) M s = M s for all s ∈ S ;(b) M B = { π(m , . . . , m n ) | π ∈ Π and (cid:104) m , . . . , m n (cid:105) ∉ M π } ∪{ true } ;(c) M σ = M σ for all σ ∈ Σ ; This chapter uses the word “institution” for the signatures, sentences, models,model morphisms, and satisfaction associated with some logical system. This is usefulbecause we work with many diﬀerent logical systems. Institutions are formalized in[67]; see also the discussion of the satisfaction condition in Section 4.10. orn Clause Logic (d) M π (m , . . . , m n ) = true if (cid:104) m , . . . , m n (cid:105) ∈ M π ; and(e) M π (m , . . . , m n ) = π(m , . . . , m n ) if (cid:104) m , . . . , m n (cid:105) ∉ M π .4. Conversely, any Φ -algebra A gives a Φ -model A by dropping every-thing involving the sort B , and deﬁning (cid:104) m , . . . , m n (cid:105) ∈ A π iﬀ A π (m , . . . , m n ) = A true .

5. Now let H be the set of Φ -conditional equations of the form ( ∀ X) p = true if { p = true , . . . , p n = true } , one for each Horn clause in H of the form ( ∀ X) p ∧ · · · ∧ p n ⇒ p .

6. Finally, deﬁne T H to be T H , where T H is the initial ( Φ , H) -algebra.We now check that this construction works, i.e., that T H really is aninitial ( Φ , H) -model. Let M be any ( Φ , H) -model. Then we can use 3and 5 to check that the algebra M satisﬁes H . Hence there is a unique Φ -homomorphism h : T H → M , which then gives us a Φ -morphism h : T H = T H → M = M . Uniqueness of this morphism follows from thefact that the translations described in 3 and 4 above deﬁne a bijectivecorrespondence between Φ -morphisms and Φ -homomorphisms. (cid:2) This proof translates Horn clauses into conditional equations and Hornmodels into algebras, and then exploits the existence of initial algebras.It is noteworthy that Π contains true but not false , and that the truthvalues of false atoms are just the atoms themselves. Note that by 3(a)and 6, the carrier (T H ) s for s ∈ S is the same as that of T Σ , becausethere are no equations among Σ -terms. Exercise 8.2.3

Give the details of the argument that M satisﬁes H for the aboveproof. (cid:2) Exercise 8.2.4

Show that if there are no atoms in H then (T H ) π = ∅ for each π ∈ Π . (cid:2) Theorem 8.2.6 justiﬁes deﬁning relations “by induction” with Hornclauses: if we deﬁne a relation π with Horn clauses H and if M isanother H -model having the same carriers as the initial H -model T H ,then the unique Φ -morphism h : T H → M must be the identity, and sowe must have (T H ) π ⊆ M π . The same argument works for any initial H -model; moreover, for any H -model M , the image of (T H ) π under theunique homomorphism h to M is contained in M π , that is, h((T H ) π ) ⊆ M π . Thus (T H ) π is the smallest relation satisfying the given formulae. First-Order Logic and Proof Planning

Example 8.2.7

We use Theorem 8.2.6 to deﬁne the transitive closure of a rela-tion, i.e., the least transitive relation containing the given relation. Here Σ has just one sort, Elt , with Σ [], Elt = X for some set X , and withall other Σ w,s empty; and Π has Π Elt Elt = {

R, R ∗ } , with all other Π w empty. Assuming R is already deﬁned on some set X , the followingHorn clauses deﬁne the transitive closure R ∗ of R , where the variables x, x (cid:48) , x (cid:48)(cid:48) range over X : xRx (cid:48) ⇒ xR ∗ x (cid:48) xR ∗ x (cid:48) ∧ x (cid:48) R ∗ x (cid:48)(cid:48) ⇒ xR ∗ x (cid:48)(cid:48) . We can use the construction in the proof of Theorem 8.2.6 to jus-tify an OBJ speciﬁcation for the transitive closure R* of a relation R .Because this speciﬁcation works for any R , we give a theory for R ; this is preceded by a speciﬁcation for the auxiliary sort B and its one truthvalue, denoted tt to distinguish it from OBJ’s builtin Boolean value true . obj B is sort B .op tt : -> B .endoth R is sort Elt .ex B .op _R_ : Elt Elt -> B .endthobj R* is pr R . ex B .op _R*_ : Elt Elt -> B .vars X Y Z : Elt .cq X R* Y = tt if X R Y == tt .cq X R* Z = tt if X R* Y == tt and Y R* Z == tt .endo Notice that the module B has initial semantics, R has loose seman-tics, and R* has again initial semantics. The notation “ pr R . ex B . ”in R* indicates that the sort Elt is protected but the sort B is only ex-tended; although the single truth value of sort B will not be corrupted, it is likely that other terms of sort B will be added to serve as “false”values for relations. On the other hand, the module R is protected in R* ,because the equations in R* do not aﬀect the relation R . What all thismeans is that given any relation R , a unique R* is deﬁned for it; this canbe seen by considering the models of R* , and a more formal approachinvolving second-order quantiﬁers is also given in Chapter 9. Note thatthe second equation of the module R* cannot be used for reductionbecause its condition has variables that are not in its leftside; howeverOBJ3’s apply can be used in equational deduction.We can give a more idiomatic version of the above speciﬁcation us-ing builtin truth values; this is valid because if BOOL is protected in the irst-Order Logic module R , then it is necessarily also protected in R* , because BOOL isprotected in the module R below iﬀ the relation R is correctly deﬁnedin whatever model is chosen. th R is sort Elt .op _R_ : Elt Elt -> Bool .endthobj R* is pr R .op _R*_ : Elt Elt -> Bool .vars X Y Z : Elt .cq X R* Y = true if X R Y .cq X R* Z = true if X R* Y and Y R* Z .endo (In OBJ, it is more natural to parameterize the object R * by the the-ory R , but since parameterization is not discussed until Chapter 11, wetake a simpler approach here.) (cid:2) Example 8.2.7 illustrates a general method for replacing relations andHorn clauses by functions and equations. This implies we don’t needto add relations and Horn clauses to an implementation of equationallogic like OBJ.

Exercise 8.2.5 (a) Deﬁne a relation R on the Peano numbers by N R M = s N == M and use (one of) the above object(s) R* and OBJ3’s apply to showthat s 0 R* s s s 0 . Now explain what R* is.(b) Show that any relation is contained in a least equivalence relation,and give corresponding OBJ code deﬁning that relation. Now useOBJ to compute three values of the equivalence closure of somesimple (but not wholly trivial) relation. (cid:2) Exercise 8.2.6

Show that

X R * Y in the second equation of the speciﬁcation R * in Example 8.2.7 can be replaced by X R Y without changing the mean-ing; say what “without changing the meaning” means in this context. (cid:2)

This section gives an algebraic treatment of ﬁrst-order logic syntax, andthen deﬁnes ﬁrst-order satisfaction; after that, adding equality predi-cates and ﬁxing a “data” model are considered; Gödel’s completenessand incompleteness theorems are also informally discussed. First-Order Logic and Proof Planning

We deﬁne the ﬁrst-order sentences over a ﬁrst-order signature Φ = ( Σ , Π ) by ﬁrst deﬁning Σ -terms and then deﬁning Φ -formulae; we willhave to be careful about variables. Let S be the sort set of Φ , let X be an S -sorted set of variable symbols disjoint from Σ and Φ , such that eachsort has an inﬁnite number of symbols, and let X be a ﬁxed S -indexedsubset of X .A ( Φ , X) - term is an element of T Σ ∪ X . Recall that the ( S -indexed)function Var is deﬁned on T Σ ∪ X as follows:0. Var s (σ ) = ∅ if σ ∈ Σ [],s ;1. Var s (x) = { x } if x ∈ X s ; Var s (σ (t , . . . , t n )) = (cid:83) ni = Var s i (t i ) for n > ( Var is the unique ( Σ ∪ X) -homomorphism T Σ ∪ X → P (X) where the S -indexed set P (X) of all subsets of X is given an appropriate ( Σ ∪ X) -structure.) We are now ready for the syntax of ﬁrst-order logic: Deﬁnition 8.3.1

A (well-formed) ( Φ , X) - formula is an element of the carrier ofthe (one-sorted) free algebra WFF X ( Φ ) deﬁned to have the following asits (one-sorted) signature, which we denote Ω and call the metasigna-ture :0. a constant true ,1. a unary preﬁx operation ¬ , called negation ,2. a binary inﬁx operation ∧ , called conjunction ,3. a unary preﬁx operation ( ∀ x) for each x in X , called universalquantiﬁcation over x ,plus as its generators (i.e., constants not in Ω ), the atomic ( Φ , X) - formu-lae , which are the elements of G X = { π(t , . . . , t n ) | π ∈ Π s ...s n and t i ∈ (T Σ ∪ X ) s i for i = , . . . , n } . Note that Ω is inﬁnite if X is, because then there is an inﬁnite number of unary operations ( ∀ x) , but of course all ﬁrst-order formulae are ﬁnite.The symbols in Ω are called logical symbols, whereas those in Φ arecalled non-logical symbols. Let WFF ( Φ ) = WFF X ( Φ ) ; it contains every WFF X ( Φ ) , and its elements are called Φ - formulae .The functions Var and

Free , giving the sets of all variables , and ofall free variables , of Φ -formulae, are deﬁned by the following:0. Var ( true ) = Free ( true ) = ∅ ,1. Var (π(t , . . . , t n )) = Free (π(t , . . . , t n )) = (cid:83) ni = Var (t i ) ,2. Var ( ¬ P ) = Var (P ) , and

Free ( ¬ P ) = Free (P ) , irst-Order Logic Var (P ∧ Q) = Var (P ) ∪ Var (Q) , and

Free (P ∧ Q) = Free (P ) ∪ Free (Q) , and4.

Var (( ∀ x)P ) = Var (P ) ∪ { x } , and Free (( ∀ x)P ) = Free (P ) − { x } .A variable that is not free is called bound ; let Bound (P ) = Var (P ) − Free (P ) . A Φ - sentence is a Φ -formula P with no free variables, i.e., with Free (P ) = ∅ ; Φ -sentences are also called closed Φ - formulae . A formulathat is not closed is called open . Let FoSen ( Φ ) denote the set of all Φ -sentences. (cid:2) Exercise 8.3.1

Show that the functions

Var and

Free are Ω -homomorphisms,by giving P ( X ) appropriate Ω -algebra structures. (cid:2) We introduce the remaining logical connectives, false , ∨ , ⇒ , (cid:97) , and ( ∃ x) (the last four are called disjunction , implication , equivalence ,and existential quantiﬁcation , respectively), as abbreviations for cer-tain terms over the operations already introduced, as follows: false = ¬ true P ∨ Q = ¬ ( ¬ P ∧ ¬ Q)P ⇒ Q = ¬ P ∨ QP (cid:97) Q = (P ⇒ Q) ∧ (Q ⇒ P )( ∃ x) P = ¬ (( ∀ x) ¬ P ) .

The symbols

P , Q above are variables over formulae, and the ﬁve op-eration symbols on the left sides of the equations extend the metasig-nature Ω to a new metasignature Ω . Given any Ω -algebra M , the ﬁveequations above extend M in a unique way to an Ω -algebra; moreover,any Ω -homomorphism is automatically an Ω -homomorphism betweenits extended algebras. In particular, the Ω -algebras WFF X ( Φ ) extendto Ω -algebras, and the Ω -homomorphisms Var and

Free extend to Ω -homomorphisms that correctly handle the new logical symbols. With-out these symbols, many of our theorem-proving applications wouldbe much more awkward; this illustrates the conﬂict between logics forfoundations and logics for applications discussed in Section 1.3. Exercise 8.3.2

Extend the recursive deﬁnitions of

Var and

Free so that theydirectly handle the new symbols in Ω . (cid:2) It might ﬁrst appear that formulae like the following are ill-formed orambiguous: ( ∃ x)( ∀ x) geq (x, x)( ∃ x) ( pos (x) ∧ ( ∀ x) geq (x, x)) ( ∃ x)((( ∀ x) pos (x)) ∧ geq (x, x)) . First-Order Logic and Proof Planning

But because we deﬁned quantiﬁers as unary operations on expressions,every quantiﬁer has a unique argument, a subformula called its scope .Every free instance of the quantifying variable within its scope is saidto be bound (or captured ) by that quantiﬁer. Thus, in the ﬁrst formulaabove, the two x ’s in geq (x, x) are bound to the universal quantiﬁer,not to the existential. In the second formula, the ﬁrst x is bound to theexistential quantiﬁer, and the other two are bound to the universal. Inthe third formula, the ﬁrst x is bound to the universal and the next twoare bound to the existential quantiﬁer. It is poor style to keep reusingthe same variable for quantiﬁers, and the following equivalent formulaewould have been clearer: ( ∃ y)( ∀ x) geq (x, x)( ∃ x) ( pos (x) ∧ ( ∀ y) geq (y, y)) ( ∃ y)((( ∀ x) pos (x)) ∧ geq (y, y)) . However, the original formulae still have deﬁnite structure and mean-ing, due to our algebraic notion of formula, which does not require anyprior deﬁnition of scope.Of course, it is still very possible to write ambiguous formulae, suchas ( ∃ x) pos (x) ∧ geq (x, x) , where the argument (scope) of the existential quantiﬁer cannot be de-termined; however, this is a parsing problem, not a problem in ﬁrst-order logic as such (see Section 3.7). In fact, it is rather common towrite ambiguous formulae when it doesn’t matter which parse is taken.For example, in the formula ( ∃ x) pos (x) ∧ ( ∀ x) geq (x, x) , it is not clear whether the existential quantiﬁer acts on the universallyquantiﬁed subformula, but it does not matter because that subformulais closed (see E

17 of Exercise 8.3.10). This situation is much the same asin arithmetic when we write x + y + z instead of x + (y + z ) or (x + y) + z ,since we know it doesn’t matter because addition is associative. In OBJ,precedence declarations can make quantiﬁers bind however tightly we wish.In summary, our algebraic approach to ﬁrst-order logic syntax hasthe advantage over the more ad hoc approaches usually found in theliterature, of a clean separation between structure and parsing, whichsimply avoids complex and confusing deﬁnitions of scope. Many diﬀerent systems of deduction have been given for ﬁrst-orderlogic. Kurt Gödel ﬁrst showed completeness for one of these with re-spect to a “semantic deﬁnition of truth” given by Tarski; this is a notion irst-Order Logic of satisfaction of ﬁrst-order sentences by ﬁrst-order models. All soundand complete systems are equivalent in the sense that they give riseto the same theorems for any theory. This chapter uses (a version of)Tarski’s semantics for justifying a set of rules to transform complexproof tasks into Boolean combinations of simpler proof tasks; we callthese proof planning rules; we do not attempt a completeness proof.More technically, the deﬁnitions in this subsection ﬁrst extend assign-ments a : X → M from terms to ﬁrst-order formulae P , and then deﬁnethe “meaning” or denotation [[P ]] of a formula P to be the set of allassignments that make P true. Deﬁnition 8.3.2

Given a ﬁrst-order signature Φ = ( Σ , Π ) , a Φ -model M , and an assignment (of values in M to variables in X ), i.e., an S -indexed function a : X → M , then we deﬁne E32 a : WFF X ( Φ ) → B , where B = { true , false } by the following:0. a( true ) = true .1. a( ¬ P ) = ¬ a(P ) .2. a(P ∧ Q) = a(P ) ∧ a(Q) .3. a(( ∀ x)P ) = true iﬀ b(P ) = true for all b : X → M such that b(z) = a(z) if z ≠ x .4. a(π(t , . . . , t k )) = true iﬀ (a(t ), . . . , a(t k )) ∈ M π .When X is small, it may be convenient to use the notation P [x ← m , x ← m , . . . , x n ← m n ] instead of a(P ) with X = { x , . . . , x n } and a(x i ) = m i for i = , . . . , n .We now deﬁne the denotation of a ( Φ , X) -formula P , written [[P ]] MX ,or just [[P ]] , to be { a : X → M | a(P ) = true } . Then M satisﬁes P ∈ WFF X ( Φ ) , written M (cid:238) Φ P , iﬀ [[P ]] MX = [X → M] ,i.e., iﬀ all assignments from X to M make P true. E33

Given a set A ofwell-formed Φ -formulae, let M (cid:238) Φ A mean M (cid:238) Φ P for each P ∈ A , and let A (cid:238) Φ P mean that M (cid:238) Φ A implies M (cid:238) Φ P for all Φ -models M ; whenthe symbol (cid:238) is used in this way, it may be called semantic entailment .Note that A need not be ﬁnite. A set A of formulae is closed iﬀ allits elements are closed. As usual, we may omit the subscript Φ on (cid:238) Φ when it is clear from context. Let us write FOL for the institution

E34 ofﬁrst-order logic with this notion of satisfaction. (cid:2)

Intuitively, 3 in Deﬁnition 8.3.2 says [[( ∀ x)P ]] is the set of assignmentsthat make P true no matter what value they assign to x . This suggeststhat when P has no free variables, a(P ) should be independent of thevalues a(x) for all x ∈ X . This is made precise in the following: First-Order Logic and Proof Planning

Proposition 8.3.3

Given a ( Φ , X) -formula P and assignments a, a (cid:48) : X → M , if a(x) = a (cid:48) (x) for all x ∈ Free (P ) , then a(P ) = a (cid:48) (P ) . Proof:

We use induction on the structure of ( Φ , X) -formulae. The two basecases are 0 and 4 of Deﬁnition 8.3.2. For 0, if P = true , then a(P ) = a (cid:48) (P ) for all a, a (cid:48) . For 4, if P = π(t , . . . , t k ) , then Free (P ) = Var (P ) = (cid:83) ki = Var (t i ) , and so if a(z) = a (cid:48) (z) for all z ∈ Free (P ) then a(t i ) = a (cid:48) (t i ) for i = , . . . , k , and so a(P ) = a (cid:48) (P ) .There are three “step” cases, of which 1 and 2 are easy, because Free ( ¬ P ) = Free (P ) and

Free (P ∧ Q) = Free (P ) ∪ Free (Q) . For 3, assume a(P ) = a (cid:48) (P ) if a(z) = a (cid:48) (z) for all z ∈ Free (P ) , and let Q = ( ∀ x)P .Then Free (Q) = Free (P ) − { x } , call it Z , and so a(z) = a (cid:48) (z) for all z ∈ Z implies a(Q) = a(( ∀ x)P ) = true iﬀ b(P ) = true for all b such that b(z) = a(z) if z ≠ x . Also a (cid:48) (Q) = a (cid:48) (( ∀ x)P ) = true iﬀ b (cid:48) (P ) = true for all b (cid:48) such that b (cid:48) (z) = a (cid:48) (z) if z ≠ x . Now suppose a(Q) = true and let b (cid:48) : X → M be such that b (cid:48) (z) = a (cid:48) (z) if z ≠ x . Deﬁne b : X → M by b(z) = a(z) if z ≠ x and b(x) = b (cid:48) (x) . Then b(z) = b (cid:48) (z) for each z ∈ Free (P ) , and so the induction hypothesis gives us b (cid:48) (P ) = b(P ) = true and also a (cid:48) (Q) = true . Similarly, we can show that a (cid:48) (Q) = true implies a(Q) = true . Therefore a(Q) = a (cid:48) (Q) if a(z) = a (cid:48) (z) for all z ∈ Z . (cid:2) Corollary 8.3.4 If P is a closed ( Φ , X) -formula and M is a Φ -model, then either [[P ]] MX = ∅ or else [[P ]] MX = [X → M] . Proof:

Since

Free (P ) = ∅ , we have a(z) = a (cid:48) (z) for all z ∈ Free (P ) for any a, a (cid:48) at all. Therefore a(P ) = a (cid:48) (P ) for all a, a (cid:48) . Hence a(P ) = true forall a , or else a(P ) = false for all a . In the ﬁrst case, [[P ]] = [X → M] while in the second [[P ]] = ∅ . (cid:2) That is, any closed formula is either true or else false of any givenmodel.As usual, [[ _ ]] MX is an Ω -homomorphism to a suitable target algebra,in this case with carrier A MX = P ([X → M]) ; we usually drop the su-perscript M and subscript X . The Ω -algebra structure for A is given asfollows, for A, B ⊆ [X → M] :0. A true = [X → M] . A ¬ (A) = [X → M] − A .2. A ∧ (A, B) = A ∩ B .3. A ( ∀ x) (A) = { a : X → M | a (cid:48) (y) = a(y) for all y ≠ x implies a (cid:48) ∈ A } .If we now deﬁne α : G X → A X by α(π(t , . . . , t n )) = { a | (a(t ), . . . , a(t n )) ∈ M π } , then [[ _ ]] is the unique Ω -homomorphism WFF X ( Φ ) → A X extending α ;i.e., [[P ]] = α(P ) . irst-Order Logic Deﬁnition 8.3.5

First-order ( Φ , X) -formulae P , Q are ( semantically ) equivalent ,written P ≡ Q , iﬀ [[P ]] MX = [[Q]] MX for all M . (cid:2) Note that ≡ is neither a logical nor a non-logical symbol, but a metalog-ical symbol , used for talking about the satisfaction of formulae. Equiva-lent formulae are true under exactly the same circumstances and hencecan be substituted for each other without changing the truth value ofany formula of which they are part. (The ubiquity of this concept re-ﬂects the obsession of classical logic with truth.) Exercise 8.3.3

Given a ﬁrst-order signature Φ , a Φ -model M , and ( Φ , X) -formu-lae P , Q , show the following:(a) M (cid:238) P ⇒ Q iﬀ [[P ]] MX ⊆ [[Q]] MX . (b) P ≡ Q iﬀ M (cid:238) (P (cid:97) Q) for all M .(c) P ≡ Q implies M (cid:238) P iﬀ M (cid:238) Q for all M .(d) “ implies ” cannot be replaced by “ iﬀ ” in (c) above.(e) [[( ∀ x) P ]] MX ⊆ [[P ]] MX .Note that (c) implies that P ≡ Q implies [[P ]] = [X → M] iﬀ [[Q]] = [X → M] . (cid:2) Exercise 8.3.4

Given a ﬁrst-order signature Φ , and Φ -formulae P , Q, R , showthe following: E . P ∧ Q ≡ Q ∧ P .E . P ∧ (Q ∧ R) ≡ (P ∧ Q) ∧ R .E . P ∧ P ≡ P .

Also, for Φ -formulae A, A (cid:48) , P , P (cid:48) , show that E . A ≡ A (cid:48) and P ≡ P (cid:48) imply A (cid:238) P iﬀ A (cid:48) (cid:238) P (cid:48) . (cid:2) E4 is a (weak) version of

Leibniz’s principle , that equal things may besubstituted for each other; this supports the importance of ≡ , and un-derlines the somewhat strange view of traditional logic that all true sentences are equal (as are all false sentences). The following builds on E E Notation 8.3.6

In the notation “ A (cid:238) . . . ” where A is a set, we may write A, P or A ∧ P for A ∪ { P } , and write A, A (cid:48) or A ∧ A (cid:48) for A ∪ A (cid:48) . For example, A, P , Q (cid:238)

P , Q . (cid:2) This notation makes sense because both set notation and conjunctionare commutative, associative and idempotent. Any ﬁnite set A can beregarded as the conjunction of its sentences, although this does notwork if A is inﬁnite. First-Order Logic and Proof Planning

Exercise 8.3.5

Let Φ be the signature of Example 8.1.2 above, let M be the stan-dard Φ -model of Example 8.1.4, and let X = { x, y } . Now describe [[P ]] ,for P each of the following: geq (s(s(s( ))), x)( ∀ x) geq (s(s(s( ))), x)( ∀ x)( ∀ y) geq (x, y) ∧ pos (x)( ∃ y) geq (x, y) ∧ pos (x)( ∀ y)( ∃ x) geq (x, y) ∧ pos (x)( ∀ x) pos (x) ⇒ geq (y, x)( geq (x, s(s( ))) ∧ geq (s(s(s( ))), x) ∧ geq (x, y)) ∨ ( geq (x, y) ∧ geq (y, x)) (cid:2) Proposition 8.3.7

Given a signature Φ , a Φ -model M , and ( Φ , X) -formulae P , Q ,then: P . M (cid:238) P ∧ Q iﬀ M (cid:238) P and M (cid:238) Q .P . M (cid:238) P ∨ Q if M (cid:238) P or M (cid:238) Q .P . M (cid:238) P ∨ Q iﬀ M (cid:238) P or M (cid:238) Q if P or Q is closed .P . M (cid:238) P ⇒ Q iﬀ M (cid:238) P implies M (cid:238) Q if P is closed .P . M (cid:238) ¬¬ P iﬀ M (cid:238) P .P a. M (cid:238) ¬ P iﬀ M (cid:238) P is false if P is closed and M nonempty . P . M (cid:238) ( ∀ x) P iﬀ M (cid:238) P .P . M (cid:238) P if M (cid:238) Q and M (cid:238) Q ⇒ P .

Proof: P . Since [[P ∧ Q]] = [[P ]] ∩ [[Q]] , we have [[P ∧ Q]] = [X → M] iﬀ [[P ]] = [[Q]] = [X → M] . P . Since [[P ∨ Q]] = [[P ]] ∪ [[Q]] , we have [[P ∨ Q]] = [X → M] if [[P ]] = [X → M] or [[Q]] = [X → M] . P . If P is closed then [[P ]] = [X → M] or else [[P ]] = ∅ by Corol-lary 8.3.4. Therefore [[P ∨ Q]] = [X → M] iﬀ [[P ]] = [X → M] or [[Q]] = [X → M] . The argument is the same for the Q closed case. P . This follows from Corollary 8.3.4 plus the fact that [[P ⇒ Q]] = ([X → M] − [[P ]]) ∪ [[Q]] .P . We need [[( ∀ x)P ]] = [X → M] iﬀ [[P ]] = [X → M] , for x ∈ X . By (e) of Exercise 8.3.3, [[( ∀ x)P ]] = [X → M] implies [[P ]] = [X → M] . Conversely, if [[P ]] = [X → M] , then for each Recall that this means that all its carriers are non-empty. irst-Order Logic a : X → M and each b : X → M such that b(z) = a(z) for all z ≠ x , we have b(P ) = true , so that a ∈ [[( ∀ x)P ]] . Therefore [[( ∀ x)P ]] = [X → M] . P , P a and P (cid:2) P ¬¬ P ≡ P , and is often called the law of double negation . P P modus ponens and goes back tothe ancient Greeks (though the name is Latin); it is closely related to the proof rule T Example 8.3.8

The “ if ” in P iﬀ .”Let Φ be the signature of Example 8.1.2, let M be its standard model,and let P , Q be the formulae pos (x), ¬ geq (x, s( )) , respectively. Then M (cid:238) P ∨ Q holds, but both M (cid:238) P and M (cid:238) Q are false. (cid:2) Exercise 8.3.6

The following refer to Proposition 8.3.7 above:(a) Give a signature Φ , a Φ -model M and Φ -formulae P , Q which showthat P4 does not hold without the restriction that P is closed.(b) Show that M (cid:238) P ⇒ Q implies ( M (cid:238) P implies M (cid:238) Q ), withneither P , Q required to be closed.(c) Prove P P a .(d) Prove P (cid:2) The following results for semantic entailment are analoguous tothose in Proposition 8.3.7:

Proposition 8.3.9

Let Φ be a ﬁrst-order signature, let A be a set of ( Φ , X) -formulae, and let P , Q be ( Φ , X) -formulae. Then R . A, P (cid:238) P .R . A (cid:238) P ∧ Q iﬀ A (cid:238) P and A (cid:238) Q .R . A (cid:238) P ∨ Q if A (cid:238) P or A (cid:238) Q .R . A (cid:238) P ⇒ Q iﬀ A, P (cid:238) Q if P is closed .R . A (cid:238) ¬ P iﬀ A, P (cid:238) false if P is closed .R a. A (cid:238) Φ ( ∀ x) P iﬀ A (cid:238) Φ P if x not free in A .R . A (cid:238) Φ ( ∀ x) P iﬀ A (cid:238) Φ ( { x } ) P if x not free in A .R b. A, ( ∀ x)P (cid:238) Q iﬀ A, P (cid:238)

Q .R . A (cid:238) P if A (cid:238) Q and A (cid:238) Q ⇒ P . R a. A (cid:238) P ⇒ R if A (cid:238) P ⇒ Q and A (cid:238) Q ⇒ R . First-Order Logic and Proof Planning

Proof: R , R , R P , P R A (cid:238) P ⇒ Q iﬀ for all models M , (M (cid:238) A) ⇒ (M (cid:238) P ⇒ Q)) iﬀ ¬ (M (cid:238) A) ∨ ¬ (M (cid:238) P ) ∨ (M (cid:238) Q) iﬀ ¬ (M (cid:238) A ∧ P ) ∨ (M (cid:238) Q) iﬀ (M (cid:238) A ∧ P ) ⇒ (M (cid:238) Q) , which is equivalent to

A, P (cid:238) Q , where the ﬁrst ⇒ in the ﬁrst line andall the iﬀ s are in the metalanguage, while the second ⇒ is in ﬁrst-orderlogic, and where the ﬁrst iﬀ uses P R R

3, by substituting false for Q and ¬ P for P , andusing ¬ P ≡ (P ⇒ false ) . R a, R b, R P , P , P sition 8.3.7, respectively.For R

5, by R a , it suﬃces to show that A (cid:238) Φ P iﬀ A (cid:238) Φ ( { x } ) P , i.e., to show that the following are equivalent,for all models

M, (M (cid:238) Φ A ⇒ M (cid:238) Φ P ) for all models M (cid:48) , (M (cid:48) (cid:238) Φ ( { x } ) A ⇒ M (cid:48) (cid:238) Φ ( { x } ) P ) , noting that the M in the ﬁrst assertion are Φ -models, while the M (cid:48) inthe second are Φ ( { x } ) -models. Since x is not free in A , it is suﬃcientto showfor all models M, M (cid:238) Φ P iﬀ for all models M (cid:48) , M (cid:48) (cid:238) Φ ( { x } ) P .

These two expressions respectively equalfor all models

M, [[P ]] MX = [X → M] for all models M (cid:48) , [[P ]] M (cid:48) X −{ x } = [(X − { x } ) → M (cid:48) ] , which are equivalent because an assignment a : X → M to a Φ -model M is the same thing as an assignment a (cid:48) : (X − { x } ) → M (cid:48) to a Φ ( { x } ) -model M (cid:48) . R a is left as an exercise. (cid:2) R Theorem of Deduction . R proof by contradiction . R Theorem of Constants ,and the heart of its proof is similar to that for the equational case; seealso (d) of Exercise 8.3.7 below. Note that in forming Φ ( { x } ) , Φ and { x } are disjoint because Φ and X are. Strictly speaking, we have changedthe variable set from X to X − { x } , so that occurrences of x in Φ ( { x } ) -formulae are constants, not variables. R modus ponens . R b implies outermost universal quantiﬁers irst-Order Logic can be removed from the left of a turnstile; however this should not bedone automatically, because it precludes substituting for the variableinvolved (see Section 8.3.4). R a expresses the transitivity of implica-tion. Exercise 8.3.7

The following refer to Proposition 8.3.9:(a) Give Φ , A, P showing that “ if ” in R iﬀ ”.(b) Prove R only if ” replacing “ iﬀ ” and without the clause “ if P is closed.”(c) Give Φ , A, P showing that neither direction of the assertion A (cid:238) ¬ P iﬀ A (cid:54)(cid:238) P is correct.(d) Generalize R

5, replacing x by a variable set X .(e) Prove R a . Hint:

Very simple choices will work for (a) and (c). (cid:2)

Mathematics, and especially logic, is often said to deal with absolute or eternal truths , true under all possible interpretations in all possi-ble models (or “worlds”). Mathematical truths are also said to be for-mal truths , “trivially” true for formal, non-empirical reasons (though ofcourse establishing such truths can be non-trivial); the word “tautolog-ical” is also used. Deﬁnition 8.3.10 A Φ -formula P is a tautology iﬀ M (cid:238) P for every Φ -model M . (cid:2) Exercise 8.3.8

For Φ the signature of Example 8.1.2, show that each of the fol-lowing is a tautology, or else show that it is not: ( ∀ x) ( geq (x, ) ⇒ pos (x)) geq (x, x) geq (x, y) ∨ ¬ geq (x, y) geq (x, y) ∨ geq (y, x)( ∀ x) geq (x, x)( ∀ x) ( geq (x, y) ⇒ geq (x, x))( ∀ x) ( pos (x) ⇒ pos (x)) . Hint:

Don’t forget that there is no ﬁxed Φ -model here. (cid:2) First-Order Logic and Proof Planning

Exercise 8.3.9

Prove the following equivalences for

P , Q ﬁrst-order formulae: E . true ∧ P ≡ P .E . false ∧ P ≡ false .E . true ∨ P ≡ true .E . false ∨ P ≡ P .E . P ∧ ¬ P ≡ false .E . P ∨ ¬ P ≡ true .E . ¬¬ P ≡ P .E . ( ∀ x)( ∀ y) P ≡ ( ∀ y)( ∀ x) P .E . ( ∀ x)( ∀ x) P ≡ ( ∀ x) P . (cid:2) Notation 8.3.11

Let X be a variable set with elements x , . . . , x n . Then we maywrite ( ∀ X)P for ( ∀ x ) . . . ( ∀ x n )P . By E

12 and E

13, ordering and repe- tition of variables do not matter. Note that ( ∀ X) does not make senseif X is inﬁnite. We extend existential quantiﬁers in the same way, to ( ∃ X) where X is a ﬁnite set of variables. (cid:2) Results about quantiﬁers generally extend by induction on the num-ber of quantiﬁed variables. Let Φ (X) denote the ﬁrst-order signature ( Σ (X), Π ) when Φ = ( Σ , Π ) . Proposition 8.3.12

Let Φ be a ﬁrst-order signature, A a set of Φ -sentences, and P a Φ -formula. Then R aX. A (cid:238) Φ ( ∀ X) P iﬀ A (cid:238) Φ P .R X. A (cid:238) Φ ( ∀ X) P iﬀ A (cid:238) Φ (X) P .R X is the classical Theorem of Constants. (cid:2) Exercise 8.3.10

Prove the following equivalences for

P , P (cid:48) , Q, R ﬁrst-order for-mulae: E . (P ⇒ Q) ∧ (P (cid:48) ⇒ Q) ≡ (P ∨ P (cid:48) ) ⇒ Q .E . ( ∀ X) (P ∧ Q) ≡ ( ∀ X)P ∧ ( ∀ X)Q .E . ( ∀ X) P ≡ P if P is closed .E . ( ∃ X) (P ∨ Q) ≡ ( ∃ X)P ∨ ( ∃ X)Q .E . ( ∃ X) (P ∧ Q) ≡ (( ∃ X)P ) ∧ Q if Q is closed . E . ( ∃ X) P ≡ P if P is closed . E . ( ∀ X)( ∀ Y ) P ≡ ( ∀ Y )( ∀ X) P .E . ( ∃ X)( ∃ Y ) P ≡ ( ∃ Y )( ∃ X) P .E . (P ∨ Q) ∧ R ≡ (P ∧ R) ∨ (Q ∧ R) .E

22, as well as E E

11, are examples of the very general principle thatevery equational law of Boolean algebra holds as an equivalence of ﬁrst-order formulae, i.e., is a tautology; see also E E (cid:2) Exercise 8.3.11

Let Φ be a ﬁrst-order signature and let P , Q be Φ -formulae.(a) Show that P is a tautology iﬀ true (cid:238) Φ P . irst-Order Logic (b) Give an example showing that the condition “ Q is closed” is nec-essary in E Q is closed then ( ∀ X) (P ∨ Q) ≡ (( ∀ X) P ) ∨ Q . (cid:2) We noted earlier that in our syntax for Horn clause logic, the sym-bols ∀ , ∧ , and ⇒ are not themselves logical symbols, but instead to-gether constitute a single mixﬁx logical symbol. However, this nota-tion does suggest a simple translation into ﬁrst-order logic, where eachsymbol is taken as the corresponding logical symbol in ﬁrst-order logic.This can perhaps be made clearer by adding parentheses, so that thetranslation of the Horn clause h = ( ∀ X) p ∧ · · · ∧ p n ⇒ p is the ﬁrst-order formula h (cid:48) = ( ∀ X) ((p ∧ · · · ∧ p n ) ⇒ p ) , where of course the same signature Φ is used in each case. The follow-ing enables us to regard HCL as a “subinstitution” of

FOL : Fact 8.3.13

Let M be a Φ -model, let h c be a Horn clause, and let h (cid:48) c be its ﬁrstorder translation. Then M (cid:238) Φ h c iﬀ M (cid:238) Φ h (cid:48) c . (cid:2) Exercise 8.3.12

Prove Fact 8.3.13 from the appropriate deﬁnitions of satisfac-tion. (cid:2)

The following demonstrates the very important fact that initial mod-els do not always exist for theories over full ﬁrst-order logic; this im-plies that it is not (in general) valid to do induction over models deﬁnedby sets of ﬁrst-order sentences, even if they are supposed to be initial.

Example 8.3.14

Let Σ have one sort and two constants, a, b , let Φ have just onerelation symbol, π , and let A consist of the axiom π(a) ∨ π(b) . Thisspeciﬁcation has no initial model: clearly the carrier must be { a, b } ,but there is no way to get a smallest subset for π ; in fact, there are twodiﬀerent equally good (and equally bad) minimal choices, namely { a } and { b } . (cid:2) Proposition 8.3.15

Given a ( Φ , X) -sentence ( ∃ x)P with Free (P ) = { x } and a Φ -model M with all carriers nonempty, E35 then M (cid:238) ( ∃ x)P iﬀ there is anassignment a : X → M such that a(P ) = true . First-Order Logic and Proof Planning

Proof:

By the following computation: M (cid:238) ( ∃ x)P iﬀ ( by deﬁnition of ∃ )M (cid:238) ¬ (( ∀ x) ¬ P ) iﬀ ( by P a) not (M (cid:238) ( ∀ x) ¬ P ) iﬀ ( by P ) not (M (cid:238) ¬ P ) iﬀnot ([[ ¬ P ]] = [X → M]) iﬀnot ([[P ]] = ∅ ) iﬀexists a : X → M with a(P ) = true . (cid:2) Exercise 8.3.13

Give examples showing how Proposition 8.3.15 fails if either M has empty carriers, or P has free variables other than x . (cid:2) This section extends substitution from terms to ﬁrst-order formulae,and gives the so-called Substitution Theorem, which will be importantfor several later developments, including that of quantiﬁers.

Deﬁnition 8.3.16

Let Φ = ( Σ , Π ) be a ﬁrst-order signature and let θ : X → T Σ (X) be a substitution. Now deﬁne (cid:98) θ : WFF X ( Φ ) → WFF X ( Φ ) recursively asfollows:0. (cid:98) θ(π(t , . . . , t n )) = π(θ(t ), . . . , θ(t n )) ;1. (cid:98) θ( true ) = true ;2. (cid:98) θ( ¬ P ) = ¬ (cid:98) θ(P ) ;3. (cid:98) θ(P ∧ Q) = (cid:98) θ(P ) ∧ (cid:98) θ(Q) ;4. (cid:98) θ(( ∀ x)P ) = ( ∀ x) (cid:99) θ x (P ) ,where θ x is the substitution that agrees with θ everywhere on X except x , and θ x (x) = x . We may write θ(P ) , or sometimes more elegantly P θ ,for (cid:98) θ(P ) , and call it the result of applying θ to P , or of substituting θ(x) in P for each x ∈ X . When X is small, the notation P [x ← t , . . . , x n ← t n ] may be more convenient than P θ . (cid:2) The simplicity of this deﬁnition, which as usual is recursive over Ω ,may come as a pleasant surprise. Notice that (cid:98) θ automatically avoidssubstituting for bound variables. However, there is a subtle diﬃculty: Example 8.3.17

Let Φ be the signature of Example 8.1.2, let X = { x, y, z } , let θ(x) = θ(y) = s(x) , and θ(z) = z . Then θ( geq (y, x)) = geq (s(x), ).θ(( ∀ x)( geq (x, ) ⇒ pos (x))) = ( ∀ x)( geq (x, ) ⇒ pos (x)). θ(( ∀ x)( geq (x, y) ⇒ geq (x, x))) = ( ∀ x)( geq (x, s(x)) ⇒ geq (x, x)). irst-Order Logic Note the capture of the x in s(x) by the quantiﬁer in the last formula,although the variable y that it replaced was free in this formula. Thisphenomenon is called variable capture .Now deﬁne a substitution τ by τ(x) = τ(y) = τ(z) = z . Then (θ ; τ)(x) = (θ ; τ)(y) = s( ) , and (θ ; τ)(z) = z , so that if we let P denote the third formula above, then P (θ ; τ) = ( ∀ x)( geq (x, s(x)) ⇒ geq (x, x)) , whereas (P θ)τ = ( ∀ x)( geq (x, s( )) ⇒ geq (x, x)) . Thusvariable capture thwarts the compositionality of substitution. This mo-tivates Deﬁnition 8.3.18 below. (cid:2) Exercise 8.3.14

We can extend the notation θ x to θ Z for Z ⊆ X , by deﬁning θ Z (y) = θ(y) for y (cid:54)∈ Z and θ Z (y) = y for y ∈ Z . Show the followingfor any P in WFF X ( Φ ) and substitution θ : θ = θ Z iﬀ θ is the identity on (at least) Z .2. θ(( ∀ Z) P ) = ( ∀ Z) θ Z (P ) .3. θ(P ) = P if P is closed.4. More generally, θ Free(P) (P ) = P .5. Even more generally, θ(P ) = τ(P ) if θ(y) = τ(y) for y ∈ Free (P ) . (cid:2) Deﬁnition 8.3.18

Given a ( Φ , X) -formula P and a substitution θ , deﬁne θ to be capture free for P as follows:0. θ is capture free for P if P is atomic;1. θ is capture free for true ;2. θ is capture free for ¬ P if it is for P ;3. θ is capture free for P ∧ Q if it is for P and for Q ; and4. θ is capture free for ( ∀ x)P if θ x is capture free for P and if y ≠ x is a free variable of P then x is not free in θ(y) .Capture freedom extends from the operations in Ω to those in Ω ; forexample, θ is capture free for ( ∃ x)P under exactly the same conditionsas those for ( ∀ x)P . (cid:2) Proposition 8.3.19

Let θ, τ be two substitutions such that θ is capture free for P . Then1. (P θ)τ = P (θ ; τ) , and2. if τ is capture free for P θ , then θ ; τ is capture free for P . First-Order Logic and Proof Planning

Proof:

We prove 1. by structural induction over Ω . We leave the reader tocheck the result for P atomic or true , and for negation and conjunction.Suppose P = ( ∀ x)Q . Then (P θ)τ = ( ∀ x)((Qθ x )τ x ) and P (θ ; τ) = ( ∀ x)Q(θ ; τ) x ; because θ is capture free for P , so is θ x for Q ; thus bythe induction hypothesis, (Qθ x )τ x = Q(θ x ; τ x ) .Now we claim Q(θ x ; τ x ) = Q(θ ; τ) x . By 5. of Exercise 8.3.14, it suf-ﬁces to show (θ x ; τ x )(y) = (θ ; τ) x (y) for all y ∈ Free (Q) . If x ∈ Free (Q) then (θ x ; τ x )(x) = x = (θ ; τ) x (x) ; if y ≠ x is in Free (Q) , then (θ ; τ) x (y) = (θ ; τ)(y) and also (θ x ; τ x )(y) = τ x (θ x (y)) = τ x (θ(y)) = τ(θ(y)) , the last equality because θ is capture free for P , since x doesnot occur in θ(y) .We also prove 2. by structural induction over Ω . Because capturefreedom commutes with negation and conjunction, as does substitu- tion, it suﬃces to check the induction step for P = ( ∀ x)Q . But (θ ; τ) x is capture free for Q because θ x is capture free for Q and τ x is capturefree for Qθ x , plus the induction hypothesis. Now let y ≠ x be a freevariable in Q ; because x ∉ Var (θ(y)) and x ∉ Var (τ(z)) , for any freevariable z ≠ x of Qθ x , we have x ∉ Var (τ(θ(y))) . Therefore θ ; τ iscapture free for P . (cid:2) Deﬁnition 8.3.20

Given a substitution θ : X → T Σ (X) and a model M , deﬁne [[θ]] M : [X → M] → [X → M] by [[θ]] M (a) = θ ; a . As usual, we omit thesubscript M when context makes it unnecessary. (cid:2) The next result does most of the work involved in proving the mainresult of this subsection; its rather technical proof has been placed inAppendix B to avoid distraction.

Proposition 8.3.21 If θ is capture free for P , then [[θ(P )]] M = [[θ]] − M ([[P ]] M ) forany model M . (cid:2) The main result now says that any substitution instance of a validformula is valid:

Theorem 8.3.22 ( Substitution ) Let Φ = ( Σ , Π ) be a ﬁrst-order signature, A a setof Φ -sentences, a ( Φ , X) -formula P , and θ : X → T Σ (X) a substitutionthat is capture free for P . Then A (cid:238) P implies A (cid:238) P θ .

Proof:

Fix a model M of A . It suﬃces to show that [[P ]] = [X → M] implies [[P θ]] = [X → M] , which follows directly from Proposition 8.3.21. (cid:2) Corollary 8.3.23

Let Φ = ( Σ , Π ) be a ﬁrst-order signature, A a set of Φ -sentences,and P in WFF X ( Φ ) . Let Y ⊆ X be a ﬁnite variable set and let θ : X → T Σ (X) be a substitution such that θ Y = θ (i.e., θ can only be non-identityoutside Y ). Then A (cid:238) ( ∀ Y ) P implies A (cid:238) ( ∀ Y ) P θ . irst-Order Logic

Proof:

Let Q = ( ∀ Y ) P and apply

E36

Theorem 8.3.22 to get A (cid:238) ( ∀ Y ) P implies A (cid:238) θ(( ∀ Y ) P ) .

Now by 2 of Exercise 8.3.14 and because θ Y = θ , we get A (cid:238) ( ∀ Y ) P implies A (cid:238) ( ∀ Y ) P θ . (cid:2)

From this, it further follows that A (cid:238) ( ∀ Y ) P implies A (cid:238) ( ∀ Y (cid:48) ) P θ . when Y (cid:48) ⊆ Y and θ = θ Y − Y (cid:48) . This says we can substitute values forvariables in Z = Y − Y (cid:48) and eliminate their quantiﬁers.The second result below is needed in Section 8.5. Lemma 8.3.24

Let P be a Φ -formula, let θ : X → T Σ (X) be a substitution that iscapture free for P , and let a : X → M be an interpretation in a Φ -model M . Then a(P θ) = (θ ; a)(P ) . Proof:

More precisely, we need to show that (cid:98) θ ; a = θ ; a , which follows byshowing that ( (cid:98) θ ; a)(x) = (θ ; a)(x) for all x ∈ X , and that (cid:98) θ ; a satisﬁesthe conditions of Deﬁnition 8.3.2. (cid:2) Lemma 8.3.25

Let P be a Φ -formula and let substitutions θ, θ (cid:48) : X → T Σ (X) be capture free for P . Given interpretations a, a (cid:48) : X → M in a Φ -model M , then a(P θ) = a (cid:48) (P θ (cid:48) ) whenever a(θ(z)) = a (cid:48) (θ (cid:48) (z)) for all z ∈ Free (P ) . Proof:

By Lemma 8.3.24, (θ ; a)(P ) = a(P θ) and (θ (cid:48) ; a (cid:48) )(P ) = a (cid:48) (P θ (cid:48) ) . Be-cause (θ ; a)(x) = (θ (cid:48) ; a (cid:48) )(x) for all x ∈ Free (P ) , Proposition 8.3.3 gives (θ ; a)(P ) = (θ (cid:48) ; a (cid:48) )(P ) . Thus a(P θ) = a (cid:48) (P θ (cid:48) ) . (cid:2) The syntax of ﬁrst-order logic with equality is exactly the same as thatof ﬁrst-order logic, but its signatures are required to have binary (inﬁx) equality predicates, exactly one for each sort s ∈ S , denoted = s ; moreprecisely, we assume that {= s } ⊆ Π ss for each s ∈ S . Semantically,ﬁrst-order logic with equality restricts its models to those where theequality predicates are interpreted as actual identities; that is, for eachmodel M and s ∈ S , M = s = {(cid:104) m, m (cid:105) | m ∈ M s } . Satisfaction is as usual. Let us denote this institution

FOLEQ . It is im-portant to notice that all our deﬁnitions and results for

FOL carry overto

FOLEQ . This is because the proofs are the same, the only diﬀerencebeing that there are fewer models. First-Order Logic and Proof Planning

Similarly,

Horn clause logic with equality is Horn clause logic withthe same given equality predicates, interpreted the same way as above;let us denote this institution

HCLEQ .The ﬁrst-order logic of equality is the special case of ﬁrst-orderlogic with equality where equalities are the only predicates. Since Φ is completely determined by Σ , we may write (cid:238) Σ instead of (cid:238) Φ in thiscontext. Again, the deﬁnitions, results, and proofs for FOL carry over.Let us denote this institution

FOLQ .Similarly, the

Horn clause logic of equality is Horn clause logic withequality where the only predicates are the equalities. In fact, the Hornclause logic of equality is the same as conditional equational logic (seethe exercise below); therefore it is another way to view the logic of OBJ.Of course, our algebraic orientation prefers the conditional equationalformulation to the Horn clause formulation.

Exercise 8.3.15

Let Φ = ( Σ , Π ) be a ﬁrst-order signature with exactly one predi-cate symbol for each sort, namely the equality. Now deﬁne a translationfrom conditional Σ -equations e to Φ -Horn clauses h e , and prove that M (cid:238) Φ e iﬀ M (cid:238) Φ h e , for any Φ -model M . (cid:2) It now follows that the institution

CEQL of conditional equational logicis a subinstitution of

FOLEQ . Hence the rules of deduction for

CEQL arealso valid for

FOLEQ , of course restricted to the sentences that corre-spond to conditional equations.

Going further along the line of Section 8.3.5, we can give ﬁxed “stan-dard” interpretations not only for equality symbols, but also for anydesired sorts and non-logical symbols. For example, if Ψ is the signa-ture Φ Nat of Example 8.1.2 and if Φ is some ﬁrst-order signature with Ψ ⊆ Φ , then we can ﬁx the interpretation of Ψ to be the standard nat-ural numbers. Deﬁne a Φ - model over D to be a Φ -model M such thatthe restriction (reduct) M | Ψ of M to Ψ is the ﬁxed model D . We denotethis institution FOL/D ; then all our deﬁnitions and results for

FOL carry over to

FOL/D , because the same proofs work on the reduced collectionof models. If A is a set of Φ -axioms, then a ( Φ , A) -model over D is a Φ -model over D satisfying A . Note that for some A there may be no suchmodels, for example, if A implies a Ψ -sentence that is false in D (suchas 1 = FOLEQ/D of ﬁrst-orderlogic with equality over some ﬁxed Ψ -model D . If we add a few morearithmetic operations to the signature Ψ = Φ Nat of the natural num-bers, we get a system to which Gödel’s incompleteness theorem applies.This famous result says that any ﬁrst-order theory rich enough to talkabout a certain fragment of arithmetic will always have true sentences roof Planning that cannot be proved; in other words, no ﬁnite set of axioms can becomplete for arithmetic. The situations that arise in our applicationsare often of this kind, since we need to reason about some ﬁxed datatypes, e.g., natural numbers, integers, lists of natural numbers, etc. Inpractice, when we stumble over a result that cannot be proved by equa-tional reasoning from the axioms in our theory, we try to prove it usinginduction. Induction is a second-order axiom, not a ﬁrst-order axiom,but even so, there is no guarantee that we will ﬁnd the proof we wantby using it.We can also consider the institution of the ﬁrst-order logic of equal-ity over a ﬁxed model D , denoted FOLQ/D . The deﬁnitions and resultsfor

FOL again carry over, because the proofs are the same; and theabove discussion about incompleteness also applies.

FOLQ/D is funda- mental for this book, because our method is to state proof tasks usingformulae in this logic, and then reduce them to a combination of equa-tional proof tasks that can be handled with reduction (see Section 8.4below). Because D is usually deﬁned by initiality with respect to somegiven equational theory, induction can usually be used to prove addi-tional properties of D that are needed (such properties are traditionallycalled “lemmas”). For proof planning, we will use the 2-bar semantic entailment turnstile, (cid:238) , in a new way, reading “ A (cid:238) P ” as indicating the task of proving the goal P from the assumptions A . With this in mind, we can reformulatethe assertions of Proposition 8.3.9 as “proof planning rules,” rewriterules that transform complex proof tasks into combinations of simplerproof tasks. Given a ﬁrst-order signature Φ , a set A of Φ -sentences, and Φ -formulae P , Q , these rules are as follows: T . A, P (cid:238) Φ P (cid:45) → true .T . A (cid:238) Φ P ∧ Q (cid:45) → A (cid:238) Φ P and A (cid:238) Φ Q .T . A (cid:238) Φ P ∨ Q (cid:45) → A (cid:238) Φ P or A (cid:238) Φ Q .T . A (cid:238) Φ P ⇒ Q (cid:45) → A, P (cid:238) Φ Q if P is closed .T . A (cid:238) Φ ¬ P (cid:45) → A, P (cid:238) Φ false if P is closed .T . A (cid:238) Φ ( ∀ X) P (cid:45) → A (cid:238) Φ (X) P .

Thus, T P ∧ Q from A , we can prove P from A and Q from A . These really are rewrite rules rather than equations, becausethey have a deﬁnite left to right orientation. In particular, T vice versa , because R if ,” not “ iﬀ ”. Rule T false , we must First-Order Logic and Proof Planning supplement this rule later. Note that the signature subscripts on (cid:238) areimportant for T

5, but not for T T

4. We will call the signature thatappears as a subscript on the turnstile the working signature of theproof task A (cid:238) Φ P in this context.We say a proof planning rule is sound if its rightside is a suﬃcient condition for its leftside; note that this is opposite to soundness forrules of deduction, where the rightside is a necessary condition for theleftside. For example, if we weaken rule T P ∨ Q from A , it suﬃces to prove P , and it also suﬃces to prove Q , then the right sides are suﬃcient for their left sides, but far fromnecessary: T a. A (cid:238) Φ P ∨ Q (cid:45) → A (cid:238) Φ P .T b. A (cid:238) Φ P ∨ Q (cid:45) → A (cid:238) Φ Q .

We will see that rules like these are adequate for many interestingproblems, including the ripple carry adder discussed in Section 8.4.2below. We will also see that these rules can themselves be expressedand executed in OBJ.As a ﬁrst step in making the above intuitions more precise, let usconsider the language used for expressing proof tasks. We can seethat all the terms in T T A (cid:238) Φ B , where A, B are Φ -formulae. Since our proof-planningapplications involve atomic sentences from the institution FOLQ/D , wemay write (cid:238) Σ instead of (cid:238) Φ . The terms in T T FOLQ/D : they make assertions about combinations of proof tasksinvolving sentences in

FOLQ/D . Of course, most assertions of this formare false.Our paradigm takes a proof task A (cid:238) Φ P and transforms it into aBoolean combination of proof tasks that can be checked by reductionwith OBJ. Proposition 8.3.9 then implies that if we use T T A (cid:238) Φ P is true, i.e., the proof score consistingof those reductions really does prove P from A . If some reductionsdon’t do what we want, then we have failed to prove the result, but in general, this does not mean it isn’t true (though there are some caseswhere failure does imply that the original proof task is false).The rules in the object META below encode the transformations T T Ground terms of sort

Meta are metasentences that describe structures of possible proofs;they could also be called “proof terms,” because they are possible proofsexpressed as terms. This module uses order-sorted algebra ; for exam-ple, the line “ subsort BType < Type ” means that every

BType (for Strictly speaking, the equations here are really rewrite rules, so that the full powerof equational logic cannot be used, but only term rewriting. roof Planning “basic type”) is also a

Type . Order-sorted algebra is developed in Chap-ter 10, but the OBJ code below should be understandable without adetailed knowledge of Chapter 10. obj META is sorts Meta Sen Sig Type .pr QID .dfn BType is QID .subsort BType < Type .subsort Bool < Sen Meta .op _|=[_] _ : Sen Sig Sen -> Meta [prec 11].op (_][_:_) : Sig Id Type -> Sig .op _and_ : Meta Meta -> Meta [assoc comm prec 2].op _and_ : Sen Sen -> Sen [assoc comm prec 2].op _or_ : Meta Meta -> Meta [assoc comm prec 7].op _or_ : Sen Sen -> Sen [assoc comm prec 7].op _=>_ : Meta Meta -> Meta [prec 9].op _=>_ : Sen Sen -> Sen [prec 9].op not_ : Meta -> Meta [prec 1].op not_ : Sen -> Sen [prec 1].op (all_:_ _) : Id Type Sen -> Sen .vars A P Q : Sen . var X : Id .var T : Type . var S : Sig .[ass] eq A and P |=[S] P = true .[and] eq A |=[S] (P and Q) = (A |=[S] P) and (A |=[S] Q) .[or] eq A |=[S] (P or Q) = (A |=[S] P) or (A |=[S] Q) .[imp] eq A |=[S] (P => Q) = (A and P) |=[S] Q .[not] eq A |=[S] (not P) = (A and P) |=[S] false .[all] eq A |=[S] (all X : T P) = A |=[S][X : T] P .endo

Note that the Boolean operations and , or , => , and not are triply over-loaded, because they are deﬁned for both sentences and metasentences,as well as for OBJ’s builtin Booleans. Since we have shown these opera-tions to be associative and commutative, we can include these laws asattributes.Strictly speaking, the rule all should require that the variable X not occur in A , and most texts on ﬁrst-order logic do give such a “side condi-tion” for this rule. However, it is more natural in our setting to considerthis a condition on signatures, since forming Σ (X) already requires X tobe disjoint from Σ . This well-formedness condition is easily expressedusing so-called “error supersorts” in order-sorted algebra, but becausewe have not treated that topic, we omit this from the above speciﬁca-tion.Now let’s use this machinery to plan some proofs: Example 8.4.1

Below is a simple reduction of a compound proof task to a sim-pler proof task. This computation tells us that if we want to prove a First-Order Logic and Proof Planning sentence of the form A (cid:238) Σ ( ∀ w , w : Bus ) P ⇒ P , then it suﬃces to prove A, P (cid:238) Σ (w ,w : Bus ) P . Here is the OBJ code: open .ops A1 P1 P2 : -> Sen .op Sigma : -> Sig .red A1 |=[Sigma] (all ’w1 : ’Bus (all ’w2 : ’Bus P1 => P2)).***> should be: A1 and P1 |=[Sigma] [’w1 : ’Bus]***> [’w2 : ’Bus] P2close

Of course, it works; OBJ does just three rewrites, each an applicationof a proof rule. This reduction justiﬁes the proof score used in theexample of Section 8.4.2. (cid:2)

Example 8.4.2

We can use

META to plan a proof that the intersection of twotransitive relations is transitive. Our proof task has the form ( ∀ X) P ⇒ Q , ( ∀ X) P ⇒ Q (cid:238) Σ ( ∀ X) P ⇒ Q where X has variables x, y, z of sort Elt , and where the ﬁrst clauseexpresses transitivity of a relation, e.g., R , with P = (x R y) ∧ (y R z)Q = x R z , the second clause expresses transitivity of R , and the third expressestransitivity of their intersection, which is deﬁned to be R ∧ R . Thisdeﬁnition justiﬁes adding the two lemmas P = P ∧ P Q = Q ∧ Q . Now we can write the proof task in OBJ, and reduce it to get a proofplan: open .op all-X_ : Sen -> Sen .op Sigma : -> Sig .vars-of .eq all-X A = (all ’x : ’Elt (all ’y : ’Elt (all ’z : ’Elt A))).ops P1 P2 P12 Q1 Q2 Q12 : -> Sen .eq P12 = P1 and P2 .eq Q12 = Q1 and Q2 .red ((all-X (P1 => Q1)) and (all-X (P2 => Q2))) |=[Sigma](all-X (P12 => Q12)).close roof Planning

OBJ3 does 10 rewrites and produces a rather large term, suggestingthat setting up this proof is not completely trivial. (cid:2)

Exercise 8.4.1

Execute the reduction in the example above in OBJ3 and inter-pret the result. Does it make sense? What does it say? Now use OBJ toactually do the proof that has been planned, and interpret the results. (cid:2)

Exercise 8.4.2

In a way similar to the above example and exercise:(a) Use OBJ to plan and carry out a proof that the union of two sym-metric relations (on the same set) is symmetric.(b) Use OBJ to plan and carry out a proof that the intersection of twoequivalence relations (on the same set) is an equivalence relation. (cid:2)

The transformation corresponding to modus ponens (R6 on page 261)is T . A (cid:238) Φ P (cid:45) → A (cid:238) Φ Q and A (cid:238) Φ Q ⇒ P , which is not a rewrite rule, because its rightside contains a variable notin its leftside; hence it cannot be applied automatically by rewriting.But it is still very important for proofs.We mentioned earlier that to use the rule T false . Rule T Q that can be both proved and disproved. Pure equational logic can neverprove disequalities (i.e., negations of equations). But initiality gives usa way forward. If a speciﬁcation is canonical, then diﬀerent reducedground terms necessarily denote distinct elements of its initial model.For example, we know that 0 ≠ false ≠ true are satisﬁed in thestandard models, so if we can prove 0 = false = true , then we havethe desired contradiction.The second rule below, T

8, justiﬁes introducing a “lemma” Q tohelp prove P from A ; of course, Q itself must also be valid for A . Inpractice, lemmas are often results about D that require induction, such as the associative and commutative laws for addition. We will see somemore substantial lemmas in the proof of the next section. T . A (cid:238) Φ false (cid:45) → A (cid:238) Φ Q and A (cid:238) Φ ¬ Q .T . A (cid:238) Φ P (cid:45) → A (cid:238) Φ Q and A, Q (cid:238) Φ P .

The following justiﬁes the two rules above:

Proposition 8.4.3

Let Φ be a ﬁrst-order signature, let A be a set of Φ -sentences,and let P , Q also be Φ -sentences. Then R . A (cid:238) Φ false iﬀ A (cid:238) Φ Q and A (cid:238) Φ ¬ Q . R . A (cid:238) Φ P if A (cid:238) Φ Q and A, Q (cid:238) Φ P . First-Order Logic and Proof Planning

Proof:

The ﬁrst assertion follows from R E (cid:2) Exercise 8.4.3 (a) Show that “ if ” in R8 above cannot be replaced by “ iﬀ ”.(b) Use R R R Q is closed. (cid:2) Another useful rule allows us to “strengthen” the axioms (or as-sumptions) used for a proof. Intuitively, if we can prove somethingfrom stronger (i.e., more restrictive) assumptions, then it is also validunder the weaker assumptions: R . A (cid:238) P if A (cid:238) A (cid:48) and A (cid:48) (cid:238) P .

The transformational form of this rule is of course T . A (cid:238) P (cid:45) → A (cid:238) A (cid:48) and A (cid:48) (cid:238) P .

This is not a rewrite rule because its rightside contains a variable not inits leftside. This rule may be called the “ wmawlog ” rule, because it jus-tiﬁes the “we may assume without loss of generality” steps that occur atthe beginning of many proofs, replacing the original assumptions withothers that are stronger or equivalent. (Some proofs that say “we mayassume without loss of generality” are actually case analyses, where arelatively easy special case is eliminated; e.g., in showing n ≥ n , wemay assume n ≥ Exercise 8.4.4

Prove soundness of R (cid:2) The module

META2 below expresses T T META module. None of these are rewrite rules, because eachhas a variable on its rightside that is not on its left. Hence they must beapplied “by hand,” e.g., with OBJ3’s apply feature. This makes sense,because creativity is required in choosing suitable Q , and this creativitycan never be fully automated. obj META2 is pr META .var A A’ P Q : Sen . var S : Sig .[modp] eq A |=[S] P = (A |=[S] Q) and (A |=[S] Q => P) .[contd] eq A |=[S] false = (A |=[S] Q) and (A |=[S] not Q) .[lemma] eq A |=[S] P = (A |=[S] Q) and (A and Q |=[S] P) .[astr] eq A |=[S] P = (A’|=[S] P) and (A |=[S] A’) .endo Two special cases of T R a. A, P ⇒ Q (cid:238) R if A, P (cid:48) ⇒ Q (cid:238) R and A (cid:238) P (cid:48) ⇒ P roof Planning is the special case of R A, P ⇒ Q is substituted for A , where A, P (cid:48) ⇒ Q is substituted for A (cid:48) , where R is substituted for P , and thenthe result is simpliﬁed using the rule ((cid:63)) A, P ⇒ Q (cid:238) P (cid:48) ⇒ Q if A (cid:238) P (cid:48) ⇒ P .

The resulting transformation rule is T a. A, P ⇒ Q (cid:238) R (cid:45) → A, P (cid:48) ⇒ Q (cid:238) R and A (cid:238) P (cid:48) ⇒ P .

In case P (cid:48) is closed, we can use R T a. A, P ⇒ Q (cid:238) R (cid:45) → A, P (cid:48) ⇒ Q (cid:238) R and A, P (cid:48) (cid:238)

P .

Note that this rule applies in particular to the condition of a conditional equation.

Exercise 8.4.5

Prove soundness of ((cid:63)) . Show how R a justiﬁes strengtheningthe condition of a conditional equation. (cid:2) For our second special case of T

9, recall from Corollary 8.3.23 thatif Y (cid:48) ⊆ Y ⊆ X are variable sets and θ : X → T Σ (X) is a substitution suchthat θ X − (Y − Y (cid:48) ) = θ (i.e., θ is non-identity at most outside Y − Y (cid:48) ), then A (cid:238) ( ∀ Y )P implies A (cid:238) ( ∀ Y (cid:48) )θP . From this and T T b. A, ( ∀ Y )P (cid:238) Q (cid:45) → A, ( ∀ Y (cid:48) )θP (cid:238) Q , with

Y , Y (cid:48) and θ as above.The ﬁrst nine rules below are similar to the attributes declared for and and or , simple facts about the extended Boolean connectives; thenext three rules help us conclude proofs, and the last rule lets us do modus ponens on the leftside of the turnstile. These rules are oftenuseful in simplifying proof plans; it can be shown that applying themnever prevents a proof from being found if one exists. obj META3 is pr META2 .vars A P Q R : Sen . var S : Sig .eq A and A = A .eq A and true = A .eq A and false = false .eq A and not A = false .eq A or false = A .eq A or true = true .eq A => true = true .eq A => false = not A .eq not not A = A .eq P |=[S] P = true . First-Order Logic and Proof Planning eq false |=[S] P = true .eq (A and P) |=[S] P = true .eq A and P and (P => Q) |=[S] R= A and P and Q and (P => Q) |=[S] R .endo

Example 8.4.4

Here are some proof tasks for which the above rules suﬃce toshow that no proof is needed, because they are already satisﬁed: open .ops P1 P2 : -> Sen .op Phi : -> Sig .red P1 |=[Phi] not P1 => P2 .red P1 and P2 |=[Phi] P1 => P2 .red P2 |=[Phi] P1 => (P1 => P2).red true |=[Phi] false => P1 .close (They all reduce to true .) (cid:2) We call the following a theorem even though it is easy to prove be-cause of its fundamental importance for our approach to proof plan-ning:

Theorem 8.4.5

If some task A (cid:238) P transforms under the rules T T (cid:238) -atoms having truth values such that the Bool-ean combination evaluates to true , then A (cid:238) P is true. Proof:

This follows by induction on the length of rule application sequences,using soundness of the individual rules, as stated in assertions R R (cid:2) In fact, we can set things up so that the Boolean combination evaluatesto true iﬀ each atom is true, by using the rules T a and T b instead of T

2. Then an OBJ proof score based on this proof plan will succeed iﬀeach OBJ evaluation is true .The rule below says that if we can reduce the two sides of an equa- tion to the same thing, then the equation is true:

TRW . A (cid:238) Φ ( ∀ X) t = t (cid:48) (cid:45) → t ↓ Φ (X),A R t (cid:48) , where the notation t ↓ Φ (X),A R t (cid:48) means that the terms t, t (cid:48) can be rewrit-ten to the same term using the set A R of rewrite rules of A . Althoughpossible, it is not worthwhile expressing this rule in our META frame-work, because this would require specifying equations, rewriting, etc.in OBJ. Instead, we just note that all atomic clauses should be passed onelsewhere for evaluation, after proof planning is completed. When theinstitution is

FOLQ/D , these will all be equations, and rule

TRW can beused; competition techniques are also possible (see Chapter 12). More roof Planning interestingly, there are decision procedures for atoms over certain spe-cial domains, e.g., Presburger arithmetic. Example 8.4.6

We explore some ways that things can go wrong in proofs; fail-ures are unpredictable, irregular, and very common. We ﬁrst try to planthe easy part of the proof of Exercise 8.2.6, that if a relation R satisﬁesthe equations cq X R* Y = true if X R Y .cq X R* Z = true if X R* Y and Y R* Z . then it also satisﬁes the equation cq X R* Z = true if X R Y and Y R* Z . Our proof task has the form A ∧ A (cid:238) Σ ( ∀ X) (P ⇒ P ) , and we can generate a proof score for it with the following: open META3 .op all-X_ : Sen -> Sen .ops A1 A2 P1 P2 : -> Sen .op Phi : -> Sig .var A : Sen .eq all-X A = (all ’x : ’Elt (all ’y : ’Elt (all ’z : ’Elt A))).red (A1 and A2) |=[Phi] (all-X (P1 => P2)) .close which yields the proof plan A1 and A2 and P1 |=[Phi] [’x : ’Elt][’y : ’Elt][’z : ’Elt] P2 .

But if we translate this into an OBJ proof score, it fails because the second conditional equation ( A2 ) is not a rewrite rule, since the variable Y occurs in its condition but not in its leftside. We could get around thisby using OBJ’s apply feature for A2 ; but it seems easier to make part ofthe necessary substitution by hand (the entire substitution would haveto be entered by hand to use apply anyway), add the resulting rule,and then use reduction. Substituting y for Y in A2 is justiﬁed by T a ,yielding the ﬁrst equation in the proof score below: This is a decidable fragment of arithmetic, usually taken to be so-called extendedquantiﬁer free Presburger arithmetic for the rationals and integers, with unary minus,addition, subtraction, multiplication by constants, equality, disequality, and the rela-tions <, ≤ , ≥ and > [164]. First-Order Logic and Proof Planning open R* .vars-of .ops x y z : -> Elt .cq X R* Z = true if X R* y and y R* Z .eq x R y = true .eq y R* z = true .red x R* z .close

However, this does not work either! This is because OBJ goes into aninﬁnite loop, applying the ﬁrst equation to itself over and over, withthe substitution X = x , Z = y . We can circumvent this by preventingthe substitution of y for Z by adding to the condition of the rule. Thisis justiﬁed by T b , and yields the following rule: cq X R* Z = true if Z =/= y and X R* y and y R* Z . However, this still causes an inﬁnite loop, because and does not knowthat if its ﬁrst argument is false then the whole conjunction is nec-essarily false; hence we deﬁne and use a more clever conjunction, thatuses a partially lazy evaluation (see Section 5.4): open R* .vars-of .ops x y z : -> Elt .var B : Bool .op _cand_ : Bool Bool -> Bool [strat (1 0)] .eq false cand B = false .eq true cand B = B .cq X R* Z = true if Z =/= y cand (X R* y and y R* Z) .eq x R y = true .eq y R* z = true .red x R* z .close

But this does not work either, since OBJ ﬁnds a diﬀerent inﬁnite loop!This one can be prevented by also prohibiting the instantiation of X by y , by adding another conjunct: cq X R* Z = true if (Z =/= y and X =/= y)cand (X R* y and y R* Z) . This (ﬁnally!) works, and the proof is done. (However, OBJ3 failsto parse the condition if either pair of the parentheses is omitted; thiscould be circumvented by declaring a non-default precedence for cand ,but it is not worth the trouble.)It was not so easy to get OBJ3 to execute this simple proof score:four diﬀerent things went wrong and had to be worked around! These Of course, there were also some typographical errors during the development ofthis proof; these were caught in the usual way by the OBJ parser, and ﬁxed by the user. roof Planning workarounds were: (a) instantiate an equation that was not a rewriterule to make it one; (b) add conditions to an equation to ensure ter-mination (this was done twice); (c) change the order of evaluation byforcing a rule to fail if one conjunct in its condition fails; and (d) addparentheses to help the parser. All of these are standard “tricks ofthe trade” for an experienced OBJ user, and I hope this example willhelp you to use them in the future. In particular, please note how ter-mination was handled: we did not attempt to prove that the rule setwas terminating; instead, we discovered experimentally that it was not terminating, and then we just strengthened the rules to prevent theparticular loop that we found, while preserving correctness. The sameapproach applies to the Church-Rosser property: when we failed to getthe order of evaluation we wanted, we just changed OBJ’s evaluation strategy. Our emphasis is on getting a correct proof, rather than ongetting a canonical speciﬁcation.Things are worse for the other half of the proof of Exercise 8.2.6:the proof score that is automatically generated from the proof task isvery little help; some entirely new ideas are needed, and initiality mustbe used. We omit the details, but underline the moral: the proof scoreautomatically generated by our

META rules is only adequate for simpleproof tasks; for slightly more diﬃcult tasks, small modiﬁcations may besuﬃcient, but in general, some real creativity must be supplied by theuser. Nevertheless, close adherence to the transformational approachwill guarantee correctness of the proof score, and hence of the proof,if the proof score executes correctly. (cid:2)

Our approach to ﬁrst-order logic has been a bit eccentric: After an al-gebraic treatment of syntax, we developed a number of properties ofsemantic entailment, and then applied them to proof planning; we havenot considered rules of deduction in the traditional sense at all.Rules of deduction are used to deduce (infer) something new fromsomething old, such that if the old is true then so is the new. Ourpurpose in this chapter has been just the opposite: to reduce something we hope is true to some new thing(s), such that if the new are true, thenso is the old. Hence, what we call an “elimination rule” correspondsto what is called an “introduction rule” in the standard literature, butapplied backwards .To illustrate this, let’s consider the traditional rule for conjunction(“and”) introduction:

P QP ∧ Q This says that if we have proved P and Q , then we are entitled to say First-Order Logic and Proof Planning we can prove their conjunction P ∧ Q . Since it is awkward to work withtautologies, a more useful formulation is A (cid:96) P A (cid:96) QA (cid:96) P ∧ Q where A is some set of axioms, and “ (cid:96) ” indicates ﬁrst-order proof; thisis very much like what we did for equational logic. But in our presentcontext, where we have the task of proving P ∧ Q (from A ), the abovetells us it is suﬃcient to prove P and Q separately: that is, we can applythe above rule backwards to eliminate the conjunction from our goal;this is why we call it “conjunction elimination.”The rule that is usually called conjunction elimination is completelydiﬀerent: it says that if we have proved P ∧ Q , then we are entitled to say we have proved P . This may be written: A (cid:96) P ∧ QA (cid:96) P (Of course, there is a similar elimination rule for Q .)The most important property that a rule of deduction can have is soundness ; an unsound rule cannot guarantee correct proofs. A se-quent rule is sound iﬀ the result of replacing (cid:96) by (cid:238) is a valid impli-cation. For example, soundness of the rule that we call conjunctionelimination depends upon the result A (cid:238) P and A (cid:238) Q imply A (cid:238) P ∧ Q .

Using the traditional rules in the forward direction gives a bottom up proof, starting with what is known, and gradually building up more. Bycontrast, our proof planning rules build proofs top down , starting withwhat we want, and working down towards what we know. This sec-ond kind of proof organization corresponds (roughly) to what is called natural deduction in the logic literature. More generally, a rule thattransforms what appears on the right of the turnstile is doing top down or backwards inference, and one that transforms what appears on theleft of the turnstile is doing bottom up or forwards inference. It is important to notice that “real” proofs (e.g., from textbooks, re-search papers, lectures, blackboards, etc.) are usually neither top downnor bottom up! In fact, reading a proof written in either of these stylescan be pretty tedious. A bottom up version of a complex proof wouldﬁrst present a long list of assumptions and low level results that arecompletely unrelated to each other; it would then build on top of thesea layer of loosely related low level results; and so on upwards; the re-sult would be incomprehensible until the very end (and probably eventhen). A top down proof would be easier to follow, but would prohibit A proof calculus where sentences involve (cid:96) is sometimes called a sequent calculus . roof Planning the use of lemmas, which can make proofs much easier to follow. Thus,natural deduction is not really very natural after all. A brief discussionof the naturalness of proofs appears in Section 8.8.Proof planning rules that are rewrite rules can and generally shouldbe applied automatically, but other kinds of rule require more atten-tion. Therefore only the most routine aspects of proof planning canbe completely automated by rewriting; the most interesting rules, suchas proof by contradiction, adding lemmas, and induction, require some(often considerable!) ingenuity.We are now in a position to understand what OBJ “proof scores”really are, and why they work: An OBJ proof score contains the equa-tional reductions that result from applying proof planning rules to anoriginal proof task; such a proof score can be proved valid by appeal to the proof planning rules that produced it. This section veriﬁes a ripple carry adder of arbitrary width, i.e., provesthat it really does add. The veriﬁcation makes heavy use of OBJ’s ab-stract data type capabilities. Figure 8.1 shows the structure of thisdevice; it is a cascade connection of n “full adders.” In addition, thedevice has two input buses and one output bus, each n bits wide, plusa ﬁnal carry bit output. (For those not already familiar with hardware,all of these terms are made precise in the OBJ speciﬁcations below.)The ADT of natural numbers with addition is needed to handle thecorrectness condition, and the use of multiplication and exponentiationshould not be surprising given the nature of binary numbers; but itis interesting to notice how convenient the integers with subtractionreally are for this example. n -bit wide busses are represented by listsof Booleans of length n ; this abstract data type is deﬁned using order-sorted algebra, so that inductive proofs have the 1-bit case for theirbase, and the operation of postpending a bit for their induction step.The result to be veriﬁed is that ( ∀ w , w ) ( | w | = | w | ) ⇒ ( sout ∗ (w , w ) + | w | ∗ cout (w , w ) = w + w ) , where w and w range over buses, where | w | and w are respectivelythe length, and the number denoted by, a bus w , where sout ∗ (w , w ) represents the content of the output bus, and where cout (w , w ) isthe carry bit. In words, this formula says that given two input buses ofthe same width, the number on the output bus together with the carry(as highest bit) equals the sum of the numbers on the input buses.Despite the somewhat complex structure of its terms, this formulais of exactly the form treated in Example 8.4.1. Hence we can prove itby introducing new constants for w and w , then assuming P , and First-Order Logic and Proof Planning

FA1 FA2 FAnfalse (cid:116) >> b > b ∧ s >c > b > b ∧ s . . .c >. . . > b n > b n ∧ s n >c n Figure 8.1: A Ripple Carry Adderproving P by checking equality of the reduced forms of its two sides. The proofs of the lemmas are by straightforward case analysis and/orinduction, and are omitted here. The proof of the main result is byinduction on the width of the input buses, starting from width 1.The ﬁrst OBJ module below uses order-sorted algebra to specify theintegers; thus “ subsort Nat < Int ” says every natural number is alsoan integer (see Chapter 10 for details on order-sorted algebra). A num-ber of inductive lemmas are included in this module, such as the dis-tributive law. obj INT is sorts Int Nat .subsort Nat < Int .ops 0 1 2 : -> Nat .op s_ : Nat -> Nat [prec 1] .ops (s_)(p_) : Int -> Int [prec 1] .op (_+_) : Nat Nat -> Nat [assoc comm prec 3] .op (_+_) : Int Int -> Int [assoc comm prec 3] .op (_*_) : Nat Nat -> Nat [assoc comm prec 2] .op (_*_) : Int Int -> Int [assoc comm prec 2] .op (_-_) : Int Int -> Int [prec 4] .op -_ : Int -> Int [prec 1] .vars I J K : Int .eq 1 = s 0 . eq 2 = s 1 .eq s p I = I .eq p s I = I .eq I + 0 = I .eq I + s J = s(I + J) .eq I + p J = p(I + J) .eq I * 0 = 0 .eq I * s J = I * J + I .eq I * p J = I * J - I .eq I * (J + K) = I * J + I * K .eq - 0 = 0 .eq - - I = I .eq - s I = p - I . roof Planning eq - p I = s - I .eq I - J = I + - J .eq I + - I = 0 .eq -(I + J) = - I - J .eq I * - J = -(I * J) .op 2**_ : Nat -> Nat [prec 1] .var N : Nat .eq 2** 0 = 1 .eq 2** s N = 2** N * 2 .endoobj BUS is sort Bus .extending PROPC + INT .subsort Prop < Bus .op __ : Prop Bus -> Bus .op |_| : Bus -> Nat .var B : Prop . var W : Bus .eq | B | = 1 .eq | B W | = s | W | .op First-Order Logic and Proof Planning eq

In reducing the above expression to true , OBJ3 did 158 rewrites, manyof which were associative-commutative (and many more rewrites weretried but failed), so one would certainly prefer to have this calculationdone mechanically, rather than do it oneself by hand! This proof maybe about two orders of magnitude easier using induction in OBJ than itwould be in a fully manual proof system. In addition, OBJ produced avalidated proof score.

Exercise 8.4.6

Prove the three lemmas in the object

LEMMAS above. (cid:2)

To summarize, we have used OBJ and reduction in two diﬀerentways, at two diﬀerent levels: ﬁrst at the meta level, to reduce the orig- inal proof task to a form that OBJ can directly handle; and then at the“object” level, to do the actual “dirty work” of the proof. Hence, thisproof was completely automatic. (Needless to say, this is not alwayspossible.)

After seeing the kind of tricks with signatures that are used to elimi-nate quantiﬁers ( R R

10 in the next subsection),a reader may worry that the truth of a goal depends on the signature.The following shows that is not the case. easoning with Existential Quantiﬁers

Proposition 8.4.7

Let A be a set of Φ -formulae, let P be a Φ -formula, and let Φ (cid:48) ⊆ Φ be such that a sort of Φ (cid:48) is void iﬀ it is also void in Φ . If A, P arealso Φ (cid:48) -formulae, then A (cid:238) Φ P iﬀ A (cid:238) Φ (cid:48) P .

Proof:

First note that if M (cid:48) = M | Φ (cid:48) for M a Φ -model, then M (cid:238) Φ A iﬀ M (cid:48) (cid:238) Φ (cid:48) A .Also note that by the non-void assumption, any Φ (cid:48) -model M (cid:48) extendsto a Φ -model M ∗ such that M (cid:48) = M ∗ | Φ (cid:48) .Now suppose A (cid:238) Φ P and M (cid:48) (cid:238) Φ (cid:48) A ; we will prove that M (cid:48) (cid:238) Φ (cid:48) P .Choose M ∗ such that M (cid:48) = M ∗ | Φ (cid:48) . Then M ∗ (cid:238) Φ A . Therefore M ∗ (cid:238) Φ P ,and hence M (cid:48) (cid:238) Φ (cid:48) P . For the converse, suppose A (cid:238) Φ (cid:48) P and M (cid:238) Φ A .Let M (cid:48) = M | Φ (cid:48) . Then M (cid:48) (cid:238) Φ (cid:48) A . Therefore M (cid:48) (cid:238) Φ (cid:48) P and hence M (cid:238) Φ P . (cid:2) As long as all formulae parse, and you don’t populate an old void sortor depopulate an old non-void sort, the working signature can be aslarge or as small as you please. This implies that a mechanical theoremprover can eﬀectively ignore the working signature, as our OBJ proofscores in fact do; the above result also helps to justify our frequentpractice of dropping the signature subscript from (cid:238) . Sentences that involve existential quantiﬁers can occur either on theassumption or the goal side of the symbol (cid:238) , and must be handleddiﬀerently in each case. We begin with the assumption case. Becauseit can be diﬃcult to use assumptions that contain existential quanti-ﬁers, it is useful to transform them into a more constructive form. Forexample, the proof task

A, ( ∃ a, b : Pos )(c = a/b) (cid:238) Φ Q can be transformed to A, c = a/b (cid:238) Φ (a,b : Pos ) Q .

The intuition here is that since we know a, b exist, in trying to prove Q we may as well assume that a, b have been given to us in the signature;in this case a, b are called Skolem constants .More generally, an existential quantiﬁer may lie within the scope ofone or more universal quantiﬁers, as in

A, ( ∀ x : Nat )( ∃ y : Nat ) f (x, y) = (cid:238) Φ Q . First-Order Logic and Proof Planning

In such a case, the choice of y must depend on the value of x , so thatwhat is added to the signature must be a function of x . Hence the resultof transforming the above should be A, ( ∀ x : Nat ) f (x, y(x)) = (cid:238) Φ (y : Nat → Nat ) Q (cid:48) , where Q (cid:48) denotes the result of substituting y(x) for y in Q . Transfor-mations of this kind are justiﬁed by Proposition 8.5.1 below. Proposition 8.5.1

Given a set A of Φ -formulae plus Φ -formulae E37

P , Q where

Free (P ) = X ∪ { y } with X = { x : s , . . . , x n : s n } and with y of sort s ,then R a. A, P (cid:48) (cid:238) Φ (Y ) Q implies A, ( ∃ y : s)P (cid:238) Φ Q , where P (cid:48) denotes the result of substituting y(x , . . . , x n ) for y in P andwhere Y is the declaration y : s . . . s n → s . Moreover, under the sameassumptions, R b. A, ( ∀ X)P (cid:48) (cid:238) Φ (Y ) Q implies A, ( ∀ X)( ∃ y : s)P (cid:238) Φ Q .

Proof:

We ﬁrst prove the implication R a . Let θ be the substitution thattakes y to y(x , . . . , x n ) and is the identity on other variables. Then P (cid:48) = P θ . Let M be a Φ -model satisfying ( ∃ y : s)P . Then for every a : X → M there is some a (cid:48) : X ∪ { y } → M such that a (cid:48) | X = a and a (cid:48) (P ) = true . Let M (cid:48) be a Φ (Y ) -model that extends M with a new func-tion M (cid:48) y : M s ...s n → M s deﬁned by M (cid:48) y (m , . . . , m n ) = a (cid:48) (y) where a (cid:48) : X ∪ { y } → M is an interpretation obtained as above from a : X → M deﬁned by a(x i ) = m i for i = , . . . , n . Note that M (cid:48) | Φ = M because M (cid:48) only adds the new operation M (cid:48) y to M . Also note that M (cid:48) (cid:238) Φ (Y ) P (cid:48) :indeed, each interpretation a : X → M (cid:48) is actually an interpretation a : X → M that takes each x i in X to an m i in M s i such that there isan a (cid:48) : X ∪ { y } → M such that a (cid:48) | X = a , M (cid:48) y (m , . . . , m n ) = a (cid:48) (y) and a (cid:48) (P ) = true . Because a(y(x , . . . , x n )) = M (cid:48) y (a(x ), . . . , a(x n )) = a (cid:48) (y) , then a(θ(z)) = a (cid:48) (id(z)) for all z ∈ Free (P ) , where id is the identity substitution; now Lemma 8.3.25 implies a(P (cid:48) ) = a (cid:48) (P ) = true . Thus M (cid:48) is a Φ (Y ) -model of P (cid:48) . Therefore if M is a Φ -model of both A and ( ∃ y : s)P , then M (cid:48) is a Φ (Y ) -model of both A and P (cid:48) . Hence M (cid:48) is a Φ (Y ) -model of Q , and thus M is a Φ -model of Q . R b now follows from R a by n applications of R b . (cid:2) The new function symbol y is called a Skolem constant or a

Skolemfunction , depending on whether the quantiﬁer ( ∀ X) is present. Thecorresponding transformation rules, called Skolemization rules, are T a. A, ( ∃ y : s)P (cid:238) Φ Q (cid:45) → A, P (cid:48) (cid:238) Φ (Y ) Q (cid:48) . T b. A, ( ∀ X)( ∃ y : s)P (cid:238) Σ Q (cid:45) → A, ( ∀ X)P (cid:48) (cid:238) Φ (Y ) Q (cid:48) . easoning with Existential Quantiﬁers Note that these rules only apply to formulae on the left of the turnstile;a diﬀerent approach is needed for establishing goals that contain exis-tential quantiﬁers. Of course, not all existential quantiﬁers are so politeas to occur only within the scope of universal quantiﬁers. But since aﬁrst-order formula has only a ﬁnite number of quantiﬁers, by patientlyapplying T

10 wherever it can be applied (outermost ﬁrst), all of itsexistential quantiﬁers will eventually be eliminated. (Second-order ex-istential quantiﬁers can be Skolemized by adding further arguments tothe quantiﬁed function, as discussed in Chapter 9.)Below is OBJ3 (meta-)code for Skolem constants; Skolem functionscan be done in a similar way, but this would require modifying someprevious meta code. obj META4 is pr META3 .vars A P Q : Sen . var X : Id . var T : BType .var S : Sig .op (exist_:_ _) : Id BType Sen -> Sen .eq A and (exist X : T P) |=[S] Q = A and P |=[S][X : T] Q .endo

To prove a goal that involves an existential quantiﬁer, it is necessaryto show that a suitable value actually exists in all models that satisfythe assumptions. In general, the suitable value will depend upon achoice of other values, because the existential quantiﬁer occurs withinthe scope of some universal quantiﬁers in the goal. For example, thesentence ( ∀ x)( ∃ y) x + y = y = − x . This suggests that ifwe can ﬁnd a term expressing the dependency of the existential vari-able on the universal variables, then we can prove our goal. This proofmethod is supported by the following: Proposition 8.5.2

Given a Φ -sentence ( ∀ X)( ∃ y : s)P with Free (P ) = X ∪ { y } ,then R . A (cid:238) Φ ( ∀ X)P [y ← t] implies A (cid:238) Φ ( ∀ X)( ∃ y)P , where t is some term over X of sort s . Proof:

We ﬁrst assume A (cid:238) Φ ( ∀ X)P [y ← t] , which by the Theorem of Con-stants ( R M (cid:238) Φ (X) A implies M (cid:238) Φ (X) P [y ← t] for all Φ (X) -models M . Then we want to show A (cid:238) Φ ( ∀ X)( ∃ y)P , i.e., that M (cid:238) Φ (X) A implies M (cid:238) Φ (X) ( ∃ y)P , for all Φ (X) -models M . So weassume M (cid:238) Φ (X) A , and from this conclude by the assumption that M (cid:238) Φ (X) P [y ← t] , and hence E38 by Proposition 8.3.15, that M (cid:238) Φ (X) ( ∃ y)P . (cid:2) First-Order Logic and Proof Planning

The corresponding transformation rule is: T . A (cid:238) Φ ( ∀ X)( ∃ y)P (cid:45) → A (cid:238) Φ ( ∀ X)P [y ← t] . This rule cannot be expressed in our current meta level formalism, be-cause terms are not speciﬁed in it. However, it is easy to express theessence of the rule, by expressing substitutions for variables as newequations, where terms will be handled the usual way in concrete ex-amples. obj META5 is pr META4 .vars A P : Sen . var y : Id . var T : BType .var S : Sig .op all-X_ : Sen -> Sen .op Eqt : Id -> Sen .eq A |=[S] (all-X (exist y : T P)) =A and Eqt(y) |=[S][y : T] (all-X P).*** where Eqt(y) is the equation y = t*** and all-X is one or more universal quantifierendo

Notice that in this formulation of the rule, the status of y is changedfrom being a variable to being a constant; this is needed so that the newequation will do the substitution.Proposition 8.5.2 can be applied iteratively to eliminate nested exis-tential quantiﬁers. For example, a sentence of the form ( ∀ X)( ∃ y)( ∀ Z)( ∃ w) P can be transformed ﬁrst to ( ∀ X ∪ Z)( ∃ w) P [y ← t] and then to ( ∀ X ∪ Z) P [y ← t][w ← t (cid:48) ] , provided the restrictions on free variables are satisﬁed — but of coursethese are very natural. Below is a simple example. Example 8.5.3

Suppose we want to prove

NAT (cid:238) ( ∀ x, y : Int )( ∃ z, w : Int ) P and P , where P , P are linear equations. The goal of this proof task says thatthese two equations can always be solved for w, z given values for x, y .Below we use META5 to plan a proof for this goal; OBJ3’s call-that fea-ture is used to delay applying the equation that deﬁnes the universalquantiﬁers; without this trick, these quantiﬁers are turned into con-stants, and then the quantiﬁer elimination rule in

META5 cannot be ap-plied. ase Analysis open META5 .ops INT P1 P2 : -> Sen .op Phi : -> Sig .op Int : -> BType .var A : Sen .red INT |=[Phi] (all-X (exist ’z : Int(exist ’w : Int (P1 and P2)))) .call-that t .eq all-X A = (all ’x : Int (all ’y : Int A)).red t .close

The proof plan that results from this is (INT and Eqt(’z) and Eqt(’w)|=[Phi][’z : Int][’w : Int][’x : Int][’y : Int] P2) and(INT and Eqt(’z) and Eqt(’w)|=[Phi][’z : Int][’w : Int][’x : Int][’y : Int] P1)

For a particular instance, suppose that P1 and P2 are respectively theequations x − w + y = x + w = z − , for which the solutions are w = x + y − z = x + y + . Then the original goal is proved with the above proof plan by the fol-lowing, provided the two reductions give true (which they do): open INT .ops x y z w : -> Int .eq w = x + (2 * y) - 3 .eq z = x + y + 1 .red x - w + (2 * y) == 3 .red x + w == (2 * z) - 5 .close (cid:2)

Sometimes it is easier to prove a result by breaking the proof (or a partof it) into “cases.” For example, in trying to prove a sentence of theform (where n is a natural number variable) ( ∀ n) (n > ⇒ Q(n)) , First-Order Logic and Proof Planning it might be easier to prove the following two cases separately, ( ∀ n) (n = ⇒ Q(n)) ,( ∀ n) (n > ⇒ Q(n)) , than to prove the assertion in its original form. In general, there aremany diﬀerent ways to break a condition like n > ( ∀ n) (n even ∧ n > ⇒ Q(n)) ,( ∀ n) (n odd ⇒ Q(n)) , and still another is ( ∀ n) (n = ⇒ Q(n)) ,( ∀ n) (n prime ⇒ Q(n)) , ( ∀ n) (n composite ⇒ Q(n)) .

Such examples suggest that “cases” are “predicates” (i.e., open for-mulae) P , . . . , P N such that P ∨ · · · ∨ P N and P are equivalent, wherethe sentence to be proved has the form P ⇒ Q ; and they further sug-gest that a proof by “case analysis” consists of proving P i ⇒ Q for i = , . . . , N . We make this more precise as follows: Proposition 8.6.1

To prove A (cid:238) P ⇒ Q , it suﬃces to give predicates P , . . . , P N such that A (cid:238) P ⇒ (P ∨ · · · ∨ P N ) , and then to prove A (cid:238) P i ⇒ Q for i = , . . . , N . Proof:

Soundness of this proof measure follows from the calculation: A (cid:238) (P ⇒ Q) ∧ · · · ∧ (P N ⇒ Q) iﬀ A (cid:238) (P ∨ · · · ∨ P N ) ⇒ Q implies A (cid:238) (P ⇒ Q) . using E

14 for the ﬁrst step and R a for the second. (cid:2) This justiﬁes the deduction rule R . A (cid:238) P ⇒ Q if A (cid:238) P i ⇒ Q for i = , . . . , N and A (cid:238) P ⇒ P ∨ · · · ∨ P N , which in turn justiﬁes the transformation rule T . A (cid:238) P ⇒ Q (cid:45) → A (cid:238) P i ⇒ Q for i = , . . . , N and A (cid:238) P ⇒ P ∨ · · · ∨ P N . Example 8.6.2

Often a case analysis succeeds because it makes use of new in-formation. For example, we cannot prove A |(cid:155) ( ∀ B) not not B = B , where A is a ground speciﬁcation of the Booleans, by proving A (cid:238) ( ∀ B : Bool ) not not B = B ase Analysis because it is not true of all models of A . However, if we work in theinstitution FOLQ /D where D includes the Booleans, then we can prove A |(cid:155) ( ∀ B : Bool ) B = true ∨ B = false , using initiality, so that it suﬃces to prove the two cases, A (cid:238) not not true = true A (cid:238) not not false = false . In fact, exactly this kind of Boolean case analysis justiﬁes the methodof truth tables (as in Example 7.3.16). (cid:2)

Some typical case analyses are given below, for x an integer, y a non-zero integer, and z a positive integer, respectively: (x = ) ∨ (x > ) ∨ (x < ) ; (y > ) ∨ (y < ) ; (z = ) ∨ (z > ) . It is also typical that proving the validity of these disjunctions requiresinduction.

Exercise 8.6.1

Prove validity of the following case analysis: for all natural num-bers n , either n = ( ∃ j) n = s(j) . (cid:2) Example 8.6.3

The proof that √ √ = a/b with a, b positive relatively prime integers. This step can be justiﬁedusing case analysis. The initial assumption is √ = a/b with a, b pos-itive. Let gcd (a, b) = g (where gcd denotes the greatest common di-visor). Then either g = g >

1. In the ﬁrst case, a, b are alreadyrelatively prime, while in the second case we have a = a (cid:48) g and b = b (cid:48) g with a (cid:48) , b (cid:48) positive and relatively prime. Then √ = a (cid:48) /b (cid:48) . So in eithercase the initial assumption is justiﬁed, and we can proceed with theproof. The rest of the top level proof planning can be done by OBJ: open META4 + NAT .op NAT : -> Sen .op NATSIG : -> Sig .ops ’a ’b : -> NzNat .op eq : Nat Nat -> Sen .let P = eq(’a * ’a * 2, ’b * ’b).red NAT |=[NATSIG] not (exist ’a : ’NzNat(exist ’b : ’NzNat P)).close The result of the reduction is as follows: First-Order Logic and Proof Planning result Meta: NAT and eq(’a * ’a * 2,’b * ’b) |=[(NATSIG][’a : ’NzNat)][’b : ’NzNat] false which says we should assume the negation of the goal, Skolemize ’a and ’b , and then try to derive a contradiction; of course, this leavesout the most diﬃcult parts of the proof, which cannot be automated soeasily. (cid:2) Exercise 8.6.2

Give a complete proof plan and OBJ3 proof score for showing √ (cid:2) This section shows how to generalize, justify and use the familiar formof induction that checks base and step cases; this includes so-called“structural induction” but not well-founded induction and similar po-tentially transﬁnite methods. Example 8.3.14 showed that ﬁrst-orderspeciﬁcations in general do not have initial models; therefore inductionis not in general valid for proving sentences about structures deﬁnedby ﬁrst-order theories. However, induction is valid for proving sen-tences about structures that are deﬁned (or deﬁnable) by initial algebrasemantics. Our applications usually have some underlying data valuesthat have been deﬁned in this way, such as the integers or the nat-urals, and simple inductive results about them are usually needed inall but the simplest proofs. Using the language of Section 8.3.6, thismeans we are working in the institution FOLQ /D , where D is an ini-tial ( Ψ , E) -algebra. For this institution, satisfaction diﬀers from that ofthe ordinary ﬁrst-order logic institution FOL , in that for P a ﬁrst-order Ψ -formula, ∅ (cid:238) FOLQ /D Ψ P iﬀ D (cid:238) FOLQ Ψ P . We can now state the basicjustiﬁcation for inductive reasoning as follows: R . ∅ (cid:238) FOLQ /D Ψ P iﬀ D (cid:238) FOLQ Ψ P iﬀ E |(cid:155) Ψ P , where we extend the notation |(cid:155) of Section 6.4 to ﬁrst-order sentences,so that E |(cid:155) Ψ P means P is satisﬁed by an initial model of E . More gener- ally, we have A (cid:238) FOLQ /D Ψ P iﬀ E |(cid:155) Ψ P , provided P is a Ψ -sentence and A is consistent with D . We should not neglect to mention a very simple,but still useful further rule, where P is an arbitrary ﬁrst-order sentence: R . A |(cid:155) Σ P if A (cid:238) Σ P .

This rule is sound for

FOLQ /D because if P holds for every model of A ,it certainly holds for an initial model of A (a special case was alreadymentioned on page 181 in Chapter 6).In line with R

13, we have the following result, for which as so oftenhappens, there is a nice semantic proof: lgebraic Induction

Proposition 8.7.1 If M, M (cid:48) are isomorphic Ψ -models and if P is a ﬁrst-order ( Ψ , X) -formula, then M (cid:238) Ψ P iﬀ M (cid:48) (cid:238) Ψ P . Proof:

Let ψ : M → M (cid:48) be a Ψ -isomorphism with inverse ρ : M (cid:48) → M , andassume M (cid:238) Ψ P . Let θ : X → M (cid:48) be an assignment. E39

Then θ ; ρ : X → M is an assignment, so that P (θ ; ρ) = true . Therefore P (θ ; ρ) ; ψ = ( true )ψ = true . The proof of the converse is similar. (cid:2) Formulae that can be proved by induction include sentences of theform ( ∀ x : v)P where x is free in P ; in this case, we often write ( ∀ x)P (x) , and call x the induction variable . Then M (cid:238) ( ∀ x)P means M (cid:238) θ m (P ) for all m ∈ M v where θ m is the substitution E40 with θ m (x) = m ; we may write P (m) for θ m (P ) . If the inductive goal has theform ( ∀ x )( ∀ x ) . . . ( ∀ x n )P , then (by E

20) we can reorder the quanti-ﬁers to put any one of x , . . . , x n ﬁrst, say x i , and use it as the induction variable for proving ( ∀ X − { x i } )P where X = { x , x , . . . , x n } .Usually more than one induction scheme can be used for a given ini-tial speciﬁcation; even the natural numbers have many diﬀerent induc-tion schemes. The most familiar scheme proves ( ∀ x)P (x) by proving P ( ) and then proving that P (n) implies

P (sn) ; let’s call this

Peano in-duction . But we could also prove ( ∀ x)P (x) by proving P ( ) and P (s ) ,and then proving that P (n) implies

P (ssn) ; let’s call this even-odd in-duction . These two schemes correspond to two diﬀerent choices ofgenerators: the ﬁrst uses 0 , s _, while the second uses 0 , s , ss _. Theseare the ﬁrst two of an inﬁnite family of induction schemes: for each n >

0, the n -jump Peano induction scheme has n constants 0 , . . . , s n − s n _.We insist that before an induction scheme is used, it should beproved sound ; this requires formalizing the notion of induction schemefor (initial models of) an equational speciﬁcation ( Ψ , E) ; the formaliza-tion will involve a signature Γ of generators deﬁned over ( Ψ , E) by somenew equations that may involve some new auxiliary function symbols. Deﬁnition 8.7.2 An inductive goal for a signature (V , Ψ ) is a V -sorted family P of ﬁrst-order Ψ -formulae P v , each with exactly one free variable x v ofsort v ∈ V ; we may write P v (x v ) and call x v the induction variable ofsort v . Usually P v (x v ) = true for all v except one, say u ∈ V , in which case we identify P with P u and write x for x u .Given a speciﬁcation T = ( Ψ , E) , an algebraic induction scheme for T is a V -sorted extension theory T (cid:48) = ( Ψ (cid:48) , E (cid:48) ) of T and a subsignature Γ of Ψ (cid:48) , written ( Ψ (cid:48) , E (cid:48) , Γ ) . Then inductive reasoning for an inductivegoal P over T using the scheme ( Ψ (cid:48) , E (cid:48) , Γ ) says: ﬁrst show P v (c) for eachconstant c ∈ Γ [],v ; and then show P v (g(t , . . . , t k )) for each g ∈ Γ w,v assuming P v i (t i ) for each i = , . . . , k (where w = v . . . v k and each t i is a ground Ψ -term of sort v i ) using the equations in E (cid:48) . (cid:2) We want to use inductive reasoning to prove E |(cid:155) Ψ ( ∀ x v )P v (x v ) foreach sort v ∈ V , which we may write for short as E |(cid:155) ( ∀ x)P (x) . The First-Order Logic and Proof Planning result below follows from Theorem 6.4.4, that initial algebras have noproper subalgebras:

Theorem 8.7.3

Given a speciﬁcation T = ( Ψ , E) and an algebraic inductionscheme ( Ψ (cid:48) , E (cid:48) , Γ ) over T , then inductive reasoning with ( Ψ (cid:48) , E (cid:48) , Γ ) over T is sound , in the sense that if the steps of inductive reasoning for P using the scheme are carried out, then T |(cid:155) ( ∀ x)P (x) , provided Γ is inductive for ( Ψ (cid:48) , E (cid:48) ) over ( Ψ , E) , in the sense that:(I1) two ground Ψ -terms are equal under E iﬀ they are equal under E (cid:48) ;(I2) every ground Ψ (cid:48) -term equals some Ψ -term under E (cid:48) ; and(I3) every ground Ψ -term equals some Γ -term under E (cid:48) . In this case we may also say that ( Ψ (cid:48) , E (cid:48) , Γ ) is inductive over ( Ψ , E) . Proof:

We need to show D (cid:238) ( ∀ x v )P v (x v ) for each v ∈ V , where D is aninitial model for T . Because (I1) and (I2) imply that initial models of T and T (cid:48) are Ψ -isomorphic, by Proposition 8.7.1 it suﬃces to show D (cid:48) (cid:238) ( ∀ x v )P v (x v ) for each v ∈ V , where D (cid:48) is an initial model for T (cid:48) .To this end, deﬁne a V -sorted subset M of the initial algebra D (cid:48) = T Ψ (cid:48) ,E (cid:48) by M v = { [t] | D (cid:48) (cid:238) P v (t), t ∈ T Ψ ,v } , for each v ∈ V . Then (I1) and (I2) imply D (cid:48) (cid:238) ( ∀ x) P iﬀ M = D (cid:48) . Toprove M = D (cid:48) , it suﬃces to show that M is a Ψ (cid:48) -algebra, because D (cid:48) hasno proper Ψ (cid:48) -subalgebras (Theorem 6.4.4). By (I3), it suﬃces to showthat M is a Γ -algebra. But successfully carrying out the steps of theinductive reasoning shows exactly this. (cid:2) Conditions (I1) and (I2) above say that T (cid:48) is a protecting initial exten-sion of T , i.e., that after enriching T , there are no new ground terms;and (I3) says that all ground terms are (equal to) Γ -terms. Theorem8.7.3 not only justiﬁes the most familiar induction schemes, but alsomany others, as shown in the examples and exercises below. It is worth noticing that the above theorem applies to any reachable model of T ,not just to an initial model of T (recall that D is reachable iﬀ the unique Ψ -homomorphism I → D is surjective, where I is an initial model of T ,i.e., iﬀ D satisﬁes the “no junk” condition). Example 8.7.4

The following take ( Ψ , E) to be the usual Peano speciﬁcationfor the natural numbers, with Ψ having just the sort Nat and functionsymbols 0 , s , and with E = ∅ .1. Of course, Peano induction takes Γ = Ψ (cid:48) = Ψ with E (cid:48) = ∅ . In thiscase, inductivity is trivial. lgebraic Induction

2. Letting Γ contain 0 , s , s , with Ψ (cid:48) = Ψ ∪ Γ and E (cid:48) = { s (n) = s(s(n)) } gives the even-odd induction scheme. Then (I1)–(I3) areeasy to prove. (cid:2) Exercise 8.7.1

Show that the n -jump Peano induction scheme is sound for each n > (cid:2) Example 8.7.5

A more sophisticated algebraic induction scheme lets Γ contain0 , s × p for each prime p , with ( Ψ (cid:48) , E (cid:48) ) deﬁningthe usual binary multiplication and (to help deﬁne that) the usual binaryaddition. The resulting scheme, which we call prime induction , saysthat to prove P (n) for all natural numbers n , prove P ( ) , P (s ) , andthat P (n) implies

P (n × p) for each prime p .This scheme is inductive, because in this case, (I1) and (I2) just say that after enriching T with addition, multiplication, and primes, thenatural numbers are still its ground terms, while (I3) says that everypositive number is a product of primes, which is the so-called Funda-mental Theorem of Arithmetic, which was ﬁrst proved by Gauss. (cid:2) Example 8.7.6

Prime induction can be used to prove some pretty facts aboutthe so-called

Euler function ϕ , where ϕ(n) is the number of positiveintegers less than n that are relatively prime to n . One of these is thefollowing, which is rather well known as the Euler formula , ϕ(n) = n (cid:89) p | n (cid:32) − p (cid:33) , where p varies over primes. We can deﬁne the Euler function induc-tively over the prime induction scheme as follows: ϕ( ) = ϕ(np) = ϕ(n)(p − ) if p prime and not p | nϕ(np) = ϕ(n) p if p prime and p | n . Or alternatively, we can consider the above as three properties of ϕ that can be proved from its deﬁnition as the number of relatively prime numbers less than its argument.The following is an OBJ proof score for the Euler formula (the spec-iﬁcations NAT and

ListOfNat have been omitted): obj PRIME is pr NAT .op _|_ : NzNat NzNat -> Bool .op prime : NzNat -> Bool [memo].op prime : NzNat NzNat -> Bool [memo].vars N M : NzNat .eq N | M = gcd(N,M) == N .eq prime(s 0) = false .cq prime(N) = prime(N,p N) if N > s 0 . First-Order Logic and Proof Planning eq prime(N, s 0) = true .cq prime(N,M) = false if M > s 0 and M | N .cq prime(N,M) = prime(N,p M) if M > s 0 and not M | N .endoobj PRIME-DIVISORS is pr PRIME + ListOfNat .op pr-div : NzNat -> List .op pr-div : NzNat Nat -> List .vars N P M : NzNat . var L : List .eq pr-div(s 0) = nil .cq pr-div(P) = P if prime(P) .cq pr-div(N * P) = pr-div(N) if P | N .cq pr-div(N * P) = P pr-div(N) if prime(P) and not P | N .cq pr-div(M) = pr-div(M,M) if not prime(M) .eq pr-div(N,s 0) = nil .cq pr-div(N,P) = P pr-div(N,p P) if P > s 0 and prime(P)and P | N .cq pr-div(N,P) = pr-div(N,p P) if P > s 0 andnot (prime(P) and P | N).ops Pi Pip : List -> NzNat .eq Pi(nil) = s 0 .eq Pi(N L) = N * Pi(L) .eq Pip(nil) = s 0 .eq Pip(N L) = (p N) * Pip(L) .endoobj EULER is pr PRIME-DIVISORS .op phi : NzNat -> NzNat .vars N P : NzNat .eq phi(s 0) = s 0 .cq phi(N * P) = phi(N) * P if prime(P) and P | N .cq phi(N * P) = phi(N) * p P if prime(P) and not P | N .endo***> Prove phi(N) * Pi(pr-div(N)) == N * Pip(pr-div(N))***> for each N:NzNat***> First show the formula for N = 1red phi(1) * Pi(pr-div(1)) == 1 * Pip(pr-div(1)) .***> and introduce the basic constants and assumptionsopenr .ops n q pq : -> NzNat .eq prime(q) = true .eq p q = pq .close***> Then suppose the property for n and prove it for n * qopenr .eq phi(n) * Pi(pr-div(n)) = n * Pip(pr-div(n)) .close lgebraic Induction ***> Case where q | nopen .eq q | n = true .red phi(n * q) * Pi(pr-div(n * q)) ==n * q * Pip(pr-div(n * q)) .close***> Case where not q | nopen .eq q | n = false .red phi(n * q) * Pi(pr-div(n * q)) ==n * q * Pip(pr-div(n * q)) .close

The unary function p is the predecessor function, which is inverse to the successor function s ; the function pr-div gives the list of primedivisors of a number; the functions Pi and Pip give the products of alist of numbers, and of the list of predecessors of a list of numbers,respectively. The equation p q = pq tells OBJ that the predecessor of p is positive, an important fact that needs to be proved separately. Atotal of 148 rewrites were executed in doing this proof. (Special thanksto Grigore Ro¸su for help with this example.) (cid:2) Exercise 8.7.2

Prove soundness of the induction scheme for positive naturalnumbers that shows

P ( ) and P (p) for each prime p , and then showsthat P (m) and

P (n) imply

P (mn) for any positive naturals m, n . (cid:2) Example 8.7.7

We now consider another somewhat sophisticated inductionscheme, this one for pairs of natural numbers. The underlying datatype is a simple extension of the naturals, with functions for pairingand unpairing, <_,_> : Nat Nat -> 2Nat .p1,p2 : 2Nat -> Nat . subject to the equations p1(< M, N >) = M .p2(< M, N >) = N . (It is interesting to compare this with the speciﬁcation for pairs ofnatural numbers in Example 3.3.10.)Our induction scheme for this data type uses the functions a,b : Nat -> 2Natf,g : 2Nat -> 2Nat where First-Order Logic and Proof Planning a(M) = < M, 0 >b(N) = < 0, N >f(< M, N >) = < M + N, N >g(< M, N >) = < M, N + N > where we can think of the a, b as each providing an inﬁnite family ofconstants.Then to prove ( ∀ p : ) P (p) for some ﬁrst-order formula P , itsuﬃces to prove the following, where the ﬁrst two are base cases andthe last two are induction steps, ( ∀ m : Nat ) P ( (cid:104) m, (cid:105) )( ∀ n : Nat ) P ( (cid:104) , n (cid:105) )( ∀ m, n : Nat ) P ( (cid:104) m, n (cid:105) ) ⇒ P ( (cid:104) m + n, n (cid:105) )( ∀ m, n : Nat ) P ( (cid:104) m, n (cid:105) ) ⇒ P ( (cid:104) m, n + m (cid:105) ) provided the scheme is inductive, which is Exercise 8.7.3 below. Noticethat in the last two steps, we can assume that both m, n are positivewithout loss of generality (because 0 is covered by the base cases).To illustrate this induction scheme, we give below an OBJ3 proofscore showing that gcd (m, n) = gcd (n, m) by considering gcd as a function → Nat . We ﬁrst deﬁne the in-equality relation, and then introduce some lemmas. openr INT .op _>_ : Int Int -> Bool .vars-of .eq s I > I = true .eq I > p I = true .eq s I > s J = I > J .vars M N : Nat .eq 0 > M = false .eq s M > 0 = true .*** some lemmaseq I + J + (- J) = I .eq I + J + (- I) = J .cq I + J > I = true if I > 0 and J > 0 .cq I + J > J = true if I > 0 and J > 0 .cq I + J > 0 = true if I > 0 and J > 0 .closeobj 2NAT is sorts 2Nat 2Int .pr INT .subsort 2Nat < 2Int .op <_,_> : Nat Nat -> 2Nat .op <_,_> : Int Int -> 2Int .ops p1 p2 : 2Nat -> Nat . lgebraic Induction ops p1 p2 : 2Int -> Int .vars M N : Int .eq p1(< M, N >) = M .eq p2(< M, N >) = N .endoobj GCD is pr 2NAT .op gcd_ : 2Int -> Int .vars M N : Int .eq gcd < 0, N > = N .eq gcd < M, 0 > = M .eq gcd < M, M > = M .cq gcd < M, N > = gcd(< M - N, N >) if M > N and N > 0 .cq gcd < M, N > = gcd(< M, N - M >) if N > M and M > 0 .endoopenr GCD .ops m n : -> Nat .*** base cases:red gcd < m,0 > == gcd < 0,m > .red gcd < 0,n > == gcd < n,0 > .*** for the induction steps:eq m > 0 = true .eq n > 0 = true .close*** induction step computations:open .eq gcd < m, n > = gcd < n, m > . *** induction hypothesis:red gcd < m, m + n > == gcd < m + n, m > .closeopen .eq gcd < n, m > = gcd < m, n > . *** induction hypothesis:red gcd < m + n, n > == gcd < n, m + n > .close

These computations require a total of 108 rewrites, many of which check the conditions of equations. (cid:2)

Exercise 8.7.3

This involves the material introduced in Example 8.7.7.(a) Show that the scheme of this example is inductive.(b) Prove that the equation < p1( P ), p2( P ) > = P holds in any initial model of , and use this to conclude thatthe universal quantiﬁcation ( ∀ P : ) is equivalent to ( ∀ M, N : Nat ) . (cid:2) First-Order Logic and Proof Planning

Notice that in proving that a sentence P holds for some constructorterm t , we are entitled to assume P (t (cid:48) ) for every subterm t (cid:48) of t . Thisis because when t = σ (t , . . . , t n ) , we assume P (t ), . . . , P (t n ) , whichin turn were proved assuming that P holds for the top level subtermsof t , . . . , t n , etc. Some proofs require these additional assumptions.In the case of the usual Peano induction for the natural numbers, thericher induction principle which includes these additional assumptionsis called strong induction , or complete induction , or course of valuesinduction , and it means that in proving P (n) , we can assume

P (k) forall k < n . We will use the same names for the corresponding enrich-ment of our much more general notion of induction. Below is a simpleexample for the natural numbers:

Example 8.7.8

We deﬁne a sequence f of natural numbers by f ( ) = , f ( ) =

0, and f (n) = f ( div (n)) + n ≥ . The following OBJ proof score shows that f is always even. The module SEQ ﬁrst deﬁnes the auxiliary functions even and div2 , which respec-tively tell if a number is even, and give its quotient when divided by 2.Next four lemmas are stated, a constant n is introduced to eliminatethe universal quantiﬁer from the formula to be proved, which is ( ∀ N) even (f (N)) = true , and then n is assumed to be at least 2 (we also need to assume it isat least 1, which is easier than introducing another lemma which willdeduce that fact). Finally the strong induction hypothesis is stated, thebase cases are checked, and then the inductive step is checked, whichrequires 50 rewrites to get true . obj SEQ is pr NAT .op even_ : Nat -> Bool .var N : Nat .eq even 0 = true .eq even s 0 = false .eq even s s N = even N .op div2_ : Nat -> Nat .eq div2 0 = 0 .eq div2 s 0 = 0 .eq div2 s s N = s div2 N .op f_ : Nat -> Nat .eq f 0 = 0 .eq f s 0 = 0 .cq f s N = 3 * f(div2 N) + 2 if N > 0 .endo iterature openr SEQ .vars N M : Nat .cq s N > M = true if N > M .cq even(N + M) = true if even N and even M .cq even(N * M) = true if even M .cq N > div2 N = true if N > 0 .op n : -> Nat .eq n > s 0 = true .eq n > 0 = true .cq even f N = true if s n > N .closered even f 0 .red even f 1 .red even f s n . (cid:2) Exercise 8.7.4

Write a speciﬁcation for ﬁnite sets of naturals having the con-structors ∅ and add : Nat Set → Set . Now deﬁne union, and prove itis associative, commutative, and idempotent. This is a nice example ofinduction over a speciﬁcation that is not anarchic.

E41 (cid:2)

Exercise 8.7.5( (cid:63) ) Explore the idea that if speciﬁcations ( Ψ , E) and ( Ψ (cid:48) , E (cid:48) ) areequivalent in the (loose) sense of Section 4.10, then they are groundequivalent, in that they have the same initial models (after reducing toa common signature); therefore either one can be used as an inductionscheme for the other. Going even further down this road, it mightbe interesting to consider the two ways of specifying pairs of naturalnumbers that are given in Examples 3.3.10 and 8.7.3. (cid:2) Many mathematicians, especially logicians, would say that ﬁrst-orderlogic is the most important of all logics, because it is the foundationfor set theory, and hence for all of mathematics. It is certainly one ofthe most intensively studied of all mathematical systems, and from themid twentieth century has been considered the most classic of all log- ics. Introductory textbooks on logic include [163, 133] and [48]; thereare many more textbooks, and a truly enormous literature of advancedtexts and papers.Most logic texts emphasize the completeness of some set of rules;we have taken a diﬀerent, some would say eccentric, approach, by de-veloping and using whatever rules we need, subject to proving theirsoundness based on satisfaction; in this sense, we consider logic to bea kind of open system (see the discussion in Section 1.3). In any case,pure ﬁrst-order logic is not suﬃcient for our applications, because ofour need for built in data types, induction, and (in the next chapter)second-order logic. First-Order Logic and Proof Planning

Many mathematicians would also say that the rules of inference ofﬁrst-order logic are “self evident” logical truths. But this has been chal-lenged by the so-called intuitionism of Brouwer and others, as well assome even more radical approaches, such as Martin-Löf type theory. Inparticular, the rules of double negation (P5) and proof by contradiction(R4) have been questioned. However, these challenges have less forcefor reasoning about relatively ﬁnitistic applications like numbers, lists,and circuits, which are our main interest in this book.The material in Section 8.3 builds on the algebraic exposition ofﬁrst-order logic given in [67]. There are many works on algebraic ap-proaches to logic; some early books are by Paul Halmos [96], RogerLyndon [125], and Helena Rasiowa [155]; some of the earliest work inthis area was done in Poland, and Rasiowa was one of the pioneers.

Halmos and Givant [97] give a nice introductory treatment. More ad-vanced approaches involve cylindric algebra, category theory, sheaves,and topoi [100, 44].Horn clauses are named after Alfred Horn, who for many years wasprofessor of logic at UCLA. Horn clauses are the basis for the syntax ofso-called “logic programming” languages, such as Prolog [34]; however,most of the syntax of Prolog does not correspond to Horn clause logic,and the part of its syntax that does correspond has a semantics thatdiﬀers from the model theory of Horn clause logic.Our deﬁnition of well-formed formulae follows the “initial algebrasemantics” advocated by [52] and [89] in using the freeness of certainsyntactic algebras. We consider this to be both simpler and clearer thanthe approaches usually taken in logic.The semantic deﬁnition of truth for ﬁrst-order logic is originally dueto Tarski [175]. This important conceptual advance ushered in the eraof so-called “model theory” in mathematical logic, and is the originalsource for the emphasis on semantics in this book.The proof of Theorem 8.2.6 is due to Diaconescu, and follows [41].Those familiar with logic programming may be interested to note thatthe initial H -model T H is what is usually called a “Herbrand Universe”in that ﬁeld. Properties (a) and (b) of Exercise 8.1.3 show that Φ -models with Φ -morphisms form (what is called) a category; then property (c) followsautomatically, along with many other useful properties. Category the-ory also suggests that for some structures bijective morphisms maynot be isomorphisms, because bijectivity cannot even be stated for ab-stract categories (see (d) of Exercise 8.1.3). Such “facts for free” andhints about generalizability motivate the study of category theory, andsome basics are given in Chapter 12, along with a few relevant refer-ences.Institutions [67] axiomatize the Tarskian model-theoretic formula-tion of mathematical logic, by axiomatizing the satisfaction relation iterature between syntax and semantics, indexed by the ambient signature; themain axiom is called the satisfaction condition, a special case of whichwas treated in Section 4.10. The theory of institutions is developedto some extent in Chapter 14; this theory has been used for manycomputing science applications, including speciﬁcation and modular-ization. The Eqlog system [79, 42] implements the institution HCLEQ ;because of its use of general equality, Eqlog goes well beyond whatstandard logic programming languages like Prolog provide, though atsome cost in eﬃciency. The institution

FOLQ / D is closely related tothe hidden algebra institution developed in Chapter 13, which was de-signed to handle dynamic situations, such as the sequential (i.e., statedependent, time varying) circuits that are discussed in Chapter 9.Proof planning is a venerable topic in Artiﬁcial Intelligence, and there is nothing especially novel about our treatment here, except thatwe have been very precise about the institutions(s) involved, have put astrong emphasis on equational reasoning, and have accepted the needfor human involvement. Inference and proof planning are not equa-tional deduction, because not all rules are reversible; instead they arerewrite rules. If a more logical formulation is really desired, then Mese-guer’s rewriting logic [134] is applicable.The formulation of the Substitution Theorem (Theorem 8.3.22) ap-pears to be new. Special thanks to Grigore Ro¸su for help with its proof,especially Proposition 8.3.21, on which it rests. Skolem constants andfunctions were introduced by the Norwegian logician Thoralf Skolem inhis studies of ﬁrst-order proof theory in the late 1920s.The material on algebraic induction in Section 8.7 appears to benew. In general, mathematicians have not seen much value in formal-izing exotic variants of ordinary induction, though they have done con-siderable work in formalizing variants of transﬁnite induction. On theother hand, computer scientists have been very concerned with induc-tive techniques for complex data structures; Burstall’s work on struc-tural induction pioneered this important area [21]. Some related workhas been done in computer science under the name of “cover set” and“test set”; for some recent work, with citations of older literature, see [17, 18]; this work concerns semi-automated inductive proofs of equa-tions for the one-sorted case, largely restricted to anarchic construc-tors. This material, like much else in this book, is also more generalthan most of the existing literature in that it treats the overloadedmany-sorted case.Humans have a deep seated desire for the phenomena that theyexperience to appear coherent, e.g., through some kind of causal expla-nation. In particular, we want to know why a particular step is taken ina proof, not just that it happens to work. Unfortunately, formal proofsusually leave out the motivations for their steps. However, a proofshould be more fun (or at least, relatively easier) to read if it is orga- First-Order Logic and Proof Planning nized to tell a story , explaining how various diﬃculties arise and areovercome. Thus it seems likely that much can be learned about how toeﬀectively present proofs by studying narratives, particularly oral sto-ries as they are actually told, and perhaps also movies. References onnarratology include [118, 7, 123, 145, 124], and more information onproofs as narratives can be found in [64]. Semiotics, which is the studyof signs, should also be able to help make proofs easier to understand;references on semiotics include [149, 104, 64, 83].

A Note for Lecturers:

Although much of the material inthis chapter looks rather technical, there are usually sim- ple underlying intuitions that can be brought out throughdiscussing the examples. (However the proof of Proposi-tion 8.3.21 in Appendix B really is rather technical.)Some students are troubled by the seemingly circular use ofﬁrst-order logic to reason about ﬁrst-order logic. Thereforeit should be explained that informal ﬁrst-order (and occa-sionally higher-order) logic is the language of mathematics,and that the project of this chapter is to formalize ﬁrst-orderlogic, that is, to make of it a mathematical object, which canthen be reasoned about; this formal system has much thesame status as the formalizations of the natural numbersthat we have been dealing with all along, but of course it ismuch more complex. The major purpose of this formaliza-tion, for this book, is to justify the various ways that we useto mechanize reasoning.

Second-Order EquationalLogic

This chapter generalizes ordinary equational logic, which involves onlyuniversal quantiﬁcation over constants, to second-order equational logic,which in addition permits universal quantiﬁcation over operations. Wethen develop full second-order logic with the same techniques used forﬁrst-order logic in Chapter 8, and in particular, we extend ﬁrst-orderquantiﬁcation to second-order quantiﬁcation using the techniques ofSection 8.3, emphasizing the special case where the only predicates arethe equality predicates. This treatment of second-order quantiﬁcationis a natural extension of our treatment of ﬁrst-order quantiﬁcation, andthe algebraic techniques used are essentially the same as for the ﬁrst-order case.These generalizations are crucial for our approach to verifying se-quential hardware circuits, the behavior of which changes over time,due to the eﬀects of internal memory and/or external inputs. Althoughthe resulting logic is much simpler than higher-order logic, it is entirelyadequate for applications to hardware. These applications extend theapproach of Section 7.4 with inﬁnite sequences of Boolean values onwires, instead of single Boolean values, so as to model states, inputs,etc. that vary with (discrete, unbounded) time, as represented by thesequence of natural numbers. The following illustrates some of whatcan be done in this framework:

Example 9.0.1

The fact that the union of a relation with its converse is itssymmetric closure, i.e., is the smallest symmetric relation containingit, is stated by the second-order formula below, where

R, S are relationvariables, i.e., they have rank DD → Bool for some sort D (which youshould think of as the “domain” of R and S ), and x, y are ordinaryvariables of sort D : ( ∀ R, S) ( [ ( ∀ x, y) (S(x, y) = S(y, x)) ∧ (R(x, y) ⇒ S(x, y)) ] ⇒ [ ( ∀ x, y) (R(x, y) ∨ R(y, x) ⇒ S(x, y)) ] ) .

After the relevant formal deﬁnitions are presented, Example 9.1.9 willgive an OBJ proof score that veriﬁes the above formula. (cid:2) Second-Order Equational Logic

We begin our formal development with the syntax of equations:

Deﬁnition 9.0.2

A (second-order) Σ - equation is a signature X of variable sym-bols (disjoint from Σ ) plus two ( Σ ∪ X) -terms; we write such equationsabstractly in the form ( ∀ X) t = t (cid:48) and concretely in forms like ( ∀ x, y, z, f , g) f (x, y, z) = g(x, y) + z , when X = { x, y, z, f , g } and their sorts can be inferred from their usesin the terms. This deﬁnition and notation extend to conditional equa-tions in the usual way. (cid:2) To deﬁne satisfaction for second-order equations, we extend the no-tion of an assignment of values in a Σ -algebra from ground signaturesto arbitrary signatures. Such an assignment on X is just an interpre-tation of X in M , that is, an X -algebra structure for M in addition tothe Σ -algebra structure it already has. Since Σ and X are disjoint, thismeans that M gets the structure of a ( Σ ∪ X) -algebra. Therefore the Σ -algebra M and the interpretation a : X → M in M of the variable sym-bols in X determine a unique ( Σ ∪ X) -homomorphism a : T Σ ∪ X → M bythe initiality of T Σ ∪ X . Note that operation symbols in X are interpretedas functions on M in exactly the same way as are operation symbols in Σ . Now we are ready for Deﬁnition 9.0.3 A Σ -algebra M satisﬁes a second-order Σ -equation ( ∀ X) t = t (cid:48) iﬀ for any interpretation a : X → M we have a(t) = a(t (cid:48) ) in M . In thiscase we may write M (cid:238) Σ ( ∀ X) t = t (cid:48) omitting some Σ subscripts for simplicity. A Σ -algebra M satisﬁes a set A of (ﬁrst and second-order) Σ -equations iﬀ it satisﬁes each e ∈ A , andin this case we write M (cid:238) Σ A .

We may also say that M is a P -algebra, and write M (cid:238) P where P = ( Σ , A) . Once again, this extends to conditional equations in the obviousway. (cid:2) It is very pleasing that this is such a straightforward generalization ofﬁrst-order equations and their satisfaction, obtained by using an arbi-trary signature instead of a ground signature; in particular, note thatinitiality plays exactly the same role here as it did in the ﬁrst-order case.Perhaps surprisingly, speciﬁcations with second-order equations haveinitial models (see Theorem 9.1.10 below); the proof is nearly the same heorem of Constants as for the ﬁrst-order case (Theorem 6.1.15), but is deferred to the nextsection because it requires a result given there. The following illus-trates the satisfaction of second-order equations, and also shows howeasy it is to write second-order equations that identify all elements ofany model that satisﬁes them:

Example 9.0.4

Many simple second-order equations only have trivial models.For example, given a one-sorted signature, the equation ( ∀ f , x, y) f (x, y) = f (y, x) implies ( ∀ x, y) x = y , by taking the function f (x, y) = x . Thereforeany model satisfying this equation must have just one element. Someother simple equations that have the same eﬀect are ( ∀ f , x) f (x) = a where a is a constant, and ( ∀ f , g, x) f (x) = g(x) . (cid:2) The development of ﬁrst-order equational logic in Chapters 3 and 4deﬁned variables to be new constants, and used the Theorem of Con-stants (Theorem 3.3.11) to justify proving equations with variables byregarding the variables as constants, so that ground term deductioncould be used; recall that this was necessary for applications such asinductive proofs. We now extend it to our second-order setting, whereit will play the same role:

Theorem 9.1.1 ( Theorem of Constants ) Given disjoint signatures Σ and X , givena set A of Σ -equations, and given t, t (cid:48) ∈ T Σ ∪ X , then A (cid:238) Σ ( ∀ X) t = t (cid:48) iﬀ A (cid:238) Σ ∪ X ( ∀∅ ) t = t (cid:48) . Proof:

Each condition is equivalent to the condition that a(t) = a(t (cid:48) ) for every ( Σ ∪ X) − algebra M satisfying A andevery a : X → M , where a : T Σ ∪ X → M is the unique homomorphism. (cid:2) This is exactly the same proof that we gave for the ﬁrst-order case; asbefore, its simplicity arises from using satisfaction and initiality, ratherthan using some set of rules of deduction. This and the CompletenessTheorem for ﬁrst-order equational logic give the following:

Corollary 9.1.2

Given disjoint signatures Σ and X , given a set A of Σ -equations,and given t, t (cid:48) ∈ T Σ ∪ X , then A (cid:238) Σ ( ∀ X) t = t (cid:48) iﬀ A (cid:96) Σ ∪ X ( ∀∅ ) t = t (cid:48) . (cid:2) Second-Order Equational Logic

Results below on expanding and contracting signatures, which gen-eralize results in Section 4.7, use the following:

Deﬁnition 9.1.3

A signature Ψ is non-void over another signature Σ iﬀ Ψ is dis-joint from Σ and (T Σ ∪ Ψ ) s is non-empty for every sort s of Ψ . Similarly, asignature Ψ is non-void relative to a subsignature Φ over another sig-nature Σ if and only if Ψ is disjoint from Σ , and (T Σ ∪ Ψ ) s is non-emptyfor every sort s of Ψ iﬀ (T Σ ∪ Φ ) s is non-empty for every sort s of Φ . (cid:2) Fact 9.1.4 If Φ ⊆ Ψ , if Φ is non-void over Σ , and if Ψ is non-void relative to Φ over Σ , then Ψ is non-void over Σ . Proof:

By the relative non-void hypothesis, (T Σ ∪ Ψ ) s ≠ ∅ for every sort s of Ψ iﬀ (T Σ ∪ Φ ) s ≠ ∅ for every sort s of Φ , and the latter is true by the non-voidness of Φ . Therefore the former is also true. (cid:2) Proposition 9.1.5 If ( ∀ Φ ) t = t (cid:48) is a (second-order) Σ -equation and if Φ ⊆ Ψ with Ψ non-void relative to Φ over Σ , then ( ∀ Ψ ) t = t (cid:48) is also a Σ -equation, and for every Σ -algebra M , M (cid:238) Σ ( ∀ Φ ) t = t (cid:48) iﬀ M (cid:238) Σ ( ∀ Ψ ) t = t (cid:48) . Proof:

Let a : Φ → M be such that a(t) = a(t (cid:48) ) . Then Φ must be non-void over Σ , and hence so is Ψ , by the non-voidness of Ψ relative to Φ . Thereforewe can extend a to b : Ψ → M such that b(t) = b(t (cid:48) ) by choosinginterpretations for operations in Ψ − Φ . On the other hand, if there areno such interpretations a , then Ψ must be void, so that Φ is also void,by the non-voidness of Ψ relative to Φ , and hence there are no suchinterpretations b . Therefore the ﬁrst condition implies the second.Conversely, if b : Ψ → M such that b(t) = b(t (cid:48) ) , then Ψ is non-voidover Σ , so we can restrict b to a : Φ → M such that a(t) = a(t (cid:48) ) . On theother hand, if there is no such b , then Ψ must be void, and then relativenon-voidness implies Φ is void too, so there is also no such a . (cid:2) The above result justiﬁes adding extra constant and operation sym- bols to a proof score under appropriate conditions. Necessity of thenon-voidness hypothesis is shown by the speciﬁcation given in Exam-ple 4.3.8, but the following is also interesting:

Exercise 9.1.1

Give signatures Σ , Φ , Ψ plus a Σ -model M and a Σ -sentence ( ∀ Φ ) t = t (cid:48) such that Φ ⊆ Ψ where Ψ adds only non-constant opera-tions to Φ , and M satisﬁes ( ∀ Φ ) t = t (cid:48) , but does not satisfy ( ∀ Ψ ) t = t (cid:48) . (cid:2) The Theorem of Constants also generalizes in a way similar to thatof Proposition 9.1.5: heorem of Constants

Corollary 9.1.6

Given a (second-order) Σ -equation ( ∀ Φ ) t = t (cid:48) , a signature ∆ disjoint from both Φ and Σ , and a set A of Σ -equations, let Ψ = Φ ∪ ∆ .Then A (cid:238) Σ ( ∀ Ψ ) t = t (cid:48) iﬀ A (cid:238) Σ ∪ Φ ( ∀ ∆ ) t = t (cid:48) , provided Ψ is non-void relative to Φ over Σ . Proof: If M (cid:238) Σ A then M (cid:238) Σ ( ∀ Φ ) t = t (cid:48) iﬀ M (cid:238) Σ ( ∀ Ψ ) t = t (cid:48) by Proposi-tion 9.1.5, so that A (cid:238) Σ ( ∀ Φ ) t = t (cid:48) iﬀ A (cid:238) Σ ( ∀ Ψ ) t = t (cid:48) iﬀ A (cid:238) Σ ∪ Φ ( ∀ ∆ ) t = t (cid:48) , the last step by Theorem 9.1.1. (cid:2) (Theorem 9.1.1 is the special case where ∆ = ∅ .)Theorem 9.1.1 also justiﬁes a key rule of deduction for second-orderequational logic, generalizing the ﬁrst-order universal quantiﬁer elim- ination rule (and the transformation T5 of Chapter 8) to the second-order case. Let (cid:96) denote the syntactic derivability relation deﬁned bysome complete set of rules R for ﬁrst-order equations, and let (cid:96) bedeﬁned by R plus the following new rule, which we will call dropping : ( D ) A (cid:96) Σ ∪ Φ ( ∀∅ ) t = t (cid:48) implies A (cid:96) Σ ( ∀ Φ ) t = t (cid:48) . Fact 9.1.7

The rule (D) is sound.

Proof: If A (cid:96) Σ ∪ Φ ( ∀∅ ) t = t (cid:48) , then A (cid:238) Σ ( ∀ Φ ) t = t (cid:48) by Corollary 9.1.2, so it issound to infer ( ∀ Φ ) t = t (cid:48) . (cid:2) We also have the following:

Theorem 9.1.8 ( Completeness ) If R is a set of rules of deduction for ﬁrst-order equational logic, deﬁning a relation (cid:96) that is complete, then R = R ∪{ (D) } is complete for unconditional second-order equations,in the sense that it deﬁnes a relation (cid:96) such that A (cid:96) Σ ( ∀ Φ ) t = t (cid:48) iﬀ A (cid:238) Σ ( ∀ Φ ) t = t (cid:48) , for any signature Σ and any set A of ﬁrst-order Σ -equations. Proof:

The direct implication is soundness of (D), which is Fact 9.1.7, plussoundness of (cid:96) . For the converse, if A (cid:238) Σ ( ∀ Φ ) t = t (cid:48) , then Corol- lary 9.1.2 gives A (cid:96) Σ ∪ Φ ( ∀∅ ) t = t (cid:48) , and then (D) gives A (cid:96) Σ ( ∀ Φ ) t = t (cid:48) . (cid:2) This result supports deducing a single second-order equation from aset of ﬁrst-order equations; it does not provide a complete inferencesystem for second-order equational logic, which would instead infersecond-order equations from other second-order equations. However,when all sorts involved are non-void, Corollary 9.1.6 allows droppingsecond-order equations to ﬁrst-order equations, which could then beused in proofs, although in a limited way, because we cannot substitutefor the second-order variables that have been dropped. Nevertheless, Second-Order Equational Logic the inferences that are supported by the above result are all we needfor our applications to the veriﬁcation of sequential circuits, and manyother applications, such as the following:

Example 9.1.9

We use the machinery developed above to prove the formulaof Example 9.0.1. As usual, relations are translated into Boolean-valuedfunctions. The quantiﬁers for R and S may be considered eliminated bytheir declarations; note that commutativity of S is given as an attributein its declaration, instead of an equation. The main implication in theformula is eliminated by applying the rule (D). The quantiﬁers for x and y are eliminated in the usual way, as are the implications withintheir two scopes. The case split in the proof is justiﬁed by applying thedisjunction elimination rule (T2 of Chapter 8) to the formula R(x, y) ∨ R(y, x) , and then translating the disjuncts to equations. th SETUP is sort D .op R : D D -> Bool .op S : D D -> Bool [comm] .vars X Y : D .cq S(X,Y) = true if R(X,Y) .ops x y : -> D .endthopen . *** first caseeq R(x,y) = true .red S(x,y) .closeopen . *** second caseeq R(y,x) = true .red S(x,y) .close

As expected, both reductions give true . Of course, this is a verysimple example. (cid:2)

Exercise 9.1.2

Extend Example 8.2.7 by proving that for any relation R , its tran-sitive closure R* as deﬁned there, is the least transitive relation con-taining R . (To do this in OBJ, some ingenuity will be needed in handling the equation for transitivity.) (cid:2) Theorem 9.1.10 ( Initiality ) Given a set A of Σ -equations, possibly conditional,which may be either ﬁrst or second-order, let ≡ be the Σ -congruence on T Σ generated by the relation R having the components R s = { (cid:104) t, t (cid:48) (cid:105) | A (cid:96) ( ∀∅ ) t = t (cid:48) where t, t (cid:48) are of sort s } . Then T Σ /R , denoted T Σ ,A , is an initial ( Σ , A) -algebra. Proof:

E42

Given any ( Σ , A) -algebra M , let v : T Σ → M be the unique ho-momorphism, and notice that R ⊆ ker (v) , because M (cid:238) A implies econd-Order Logic M (cid:238) ( ∀∅ ) t = t (cid:48) for every (cid:104) t, t (cid:48) (cid:105) ∈ R by Theorem 9.1.8. Now let v = q ; u with u : T Σ ,A → M be the factorization of v given by Propo-sition 6.1.12; see Figure 6.2. For uniqueness, if also u (cid:48) : T Σ ,A → M ,then q ; u (cid:48) = v by the initiality of T Σ , and R ⊆ ker (v) because M (cid:238) A .Therefore u = u (cid:48) by (2) of Proposition 6.1.12. (cid:2) The algebraic development of ﬁrst-order logic in Chapter 8 extendsstraightforwardly to second-order quantiﬁcation. As before, we assumea ﬁxed ﬁrst-order signature Φ = ( Σ , Π ) with sort set S , but now we let X be an (S ∗ × S) -indexed set of variable symbols, disjoint from Σ and Φ , such that each sort has an inﬁnite number of symbols. For X ⊆ X , a ( Φ , X) - term is an element of T Σ ∪ X and the ( S ∗ × S )-indexed function Var on these terms is deﬁned by:0.

Var w,s (σ ) = ∅ if σ ∈ Σ [],s ;1. Var w,s (x) = ∅ if x ∈ X [],s and w ≠ [] ;2. Var [],s (x) = { x } if x ∈ X [],s ;3. Var w,s (σ (t , . . . , t n )) = (cid:83) ni = Var w,s i (t i ) if n > σ (cid:54)∈ X w,s ;4. Var w,s (σ (t , . . . , t n )) = { σ } ∪ (cid:83) ni = Var w,s i (t i ) if n > σ ∈ X w,s .(As before, Var can be seen as a ( Σ ∪ X) -homomorphism T Σ ∪ X → P (X) .)Then the well-formed ( Φ , X) - formulae are deﬁned just as in Deﬁni-tion 8.3.1, elements of the (one-sorted) algebra WFF X ( Φ ) , free over themetasignature Ω , except that the universal quantiﬁcation operations ( ∀ x) will now include non-constant function symbols, and the atomic ( Φ , X) -formula generators should be G X = { π(t , . . . , t n ) | π ∈ Π s ...s n and t i ∈ (T Σ ∪ X ) s i for i = , . . . , n } . A Φ -formula is a ( Φ , X) -formula for some X , and the functions Var and

Free , giving all variables, and all free variables, of Φ -formulae aredeﬁned just as in Deﬁnition 8.3.1; the notions of bound variable, closedformula, scope, etc. are also the same.The semantics of ﬁrst-order formulae in Section 8.3.3 generalizesdirectly to second-order formulae, along with the results in that sec-tion. In particular, E43

Deﬁnition 8.3.2 does not need to be changed atall, except to note that X is an arbitrary signature, not just a groundsignature, so that interpretations of X also need to be general; in par-ticular, all the rules given in that section remain sound. The materialon substitutions in Section 8.3.4 should be modiﬁed a bit, but we willnot do so, because we don’t need it for this chapter. Second-Order Equational Logic (cid:8)(cid:8)(cid:8)(cid:72)(cid:72)(cid:72) (cid:8)(cid:8)(cid:8)(cid:72)(cid:72)(cid:72) (cid:104) (cid:104)(cid:115) (cid:115) (cid:115) f f f Figure 9.1: Series Connected InvertersLet us denote the institution of second-order logic that results fromthe above by

TOL , and denote the special case where the only predicatesare the equality predicates, by

TOLQ , calling it the second-order logicof equality . Also, as in Section 8.3.6, if we ﬁx a signature Ψ and a Ψ -model D , then we can deﬁne the institution TOLQ/D to be

TOLQ withthe additional requirement that all its signatures must contain Ψ , and that all its models M must be such that their reduct M | Ψ to Ψ is D . Whereas combinational circuits can be described by equations that donot involve time, sequential circuits have behaviors that vary with time,and thus require modeling wires that have time-varying values. Onecommon approach is to model such wires as streams of Boolean val-ues. We can still apply the method of Section 7.4 to obtain a systemof equations from a circuit diagram, but the variables that model wiresnow take values that are functions from

Nat (where the natural numbersrepresent moments of time) to truth values. As before, we use

PROPC to represent the values on wires, rather than just

BOOL , so that we canexploit its decision procedure for propositional logic. The followingsimple example illustrates the approach:

Example 9.3.1 ( Series Connected Inverters ) We prove that the series connectionof two

NOT gates (i.e., inverters), each with one unit delay, has the sameeﬀect as a two unit delay; see Figure 9.1. The system of equationsinvolved here is f (t + ) = not f (t)f (t + ) = not f (t) each of which is (implicitly) universally quantiﬁed over f , f , f and t ,where each variable f i has rank (cid:104) Nat , Prop (cid:105) , and where t has sort Nat .We think of f as an input variable, f as an output variable, and f asan internal variable. The behavior that we expect this circuit to have isdescribed by the equation f (t + ) = f (t) , eriﬁcation of Sequential Circuits i.e., it functions as a two unit delay. Moreover, its internal behavior (at f ) is described by the ﬁrst equation of the system, f (t + ) = not f (t) . An OBJ proof score showing that these two terms indeed solve thetwo inverter system will be given below. The assertion to be veriﬁedhas the form A |(cid:155) Σ ( ∀ Φ ) r where Σ is the union of the signatures of the objects PROPC and

NAT , A is the union of their equations, Φ is the signature containing threefunction symbols f , f , f of rank (cid:104) Nat , Prop (cid:105) , and r is of the form (e ∧ e ) ⇒ e , where e = ( ∀ t) f (s t) = not f (t)e = ( ∀ t) f (s t) = not f (t)e = ( ∀ t) f (s s t) = f (t). We use the transformation rules of Section 8.4 plus R14. By R14 andT6, it suﬃces to prove that A (cid:238) Σ ∪ Φ ( ∀∅ ) r , which by rule T1 is equivalent to A ∪ { e ∧ e } (cid:238) Σ ∪ Φ e , which by rule T6 again can be veriﬁed by proving A ∪ { e ∧ e } (cid:238) Σ ∪ Φ ∪{ t } f (s s t) = f (t) , which by rule T3 is equivalent to A ∪ { e , e } (cid:238) Σ ∪ Φ ∪{ t } f (s s t) = f (t) , which is exactly what the proof score below does. This series of deduc-tions at the meta level can be automated using the same techniques aswere used in Section 8.4, but we do not give details here. open NAT + PROPC .ops (f0_)(f1_)(f2_) : Nat -> Prop [prec 9] .var T : Nat .eq f1 s T = not f0 T .eq f2 s T = not f1 T .op t : -> Nat .red f2 s s t iff f0 t .close Notice that the output values of the inverters at time 0 are not deter-mined by the equations given above, and do not enter into the veriﬁca-tion. If desired, the following equations could be added Second-Order Equational Logic (cid:45)(cid:45) (cid:45)

T Y fc y

Figure 9.2: Parity of a Bit Stream eq f1 0 = false .eq f2 0 = false . to give them ﬁxed values, but this is not necessary. (cid:2)

The sentence proved here is typical of a very large class of sequentialhardware veriﬁcation problems, which have the form A |(cid:155) Σ ( ∀ Φ ) (C ⇒ e) where A deﬁnes the abstract data types of the problem, where C is aconjunction of equations deﬁning the circuit, where e is an equation tobe proved, and where Φ may involve second-order quantiﬁcation. Example 9.3.2

We consider a simple circuit to compute the parity of a streamof bits, using just one T (for “toggle”) type ﬂip-ﬂop. In Figure 9.2, f is the input stream, c is the clock stream, and y is the output stream;the clock stream just marks cycles to stabilize the ﬂip-ﬂop, and can beignored for our (logical) purposes. This ﬂip-ﬂop satisﬁes the followingequations, y(t + ) = f (t) + y(t)y( ) = false from which it follows by induction that ( ∀ f , y, t) y(t + ) = t (cid:88) i = f (i) . The proof is as follows: The base case with t = y( ) = f ( ) + y( ) = f ( ) + false = f ( ) . For the induction step, we assume the above equation, and then prove it with t + t , by ﬁrst noting that y(t + ) = f (t + ) + y(t + ) and then applyingthe above equation. (cid:2) Exercise 9.3.1

Prove correctness of the circuit of Example 9.3.2 using OBJ. (cid:2)

It is worth remarking that any veriﬁcation of a combinational circuit“lifts” to a veriﬁcation of the same circuit viewed as a sequential circuit,by replacing each wire variable, whether an input i k or a non-input p k , by a function Nat -> Prop , e.g., in the form f k (t) ; the reason isthat the same proof works for the lifted system, with the same veriﬁedproperty holding at each instant. iterature and Discussion The material in this chapter is based on [59], although Proposition 9.1.5and Theorems 9.1.10 and 9.1.8 are not there, and appear to be new, asdoes the exposition of second-order logic. The proofs involving uni-versal properties of quotient and freedom seem especially elegant andsimple.It is interesting to compare our method for representing sequen-tial circuits with the more familiar method which represents compo-nents using higher-order relations, and represents connections usingexistential quantiﬁcation (as in the usual deﬁnition of the compositionof relations) [93, 94, 25]; see also [167]. By contrast, the representa-tion suggested here uses no relations (except equality in an implicitway), and it represents interconnection by equality of wires. For se- quential circuits, both methods represent wires as variables that rangeover functions, and both methods use second-order quantiﬁers. How-ever, the results of this chapter show that existential quantiﬁcation andhigher-order relations can be avoided in favor of a simple extension ofﬁrst-order equational logic by universal quantiﬁcation over functions,contrary to claims made in [25]. The higher-order logic approach tohardware veriﬁcation of [25] was claimed to have many beneﬁts, in-cluding the following:1. natural deﬁnitions of data types (using Peano style induction prin-ciples);2. the possibility of leaving certain values undeﬁned (such as theinitial output of a delay);3. dealing with bidirectional devices.But all these beneﬁts can be realized more simply using just second-order equational logic:1. Chapter 6 showed that initial algebra semantics supports abstractdata type deﬁnitions in a very natural way, and also supports the use of structural induction principles for such deﬁnitions.2. It is very easy to leave values undeﬁned, such as the values ofinverters at time 0; conditional equations can also be used forthis purpose.3. Although it is often advantageous to exploit causality (in the formof an input/output distinction), we are not limited to that case,because equality is bidirectional.4. Moreover, because equational logic is simpler than higher-orderlogic, in general its proofs are also simpler. Second-Order Equational Logic

Of course, our method also has its limits, but fortunately, the prob-lems of greatest interest for hardware veriﬁcation fall well within itscapabilities.Although the rules and transformations of Chapter 8 were provedfor ﬁrst-order logic, they extend to second-order quantiﬁcation. Wedid not use such extended rules in this chapter, because (e.g., in Exam-ple 9.3.1) we ﬁrst applied rule (D) to get rid of the second-order univer-sal quantiﬁers. This is suﬃcient for assertions of the form A |(cid:155) Σ ( ∀ Φ ) r where r contains only ﬁrst-order quantiﬁers, but it is not suﬃcient if r contains second-order quantiﬁers. Many of the rules in Chapter 8 actu-ally hold for a wide variety of logics, as can be shown using material oninstitutions in Chapter 14.It is worth mentioning that the use of a loose extension of a ﬁxed data theory in the institution TOLQ/D is closely related to the hiddenalgebra institution developed in Chapter 13; this should not be surpris-ing, because hidden algebra was designed to handle dynamic situations,of which sequential circuits are a prime example.

A Note for Lecturers:

This short chapter contains some niceexamples and some relatively easy theory, which should beincluded in a course if possible, after the relevant parts ofChapters 7 and 8 have been covered. Proposition 9.1.5 canbe skipped, as can the details in Section 9.2.

There are many examples where all items of one sort are necessarily also items of some other sort. For example, every natural number is aninteger, and every integer is a rational. We can write this symbolicallyin the form

Natural ≤ Integer ≤ Rational , where Natural , Integer , and

Rational are names for the sorts ofentity involved. If we associate to each such name a meaning (i.e., asemantic denotation, also called an extension ) which is the set of allitems of that sort (e.g., the set of all integers), then the subsort relationsappear as set-theoretic inclusions of the corresponding extensions. Forexample, if the usual extensions of

Natural , Integer , and

Rational are denoted N , Z , and Q , respectively, then we have N ⊆ Z ⊆ Q . Sort names like

Natural and

Rational are syntactic , while their ex-tensions are semantic . This distinction is formalized below with order-sorted signatures , which include a set of sort names with a partial orderrelation, and a family of operation symbols with sorted arities; an order-sorted algebra for a given signature is then an interpretation for these sort and operation names that respects the subsorts and arities. Thisarea of mathematics is called order-sorted algebra (hereafter abbrevi-ated

OSA ).A closely related topic is overloading , which allows a single symbolto be used for several diﬀerent operations. In applying an overloadedoperation symbol, we may not even be aware that we are moving amongvarious sorts and operations. For example, we can add 2 + − / + ( − ) (a rational and an integer), or 2 + /

29 (a nat-ural and a rational), or 2 / + / + , and a sub-sort relation among naturals, integers, and rationals, in such a way that Order-Sorted Algebra and Term Rewriting whichever addition is used, we always get the same result from thesame arguments, provided they make sense. We may describe this bysaying that + is subsort polymorphic .However, this is only one of several ways that the term “polymor-phic” is used. The term was introduced by Christopher Strachey toexpress the use of the same operation symbol with diﬀerent meaningsin programming languages. He distinguished two main forms of poly-morphism, which he called ad hoc and parametric . In his own words[173]: In ad hoc polymorphism there is no simple systematicway of determining the type of the result from the type ofthe arguments. There may be several rules of limited extentwhich reduce the number of cases, but these are themselves ad hoc both in scope and content. All the ordinary arith-metic operations and functions come into this category. Itseems, moreover, that the automatic insertion of transferfunctions by the compiling system is limited to this class.Parametric polymorphism is more regular, as illustratedby the following example: Suppose f is a function whose ar-gument is of type α and whose result is of type β (so thatthe type of f might be written α → β ), and that L is a listwhose elements are all of type α (so that the type of L is α list ) . We can imagine a function, say Map , which applies f in turn to each member of L and makes a list of the re-sults. Thus Map [f , L] will produce a β list . We would like Map to work on all types of list provided f was a suitablefunction, so that Map would have to be polymorphic. How-ever its polymorphism is of a particularly simple parametrictype which could be written (α → β, α list ) → β list , where α and β stand for any types.Strachey’s distinction is based on the kind of semantic relationship thatholds between the diﬀerent interpretations of an operation symbol, andit suggests a more detailed distinction among the diﬀerent kinds ofpolymorphism, in which the less ad hoc the relationship is, the easier it is to do type inference, and the closer it is to parametric polymorphism:• In strongly ad hoc polymorphism , an operation symbol has se-mantically unrelated uses, such as + for both integer addition andBoolean disjunction. (But even in this case, the two instances of + share the associative, commutative, and identity properties.)• In multiple representation , the uses are related semantically, buttheir representations may be diﬀerent. For example, in an arith-matic system we may have several representations for the number2, such as 2 . /

2. Polar and Cartesian coordinate represen-tations of points in the plane are another example. ignature, Algebra, and Homomorphism • Subsort polymorphism is where the diﬀerent instances of an op-eration symbol are related by subset inclusion such that the resultdoes not depend on the instance used, as with + for natural, inte-ger, and rational numbers. This sense is developed in this chapter.• Parametric polymorphism , as in Strachey’s

Map function, ap-pears in many higher-order functional programming languages,including ML [99, 180], Haskell [107], and Miranda [179].OBJ supports all four kinds of polymorphism.

Strongly ad hoc polymor-phism is supported by signatures in which the same operation symbolhas sorts that are unrelated in the subsort hierarchy. We have alreadyexplained that subsort polymorphism is inherent in the nature of OSA.Strachey’s implementation of arithmetic involved “transfer functions”(which would now be called “coercions”) to change the representation of numbers; but coercions are not needed for subsort polymorphicoperations, because subsorts appear as subset inclusions of the dataitems. Also, for regular signatures (in the sense of Deﬁnition 10.2.5 be-low), expressions involving subsort polymorphism always have a small-est sort. OSA also accommodates coercions and multiple represen-tation polymorphism [138], although we do not treat this topic here.Parametric polymorphism in OBJ is supported by parameterized ob-jects, such as

LIST[X] , that provide higher-order capabilities in a ﬁrst-order algebraic setting [61]. So it seems fair to conclude that Stracheywas excessively pessimistic about the amount of structure in polymor-phism that is not parametric in his sense, since OSA is a simple but richmathematical theory that is easily implemented and is far from ad hoc in the pejorative sense of being arbitrary.This chapter ﬁrst generalizes our treatment of many-sorted algebra(MSA) to OSA, with illustrative examples, and then treats retracts, whichenable error handling, and order-sorted term rewriting, which providesan operational semantics; details similar to MSA are sometimes omit-ted.

This section introduces and brieﬂy illustrates the three most basic con-cepts of order-sorted algebra; each is a straightforward extension ofthe corresponding MSA concept.

Deﬁnition 10.1.1 An order-sorted signature (S, ≤ , Σ ) consists of1. a many-sorted signature (S, Σ ) , with2. a partial ordering ≤ on S such that the following monotonicitycondition is satisﬁed, σ ∈ Σ w ,s ∩ Σ w ,s and w ≤ w imply s ≤ s . (cid:2) Order-Sorted Algebra and Term Rewriting

In OBJ notation, the sort set S and the operation set Σ are just the samefor OSA signatures as for MSA signatures. The new ingredient is thepartial ordering on S , which is declared by giving a set of subsort pairs,of the form S1 < S2 ; these can be strung together in declarations ofthe form

S1 < S2 < · · · < Sn , which abbreviates

S1 < S2 , S2 < S3 ,. . . . In each case, the declaration must be preceded by the keywords subsort or subsorts , and terminated with a period (preceded by aspace, as usual). The partial ordering deﬁned on S is the least suchcontaining the given set of pairs. This syntax is illustrated in the fol-lowing: Example 10.1.2

Below is the signature part of a speciﬁcation for lists of naturalnumbers in a Lisp-like syntax: sorts Nat NeList List .subsorts NeList < List .op 0 : -> Nat .op s_ : Nat -> Nat .op nil : -> List .op cons : Nat List -> NeList .op car : NeList -> Nat .op cdr : NeList -> List .

Here cons is a list constructor which adds a new number at the headof a list; car and cdr are the corresponding selectors, which select the head and tail (also called the front and the rest ) of non-empty lists;and nil is the empty list. However, equations are needed to expressthese relationships between cons and its selectors, for which see Ex-ample 10.2.19 below. (cid:2)

Deﬁnition 10.1.3

Given an order-sorted signature (S, ≤ , Σ ) , an order-sorted (S, ≤ , Σ ) - algebra is a many-sorted (S, Σ ) -algebra M such that1. s ≤ s implies M s ⊆ M s σ ∈ Σ w ,s ∩ Σ w ,s and w ≤ w imply M w ,s σ = M w ,s σ on M w . (cid:2) The second condition says that overloaded operations are consistentunder restriction; this expresses subsort polymorphism.

Example 10.1.4

Letting Σ be the signature of Example 10.1.2, deﬁne a Σ -algebra F as follows: F Nat = { } F NeList = { } F List = { , nil } erm and Equation with s( ) = cons ( , L) = car (L) = cdr (L) = nil . Of course, this isnot the “intended” standard or initial model, which instead is deﬁnedas follows: L Nat = { , s , ss , . . . } L NeList = L Nat ∪ L ∗ Nat L List = L NeList ∪ { nil } where ∗ denotes the ﬁnite sequence constructor. Finally, the opera-tions of L are deﬁned by cons (N, N . . . N n ) = NN . . . N n car (N . . . N n ) = N cdr (N . . . N n ) = N . . . N n if n > cdr (N . . . N n ) = nil if n = (cid:2) Deﬁnition 10.1.5

Given order-sorted (S, ≤ , Σ ) -algebras M, M (cid:48) , an order-sorted (S, ≤ , Σ ) - homomorphism h : M → M (cid:48) is a many-sorted Σ -homomor-phism h : M → M (cid:48) such that s ≤ s implies h s = h s on M s . (This is also called the monotonicity condition .) (cid:2) Exercise 10.1.1

Adopting the notation of Example 10.1.4, show that there is aunique Σ -homomorphism L → F . In fact, F is a ﬁnal Σ -algebra, and wewill see later that L is an initial ( Σ , E) -algebra for the most reasonableand expected equation set E . (cid:2) The main topic of this section is order-sorted equations and their sat-isfaction. This requires that we ﬁrst treat OSA terms and substitutions.We will see that there are some slightly subtle points about parsingterms, for which we later introduce the notion of a regular signature.

Deﬁnition 10.2.1

Given an order-sorted signature Σ , the S -indexed set T Σ of Σ - terms is deﬁned recursively by the following: Σ [],s ⊆ T Σ ,s for s ∈ S ,2. s ≤ s implies T Σ ,s ⊆ T Σ ,s ,3. σ ∈ Σ w,s and t i ∈ T Σ ,s i for i = , . . . , n imply σ (t , . . . , t n ) ∈ T Σ ,s where w = s . . . s n and n > (cid:2) Example 10.2.2

Let Σ denote the signature of Example 10.1.2 with car and cdr removed. Then the following are terms of sort List : nilcons ( , nil ), cons ( s0 , nil ), . . . cons ( , cons ( , nil )), cons ( s0 , cons ( , nil )), cons ( s0 , cons ( s0 , nil )), . . .. . . Order-Sorted Algebra and Term Rewriting sσ (cid:45) w ww (cid:45) σ s Figure 10.1: Visualizing RegularityAll except nil are also of sort

NeList . (cid:2) Notice that conditions 1 and 3 in Deﬁnition 10.2.1 are the same asfor MSA; condition 2 is needed to satisfy the ﬁrst condition in Deﬁni- tion 10.1.3. Strictly speaking, we should have used underlined paren-theses, σ (t . . . t n ) , in the above, as we did for MSA terms in Deﬁni-tion 3.7.5. We make this indexed family of Σ -terms into a Σ -algebra asfollows: Deﬁnition 10.2.3

Given σ ∈ Σ w,s with w = s . . . s n ≠ [] , deﬁne ( T Σ ) σ : ( T Σ ) w → ( T Σ ) s by ( T Σ ) σ (t , . . . , t n ) = σ (t , . . . , t n ) ; and when w = [] , deﬁne ( T Σ ) σ = σ . The resulting Σ -algebra is called the Σ - term (or sometimes word ) algebra , and denoted T Σ . (cid:2) Example 10.2.4

The terms listed in Example 10.2.2, plus others hinted at there,form the carrier of sort

List of the term algebra T Σ for the signature ofExample 10.1.2; all but nil are also in the carrier of sort NeList , whilethe carrier of sort

Nat contains the usual Peano numbers. (cid:2)

Exercise 10.2.1

Check that T Σ of Deﬁnition 10.2.3 is an order-sorted Σ -algebra,by checking the two conditions in Deﬁnition 10.1.3. (cid:2) The next deﬁnition is motivated by wanting each Σ -term to have aunique parse of least sort. Notice that this is precisely what happensin the example arithmetic system that we discussed in the introductionto this chapter: we want each term to have the most speciﬁc possiblesort; for example, − Integer , but not a

Rational or a

Natural . It is natural to achieve this by requiring that the set of ranksthat each overloaded operation might have (in a given context) has aleast element:

Deﬁnition 10.2.5

An order-sorted signature (S, ≤ , Σ ) is regular iﬀ for each σ ∈ Σ w ,s and each w ≤ w there is a unique least element in the set { (w, s) | σ ∈ Σ w,s and w ≥ w } . (cid:2) This says that the set of possible ranks for σ with arity greater than anyﬁxed w has a smallest element; see Figure 10.1, in which the verticallines indicate subsort relations. To explore the consequences of thiscondition, we consider some examples where it is not satisﬁed: erm and Equation Example 10.2.6

The signature sorts s1 s2 s3 s4 s5 .subsort s1 < s3 .subsort s2 < s4 .op a : -> s1 .op b : -> s2 .op f : s1 s4 -> s5 .op f : s3 s2 -> s5 . is non-regular, because the set of ranks for f with arity at least s1s2 consists of the two tuples ( s1 s4, s5 ) and ( s3 s2, s5 ), neither ofwhich is less than the other. Therefore the term f(a,b) does not havea least sort, but instead has two incompatible sorts. (cid:2) Exercise 10.2.2

Show that the following is a non-regular signature: sorts Nat NeList List .subsort Nat < NeList < List .op 0 : -> Nat .op s_ : Nat -> Nat .op nil : -> List .op cons : NeList List -> NeList .op cons : List NeList -> NeList .

This deﬁnes a complex kind of list of natural numbers, in which listscan be elements of other lists, with non-empty lists distinguished fromthe empty list nil . But its non-regularity shows that this distinction isnot suﬃciently careful; see Exercise 10.2.4. (cid:2)

Non-regular signatures can often be made regular by adding somenew subsort declarations and/or by changing the ranks of some opera-tions.

Exercise 10.2.3

Show that adding the subsort declaration s2 < s4 to the sig-nature of Example 10.2.6 gives a regular signature. (cid:2)

Exercise 10.2.4

Show how to modify the signature of Exercise 10.2.2 to make it regular.

Hint:

Add a new operation declaration to further overload cons . (cid:2) Proposition 10.2.7 If Σ is regular, then for each t ∈ T Σ there is a least sort s ∈ S such that t ∈ T Σ ,s . This sort is denoted LS(t) . Proof:

We proceed by induction on the depth of terms in T Σ . If t ∈ T Σ hasdepth 0, then t = σ for some σ ∈ Σ [],s and so by regularity with w = w = [] , there is a least s ∈ S such that σ ∈ Σ [],s ; this is the leastsort of σ . Now consider a well-formed term t = σ (t . . . t n ) ∈ T Σ ,s ofdepth n +

1. Then each t i has depth ≤ n and therefore by the induction Order-Sorted Algebra and Term Rewriting hypothesis, has a least sort, say s i ; let w = s . . . s n . Then σ ∈ Σ w (cid:48) ,s (cid:48) for some w (cid:48) , s (cid:48) with s (cid:48) ≤ s and w ≤ w (cid:48) , and so by regularity, there areleast w (cid:48) and s (cid:48) such that σ ∈ Σ w (cid:48) ,s (cid:48) and w (cid:48) ≥ w ; this least s (cid:48) is thedesired least sort of t . (cid:2) This proof essentially gives an algorithm for ﬁnding the least sort parseof any term over a regular signature.

Theorem 10.2.8 ( Initiality ) If Σ is regular and if M is any Σ -algebra, then thereis one and only one Σ -homomorphism from T Σ to M . (cid:2) This result is proved in Appendix B, and as before, its property is called initiality . The following example shows that if Σ is not regular, then T Σ is not necessarily initial: Exercise 10.2.5

Deﬁne an algebra M over the signature of Example 10.2.6 asfollows: M s1 = M s2 = M s3 = M s4 = { } , M s5 = { , } , M a = M b = , M s1 s4 , s5f ( , ) = , M s3 s2 , s5f ( , ) =

2. Now show that if T is the termalgebra for the signature given in Example 10.2.6, then T s = { f ( a , b ) } ,and conclude from this that T is not initial. (cid:2) However, [68] shows that initial Σ -algebras do exist even when Σ is notregular. Rather than just Σ -terms, the construction uses terms anno-tated with sort information; the notation T Σ may be used. We omitdetails, which are very similar to those for the many-sorted case (seeSection 3.2). Again as for MSA, we may sometimes write T Σ when wereally mean T Σ , and we may ignore the sort annotations, even thoughthey are necessary, because they are implicit in parsing; this is consis-tent with what is done in implementations of OBJ.As in the many-sorted case, it is convenient (but not always nec-essary) for each variable symbol to have just one sort; therefore weassume that any S -indexed set X = { X s | s ∈ S } used to provide vari-ables is such that X s and X s are disjoint whenever s ≠ s , and suchthat all symbols in X are distinct from those in Σ ; we may use theterm variable set for such indexed sets. Then as in the many-sortedcase, we deﬁne the signature Σ (X) by Σ (X) w,s = Σ w,s for w ≠ [] , and Σ (X) [],s = Σ [],s ∪ X s . We can now form the Σ (X) -term algebra T Σ (X) and view it as a Σ -algebra, which is then denoted T Σ (X) . The following isproved in Appendix B: Theorem 10.2.9 If (S, ≤ , Σ ) is regular, then T Σ (X) is a free Σ -algebra on X , inthe sense that for each Σ -algebra M and each assignment a : X → M ,there is a unique Σ -homomorphism a : T Σ (X) → M such that a(x) = a(x) for all x in X . (cid:2) This result also generalizes

E44 to non-regular signatures, using T Σ (X) instead of T Σ (X) , though we omit the details. The same applies to thefollowing, the proof of which is just the same as for Proposition 3.5.1for MSA: erm and Equation Proposition 10.2.10

Given an OSA signature Σ , a ground signature X disjointfrom Σ , an OSA Σ -algebra M , and a map a : X → M , there is a unique Σ -homomorphism a : T Σ (X) → M which extends a , in the sense that a s (x) = a s (x) for each s ∈ S and x ∈ X s . (cid:2) Substitutions and their composition can now be deﬁned just as in MSA,simply replacing T Σ (Y ) by T Σ (Y ) : Deﬁnition 10.2.11 A substitution of Σ -terms with variables in Y for variablesin X is an arrow a : X → T Σ (Y ) ; the notation a : X → Y may also beused. The application of a to t ∈ T Σ (X) is a(t) . Given substitutions a : X → T Σ (Y ) and b : Y → T Σ (Z) , their composition a ; b (as substitutions)is the S -sorted arrow a ; b : X → T Σ (Z) . (cid:2) We also use Notation 3.5.3 for OSA substitutions: Given t ∈ T Σ (X) and a : X → T Σ (Y ) with | X | = { x , . . . , x n } and a(x i ) = t i for i = , . . . , n ,write a(t) as t(x ← t , x ← t , . . . , x n ← t n ) , omitting x i ← t i when t i is x i . Proposition 10.2.12

OSA substitutions are sort decreasing, in that

LS(θ(x)) ≤ s for any x ∈ X s and more generally, LS(θ(t)) ≤ LS(t) for any Σ -term t . Proof:

The ﬁrst assertion follows because θ(x) ∈ T Σ (Y ) s and LS(t) ≤ s forall t ∈ T Σ (Y ) s . The second assertion can be proved by an inductionsimilar to that used for Proposition 10.2.7, by applying condition 2 ofDeﬁnition 10.1.1. (cid:2) The composition of substitutions is associative, by exactly the sameproof used for the MSA case (Proposition 3.6.5, except using Proposi-tion 10.2.10 instead of 3.5.1):

Proposition 10.2.13

Given substitutions a : W → T Σ (X) , b : X → T Σ (Y ) , c : Y → T Σ (Z) , then (a ; b) ; c = a ; (b ; c) . (cid:2) The following consequence of the proof is important for calculations inseveral proofs:

Corollary 10.2.14

Given substitutions a : W → T Σ (X) , b : X → T Σ (Y ) , then a ; b = (a ; b) . (cid:2) Deﬁnition 10.2.15 below introduces concepts which help deﬁne theterms for which our order-sorted equational satisfaction makes sense.Their motivation is that the two terms in an equation must have sortsthat are somehow connected. For example, an equation such as ( ∀ n) n = true really cannot make sense; on the other hand, the equation ( ∀ n) n = i Order-Sorted Algebra and Term Rewriting where n is a natural and i is the square root of minus one, does makesense, even though it is not satisﬁed by the standard model of the num-ber system. This is because 5 n and i are in the same connected com-ponent of S , while 5 n and true are not, where we say that s and s arein the same connected component of S iﬀ s ≡ s , where ≡ is the leastequivalence relation on S that contains ≤ . Here is the formalization: Deﬁnition 10.2.15

A partial ordering (S, ≤ ) is ﬁltered iﬀ for all s , s ∈ S , thereis some s ∈ S such that s ≤ s and s ≤ s . A partial ordering (S, ≤ ) is locally ﬁltered iﬀ every connected component of it is ﬁltered. An order-sorted signature (S, ≤ , Σ ) is locally ﬁltered iﬀ (S, ≤ ) is locally ﬁltered,and it is a coherent signature iﬀ it is both locally ﬁltered and regular.A partial ordering (S, ≤ ) has a top iﬀ it contains a (necessarily unique)maximum element, u ∈ S , such that s ≤ u for all s ∈ S . (cid:2) Hereafter we assume that all OSA signatures are coherent unless other-wise stated. Assuming local ﬁltration is not at all restrictive in practice,because we always add top elements to connected components, or evento the whole partial ordering (see Exercise 10.2.6). Indeed, OBJ3 doesjust this, with its

Universal sort. The need for local ﬁltration is shownin Example 10.2.17 below, and is also used in the quotient construc-tion of Deﬁnition 10.4.6 and in the application of that construction toorder-sorted rewriting modulo equations in Section 10.7.

Exercise 10.2.6

Show that any ﬁltered partial order is locally ﬁltered, and thatany partial order with a top is ﬁltered. Give examples showing that theconverse assertions are false. (cid:2)

Deﬁnition 10.2.16

An order-sorted Σ - equation is a triple (cid:104) X, t , t (cid:105) where X is a variable set, and t , t ∈ T Σ (X) such that LS(t ) and LS(t ) arein the same connected component of (S, ≤ ) ; we shall of course write ( ∀ X) t = t , or in concrete cases, things like ( ∀ x, y, z) t = t . A Σ -algebra M satisﬁes a Σ -equation ( ∀ X) t = t iﬀ for all assignments a : X → M we have a(t ) = a(t ) .A conditional Σ - equation is a quadruple (cid:104) X, t , t , C (cid:105) , where (cid:104) X, t , t (cid:105) is a Σ -equation and C is a ﬁnite set of pairs (cid:104) u, v (cid:105) such that (cid:104) X, u, v (cid:105) is a Σ -equation; we shall write ( ∀ X) t = t if C , or more concretely, things like ( ∀ x, y) t = t if u = v , u = v . A Σ -algebra M satis-ﬁes a conditional Σ -equation ( ∀ X) t = t if C iﬀ for all assignments a : X → M whenever a(u) = a(v) for all (cid:104) u, v (cid:105) ∈ C then a(t ) = a(t ) .Finally, we say that a Σ -algebra M satisﬁes a set A of Σ -equations(conditional or not) iﬀ it satisﬁes each one of them. In this case, wewrite M (cid:238) A or possibly M (cid:238) Σ A . (cid:2) Although satisfaction makes sense when the set of conditions is inﬁ-nite, we have restricted the deﬁnition to ﬁnite C because this is neededfor both deduction and rewriting. The following gives another reasonwhy local ﬁltration is necessary: erm and Equation Example 10.2.17

We show that without local ﬁltration, equational satisfactionis not invariant under isomorphism. Given the speciﬁcation th NON-LF is sorts A B C .subsorts B < A C .op a : -> A .op b : -> B .op c : -> C .eq a = c .endth let Σ denote its signature. Then the initial order-sorted Σ -algebra T Σ has ( T Σ ) A = { a, b } , ( T Σ ) B = { b } , ( T Σ ) C = { b, c } and it does not satisfythe equation, whereas the Σ -isomorphic algebra A with A A = { b, d } , A B = { b } and A C = { b, d } , does satisfy the equation, where both a and c are interpreted as d in A . (See Exercise 10.7.1 for some furtherrelated discussion.) (cid:2) The undesirable phenomenon of Example 10.2.17 is impossible for lo-cally ﬁltered signatures:

Proposition 10.2.18 If Σ is a coherent OSA signature and A , B are Σ -isomorphicalgebras, E45 then A satisﬁes an equation ( ∀ X) t = t (cid:48) iﬀ B does. Proof:

By symmetry of the isomorphism relation, it suﬃces to prove just onedirection. So assume A satisﬁes the equation, let f : A → B be a Σ -isomorphism, and let β : X → B be an assignment. Then β = α ; f for some assignment α : X → A . Therefore β = α ; f , so that if s ≥ LS(t), LS(t (cid:48) ) then β s (t) = f s (α s (t)) = f s (α s (t (cid:48) )) = β s (t (cid:48) ) as desired. (cid:2) Exercise 10.2.7

Generalize Proposition 10.2.18 from unconditional to condi-tional equations. (cid:2)

Example 10.2.19 ( Errors for Lists ) Example 10.1.2 noted that equations are needed to give car and cdr the desired meanings. The following givesthose equations in an appropriate object (for convenience, we importthe natural numbers instead of deﬁning them from scratch): obj LIST is sorts List NeList .pr NAT .subsorts NeList < List .op nil : -> List .op cons : Nat List -> NeList .op car : NeList -> Nat .op cdr : NeList -> List .var L : List . var N : Nat . Order-Sorted Algebra and Term Rewriting eq car(cons(N,L)) = N .eq cdr(cons(N,L)) = L .endo

The initial algebra of this speciﬁcation is what one would expect,noting that the terms car(nil) and cdr(nil) do not parse, and henceare not in it. However, because these terms, and the many others ofwhich they are subterms, represent errors, what we really want is forthem to be proper terms, but of a diﬀerent sort, so that we can “handle”or “trap” them as errors, without having to invoke any nasty imperativefeatures, as is done in most functional programming languages. Thefollowing shows that OSA provides an elegant solution for this problem,and it has even been shown that MSA cannot provide a satisfactorysolution [138]. obj ELIST is sorts List ErrList ErrNat .pr NAT .subsort List < ErrList .subsort Nat < ErrNat .op nil : -> List .op cons : Nat List -> List .op car : List -> ErrNat .op cdr : List -> ErrList .var N : Nat . var L : List .eq car(cons(N,L)) = N .eq cdr(cons(N,L)) = L .op nohead : -> ErrNat .eq car(nil) = nohead .op notail : -> ErrList .eq cdr(nil) = notail .endo

Now car(nil) is a proper term of sort

ErrNat rather then

Nat , andsimilarly for cdr(nil) . Therefore we can write equations that havesuch “error expressions” in their leftsides, as above. Of course morethan this is needed to get the right behavior in realistic situations, asfurther discussed in Section 10.6. (Note that expressions like cons(2, notail) and fail to parse, and hence are not terms forthis speciﬁcation, although if they are executed in OBJ3, some inter-esting things happen with retracts, as explained in Example 10.6.1.) (cid:2)

Exercise 10.2.8

Deﬁne appropriate error supersorts and error messages for op-erations applied to the empty stack, using the basic speciﬁcation belowas your starting point: obj STACK is pr NAT .sort Stack . eduction op empty : -> Stack .op push : Nat Stack -> Stack .op top_ : Stack -> Nat .op pop_ : Stack -> Stack .var X : Nat .var S : Stack .eq top push(X,S) = X .eq pop push(X,S) = S .endo

You should also deﬁne and run some test cases for your code. (cid:2)

Equational deduction also generalizes to order-sorted algebra. The fol-lowing rules use the same notation as Section 8.4.1, in which the hy-potheses of a rule are above a horizontal line, while the conclusion isgiven below the line.

Deﬁnition 10.3.1 (Order-sorted equational deduction)

Let A be a set of Σ -equations. If an equation e can be deduced using the rules (1–5C) below,from a given set A of (possibly conditional) equations, then we write A (cid:96) e or possibly A (cid:96) Σ e .(1) Reﬂexivity: ( ∀ X) t = t (2) Symmetry: ( ∀ X) t = t (cid:48) ( ∀ X) t (cid:48) = t (3) Transitivity: ( ∀ X) t = t (cid:48) , ( ∀ X) t (cid:48) = t (cid:48)(cid:48) ( ∀ X) t = t (cid:48)(cid:48) (4) Congruence: ( ∀ X) θ(y) = θ (cid:48) (y) for each y ∈ Y( ∀ X) θ(t) = θ (cid:48) (t) where θ, θ (cid:48) : Y → T Σ (X) and where t ∈ T Σ (Y ) .(5C) Conditional Instantiation: ( ∀ X) θ(v) = θ(v (cid:48) ) for each v = v (cid:48) in C( ∀ X) θ(t) = θ(t (cid:48) ) where θ : Y → T Σ (X) and where ( ∀ Y ) t = t (cid:48) if C is in A . Order-Sorted Algebra and Term Rewriting

There is also an unconditional version of (5C):(5)

Instantiation: ( ∀ X) θ(t) = θ(t (cid:48) ) where θ : Y → T Σ (X) and where ( ∀ Y ) t = t (cid:48) is in A .We can use the same notation for deduction using (1–5) as for using(1–5C) because (5) is a special case of (5C). (cid:2) These rules of deduction are essentially the same as for MSA, exceptfor the restriction on substitutions given in Proposition 10.2.12, and the same results as in Chapter 4 for MSA deduction generally carryover. In particular, these rules are sound and complete:

Theorem 10.3.2 ( Completeness ) Given a coherent order-sorted signature Σ thenan unconditional equation e can be deduced from a given set A of (pos-sibly conditional) equations iﬀ it is true in every model of A , that is, E (cid:96) e iﬀ A (cid:238) e . (cid:2) The proof is given in Appendix B, but it uses results from the nextsection, because for expository purposes, we have stated this resultearlier than required by the logical ﬂow of proof.

Exercise 10.3.1

Show that rule (0) (

Assumption ) of Chapter 4 (but for OSA) is aspecial case of rule (5) above. (cid:2)

It is straightforward to generalize the material on subterm replace-ment for conditional equations in Section 4.9 to the order-sorted case.The basic rule is as follows:(6C)

Forward Conditional Subterm Replacement . Given t ∈ T Σ (X ∪{ z } s ) with z (cid:54)∈ X , if ( ∀ Y ) t = t if C is of sort ≤ s and is in A , and if θ : Y → T Σ (X) is a substitutionsuch that ( ∀ X) θ(u) = θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C ,then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible.The substitutions t (z ← θ(t )) and t (z ← θ(t )) are valid because LS(θ(t i )) ≤ LS(t i ) by Proposition 10.2.12 and LS(t i ) ≤ s by assump-tion.Exercises 4.9.1–4.9.4 also generalize, as does the reversed version of(6C): eduction (–6C) Backward Conditional Subterm Replacement . Given t ∈ T Σ (X ∪{ z } s ) with z (cid:54)∈ X , if ( ∀ Y ) t = t if C is of sort ≤ s and is in A , and if θ : Y → T Σ (X) is a substitutionsuch that ( ∀ X) θ(u) = θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C ,then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible.Soundness of this rule follows as in the MSA case, by applying the sym-metry rule, and so we also get:( ± C ) Bidirectional Conditional Subterm Replacement . Given t ∈ T Σ (X ∪{ z } s ) with z (cid:54)∈ X , if ( ∀ Y ) t = t if C or ( ∀ Y ) t = t if C is of sort ≤ s and is in A , and if θ : Y → T Σ (X) is a substitutionsuch that ( ∀ X) θ(u) = θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C ,then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is also deducible.We now have the following important completeness result, which isproved in Appendix B: Theorem 10.3.3

Given a coherent signature Σ and a set A of (possibly condi-tional) Σ -equations, then for any unconditional Σ -equation e , A (cid:96) C e iﬀ A (cid:96) ( , , ± C) e . (cid:2) As in Chapter 4, the rules (+6C), (–6C), and ( ± C ) can each be special-ized to the case where t has exactly one occurrence of z , and thesevariants are indicated by writing 6 instead of 6; we do not write themout here (but see Deﬁnition 10.7.1 in Section 10.7 below). The follow-ing completeness result can now be proved in much the same way asCorollary 4.9.2: Corollary 10.3.4

Given a coherent signature Σ and a set A of (possibly condi-tional) Σ -equations, then for any unconditional Σ -equation e , A (cid:96) ( − C) e iﬀ A (cid:96) ( , , ± C) e iﬀ A (cid:96) ( , , , C) e . (cid:2) Exercise 10.3.2

Use order-sorted equational deduction to prove the equation f ( a ) = f ( b ) for the following speciﬁcation: th OSRW-EQ is sorts A C .subsort A < C .ops a b : -> A .op c : -> C .op f : A -> C .eq c = a .eq c = b .endth Order-Sorted Algebra and Term Rewriting

You may use OBJ’s apply feature.

Hint:

First prove a = b as alemma. (cid:2)

This section develops some more theoretical topics in order-sorted al-gebra; in general, they are straightforward extensions of the corre-sponding MSA topics, the main exception being the treatment of sub-sorts in quotients. As in MSA, the completeness of deduction (The-orem 10.3.2) can be used to construct initial and free algebras whenthere are equations, by deﬁning an S -sorted relation (cid:39) A,X on T Σ (X) for X a variable set, by t (cid:39) A,X t iﬀ A (cid:96) ( ∀ X) t = t using the rules in Deﬁnition 10.3.1. Since this relation is an order-sorted congruence in the sense of Deﬁnition 10.4.1 immediately below,we can deﬁne T Σ ,A (X) to be the quotient of T Σ (X) by (cid:39) A,X using thequotient construction given in Deﬁnition 10.4.6 below. In preparationfor the OSA notion of congruence, one should ﬁrst recall from Deﬁ-nition 6.1.1 that, given a many-sorted signature ( S, Σ ), a many-sorted Σ - congruence ≡ on a many-sorted Σ -algebra M is an S -sorted family {≡ s | s ∈ S } of equivalence relations, with ≡ s on M s such that(1) given σ ∈ Σ w,s with w = s . . . s n and a i , a (cid:48) i ∈ M s i for i = , . . . , n such that a i ≡ s i a (cid:48) i , then M σ (a , . . . , a n ) ≡ s M σ (a (cid:48) , . . . , a (cid:48) n ) . Deﬁnition 10.4.1

For ( S, ≤ , Σ ) an order-sorted signature and M an order-sorted Σ -algebra, an order-sorted Σ - congruence ≡ on M is a many-sorted Σ -congruence ≡ such that(2) if s ≤ s (cid:48) in S and a, a (cid:48) ∈ M s then a ≡ s a (cid:48) iﬀ a ≡ s (cid:48) a (cid:48) .An order-sorted (S, ≤ , Σ ) -algebra M (cid:48) is an order-sorted subalgebra ofanother such algebra M iﬀ it is a many-sorted subalgebra such that M (cid:48) s ⊆ M (cid:48) s (cid:48) whenever s ≤ s (cid:48) in S . (cid:2) Exercise 10.4.1

Show that the intersection of any set of order-sorted Σ -congru-ences on an order-sorted Σ -algebra M is also an order-sorted Σ -congru-ence on M . Conclude from this that any S -sorted family R of binaryrelations R s on M s for s ∈ S is contained in a least order-sorted Σ -congruence on M . Hint:

The set of congruences that contain R is non-empty because it contains the relation that identiﬁes everything (foreach sort). (cid:2) Example 10.4.2

Let M be the initial Σ -algebra T Σ where Σ is the ELIST signa-ture from Example 10.2.19, and let ≡ be the congruence generated by ongruence, Quotient, and Initiality its equations, i.e., the least Σ -congruence that contains all ground in-stances of the equations in ELIST . Then cdr(cons(0,nil)) ≡ nil onboth sorts List and

ErrList , consistent with

List < ErrList , while cdr(cdr(cons(0,nil))) ≡ notail for the sort ErrList . (cid:2) Fact 10.4.3 (cid:39)

A,X is an order-sorted congruence relation.

Proof:

It is easy to see that (cid:39)

A,X is reﬂexive, symmetric, and transitive fromrules (1), (2) and (3) of Deﬁnition 10.3.1, respectively, and the Σ -congru-ence property follows from rule (4). To prove (2) of Deﬁnition 10.4.1,suppose that s ≤ s (cid:48) and that t (cid:39) A,X t (cid:48) for t, t (cid:48) ∈ T Σ (X) s . Then also t, t (cid:48) ∈ T Σ (X) s (cid:48) and the same proof that showed A (cid:96) ( ∀ X) t = t (cid:48) forsort s also works for sort s (cid:48) , and vice versa . (cid:2) Recall from Deﬁnition 6.1.5, that given a many-sorted Σ -homomor- phism f : M → M (cid:48) , the kernel of f , denoted ker (f ) , is the S -sortedfamily of equivalence relations ≡ f deﬁned by a ≡ f ,s a (cid:48) iﬀ f s (a) = f s (a (cid:48) ) , and the image of f is the subalgebra f (M) with f (M) s = f (M s ) for each s ∈ S ; it may also be denoted im(h) . The following shows thatthese concepts extend easily from MSA to OSA: Proposition 10.4.4 If f : M → M (cid:48) is an order-sorted Σ -homomorphism, then1. ker (f ) is an order-sorted Σ -congruence on M ;2. f (M) is an order-sorted subalgebra of M (cid:48) . Proof:

Proposition 6.1.6 showed that each ≡ f ,s is an equivalence relation sat-isfying the congruence property (i.e., (1) above). Property (2) above fol-lows from the fact that f s (a) = f s (cid:48) (a) and f s (a (cid:48) ) = f s (cid:48) (a (cid:48) ) for any s ≤ s (cid:48) in S and any a, a (cid:48) ∈ M s .Assertion 2. was proved in Proposition 6.1.6 for MSA, so we needonly check (2) of Deﬁnition 10.4.1, which is an easy (set-theoretic) con-sequence of the fact that f is order-sorted. (cid:2) Example 10.4.5

Let

LELIST denote the result of making the speciﬁcation

ELIST of Example 10.2.19 entirely loose, including the imported natural num- bers, say with the Peano signature, although we will use ordinary dec-imal notation for convenience. Let M be the ELIST -algebra with ele-ments of sort

List just lists of natural numbers, such as ( , , , , ) ;with elements of sort ErrList those of sort

List plus notail ; andwith elements of sort

ErrNat the natural numbers plus nohead . Notethat expressions like cons(0, notail) and cons(nohead, (0,1,2)) are simply not in this algebra.Now let h : M → M (cid:48) be the S -sorted map that sends each naturalnumber to the corresponding list of numbers modulo 3, and that sends notail and nohead to themselves. Then h is a Σ -homomorphism, and h(M) is the LELIST -algebra M (cid:48) with M (cid:48) Nat = { , , } , with M (cid:48) List lists of Order-Sorted Algebra and Term Rewriting numbers from { , , } , with M (cid:48) ErrNat = M (cid:48) Nat ∪{ nohead } , and M (cid:48) ErrList = M (cid:48) List ∪ { notail } .If we let R denote the kernel of h , then nR Nat n (cid:48) iﬀ n − n (cid:48) is divisibleby 3, and (cid:96)R List (cid:96) (cid:48) iﬀ (cid:96) and (cid:96) (cid:48) have the same length N and (cid:96) i − (cid:96) (cid:48) i isdivisible by 3 for i = , . . . , N . Also, nR ErrNat n (cid:48) iﬀ nR Nat n (cid:48) or n = n (cid:48) = nohead , and (cid:96)R ErrList (cid:96) (cid:48) iﬀ (cid:96)R List (cid:96) (cid:48) or (cid:96) = (cid:96) (cid:48) = notail . (cid:2) Exercise 10.4.2 If f : M → M (cid:48) is an OSA Σ -homomorphism and M ⊆ M is a Σ -subalgebra, show that f (M ) is a Σ -subalgebra of f (M) . (cid:2) We now deﬁne the quotient of an order-sorted algebra by a congru-ence relation; following [82], the construction exploits local ﬁltration toenable identiﬁcations across subsorts:

Deﬁnition 10.4.6

For ( S, ≤ , Σ ) a locally ﬁltered order-sorted signature, M an order-sorted Σ -algebra, and ≡ an order-sorted Σ -congruence on M , the quotient of M by ≡ is the order-sorted Σ -algebra M/ ≡ deﬁned as fol-lows: for each connected component C , let M C = (cid:83) s ∈ C M s and deﬁnethe congruence relation ≡ C by a ≡ C a (cid:48) iﬀ there is a sort s ∈ C suchthat a ≡ s a (cid:48) . Then ≡ C is clearly reﬂexive and symmetric. It is transitivebecause a ≡ s a (cid:48) and a (cid:48) ≡ s (cid:48) a (cid:48)(cid:48) yield a ≡ s (cid:48)(cid:48) a (cid:48)(cid:48) for s (cid:48)(cid:48) ≥ s, s (cid:48) , which ex-ists by local ﬁltration. The inclusion M s ⊆ M C induces an injective map M s / ≡ s → M C / ≡ C because for a, a (cid:48) ∈ M s we have a ≡ s a (cid:48) implies a ≡ C a (cid:48) by construction, and conversely a ≡ C a (cid:48) implies a ≡ s (cid:48) a (cid:48) for some s (cid:48) ∈ C , and taking s (cid:48)(cid:48) ≥ s, s (cid:48) it also implies a ≡ s (cid:48)(cid:48) a (cid:48) and therefore itimplies a ≡ s a (cid:48) by property (2) of the deﬁnition of order-sorted congru-ence. Denoting by q C the natural projection q C : M C → M C / ≡ C of eachelement a into its ≡ C -equivalence class, we deﬁne the carrier (M/ ≡ ) s of sort s in the quotient algebra to be the image q C (M s ) . The order-sorted algebra M/ ≡ comes equipped with a surjective order-sorted Σ -homomorphism q : M → M/ ≡ deﬁned to be the restriction of q C toeach of its sorts, and called the quotient map associated to the con-gruence ≡ . The operations are deﬁned by (M/ ≡ ) σ ([a ], . . . , [a n ]) = [M σ (a , . . . , a n )] , and are well deﬁned because ≡ is an order-sorted Σ -congruence. (cid:2) The following illustrates the above construction:

Example 10.4.7

For a given Σ -theory B , consider the relation ≡ on T Σ deﬁnedfor LS(t) , LS(t (cid:48) ) ≤ s by t ≡ s t (cid:48) iﬀ there is a proof that t = t (cid:48) in whichevery term used has least sort ≤ s ; it is not diﬃcult to check that ≡ is a Σ -congruence. Now deﬁne B by the following: th OSTH is sorts A C .subsort A < C .ops a b : -> A .op c : -> C .eq c = a . ongruence, Quotient, and Initiality A/R (cid:63) qA (cid:45) f B (cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:42) v Figure 10.2: Condition (2) of Universal Property of Quotient eq c = b .endth

Then under the ordinary quotient construction (as in Appendix C)the ≡ -equivalence class [ a ] A of a for sort A is { a } and also [ b ] A = { b } , whereas [ a ] C = [ b ] C = { a , b , c } . However, under the construction ofDeﬁnition 10.4.6, the equivalence classes of sort s collect all terms ofsort s or less that can be proved equal, no matter what other sorts maybe involved. So in this case, [ a ] A = { a , b } and [ a ] B = { a , b , c } . Thisshows that the construction of Deﬁnition 10.4.6 does useful additionalwork for certain relations, although (cid:39) B is not one of these. E46 (cid:2)

Exercise 10.4.3

Show that the relation ≡ in Example 10.4.7 is a Σ -congruence. (cid:2) The following is straightforward from the deﬁnitions:

Fact 10.4.8

Under the assumptions of Deﬁnition 10.4.6, ker (q) = ≡ . (cid:2) Exercise 10.4.1 allows us to extend the construction of Deﬁnition10.4.6 to quotients by an arbitrary relation on an algebra.

Deﬁnition 10.4.9

Given an arbitrary S -sorted family R of binary relations R s on M s for s ∈ S , then the quotient of M by R , denoted M/R , is the quotientof M by the smallest order-sorted Σ -congruence on M containing R . (cid:2) Proposition 10.4.10 ( Universal Property of Quotient ) If Σ is a locally ﬁltered order-sorted signature, if M is an order-sorted Σ -algebra, and if R is an S -sorted family of binary relations R s on M s for s ∈ S , then the quotientmap q : M → M/R satisﬁes the following:(1) R ⊆ ker (q) , and(2) if f : M → B is any order-sorted Σ -homomorphism such that R ⊆ ker (f ) , then there is a unique Σ -homomorphism v : M/R → B such that q ; v = f (see Figure 10.2). Proof: (1) follows from Fact 10.4.8 and the deﬁnition of ≡ as the smallestcongruence that contains R . Order-Sorted Algebra and Term Rewriting

For (2), let f : M → M (cid:48) be an order-sorted Σ -homomorphism suchthat R ⊆ ker (f ) . Then ker (q) ⊆ ker (f ) and both are congruences sothat for each connected component C we have ker (q) C ⊆ ker (f ) C andthere is a unique function v C : (M/R) C → M (cid:48) C such that v C ◦ q C = f C for f C : M C → M (cid:48) C deﬁned by f C (a) = f s (a) if a ∈ M s (this is welldeﬁned by local ﬁltering). It remains only to check that, restricting v C to each one of the sorts s ∈ C , the family { v s | s ∈ S } thus obtainedis an order-sorted Σ -homomorphism. Property (2) for order-sorted ho-momorphisms follows by construction. Let σ ∈ Σ w,s with w = s . . . s n and let a i ∈ M s i for i = , . . . , n . Then (omitting sort qualiﬁcationsthroughout) we have v((M/R) σ ([a ], . . . , [a n ])) = v([M σ (a , . . . , a n )]) = f (M σ (a , . . . , a n )) = M (cid:48) σ (f (a ), . . . , f (a n )) = M (cid:48) σ (v([a ]), . . . , v([a n ])) . The case w = [] is left for the reader to check. (cid:2) The proof of Theorem 10.3.2 in Appendix B shows that the relation (cid:39)

A,X (deﬁned on page 334) is an order-sorted Σ -congruence. So we nowdeﬁne T Σ ,A (X) to be the quotient of T Σ (X) by (cid:39) A,X . Also, we denote T Σ ,A ( ∅ ) by T Σ ,A . The following is also proved in Appendix B: Theorem 10.4.11 If Σ is coherent and A is a set of (possibly conditional) Σ -equations, then T Σ ,A is an initial ( Σ , A )-algebra, and T Σ ,A (X) is a free( Σ , A )-algebra on X , in the sense that for each Σ -algebra M and each as-signment a : X → M , there is a unique Σ -homomorphism a : T Σ ,A (X) → M such that a(x) = a(x) for each x in X . (cid:2) Example 10.4.12

Theorem 10.4.11 implies that the algebra M of Example 10.4.5is an initial model for LELIST and hence a model for the original speci-ﬁcation

ELIST of Example 10.2.19. (cid:2)

The following theorem generalizes Noether’s ﬁrst isomorphism the-orem (Theorem 6.1.7) to OSA:

Theorem 10.4.13 ( Homomorphism Theorem ) For any Σ -homomorphism h : M → M (cid:48) , there is a Σ -isomorphism M/ ker (h) (cid:155) Σ im(h) . Proof:

Let f (cid:48) : A → f (A) denote the corestriction (Appendix C reviews thisconcept) of f to f (A) . Then Proposition 10.4.10 with R = ker (f (cid:48) ) = ker (f ) gives a (unique) Σ -homomorphism v : A/ ker (f ) → f (A) suchthat q ; v = f (cid:48) , which is surjective because f (cid:48) is. We will be done ifwe can show that v is also injective. To this end (and omitting sortsubscripts), suppose that v([a ]) = v([a ]) ; then f (a ) = f (a ) , sothat [a ] = [a ] . (cid:2) Exercise 10.4.4

For

M, M (cid:48) , R as in Example 10.2.19, check that M/ ≡ is Σ -iso-morphic to M (cid:48) . (cid:2) lass Deduction It is worthwhile making explicit the following consequence of theproof given in Appendix B of the Completeness Theorem (10.3.2):

Corollary 10.4.14

Given a coherent order-sorted signature Σ and a set A of(possibly conditional) Σ -equations, then an equation ( ∀ X) t = t (cid:48) is sat-isﬁed by every Σ -algebra that satisﬁes A iﬀ it is satisﬁed by T Σ ,A (X) . (cid:2) The theory of class deduction in Section 7.2 easily generalizes to OSA.

Deﬁnition 10.5.1

Given an order-sorted signature Σ and a set B of (possibly conditional) Σ -equations, a ( conditional ) Σ - equation modulo B , or ( Σ , B) - equation , is a 4-tuple (cid:104) X, t, t (cid:48) , C (cid:105) where t, t (cid:48) ∈ T Σ ,B (X) have sorts in thesame connected component of Σ , and C is a ﬁnite set of pairs from T Σ ,B (X) , again with sorts in the same connected components. Usuallywe write ( ∀ X) t = B t (cid:48) if C , and may use the same notation with t, t (cid:48) , C all Σ -terms that represent their B -equivalence classes; we may also dropthe B subscripts.Given a ( Σ , B) -algebra M , Σ - satisfaction modulo B , written M (cid:238) Σ ,B ( ∀ X) t = B t (cid:48) if C , is deﬁned by a(t) = a(t (cid:48) ) whenever a(u) = a(v) for each (cid:104) u, v (cid:105) ∈ C , for all a : X → M , where a : T Σ ,B (X) → M isthe unique Σ -homomorphism extending a . Given a set A of ( Σ , B) -equations, A (cid:238) Σ ,B e means M (cid:238) Σ ,B A implies M (cid:238) Σ ,B e for all B -models M . (cid:2) As in Section 7.2, class deduction versions of inference rules areobtained just by substituting T Σ for T Σ and = B for = , assuming that A contains ( Σ , B) -equations; we also use [A] and [e] as in Section 7.2,and the following three results have essentially the same proofs as thecorresponding results there. The B -class version of rule (i) is denoted (i B ) , and A (cid:96) Σ ,B [e] denotes class deduction modulo B of [e] from A ,using rules ( B − B ) and ( C B ) . Proposition 10.5.2 ( Bridge ) Given sets

Theorem 10.5.3 ( Completeness ) Given sets

A, B of Σ -equations and another Σ -equation e , all possibly conditional, then the following are equivalent: ( ) [A] (cid:96) B [e] ( ) [A] (cid:238) B [e] ( ) A ∪ B (cid:96) e ( ) A ∪ B (cid:238) e (cid:2) The above result connects OSA class inference and satisfaction withordinary OSA inference and satisfaction.

Theorem 10.5.4 ( Completeness ) Given sets

A, B of (possibly conditional) Σ -equa-tions and an unconditional Σ -equation e , then [A] (cid:96) B [e] iﬀ [A] (cid:96) ( B , B , ± C B )B [e] . Moreover, [A] (cid:96) ( B , B , ± C B )B [e] iﬀ M (cid:238) B [e] for all ( Σ , A ∪ B) -algebras M . (cid:2) The above result says that ( C B ) , i.e., class rewriting, is complete forclass deduction when combined with the reﬂextive, symmetric ( ± ) , andtransitive rules of inference. In strongly typed languages, some expressions may not type check,even though they have a meaningful value. For example, given a facto-rial function deﬁned only on natural numbers, the expression ((- 6)/(- 2))! is not well-formed, because the parser can only determinethat the argument of the factorial is a rational number, possibly nega-tive. It is desirable to give such expressions the “beneﬁt of the doubt,”because they could evaluate to a natural (e.g., the above evaluates to3).

Retract functions provide this capability, by lowering the sort of asubexpression to the subsort needed for parsing. In this example, theretract function symbol r Rational , Natural : Rational -> Natural is automatically inserted by OBJ during rewriting, in a process called retract rewriting , to ﬁll the gap between the parsed high sort andthe required low sort, yielding the expression (r Rational , Natural ((- 6)/(- 2)))! , which does type check. Then we can use retract eliminationequations , of the form r s,s (cid:48) (x) = x where s (cid:48) ≤ s and x is a variable of sort s (cid:48) , to eliminate retracts whentheir arguments do have the required sorts. When s (cid:48) (cid:54)≤ s , the retractremains, providing an error message that pinpoints exactly where the rror, Coercion, and Retract problem occurs and exactly what its sort gap was. For example, evaluates to Rational , Natural (1 / 3))! , in-dicating that the argument to factorial is the rational 1/3. Similar situ-ations arise with the function |_|^2 in Section 10.8 below. And unlikethe untyped case, truly nonsensical expressions are detected and re-jected at compile time, while any expression that could possibly recoveris allowed to be evaluated. By “truly nonsensical” is meant expressionslike factorial(false) that contain subexpressions in the wrong con-nected component (assuming that booleans and natural numbers are indiﬀerent connected components of the sort poset) and therefore can-not be parsed by inserting retracts. A precise semantics for retractsis given in Section 10.6.1, while the rest of this section is devoted toexamples in OBJ.

Example 10.6.1 ( Lists with Fewer Errors ) As already noted, without retracts,terms like car(cdr(cons(1,cons(2,cons(3,nil))))) do not parse in the context of the

ELIST theory of Example 10.2.19,because the subterm beginning with cdr has sort

ErrList , while car requires sort

List as its argument. However, the correct answer (whichis 2), is obtained by inserting a retract and then reducing the result.The term car(cdr(cdr(cons(1,nil)))) has a somewhat diﬀerent behavior when retracts are added: it is tem-porarily accepted as car(r

ErrList , List (cdr(r

ErrList , List (cdr(cons(1,nil)))))) which is then reduced to the form car(r

ErrList , List (nil)) which serves as a very informative error message. (cid:2)

One might think that, since this is a kind of run-time type checking,it is just operational semantics. But our approach requires that theoperational semantics agrees with the logical semantics, and retractshave a very nice logical semantics (see Section 10.6.1 below), as well as an operational semantics, which is developed in Section 10.7 below.Moreover, this kind of run-time type checking is relatively inexpensive,and in combination with the polymorphism given by subsorts and byparameterized modules, it provides the syntactic ﬂexibility of untypedlanguages with all the advantages of strong typing.The following shows that if deduction is not treated carefully, it canyield unsound results, and that naive attempts to ﬁx this problem cangreatly weaken deduction; the discussion following the example abovealso shows that retracts again provide a nice solution.

Example 10.6.2

Consider the term f(a) for the following object: Order-Sorted Algebra and Term Rewriting obj NON-DED is sorts A B .subsorts A < B .op a : -> A .op b : -> B .ops f g : A -> A .var X : A .eq f(X) = g(X) .eq a = b .endo

The ﬁrst equation can deduce g(a) from f(a) , and then the secondequation can apparently deduce g(b) from f(a) ; but g(b) is not awell-formed term! The problem is that, although replacing a by b issound in itself, it is not sound in the context of g . (cid:2) The easiest way to avoid this problem is to prohibit deductions that donot decrease sorts. But this would eliminate many important examples,such as the square norm in Section 10.8 above. A better approach isto prohibit applications yielding terms that don’t parse; in fact, Deﬁ-nition 10.3.1 takes this approach, because its rules implicitly assumethat every term occurring in them is well-formed. Unfortunately, thisprohibits many correct computations, such as that above with factorial,and it also fails to inform the user what went wrong. Retracts allow usto avoid all these diﬃculties. The result of running OBJ3, which imple-ments retracts, on red f(a) . in the context of

NON-DED is the following, reduce in NON-DED : f(a)rewrites: 2result A: g(r:B>A(b)) which is not only a valid deduction, but also an informative error mes-sage.Example 10.6.2 might raise suspicions that the rules of deduction as stated in Deﬁnition 10.3.1 are unsound; but recall that those rulesassume that all terms in them are well-formed, so without retracts, theydisallow the deduction in Example 10.6.2. Once we formalize retractsas an order-sorted theory, the soundness of deduction using retractsfollows from Theorem 10.3.2, because deduction with retracts followsexactly the same rules as deduction without them.Raising and handling exceptions can also be given a nice seman-tics using retracts. This is signiﬁcant because exceptions have bothinadequate semantic foundations and insuﬃcient ﬂexibility in manyprogramming and speciﬁcation languages. Some algebraic speciﬁca-tion languages use partial functions, which are simply undeﬁned under rror, Coercion, and Retract exceptional conditions. This can be developed rigorously, e.g., in [111],but it has the disadvantage that neither error messages nor error recov-ery are possible. OSA with retracts supports both, and is fully imple-mented in OBJ3 and BOBJ. The following illustrates some capabilitiesof retracts in this respect:

Example 10.6.3 ( Lists with Fewer Errors ) Again in the context of the

ELIST theory of Example 10.2.19, in processing large lists, explicit error mes-sages that pinpoint exceptions might be diﬃcult to understand. In thiscase, we can add some equations to simplify such expressions, var EN : ErrNat .var EL : ErrList .eq cons(r:ErrNat>Nat(EN), notail) = notail .eq cons(nohead, r:ErrList>List(EL)) = notail .eq car(r:ErrList>List(EL)) = nohead .eq cdr(r:ErrList>List(EL)) = notail . in which case red car(cdr(cdr(cdr(cons(1,nil))))) . gives just nohead as its reduced form. (cid:2)

The following somewhat open-ended exercise gives a similar butmore complex application of retracts:

Exercise 10.6.1 ( (cid:63) ) Write a suitable theory for a relational database in whichlists represent tuples and so-called “null values,” such as op nullNat : -> ErrNat . are treated as exceptions when doing arithmetic, but do not collapsetuples to a single error message. Show that it is also possible to haveboth kinds of exception in a single theory, by declaring two diﬀerenterror supersorts of

Nat . (cid:2) Retracts also support situations where data is represented more than one way, and representations are converted to whatever form ismost convenient or eﬃcient for a given context. This kind of multiplerepresentation is rather common, but is rarely given semantics. A goodexample is Cartesian and polar coordinates for points in the plane, asdeveloped in Section 10.9. There are also many cases involving conver-sion from one sort of data to another in an irreversible way; for exam-ple, to apply integer addition to two rational numbers, one might ﬁrsttruncate them; this is called coercion . In both multiple representationand coercion, applying functions deﬁned on one representation to dataof another is mediated by functions that change the representation, butthe conversions between multiple representations are reversible. Order-Sorted Algebra and Term Rewriting

A basic theorem about retracts asserts their consistency : the theorythat results from adding retract function symbols and retract equa-tions to an order-sorted speciﬁcation is a conservative extension of theoriginal, in the sense that the equational deduction and initial modelsof the original theory are not disturbed. Thus retracts not only com-bine the ﬂexibility of untyped languages with the discipline of strongtyping, and give satisfactory treatments of exception handling and mul-tiple representation, but the semantics of retracts, both deductive andmodel theoretic, is just a special case of order-sorted algebra.

Example 10.6.4

The following is similar to Example 10.6.2, but more subtle. obj MORE-NON-DED is sorts A B C .subsorts A < B < C .op f : C -> C .ops f h : A -> A .op g : B -> B .op a : -> A .var X : B .eq f(X) = g(X) .endo

Here f(X) has sort C , which looks reasonable because the sort C is greater than the sort B of g(X) . But the term h(f(a)) rewrites to h(r:B>A(g(a))) , because the equation obtained from the original byspecializing the variable X of sort B to a variable of sort A is not sortdecreasing. Therefore, not just the original rules, but also their special-izations to rules having variables of smaller sorts should be considered,as discussed in [113]; see also Deﬁnition 10.7.10. (cid:2) Exercise 10.6.2

Use OBJ to verify the assertions about its computations in Ex-amples 10.6.1, 10.6.3, and 10.6.4. (cid:2)

We have shown that strong typing is not ﬂexible enough in practice, and have also suggested how OSA can provide the necessary ﬂexibilitywith retracts. We now develop the formal semantics and prove thatretracts are sound under certain mild assumptions. The ﬁrst step is toextend an order-sorted signature Σ to another order-sorted signature Σ ⊗ having the same sorts as Σ , and the same operation symbols as Σ ,plus some new ones called retracts of the form r s (cid:48) ,s : s (cid:48) → s for eachpair s (cid:48) , s with s (cid:48) > s . The semantics of retracts is then given by retractequations ( ∀ x) r s (cid:48) ,s (x) = x , for all s (cid:48) > s , where x is a variable of sort s . rror, Coercion, and Retract Given an order-sorted signature Σ and a set A of conditional Σ -equations, extend Σ to the signature Σ ⊗ by adding the retract opera-tions, and extend A to the set of equations A ⊗ by adding the retractequations. Our requirement for retracts to be well-behaved is that thetheory extension ( Σ , A) ⊆ ( Σ ⊗ , A ⊗ ) should be conservative in the sensethat for all t, t (cid:48) ∈ T Σ (X) , t (cid:39) A(X) t (cid:48) iﬀ t (cid:39) A ⊗ (X) t (cid:48) . This is equivalent in model-theoretic terms to requiring that the uniqueorder-sorted Σ -homomorphism ψ X : T Σ ,A (X) → T Σ ⊗ ,A ⊗ (X) which leavesthe elements of X ﬁxed, is injective. We prove this under the very nat-ural assumption on the algebras T Σ ,A (X) that given X ⊆ X (cid:48) , then theunique Σ -homomorphism ι X,X (cid:48) : T Σ ,A (X) → T Σ ,A (X (cid:48) ) induced by thecomposite map X (cid:62) X (cid:48) → T Σ ,A (X (cid:48) ) (ﬁrst inclusion, then the naturalmapping of each variable to the class of terms equivalent to it) is injec-tive. We will say that a presentation ( Σ , A) is faithful if it satisﬁes thisinjectivity condition. Pathological, unfaithful presentations do exist,and for them the extension with retracts is not conservative, as shownby the following example from [78]: Example 10.6.5

Let Σ have sorts a, b, u with a, b ≤ u , have an operation f : a → b , have no constants of sort a , have constants 0 , b , plus + , & binary inﬁx and ¬ unary preﬁx of sort b . Let A have the equations ¬ (f (x)) = f (x), y + y = y, y & y = y, y + ( ¬ y) = , ( ¬ y) + y = , y & ( ¬ y) = , ( ¬ y) & y = , ¬ = , ¬ =

0. Then ( ∀ x) = A , where x is a variable of sort a , although ( ∀∅ ) = not deducible from A . Thus ( Σ , A) is not faithful. Note that T Σ ,A has 1 ≠ T Σ ⊗ ,A ⊗ has 1 = a such as r u,a ( ) and r u,a ( ) . Thus, theextension ( Σ , A) ⊆ ( Σ ⊗ , A ⊗ ) is not conservative. (cid:2) There are simple conditions on both the signature Σ and on the equa-tions A that guarantee faithfulness of a presentation ( Σ , A) . For ar- bitrary A , it is necessary and suﬃcient E47 that Σ has no quasi-empty models, which are algebras B such that B s = ∅ for some s but B s (cid:48) ≠ ∅ for some other sort s (cid:48) [78]. For arbitrary Σ , it is suﬃcient that A isChurch-Rosser as a term rewriting system [136]. A proof of the follow-ing conservative extension result is given in Appendix B: Theorem 10.6.6

If the signature Σ is coherent and ( Σ , A) is faithful, then theextension ( Σ , A) ⊆ ( Σ ⊗ , A ⊗ ) is conservative. (cid:2) This gives soundness, and Theorem 10.7.18 gives completeness. Order-Sorted Algebra and Term Rewriting

Order-sorted rewriting arises from order-sorted equational logic inmuch the same way that many-sorted rewriting arises from many-sortedequational logic. We ﬁrst consider unconditional rewriting, and addthe assumption that B contains no conditional equations to our priorassumption that all our order-sorted signatures are coherent (Deﬁni-tion 10.2.15). Deﬁnition 10.7.1

Given an order-sorted signature Σ , an unconditional order-sorted Σ - rewrite rule is an order-sorted Σ -equation ( ∀ X) t = t suchthat var (t ) ⊆ var (t ) = X ; we will write t → t . An order-sorted Σ - term rewriting system (or Σ - OSTRS ) is a set A of order-sorted Σ -rewriterules; we may also write ( Σ , A) . (cid:2) Recall that t and t must lie in the same connected component of thesort set of Σ (by Deﬁnition 10.2.16), so that, by coherence, there alwaysexists a sort s such that LS(t i ) ≤ s for i = ,

2. We begin by restrictingthe rule of deduction (+6 ), which replaces one subterm in the forwarddirection, to equations that are rewrite rules:( rw ) Order-Sorted Rewriting . Given t → t in A with LS(t ), LS(t ) ≤ s and var (t ) = Y , and given t ∈ T Σ (X ∪ { z } s ) with exactly oneoccurrence of z (cid:54)∈ X , if θ : Y → T Σ (X) is a substitution, then ( ∀ X) t (z ← θ(t )) = t (z ← θ(t )) is deducible.This rule is sound because it is a restriction of ( ) , which is alreadyknown to be sound. Therefore the following is also sound (in a sensestated formally in Proposition 10.7.9 below): Deﬁnition 10.7.2

Given a Σ -OSTRS A , one-step rewriting is deﬁned for Σ -terms t, t (cid:48) by t ⇒ A t (cid:48) iﬀ there exist a rule t → t in A with var (t ) = Y , aterm t ∈ T Σ (X ∪ { z } s ) with exactly one occurrence of z (cid:54)∈ X , and asubstitution θ : Y → T Σ (X) , such that LS(t ), LS(t ) ≤ s and t = t (z ← θ(t )) and t (cid:48) = t (z ← θ(t )) . The rewrite relation is the transitive, reﬂexive closure of the one-steprewrite relation; we use the notation t ∗ ⇒ A t (cid:48) and say t rewrites to t (cid:48) ( under A ). (cid:2) The words “match,” “redex,” etc. are used the same way as in the many-sorted case, and the one-step order-sorted rewrite relation gives an ab-stract rewrite system, so that termination, Church-Rosser, local Church-Rosser, reduced term, and so on all make sense; moreover, the usualproof methods are available for proving these properties. We do notelaborate this, because we will soon generalize order-sorted rewritingto the conditional and modulo equation cases. rder-Sorted Rewriting

Example 10.7.3

The following illustrates non-trivial overloaded order-sortedrewriting: th OSRW is sorts A B .subsort A < B .op a : -> A .op b : -> B .ops f g : B -> B .op g : A -> A .eq b = a .var A : A .eq f(A) = A .endthred f(g(b)) .

The result is g(a) , after applying each rule once. (cid:2)

The relation ∗ (cid:97) A , the reﬂexive, symmetric, and transitive closure of ⇒ A , is order-sorted replacement of equals by equals. An importantresult for many-sorted rewriting (Theorem 7.3.4) is that t ∗ (cid:97) A t (cid:48) iﬀ A (cid:96) ( ∀ X) t = t (cid:48) , or otherwise put, that ∗ (cid:97) A and (cid:39) A,X (provability) are equalrelations on T Σ (X) . By the completeness of many-sorted equationaldeduction, this also implies that ∗ (cid:97) A is complete. Unfortunately, thesenice results fail for order-sorted rewriting: Exercise 10.7.1

Use the speciﬁcation

OSRW-EQ in Exercise 10.3.2 to show thatthe relation ∗ (cid:97) A is incomplete: f ( a ) ∗ (cid:97) A f ( b ) does not hold, even though a ∗ (cid:97) A b is true, and f(a) and f(b) are provably equal by Exercise10.3.2. (This observation is due to Gert Smolka.) Examples with thisundesirable behavior can be excluded by imposing the conditions inDeﬁnition 10.7.10 below, but it is better to use retracts, relying on The-orem 10.7.18; see also Example 10.7.4 below. (cid:2) Exercise 10.7.2

It is also interesting to compare the equivalence classes for ∗ (cid:97) A under the ordinary quotient construction with those under the construction of Deﬁnition 10.4.6. Use the relation ≡ deﬁned in Exam-ple 10.4.7 equals ∗ (cid:97) A (though the spec is diﬀerent from here) to showthat the ordinary equivalence classes for ∗ (cid:97) A diﬀer from those of Deﬁ-nition 10.4.6. (cid:2) Example 10.7.4

If we write the two equations in the theory

OSRW-EQ of Exercise10.3.2 in the converse order, then OBJ can prove the result with just onereduction, due to of the way that retracts work: th OSRW-EQ-CONV is sorts A C .subsort A < C . Order-Sorted Algebra and Term Rewriting ops a b : -> A .op c : -> C .op f : A -> C .eq a = c .eq b = c .endthred f(a) == f(b).red f(a) .

The ﬁrst reduction gives true , even though the normal form of f(a) is f(r:C>A(c)) , as shown by the second reduction, because f(b) hasexactly the same normal form. In fact, the relation ∗ (cid:97) is complete forOSA with retracts (by Theorem 10.7.18 below). (cid:2) The following is the natural extension of Deﬁnition 10.7.1:

Deﬁnition 10.7.5 A conditional order-sorted rewrite rule is an order-sortedconditional equation ( ∀ X) t = t if C (in the sense of Deﬁnition10.2.16) such that var (t ) = X, var (t ) ⊆ X , and for each (cid:104) u, v (cid:105) ∈ C , var (u) ⊆ X and var (v) ⊆ X . We use the notation t → t if C . (cid:2) As with the many-sorted case, it is not straightforward to deﬁne theone-step rewrite relation for conditional rules. However, we can use the(Join) Conditional Abstract Rewrite Systems (CARS, Deﬁnition 7.7.1),just as we did in Section 7.7 for the many-sorted case; we will includerewriting modulo equations at the same time. For this purpose, westate the order-sorted modulo version of

Forward Conditional SubtermReplacement Modulo Equations : ( C B ) Given ( ∀ Y ) t = B t if C in A with LS(t ), LS(t ) ≤ s , and given t ∈ T Σ (X ∪ { z } s ) with z (cid:54)∈ X , if θ : Y → T Σ (X) is a substitutionsuch that ( ∀ X) θ(u) = B θ(v) is deducible for each pair (cid:104) u, v (cid:105) ∈ C , if t (cid:48) i = t (z ← θ(t i )) and t (cid:48)(cid:48) i (cid:39) B t (cid:48) i for i = ,

2, then ( ∀ X) t (cid:48)(cid:48) = t (cid:48)(cid:48) is also deducible.Now the main concepts: Deﬁnition 10.7.6 An order-sorted conditional term rewriting system moduloequations ( MCOSTRS ) is ( Σ , A, B) where A is a set of (possibly condi- tional) Σ -rewrite rules, and B is a set of (unconditional) Σ -equations.Given ( Σ , A, B) , deﬁne two CARS’s as follows, where t → t (cid:48) if t = t (cid:48) , . . . , t n = t (cid:48) n is in A with LS(t), LS(t (cid:48) ) ≤ s , Y = var (t) , θ : Y → T Σ , u ∈ T Σ ( { z } s ) and z (cid:54)∈ X :1. For term rewriting, let W be the set of rules of the form v → v (cid:48) if v = v (cid:48) , . . . , v n = v (cid:48) n where v (cid:39) B u(z ← θ(t)), v (cid:48) (cid:39) B u(z ← θ(t (cid:48) )) , v i = θ(t i ) and v (cid:48) i = θ(t (cid:48) i ) for i = , . . . , n .2. For class rewriting, let W be the set of rules of the form c → c (cid:48) if c = c (cid:48) , . . . , c n = c (cid:48) n where c = [u(z ← θ(t))], c (cid:48) = [u(z ← θ(t (cid:48) ))] , c i = [θ(t i )] and c (cid:48) i = [θ(t (cid:48) i )] for i = , . . . , n . rder-Sorted Rewriting The classes in the second item are those deﬁned by the quotient con-struction of Deﬁnition 10.4.6 from the provability relation (cid:39) B . Now Def-inition 7.7.1 (page 235) yields an ARS W (cid:5) for each of these. Write ⇒ A/B for the ﬁrst, which is order-sorted conditional term rewriting modulo B , and ⇒ [A/B] for the second, which is order-sorted conditional classrewriting modulo B . Also, we will use standard terminology (“match,”“redex,” etc.) in the usual way. (cid:2) As discussed in Section 7.7, OBJ does not use equality semantics forevaluating conditions, even though it is common in the literature (e.g.,[113]): it uses join condition semantics for its eﬃciency and pragmaticadequacy. Because ⇒ A/B and ⇒ [A/B] are ARS’s, all ARS results applydirectly, such as the Newman lemma and the multi-level terminationresults in Section 5.8.2. Also, the following has essentially the same proof as Proposition 7.3.2: Proposition 10.7.7

Given t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X and MCOSTRS ( Σ , A, B) , then t ⇒ A/B,X t (cid:48) iﬀ t ⇒ A/B,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . Therefore t ∗ ⇒ A/B,X t (cid:48) iﬀ t ∗ ⇒ A/B,Y t (cid:48) , and in both cases var (t (cid:48) ) ⊆ var (t) . (cid:2) Thus both ⇒ A/B,X and ∗ ⇒ A/B,X restrict and extend well over variables, sowe can drop the subscript X and use any X with var (t) ⊆ X ; also asbefore, ∗ (cid:97) A/B,X does not restrict and extend well, as shown by Exam-ple 5.1.15, so we deﬁne t ∗ (cid:97) A t (cid:48) to mean there exists an X such that t ∗ (cid:97) A,X t (cid:48) . Example 5.1.15 also shows bad behavior for (cid:39) XA/B (deﬁnedby t (cid:39) XA/B t (cid:48) iﬀ A ∪ B (cid:238) ( ∀ X) t = t (cid:48) ), although again, the concretionrule (8) of Chapter 4 (generalized to order-sorted rewriting modulo B )implies (cid:39) XA,B does behave reasonably when the signature is non-void.Deﬁning ↓ A/B,X from the ARS, we generalize Proposition 5.1.13, whichagain allows the subscript X to be dropped: Proposition 10.7.8

Given terms t, t (cid:48) ∈ T Σ (Y ) , Y ⊆ X and MCOSTRS ( Σ , A, B) ,then t ↓ A/B,X t iﬀ t ↓ A/B,Y t , and moreover, these imply A ∪ B (cid:96) ( ∀ X) t = t . (cid:2) Because unconditional order-sorted rewriting modulo no equations is a special case, Exercise 10.7.1 also shows that ∗ (cid:97) A/B is not completefor satisfaction of A ∪ B . However, we do have: Proposition 10.7.9 ( Soundness ) Given an MCOSTRS ( Σ , A, B) and t, t (cid:48) ∈ T Σ (X) ,then t ⇒ A/B t (cid:48) iﬀ [t] ⇒ [A/B] [t (cid:48) ] ,t ∗ ⇒ A/B t (cid:48) iﬀ [t] ∗ ⇒ [A/B] [t (cid:48) ] ,[t] ∗ ⇒ [A/B] [t (cid:48) ] implies [A] (cid:96) B ( ∀ X) [t] = [t (cid:48) ] [t] ∗ (cid:97) [A/B] [t (cid:48) ] implies A ∪ B (cid:96) ( ∀ X) t = t (cid:48) . Order-Sorted Algebra and Term Rewriting

Therefore ∗ (cid:97) A/B is sound for satisfaction of A ∪ B and ∗ (cid:97) [A/B] is soundfor satisfaction of A modulo B . Moreover, ∗ (cid:97) A/B ⊆ (cid:155) XA ∪ B on T Σ (X) . Proof:

The ﬁrst assertion follows from the deﬁnitions of ⇒ A/B and ⇒ [A/B] (Def-inition 10.7.6); then the second follows by induction. The third followsfrom ⇒ [A/B] being a rephrasing of ( C B ) , and the fourth follows fromthe third plus Proposition 10.5.2. (cid:2) We will consider two ways to render bidirectional term rewriting com-plete: the ﬁrst assumes the conditions below, while the second usesretracts (Theorem 10.7.18).

Deﬁnition 10.7.10

An MCOSTRS ( Σ , A, B) is sort decreasing iﬀ t ⇒ A/B t (cid:48) im-plies LS(t) ≥ LS(t (cid:48) ) . A Σ -rule t → t (cid:48) if C is sort decreasing iﬀ for any substitution θ : X → T Σ (Y ) where X = var (t) , we have LS(θ(t)) ≥ LS(θ(t (cid:48) ) . An order-sorted Σ -equation t = t (cid:48) is sort preserving iﬀ forany substitution θ : X → T Σ (Y ) where X = var (t) , we have LS(θ(t)) = LS(θ(t (cid:48) ) . (cid:2) Notice that these conditions are decidable (provided Σ is ﬁnite). We willsee that they also improve the properties of order-sorted rewriting. Thenext two results follow [113]. Proposition 10.7.11 is straightforward us-ing induction. Since Theorem 10.7.18 gives completeness with retractsbut without the sort decreasing assumption, we do not prove Theorem10.7.11; however the proofs in [113] carry over to join condition rewrit-ing. Theorem 10.7.11 If ( Σ , B) is sort preserving, then t (cid:39) B t (cid:48) iﬀ t ∗ (cid:97) B t (cid:48) . More-over, ( Σ , B) is sort preserving iﬀ t (cid:39) B t (cid:48) implies LS(t) = LS(t (cid:48) ) . AnMCOSTRS ( Σ , A, B) is sort decreasing if A is sort decreasing and B issort preserving. (cid:2) Deﬁnition 10.7.12

An MCOSTRS ( Σ , A, B) is join condition canonical if andonly if ( Σ (cid:48) , A (cid:48) , B) is canonical, where: (1) Σ (cid:48) ⊆ Σ is least such that if t i = t (cid:48) i is a condition of some rule r in A , then θ(t i ) and θ(t (cid:48) i ) are in T Σ (cid:48) for all θ : X → T Σ where X = var (t) and t is the leftside of the head of rule r ; and (2) A (cid:48) ⊆ A is least such that all conditional rulesare in A (cid:48) , and all unconditional rules that can be used in evaluating theconditions of rules in A are in A (cid:48) . (cid:2) The following is proved essentially the same way as Theorem 7.7.10:

Theorem 10.7.13 ( Completeness ) Given a join condition canonical MCOSTRS ( Σ , A, B) , the following four conditions are equivalent for any t, t (cid:48) ∈T Σ (X) : t ∗ (cid:97) A/B t (cid:48) A ∪ B (cid:96) ( ∀ X) t = t (cid:48) [A] (cid:96) B ( ∀ X) t = B t (cid:48) t (cid:39) A ∪ B t (cid:48) rder-Sorted Rewriting Moreover, if ( Σ , A, B) is Church-Rosser, then t ↓ A/B t (cid:48) is also equivalentto the above. Finally, if ( Σ , A, B) is canonical, then [[t]] A (cid:39) B [[t (cid:48) ]] A is alsoequivalent. (cid:2) The next result is proved essentially the same way as Theorem 7.3.9:

Theorem 10.7.14

Given a ground canonical MCOSTRS ( Σ , A, B) with A sort de-creasing and B sort preserving, if t , t are two normal forms of aground term t under ⇒ A/B then t (cid:39) B t . Moreover, the B -equivalenceclasses of ground normal forms under ⇒ A/B form an initial ( Σ , A ∪ B) -algebra, denoted N Σ ,A/B or just N A/B , as follows, where [[t]] denotesany arbitrary normal form of t , and [[t]] B denotes the B -equivalenceclass of [[t]] : (0) interpret σ ∈ Σ [],s as [[σ ]] B in N Σ ,A/B,s ; and(1) interpret σ ∈ Σ s ...s n ,s with n > ([[t ]] B , . . . , [[t n ]] B ) with t i ∈ T Σ ,s i to [[σ (t , . . . , t n )]] B in N Σ ,A/B,s .Finally, N Σ ,A/B is Σ -isomorphic to T Σ ,A ∪ B . (cid:2) As with previous similar results, this justiﬁes the use of rewrite rulesand normal forms to represent abstract data types, but now in thevery rich setting of conditional order-sorted rewriting modulo equa-tions; Sections 10.8 and 10.9 give examples showing how powerfullyexpressive this setting can be.The important Theorem 10.7.18 below shows why OBJ3 works welleven without the sort decreasing assumption. But ﬁrst, we make precisethe notion of retract rewriting:

Deﬁnition 10.7.15

Given an MCOSTRS ( Σ , A, B) with B sort preserving, then the retract insertion rule is deﬁned as follows: suppose a term t of leastsort s can be rewritten at the top by applying a (possibly conditional)rule u → v to yield a term t (cid:48) (i.e., there exists θ such that t (cid:39) B θ(u) and t (cid:48) (cid:39) B θ(v) ), and suppose the least sort s (cid:48) of t (cid:48) is not less than or equalto s ; now let w(z) be a context with variable z of sort s but s (cid:54)≥ s (cid:48) ; thenreplace t by r: s (cid:48)(cid:48) > s(t (cid:48) ) where s (cid:48)(cid:48) ≥ s, s (cid:48) , which exists by local ﬁltration. Retract rewriting consists of rewriting with ( Σ , A ⊗ , B) plus the retractinsertion rule. Let ∗ ⇒ A ⊗ //B denote retract rewriting. (cid:2) The retract insertion rule is sound, because it can be decomposedinto two sound deductions: ﬁrst substitute t = t (cid:48) into w( r: s (cid:48)(cid:48) > s (cid:48) (z)) to obtain w( r: s (cid:48)(cid:48) > s (cid:48) (t)) = w( r: s (cid:48)(cid:48) > s (cid:48) (t (cid:48) )) , and then apply retract elim-ination to obtain w(t) = w( r: s (cid:48)(cid:48) > s (cid:48) (t (cid:48) ) . Note that, as an optimization,a non-sort decreasing rule can be replaced by the corresponding rulewith a retract on its rightside; OBJ3 in fact does this. Also note that amatch with B cannot produce terms that require inserting or deletingretracts (although it may manipulate such terms). Order-Sorted Algebra and Term Rewriting

Although the relation ∗ (cid:97) A/B is not complete for Σ -terms, t ∗ (cid:97) A ⊗ //B t (cid:48) is complete (and sound) for Σ -terms. We illustrate this with the speciﬁ-cation of Exercise 10.7.1, which was originally used to show the incom-pleteness of ∗ (cid:97) A/B : Example 10.7.16

Example 10.7.4 successfully used the equations of Exercise10.3.2 backwards, but Exercise 10.7.1 was restricted to forward rewrit-ing. Can we show f(a) = f(b) using ∗ (cid:97) A/B ? No, but we can with ∗ (cid:97) A ⊗ //B : rewrite the retract term f(r:C>A(c)) in two diﬀerent ways:ﬁrst to f(r:C>A(a)) and second to f(r:C>A(b)) , using respectivelythe ﬁrst and second rules. Then by retract elimination, the ﬁrst equals f(a) and the second equals f(b) . (cid:2) Lemma 10.7.17 If B is sort preserving, then (cid:39) B = (cid:39) B ⊗ for Σ -terms, so that (cid:39) (A ∪ B) ⊗ = (cid:39) A ⊗ ∪ B , again for Σ -terms. Proof:

The ﬁrst assertion follows because equational reasoning with B cannotinsert or delete retracts, since B is sort preserving. The second asser-tion uses (A ∪ B) ⊗ = A ⊗ ∪ B ⊗ . (cid:2) Recall that t (cid:39) E means E (cid:238) ( ∀ X) t = t (cid:48) . Theorem 10.7.18 ( Completeness ) If ( Σ , A, B) is an MCOSTRS with B sort pre-serving and A ∪ B faithful, then t ∗ (cid:97) A ⊗ //B t (cid:48) iﬀ A ∪ B (cid:96) Σ ( ∀ X) t = t (cid:48) forany t, t (cid:48) ∈ T Σ . Proof:

Because order-sorted rewriting is sound, t ∗ (cid:97) A ⊗ //B t (cid:48) implies t (cid:39) A ⊗ ∪ B t (cid:48) for t, t (cid:48) ∈ T Σ , which by Lemma 10.7.17 is equivalent to t (cid:39) (A ∪ B) ⊗ t (cid:48) .Theorem 10.6.6 now gives equivalence of this to t (cid:39) A ∪ B t (cid:48) .For the converse, suppose t (cid:39) A ∪ B t (cid:48) for t, t (cid:48) ∈ T Σ . By the above, thisis equivalent to t (cid:39) A ⊗ ∪ B t (cid:48) . We will show t ∗ (cid:97) A ⊗ //B t (cid:48) by simulating theproof for t (cid:39) A ∪ B t (cid:48) using (cid:97) A ⊗ //B ; the essential diﬀerence is that theﬁrst can only substitute equals for equals, whereas the second allowsﬁrst proving and then using lemmas. Using a lemma u (cid:39) v is the sameas applying a rule u → v (or v → u , which is treated the same way),unless the rule is non-sort decreasing, in which case, when v (cid:48) = θ(v) is substituted for u (cid:48) = θ(u) in context w , an ill-formed Σ -term w(v (cid:48) ) may result. Then retract rewriting will substitute r: s (cid:48) > s(v (cid:48) ) for u (cid:48) ,thus obtaining the well-formed Σ ⊗ -term w( r: s (cid:48) > s(v (cid:48) )) . If the proof of u (cid:39) v involves other lemmas, the same is done recursively, substitutingthe simulated proof of u (cid:39) v using ∗ (cid:97) A ⊗ //B for the use of u → v . Doingthis recursively for all lemmas ﬁnally yields a rewriting sequence for t ∗ (cid:97) A ⊗ //B t (cid:48) . (cid:2) The implementation of order-sorted rewriting in OBJ3 achieves al-most the eﬃency of many-sorted rewriting, using clever techniques de-scribed in detail in [113]. In addition, retracts are eﬃciently handled bybuiltin Lisp code, rather than by interpreting the theory of retracts. rder-Sorted Rewriting

Exercise 10.7.3

Check whether Example 10.6.4 is sort decreasing according toDeﬁnition 10.7.10. Now enrich the theory

MORE-NON-DED so that it per-mits a non-trivial deduction similar to that in Example 10.6.2, involvingthe intermediate use of retracts, and then show how to accomplish thisdeduction using the relation ∗ (cid:97) A/B . (cid:2) Exercise 10.7.4

Use OBJ’s apply commands to prove the equation f ( a ) = f ( b ) for the speciﬁcation OSRW-EQ in Exercise 10.3.2. (cid:2)

The results of Section 7.7.1 on adding new constants generalize straight-forwardly to conditional order-sorted rewriting modulo equations; we state these generalizations explicitly because of their importance fortheorem proving, and because they appear to be new in this context.

Proposition 10.7.19

If an MCOSTRS ( Σ , A, B) is terminating, or Church-Rosser,or locally Church-Rosser, then so is ( Σ (X), A, B) , for any suitable count-able variable symbol set X . (cid:2) Proposition 10.7.20

An MCOSTRS ( Σ , A, B) is ground terminating if ( Σ (X), A, B) is ground terminating, where X is a variable set for Σ ; moreover, if Σ isnon-void, then ( Σ , A, B) is ground terminating iﬀ ( Σ (X), A, B) is groundterminating. (cid:2) Corollary 10.7.21 If Σ is non-void, then an MCOSTRS ( Σ , A, B) is ground termi-nating iﬀ it is terminating. (cid:2) Proposition 10.7.22

An MCOSTRS ( Σ , A, B) is Church-Rosser iﬀ ( Σ (X ωS ), A, B) is ground Church-Rosser, is locally Church-Rosser iﬀ ( Σ (X ωS ), A, B) isground locally Church-Rosser. (cid:2) This subsection generalizes termination results for MCTRS’s in Section

Exercise 10.7.5

Generalize Propositions 7.5.6 and 7.5.7 to MCOSTRS’s, and giveproofs. (cid:2)

Exercise 10.7.6

Apply the generalizations of Propositions 7.5.6 and 7.5.7 toMCOSTRS’s (that are not MCTRS’s) to prove their termination. (cid:2)

Given a poset P , we can deﬁne weak and strong ρ -monotonicty of order-sorted conditional rewrite rules modulo B , of order-sorted substitutionmodulo B , just as in Deﬁnition 5.5.3, and of operations in Σ , exceptthat T Σ and T Σ ( { z } s ) are replaced by T Σ ,B and T Σ ,B ( { z } s ) , respectively; Order-Sorted Algebra and Term Rewriting note that as before, the inequalities for a rule are only required to holdwhen all the conditions of the rule converge (modulo B ). The followinggeneralizes Theorem 7.7.20: Theorem 10.7.23

Let ( Σ , A, B) be a MCOSTRS with Σ non-void and A (cid:48) ⊆ A un-conditional and ground terminating; let P be a poset and let N = A − A (cid:48) .If there is ρ : T Σ ,B → P such that(1) each rule in A (cid:48) is weak ρ -monotone,(2) each rule in N is strict ρ -monotone,(3) each operation in Σ is strict ρ -monotone, and(4) P is Noetherian, or at least for each t ∈ ( T Σ ,B ) s there is some Noetherian poset P ts ⊆ P s such that t ∗ ⇒ [A/B] t (cid:48) implies ρ(t (cid:48) ) ∈ P ts ,then ( Σ , A, B) is ground terminating. (cid:2) Exercise 10.7.7

Prove Theorem 10.7.23. (cid:2)

Exercise 10.7.8

Show that if

C = ( Σ , A, B) is a MCOSTRS and C U = ( Σ , A U , B) where A U contains the rules in A with their conditions removed, then C is Church-Rosser (or ground Church-Rosser) if C U is. (cid:2) As always, ARS results apply directly, including the Newman Lemma,the Hindley-Rosen Lemma (Exercise 5.7.5) and Proposition 7.7.22, sowe do not state these here; the results that we do state are actuallyrather weak. Perhaps the most generally useful methods for provingChurch-Rosser are based on the Newman Lemma, since it is usuallymuch easier to prove the local Church-Rosser property. As mentonedin Sections 7.6 and 7.7.3, and in more detail in Chapter 12, although the

Critical Pair Theorem (5.6.9) does not generalize to modulo B rewriting,the local Church-Rosser property can still in many cases be checked bya variant of the Knuth-Bendix algorithm [117]. Exercise 10.7.9

Does Proposition 7.6.9 generalize to MCOSTRS’s? Give a proofor a counterexample. (cid:2)

Exercise 10.7.10

Generalize Proposition 7.7.23 to MCOSTRS’s, give a proof, andthen apply it to an example (not an MCTRS) to prove the Church-Rosserproperty.

Hint:

Consider a variant of

PROPC where the truth values area subsort of the propositional expressions. (cid:2)

Number System

Quat (cid:117) (cid:117) (cid:15) (cid:15) (cid:37) (cid:37) J (cid:15) (cid:15) (cid:34) (cid:34) NzQuat (cid:15) (cid:15) (cid:117) (cid:117)

NzJ Cpx (cid:15) (cid:15) (cid:36) (cid:36) (cid:42) (cid:42)

Rat (cid:36) (cid:36) (cid:15) (cid:15)

NzCpx (cid:15) (cid:15) (cid:37) (cid:37)

Imag (cid:15) (cid:15)

Int (cid:15) (cid:15) (cid:37) (cid:37)

NzRat (cid:15) (cid:15)

NzImag

Nat (cid:117) (cid:117) (cid:37) (cid:37)

NzInt (cid:15) (cid:15)

Zero NzNatFigure 10.3: Subsort Structure for Number System

This section presents an extended example, a rather complete numberhierarchy, from the naturals up to the quaternions, including also theinteger, rational, and complex numbers, with many of the usual opera-tions upon them; however, it does not include the real numbers, and thecomplex numbers and quaternions are based on the rationals insteadof the reals. A number of test cases are given. This example is from[82], and much of the work on it was done by Prof. José Meseguer andMr. Tim Winkler. It is interesting to notice that multiplication is not commutative on quaternions, although it is commutative on the sub-sorts of complexes, rationals, etc., and that this situation is allowed byour notion of overloading, as well as supported by the OBJ implemen-tation. This example is also used in [113], where it is annotated withmuch information about how its features are eﬃciently implementedin OBJ3 using techniques that include rule specializations and generalvariables. obj NAT is sorts Nat NzNat Zero .subsorts Zero NzNat < Nat .op 0 : -> Zero .op s_ : Nat -> NzNat . Order-Sorted Algebra and Term Rewriting op p_ : NzNat -> Nat .op _+_ : Nat Nat -> Nat [assoc comm] .op _*_ : Nat Nat -> Nat .op _*_ : NzNat NzNat -> NzNat .op _>_ : Nat Nat -> Bool .op d : Nat Nat -> Nat [comm] .op quot : Nat NzNat -> Nat .op gcd : NzNat NzNat -> NzNat [comm] .vars N M : Nat . vars N’ M’ : NzNat .eq p s N = N .eq N + 0 = N .eq (s N) + (s M) = s s(N + M) .eq N * 0 = 0 .eq 0 * N = 0 .eq (s N) * (s M) = s(N + (M + (N * M))) .eq 0 > M = false .eq N’ > 0 = true .eq s N > s M = N > M .eq d(0,N) = N .eq d(s N, s M) = d(N,M) .eq quot(N,M’) = if ((N > M’)or(N == M’)) thens quot(d(N,M’),M’) else 0 fi .eq gcd(N’,M’) = if N’ == M’ then N’ else (if N’ > M’ thengcd(d(N’,M’),M’) else gcd(N’,d(N’,M’)) fi) fi .endoobj INT is sorts Int NzInt .protecting NAT .subsorts NzNat < Nat NzInt < Int .op -_ : Int -> Int .op -_ : NzInt -> NzInt .op _+_ : Int Int -> Int [assoc comm] .op _*_ : Int Int -> Int .op _*_ : NzInt NzInt -> NzInt .op quot : Int NzInt -> Int .op gcd : NzInt NzInt -> NzNat [comm] .vars I J : Int . vars I’ J’ : NzInt . vars N’ M’ : NzNat .eq - - I = I .eq - 0 = 0 .eq I + 0 = I .eq M’ + (- N’) = if N’ == M’ then 0 else(if N’ > M’ then - d(N’,M’) else d(N’,M’) fi) fi .eq (- I) + (- J) = -(I + J) .eq I * 0 = 0 .eq 0 * I = 0 .eq I * (- J) = -(I * J) .eq (- J) * I = -(I * J) .eq quot(0,I’) = 0 .eq quot(- I’,J’) = - quot(I’,J’) .

Number System eq quot(I’,- J’) = - quot(I’,J’) .eq gcd(- I’,J’) = gcd(I’,J’) .endoobj RAT is sorts Rat NzRat .protecting INT .subsorts NzInt < Int NzRat < Rat .op _/_ : Rat NzRat -> Rat .op _/_ : NzRat NzRat -> NzRat .op -_ : Rat -> Rat .op -_ : NzRat -> NzRat .op _+_ : Rat Rat -> Rat [assoc comm] .op _*_ : Rat Rat -> Rat .op _*_ : NzRat NzRat -> NzRat .vars I’ J’ : NzInt . vars R S : Rat . vars R’ S’ : NzRat .eq R / (R’ / S’) = (R * S’) / R’ .eq (R / R’) / S’ = R / (R’ * S’) .ceq J’ / I’ = quot(J’,gcd(J’,I’)) / quot(I’,gcd(J’,I’))if gcd(J’,I’) =/= s 0 .eq R / s 0 = R .eq 0 / R’ = 0 .eq R / (- R’) = (- R) / R’ .eq -(R / R’) = (- R) / R’ .eq R + (S / R’) = ((R * R’) + S) / R’ .eq R * (S / R’) = (R * S) / R’ .eq (S / R’) * R = (R * S) / R’ .endoobj CPX-RAT is sorts Cpx Imag NzImag NzCpx .protecting RAT .subsort Rat < Cpx .subsort NzRat < NzCpx .subsorts NzImag < NzCpx Imag < Cpx .subsorts Zero < Imag .op _i : Rat -> Imag .op _i : NzRat -> NzImag .op -_ : Cpx -> Cpx .op -_ : NzCpx -> NzCpx .op _+_ : Cpx Cpx -> Cpx [assoc comm] .op _+_ : NzRat NzImag -> NzCpx [assoc comm] .op _*_ : Cpx Cpx -> Cpx .op _*_ : NzCpx NzCpx -> NzCpx .op _/_ : Cpx NzCpx -> Cpx .op _ Order-Sorted Algebra and Term Rewriting eq (R i) + (S i) = (R + S) i .eq -(R’ + (S’ i)) = (- R’) + ((- S’) i) .eq -(S’ i) = (- S’) i .eq R * (S i) = (R * S) i .eq (S i) * R = (R * S) i .eq (R i) * (S i) = -(R * S) .eq C * (A + B) = (C * A) + (C * B) .eq (A + B) * C = (C * A) + (C * B) .eq R

Number System eq Q / (C + (C’ j)) = Q * (((C

The equation that deﬁnes the squared norm function |_|^2 is in-teresting, because given a non-zero rational as input, it should return anon-zero rational, but the rightside of the equation does not parse as anon-zero rational, although it can be proved that it always yields one.The attribute [memo] on the constants , . . . , causes OBJ to cache thenormal forms of these terms, and then use these cached values insteadof recomputing them each time they are needed. Exercise 10.8.1

Show that

INT , RAT , CPX , and

QUAT are rings, where the the-ory of rings is given by the following theories, where the ﬁrst deﬁnescommutative (also called Abelian) groups with additive notation: th ABGP is sort Elt .op 0 : -> Elt .op _+_ : Elt Elt -> Elt [assoc comm id: 0].op -_ : Elt -> Elt .var X : Elt .eq X +(- X) = 0 .endth Order-Sorted Algebra and Term Rewriting th RING is us ABGP .op 1 : -> Elt .op _*_ : Elt Elt -> Elt [assoc id: 1 prec 30].vars X Y Z : Elt .eq X * 0 = 0 . eq 0 * X = 0 .eq X *(Y + Z) = (X * Y) + (X * Z) .eq (Y + Z)* X = (Y * X) + (Z * X) .endth

Show that

INT , RAT , and

CPX are commutative rings, where the the-ory of commutative rings is as above, except that a comm attribute isadded for * and the last equation is omitted. Show that QUAT is not acommutative ring. (cid:2)

Exercise 10.8.2

Show that

INT , RAT , CPX , and

QUAT are ﬁelds, where the theoryof ﬁelds is given by the following: th FIELD is us RING .sort NzElt .subsort NzElt < Elt .op _ − : NzElt -> NzElt [prec 2].vars X : NzElt .eq X * X − = 1 . eq X − * X = 1 .endth Show that

INT , RAT , and

CPX are commutative ﬁelds, where the the-ory of commutative ﬁelds is as above, except that it imports the theoryof commutative rings. Show that

QUAT is not a commutative ﬁeld. (cid:2)

It is surprising that ﬁelds can be speciﬁed so simply with order-sortedalgebra, but it is not diﬃcult to show that there is an isomorphismbetween the classes of models of

FIELD , and of ﬁelds in the ordinarysense, given by the function U , where if F is a model of FIELD , then U (F) is the partial algebra with X − undeﬁned when X = Exercise 10.8.3

Use OBJ to prove in the theory

FIELD that if

X, Y are non-zero,then so is X ∗ Y . Conclude from this that the overload declaration op _*_ : NzElt NzElt -> NzElt [assoc id: 1]. can be added to FIELD . (cid:2) Exercise 10.8.4

Use OBJ to prove ( ∀ X, Y : Rat ) (X + i ∗ Y )(X − i ∗ Y ) = | X | +| Y | . Hint:

Do not neglect the cases where X = Y = (cid:2) ultiple Representation and Coercion This section gives an example based on one in [81, 57], showing howOSA handles multiple representations of a single abstract data type,by providing automatic coercions among the representations. The typehere is points in the plane (or vectors from the origin), and the repre-sentations are Cartesian and polar coordinates. The speciﬁcation belowuses the module

FLOAT , which is OBJ’s approximation to the real num-ber ﬁeld (which cannot be fully implemented on a computer). BOBJ[71, 72], the newest version of OBJ, is needed for its sort constraints ,which allow users to declare new subsorts of old sorts. The keyword forsort constraints is mb , after the syntax of Maude [30]; the ﬁrst deﬁnes asubsort NNeg of non-negatives for

Float , and the second deﬁnes a fur- ther subsort for angles. Automatic coercions are deﬁned by the threeequations with retracts; the ﬁrst two convert a point in polar coordi-nates to Cartesian coordinates when the context requires it, and thethird does the opposite. Since both the sum and distance functions areonly deﬁned for the Cartesian representation, applying them to polarpoints requires coercion, as illustrated in the three reductions belowthe speciﬁcation. obj POINT is pr FLOAT .sorts NNeg Point Cart Polar Angle .subsorts Angle < NNeg < Float .var N : NNeg . var A : Angle . var F : Float .mb F : NNeg if F >= 0 .op _**2 : Float -> NNeg [prec 2].eq F **2 = F * F .mb F : Angle if 0 <= F and F < 2 * pi .subsorts Cart Polar < Point .op <_,_> : Float Float -> Cart .op _+_ : Cart Cart -> Cart .vars F1 F2 F3 F4 : Float .eq < F1, F2 > + < F3, F4 > = < F1 + F3, F2 + F4 > .op [_,_] : Angle NNeg -> Polar .eq r:Point>Cart([A, N]) = if N > 0 .eq r:Point>Cart([A, N]) = <0, 0> if N == 0 .eq r:Point>Polar() = [atan(F2 / F1), sqrt(F1 **2 + F2 **2)]if F1 =/= 0 .op d : Cart Cart -> NNeg .eq d(, ) = sqrt((F1 - F3)**2 + (F2 - F4)**2).endred d(< 1, 1 >, [pi / 3, 3]).red < 1, 1 > + [pi / 3, 3].red [pi / 2, 2] + [pi / 3, 3]. Order-Sorted Algebra and Term Rewriting

Exercise 10.9.1

Add a function to the above code that rotates a point in polarcoordinates about the origin, and use it to deﬁne a negation functionfor addition. Now run several test cases on these functions that requirecoercion. You can download the latest version of BOBJ from ftp://ftp.cs.ucsd.edu/pub/fac/goguen/bobj/ (cid:2)

Order-sorted algebra has evolved through several stages. The olderversions OBJT and OBJ1 of OBJ used error algebras [53], which can failto have initial models [150]. OSA began in 1978 in [54], and was furtherdeveloped in papers including [82, 113, 152] and [165]. This chaptersummarizes many basic results from OSA, mainly following [82]. A rather comprehensive survey up to 1992 is given in [68], and [77] is aless formal introduction with many examples.Order-sorted rewriting was ﬁrst treated in depth in [69], and thenfurther developed in [113]. Both papers focus on operational seman-tics, i.e., on how to eﬃciently implement order-sorted term rewritingin OBJ: the ﬁrst translates to many-sorted rewriting, while the secondcomputes a set of rules that work on a term data structure that keepstrack of the ranks of operations and the sorts of subterms; this is moreeﬃcient, and can be considered a weak form of compilation. Neither ofthese papers treats join conditional rewriting, so all the results aboutconditional order-sorted rewriting in Section 10.7 are new (we have ar-gued in previous chapters that join semantics is the most appropriatefor OBJ). The semantics of retracts in [78] does not cover rewriting mod-ulo equations or retract rewriting, although this is discussed in [90]under the name “safe rewriting,” again without condition rules or mod-ulo equations. Theorem 10.7.18 is an important and perhaps surprisingnew result. The example in Section 10.9 is also new in this simple form. iterature

A Note to Lecturers:

This chapter is a culmination of thisbook, in the sense that we have been gradually building uptowards a complete, rigorous treatment of the full opera-tional and algebraic semantics of OBJ3, which is conditionalorder-sorted rewriting with retracts, modulo equations, withits theories of equational deduction and of algebras as mod-els, together with practical methods for speciﬁcation andveriﬁcation, and practical ways to check key properties suchas Church-Rosser and termination. This chapter seems tobe the only place where such an exposition is available, andmany of its results are new, e.g., Theorem 10.7.18. On theother hand, many other results are fairly straightforwardgeneralizations of results in prior chapters. In my opinion, any course on algebraic speciﬁcation should at least stateand illustrate the main theorems of this chapter, makingclear the great expressive power that results from combin-ing all these features, although it is not necessary to go overall the counterexamples, auxiliary results, and proofs. Order-Sorted Algebra and Term Rewriting

Insert material from [59] here.

With this technique, we can verify the correctness of generic objects in the sense of OBJ, as well as of higher-order functions as used infunctional programming. Generic Modules

Use the approach of “What is Uniﬁcation?” [60], introducingsome basic category theory to deﬁne uniﬁcation; see alsothe discussion at the end of Chapter 8.

The discussion of the Church-Rosser property in Section 5.6 in-cluded a claim that it is decidable for terminating TRS’s. We now dis-charge that claim. First, recall from Chapter 5 that, given a TRS A , if theleft sides t, t (cid:48) of two rules have an overlap (in the sense of Deﬁnition5.6.2) θ(t ) = θ (cid:48) (t (cid:48) ) , where t is a subterm of t , then θ(t) can be rewrit-ten in two ways (one for each rule). The following, which is needed inChapter 5, follows from the existence of most general uniﬁers, whichcan be computed as described above: Proposition 12.0.1

If terms t, t (cid:48) overlap at a subterm t of t , then there is a most general overlap p , in the sense that any other overlap of t, t (cid:48) at t is a substitution instance of p . (cid:2) Recall that such a most general overlap is called a superposition , andthat the pair of terms resulting from applying the two rules to the term θ(t) is called a critical pair . If the two terms in a critical pair can berewritten to a common term using rules in A , then that critical pair issaid to converge or to be convergent . The following was proved inChapter 5, and is important since it covers non-orthogonal TRS’s thatcannot be checked with Theorem 5.6.4: Theorem 5.6.9

A TRS is locally Church-Rosser if all its critical pairs are conver-gent. (cid:2)

With the Newman Lemma (Proposition 5.6.1), this implies:

Corollary 5.6.10

A terminating TRS is Church-Rosser iﬀ all its critical pairs areconvergent, in which case it is also canonical. (cid:2)

Proofs of canonicity by orthogonality can be mechanized by usingan algorithm that checks if each pair of rules is overlapping, and checksleft linearity of each rule (which is trivial). UniﬁcationSection 5.6 promised an algorithm from the above corollary;also the above just repeats material in that section.Section 5.6 also promises the uniﬁcation algorithm.Discuss the Church-Rosser property for the TRS’s

GROUPC ofExample 5.2.8,

AND of Example 5.5.7, and

MONOID of Exer-cise 5.2.7. Are these already in Chapter 5?Discuss Knuth-Bendix [117], and proof by consistency usingcompletion, which in general is not a very satisfying tech-nique. Do not prove the correctness of Knuth-Bendix com-pletion.Do MSA case ﬁrst, then OSA.

Do Example 5.8.37 in algorithmic detail; can copy and edit.Also discuss completion procedures for matching modulo,as promised in Section 7.3.3.Check carefully against chapters 5,7,10.

We also need the following:

Proposition 12.0.2

For B containing any combination of the commutative, as-sociative, and identity laws, if terms t, t (cid:48) overlap at a subterm t of t ,then there is a most general overlap p , in the sense that any otheroverlap of t, t (cid:48) at t is a substitution instance of p . (cid:2) The following can now be done using the machinery developed above:

Exercise 12.1.1

Recalling that Example 7.5.10 shows termination of

DNF , useCorollary 5.6.10 to show that the MTRS

DNF is locally Church-Rosser,and therefore canonical. (cid:2)

Exercise 12.1.2

Show that the normal forms of the MTRS

DNF are exactly thedisjunctive normal forms of the propositional formulae. (cid:2)

Exercise 12.1.3

Recalling that Example 7.5.9 shows termination of

PROPC , useCorollary 5.6.10 to complete the proof of Hsiang’s Theorem (Theo-rem 7.3.13) that

PROPC is a canonical MTRS. (cid:2)

Not really sure whether to do this; anyway, the following isjust a rough sketch of some preliminary ideas for introduc-tory remarks.

Computing applications typically involve states, and it can be awkward,or even impossible, to treat these applications in a purely functionalstyle. Hidden algebra substantially extends ordinary algebra by distin-guishing sorts used for data from sorts used for states, calling themrespectively visible and hidden sorts. It also changes the notion of sat-isfaction to behavioral (also called observational ) satisfaction, so thatequations need not be literally satisﬁed, but need only appear to besatisﬁed under all possible experiments. Hidden algebra is powerfulenough to give a semantics for the object paradigm, including inheri-tance and concurrency.Whereas initial algebra semantics takes a somewhat static view ofdata structures and systems, as reﬂected in the central result that theinitial model is unique up to isomorphism, hidden algebra takes a moredynamic view, directly addressing behavior and abstracting away fromimplementation, with its notion of behavioral satisfaction for equa-tions. In philosophical terms, the evolution from initial algebra to hid-den algebra is similar to the evolution from Plato’s theory of static eter-nal ideals, to Aristotle’s attempts to confront the kinds of change anddevelopment that can be observed especially in the biological realm. Hidden Algebra

This book takes the notions of model and satisfaction as basic, andchecks the soundness of each proposed rule of deduction with respectto them before accepting it for use in theorem proving. Since we do notassume a single pre-determined logical system, the concepts of modeland satisfaction cannot be ﬁxed once and for all. Instead, we adopt amore general approach, inﬂuenced by the theory of institutions [67],and gradually enrich our language and models, working upward fromsimple equational logic.Assume a concept of signature. For each signature Σ , assume thatthere are a class M Σ of Σ -models, an algebra T Σ of Σ -terms, an algebra F Σ of Σ -formulae, a concept of Σ -substitution, and a relation (cid:238) Σ of Σ -satisfaction of formulae by terms. Each concept should be deﬁnedrecursively, and later concepts can be deﬁned recursively over earlierones.Let R be a language of sentences that can be directly checked byOBJ, e.g., conjunctions of reductions, let G be a language for goals, andlet L be a “meta” language that includes both G and R . The justiﬁcation for a proof score is a sequence of applications of proof measures whichtransforms a goal into an R sentence, which can then be translated intoan OBJ program and run.Elements of G express semantic facts which we wish to verify; a typical goal sentence is E |(cid:155) e , where E is a conjunction (perhaps rep-resented as a set) of formulae and e is a single formula. We will callsentences of this form (atomic) turnstile sentences. Our aim is then totransform (possibly rather exotic) turnstile sentences into R sentencesthat can be checked by OBJ.(The above can be seen as an attempt to give a more down to earthexposition of the theory of institutions [67].) MORE TO COME HERE. A General Framework (Institutions)

OBJ3 Syntax and Usage

This appendix gives a formal description of OBJ3 syntax, followed bysome practical advice on how to use it. OBJ3 is the latest implementa-tion of OBJ; it is an equational language with a mathematical semantics given by order-sorted equational logic, and a powerful type system fea-turing subtypes and overloading; these latter features allow us to deﬁneand handle errors in a precise way. In addition, OBJ3 has user-deﬁnableabstract data types with user-deﬁnable mixﬁx syntax, and a powerfulparameterized module facility that includes views and module expres-sions. OBJ is declarative in the sense that its statements assert proper-ties that solutions should have; i.e., they describe the problem. A subsetof OBJ is executable using term rewriting, and all of it is provable. See[90] for more details.OBJ3 syntax is described using the following extended BNF notation:the symbols { and } are used as meta-parentheses; the symbol | is usedto separate alternatives; [ ] pairs enclose optional syntax; . . . indicateszero or more repetitions of preceding unit; and “ x ” denotes x literally.As an application of this notation, A {, A } . . . indicates a non-empty listof A ’s separated by commas. Finally, --- indicates comments in thissyntactic description, as opposed to comments in OBJ3 code. --- top-level --- (cid:104) OBJ-Top (cid:105) ::= { (cid:104)

Object (cid:105) | (cid:104)

Theory (cid:105) | (cid:104)

View (cid:105) | (cid:104)

Make (cid:105) | (cid:104)

OtherTop (cid:105) }... (cid:104)

Make (cid:105) ::= make (cid:104)

Interface (cid:105) is (cid:104) ModExp (cid:105) endm (cid:104)

Reduction (cid:105) ::= reduce [ in (cid:104) ModExp (cid:105) : ] (cid:104) Term (cid:105) . (cid:104) Apply (cid:105) ::=apply {reduction | red | print | retr | -retr with sort (cid:104) Sort (cid:105) |(cid:104)

RuleSpec (cid:105) [ with (cid:104) VarId (cid:105) = (cid:104) Term (cid:105) {, (cid:104) VarId (cid:105) = (cid:104) Term (cid:105) }... ] }{within | at} (cid:104) Selector (cid:105) {of (cid:104)

Selector (cid:105) }... OBJ3 Syntax and Usage (cid:104)

RuleSpec (cid:105) ::= [ - ][ (cid:104) ModId (cid:105) ] . (cid:104) RuleId (cid:105)(cid:104)

RuleId (cid:105) ::= (cid:104)

Nat (cid:105) | (cid:104) Id (cid:105)(cid:104) Selector (cid:105) ::= term | top | ( (cid:104) Nat (cid:105) ...) | [ (cid:104) Nat (cid:105) [ .. (cid:104) Nat (cid:105) ] ] | "{" (cid:104) Nat (cid:105) {, (cid:104) Nat (cid:105) }... "}"--- note that "()" is a valid selector (cid:104)

OtherTop (cid:105) ::= (cid:104)

RedLoop (cid:105) | (cid:104)

Commands (cid:105) | call-that (cid:104) Id (cid:105) . | test reduction [ in (cid:104) ModExp (cid:105) : ] (cid:104) Term (cid:105) expect: (cid:104)

Term (cid:105) . | (cid:104) Misc (cid:105) --- "call that (cid:104) Id (cid:105) ." is an abbreviation for "let (cid:104) Id (cid:105) = ." (cid:104) RedLoop (cid:105) ::= rl {. | (cid:104)

ModId (cid:105) } { (cid:104)

Term (cid:105) .}... . (cid:104)

Commands (cid:105) ::= cd (cid:104)

Sym (cid:105) | pwd | ls | do (cid:104) DoOption (cid:105) . | select [ (cid:104) ModExp (cid:105) ] . | set (cid:104) SetOption (cid:105) . | show [ (cid:104) ShowOption (cid:105) ] .--- in select, can use "open" to refer to the open module (cid:104) DoOption (cid:105) ::= clear memo | gc | save (cid:104) Sym (cid:105) ... | restore (cid:104) Sym (cid:105) ... | ? (cid:104) SetOption (cid:105) ::= {abbrev quals | all eqns | all rules | blips | clear memo | gc show | include BOOL | obj2 | verbose | print with parens | reduce conditions | show retracts | show var sorts | stats | trace | trace whole} (cid:104) Polarity (cid:105) | ? (cid:104) Polarity (cid:105) ::= on | off (cid:104) ShowOption (cid:105) ::={abbrev | all | eqs | mod | name | ops | params | principal-sort | [ all ] rules | select | sign | sorts | subs | vars} [ (cid:104) ParamSpec (cid:105) | (cid:104)

SubmodSpec (cid:105) ] [ (cid:104)

ParamSpec (cid:105) | (cid:104)

SubmodSpec (cid:105) | ?--- can use "open" to refer to the open module (cid:104)

ParamSpec (cid:105) ::= param (cid:104)

Nat (cid:105)(cid:104)

SubmodSpec (cid:105) ::= sub (cid:104)

Nat (cid:105)(cid:104)

Misc (cid:105) ::= eval (cid:104)

Lisp (cid:105) | eval-quiet (cid:104)

Lisp (cid:105) | parse (cid:104)

Term (cid:105) . | (cid:104) Comment (cid:105)(cid:104)

Comment (cid:105) ::= *** (cid:104)

Rest-of-line (cid:105) | ***> (cid:104)

Rest-of-line (cid:105) | *** ( (cid:104)

Text-with-balanced-parentheses (cid:105) ) (cid:104) Rest-of-line (cid:105) --- the remaining text of the current line--- modules --- (cid:104)

Object (cid:105) ::= obj (cid:104)

Interface (cid:105) is { (cid:104)

ModElt (cid:105) | (cid:104)

Builtins (cid:105) }... endo (cid:104)

Theory (cid:105) ::= th (cid:104)

Interface (cid:105) is (cid:104) ModElt (cid:105) ... endth (cid:104)

Interface (cid:105) ::= (cid:104)

ModId (cid:105) [ [ (cid:104) ModId (cid:105) ... :: (cid:104)

ModExp (cid:105) {, (cid:104) ModId (cid:105) ... :: (cid:104)

ModExp (cid:105) }... ] ] (cid:104) ModElt (cid:105) ::={protecting | extending | including | using} (cid:104) ModExp (cid:105) . | using (cid:104) ModExp (cid:105) with (cid:104)

ModExp (cid:105) {and (cid:104)

ModExp (cid:105) }... | define (cid:104) SortId (cid:105) is (cid:104) ModExp (cid:105) . | principal-sort (cid:104) Sort (cid:105) . | sort (cid:104) SortId (cid:105) ... . | subsort (cid:104) Sort (cid:105) ... { < (cid:104)

Sort (cid:105) ... }... . | as (cid:104) Sort (cid:105) : (cid:104) Term (cid:105) if (cid:104) Term (cid:105) . | op (cid:104) OpForm (cid:105) : (cid:104) Sort (cid:105) ... -> (cid:104)

Sort (cid:105) [ (cid:104) Attr (cid:105) ] . | ops { (cid:104) Sym (cid:105) | ( (cid:104) OpForm (cid:105) )}... : (cid:104)

Sort (cid:105) ... -> (cid:104)

Sort (cid:105) [ (cid:104) Attr (cid:105) ] . | op-as (cid:104) OpForm (cid:105) : (cid:104) Sort (cid:105) ... -> (cid:104)

Sort (cid:105) for (cid:104)

Term (cid:105) if (cid:104) Term (cid:105) [ (cid:104) Attr (cid:105) ] . | [ (cid:104) RuleLabel (cid:105) ] let (cid:104) Sym (cid:105) [ : (cid:104) Sort (cid:105) ] = (cid:104) Term (cid:105) . | var (cid:104) VarId (cid:105) ... : (cid:104)

Sort (cid:105) . | vars-of [ (cid:104) ModExp (cid:105) ] . | [ (cid:104) RuleLabel (cid:105) ] eq (cid:104) Term (cid:105) = (cid:104) Term (cid:105) . | [ (cid:104) RuleLabel (cid:105) ] cq (cid:104) Term (cid:105) = (cid:104) Term (cid:105) if (cid:104) Term (cid:105) . |(cid:104) Misc (cid:105)(cid:104)

Attr (cid:105) ::= [ {assoc | comm | {id: | idr:} (cid:104) Term (cid:105) | idem | memo | strat ( (cid:104) Int (cid:105) ... ) | prec (cid:104) Nat (cid:105) | gather ({e | E | &}... ) | poly (cid:104) Lisp (cid:105) | intrinsic}... ] (cid:104)

RuleLabel (cid:105) ::= (cid:104) Id (cid:105) ... {, (cid:104) Id (cid:105) ... }... (cid:104) ModId (cid:105) --- simple identifier, by convention all caps (cid:104)

SortId (cid:105) --- simple identifier, by convention capitalised (cid:104)

VarId (cid:105) --- simple identifier, typically capitalised (cid:104)

OpName (cid:105) ::= (cid:104)

Sym (cid:105) {"_" | " " | (cid:104) Sym (cid:105) }... (cid:104)

Sym (cid:105) --- any operator syntax symbol (blank delimited) (cid:104)

OpForm (cid:105) ::= (cid:104)

OpName (cid:105) | ( (cid:104) OpName (cid:105) ) (cid:104) Sort (cid:105) ::= (cid:104)

SortId (cid:105) | (cid:104)

SortId (cid:105) . (cid:104) SortQual (cid:105)(cid:104)

SortQual (cid:105) ::= (cid:104)

ModId (cid:105) | ( (cid:104) ModExp (cid:105) ) (cid:104) Lisp (cid:105) --- a Lisp expression (cid:104)

Nat (cid:105) --- a natural number (cid:104)

Int (cid:105) --- an integer (cid:104)

Builtins (cid:105) ::=bsort (cid:104)

SortId (cid:105) (cid:104)

Lisp (cid:105) . | [ (cid:104) RuleLabel (cid:105) ] bq (cid:104) Term (cid:105) = (cid:104) Lisp (cid:105) . | [ (cid:104) RuleLabel (cid:105) ] beq (cid:104) Term (cid:105) = (cid:104) Lisp (cid:105) . | [ (cid:104) RuleLabel (cid:105) ] cbeq (cid:104) Term (cid:105) = (cid:104) Lisp (cid:105) if (cid:104) BoolTerm (cid:105) . | [ (cid:104) RuleLabel (cid:105) ] cbq (cid:104) Term (cid:105) = (cid:104) Lisp (cid:105) if (cid:104) BoolTerm (cid:105) .--- views --- (cid:104)

View (cid:105) ::= view [ (cid:104) ModId (cid:105) ] from (cid:104) ModExp (cid:105) to (cid:104) ModExp (cid:105) is (cid:104) ViewElt (cid:105) ... endv | view (cid:104) ModId (cid:105) of (cid:104) ModExp (cid:105) as (cid:104) ModExp (cid:105) is (cid:104) ViewElt (cid:105) ... endv--- terms --- (cid:104)

Term (cid:105) ::= (cid:104)

Mixﬁx (cid:105) | (cid:104)

VarId (cid:105) | ( (cid:104) Term (cid:105) ) |(cid:104) OpName (cid:105) ( (cid:104) Term (cid:105) {, (cid:104) Term (cid:105) }... ) | ( (cid:104) Term (cid:105) ). (cid:104) OpQual (cid:105) --- precedence and gathering rules used to eliminate ambiguity (cid:104)

OpQual (cid:105) ::= (cid:104)

Sort (cid:105) | (cid:104)

ModId (cid:105) | ( (cid:104) ModExp (cid:105) ) (cid:104) Mixﬁx (cid:105) --- mixfix operator applied to arguments OBJ3 Syntax and Usage --- module expressions --- (cid:104)

ModExp (cid:105) ::= (cid:104)

ModId (cid:105) | (cid:104)

ModId (cid:105) is (cid:104) ModExpRenm (cid:105) |(cid:104)

ModExpRenm (cid:105) + (cid:104) ModExp (cid:105) | (cid:104)

ModExpRenm (cid:105)(cid:104)

ModExpRenm (cid:105) ::= (cid:104)

ModExpInst (cid:105) * ( (cid:104)

RenameElt (cid:105) {, (cid:104) RenameElt (cid:105) }... ) |(cid:104)

ModExpInst (cid:105)(cid:104)

ModExpInst (cid:105) ::= (cid:104)

ParamModExp (cid:105) [ (cid:104) Arg (cid:105) {, (cid:104) Arg (cid:105) }... ] | ( (cid:104) ModExp (cid:105) ) (cid:104) ParamModExp (cid:105) ::= (cid:104)

ModId (cid:105) | ( (cid:104) ModId (cid:105) * ( (cid:104)

RenameElt (cid:105) {, (cid:104) RenameElt (cid:105) }... )) (cid:104)

RenameElt (cid:105) ::= sort (cid:104)

SortRef (cid:105) to (cid:104) SortId (cid:105) | op (cid:104) OpRef (cid:105) to (cid:104) OpForm (cid:105)(cid:104)

Arg (cid:105) ::= (cid:104)

ViewArg (cid:105) | (cid:104)

ModExp (cid:105) | [ sort ] (cid:104) SortRef (cid:105) | [ op ] (cid:104) OpRef (cid:105) --- may need to precede (cid:104)

SortRef (cid:105) by "sort" and (cid:104)

OpRef (cid:105) by "op" to--- distinguish from general case (i.e., from a module name) (cid:104)

ViewArg (cid:105) ::= view [ from (cid:104) ModExp (cid:105) ] to (cid:104) ModExp (cid:105) is (cid:104) ViewElt (cid:105) ... endv (cid:104)

ViewElt (cid:105) ::= sort (cid:104)

SortRef (cid:105) to (cid:104) SortRef (cid:105) . | var (cid:104) VarId (cid:105) ... : (cid:104)

Sort (cid:105) . | op (cid:104) OpExpr (cid:105) to (cid:104) Term (cid:105) . | op (cid:104) OpRef (cid:105) to (cid:104) OpRef (cid:105) .--- priority given to (cid:104)

OpExpr (cid:105) case--- vars are declared with sorts from source of view (a theory) (cid:104)

SortRef (cid:105) ::= (cid:104)

Sort (cid:105) | ( (cid:104) Sort (cid:105) ) (cid:104) OpRef (cid:105) ::= (cid:104)

OpSpec (cid:105) | ( (cid:104) OpSpec (cid:105) ) | ( (cid:104) OpSpec (cid:105) ). (cid:104) OpQual (cid:105) | (( (cid:104) OpSpec (cid:105) ). (cid:104) OpQual (cid:105) )--- in views if have (op).(M) must be enclosed in (), i.e., ((op).(M)) (cid:104)

OpSpec (cid:105) ::= (cid:104)

OpName (cid:105) | (cid:104)

OpName (cid:105) : (cid:104) SortId (cid:105) ... -> (cid:104)

SortId (cid:105)(cid:104)

OpExpr (cid:105) --- a (cid:104)

Term (cid:105) consisting of a single operator applied--- to variables--- equivalent forms ---assoc = associative comm = commutativecq = ceq dfn = defineev = eval evq = eval-quietjbo = endo ht = endthendv = weiv = endview ex = extendinggather = gathering id: = identity:idem = idempotent idr: = identity-rules:in = input inc = includingobj = object poly = polymorphicprec = precedence psort = principal-sortpr = protecting q = quitred = reduce rl = red-loopsh = show sorts = sortstrat = strategy subsorts = subsortth = theory us = usingvars = var *** = ---***> = ---> --- Lexical analysis ------ Tokens are sequences of characters delimited by blanks--- "(", ")", and "," are always treated as single character symbols--- Tabs and returns are equivalent to blanks (except inside comments)--- Normally, "[", "]", "_", ",", "{", and "}"--- are also treated as single character symbols. Although OBJ provides a fully interactive user interface, in practicethis is an awkward way to use the system, because users nearly alwaysmake mistakes, and mistakes can be very troublesome to correct inan interactive mode. It is much easier to ﬁrst make a ﬁle, then startOBJ, and read the ﬁle with the in command; then OBJ will report thebugs it ﬁnds, based on which you can re-edit and then re-run the ﬁle;for complex examples, this cycle can be repeated many times. The author has found it convenient to edit OBJ ﬁles in one Emacs buﬀer,while another Emacs buﬀer contains a live OBJ; then you can switchbetween these by switching windows (or buﬀers); moreover, the resultsof execution are easily available for consultation and archiving.OBJ3 can be obtained by ftp from pages linked to the following URL: Once OBJ3 is installed, you can invoke it with the command obj . A laterversion of OBJ called BOBJ is also available via the above URL; whereasOBJ3 is implemented in Lisp, BOBJ is implemented in Java, and providessome additional features, including hidden algebra (as in Chapter 13).BOBJ is almost completely upward compatible with OBJ3, except that apply commands may need some reorganization, because the internalordering of rules is diﬀerent in the two systems.CafeOBJ [43] is another algebraic speciﬁcation language that couldbe used in connection with this text, although syntactical conversionwill be needed, since its syntax tends to follow that of C. Informationon how to obtain CafeOBJ is also available via the above URL.

Also discuss the conversion script once it is available. OBJ3 Syntax and Usage

Exiled Proofs

This appendix contains proofs considered too distracting to put in themain body of the text.

B.1 Many-Sorted Algebra

Most of the proofs for the main results on many-sorted algebra wereomitted in Chapter 4, and are also omitted here, because they followfrom the more general results on order-sorted algebra that are restatedand proved in Section B.5 below. An exception is Theorem 4.9.1, whichwe prove here as a sort of “warm up” for the proof of Theorem 10.3.3in Section B.5.

Theorem 3.2.1 ( Initiality ) Given a signature Σ without overloading and a Σ -algebra M , there is a unique Σ -homomorphism T Σ → M . (cid:2) Theorem 3.2.10 ( Initiality ) Given any signature Σ and any Σ -algebra M , there isa unique Σ -homomorphism T Σ → M . (cid:2) Theorem 4.5.4

For any set A of unconditional Σ -equations and unconditional Σ -equation e , A (cid:96) e iﬀ A (cid:96) ( , , ± ) e . (cid:2) Theorem 4.8.3 ( Completeness ) Given a signature Σ and a set A of (possiblyconditional) Σ -equations, then for any unconditional Σ -equation e , A (cid:96) C e iﬀ A (cid:238) e , where (cid:96) C denotes deduction using the rules (1,2,3,4,5C). (cid:2) Note that Theorem 4.4.2 is the special case of the above where all equa-tions in A are unconditional. Exiled Proofs

Theorem 4.9.1 ( Completeness of Subterm Replacement ) For any set A of (pos-sibly conditional) Σ -equations and any unconditional Σ -equation e , A (cid:96) C e iﬀ A (cid:96) ( , , ± ) e . Proof ( (cid:63) ): Let X be a ﬁxed but arbitrary set of variable symbols over thesort set of Σ . We will show that for any e quantiﬁed by X , A (cid:96) C e iﬀ A (cid:96) ( , , ± ) e . For this purpose, we deﬁne two binary relations on T Σ (X) , for s ∈ S and t, t (cid:48) ∈ T Σ (X) s , by t ≡ s t iﬀ A (cid:96) C ( ∀ X) t = t , and t ≡ Rs t iﬀ A (cid:96) ( , , ± ) ( ∀ X) t = t , and then show they are equal. (The superscript “R” comes from “Re-placement” in the name of rule ( ± ) .)Soundness of ( ± ) and completeness of (cid:96) C give us that A (cid:96) ( , , ± ) e implies A (cid:96) C e , which gives us ≡ R ⊆ ≡ .To show the opposite inclusion, we note that ≡ is the smallest Σ -congruence satisfying a certain property, and then prove that ≡ R isanother Σ -congruence satisfying that property. The property is closureunder (5C), in the sense that if ( ∀ Y ) t = t (cid:48) if C is in A and if θ : Y → T Σ (X) is a substitution such that θ(u) ≡ θ(v) for each pair (u, v) ∈ C ,then θ(t) ≡ θ(t (cid:48) ) . That ≡ is the least congruence closed under (5C)follows from its deﬁnition.To facilitate proofs about ≡ R , we deﬁne a family of relations on T Σ (X) , for s ∈ S , by ≡ R ,s = { (t, t) | t ∈ T Σ (X) s } , and for each n > ≡ Rn,s ={ (t , t ) | t , t ∈ T Σ (X) s and A (cid:96) ( , ± ) ( ∀ X) t = t via a proof of length ≤ n } . Then ≡ R = (cid:83) n ∈ ω ≡ Rn . The relation ≡ R is reﬂexive and transitive by deﬁnition. To prove itssymmetry, we show by induction on n that each relation ≡ Rn is symmet-ric. For the induction step, suppose that t ≡ Rn + t (cid:48) , using symmetry of ≡ Rn . There are just two cases, since the last step in proving ( ∀ X) t = t (cid:48) must use either the rule (3) or the rule ( ± ) . If the last step used(3), then there exists t (cid:48)(cid:48) such that t ≡ Rn t (cid:48)(cid:48) and t (cid:48)(cid:48) ≡ Rn t (cid:48) . By the induc-tion hypothesis, we have that t (cid:48)(cid:48) ≡ Rn t and t (cid:48) ≡ Rn t (cid:48)(cid:48) , which imply that t (cid:48) ≡ Rn + t . In the second case, where ( ± ) is used, we again conclude t (cid:48) ≡ Rn + t , this time by symmetry of ( ± ) . Thus each ≡ Rn is symmetric,and symmetry of ≡ R follows from the fact that any union of symmetricrelations is symmetric. ewriting To prove ≡ R is a congruence, we must show that for each operation σ in Σ , σ (t , . . . , t k ) ≡ R σ (t (cid:48) , . . . , t (cid:48) k ) whenever t i ≡ R t (cid:48) i for i = , . . . , k .For simplicity of presentation (and in fact without loss of generality),we do the proof for k =

2, showing by induction on n that if t ≡ Rn t (cid:48) and t ≡ Rn t (cid:48) then σ (t , t ) ≡ R σ (t (cid:48) , t (cid:48) ) . For this purpose, we show that σ (t , t ) ≡ R σ (t (cid:48) , t ) and σ (t (cid:48) , t ) ≡ R σ (t (cid:48) , t (cid:48) ) , and then use transitiv-ity of ≡ R . Since these two subgoals are entirely analogous, we concen-trate on the ﬁrst, σ (t , t ) ≡ R σ (t (cid:48) , t ) .The base case ( n =

0) is trivial. For the induction step, as in thesymmetry proof for ≡ R , there are two cases, where the last step inproving t ≡ R t (cid:48) is either (3) or else ( ± ) .In the ﬁrst case, there exists t ∈ T Σ (X) such that t ≡ Rn t and t ≡ Rn t (cid:48) . Then σ (t , t ) ≡ R σ (t , t ) ≡ R σ (t (cid:48) , t ) by the induction hypothesis, and transitivity of ≡ R gives the desiredresult.For the case where ( ± ) is applied, A contains an equation e (cid:48) , ( ∀ Y ) t = t (cid:48) if C , such that there is a substitution ψ : Y → T Σ (X) such that ψ(u) ≡ Rn ψ(v) for each pair (u, v) ∈ C , and such that t (z ← ψ(t)) = t and t (z ← ψ(t (cid:48) )) = t (cid:48) , for some t ∈ T Σ (X ∪ { z } ) with z (cid:54)∈ X . By applying ( ± ) with e (cid:48) to the term σ (t , t ) instead of t , we obtain σ (t , t ) ≡ Rn + σ (t (cid:48) , t ) , which concludes our proof that ≡ R is a congruence.We still have to show that ≡ R is closed under (5C). But this followsfrom the fact that ( ± ) includes (5C), which is Exercise 4.9.4. (cid:2) This elegant proof is due to R˘azvan Diaconescu. Note that Theorem4.5.4 is the special case of the above result where all equations in A are unconditional, and that the above result is in turn a special case ofTheorem 10.3.3, which covers the order-sorted case. B.2 Rewriting

The results in this section concern overloaded many-sorted term rewrit-ing, beginning with the following:

Proposition 5.3.4

A TRS ( Σ , A) is ground terminating if ( Σ (X), A) is groundterminating, where X is a variable set for Σ ; moreover, if Σ is non-void,then ( Σ , A) is ground terminating iﬀ ( Σ (X), A) is ground terminating. Proof:

It is clear that ground termination of ( Σ (X), A) implies that of ( Σ , A) . Exiled Proofs

For the converse, suppose that ( Σ , A) is ground terminating and that Σ is non-void, so that for each sort s there is some term a s ∈ T Σ ,s . Nowassume that t ⇒ t ⇒ t ⇒ · · · is a non-terminating rewrite sequence for ( Σ (X), A) , where the rewrite t i ⇒ t i + uses the rule l i → r i in A with var (l i ) = X i for i = , , . . . ,with t i and θ i : X i → T Σ (X) such that t i = t i (z ← θ i (l i )) and t i + = t i (z ← θ i (r i )) . Deﬁne g : X ∪ { z } → T Σ ( { z } ) by g(z) = z and g(x) = a s whenever x ∈ X has sort s , and let g denote the free extension T Σ (X ∪{ z } ) → T Σ ( { z } ) . Then t ⇒ t ⇒ t ⇒ · · · is a non-terminating rewrite sequence for ( Σ , A) , where each rewrite t i ⇒ t i + uses the rule l i → r i with t i = t i (z ← θ i (l i )) and t i + = t i (z ← θ i (r i )) , where θ i : X i → T Σ is the composition θ i ; g and where t i = g(t i ) . This works because t i = g(t i ) for i = , , . . . ; intuitively,the t i rewrite sequence is the image under g of the t i rewrite sequence. (cid:2) The following is used to prove Proposition 5.5.6:

Proposition Σ -TRS A and a function ρ : T Σ → ω , then Σ -substitution is strict ρ -monotone if every operation symbol in Σ is ρ -strict monotone; the same holds for weak ρ -monotonicity. Proof ( (cid:63) ): We use induction on the structure of Σ -terms. First notice that aterm t ∈ T Σ ( { z } ) having a single occurrence of z is either z or else isof the form σ (t , . . . , t n ) where each t i except one is ground, and thatone has a single occurrence of z . Now deﬁne the z - depth d of a term t ∈ T Σ ( { z } ) having a single occurrence of z as follows: d(z) = d(σ (t , . . . , t n )) = + d(t i ) where t i contains the occurrence of z .Notice in particular that d(σ (t , . . . , t n )) = t i containing z is z .Now assume that ρ(t) > ρ(t (cid:48) ) . Then substitution is strict ρ -mono-tone for all terms t of z -depths 0 and 1; this covers the base cases. Forthe induction step, assume that strict ρ -monotonicity of substitutionholds for all terms t of z -depth less than m >

0, and that we aregiven some t ∈ T Σ ( { z } ) with z -depth m and a single occurrence of z . Then t has the form σ (t , . . . , t n ) for some σ ∈ Σ and some n > ewriting furthermore, if t i is the term containing z then d(t i ) = m −

1. Thereforestrict ρ -monotonicity of substitution holds for t i . We now calculate: t (z ← t) = σ (t , . . . , t i (z ← t), . . . , t n ) = σ (t , . . . , z, . . . , t n )(z ← t i ) , where t i = t i (z ← t) ; and similarly, t (z ← t (cid:48) ) = σ (t , . . . , z, . . . , t n )(z ← t (cid:48) i ) , where t (cid:48) i = t i (z ← t (cid:48) ) . Now because strict ρ -monotonicity of substi-tution holds for t i we have ρ(t i (z ← t)) > ρ(t i (z ← t (cid:48) )) , i.e., ρ(t i ) > ρ(t (cid:48) i ) , and therefore it follows that ρ(t (z ← t)) > ρ(t (z ← t (cid:48) )) , as desired. The argument for the second assertion is essentially thesame. (cid:2) Theorem 5.6.9 ( Critical Pair Theorem ) A TRS is locally Church-Rosser if andonly if all its critical pairs are convergent.

Sketch of Proof:

The converse is easy. Suppose that all critical pairs converge,and consider a term with two distinct rewrites. Then their redexes areeither disjoint or else one of them is a subterm of the other, since iftwo subterms of a given term are not disjoint, one must be containedin the other. If the redexes are disjoint, then the result of applying bothrewrites is the same in either order. If the redexes are not disjoint, theneither the rules overlap (in the sense of Deﬁnition 5.6.2), or else thesubredex results from substituting for a variable in the leftside of therule producing the larger redex. In the ﬁrst case, the result terms ofthe two rewrites rewrite to a common term by hypothesis, since theoverlap is a substitution instance of the overlap of some critical pairby Proposition 12.0.1. In the second case, the result of applying bothrules is the same in either order, though the subredex may have to berewritten multiple (or zero) times if the variable involved is non-linear.

THIS IS JUST A COPY OF WHAT’S IN SECTION 5.6 ALREADY; SHOULD FILLIN DETAILS HERE OR ELSE OMIT. (cid:2)

Proposition 5.8.10

A CTRS ( Σ , A) is ground terminating if ( Σ (X), A) is groundterminating, where X is a variable set for Σ ; moreover, if Σ is non-void,then ( Σ , A) is ground terminating iﬀ ( Σ (X), A) is ground terminating. Proof:

We extend the proof of Proposition 5.3.4 above. If a rewrite t i ⇒ t i + in ( Σ , A(X)) uses the rule l i → r i if C i where var (l i ) = X i and where C i contains u ij = v ij , then to get the corresponding rewrite t i ⇒ t i + in ( Σ , A) , we apply g to the conditions as well as to the left and rightsides, noting that θ i (u ij ) ↓ A θ i (v ij ) on T Σ (X) implies θ i (u ij ) ↓ A θ i (v ij ) on T Σ . (cid:2) Exiled Proofs

B.2.1 ( (cid:63) ) Orthogonal Term Rewriting Systems

The proof below that orthogonal term rewriting systems are Church-Rosser was provided by Grigore Ro¸su, following a suggestion of JosephGoguen to use the Hindley-Rosen Lemma (Proposition 5.7.5). As in theclassic proof of Gerard Huet [108], we use “parallel rewriting” (Deﬁni-tion B.2.1); this should not be confused with the concurent rewriting of[70, 134], as it is a technical notion especially created for this result.Indeed, the entire proof is rather technical, and suggestions for furthersimpliﬁcation would be of interest.

Deﬁnition B.2.1

Given a Σ -TRS A and Σ -terms t, t (cid:48) ∈ T Σ (Y ) , the one-step par-allel rewriting relation, written t ⇒⇒ A t (cid:48) , holds iﬀ there exists a Σ -term t ∈ T Σ ( { z , . . . , z n } ∪ Y ) having exactly one occurrence of each vari- able z i , and there exist n Σ -rules α i → β i in A and n substitutions θ i : X i → T Σ (Y ) where X i = var (α i ) for 1 ≤ i ≤ n , such that t = t [z ← θ (α ), . . . , z n ← θ n (α n )] and t (cid:48) = t [z ← θ (β ), . . . , z n ← θ n (β n )] . The parallel rewriting relation is the transitive closure of ⇒⇒ A , denoted t ⇒⇒ ∗ A t (cid:48) . (cid:2) The relation ⇒⇒ is reﬂexive, as can be seen by taking n = t contain exactly one occurrence of eachvariable z i only for technical reasons. Note that the Σ -rules α i → β i are not required to be distinct. We may omit the subscript A when it isclear from context, writing t ⇒⇒ ∗ t (cid:48) instead of t ⇒⇒ ∗ A t (cid:48) , and also writing ⇒⇒ instead of ⇒⇒ A . Exercise B.2.1

Given a Σ -TRS A , terms t , t (cid:48) , . . . , t n , t (cid:48) n ∈ T Σ (Y ) such that t ⇒⇒ A t (cid:48) , . . . , t n ⇒⇒ A t (cid:48) n , and t ∈ T Σ ( { z , . . . , z n } ∪ Y ) , show t [z ← t , . . . , z n ← t n ] ⇒⇒ A t [z ← t (cid:48) , . . . , z n ← t (cid:48) n ] . (cid:2) Three lemmas precede the main part of the proof. The ﬁrst justiﬁesusing parallel rewriting to prove results about ordinary rewriting.

Lemma B.2.2

Given a Σ -TRS A , ⇒⇒ ∗ A = ⇒ ∗ A . Proof:

The inclusion ⇒ ∗ A ⊆ ⇒⇒ ∗ A follows from the fact that one-step rewritingis the special case of one-step parallel rewriting where n = z .Thus it suﬃces to prove the opposite inclusion, ⇒⇒ A ⊆ ⇒ ∗ A . Supposethat t ⇒⇒ A t (cid:48) and let t ∈ T Σ ( { z , . . . , z n } ∪ Y ) , as in the deﬁnition ofone-step parallel rewriting. Let t i ∈ T Σ ( { z } ∪ Y ) denote the term t [z ← θ (β ), . . . , z i − ← θ i − (β i − ), z i ← z, z i + ← θ i + (α i + ), . . . , z n ← θ n (α n )] ewriting and let t i denote the terms t i [z ← θ i (β i )] for 1 ≤ i ≤ n .Because t = t [z ← θ (α )] and t = t [z ← θ (β )] , we get t ⇒ A t by the deﬁnition of one-step (non-parallel) rewriting. Also because t i = t i + [z ← θ i + (α i + )] and t i + = t i + [z ← θ i + (β i + )] , we get t i ⇒ A t i + for 1 ≤ i < n . Finally, since t n = t (cid:48) , we get the chain of one-steprewrites t ⇒ A · · · ⇒ A t i ⇒ A t i + ⇒ A · · · ⇒ A t (cid:48) , and therefore t ⇒ ∗ A t (cid:48) . (cid:2) From now on, we assume A is a ﬁxed Σ -TRS with Σ -rules α i → β i for1 ≤ i ≤ N . Let A i denote the Σ -TRS containing a single Σ -rule α i → β i ,let ⇒ i denote the relation ⇒ A i , let ⇒⇒ i denote the relation ⇒⇒ A i , and let X i = var (α i ) , the set of variables of α i . The next lemma is the only place where orthogonality of A is used.In reading its proof, it may help to visualize the various constructionsusing the picture below. Lemma B.2.3 If A is orthogonal and if ϕ : X i → T Σ (Y ) is a substitution suchthat ϕ(α i ) ⇒⇒ j t for some 1 ≤ i, j ≤ N , then there is some t (cid:48) ∈ T Σ (Y ) such that ϕ(β i ) ⇒⇒ j t (cid:48) and t ⇒ i t (cid:48) . Proof:

Because ϕ(α i ) ⇒⇒ j t , Deﬁnition B.2.1 implies there exist a Σ -term t ∈ T Σ ( { z , . . . , z n }∪ Y ) and substitutions θ k : X j → T Σ (Y ) such that ϕ(α i ) = t [z ← θ (α j ), . . . , z n ← θ n (α j )] and t = t [z ← θ (β j ), . . . , z n ← θ n (β j )] . Therefore θ k (α j ) is a subterm of ϕ(α i ) for each 1 ≤ k ≤ n .But because A is nonoverlapping, the terms α i and α j do not over-lap, i.e., there does not exist a non-variable subterm α ki of α i such that θ k (α j ) = ϕ(α ki ) . Consequently, the only possibility for θ k (α j ) to be asubterm of ϕ(α i ) , is to be a subterm (not necessarly proper) of ϕ(x) where x is a variable in X i . Hence for each 1 ≤ k ≤ n , there is a variable x k in X i such that θ k (α j ) is a subterm of ϕ(x k ) .The variables x k need not be distinct for distinct indices k . Because t is the term of “positions” of θ k (α j ) in ϕ(α i ) for 1 ≤ k ≤ n andbecause each θ k (α j ) is a subterm of ϕ(x k ) and A is left linear, thatis, α i has no more than one occurence of any variable x in X i , we canconclude that for each x in X i there is a subterm t x of t such that ϕ(x) = t x [z ← θ (α j ), . . . , z n ← θ n (α j ) . The term t x is the subtermof t that contains the “positions” of each θ k (α j ) in ϕ(x) for 1 ≤ k ≤ n .It is possible that ϕ(x) does not contain all θ k (α j ) as subterms or justdoes not contain anyone of them, but we still preserve the notation t x [z ← θ (α j ), . . . , z n ← θ n (α j )] which means that one substitutesonly variables z k that appear in t x , that is, variables z k for those 1 ≤ k ≤ n for which x k = x . Exiled Proofs (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74) α i x = xk = xk (cid:48) (cid:113) (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74) t x (cid:113) zk (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74) θ k (α j ) (cid:113) zk (cid:48) (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74) θ k (cid:48) (α j ) (cid:63)(cid:54) ϕ(x) (cid:63)(cid:54)(cid:63)(cid:54) t ϕ(α i ) Let (cid:15) : X i → T Σ ( { z , . . . , z n } ∪ Y ) denote the function for which (cid:15)(x) = t x . Since A is left linear, no α i has more than one instanceof any variable x in X i , and therefore (cid:15)(α i ) = t .Now let θ α , θ β : { z , . . . , z n } → T Σ (Y ) be the substitutions such that θ α (z k ) = θ k (α j ) and θ β (z k ) = θ k (β j ) for all 1 ≤ k ≤ n . Then we have (cid:15) ; θ α = ϕ , because for each x in X i ((cid:15) ; θ α )(x) = θ α ((cid:15)(x)) = θ α (t x ) = t x [z ← θ (α j ), . . . , z n ← θ n (α j )] = ϕ(x) . Let t (cid:48) be the term ((cid:15) ; θ β )(β i ) , that is t (cid:48) = (cid:15)(β i )[z ← θ (β j ), . . . , z n ← θ n (β j )] . Then we can show that ϕ(β i ) = ((cid:15) ; θ α )(β i ) = θ α ((cid:15)(β i )) = (cid:15)(β i )[z ← θ (α j ), . . . , z n ← θ n (α j )] , and by the deﬁnition of parallel rewriting, we get ϕ(β i ) ⇒⇒ j t (cid:48) . Although (cid:15)(β i ) may contain multiple occurences of variables z , . . . , z n , this doesnot modify the one-step parallel rewriting relation (the reader shouldprove this). On the other hand, because t = t [z ← θ (β j ), . . . , z n ← θ n (β j )] = (cid:15)(α i )[z ← θ (β j ), . . . , z n ← θ n (β j )] = θ β ((cid:15)(α i )) = ((cid:15) ; θ β )(α i ) , ewriting it follows that t ⇒ i t (cid:48) , by the deﬁnition of one-step ordinary rewriting. (cid:2) The above lemma holds even when n =

0, that is, when there are noparallel rewrites.

Lemma B.2.4 If A is orthogonal and if t, t , t are Σ -terms such that t ⇒⇒ i t and t ⇒⇒ j t then there is some t (cid:48) such that t ⇒⇒ j t (cid:48) and t ⇒⇒ i t (cid:48) . Proof:

There exist Σ -terms t i ∈ T Σ ( { p , . . . , p m } ∪ Y ) , t j ∈ T Σ ( { z , . . . , z n } ∪ Y ) and substitutions ϕ , . . . , ϕ m : X i → T Σ (Y ) and θ , . . . , θ n : X j → T Σ (Y ) such that t = t i [p ← ϕ (α i ), . . . , p m ← ϕ m (α i )]t = t i [p ← ϕ (β i ), . . . , p m ← ϕ m (β i )] , and t = t j [z ← θ (α j ), . . . , z n ← θ n (α j )]t = t j [z ← θ (β j ), . . . , z n ← θ n (β j )] . In the picture below, t i and t j appear as two diﬀerent “tops” for theterm t : (cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10)(cid:10) (cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:74)(cid:72)(cid:72)(cid:72)(cid:72)(cid:72)(cid:72)(cid:72)(cid:72)(cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:8)(cid:63)(cid:54) (cid:63)(cid:54) (cid:63)(cid:54) t t i t j t Let δ α,α : { p , . . . , p m , z , . . . , z n } → T Σ (Y ) be a substitution suchthat δ α,α (p l ) = ϕ l (α i ) for all 1 ≤ l ≤ m and δ α,α (z k ) = θ k (α j ) forall 1 ≤ k ≤ n . We get δ α,α (t i ) = t and also δ α,α (t j ) = t , that is, δ α,α is a uniﬁer of the terms t i and t j . Because t i , t j are uniﬁable,they have a most general uniﬁer, say ψ : { p , . . . , p m , z , . . . , z n } → T Σ ( { p , . . . , p m , z , . . . , z n } ∪ Y ) . In our case, because t i and t j haveexactly one occurrence of the variables p , . . . , p m and z , . . . , z n re-spectively, each ψ(p l ) is either equal to p l or else is a subterm of t j ,and each ψ(z k ) is either equal to z k or else is a subterm of t i . Thereader can now check that ψ ; δ α,α = δ α,α .Next we introduce two more substitutions, δ α,β , δ β,α : { p , . . . , p m ,z , . . . , z n } → T Σ (Y ) such that δ α,β (p l ) = ϕ l (α i ) and δ α,β (z k ) = θ k (β j ) , The notions of uniﬁer and most general uniﬁer are deﬁned in Chapter 12. Exiled Proofs and δ β,α (p l ) = ϕ l (β i ) and δ β,α (z k ) = θ k (α j ) , respectively. We nowclaim that ϕ l (α i ) ⇒⇒ j (ψ ; δ α,β )(p l ) for each 1 ≤ l ≤ m . This is because ϕ l (α i ) = δ α,α (p l ) = (ψ ; δ α,α )(p l ) and because either ψ(p l ) equals p l ,in which case (ψ ; δ α,β )(p l ) = ϕ l (α i ) and then we use the reﬂexivity of ⇒⇒ j , or else ψ(p l ) contains only distinct variables in { z , . . . , z n } andthen ϕ l (α i ) = δ α,α (ψ(p l )) = ψ(p l )[z ← θ (α j ), . . . , z n ← θ n (α j )] and (ψ ; δ α,β )(p l ) = δ α,β (ψ(p l )) = ψ(p l )[z ← θ (β j ), . . . , z n ← θ n (β j )]. Similarly, θ k (α j ) ⇒⇒ i (ψ ; δ β,α )(z k ) for each 1 ≤ k ≤ n .Suppose that p , . . . , p M are all the variables of t i such that p l ∉ var (ψ(z k )) for 1 ≤ k ≤ n , and that z , . . . , z N are the variables of t j such that z k ∉ var (ψ(p l )) for 1 ≤ l ≤ m . Then there is a term t ∈ T Σ ( { p , . . . , p M } ∪ { z , . . . , z N } ∪ Y ) such that ψ(t ) = ψ(t i ) = ψ(t j ) .In the above picture, t is the “intersection” of t i and t j . The readershould now check that t = t [p ← ϕ (α i ), . . . , p M ← ϕ M (α i ),z ← θ (α j ), . . . , z N ← θ N (α j )]t = t [p ← ϕ (β i ), . . . , p M ← ϕ M (β i ),z ← (ψ ; δ β,α )(z ), . . . , z N ← (ψ ; δ β,α )(z N )]t = t [p ← (ψ ; δ α,β )(p ), . . . , p M ← (ψ ; δ α,β )(p M ),z ← θ (β j ), . . . , z N ← θ N (β j )] . Because ϕ l (α i ) ⇒⇒ j (ψ ; δ α,β )(p l ) , we conclude by Lemma B.2.3 thatthere is some u l such that ϕ l (β i ) ⇒⇒ j u l and (ψ ; δ α,β )(p l ) ⇒ i u l for all 1 ≤ l ≤ M . Similarly, there is some v k such that θ k (β j ) ⇒⇒ i v k and (ψ ; δ β,α )(z k ) ⇒ j v k for all 1 ≤ k ≤ N .Finally, let t (cid:48) = t [p ← u , . . . , p M ← u M , z ← v , . . . , z N ← v N ] .Then by Exercise B.2.1, we have t ⇒⇒ j t (cid:48) and t ⇒⇒ i t (cid:48) . (cid:2) Theorem 5.6.4 A Σ -TRS A is Church-Rosser if it is orthogonal and lapse free. Proof:

To prove that ⇒ i and ⇒ j commute for 1 ≤ i, j ≤ N , by Lemma B.2.2 itsuﬃces to prove that ⇒⇒ i and ⇒⇒ j commute.First, we prove by induction on the length of rewriting with ⇒⇒ ∗ j thatwhenever t ⇒⇒ i t and t ⇒⇒ ∗ j t there exists a t (cid:48) such that t ⇒⇒ ∗ j t (cid:48) ewriting Modulo Equations and t ⇒⇒ i t (cid:48) . If the length of the rewrite sequence t ⇒⇒ ∗ j t is zero,then let t (cid:48) = t . If it is more than zero, let t (cid:48) be a Σ -term such that t ⇒⇒ ∗ j t (cid:48) ⇒⇒ j t . By the induction hypothesis, there exists t (cid:48)(cid:48) such that t ⇒⇒ ∗ j t (cid:48)(cid:48) and t (cid:48) ⇒⇒ i t (cid:48)(cid:48) . Now Lemma B.2.4 gives us t (cid:48) such that t (cid:48)(cid:48) ⇒⇒ j t (cid:48) and t ⇒⇒ i t (cid:48) . Therefore t ⇒⇒ ∗ j t (cid:48) and t ⇒⇒ i t (cid:48) .Now we prove by induction on the length of rewriting with ⇒⇒ ∗ i thatwhenever t ⇒⇒ ∗ i t and t ⇒⇒ ∗ j t there exists t (cid:48) such that t ⇒⇒ ∗ j t (cid:48) and t ⇒⇒ ∗ i t (cid:48) ; that is, ⇒⇒ i and ⇒⇒ i commute. If the length of the rewritesequence is zero, let t (cid:48) = t , and otherwise let t (cid:48) be a Σ -term such that t ⇒⇒ ∗ i t (cid:48) ⇒⇒ i t . By the induction hypothesis, there exists a term t (cid:48)(cid:48) suchthat t (cid:48) ⇒⇒ ∗ j t (cid:48)(cid:48) and t ⇒⇒ ∗ i t (cid:48)(cid:48) . By the induction above there exists t (cid:48) suchthat t ⇒⇒ ∗ j t (cid:48) and t (cid:48)(cid:48) ⇒⇒ i t (cid:48) .It now follows that t ⇒⇒ ∗ j t (cid:48) and t ⇒⇒ ∗ i t (cid:48) . Therefore ⇒⇒ i and ⇒⇒ j commute, and so the Hindley-Rosen Lemma (Proposition 5.7.5) gives usthat A is Church-Rosser. (cid:2) B.3 Rewriting Modulo Equations

Theorem 7.7.20

Let ( Σ , A, B) be a CMTRS with Σ non-void and let ( Σ , A (cid:48) , B) be as ground terminating sub-CMTRS of Σ , A, B . let P a poset and let N = A − B . If there is a ρ : T Σ ,B → P such that(1) each rule in B is weak ρ -monotone,(2) each rule in N is strict ρ -monotone,(3) each operation in Σ is strict ρ -monotone, and(4) P is Noetherian, or at least, for each t ∈ (T Σ ,B there is someNoetherian poset P ts ⊆ P s such that t ∗ ⇒ [A/B] t (cid:48) implies ρ(t (cid:48) ) ∈ P ts ,then ( Σ , A, B) is ground terminating. Proof: . . . . . . See proof of Theorem 5.8.20, page 136 . . . . . . (cid:2)

B.4 First-Order Logic

This section restates and proves Proposition 8.3.21, which does mostof the work for proving the Subsitution Theorem (Theorem 8.3.22).

Proposition 8.3.21 If θ is capture free for P , then for any model M , [[θ(P )]] M = [[θ]] − M ([[P ]] M ) . Exiled Proofs

Proof:

We ﬁrst show by structural induction over Ω that the required equalityholds for every substitution τ that is capture free for P , with τ Free (P) the identity. The reader is left to check the base cases (where P isa generator in G X or true ) and the inductive steps for negation andconjunction. Now suppose P = ( ∀ x)Q . The assertion a ∈ [[τ(( ∀ x)Q)]] is equivalent to b ∈ [[τ x (Q)]] for each b : X → M with b(y) = a(y) for y ≠ x , that is, τ x ; b ∈ [[Q]] , because (τ x ) Free (Q) is the identity, τ x is capture free for Q , plus the induction hypothesis. Similarly, a ∈ [[τ]] − ([[( ∀ x)Q]]) is equivalent to τ ; a ∈ [[( ∀ x)Q]] , that is, b (cid:48) ∈ [[Q]] for each b (cid:48) : X → M with b (cid:48) (y) = a(τ(y)) for y ≠ x .Suppose a ∈ [[τ(( ∀ x)Q)]] and let b (cid:48) : X → M such that b (cid:48) (y) = a(τ(y)) for y ≠ x . Deﬁne b : X → M by b(x) = b (cid:48) (x) and b(y) = a(y) for y ≠ x ; then τ x ; b ∈ [[Q]] . We now claim b (cid:48) = τ x ; b . In- deed, (τ x ; b)(x) = b(x) = b (cid:48) (x) , and if y ≠ x then (τ x ; b)(y) = b(τ x (y)) = b(τ(y)) . But x ∉ Var (τ(y)) , because if y ∈ Free (P ) then x ∉ Var (τ(y)) because τ is capture free for P , and if y ∉ Free (P ) then τ(y) = y . Then b(τ(y)) = a(τ(y)) = b (cid:48) (y) , that is, b (cid:48) = τ x ; b .Therefore b (cid:48) ∈ [[Q]] , that is, τ ; a ∈ [[( ∀ x)Q]] .Conversely, suppose τ ; a ∈ [[( ∀ x)Q]] and let b : X → M such that b(y) = a(y) for y ≠ x and let b (cid:48) be τ x ; b . Then b (cid:48) (y) = a(τ(y)) (asabove). Therefore τ x ; b ∈ [[Q]] , that is, a ∈ [[τ(( ∀ x)Q)]] .Now let θ be any substitution and let τ be the substitution θ X − Free (P) .Then τ is capture free for P , and τ Free (P) is the identity; therefore [[τ(P )]] = [[τ]] − ([[P ]]) . By 5. of Exercise 8.3.14, θ(P ) = τ(P ) ; there-fore it suﬃces to prove θ ; a ∈ [[P ]] iﬀ τ ; a ∈ [[P ]] for each a : X → M . ByProposition 8.3.3, it is enough to show that (θ ; a)(y) = (τ ; a)(y) for y ∈ Free (P ) , which is true because θ(y) = τ(y) for y ∈ Free (P ) , byconstruction of τ . (cid:2) B.5 Order-Sorted Algebra

This section provides the omitted proofs for results on order-sorted algebra in Chapter 10.

Theorem 10.2.8 ( Initiality ) If Σ is regular and if M is any Σ -algebra, then thereis one and only one Σ -homomorphism from T Σ to M . Proof:

In this proof we write T for T Σ . Let M be an arbitrary order-sorted Σ -algebra; then we must show that there is a unique order-sorted Σ -homomorphism h : T → M . We will (1) construct h , then (2) show it isan order-sorted Σ -homomorphism, and ﬁnally (3) show it is unique.(1) We construct h by induction on the depth of terms in T . Thereare two cases: rder-Sorted Algebra (1a) If t ∈ T has depth 0, then t = σ for some constant σ in Σ . Byregularity, σ has a least sort s . Then for any s (cid:48) ≥ s we deﬁne h s (cid:48) (σ ) = M [],sσ (1b) If t = σ (t . . . t n ) ∈ T has depth n +

1, then by regularity thereare least w and s with σ ∈ Σ w,s where w = s . . . s n ≠ [] and LS(t i ) ≤ s i for i = , . . . , n . Then for any s (cid:48) ≥ s we deﬁne h s (cid:48) (t) = M w,sσ (h s (t ), . . . , h s n (t n )) , noting that h s (t ), . . . , h s n (t n ) are already deﬁned.(2) We now show that h is an order-sorted Σ -homomorphism. Byconstruction h satisﬁes the restriction condition E48 of Deﬁnition 10.1.5.To see that it also satisﬁes the homomorphism condition of Deﬁni- tion 10.1.5, we again consider two cases:(2a) σ ∈ Σ [],s is a constant. By regularity and monotonicity, s is theleast sort of σ , and we have already deﬁned h s (σ ) = M [],sσ as needed.(2b) We now consider a term t of depth greater than 0, and let σ ∈ Σ w (cid:48) ,s (cid:48) with w (cid:48) = s (cid:48) . . . s (cid:48) n ≠ [] be such that t = σ (t . . . t n ) =T w (cid:48) ,s (cid:48) σ (t , . . . , t n ) . By regularity and Proposition 10.2.7 there are least w = s . . . s n and s = LS(t) such that t = σ (t . . . t n ) = T w,sσ (t , . . . , t n ) .Then w ≤ w (cid:48) and s ≤ s (cid:48) so that (2) of Deﬁnition 10.1.3 gives M w (cid:48) ,s (cid:48) σ = M w,sσ on M w . Thus, using the already established fact that h satisﬁesthe restriction condition, we have h s (cid:48) (σ (t . . . t n )) = M w,sσ (h s (t ), . . . , h s n (t n )) = M w (cid:48) ,s (cid:48) σ (h s (cid:48) (t ), . . . , h s (cid:48) n (t n )). (3) Finally, we show the uniqueness of h . In fact, we will show thatif h (cid:48) : T → M is an order-sorted Σ -homomorphism, then h = h (cid:48) , byinduction on the depth of terms. For depth 0 consider σ ∈ Σ [],s . Then s is the least sort of σ , and for any s ≥ s (cid:48) , we must have h (cid:48) s (cid:48) (σ ) = h (cid:48) s (σ ) = M [],sσ = h s (σ ) = h s (cid:48) (σ ) , as desired. Now assume the result for depth ≤ n , and consider a term t = σ (t . . . t n ) = T w (cid:48) ,s (cid:48) σ (t , . . . , t n ) of depth n + σ ∈ Σ w (cid:48) ,s (cid:48) and w (cid:48) = s (cid:48) . . . s (cid:48) n . As in (2b), there are least w = s . . . s n and s = LS(t) such that t = σ (t , . . . , t n ) = T w,sσ (t , . . . , t n ) and M w (cid:48) ,s (cid:48) σ = M w,sσ on M w . Then h (cid:48) s (cid:48) (t) = M w (cid:48) ,s (cid:48) σ (h (cid:48) s (cid:48) (t ), . . . , h (cid:48) s (cid:48) n (t n )) = M w (cid:48) ,s (cid:48) σ (h s (cid:48) (t ), . . . , h s (cid:48) n (t n )) ( by the induction hypothesis ) = M w,sσ (h s (t ), . . . , h s n (t n )) = h s (cid:48) (t) as needed. (cid:2) Exiled Proofs

Theorem 10.2.9 ( Freeness ) If (S, ≤ , Σ ) is regular, then T Σ (X) is a free Σ -algebraon X , in the sense that for each Σ -algebra M and each assignment a : X → M , there is a unique Σ -homomorphism a : T Σ (X) → M such that a(x) = a(x) for all x in X . Proof:

The Σ -algebras M with an assignment a : X → M are in bijective cor-respondence with Σ (X) -algebras M . Now the initiality of T Σ (X) amongall Σ (X) -algebras A (Theorem 10.2.8) gives the desired result. (cid:2) Theorem 10.3.2 ( Completeness ) Given a coherent order-sorted signature Σ , given t, t (cid:48) in T Σ (X) , and given a set A of conditional Σ -equations, then the fol-lowing assertions are equivalent:(C1) ( ∀ X) t = t (cid:48) is derivable from A using rules (1)–(4) and (5C).(C2) ( ∀ X) t = t (cid:48) is satisﬁed by every order-sorted Σ -algebra thatsatisﬁes A .When all equations in A are unconditional, the same holds replacingrule (5C) by rule (5). Proof:

We leave the reader to check soundness , i.e., that (C1) implies (C2); thisfollows as usual by induction from the soundness of each rule of de-duction separately. Here we show completeness , i.e., that (C2) implies(C1). The structure of this proof is as follows: We are given a Σ -equation e = ( ∀ X) t = t (cid:48) that is satisﬁed by every Σ -algebra that satisﬁes A , andwe wish to show that e is derivable from A ; to this end, we constructa particular Σ -algebra M such that if M satisﬁes e then e is derivablefrom A ; then we show that M satisﬁes A .First, we show that the following property of terms t, t (cid:48) ∈ T Σ (X) s for some sort s , deﬁnes an order-sorted Σ -congruence on T Σ (X) :(D) ( ∀ X) t = t (cid:48) is derivable from A using rules (1–4) plus (5C).Let us denote this relation ≡ . Then rules (1–3) say that ≡ is an equiva-lence relation on T Σ (X) s for each sort s . By applying rule (4) to terms t of the form σ (x , . . . , x n ) for σ ∈ Σ , we see that ≡ is a many-sorted Σ -congruence. Finally, ≡ is also an order-sorted Σ -congruence, becauseproperty (D) does not depend upon s .Now we can form the order-sorted quotient of T Σ (X) by ≡ , whichwe denote by T Σ ,A (X) , or within this proof, just M . Then by the con-struction of M , for each t, t (cid:48) ∈ T Σ (X) we have( * ) [t] = [t (cid:48) ] in M iﬀ (D) holds,where [t] denotes the ≡ -equivalence class of t .We next show the key property of M , that( ** ) ( ∀ X) t = t (cid:48) satisﬁed in M implies that (D) holds. rder-Sorted Algebra Y (cid:63) [ _ ] T Σ (X) (cid:45)(cid:81)(cid:81)(cid:81)(cid:81)(cid:81)(cid:81)(cid:81)(cid:81)(cid:115) θ M ϕ Figure B.1: Factorization of θ Since the equation ( ∀ X) t = t (cid:48) is satisﬁed in M , we can use the inclu-sion i X : X → M sending x to [x] as an S -sorted assignment to see that [t] = [t (cid:48) ] in M ; then (D) holds by ( * ).We now prove that M satisﬁes A . Let ( ∀ Y ) t = t (cid:48) if C be a con- ditional equation in A , and let θ : Y → M be an S -sorted assignmentsuch that θ(u) = θ(v) for each condition u = v in C . Then for each s ∈ S and each y ∈ Y s , we can choose a representative t y ∈ T Σ (X) s such that θ(y) = [t y ] in M . Now let ϕ : Y → T Σ (X) be the substitutionsending y to t y . Then θ(y) = [ϕ(y)] for each y ∈ Y , and therefore θ(t) = [ϕ(t)] in M for any t ∈ T Σ (Y ) , by the freeness of T Σ (Y ) over Y . See Figure B.1.Therefore, [ϕ(u)] = [ϕ(v)] holds in M , and by the property ( * ), theequation ( ∀ X) ϕ(u) = ϕ(v) is derivable from A using (1–4) plus (5C)for each u = v in C . Therefore by rule (5C), the equation ( ∀ X) ϕ(t) = ϕ(t (cid:48) ) is derivable from A , and hence by ( * ), θ(t) = θ(t (cid:48) ) holds in M ,and thus the conditional equation ( ∀ Y ) t = t (cid:48) if C holds in M .Since an unconditional equation is just a conditional equation whoseset C of conditions is empty, when every equation in A is unconditionalwe are reduced to the simpliﬁed special case of the above argumentwhere only the rule (5) is needed. (cid:2) This result also gives completeness for ordinary MSA, and of coursefor unsorted algebra, as special cases. Now the initiality and freenessresults:

Theorem 10.4.11 ( Initiality ) If Σ is coherent and A is a set of (possibly con-ditional) Σ -equations, then T Σ ,A is an initial ( Σ , A )-algebra, and T Σ ,A (X) is a free ( Σ , A )-algebra on X , in the sense that for each Σ -algebra M and each assignment a : X → M , there is a unique Σ -homomorphism a : T Σ ,A (X) → M such that a(x) = a(x) for each x in X . Proof:

E49

First notice that the freeness of T Σ ,A (X) specializes to the initialityof T Σ ,A when X = ∅ , so that it suﬃces to show the freeness of T Σ ,A (X) .Let M be an order-sorted algebra satisfying A , and let a : X → M be anassignment for M . Then we have to show that there is a unique order-sorted Σ -homomorphism a & : T Σ ,A (X) → M extending a , i.e., such that Exiled Proofs a & (q(x)) = a(x) for each x ∈ X , where q denotes the quotient homo-morphism q : T Σ (X) → T Σ ,A (X) . The existence of a & follows from com-pleteness (Theorem 10.3.2), because the fact that M satisﬁes A impliesthat a ∗ (t) = a ∗ (t (cid:48) ) for every equation ( ∀ X) t = t (cid:48) that is derivablefrom A with the rules (1–4) plus (5C), and this implies that ≡ ⊆ ker (a ∗ ) ,and thus by the universal property of quotients (Proposition 10.4.10),there is a unique order-sorted homomorphism a & : T Σ ,A (X) → A with a ∗ = a & ◦ q .The uniqueness of a & now follows by combining the universal prop-erty of T Σ (X) as a free order-sorted algebra on X with the universalproperty of q as a quotient, as follows: Let h : T Σ ,A (X) → M be anotherorder-sorted homomorphism such that h(q(x)) = a(x) for each x ∈ X .Since T Σ (X) is a free order-sorted algebra on X , we have a ∗ = h ◦ q , and by the universal property of q as a quotient we have h = a & as desired. (cid:2) Theorem 10.3.3

Given a coherent signature Σ and a set A of (possibly condi-tional) Σ -equations, then for any unconditional Σ -equation e , A (cid:96) C e iﬀ A (cid:96) ( , , ± ) e . Proof:

See page 8 of [82] [OSA1], and Theorem 4.9.1, page 82. (cid:2)

The model-theoretic proof of Theorem 10.6.6 below uses naturality of the family ψ X of morphisms, which in particular implies commuta-tivity of the following diagram for X ⊆ X (cid:48) , where µ X,X (cid:48) is the unique Σ ⊗ -homomorphism induced by the composite map X (cid:62) X (cid:48) → T Σ ⊗ ,A ⊗ (X (cid:48) ) : T Σ ,A (X) (cid:63) ι X,X (cid:48) µ X,X (cid:48) T Σ ⊗ ,A ⊗ (X) (cid:45) ψ X (cid:48) (cid:63) T Σ ,A (X (cid:48) ) (cid:45) T Σ ⊗ ,A ⊗ (X (cid:48) )ψ X Theorem 10.6.6 If Σ is coherent and ( Σ , A) is faithful, then the extension ( Σ , A) ⊆ ( Σ ⊗ , A ⊗ ) is conservative. Proof:

We have to show that ψ X : T Σ ,A (X) → T Σ ⊗ ,A ⊗ (X) is injective. By theabove naturality diagram plus faithfulness, it suﬃces to show that ψ X (cid:48) : T Σ ,A (X (cid:48) ) → T Σ ⊗ ,A ⊗ (X (cid:48) ) is injective, where X (cid:48) ⊇ X is obtained from X byadding a new variable symbol of sort s for each sort s with X s = ∅ .Now pick an arbitrary variable symbol x s ∈ X s for each s ∈ S . Thekey step is to make the ( Σ , A) -algebra T Σ ,A (X (cid:48) ) into a ( Σ ⊗ , A ⊗ ) -algebra rder-Sorted Algebra by deﬁning r s (cid:48) ,s : T Σ ,A (X (cid:48) ) s (cid:48) → T Σ ,A (X (cid:48) ) s to be the function that sends [t] ∈ T Σ ,A (X (cid:48) ) s to [t] , and otherwise sends it to x s . It is now easyto see that the retract equations are satisﬁed. Thus the freeness of T Σ ⊗ ,A ⊗ (X) implies that the natural inclusion X (cid:48) → T Σ ,A (X (cid:48) ) induces aunique Σ ⊗ -homomorphism q : T Σ ⊗ ,A ⊗ (X (cid:48) ) → T Σ ,A (X (cid:48) ) such that q ◦ ψ X (cid:48) is the identity. Therefore ψ X (cid:48) is injective. (cid:2) Exiled Proofs

Some Background onRelations

The ﬁrst part of this appendix reviews some basic material that is as-sumed in the body of this text. The approach is oriented towards usein algebra and OBJ, and diﬀers from traditional set-theoretic formaliza-tions like that in Deﬁnition C.0.1. Section C.1 implements some of thismaterial in OBJ.

Deﬁnition C.0.1 A set-theoretic relation , from a set A to a set B , is a subset R ⊆ A × B ; we let aRb mean that (cid:104) a, b (cid:105) ∈ R . The image of R ⊆ A × B is the set { b | aRb for some a ∈ A } , a subset of B , and the coimage of R is { a | aRb for some b ∈ B } , a subset of A . A set-theoretic function from A to B is a set-theoretic relation f from A to B that satisﬁes thefollowing two properties:(1) for each a ∈ A there is some b ∈ B such that (cid:104) a, b (cid:105) ∈ f , and(2) if (cid:104) a, b (cid:105) , (cid:104) a, b (cid:48) (cid:105) ∈ f then b = b (cid:48) .When f is a function, we usually write f (a) for the unique b such that (cid:104) a, b (cid:105) ∈ f . (cid:2) The above is not very satisfactory in some respects. For example, con-sider the case where A ⊆ B and we want f to be the inclusion functionfrom A to B . As a set-theoretic relation, this is {(cid:104) a, a (cid:105) | a ∈ A } , which is exactly the same as the set-theoretic relation for the identity func-tion on A (for a speciﬁc example, let A = ω + and B = ω *). But thesetwo functions are not the same; although they have the same graph,image, and coimage, they have diﬀerent target sets. Indeed, there aremany proofs in this text that use inclusion functions, and that wouldfail if inclusion and identity functions were the same! Hence, the aboveformalization of the relation concept is not suitable for our purposes.Instead,we use the following: Deﬁnition C.0.2 A relation from A to B is a triple (cid:104) A, R, B (cid:105) , where R is a set-theoretic relation from A to B , called the graph of the relation. We write Some Background on Relations aRb if (cid:104) a, b (cid:105) ∈ R . If B = A , then R is said to be a relation on A . A and B are called the source and target of R , respectively, or sometimes the domain and codomain of R , respectively, and we may write R : A → B .We may also use the terms image and coimage , as deﬁned in DeﬁnitionC.0.1 for the graph R of the relation. If the graph R satisﬁes (1) and (2)of Deﬁnition C.0.1, then the relation is called a function , an arrow , ora map from A to B . (cid:2) This diﬀers from Deﬁnition C.0.1 in that source and target sets are ex-plicitly given; this allows us to distinguish inclusions from identities,but we may still abbreviate R : A → B by just R . Deﬁnition C.0.3

Given a relation R : A → B and A (cid:48) ⊆ A , then { b | (cid:104) a, b (cid:105) ∈ R for some a ∈ A (cid:48) } is called the image of A (cid:48) under R , written R(A (cid:48) ) . Also, given B (cid:48) ⊆ B , then { a | (cid:104) a, b (cid:105) ∈ R for some b ∈ B (cid:48) } is called the inverse image of B (cid:48) under R , written R − (B (cid:48) ) . (cid:2) The following equivalent formalization of relations is more suitablefor mechanization in OBJ (the equivalence is discussed in Chapter 8):

Deﬁnition C.0.4 A relation from A to B is an arrow A × B → { true , false } . Its graph is the set {(cid:104) a, b (cid:105) | R(a, b) = true } , and A, B are called its source and target sets, respectively. (cid:2)

Here are some further concepts associated with functions that areused in this book:

Deﬁnition C.0.5

A function f : A → B is injective iﬀ f (a) = f (a (cid:48) ) implies a = a (cid:48) for all a, a (cid:48) ∈ A , is surjective iﬀ for all b ∈ B there is some a ∈ A such that f (a) = b , and is bijective iﬀ it is both injective andsurjective. (cid:2) Exercise C.0.1

Given a function f : A → B , show that f − (B) = A . (cid:2) Exercise C.0.2

Show that a function f : A → B is surjective iﬀ its image is B . (cid:2) Deﬁnition C.0.6

Given a function f : A → B and A (cid:48) ⊆ A , then the restric-tion of f to A (cid:48) is the function f | A (cid:48) : A (cid:48) → B with graph {(cid:104) a, b (cid:105) | a ∈ A (cid:48) and (cid:104) a, b (cid:105) ∈ f } . Also, given B (cid:48) ⊆ B , the corestriction of f to B (cid:48) isthe function f − (B (cid:48) ) → B (cid:48) with graph {(cid:104) a, b (cid:105) | b ∈ B (cid:48) and (cid:104) a, b (cid:105) ∈ f } . (cid:2) We now consider a number of diﬀerent kinds of relation on a set. Although we can always recover the source of a set-theoretic function because ofcondition (1). Deﬁnition C.0.7

A relation R on a set A is:• reﬂexive iﬀ aRa for all a ∈ A ,• symmetric iﬀ aRa (cid:48) implies a (cid:48) Ra for all a, a (cid:48) ∈ A ,• anti-reﬂexive iﬀ aRa for no a ∈ A ,• anti-symmetric iﬀ aRa (cid:48) and a (cid:48) Ra imply a = a (cid:48) for all a, a (cid:48) ∈ A ,• transitive iﬀ aRa (cid:48) and a (cid:48) Ra (cid:48)(cid:48) imply aRa (cid:48)(cid:48) for all a, a (cid:48) , a (cid:48)(cid:48) ∈ A ,• a partial ordering iﬀ it is reﬂexive, anti-symmetric, and transitive,• a quasi ordering iﬀ it is anti-reﬂexive and transitive, and• an equivalence relation iﬀ it is reﬂexive, symmetric, and transi- tive.It is customary to let ≥ and > denote partial and quasi orderings, re-spectively, and to call a set with a partial ordering a poset , in whichcase the underlying set A may be called the carrier of the poset. (cid:2) Example C.0.8

Let A be the set of all people. Then the relation “ancestor-of”is transitive and anti-reﬂexive, and is thus a quasi ordering; but it isnot symmetric or reﬂexive. The “cousin-of” relation is symmetric, butnot transitive or reﬂexive, although the “cousin-of-or-equal” relation isreﬂexive, symmetric and transitive, and thus is an equivalence relation.The “child-of” relation is anti-reﬂexive, but has none of the other prop-erties in Deﬁnition C.0.7. (cid:2) If X is a set, then P (X) denotes the set of all subsets of X , called the power set of X . Exercise C.0.3

Let A = P (ω) . Then what properties does the “subset-of” re-lation on A have? What about the “proper-subset-of” relation? If A =P (X) , do the answers to these questions vary with X ? If so, how? Ifnot, why? (cid:2) There is an important bijective correspondence between the partial and the quasi orderings on a set. This expressed precisely in the fol-lowing:

Proposition C.0.9

Given a set A and a relation R on A , deﬁne R Q by aR Q a (cid:48) iﬀ aRa (cid:48) and a ≠ a (cid:48) , and deﬁne R P by aR P a (cid:48) iﬀ aRa (cid:48) or a = a (cid:48) . Then R Q is a quasi order if R is a partial order, and R P is a partial order if R is aquasi order. Moreover, if R is a quasi order, then (R P ) Q = R , and if R isa partial order then (R Q ) P = R . Proof:

The reader can check the following assertions, which complete theproof: if R is transitive and anti-symmetric, then R Q is transitive; if R Some Background on Relations is transitive then so is R P ; if R is any relation, then R Q is anti-reﬂexive,and R P is reﬂexive and anti-symmetric; if R is a partial order, then aRa (cid:48) iﬀ aR Q a (cid:48) or a = a (cid:48) iﬀ a(R Q ) P a (cid:48) ; and if R is a quasi order, then aRa (cid:48) iﬀ aR P a (cid:48) and a ≠ a (cid:48) iﬀ a(R P ) Q a (cid:48) . (cid:2) We can also have operations and relations on relations. For example:

Deﬁnition C.0.10 If R and R (cid:48) are relations on A , then R ⊆ R (cid:48) means that aRa (cid:48) implies aR (cid:48) a (cid:48) for all a, a (cid:48) ∈ A . Also, we deﬁne the union of R and R (cid:48) ,denoted R ∪ R (cid:48) , by a(R ∪ R (cid:48) )a (cid:48) iﬀ aRa (cid:48) or aR (cid:48) a (cid:48) , and their intersection ,denoted R ∩ R (cid:48) , by a(R ∩ R (cid:48) )a (cid:48) iﬀ aRa (cid:48) and aR (cid:48) a (cid:48) . (cid:2) Proposition C.0.11

Every relation R on A is contained in a least transitive rela-tion on A , denoted R + and called the transitive closure of R . Proof:

We deﬁne R + as follows: aR + a (cid:48) iﬀ there exists a ﬁnite (possibly empty)list a , a , . . . , a n of elements of A such that aRa and a Ra and. . . a n Ra (cid:48) . Then R ⊆ R + , and it is not hard to check that R + is tran-sitive.Now suppose that R ⊆ S and that S is transitive. If aR + a (cid:48) , thenthere exist a , . . . , a n such that aRa and a Ra and . . . a n Ra (cid:48) . Thus aSa and a Sa and . . . a n Sa (cid:48) (because R ⊆ S ), and so transitivity of S gives aSa (cid:48) . Therefore R + ⊆ S , and hence R + is minimal. (cid:2) Example C.0.12 If A is the set of all people and R is the “parent-of” relation,then R + is the “ancestor-of” relation. (cid:2) Proposition C.0.13

Every relation R on a set A is contained in a least transi-tive and reﬂexive relation on A , denoted R ∗ and called the transitive,reﬂexive closure of R . Proof:

Let us deﬁne aR ∗ a (cid:48) iﬀ a = a (cid:48) or aR + a (cid:48) . Then R ∗ is transitive, re-ﬂexive, and contains R . If S is another such relation, then R + ⊆ S byProposition C.0.11 and hence R ∗ ⊆ S (by reﬂexivity). (cid:2) Deﬁnition C.0.14

Given a relation R on A , its converse is denoted R (cid:94) , and isdeﬁned by aR (cid:94) a (cid:48) iﬀ a (cid:48) Ra . (cid:2) Example C.0.15 If A is the set of people and R is the “parent-of” relation, then R (cid:94) is the “child-of” relation. (cid:2) Proposition C.0.16

Every relation R on A is contained in a least symmetric re-lation on A , namely R ∪ R (cid:94) , called the symmetric closure of R anddenoted R ± . (cid:2) Proposition C.0.17

Every relation R on A is contained in a least equivalencerelation on A , namely (R ± ) ∗ , called the equivalence relation generatedby R , and denoted R ≡ . (cid:2) BJ Theories for Relations

Exercise C.0.4

Prove Propositions C.0.16 and C.0.17. (cid:2)

Exercise C.0.5

Show that a function f : A → B is bijective iﬀ the converse of itsgraph is a function f : B → A . (cid:2) Deﬁnition C.0.18 If ≡ is an equivalence relation on a set A , we let [a] denotethe ≡ - equivalence class of a ∈ A , deﬁned to be [a] = { a (cid:48) ∈ A | a (cid:48) ≡ a } , and we deﬁne the quotient of A by ≡ , denoted A/ ≡ , to be { [a] | a ∈ A } . (cid:2) Exercise C.0.6 If ≡ is an equivalence relation, and if [a] and [b] are two distinct ≡ -equivalence classes, then show that [a] ∩ [b] = ∅ . Also, show that (cid:83) { C | C ∈ A/ ≡} = A . (cid:2) Example C.0.19 If A is the set of all people, alive now or in the past, and if R is again the “parent-of” relation, then the hypothesis that all peo-ple are descended from Adam and Eve implies that there is just oneequivalence class under the relation R ≡ . On the other hand, if there ismore than one equivalence class, there may be aliens among us, anotherspecies, or some other non-interbreeding population. (cid:2) Everything in this section extends to S -sorted relations and func-tions in the style of Section 2.2, using the following set-theoretic rep-resentation for S -sorted sets: Let S be a set, and let Set be some setof sets that includes all sets that are candidates for use in indexed setswithin the current context; then an S -sorted function A is a function A : S → Set . C.1 OBJ Theories for Relations

This section gives OBJ3 code for some of the concepts discussed above.It is intended to be read later than the above material, after the relevantconcepts from OBJ have been studied.To get started, here is OBJ3 code for the theory of relations: th REL is sort Elt .op _R_ : Elt Elt -> Bool .endth

Notice that nothing at all is assumed about R . We will enrich thistheory in various ways in the following development. Example C.1.1

The theory of partial ordering relations is given below; a setwith a partial ordering is called a “poset.” th POSET is sort Elt .op _=>_ : Elt Elt -> Bool . Some Background on Relations vars A B C : Elt .eq A => A = true .cq A = B if A => B and B => A .cq A => C = true if A => B and B => C .endth

Any initial algebra of the following speciﬁcation of the natural num-bers with their natural ordering will satisfy the above theory th NAT is sort Nat .op 0 : -> Nat .op s : Nat -> Nat .op _=>_ : Nat Nat -> Bool .vars A B : Nat .eq A => A = true .eq s(A) => 0 = true .eq s(A) => s(B) = A => B .endth (cid:2)

Exercise C.1.1

Write an OBJ3 theory for quasi orders and show how to get apartial order from a quasi order, and vice versa . Give three examples ofa quasi order. (cid:2)

Example C.1.2 ((cid:63))

The transitive closure of a relation R is speciﬁed by the fol-lowing, in which the imported module REL is assumed to deﬁne R : obj TRCL ispr REL .op _R+_ : Elt Elt -> Bool .vars A B C : Elt .cq A R+ B = true if A R B .cq A R+ C = true if A R B and B R+ C .endo However, it is more elegant to deﬁne transitive closure as a param-eterized theory, as follows (Chapter 11 discusses this concept): obj TRCL[R :: REL] isop _R+_ : Elt Elt -> Bool .vars A B C : Elt .cq A R+ B = true if A R B .cq A R+ C = true if A R B and B R+ C .endo

There are some peculiar points about the OBJ3 speciﬁcations above.First, the second conditional equation is not a rewrite rule, because thevariable B occurs in the condition but not in the leftside. ThereforeOBJ3 will not accept it in an object module, although it will accept itin a theory module. However, initial semantics is needed, because the BJ Theories for Relations transitive closure is supposed to be the least transitive relation con-taining the given one. This means that the above code will not actuallyrun in OBJ3. This is because a variable that is not in the leftside actsas if it were existentially quantiﬁed, and OBJ3 cannot handle existentialquantiﬁers. However, the speciﬁcations make perfect semantic sense,and in fact would run in Eqlog [79, 42].Note that under initial semantics, when a R+ b does not equal true ,it also does not equal false , but rather equals the term a R+ b , be-cause that term is itself reduced. This means that if we want a versionof transitive closure that does equal false when it is not true , then weshould replace occurrences of the expression a R+ b by the expression a R+ b == true . (cid:2) Exercise C.1.2 ((cid:63))

Write an OBJ3 parameterized theory specifying the transi- tive, reﬂexive closure of a relation. (cid:2)

Example C.1.3

Here is an OBJ3 theory for equivalence relations: th EQV is sort Elt .op _ ≡ _ : Elt Elt -> Bool .vars A B C : Elt .eq A ≡ A = true .eq A ≡ B = B ≡ A .cq A ≡ C = true if A ≡ B and B ≡ C .endth (cid:2)

Exercise C.1.3 ((cid:63))

Write an OBJ3 parameterized theory specifying the transi-tive, reﬂexive, symmetric closure of a relation; this is called its equiva-lence closure . (cid:2) Some Background on Relations

Social Implications

Practical interest in veriﬁcation technology is particularly acute for socalled “critical systems,” which have the property that incorrect opera- tion may result in loss of human life, compromise of national security,massive loss of property, etc. Typical examples are heart pacemakers,ﬂight control systems, automobile brake controllers, encryption sys-tems, nuclear power plants, and electronic fund transfer systems. Inthis context, it is especially important to understand the limitations ofveriﬁcation, both those limitations that are inherent in the nature ofveriﬁcation, and those that are due to the current state of the art. Un-fortunately, there is a temptation to play down, or even cover up, theselimitations, due to the lure of fame and fortune. When veriﬁcation isjust an academic exercise, this does little harm; but there is cause forserious concern when manufacturers make advertising claims aboutthe reliability of a critical system (or component thereof) based on itshaving been veriﬁed. As Dr. Avra Cohn [31] says,The use of the word ‘veriﬁed’ must under no circumstances be allowed to confer a false sense of security.This is because it could lead to unintentionally and unnecessarily takingsevere risks. Indeed, one might well argue that to knowingly make falseor misleading claims about the reliability of a critical system should bea criminal oﬀense. It is certainly a grave moral oﬀense.We should realize that nothing in the real world can have the cer-tainty of mathematical truth. Although we might like to think that ‘8 × =

56’ is always and incontrovertibly true in some mathematicalheaven, an actual human being can sometimes misremember the multi-plication table, and an actual machine can sometimes drop a bit, break aconnector, or burn out a power supply. The most that can truthfully beasserted of a “veriﬁed chip” is that certain kinds of design errors havebeen made far less likely. Of course, this is very much worth pursuing,but there remains a long chain of assumptions that must be satisﬁedin order that an actual physical instance of a chip whose logic has beenveriﬁed will operate as intended in situ , including the following: thechip must be correctly fabricated; it must be correctly installed; it mustbe fed correct data and given correct power; the electronic circuits that Social Implications realize its logic must be correctly designed, and used only within theirdesign limits; the analog circuits that support communication must op-erate correctly; there must not be excessive electromagnetic radiationaround the chip; etc., etc. In addition, human factors are often involved;for example, the user should not override warning signals, and mustcorrectly interpret the output.However, I wish to concentrate on certain logical issues that are in-volved. A major point, emphasized by Cohn [31], is that veriﬁcation isa relationship between two models, that is, between two mathematicalabstractions, one of the chip, and the other of the designers’ intentions.However, such models can never capture either the totality of any par-ticular chip, or even of the designers intentions for that chip. This isdue to a number of factors, including errors made in constructing these necessarily complex abstractions, and deliberately ignoring certain fac-tors (which is of course the very nature of abstraction), such as ﬂuctu-ations in power levels, aging of physical components, overheating, etc.Moreover, since the languages in which designs and speciﬁcations areusually expressed are relatively informal, there must be a translationinto some formal language, and errors may be introduced when eitherthe designers or the veriﬁers misunderstand these informal languages.In addition, the theorem prover must correctly implement some logicalsystem, and the formalism for representing the chip in logic must becorrect with respect to that system. Unless a theorem prover is rigor-ously based on a precise and well-understood logical system, there islittle basis for conﬁdence in its “proofs.” For example, it is seductivebut dangerous to “throw together” several diﬀerent logical systems,since the combination may fail to have any obvious notions of modeland satisfaction, even though the components do have them. Even atheorem prover with a sound logical basis is likely to have some bugsin its code, because it is after all a complex real system itself.Moreover, it is always possible that the assumptions about how for-mal sentences represent physical devices are ﬂawed in some subtleways, for example, relating to signal strength, and it is all too easyto use a theorem prover incorrectly, for example, to give it erroneousinput, or to interpret its output incorrectly. Finally, we must note that the current state of the art is not adequate to support the veriﬁcationof really large or complex systems, although recent advances have beenboth rapid and signiﬁcant, and the future looks promising. ibliography [1] Arnon Avron, Furio Honsell, and Ian Mason. Using typed lambdacalculus to implement formal systems on a computer. Technical

Report ECS-LFCS-87-31, Laboratory for Computer Science, Uni-versity of Edinburgh, 1987.[2] Franz Baader and Tobias Nipkow.

Term Rewriting and All That .Cambridge, 1998.[3] Hengt Barendregt.

The Lambda Calculus, its Syntax and Seman-tics . North-Holland, 1984. Second Revised Edition.[4] Hengt Barendregt. Functional programming and lambda calculus.In Jan van Leeuwen, editor,

Handbook of Theoretical ComputerScience . North-Holland, 1989.[5] Michael Barr and Charles Wells.

Toposes, Triples and The-ories . Springer, 1985. Grundlehren der Mathematischen Wis-senschafter, Volume 278.[6] Michael Barr and Charles Wells.

Category Theory for ComputingScience . Prentice-Hall, 1990.[7] Roland Barthes.

S/Z: An Essay and Attitudes . Hill and Wang, 1974.Trans. Richard Miller.[8] Jean Benabou. Structures algébriques dans les catégories.

Cahiersde Topologie et Géometrie Diﬀérentiel , 10:1–126, 1968.[9] Jan Bergstra, Jan Heering, and Paul Klint.

Algebraic Speciﬁcation . Association for Computing Machinery, 1989.[10] Jan Bergstra and Jan Willem Klop. Conditional rewrite rules: Con-ﬂuence and termination.

Journal of Computer System and Sci-ences , 32:323–362, 1986.[11] Jan Bergstra and John Tucker. Characterization of computabledata types by means of a ﬁnite equational speciﬁcation method.In Jaco de Bakker and Jan van Leeuwen, editors,

Automata, Lan-guages and Programming, Seventh Colloquium , pages 76–90.Springer, 1980. Lecture Notes in Computer Science, Volume 81. Bibliography [12] Garrett Birkhoﬀ. On the structure of abstract algebras.

Proceed-ings of the Cambridge Philosophical Society , 31:433–454, 1935.[13] Garrett Birkhoﬀ and J. Lipson. Heterogeneous algebras.

Journalof Combinatorial Theory , 8:115–133, 1970.[14] Woodrow Wilson Bledsoe and Donald Loveland (editors).

Auto-mated Theorem Proving: After 25 Years . American MathematicalSociety, 1984. Volume 29 of Contemporary Mathematics Series.[15] Barry Boehm.

Software Engineering Economics . Prentice-Hall,1981.[16] William Boone. The word problem.

Ann. Math. , 70:207–265, 1959.[17] Adel Bouhoula. Automated theorem proving by test set induc- tion.

Journal of Symbolic Computation , 23(1):47–77, 1997.[18] Adel Bouhoula and Jean-Pierre Jouannaud. Automata-driven au-tomated induction. In

Proceedings, 12th Symposium on Logic inComputer Science , pages 14–25. IEEE, 1997.[19] Robert Boyer and J Moore.

A Computational Logic . Academic,1980.[20] J.W. Brewer and Martha K. Smith, editors.

Emmy Noether: A Trib-ute to her Life and Work . Dekker, 1981.[21] Rod Burstall. Proving properties of programs by structural induc-tion.

Computer Journal , 12(1):41–48, 1969.[22] Rod Burstall and Joseph Goguen. Putting theories together tomake speciﬁcations. In Raj Reddy, editor,

Proceedings, Fifth In-ternational Joint Conference on Artiﬁcial Intelligence , pages 1045–1058. Department of Computer Science, Carnegie-Mellon Univer-sity, 1977.[23] Rod Burstall and Joseph Goguen. The semantics of Clear, a spec-iﬁcation language. In Dines Bjorner, editor,

Proceedings, 1979Copenhagen Winter School on Abstract Software Speciﬁcation ,pages 292–332. Springer, 1980. Lecture Notes in Computer Sci- ence, Volume 86.[24] Rod Burstall and Joseph Goguen. Algebras, theories and free-ness: An introduction for computer scientists. In Martin Wirs-ing and Gunther Schmidt, editors,

Theoretical Foundations ofProgramming Methodology , pages 329–350. Reidel, 1982. Pro-ceedings, 1981 Marktoberdorf NATO Summer School, NATO Ad-vanced Study Institute Series, Volume C91.[25] Albert Camilleri, Michael J.C. Gordon, and Tom Melham. Hard-ware veriﬁcation using higher-order logic. Technical Report 91,University of Cambridge, Computer Laboratory, June 1986. ibliography [26] Carlo Cavenathi, Marco De Zanet, and Giancarlo Mauri. MC-OBJ: aC interpreter for OBJ, 1987.[27] Chin-Liang Chang and Richard Char-Tung Lee.

Symbolic Logic andMechanical Theorem Proving . Academic, 1973.[28] Alonzo Church, editor.

The Calculi of Lambda-Conversion . Prince-ton, 1941. Annals of Mathematics Studies, No. 6.[29] Manuel Clavel, Francisco Duran, Steven Eker, José Meseguer, andM.-O. Stehr. Maude as a formal meta-tool. In

Proceedings, FM’99- Formal Methods, Volume II , pages 1684–1701. Springer, 1999.Lecture Notes in Computer Science, Volume 1709.[30] Manuel Clavel, Steven Eker, Patrick Lincoln, and José Meseguer.

Principles of Maude. In José Meseguer, editor,

Proceedings, FirstInternational Workshop on Rewriting Logic and its Applications .Elsevier Science, 1996. Volume 4,

Electronic Notes in TheoreticalComputer Science .[31] Avra Cohn. Correctness properties of the Viper block model: Thesecond level. In V.P. Subramanyan and Graham Birtwhistle, ed-itors,

Current Trends in Hardware Veriﬁcation and AutomatedTheorem Proving , pages 1–91. Springer, 1989.[32] Paul M. Cohn.

Universal Algebra . Harper and Row, 1965. Revisededition 1980.[33] Derek Coleman, Robin Gallimore, and Victoria Stavridou. The de-sign of a rewrite rule interpreter from algebraic speciﬁcations.

IEE Software Engineering Journal , July:95–104, 1987.[34] Alan Colmerauer, H. Kanoui, and M. van Caneghem. Etudeet réalisation d’un système Prolog. Technical report, Grouped’Intelligence Artiﬁcielle, U.E.R. de Luminy, Université d’Aix-Marseille II, 1979.[35] Robert Constable et al.

Implementing Mathematics with the NuprlProof Development System . Prentice-Hall, 1986. [36] Pierre-Luc Curien.

Categorial Combinators, Sequential Algo-rithms, and Functional Programming . Pitman and Wiley, 1986.Research Notes in Theoretical Computer Science.[37] Haskell Curry and R. Feys.

Combinatory Logic, Volume I . North-Holland, 1958.[38] Haskell Curry, J.R. Hindley, and J.P. Seldin.

Combinatory Logic,Volume II . North-Holland, 1972. Studies in Logic 65.[39] Nachum Dershowitz and Jean-Pierre Jouannaud. Rewriting sys-tems. In Jan van Leeuwen, editor,

Handbook of Theoretical Com- Bibliography puter Science, Volume B: Formal Methods and Semantics , pages243–320. North-Holland, 1990.[40] Nachum Dershowitz and David Plaisted. Equational program-ming. In John Hayes, Donald Michie, and J. Richards, editors,

Ma-chine Intelligence 11 , pages 21–56. Oxford, 1987.[41] R˘azvan Diaconescu. The logic of Horn clauses is equational. Tech-nical Report PRG–TR–3–93, Programming Research Group, Uni-versity of Oxford, 1993. Written 1990.[42] R˘azvan Diaconescu.

Category-based Semantics for Equational andConstraint Logic Programming . PhD thesis, Programming Re-search Group, Oxford University, 1994.[43] R ˘ azvan Diaconescu and Kokichi Futatsugi. CafeOBJ Report: TheLanguage, Proof Techniques, and Methodologies for Object-Oriented Algebraic Speciﬁcation . World Scientiﬁc, 1998. AMASTSeries in Computing, Volume 6.[44] Aubert Daigneault (editor).

Studies in Algebraic Logic . Mathemat-ical Association of America, 1974. MAA Studies in Mathematics,Vol. 9.[45] Hartmut Ehrig and Bernd Mahr.

Fundamentals of Algebraic Speci-ﬁcation 1: Equations and Initial Semantics . Springer, 1985. EATCSMonographs on Theoretical Computer Science, Vol. 6.[46] Trevor Evans. On multiplicative systems deﬁned by generatorsand relations, I.

Proceedings of the Cambridge Philosophical Soci-ety , 47:637–649, 1951.[47] Kokichi Futatsugi, Joseph Goguen, Jean-Pierre Jouannaud, andJosé Meseguer. Principles of OBJ2. In Brian Reid, editor,

Pro-ceedings, Twelfth ACM Symposium on Principles of ProgrammingLanguages , pages 52–66. Association for Computing Machinery,1985. [48] Jean H. Gallier.

Logic for Computer Scientists . Harper and Row,1986.[49] Stephen Garland and John Guttag. Inductive methods for reason-ing about abstract data types. In

Proceedings, Fifteenth Sympo-sium on Principles of Programming Languages , pages 219–229.Association for Computing Machinery, January 1988.[50] Jürgen Giesl and Aart Middeldorp. Innermost termination ofcontext-sensitive rewriting. In

Proceedings, Sixth InternationalConference on Developments in Language Theory . Springer, 2002.Lecture Notes in Computer Science. ibliography [51] Joseph Goguen. Reality and human values in mathematics. Sub-mitted for publication.[52] Joseph Goguen. Semantics of computation. In Ernest Manes,editor,

Proceedings, First International Symposium on CategoryTheory Applied to Computation and Control , pages 151–163.Springer, 1975. (San Francisco, February 1974.) Lecture Notes inComputer Science, Volume 25.[53] Joseph Goguen. Abstract errors for abstract data types. In EricNeuhold, editor,

Proceedings, First IFIP Working Conference onFormal Description of Programming Concepts , pages 21.1–21.32.MIT, 1977. Also in

Formal Description of Programming Concepts ,Peter Neuhold, Ed., North-Holland, pages 491–522, 1979. [54] Joseph Goguen. Order-sorted algebra. Technical Report 14, UCLAComputer Science Department, 1978. Semantics and Theory ofComputation Series.[55] Joseph Goguen. Some design principles and theory for OBJ-0, alanguage for expressing and executing algebraic speciﬁcations ofprograms. In Edward Blum, Manfred Paul, and Satsoru Takasu,editors,

Proceedings, Conference on Mathematical Studies of In-formation Processing , pages 425–473. Springer, 1979. LectureNotes in Computer Science, Volume 75.[56] Joseph Goguen. How to prove algebraic inductive hypotheseswithout induction, with applications to the correctness of datatype representations. In Wolfgang Bibel and Robert Kowalski,editors,

Proceedings, Fifth Conference on Automated Deduction ,pages 356–373. Springer, 1980. Lecture Notes in Computer Sci-ence, Volume 87.[57] Joseph Goguen. Modular algebraic speciﬁcation of some ba-sic geometrical constructions.

Artiﬁcial Intelligence , pages 123–153, 1988. Special Issue on Computational Geometry, edited byDeepak Kapur and Joseph Mundy; also, Report CSLI-87-87, Centerfor the Study of Language and Information at Stanford University,

March 1987.[58] Joseph Goguen. Memories of ADJ.

Bulletin of the European As-sociation for Theoretical Computer Science , 36:96–102, October1989. Guest column in the ‘Algebraic Speciﬁcation Column.’ Alsoin

Current Trends in Theoretical Computer Science: Essays andTutorials , World Scientiﬁc, 1993, pages 76–81.[59] Joseph Goguen. OBJ as a theorem prover, with application tohardware veriﬁcation. In V.P. Subramanyan and Graham Birtwhis-tle, editors,

Current Trends in Hardware Veriﬁcation and Auto-mated Theorem Proving , pages 218–267. Springer, 1989. Bibliography [60] Joseph Goguen. What is uniﬁcation? A categorical view of substi-tution, equation and solution. In Maurice Nivat and Hassan Aït-Kaci, editors,

Resolution of Equations in Algebraic Structures, Vol-ume 1: Algebraic Techniques , pages 217–261. Academic, 1989.[61] Joseph Goguen. Higher-order functions considered unnecessaryfor higher-order programming. In David Turner, editor,

ResearchTopics in Functional Programming , pages 309–352. Addison Wes-ley, 1990. University of Texas at Austin Year of Programming Se-ries; preliminary version in SRI Technical Report SRI-CSL-88-1,January 1988.[62] Joseph Goguen. Proving and rewriting. In Hélène Kirchner andWolfgang Wechler, editors,

Proceedings, Second InternationalConference on Algebraic and Logic Programming , pages 1–24.

Springer, 1990. Lecture Notes in Computer Science, Volume 463.[63] Joseph Goguen. A categorical manifesto.

Mathematical Structuresin Computer Science , 1(1):49–67, March 1991.[64] Joseph Goguen. An introduction to algebraic semiotics, with ap-plications to user interface design. In Chrystopher Nehaniv, edi-tor,

Computation for Metaphors, Analogy and Agents , pages 242–291. Springer, 1999. Lecture Notes in Artiﬁcial Intelligence, Vol-ume 1562.[65] Joseph Goguen. Social and semiotic analyses for theorem proveruser interface design.

Formal Aspects of Computing , 11:272–301,1999. Special issue on user interfaces for theorem provers.[66] Joseph Goguen and Rod Burstall. Institutions: Abstract modeltheory for computer science. Technical Report CSLI-85-30, Centerfor the Study of Language and Information, Stanford University,1985. A preliminary version appears in

Proceedings, Logics of Pro-gramming Workshop , Edmund Clarke and Dexter Kozen, editors,Springer Lecture Notes in Computer Science, Volume 164, pages221–256, 1984.[67] Joseph Goguen and Rod Burstall. Institutions: Abstract model theory for speciﬁcation and programming.

Journal of the Associ-ation for Computing Machinery , 39(1):95–146, January 1992.[68] Joseph Goguen and R˘azvan Diaconescu. An Oxford survey oforder-sorted algebra.

Mathematical Structures in Computer Sci-ence , 4:363–392, 1994.[69] Joseph Goguen, Jean-Pierre Jouannaud, and José Meseguer. Op-erational semantics of order-sorted algebra. In Wilfried Brauer,editor,

Proceedings, 1985 International Conference on Automata,Languages and Programming , pages 221–231. Springer, 1985.Lecture Notes in Computer Science, Volume 194. ibliography [70] Joseph Goguen, Claude Kirchner, and José Meseguer. Concurrentterm rewriting as a model of computation. In Robert Keller andJoseph Fasel, editors,

Proceedings, Graph Reduction Workshop ,pages 53–93. Springer, 1987. Lecture Notes in Computer Science,Volume 279.[71] Joseph Goguen and Kai Lin. Behavioral veriﬁcation of distributedconcurrent systems with BOBJ. In Hans-Dieter Ehrich and T.H.Tse, editors,

Proceedings, Conference on Quality Software , pages216–235. IEEE Press, 2003.[72] Joseph Goguen and Kai Lin. Specifying, programming and verify-ing with equational logic. In Sergei Artemov, Howard Barringer,Artur d’Avila Garcez, Luis Lamb, and John Woods, editors,

WeWill Show Them! Essays in honour of Dov Gabbay, Vol. 2 , pages

Proceedings, Automated Software Engineering ,pages 55–62. IEEE, 1997.[74] Joseph Goguen, Kai Lin, Akira Mori, Grigore Ro¸su, and AkiyoshiSato. Tools for distributed cooperative design and validation. In

Proceedings, CafeOBJ Symposium . Japan Advanced Institute forScience and Technology, 1998. Namazu, Japan, April 1998.[75] Joseph Goguen, Kai Lin, and Grigore Ro¸su. Circular coinductiverewriting. In

Automated Software Engineering ’00 , pages 123–131. IEEE, 2000. Proceedings of a workshop held in Grenoble,France.[76] Joseph Goguen, Kai Lin, Grigore Ro¸su, Akira Mori, and BogdanWarinschi. An overview of the Tatami project. In Kokichi Fu-tatsugi, Ataru Nakagawa, and Tetsuo Tamai, editors,

Cafe: AnIndustrial-Strength Algebraic Formal Method , pages 61–78. Else-vier, 2000.[77] Joseph Goguen and Grant Malcolm.

Algebraic Semantics of Im- perative Programs . MIT, 1996.[78] Joseph Goguen and José Meseguer. Completeness of many-sortedequational logic.

Houston Journal of Mathematics , 11(3):307–334,1985.[79] Joseph Goguen and José Meseguer. Eqlog: Equality, types, andgeneric modules for logic programming. In Douglas DeGroot andGary Lindstrom, editors,

Logic Programming: Functions, Rela-tions and Equations , pages 295–363. Prentice-Hall, 1986. An ear-lier version appears in

Journal of Logic Programming , Volume 1,Number 2, pages 179–210, September 1984. Bibliography [80] Joseph Goguen and José Meseguer. Remarks on remarks onmany-sorted equational logic.

Bulletin of the European Associationfor Theoretical Computer Science , 30:66–73, October 1986. Alsoin

SIGPLAN Notices , Volume 22, Number 4, pages 41-48, April1987.[81] Joseph Goguen and José Meseguer. Order-sorted algebra solvesthe constructor selector, multiple representation and coercionproblems. In

Proceedings, Second Symposium on Logic in Com-puter Science , pages 18–29. IEEE Computer Society, 1987. AlsoReport CSLI-87-92, Center for the Study of Language and Infor-mation, Stanford University, March 1987; revised version in

In-formation and Computation, 103 , 1993.[82] Joseph Goguen and José Meseguer. Order-sorted algebra I:

Equational deduction for multiple inheritance, overloading, ex-ceptions and partial operations.

Theoretical Computer Science ,105(2):217–273, 1992. Drafts exist from as early as 1985.[83] Joseph Goguen, Akira Mori, and Kai Lin. Algebraic semiotics,ProofWebs and distributed cooperative proving. In Yves Bartot,editor,

UITP‘97, User Interfaces for Theorem Provers , pages 25–34. INRIA, 1999. (Sophia Antipolis, 1–2 September 1997).[84] Joseph Goguen, Andrew Stevens, Keith Hobley, and HendrikHilberdink. 2OBJ, a metalogical framework based on equationallogic.

Philosophical Transactions of the Royal Society, Series A ,339:69–86, 1992. Also in

Mechanized Reasoning and HardwareDesign , edited by C.A.R. Hoare and Michael J.C. Gordon, Prentice-Hall, 1992, pages 69–86.[85] Joseph Goguen and Joseph Tardo. OBJ-0 preliminary users man-ual. Semantics and theory of computation report 10, UCLA, 1977.[86] Joseph Goguen, James Thatcher, and Eric Wagner. An initial al-gebra approach to the speciﬁcation, correctness and implemen-tation of abstract data types. In Raymond Yeh, editor,

CurrentTrends in Programming Methodology, IV , pages 80–149. Prentice-

Hall, 1978.[87] Joseph Goguen, James Thatcher, Eric Wagner, and Jesse Wright.Abstract data types as initial algebras and the correctness of datarepresentations. In Alan Klinger, editor,

Computer Graphics, Pat-tern Recognition and Data Structure , pages 89–93. IEEE, 1975.[88] Joseph Goguen, James Thatcher, Eric Wagner, and Jesse Wright.Initial algebra semantics and continuous algebras.

Journal of theAssociation for Computing Machinery , 24(1):68–95, January 1977.[89] Joseph Goguen, James Thatcher, Eric Wagner, and Jesse Wright.Initial algebra semantics and continuous algebras.

Journal of the ibliography

Association for Computing Machinery , 24(1):68–95, January 1977.An early version is “Initial Algebra Semantics”, by Joseph Goguenand James Thatcher, IBM T.J. Watson Research Center, ReportRC 4865, May 1974.[90] Joseph Goguen, Timothy Winkler, José Meseguer, Kokichi Fu-tatsugi, and Jean-Pierre Jouannaud. Introducing OBJ. In JosephGoguen and Grant Malcolm, editors,

Software Engineering withOBJ: Algebraic Speciﬁcation in Action , pages 3–167. Kluwer, 2000.Also Technical Report SRI-CSL-88-9, August 1988, SRI Interna-tional.[91] Robert Goldblatt.

Topoi, the Categorial Analysis of Logic . North-Holland, 1979.[92] Michael J.C. Gordon.

The Denotational Description of Program-ming Languages . Springer, 1979.[93] Michael J.C. Gordon. HOL: A machine oriented formulation ofhigher-order logic. Technical Report 85, University of Cambridge,Computer Laboratory, July 1985.[94] Michael J.C. Gordon. Why higher-order logic is a good formalismfor specifying and verifying hardware. In George Milne and P.A.Subrahmanyam, editors,

Formal Aspects of VLSI Design . North-Holland, 1986.[95] Michael J.C. Gordon, Robin Milner, and Christopher Wadsworth.

Edinburgh LCF . Springer, 1979. Lecture Notes in Computer Sci-ence, Volume 78.[96] Paul Halmos.

Algebraic Logic . Van Nostrand, 1962.[97] Paul Halmos and Steven Givant.

Logic as Algebra . MathematicalAssociation of America, 1998. Dolciani Expositions No. 21.[98] Robert Harper, Furio Honsell, and Gordon Plotkin. A frameworkfor deﬁning logics. In

Proceedings, Second Symposium on Logic inComputer Science , pages 194–204. IEEE Computer Society, 1987. [99] Robert Harper, David MacQueen, and Robin Milner. Standard ML.Technical Report ECS-LFCS-86-2, Department of Computer Sci-ence, University of Edinburgh, 1986.[100] Leon Henkin, Donald Monk, and Alfred Tarski.

Cylindric Alge-bras . North Holland, 1971.[101] Phillip J. Higgins. Algebras with a scheme of operators.

Mathema-tische Nachrichten , 27:115–132, 1963.[102] G. Higman and B.H. Neumann. Groups as groupoids with one law.

Publ. Math. Debrecen , 2:215–221, 1952. Bibliography [103] J.R. Hindley.

The Church-Rosser Property and a Result in Com-binatory Logic . PhD thesis, University of Newcastle-upon-Tyne,1964.[104] Masako K. Hiraga. Diagrams and metaphors: Iconic aspects inlanguage.

Journal of Pragmatics , 22:5–21, 1994.[105] S. Hölldobler, editor.

Foundations of Equational Logic Program-ming . Springer, 1989. Lecture Notes in Artiﬁcial Intelligence, Vol-ume 353.[106] Jieh Hsiang.

Refutational Theorem Proving using Term RewritingSystems . PhD thesis, University of Illinois at Champaign-Urbana,1981. [107] Paul Hudak, Simon Peyton Jones, Philip Wadler, Arvind, et al. Re-port on the functional programming language Haskell.

ACM SIG-PLAN Notices , 27, May 1992. Version 1.2.[108] Gérard Huet. Conﬂuent reductions: Abstract properties and ap-plications to term rewriting systems.

Journal of the Associationfor Computing Machinery , 27(4):797–821, 1980. Preliminary ver-sion in

Proceedings , 18th IEEE Symposium on Foundations ofComputer Science, IEEE, 1977, pages 30–45.[109] Gérard Huet and Derek Oppen. Equations and rewrite rules: Asurvey. In Ron Book, editor,

Formal Language Theory: Perspec-tives and Open Problems , pages 349–405. Academic, 1980.[110] John Hughes. Abstract interpretations of ﬁrst-order polymorphicfunctions. In Cordelia Hall, John Hughes, and John O’Donnell,editors,

Proceedings of the 1988 Glasgow Workshop on FunctionalProgramming , pages 68–86. Computing Science Department, Uni-versity of Glasgow, 1989.[111] Heinz Kaphengst and Horst Reichel. Initial algebraic seman-tics for non-context-free languages. In Marek Karpinski, editor,

Fundamentals of Computation Theory , pages 120–126. Springer,1977. Lecture Notes in Computer Science, Volume 56.[112] Matt Kaufmann, Panagiotis Manolios, and J. Strother Moore.

Computer-Aided Reasoning: An Approach . Kluwer, 2000.[113] Claude Kirchner, Hélène Kirchner, and José Meseguer. Opera-tional semantics of OBJ3. In T. Lepistö and Aarturo Salomaa, ed-itors,

Proceedings, 15th International Colloquium on Automata,Languages and Programming , pages 287–301. Springer, 1988.(Tampere, Finland, 11-15 July 1988.) Lecture Notes in ComputerScience, Volume 317. ibliography [114] Jan Willem Klop. Term rewriting systems: A tutorial.

Bulletinof the European Association for Theoretical Computer Science ,32:143–182, June 1987.[115] Jan Willem Klop. Term rewriting systems: from Church-Rosserto Knuth-Bendix and beyond. In Samson Abramsky, Dov Gabbay,and Tom Maibaum, editors,

Handbook of Logic in Computer Sci-ence , pages 1–117. Oxford, 1992.[116] Donald Knuth. Semantics of context-free languages.

Mathemati-cal Systems Theory , 2:127–145, 1968.[117] Donald Knuth and Peter Bendix. Simple word problems in uni-versal algebra. In J. Leech, editor,

Computational Problems in Ab-stract Algebra . Pergamon, 1970. [118] William Labov. The transformation of experience in narrative syn-tax. In

Language in the Inner City , pages 354–396. University ofPennsylvania, 1972.[119] Leslie Lamport. L A TEX User Guide and Reference Manual . Addison-Wesley, 1985.[120] Peter Landin. A correspondence between ALGOL60 and Church’slambda notation.

Communications of the Association for Comput-ing Machinery , 8(2):89–101, 1965.[121] F. William Lawvere. Functorial semantics of algebraic theories.

Proceedings, National Academy of Sciences, U.S.A. , 50:869–872,1963. Summary of Ph.D. Thesis, Columbia University.[122] Alexander Leitsch.

The Resolution Calculus . Springer, 1997. Textsin Theoretical Computer Science.[123] Charlotte Linde. The organization of discourse. In TimothyShopen and Joseph M. Williams, editors,

Style and Variables inEnglish , pages 84–114. Winthrop, 1981.[124] Charlotte Linde. Private stories in public discourse.

Poetics , Notes on Logic . Van Nostrand, 1966. Mathemat-ical Studies, Volume 6.[126] Saunders Mac Lane.

Categories for the Working Mathematician .Springer, 1971.[127] Saunders Mac Lane. Abstract algebra uses homomorphisms.

American Mathematical Monthly , 103(4):330–331, April 1996.[128] Saunders Mac Lane and Garrett Birkhoﬀ.

Algebra . Macmillan,1967. Bibliography [129] David MacQueen and Donald Sannella. Completeness of proofsystems for equational speciﬁcations.

IEEE Transactions on Soft-ware Engineering , SE-11(5):454–461, May 1985.[130] Ernest Manes.

Algebraic Theories . Springer, 1976. Graduate Textsin Mathematics, Volume 26.[131] John McCarthy, Michael Levin, et al.

LISP 1.5 Programmer’s Man-ual . MIT, 1966.[132] William McCune. Otter 3.0 Users Guide, 1995. Technical Report,Argonne National Laboratory.[133] Elliott Mendelson.

Introduction to Mathematical Logic . Academic,1979. Second edition. [134] José Meseguer. Conditional rewriting logic: Deduction, modelsand concurrency. In Stéphane Kaplan and Misuhiro Okada, ed-itors,

Conditional and Typed Rewriting Systems , pages 64–91.Springer, 1991. Lecture Notes in Computer Science, Volume 516.[135] José Meseguer. Conditional rewriting logic as a uniﬁed model ofconcurrency.

Theoretical Computer Science , 96(1):73–155, 1992.[136] José Meseguer and Joseph Goguen. Deduction with many-sortedrewrite rules. Technical Report CSLI-85-42, Center for the Studyof Language and Information, Stanford University, December1985.[137] José Meseguer and Joseph Goguen. Initiality, induction and com-putability. In Maurice Nivat and John Reynolds, editors,

AlgebraicMethods in Semantics , pages 459–541. Cambridge, 1985.[138] José Meseguer and Joseph Goguen. Order-sorted algebra solvesthe constructor selector, multiple representation and coercionproblems.

Information and Computation , 103(1):114–158, March1993. Revision of a paper presented at LICS 1987.[139] Donald Michie. ‘Memo’ functions and machine learning.

Nature , Proceedings, 7th Symposium on Principles of Program-ming Languages . Association for Computing Machinery, 1980.[141] Alan Mycroft.

Abstract Interpretation and Optimising Transfor-mations for Applicative Programs . PhD thesis, University of Edin-gurgh, 1981.[142] Anil Nerode. Linear automaton transformations.

Proceedings,American Math. Society , 9:541–544, 1958. ibliography [143] M.H.A. Newman. On theories with a combinatorial deﬁnition of‘equivalence’.

Annals of Mathematics , 43(2):223–243, 1942.[144] Petr Novikov. On the algorithmic unsolvability of the word prob-lem in group theory.

Trudy Mat. Inst. Steklov , 44:143, 1955.[145] Julian Orr. Narratives at work: Story telling as cooperative di-agnostic activity. In

Proceedings, Conference on Computer Sup-ported Cooperative Work (SIGCHI) . Association for ComputingMachinery, 1986.[146] David Parnas. Information distribution aspects of design method-ology.

Information Processing ’72 , 71:339–344, 1972. Proceedingsof 1972 IFIP Congress.[147] Lawrence Paulson.

Logic and Computation: Interactive Proof withCambridge LCF . Cambridge, 1987. Cambridge Tracts in Theoreti-cal Computer Science, Volume 2.[148] Lawrence C. Paulson. The foundation of a generic theoremprover. Technical Report 130, University of Cambridge, Com-puter Laboratory, March 1988.[149] Charles Saunders Peirce.

Collected Papers . Harvard, 1965. In 6volumes; see especially Volume 2: Elements of Logic.[150] David Plaisted. An initial algebra semantics for error presenta-tions. SRI International, Computer Science Laboratory, 1982.[151] David Plaisted. Equational reasoning and term rewriting systems.In Dov Gabbay and Jörg Siekmann, editors,

Handbook of Logic inAI and Logic Programming . Oxford, 1993.[152] Axel Poigné. Parameterization for order-sorted algebraic speciﬁ-cation.

Journal of Computer and System Sciences , 40(3):229–268,1990.[153] Emil Post. Recursive unsolvability of a problem of thue.

J. Sym-bolic Logic , 12:1–11, 1947. [154] Michael Rabin and Dana Scott. Finite automata and their decisionproblems.

IBM Journal of Research and Development , 3:114–125,1959.[155] Helena Rasiowa and R. Sikorski.

The Mathematics of Metamathe-matics . PAN (Warsaw), 1963.[156] Brian Ritchie and Paul Taylor. The interactive proof editor: Anexperiment in interactive theorem proving. In V.P. Subramanyanand Graham Birtwhistle, editors,

Current Trends in HardwareVeriﬁcation and Automated Theorem Proving , pages 303–322.Springer, 1989. Bibliography [157] J. Alan Robinson. A machine-oriented logic based on the resolu-tion principle.

Journal of the Association for Computing Machin-ery , 12:23–41, 1965.[158] Barry Rosen. Tree-manipulating systems and Church-Rosser the-orems.

Journal of the Association for Computing Machinery ,pages 160–187, January 1973.[159] A.B.C. Sampaio and Kamran Parsaye. The formal speciﬁcation andtesting of expanded hardware building blocks. In

Proceedings,ACM Computer Science Conference . Association for ComputingMachinery, 1981.[160] M. Schönﬁnkel. Über die bausteine der mathematischen logik.

Mathematische Annalen , 92:305–316, 1924. In

From Frege toGödel , Jean van Heijenoort (editor), Harvard, 1967, pages 355–366.[161] Dana Scott. Lattice theory, data types and semantics. In RandallRustin, editor,

Formal Semantics of Algorithmic Languages , pages65–106. Prentice-Hall, 1972.[162] Dana Scott and Christopher Strachey. Towards a mathematicalsemantics for computer languages. In

Proceedings, 21st Sympo-sium on Computers and Automata , pages 19–46. Polytechnic In-stitute of Brooklyn, 1971. Also Programming Research GroupTechnical Monograph PRG–6, Oxford.[163] Joseph R. Shoenﬁeld.

Mathematical Logic . Addison-Wesley, 1967.[164] Rob Shostak. Deciding combinations of theories.

Journal of theACM , 31(1):1–12, 1984.[165] Gert Smolka, Werner Nutt, Joseph Goguen, and José Meseguer.Order-sorted equational computation. In Maurice Nivat and Has-san Aït-Kaci, editors,

Resolution of Equations in Algebraic Struc-tures, Volume 2: Rewriting Techniques , pages 299–367. Academic,1989.[166] S. Sridhar. An implementation of OBJ2: An object-oriented lan- guage for abstract program speciﬁcation. In K.V. Nori, editor,

Proceedings, Sixth Conference on Foundations of Software Tech-nology and Theoretical Computer Science , pages 81–95. Springer,1986. Lecture Notes in Computer Science, Volume 241.[167] Victoria Stavridou. Specifying in OBJ, verifying in REVE, and someideas about time. Technical report, Department of Computer Sci-ence, University of Manchester, 1987.[168] Victoria Stavridou, Joseph Goguen, Steven Eker, and SergeAloneftis. funnel : A chdl with formal semantics. In

Proceed- ibliography ings, Advanced Research Workshop on Correct Hardware DesignMethodologies , pages 117–144. IEEE, 1991.[169] Victoria Stavridou, Joseph Goguen, Andrew Stevens, Steven Eker,Serge Aloneftis, and Keith Hobley. funnel and 2OBJ: towards anintegrated hardware design environment. In

Theorem Provers inCircuit Design , volume IFIP Transactions, A-10, pages 197–223.North-Holland, 1992.[170] Andrew Stevens and Joseph Goguen. Mechanised theorem prov-ing with 2OBJ: A tutorial introduction. Technical report, Program-ming Research Group, University of Oxford, 1993.[171] Mark Stickel. A Prolog technology theorem prover. In

First Inter-national Symposium on Logic Programming . Association for Com- puting Machinery, February 1984.[172] Christopher Strachey. Towards a formal semantics. In Steel, ed-itor,

Formal Language Description Languages , pages 198–220.North-Holland, 1966.[173] Christopher Strachey. Fundamental concepts in programminglanguages. Lecture Notes from International Summer School inComputer Programming, Copenhagen, 1967.[174] Tanel Tammet. Gandalf.

Journal of Automated Reasoning ,18(2):199–204, 1997.[175] Alfred Tarski. The semantic conception of truth.

Philos. Phe-nomenological Research , 4:13–47, 1944.[176] James Thatcher, Eric Wagner, and Jesse Wright. Data type spec-iﬁcation: Parameterization and the power of speciﬁcation tech-niques. In

Proceedings, Sixth Symposium on Principles of Program-ming Languages . Association for Computing Machinery, 1979.Also in

TOPLAS 4 , pages 711–732, 1982.[177] Yoshihito Toyama. Counterexamples to termination for the directsum of term rewriting systems.

Information Processing Letters , Software – Practice and Experience , 9:31–49, 1979.[179] David Turner. Miranda: A non-strict functional language withpolymorphic types. In Jean-Pierre Jouannaud, editor,

FunctionalProgramming Languages and Computer Architectures , pages 1–16. Springer, 1985. Lecture Notes in Computer Science, Volume201.[180] Jeﬀrey Ullman.

Elements of ML Programming . Prentice Hall, 1998. Bibliography [181] Ivo van Horebeck.

Formal Speciﬁcations Based on Many-Sorted Ini-tial Algebras and their Applications to Software Engineering . Uni-versity of Leuven, 1988.[182] Alfred North Whitehead.

A Treatise on Universal Algebra, withApplications, I . Cambridge, 1898. Reprinted 1960.[183] Glynn Winskel. Relating two models of hardware. In David Pitt,Axel Poigné, and David Rydeheard, editors,

Proceedings, SecondSummer Conference on Category Theory and Computer Science ,pages 98–113. Laboratory for Computer Science, University of Ed-inburgh, 1987. ditors’ Notes

Notes for Chapter 1

E1. [Page 8] The sentence about no new theorems being proved by automated theoremprovers is not accurate, because some new theorems in algebra were ﬁrst proved byautomatic ﬁrst-order provers, like the Robbins conjecture, for example.E2. [Page 10] The sentence about OBJ semantics can be confusing, because the expression“only because” might suggest that it happens “almost by chance.”E3. [Page 13] The author used to reset counters for each chapter, but the editors havedecided to reset them for each section in order to improve readability, since somechapters are quite long.

Notes for Chapter 3

E4. [Page 41] After deﬁning the concept of satisfaction in Deﬁnition 3.3.7, it would be con-venient to add a discussion about algebras with empty sorts ( M s = ∅ for some sort s ∈ S ) and the satisfaction of equations for such algebras: M (cid:238) ( ∀ X) t = t (cid:48) if there isno assignment a : X → M due to M s = ∅ . See also Section 4.3.2.E5. [Page 43] The proof of Theorem 3.3.11 is a bit too fast and it is obscured by using thesame notation M for the algebra and its underlying S -sorted set (see Deﬁnition 2.5.1).E6. [Page 44] After deﬁning satisfaction of conditional equations in Deﬁnition 3.4.1, itwould be helpful to comment that M (cid:238) ( ∀ X) t = t (cid:48) if C when M does not satisfythe conditions C .E7. [Page 52] In the proof of Proposition 3.7.2, the second paragraph proves the contrapos-itive of ﬁrst paragraph instead of its converse. Instead, the converse implication doesnot hold. Consider, for example, the signature ( { s, s , s } , { f : s → s , f : s → s } ) .There are no overloaded terms because there are no terms, but the signature is notregular.E8. [Page 53] In Deﬁnition 3.7.5 it seems that M Σ denotes the annotated form when there isconfusion and the non-annotated form when there is no confusion (as stated at the end,“As with terms in T Σ , we will usually omit the sort annotation unless it is necessary”).However, it also seems that the author’s intention is to denote by M Σ the non-annotatedform. Otherwise, Deﬁnition 3.7.7 and Exercise 3.7.2 don’t make sense. For example, aaa (Exercise 3.7.2) doesn’t have ﬁve distinct parses because it should be written either (a(a a.A).A).A or (a(a.A a).A).A or ((a a.A).A a).A or ((a.A a).A a).A or (a a.A a).A . Suggestion:

Similar to T Σ and T Σ , there could be two notations: M Σ and M Σ corre-sponding to the non-annotated and annotated forms, respectively. Editors’ Notes

Notes for Chapter 4

E9. [Page 65] This section on explicit quantiﬁcation has as title “The Need for Quantiﬁers.”This can be a bit confusing, because what is really necessary is not the quantiﬁers perse, but the explicit annotation of the set of variables involved in an equation, whichinstead of being deﬁned as a pair (cid:104) t, t (cid:48) (cid:105) is now deﬁned as a triple (cid:104)

X, t, t (cid:48) (cid:105) .In books like Johnstone’s “Notes on Logic and Set Theory” and Lambek & Scott’s “In-troduction to Higher-Order Categorical Logic”, other notations not involving quantiﬁersare used for this, such as t = X t (cid:48) .It is interesting to compare with discussion in page 249 about Horn clause symbols.E10. [Page 72] In Exercise 4.5.5, it seems it is necessary to add the hypothesis that A ≠ ∅ .E11. [Page 86] In Deﬁnition 4.10.6, if Σ (cid:48) contains symbols from X , then the translation of ( ∀ X) t = t (cid:48) along ϕ is not correctly deﬁned. This is a serious issue which has beenconsidered in Diaconescu’s book “Institution-Independent Model Theory”: variables aretriples of the form (x, s, Σ ) , where x is the name of the variable, s is the sort of thevariable, and Σ is the signature for which the variable is considered. Then we have ϕ(x, s, Σ ) = (x, ϕ(s), Σ (cid:48) ) . In this way, variable name clashes are successfully avoided. Notes for Chapter 5

E12. [Page 107] The second sentence in the statement of Proposition 5.3.4 is not true: if ( Σ , A) is ground terminating, then ( Σ (X), A) may fail to be ground terminating; considerfor example the rule f (Z) → f (f (Z)) in Example 5.3.2. The second sentence should beread as “ ( Σ , A) is ground terminating if ( Σ (X), A) is ground terminating,” for the authorwrites so in the new statement of Proposition 5.3.4 in page 381.E13. [Page 112] The explaining paragraph after Deﬁnition 5.5.3 is confusing because it doesnot correspond to the deﬁnition statement. Instead, the second deﬁnition says that σ preserves the decrease of the weight, and the third deﬁnition says that all operationspreserve weight decreasingness.E14. [Page 118] After Theorem 5.6.4 (Orthogonality), an example showing why the left-linearcondition is needed would help the reader (in the same way that Exercise 5.6.2 showswhy the lapse free condition is needed). The following T RS is non-overlapping andlapse free but not Church-Rosser because it is not left linear: f (x, x) → a, f (x, g(x)) → b, c → g(c) E15. [Page 126] Proposition 5.3.4 reappears in this page, so note E12 in page 107 also applieshere.E16. [Page 128] In the ﬁrst paragraph after Deﬁnition 5.8.2, it seems unclear whether R k − should instead be R k .E17. [Page 131] Proposition 5.8.10 is the conditional generalization of Proposition 5.3.4 (inpages 107 and 126) and the same comment applies: if ( Σ , A) is ground terminating then ( Σ (X), A) may fail to be ground terminating. See notes E12 and E15 above.E18. [Page 147] It would be necessary to check whether this argument is right.E19. [Page 152] The comment before Deﬁnition 5.9.1 about adding “little more information”seems inconsistent because the deﬁnition does not take that into account.E20. [Page 156] This dangling reference probably corresponds to a corollary which may havebeen removed. ditors’ Notes E21. [Page 165] In the proof of the initiality Theorem 6.1.15 there is an important gap, be-cause the author does not prove T Σ ,A (cid:238) A , and so we don’t know whether T Σ ,A is a ( Σ , A) -algebra or not. Compare with note E42 for Theorem 9.1.10 in page 312.E22. [Page 174] Exercise 6.4.1 coincides with Exercise 3.1.1 in Section 3.1.E23. [Page 177] Example 6.5.7 on commutativity of addition could be used to show howlemmas come up in a proof attempt.E24. [Page 178] Exercise 6.5.1 coincides with lemma0 in previous Example 6.5.7.E25. [Page 180] The protecting notion ( pr NAT in Exercise 6.5.5) has not been explainedbefore. The notion is quickly explained later in Example 7.3.11 (page 197). Notes for Chapter 7

E26. [Page 195] In the proof of Theorem 7.3.9, it seems that there is a small gap in theproof of N (cid:238) A ∪ B . Since [[ _ ]] B : T Σ → N , we should consider b : X → T Σ such that b ; [[ _ ]] B = a , because [[ _ ]] B is a surjection. T ÷ [[ _ ]] B X b b b a / / mM | | NT ÷ (X) a @ @ b O O Since a = b ; [[ _ ]] B and there exists a unique homomorphism T Σ (X) → N extending a ,we have b ; [[ _ ]] B = a . Now applying the given rule to t with substitution b : X → T Σ gives b(t) ⇒ A/B b(t (cid:48) ) , so these two terms have the same canonical form, i.e., [[b(t)]] B = [[b(t (cid:48) )]] B and thus a(t) = a(t (cid:48) ) .E27. [Page 201] Here it is stated that the protecting notion was deﬁned in Chapter 6, butthis is not really the case, as pointed out about Exercise 6.5.5 in page 180 (see note E25above). The notion is quickly introduced (in Chapter 7) in Example 7.3.11 (page 197).E28. [Page 203] Corollary 7.3.19 deserves a more detailed proof.E29. [Page 211] For the statement of Proposition 7.4.3, it needs to be explained how thetriangular system T becomes a ﬁrst-order formula.E30. [Page 216] As above (see note E29), for the statement of Proposition 7.4.10, it needs tobe explained how the conditional triangular system T becomes a ﬁrst-order formula.E31. [Page 241] Lemma 7.7.16 and Proposition 7.7.17 refer to weak rewriting modulo, whichhas not been deﬁned in the conditional case. Notes for Chapter 8

E32. [Page 257] In Deﬁnition 8.3.2 the author should denote by ˆ a : WFF X (φ) → B the ex-tension of a : X → M to well-formed formulae because he denoted by a : T Σ → M Editors’ Notes the extension of a to terms. As we may notice later, in the deﬁnition of substitution(Deﬁnition 8.3.16 in page 266), he denotes by θ : T Σ (X) → T Σ (X) the extension of θ : X → T Σ (X) to terms, and by ˆ θ : WFF X (φ) → WFF X (φ) the extension of θ to (φ, X) -formulae. So, there seems to be an inconsistency in the notations.E33. [Page 257] Since formulas and their meaning are deﬁned with respect to a set X ofvariables, for satisfaction M (cid:238) Φ P to be well deﬁned (in Deﬁnition 8.3.2), independencefrom X must be proved.However, the deﬁnition of satisfaction is clearly ambiguous because it dependson the set of variables X . Assume, for example, a Φ -model M such that M s = ∅ forsome sort s ∈ S , and [[P]] M ∅ = ∅ , i.e., M (cid:54)(cid:238) Φ P , for some sentence (closed formula) P ∈ WFF ∅ ( Φ ) . Let X = { x : s } be the variable set with just one element x of sort s . Wehave P ∈ WFF X ( Φ ) and [X → M] = ∅ = [[P]] MX . Hence, M (cid:238) Φ P , which is a contradictionwith our initial assumption M (cid:54)(cid:238) Φ P .This satisfaction condition should be indexed not only by the signature Φ but alsoby the set of variables X . For example, the notation (cid:238) X Φ clears the above ambiguity.E34. [Page 257] The concept of institution, used at the end of Deﬁnition 8.3.2, has only beenmentioned previously in the footnote 1 in page 250.E35. [Page 265] In the statement of Proposition 8.3.15 it could be added that, if P does notcontain variables with empty sort, then the nonempty assumption on carriers of M maybe omitted.E36. [Page 269] In the proof of Corollary 8.3.23, Theorem 8.3.22 is applied to Q = ( ∀ Y ) P ,which requires that θ is capture free for Q .E37. [Page 288] The hypothesis Bound (P) ∩ X = ∅ must be added to the statement of Propo-sition 8.5.1; otherwise, it could be possible for θ not to be capture free for P and thenLemma 8.3.25 could not be applied.Furthermore, the notation for the function y : M s ,...,s n → M s should be diﬀerentfrom the variable y : s . The reason is that θ has source X ∪ { y } and target T Σ (Y )(X ∪{ y } ) . Thus, the target contains y regarded both as variable and as operation symbol.E38. [Page 289]Proposition 8.3.15 is used in the proof of Proposition 8.5.2 without checkingthe nonempty carriers condition.E39. [Page 295] The notation for assignment application to a formula seems diﬀerent in theproof of Proposition 8.7.1.E40. [Page 295] In the paragraph after Proposition 8.7.1, θ m is not a substitution, but anassignment in general. Then θ m (P) is not a sentence.E41. [Page 303] The concept “anarchic” used in Exercise 8.7.4 has not been deﬁned. Notes for Chapter 9

E42. [Page 312] In the proof of Theorem 9.1.10, the author does not prove T Σ ,A (cid:238) A . Hence,we don’t know whether T Σ ,A is a ( Σ , A) -algebra or not. Compare with note E21 forTheorem 6.1.15 in page 165.E43. [Page 313] Here the author invokes Deﬁnition 8.3.2, meaning that we have to deal herewith the same problems as in Chapter 8 (see above notes E32–E33 about this deﬁnition). ditors’ Notes E44. [Page 326] It seems that the conclusions of Theorem 10.2.9 and Proposition 10.2.10coincide. However, Theorem 10.2.9 requires signatures to be regular while Proposi-tion 10.2.10 does not. This is rather strange even with the additional explanations.E45. [Page 329] The calligraphic notation for algebras in Proposition 10.2.18 and the para-graphs above has not been used before.E46. [Page 336] In Example 10.4.7, a function op f : A -> A should be added, in order topoint out that [ f(a) ] = { f(a) , f(b) } and f(c) ∉ [ f(a) ] .E47. [Page 345] In the paragraph before Theorem 10.6.6, the statement “for arbitrary A , it isnecessary and suﬃcient that Σ has no quasi-empty models” needs a proof.E48. [Page 391] In the proof of Theorem 10.2.8, the restriction condition in fact seems tocorrespond to what is called the monotonicity condition in Deﬁnition 10.1.5.E49. [Page 393] The star notation used in the proof of Theorem 10.4.11 (Initiality) has notbeen used before. Instead, the overbar notation was preferred.has no quasi-empty models” needs a proof.E48. [Page 391] In the proof of Theorem 10.2.8, the restriction condition in fact seems tocorrespond to what is called the monotonicity condition in Deﬁnition 10.1.5.E49. [Page 393] The star notation used in the proof of Theorem 10.4.11 (Initiality) has notbeen used before. Instead, the overbar notation was preferred.